Re: [SystemSafety] OpenSSL Bug

From: Dewi Daniels < >
Date: Tue, 15 Apr 2014 17:23:29 +0100


Derek Jones sent me a private communication, but he has kindly agreed I can respond to this list.  

I wrote:  

>> How about Andy German's paper on "Software Static Code Analysis

>> Lessons Learned"?

>>

>>
http://www.crosstalkonline.org/storage/issue-archives/2003/200311/200311-Ger man.pdf

>>

>> "Table 1 shows that the poorest language for safety-critical

>> applications is C with consistently high anomaly rates. The best

>> language found is SPARK (Ada), which consistently achieves one

>> anomaly per 250 software lines of code".  

Derek replied:  

> Thanks. I have read this paper.

>

> It is difficult to read anything into the results because nothing is

> said about the usage (i.e., applications that are more heavily used

> are more likely to experience more faults) or about the testing

> investment that happened prior to release (obviously more testing

> means fewer faults in the field).
 

Andy German's paper reports on the results of conducting static code analysis on the airborne software for the Lockheed C-130J between 1995 and 1996. This work was carried by Lloyd's Register and by Aerosystems. I worked for Lloyd's Register at the time. The Lockheed C-130J was undergoing FAA certification and the airborne software was therefore being developed to DO-178B. As the launch customer, the UK Ministry of Defence required the airborne software to be subjected to static code analysis in addition to the DO-178B software development and verification process. We used the SPARK Examiner for the C-130 mission computer software, which was written in SPARK, and MALPAS for the remaining code (a mix of C, Ada, Pascal, PLM and Lucol). Only the DO-178B Level A and Level B software was analysed.  

The static analysis found a number of defects in the software. Andy German's paper presents a number of interesting findings:  

  1. "When Level A was compared to Level B, no significant difference in anomaly rates identified by static analysis was found".
  2. "Table 1 shows that the poorest language for safety-critical applications is C with consistently high anomaly rates. The best language found is SPARK (Ada), which consistently achieves one anomaly per 250 software lines of code".
  3. "The average number of safety-critical anomalies found is a small percentage of the overall anomalies found with about 1 percent identified as having safety implications".
  4. "Automatically generated code was found to have considerably reduced syntactic and data flow errors".

I reproduce Table 1 below:  

Software Language

Range

Software Lines of Code Per Anomaly

Anomalies Per Thousand Lines of Code

C

Worst

2

500  

Average

6 - 38

167 - 26  

Best (Auto Code Generated)

80

12.5

Pascal

Worst

6

167  

Average/Best

20

50

PLM Average

50

20

Ada

Worst

20

50  

Average

40

25  

Best (Auto Code Generated)

210

4.8

Lucol

Average

80

12.5

SPARK Average

250

4  

I also recollect that significant differences in anomaly rates were found between the software produced by different vendors, though this is not highlighted in Andy's paper.  

I think these findings are very interesting and are worthy of further investigation. However, I also believe there are a number of reasons why we need to be careful not to read too much into the C-130J experience:  

  1. Due to schedule constraints, many software sub-systems were delivered for static code analysis before they had been subjected to formal software verification. We therefore do not know how many of the defects found by static code analysis would have been found anyway by the DO-178B software verification process. Some of the software baselines delivered for static code analysis were very immature, so the number of defects found is not in any way representative of the output of a DO-178B-compliant software process. I personally helped conduct the static analysis of the software sub-system with the highest anomaly rate (500 anomalies per 1000 lines of code according to Table 1). Although it was supposedly developed to DO-178B Level B, the code had clearly been subjected to only cursory review and test. The design did not satisfy the requirements, and the code did not satisfy the design. It would not have been signed-off by any DER in that state. Indeed, I believe that this sub-system was rejected by Lockheed and replaced with an alternate part from another vendor.
  2. We counted all defects found, ranging from software defects that could have resulted in a catastrophic failure condition to spelling mistakes in supporting documentation. This accounts for the unusually high anomaly rates quoted. For example, Table 1 shows that the anomaly rate for the mission computer software, which was written in SPARK, was 4 anomalies per thousand lines of code. The mission computer software was formally specified using Parnas tables, coded in SPARK, and program proof conducted using the SPARK Simplifier and SPADE Proof Checker. This is the same software process that Praxis reported to result in defect rates of fewer than 1 anomaly per thousand lines of code on other projects.
  3. While a significant difference was found in the anomaly rates resulting from the use of different programming languages, there was an even greater difference between the anomaly rates discovered in software developed by different vendors. While the average C program had a higher anomaly rate than the average Ada program, the best C programs had a lower anomaly rate than the worst Ada programs.

Yours,  

Dewi Daniels | Managing Director | Verocel Limited

Direct Dial +44 1225 718912 | Mobile +44 7968 837742 | Email ddaniels_at_xxxxxx  

Verocel Limited is a company registered in England and Wales. Company number: 7407595. Registered office: Grangeside Business Support Centre, 129 Devizes Road, Hilperton, Trowbridge, United Kingdom BA14 7SZ



The System Safety Mailing List
systemsafety_at_xxxxxx Received on Tue Apr 15 2014 - 18:26:23 CEST

This archive was generated by hypermail 2.3.0 : Tue Jun 04 2019 - 21:17:06 CEST