Re: [SystemSafety] Qualifying SW as "proven in use" [Measuring Software]

From: Michael Jackson < >
Date: Wed, 26 Jun 2013 11:49:14 +0100


Michael (if I may):

Your example also illustrates another fundamental point, as it was surely meant to.

The idea of 'software safety' is inherently absurd. What would be unsafe software?
Perhaps it might crash an (old-fashioned) operating system by leaking storage, or
it might wear out a hard disk by causing an excessively demanding pattern of arm
movements: but it's hard to think of many such possibilities.

By 'safety', as the example perfectly illustrates, we mean some set of properties of
the behaviour of the problem world outside the computer. Within the envelope of the
given characteristics of the relevant parts of the problem world, this behaviour will be
governed by the behaviour of the computer executing the controlling software.

Complexity in software design suggests that someone may not have understood something well enough. Among the many ways to misunderstand are:

  1. Making a 'pure software error'. The programmer's intention was correct, but the code does not achieve it.
  2. Misunderstanding the problem world behaviour required. The code achieves the progammer's intention, but the intention---possibly expressed in a specification--- was faulty: valve W should be opened, not valve V.
  3. Misunderstanding the complexity of the intended behaviour in the problem world. Software may be complex because it tries to satisfy two functional requirements whose combination has not been adequately understood. This is the 'feature interaction' problem.

Even if every part of the problem world---misleadingly called 'the environment'---is to
be taken as given, the task of software development must still be seen as the task
of designing a behaviour in that given problem world. Safety is specifically a property
of that designed behaviour and the reliability with which it is evoked and controlled
by the software.

At 20:13 25/06/2013, C. Michael Holloway wrote:
>Greetings,
>
>Coming up with an argument that supports the premise that it is
>possible for software to have many defects, but still be safe is
>trivial. Consider software that is designed to meet the following
>requirements:
>
>A. If the input received is the integer 10, open valve V within 5 seconds.
>B. For integer inputs 1-9, 11-26, return the corresponding letter
>of the English alphabet.
>C. For all other integer inputs, return the letter 'Q'.
>D. For all non-integer inputs, return the string "Invalid input."
>
>Suppose further that requirement A is the only requirement with any
>safety implications. So long as the software always opens value V
>within 5 seconds, safety is maintained, even if the software never
>once satisfies requirements B, C, or D. Since the input domain is
>potentially infinite, one could say that the software is infinitely
>defective but perfectly safe.
>
>Of course, the requirements for real systems are rarely this simple;
>but there are real systems for which it is possible to distinguish
>between requirements that have safety implications and those that
>don't. For any such system, lack of defects is not a necessary
>condition for safety.
>
>--
>
>C. Michael Holloway
>
>Disclaimer: My opinions are mine alone. Give neither blame nor
>credit to my employer for them.
>
>On 6/25/13 2:47 PM, Steve Tockey wrote:
>>
>>Matthew,
>>Have you looked into the work done by Nancy Leveson? The issue is,
>>I think, that defects (i.e., reliability) and safety are related
>>but distinct. One could have software with few defects (reliable)
>>that is safe, just as one could have software with few defects
>>(reliable) that wasn't safe--considering the environment in which
>>the software is being used. I'm just hard pressed to come up with
>>an argument that supports the premise that software with many
>>defects (unreliable) *is* safe. So again, I'm proposing this is
>>another one of those "necessary but not sufficient" cases. One
>>should always be paying attention to building software with a low
>>probability of defects in the first place, particularly in safety
>>critical systems. But just paying attention to low probability of
>>defects leaves a whole slew of other important failure modes where
>>safety goes out the window.
>>
>>
>>-- steve
>>
>>
>>
>>Date: Monday, June 24, 2013 7:37 PM
>>Cc: Bielefield Safety List
>>Subject: Re: [SystemSafety] Qualifying SW as "proven in use"
>>[Measuring Software]
>>
>>Isn't the underlying question whether what you are measuring
>>actually correlating to what you're interested in?
>>
>>If you're not measuring 'safety' directly but indirectly through
>>'faults' then there's the issue of systematic error in your measurement.
>>
>>Should I care about the entire population of faults from a safety
>>perspective? If so why (e.g what's the causal reasoning)?
>>
>>Taking the argument to an extreme, are the proponents positing that
>>if a fault density below some predefined level is achieved then one
>>can turn around and say the software was 'safe'? If not what else
>>do I have to do?
>>
>>
>>
>>
>>On Tue, Jun 25, 2013 at 9:38 AM, Derek M Jones
>>All,
>>
>>Actually, getting the evidence isn't that tricky, it's just a lot of work.
>>
>>
>>This is true of most things (+ getting the money to do the work).
>>
>>Essentially all one needs to do is to run a correlation analysis
>>(correlation coefficient) between the proposed quality measure on the one
>>hand, and defect tracking data on the other hand.
>>
>>
>>There is plenty of dirty data out there that needs to be cleaned up
>>before it can be used:
>><http://shape-of-code.coding-guidelines.com/2013/06/02/data-cleaning-the-next-step-in-empirical-software-engineering/>http://shape-of-code.coding-guidelines.com/2013/06/02/data-cleaning-the-next-step-in-empirical-software-engineering/
>>
>>For example, the code quality measure "Cyclomatic Complexity" (reference:
>>Tom McCabe, A Complexity Measure , IEEE Transactions on Software
>>Engineering, December, 1976) was validated many years ago by simply
>>
>>
>>I am not aware of any study that validates this metric to a reasonable
>>standard. There are a few studies that have used found a medium
>>correlation in a small number of data points.
>>
>>I have some data whose writeup is not yet available in a good enough
>>draft form to post to my blog. I only plan to write about this
>>metric because it is widely cited and is long overdue for relegation
>>to the history of good ideas that did not stand the scrutiny of
>>empirical evidence.
>>
>>finding a strong positive correlation between the cyclomatic complexity of
>>functions and the number of defects that were logged against those same
>>
>>
>>Correlation is not causation.
>>
>>Cyclomatic complexity correlates well with lines of code, which
>>in turn correlates well with number of faults.
>>
>>functions (I.e., code in that function needed to be changed in order to
>>repair that defect).
>>
>>
>>Changing the function may increase the number of faults. Creating two
>>functions where there was previously one will reduce an existing peak
>>in the distribution of values, but will it result in less faults
>>overall?
>>
>>All this stuff with looking for outlier metric values is pure hand
>>waving. Where is the evidence that the reworked code is better not
>>worse?
>>
>>According to one study of 18 production applications, code in functions
>>with cyclomatic complexity <=5 was about 45% of the total code base but
>>this code was responsible for only 12% of the defects logged against the
>>total code base. On the other hand, code in functions with cyclomatic
>>complexity of >=15 was only 11% of the code base but this same code was
>>responsible for 43% of the total defects. On a per-line-of-code basis,
>>functions with cyclomatic complexity >=15 have more than an order of
>>magnitude increase in defect density over functions measuring <=5.
>>
>>What I find interesting, personally, is that complexity metrics for
>>object-oriented software have been around for about 20 years and yet
>>nobody (to my knowledge) has done any correlation analysis at all (or, at
>>a minimum they have not published their results).
>>
>>The other thing to remember is that such measures consider only the
>>"syntax" (structure) of the code. I consider this to be *necessary* for
>>code quality, but far from *sufficient*. One also needs to consider the
>>"semantics" (meaning) of that same code. For example, to what extent is
>>the code based on reasonable abstractions? To what extent does the code
>>exhibit good encapsulation? What are the cohesion and coupling of the
>>code? Has the code used "design-to-invariants / design-forchange"? One can
>>have code that's perfectly structured in a syntactic sense and yet it's
>>garbage from the semantic perspective. Unfortunately, there isn't a way
>>(that I'm aware of, anyway) to do the necessary semantic analysis in an
>>automated fashion. Some other competent software professionals need to
>>look at the code and assess it from the semantic perspective.
>>
>>So while I applaud efforts like SQALE and others like it, one needs to be
>>careful that it's only a part of the whole story. More work--a lot
>>more--needs to be done before someone can reasonably say that some
>>particular code is "high quality".
>>
>>
>>Regards,
>>
>>-- steve
>>
>>
>>
>>
>>
>>
>>-----Original Message-----
>>Date: Friday, June 21, 2013 6:04 AM
>>To:
>>Subject: Re: [SystemSafety] Qualifying SW as "proven
>>in use" [Measuring Software]
>>
>>I agree with Derek
>>
>>Code quality is only a means to an end
>>We need evidence to to show the means actually helps to achieve the ends.
>>
>>Getting this evidence is pretty tricky, as parallel developments for the
>>same project won't happen.
>>But you might be able to infer something on average over multiple projects.
>>
>>Derek M Jones wrote:
>>Thierry,
>>
>>To answer your questions:
>>1) Yes, there is some objective evidence that there is a correlation
>>between a low SQALE index and quality code.
>>
>>
>>How is the quality of code measured?
>>
>>Below you say that SQALE DEFINES what is "good quality" code.
>>In this case it is to be expected that a strong correlation will exist
>>between a low SQALE index and its own definition of quality.
>>
>>For example ITRIS has conducted a study where the "good quality" code
>>is statistically linked to a lower SQALE index, for industrial
>>software actually used in operations.
>>
>>
>>Again how is quality measured?
>>
>>No, there is not enough evidence, we wish there would be more people
>>working on getting the evidence.
>>
>>
>>Is there any evidence apart from SQALE correlating with its own
>>measures?
>>
>>This is a general problem, lots of researchers create their own
>>definition of quality and don't show a causal connection to external
>>attributes such as faults or subsequent costs.
>>
>>Without running parallel development efforts that
>>follow/don't follow the guidelines it is difficult to see how
>>reliable data can be obtained.
>>
>>
>>
>>--
>>Derek M. Jones tel:
>><tel:%2B44%20%280%29%201252%20520%20667>+44 (0) 1252 520 667
>>Knowledge Software
>>Ltd
>>blog:<http://shape-of-code.coding-guidelines.com>shape-of-code.coding-guidelines.com
>>Software
>>analysis <http://www.knosof.co.uk>http://www.knosof.co.uk
>>_______________________________________________
>>The System Safety Mailing List
>>
>>
>>
>>
>>--
>>Matthew Squair
>>
>>Mob: +61 488770655
>
>_______________________________________________
>The System Safety Mailing List
>systemsafety_at_xxxxxx



The System Safety Mailing List
systemsafety_at_xxxxxx Received on Wed Jun 26 2013 - 12:49:29 CEST

This archive was generated by hypermail 2.3.0 : Tue Jun 04 2019 - 21:17:05 CEST