[SystemSafety] FW: Software reliability (or whatever you would prefer to call it) [UNCLASSIFIED]

From: Smith, Brian E. (ARC-TH) < >
Date: Fri, 6 Mar 2015 17:39:28 +0000


This is a most interesting discussion. I very much appreciate the comments; especially those from my distinguished colleague, Michael Holloway at NASA Langley.

Next week, NASA Ames is hosting a technical workshop entitled, ³Transition to Autonomy.² Every morning I harvest general ideas and comments from this discussion thread to give my little grey cells something to contribute at this event.

The topic of how to characterize, measure, and assure the Œperformance¹ of safety-critical software in new autonomous, automated systems is either 1. a potential show-stopper or 2. an enabler to implementing such advanced software in aviation - depending on one¹s perspectiveŠ

Another colleague of mine at Langley is what he terms an 'optimistic skeptic' when it comes to automation. He asserts that software-enabled autonomy may be a great thing, but we are implementing it incorrectly because just shuts the human out of the loop and expects him/her to Œpay attention¹ But we know that humans can¹s sustain that over long periods of time. There are many who believe that we just need better displays and better information driven by more Œreliable¹ (or what ever term we can agree on) software. That helps, but it¹s not the answer. Better displays and information won¹t help in routine commercial flight. We need to look at what we want the human to do and then provide a function allocation between human and machine that allows and even enhances his/her ability to do that. Instead, we have put them in a quiet, dark flight deck with a nice engine thrum (except on the A380) and tell them to pay attention to the outputs from the avionics software. And then we usually startle the heck out of them and demand that they respond quickly. Similar arguments could be made with respect to air traffic controllers and how they interact with ground-based Performance Based Navigation systems.

Let¹s not forget that today¹s remarkable record in flight safety has been achieved by the triad of software, hardware, and Œliveware¹ (people) - to borrow a term from the ICAO SCHELL model. Much of the discussion on this forum has centered on the SW/HW dichotomy. These two elements facilitate the total system capabilities we want and perhaps are not goals to be pursued in and of themselves. They need to be set in context.

DOT/FAA/AR-10/27 is a 2010 document entitled, "Flight Crew Intervention Credit in System Safety Assessments: Evaluation by Manufacturers and Definition of Application Areas.² The research effort described in that reference is motivated by the following statement:

"According to current regulations for type certification of large commercial aircraft, certification credit may be taken for correct and appropriate action for both quantitative and qualitative assessments provided that some general criteria are fulfilled. According to the same regulations, quantitative assessments of the probabilities of flight crew errors are not considered feasible. As a consequence, the system designer is allowed to take 100% credit for correct flight crew action in response to a failure. Previous research indicates that this leads to an overestimation of flight crew performance."

So assessing human Œreliability¹ may be as difficult and as critical in improving system-wide safety as assessing software Œreliability.¹ Each has its own thorny challenges in how we define terms and measure performance.

I very much look forward to following this thread in the days ahead...

Brian E. Smith

Special Assistant for Aeronautics
Human Systems Integration Division
Bldg N262, Room 120; Mail Stop 262-11
NASA Ames Research Center
P.O. Box 1000
Moffett Field, CA 94035

(v) 650.604.6669, (c) 650.279-1068, (f) 650.604.3323

Never let an airplane or a motorcycle take you somewhere your brain didn't go five seconds earlier.  

On 3/6/15, 6:17 AM, "RICQUE Bertrand (SAGEM DEFENSE SECURITE)" <bertrand.ricque_at_xxxxxx

>Right, and this is the problem at least for process industries making
>huge use of this type of behaviour.
>
>Bertrand Ricque
>Program Manager
>Optronics and Defence Division
>Sights Program
>Mob : +33 6 87 47 84 64
>Tel : +33 1 58 11 96 82
>Bertrand.ricque_at_xxxxxx >
>
>-----Original Message-----
>From: systemsafety-bounces_at_xxxxxx >[mailto:systemsafety-bounces_at_xxxxxx >King, Martin (NNPPI)
>Sent: Friday, March 06, 2015 2:31 PM
>To: systemsafety_at_xxxxxx >Subject: Re: [SystemSafety] Software reliability (or whatever you would
>prefer to call it) [UNCLASSIFIED]
>
>This message has been marked as UNCLASSIFIED by King, Martin (NNPPI)
>
>
>Many safety shutdown systems will spend a considerable proportion of
>their time (90%+) in one of two plant states (operational and
>maintenance) with parameters that are quite limited in range. The two
>dominant states usually have parameter values that are quite disparate.
>Most of the remainder of the time is spent transitioning between these
>two states. In an ideal world the limited range of parameter values that
>will cause a shutdown will never occur - in practise they will normally
>occur extremely rarely over the life of the plant. Is this really the
>input value distribution that we want to test our equipment with?
>
>Martin King
>(My opinions etc, not necessarily those of my employer or colleagues!)
>
>
>-----Original Message-----
>From: systemsafety-bounces_at_xxxxxx >[mailto:systemsafety-bounces_at_xxxxxx >Martyn Thomas
>Sent: 06 March 2015 13:04
>To: systemsafety_at_xxxxxx >Subject: Re: [SystemSafety] Software reliability (or whatever you would
>prefer to call it)
>
>I agree. That's why I added the point about explicit assumtions before
>using such measurements to predict the future.
>
>There is usually a hidden assumption that the future input distribution
>will match that encountered during the measurement. But it's hard to
>justify having high confidence that such an assumption will prove correct.
>
>Martyn
>
>On 06/03/2015 12:32, Derek M Jones wrote:
>> Martyn,
>>
>>> The company calculates some measure of the amount of usage before
>>> failure. Call it MTBF.
>>
>> Amount of usage for a given input distribution.
>>
>> A complete reliability model has to include information on the
>> software's input distribution.
>>
>> There is a growing body of empirical work that builds fault models
>> based on reported faults over time. Nearly all of them suffer from
>> the flaw of ignoring the input distribution (they also tend to ignore
>> the fact that the software is changing over time, but that is another
>> story).
>>
>
>_______________________________________________
>The System Safety Mailing List
>systemsafety_at_xxxxxx >
>The following attachments and classifications have been attached:
>The data contained in, or attached to, this e-mail, may contain
>confidential information. If you have received it in error you should
>notify the sender immediately by reply e-mail, delete the message from
>your system and contact +44 (0) 1332 622800(Security Operations Centre)
>if you need assistance. Please do not copy it for any purpose, or
>disclose its contents to any other person.
>
>An e-mail response to this address may be subject to interception or
>monitoring for operational reasons or for lawful business practices.
>
>(c) 2015 Rolls-Royce plc
>
>Registered office: 62 Buckingham Gate, London SW1E 6AT Company number:
>1003142. Registered in England.
>
>_______________________________________________
>The System Safety Mailing List
>systemsafety_at_xxxxxx >#
>" Ce courriel et les documents qui lui sont joints peuvent contenir des
>informations confidentielles, être soumis aux règlementations relatives
>au contrôle des exportations ou ayant un caractère privé. S'ils ne vous
>sont pas destinés, nous vous signalons qu'il est strictement interdit de
>les divulguer, de les reproduire ou d'en utiliser de quelque manière que
>ce soit le contenu. Toute exportation ou réexportation non autorisée est
>interdite Si ce message vous a été transmis par erreur, merci d'en
>informer l'expéditeur et de supprimer immédiatement de votre système
>informatique ce courriel ainsi que tous les documents qui y sont
>attachés."
>******
>" This e-mail and any attached documents may contain confidential or
>proprietary information and may be subject to export control laws and
>regulations. If you are not the intended recipient, you are notified that
>any dissemination, copying of this e-mail and any attachments thereto or
>use of their contents by any means whatsoever is strictly prohibited.
>Unauthorized export or re-export is prohibited. If you have received this
>e-mail in error, please advise the sender immediately and delete this
>e-mail and all attached documents from your computer system."
>#
>
>_______________________________________________
>The System Safety Mailing List
>systemsafety_at_xxxxxx



The System Safety Mailing List
systemsafety_at_xxxxxx Received on Fri Mar 06 2015 - 18:39:42 CET

This archive was generated by hypermail 2.3.0 : Fri Apr 26 2019 - 00:17:07 CEST