Re: [SystemSafety] Degraded software performance [diverged from Fault, Failure and Reliability Again]

Date: Thu, 5 Mar 2015 10:46:00 +0100

This distinction is important as soon as you have the possibility (and the duty) to choose between full hardware technology (e.g. relays) and HW+SW. One must be able to compare the performances before choosing…

Bertrand Ricque
Program Manager
Optronics and Defence Division
Sights Program
Mob : +33 6 87 47 84 64
Tel : +33 1 58 11 96 82

Sent: Wednesday, March 04, 2015 6:34 PM To: Nick Tudor
Cc: <systemsafety_at_xxxxxx Subject: Re: [SystemSafety] Degraded software performance [diverged from Fault, Failure and Reliability Again]

I think you've reversed the point I was making, and then disagreed with the opposite of what I was saying. What I really should have done is used "computer system reliability" and refused to buy into the hardware/software demarkation issue.

I disagree with claiming software rates for software regardless of whether they are carefully concocted statistical estimates, or "software doesn't fail". BOTH rely on making some arbitrary distinction between what is software, and what is hardware. Whoever makes that distinction, where-ever they make it, has an obligation to state clear assumptions about the other side of the distinction, and have grounds for believing those assumptions to be realistic. You want to say that each of my failure modes for software "is a hardware issue". Fine. But you don't want to make claims for software reliability either. If you're not going to make a claim for reliability, any distinction between software and hardware you want to create is fine by me. Anyone who wants to claim either hardware or software reliability though, and also wants to make a distinction between "software issues" and "hardware issues", needs to consider both sides of the distinction. If someone wants to say "the processor that the software runs on is not software", then their standard needs to specifically address how they'll make sure that your software requirements consider the aging of the processor. If they want to say that changes in the input profile for the software are not a software issue, then they need to go back to software engineering school, because there's no universe in which a changed pattern of inputs does not change the probability of an incorrect output. On the plus side, if you'll let me characterise your message as a strawman (instead of an honest misinterpretation of intent, which I'm sure it was) I can complete my mailing list fallacy bingo card. We've already had arguments from antiquity, argument from authority, "is" equals "ought", equivocation, false equivalence, and not understanding the difference between false and falsifiable. I don't think we've had anyone blatantly misrepresent anyone else's position though. Drew

My safety podcast:<> My mobile (from October 6th): 0450 161 361

In line responses Andrew:

I need to give more than one example, because the point is general, rather than specific to the individual causes. In each case the cumulative probability of software failure increases over time.

>>if you can determine the wear out mechanism for software I would agree, but you can't, so I don't.

  1. Damage to the instruction set e.g. the physical record of the instructions on a storage medium changes very specific e.g. bit flip on a magnetic storage device holding the executable files

>>this is a hardware issue.

2) Increased unreliability of the physical execution environment e.g. an increased rate of processor errors very specific e.g. dust accumulates on part of the processor card, making it run hot and produce calculation errors
>> this too is hardware.

3) Increased unreliability of input hardware e.g. software is required to detect and respond correctly to an increased rate and variety of sensor failure combinations Note: This is the one that challenges "but we're running the software in exactly the same hardware environment". Hardware environments change as they get older.


4) Software accumulates information during runtime

e.g. a count of elapsed time
e.g. increasing volume of stored data
e.g. memory leak

>>bad requirements or/and bad verification.
NB1: In all of these cases I've heard arguments "that's not the software, that's X". Those arguments are only relevant if you can control for X when collecting data for software reliability calculation. Software without an execution environment is a design. It "never fails" in the way that _no_ design fails. When it does fail, it is subject to the same degredation over time as any physical implementation

>> there is no such thing as software reliability so don't use maths (or rather statistics and claim they are maths) inappropriately.

NB2: I'm not claiming that failure due to physical degredation is significant compared to failure due to errors in the original instructions. I'm saying that we don't know, and that not knowing becomes a big issue once we've tested to the point of not finding errors in the original instructions. At that point, absent evidence to the contrary, we should be assuming that physical degredation is signficant.

>>. No one (I hope) denies that hardware effects may influence software calculations. Still doesn't mean that the maths, er Statistics are the right tool for the job.


On 4 March 2015 at 12:27, Michael J. Pont <M.Pont_at_xxxxxx Drew,

“The underlying point holds, that software _can_ exhibit degraded performance over time.”

Can you please give me a simple example of what you mean by this.



The System Safety Mailing List

Nick Tudor
Tudor Associates Ltd
Mobile: +44(0)7412 074654<tel:%2B44%280%297412%20074654><> [cid:image001.jpg_at_xxxxxx

77 Barnards Green Road
WR14 3LR
Company No. 07642673
VAT No:116495996<>

" Ce courriel et les documents qui lui sont joints peuvent contenir des informations confidentielles, être soumis aux règlementations relatives au contrôle des exportations ou ayant un caractère privé. S'ils ne vous sont pas destinés, nous vous signalons qu'il est strictement interdit de les divulguer, de les reproduire ou d'en utiliser de quelque manière que ce soit le contenu. Toute exportation ou réexportation non autorisée est interdite Si ce message vous a été transmis par erreur, merci d'en informer l'expéditeur et de supprimer immédiatement de votre système informatique ce courriel ainsi que tous les documents qui y sont attachés."

" This e-mail and any attached documents may contain confidential or proprietary information and may be subject to export control laws and regulations. If you are not the intended recipient, you are notified that any dissemination, copying of this e-mail and any attachments thereto or use of their contents by any means whatsoever is strictly prohibited. Unauthorized export or re-export is prohibited. If you have received this e-mail in error, please advise the sender immediately and delete this e-mail and all attached documents from your computer system." #

The System Safety Mailing List
systemsafety_at_xxxxxx Received on Thu Mar 05 2015 - 10:46:10 CET

This archive was generated by hypermail 2.3.0 : Tue Jun 04 2019 - 21:17:07 CEST