Re: [SystemSafety] Degraded software performance [diverged from Fault, Failure and Reliability Again]

From: Michael J. Pont < >
Date: Wed, 4 Mar 2015 14:22:19 -0000


Thanks for taking the time to compile this list.  

I’d view the Example 1 to Example 3 as hardware-related faults.  

(If we don’t do so, then we would – I think – end up having to class a broken wire that connects a switch to our processor as a “software failure”).  

In the case of Example 4, I agree that it could be argued that the software behaviour may change over time. However, I’d view this as the consequence of a design or coding error rather than anything else – rather in the line of Matthew’s example from earlier.  

At the end of the day, it may simply come down to definitions (and you, of course, you are free to define these things as you see fit).  


From: DREW Rae [mailto:d.rae_at_xxxxxx Sent: 04 March 2015 13:25
To: M.Pont_at_xxxxxx
Cc: <systemsafety_at_xxxxxx Subject: Degraded software performance [diverged from Fault, Failure and Reliability Again]  


I need to give more than one example, because the point is general, rather than specific to the individual causes. In each case the cumulative probability of software failure increases over time.

  1. Damage to the instruction set

e.g. the physical record of the instructions on a storage medium changes

very specific e.g. bit flip on a magnetic storage device holding the executable files  

2) Increased unreliability of the physical execution environment

e.g. an increased rate of processor errors

very specific e.g. dust accumulates on part of the processor card, making it run hot and produce calculation errors

3) Increased unreliability of input hardware

e.g. software is required to detect and respond correctly to an increased rate and variety of sensor failure combinations

Note: This is the one that challenges "but we're running the software in exactly the same hardware environment". Hardware environments change as they get older.

4) Software accumulates information during runtime

e.g. a count of elapsed time

e.g. increasing volume of stored data

e.g. memory leak  

NB1: In all of these cases I've heard arguments "that's not the software, that's X". Those arguments are only relevant if you can control for X when collecting data for software reliability calculation. Software without an execution environment is a design. It "never fails" in the way that _no_ design fails. When it does fail, it is subject to the same degredation over time as any physical implementation  

NB2: I'm not claiming that failure due to physical degredation is significant compared to failure due to errors in the original instructions. I'm saying that we don't know, and that not knowing becomes a big issue once we've tested to the point of not finding errors in the original instructions. At that point, absent evidence to the contrary, we should be assuming that physical degredation is signficant.


On 4 March 2015 at 12:27, Michael J. Pont <M.Pont_at_xxxxxx


“The underlying point holds, that software _can_ exhibit degraded performance over time.”

Can you please give me a simple example of what you mean by this.



The System Safety Mailing List

The System Safety Mailing List
systemsafety_at_xxxxxx Received on Wed Mar 04 2015 - 15:22:33 CET

This archive was generated by hypermail 2.3.0 : Tue Jun 04 2019 - 21:17:07 CEST