Re: [SystemSafety] Does "reliable" mean "safe" and or "secure" or neither?

From: Roberto Bagnara < >
Date: Mon, 25 Apr 2016 09:46:23 +0200


On 24/04/2016 20:11, Roberto Bagnara wrote:
> On 24/04/2016 18:36, Michael J. Pont wrote:

>> Roberto asks:
>>
>> "Can we talk about the reliability of the components in the context
>> of the overall system, without any knowledge about how they implement
>> their functionality (e.g., hardware only, hardware + little bit of
>> software, hardware + lots of software, hardware + software + humans)?"
>>
>> If our definition of reliability is something like this (from my previous
>> email):
>>
>> "the extent to which an experiment, test, or measuring procedure yields the
>> same results on repeated trials"

>
> OK.

Here is the complete argument I was trying to make (slightly revised). I would be grateful to those who will indicate the flaws in it.

Trying to match Michael's definition of reliability, in my example a "trial" would consist in exercising the overall system under in-spec conditions. By "in-spec conditions" I mean all conditions for which the specification of the overall system has something to say: these include normal conditions and some abnormal conditions (in which the overall system is still expected to behave gracefully, so to speak). Whenever I use the terms "in-spec" and "out-of-spec" below the same considerations apply.

Suppose the overall system is composed by a number of interacting components. Suppose also that such components are black boxes: we cannot look inside them (for the time being). However, we know everything about the interactions between the components because we can monitor them with precision. Suppose we also have specifications of each component that are detailed enough so that, in case of system exhibits and out-of-spec behavior, we are able to point the finger at small sets of components and tell which component(s) originated the first out-of-spec behavior, which component(s) that were meant to mitigate this misbehavior failed to do so, and so on.

The "experiment" would consist in recording the in-spec and out-of-spec behaviors of the various system components. We would say that two outcomes are "the same result" if they are either both in-spec or both out-of-spec. We perform many "repeated trials" and we thus determine the "reliability" of each component in the context of the overall system.

At this point, I would say we are entitled to talk about the reliability of each component in the context of the overall system.

Now, suppose we find that a particular component C, currently implemented by black box A, is not reliable enough, i.e., during the repeated trials has exhibited a number of out-of-spec behaviors that is too high. We thus replace black box A with black box B, a different implementation of the same component C. And now the results are much better.

At this point, I would say we are entitled to say that B is more reliable than A in the context in which they operate, even if we have no idea what is in the boxes.

So we open black boxes A and B: inside we find both hardware and software. I take it for granted that this discovery does not invalidate our previous conclusion that B is more reliable than A in the context in which they operate.

We examine the hardware in boxes A and B and find that A and B are based on exactly the same hardware: there is no difference whatsoever.

At this point, I would say we are entitled to say that B's software is more reliable than A's software in the context in which both operate (which includes the hardware on which they both run).

In order to conclude, I think it is enough to show that what I have described can happen in practice. The first thing that comes to mind concerns hardware bit flips due to radiation: they do happen all the time and, given that A and B have the same hardware, the hardware of A and B is affected in the same way. However, the software of B, differently from the software of A, extensively uses variable mirroring: it keeps two or more copies of the same variable using different data representations for the different copies (e.g., one copy holds the bitwise negation of another copy). In this way B's software can detect and sometimes correct the effect of bit flips.

Kind regards,

    Roberto

-- 
     Prof. Roberto Bagnara

Applied Formal Methods Laboratory - University of Parma, Italy
mailto:bagnara_at_xxxxxx
                              BUGSENG srl - http://bugseng.com
                              mailto:roberto.bagnara_at_xxxxxx
_______________________________________________
The System Safety Mailing List
systemsafety_at_xxxxxx
Received on Mon Apr 25 2016 - 09:57:22 CEST

This archive was generated by hypermail 2.3.0 : Tue Jun 04 2019 - 21:17:08 CEST