There are a number of problems with trying to make such a taxonomy. One is the trade between making the fault classes as broad as possible (to make sure we have covered as many faults as possible) versus making the fault class definitions concise and having useful properties (e.g., being able to map appropriate fault avoidance or fault tolerance techniques to a particular fault class). Another problem is trying to simplify the high dimensionality of this space. When reducing the dimensionality by aggregating fault classes into supersets, different hierarchies can result. For example, should faults first be divided into Value faults and Timing faults with each having a subset that is Byzantine? Or, should Byzantine be the superior set with Value and Timing subsets? Whichever way this is done, it seems there is always some lack of orthogonality.
There is no consensus on how a fault taxonomy should be constructed. When a group of people is assembled for some purpose, in which individuals disagree on the taxonomy, some compromise taxonomy usually is created (often specific to the task at hand). There is also a lack of consensus on a lot of the terminology. For example, I disagree with the use of "arbitrary" as a synonym for or a description of "Byzantine" (need to edit the Wikipedia "Byzantine fault tolerance" page someday). I don't think "arbitrary" should be used for a fault set that doesn't include power source overvoltage, shrapnel from exploding capacitors, common mode failures due to compiler/linker or synthesizer bugs, ...
Even the basic definitions of fault, failure, and error are not completely agreed. I think the definitions created by IFIP WG10.4 are the best published and should be the ones generally used. However, I think the term "error" should apply only to the difference in state for those elements of a device that are intended to hold state. I vehemently disagree with those (including other members of WG10.4) who use "error" as the difference in any state of the device, including a structural state. That is, I would not classify a broken wire as an "error".
From: systemsafety-bounces_at_xxxxxx
Sent: Tuesday, August 27, 2013 14:06
To: Peter Bernard Ladkin
Cc: systemsafety_at_xxxxxx
Subject: Re: [SystemSafety] Critical Design Checklist
"It never seems to be exactly what we want."
Is "it" (the revised taxonomy) just re-inventing the wheel, or is there something else going on?
On 27 Aug 2013, at 18:07, Robert Schaefer at 300 <schaefer_robert_at_xxxxxx
Would a complete taxonomy even be possible? As the possibility of fault- contexts-in-the-world appears to be infinite or near infinite, wouldn't the number of fault types be near infinite as well?
Since "fault type" is a human classification, it is guaranteed not to be anywhere near infinite, but indeed quite finite. Perrow has a classification he called "DEPOSE". That has just six categories, one for each letter.
Whether it does what one wants it to do is another question, as Kevin points out.
I would also propose that fault is also a human classification (since you talk about a fault in language, no matter how precise, your words may have another instance which fulfil them, and it is the words/concepts which define what you are talking about) whereas failure has at least a time/space stamp. Ideally. Unfortunately, in the current state of the (lack of) art, I think failure might often be lacking objectivity too, if a specification exists and is ambiguous.
PBL Prof. Peter Bernard Ladkin, University of Bielefeld and Causalis Limited
> such a list should possess orthogonality, decidability, atomicity, criticality and a rationale.
Addressing orthogonality (and completeness), the list should have a proper taxonomy. But, that's hard to do.
Internally, we keep revisiting the creation of a taxonomy for fault types, even though much has been published on the subject. It never seems to be exactly what we want.
Sent: Tuesday, August 27, 2013 04:12
To: martyn_at_xxxxxx
Cc: systemsafety_at_xxxxxx
Subject: Re: [SystemSafety] Critical Design Checklist
Not so much a list but a comment that the items in such a list should possess orthogonality, decidability, atomicity, criticality and a rationale.
The criticality should address Martyn's 'and what then' comment.
On Tuesday, 27 August 2013, Martyn Thomas wrote: On 26/08/2013 21:37, Driscoll, Kevin R wrote: For NASA, we are creating a Critical Design Checklist:
* Objective - A checklist for designers to help them determine if a safety-critical design has met its safety requirements
Kevin
For this purpose, I interpret your phrase "safety requirements" for a "safety-critical design" as meaning that any system that can be shown to implement the design correctly will meet the safety requirements for such a system in some required operating conditions.
Here's my initial checklist:
Of course, there is a lot of detail conceled within these top-level questions. For example, the specification of operating conditions is likely to contain detail of required training for operators, which will also need to be shown to be adequate.
But there's probably no need to go into more detail as you will probably get at least one answer "no" to the top six questions.
What will you do then?
Regards
Martyn
-- Sent from Gmail Mobile _______________________________________________ The System Safety Mailing List systemsafety_at_xxxxxxReceived on Wed Aug 28 2013 - 19:44:54 CEST
_______________________________________________ The System Safety Mailing List systemsafety_at_xxxxxx
This archive was generated by hypermail 2.3.0 : Sat Feb 16 2019 - 02:17:05 CET