Re: [SystemSafety] Critical Design Checklist

From: Matthew Squair < >
Date: Thu, 29 Aug 2013 11:36:11 +1000


Hi Kevin,

I've run into similar definitional issues with what constitutes a functional failure in FHA's and the taxonomy (usually fairly simple) of failure that is used. My conclusion was that a lot (perhaps a majority?) of FHA's suffer from incompleteness because their definition of what is a function and therefore it's specification is incomplete. Which may, or may not, matter given the level of abstraction you're working at. Wrote a post here, http://wp.me/px0Kp-16s if you're interested.

On Thu, Aug 29, 2013 at 3:44 AM, Driscoll, Kevin R < kevin.driscoll_at_xxxxxx

> > Is "it" (the revised taxonomy) just re-inventing the wheel, or is
> there something else going on?****
>
> As one who abhors re-inventing the wheel (particularly when the result may
> have some corners on it), we don’t do this unless we need to.****
>
> ** **
>
> There are a number of problems with trying to make such a taxonomy.****
>
> One is the trade between making the fault classes as broad as possible (to
> make sure we have covered as many faults as possible) versus making the
> fault class definitions concise and having useful properties (e.g., being
> able to map appropriate fault avoidance or fault tolerance techniques to a
> particular fault class).****
>
> Another problem is trying to simplify the high dimensionality of this
> space. When reducing the dimensionality by aggregating fault classes into
> supersets, different hierarchies can result. For example, should faults
> first be divided into Value faults and Timing faults with each having a
> subset that is Byzantine? Or, should Byzantine be the superior set with
> Value and Timing subsets? Whichever way this is done, it seems there is
> always some lack of orthogonality.****
>
> ** **
>
> There is no consensus on how a fault taxonomy should be constructed. When
> a group of people is assembled for some purpose, in which individuals
> disagree on the taxonomy, some compromise taxonomy usually is created
> (often specific to the task at hand). There is also a lack of consensus on
> a lot of the terminology. For example, I disagree with the use of
> "arbitrary" as a synonym for or a description of “Byzantine” (need to edit
> the Wikipedia "Byzantine fault tolerance" page someday). I don't think
> "arbitrary" should be used for a fault set that doesn't include power
> source overvoltage, shrapnel from exploding capacitors, common mode
> failures due to compiler/linker or synthesizer bugs, ...****
>
> ** **
>
> Even the basic definitions of fault, failure, and error are not completely
> agreed. I think the definitions created by IFIP WG10.4 are the best
> published and should be the ones generally used. However, I think the term
> "error" should apply only to the difference in state for those elements of
> a device that are intended to hold state. I vehemently disagree with those
> (including other members of WG10.4) who use "error" as the difference in
> any state of the device, including a structural state. That is, I would
> not classify a broken wire as an "error".****
>
> ** **
>
> *From:* systemsafety-bounces_at_xxxxxx > systemsafety-bounces_at_xxxxxx > Schaefer at 300
> *Sent:* Tuesday, August 27, 2013 14:06
> *To:* Peter Bernard Ladkin
> *Cc:* systemsafety_at_xxxxxx > *Subject:* Re: [SystemSafety] Critical Design Checklist****
>
> ** **
>
> "It never seems to be exactly what we want." ****
>
> ** **
>
> Is "it" (the revised taxonomy) just re-inventing the wheel, or is there
> something else going on?****
>
> ** **
> ------------------------------
>
> *From:* Peter Bernard Ladkin <ladkin_at_xxxxxx > *Sent:* Tuesday, August 27, 2013 1:01 PM
> *To:* Robert Schaefer at 300
> *Cc:* Driscoll, Kevin R; systemsafety_at_xxxxxx > *Subject:* Re: [SystemSafety] Critical Design Checklist ****
>
> ****
>
> On 27 Aug 2013, at 18:07, Robert Schaefer at 300 <schaefer_robert_at_xxxxxx > wrote:****
>
> ** **
>
> Would a complete taxonomy even be possible? ****
>
> As the possibility of fault- contexts-in-the-world appears to be infinite
> or near infinite, wouldn't the number of fault types be near infinite as
> well?****
>
> ** **
>
> Since "fault type" is a human classification, it is guaranteed not to be
> anywhere near infinite, but indeed quite finite. Perrow has a
> classification he called "DEPOSE". That has just six categories, one for
> each letter. ****
>
> ** **
>
> Whether it does what one wants it to do is another question, as Kevin
> points out.****
>
> ** **
>
> I would also propose that fault is also a human classification (since you
> talk about a fault in language, no matter how precise, your words may have
> another instance which fulfil them, and it is the words/concepts which
> define what you are talking about) whereas failure has at least a
> time/space stamp. Ideally. Unfortunately, in the current state of the (lack
> of) art, I think failure might often be lacking objectivity too, if a
> specification exists and is ambiguous.****
>
> ** **
>
> PBL****
>
> ** **
>
> Prof. Peter Bernard Ladkin, University of Bielefeld and Causalis Limited**
> **
>
> ** **
>
> ** **
>
>
>
> ****
> ------------------------------
>
> *From:* systemsafety-bounces_at_xxxxxx > systemsafety-bounces_at_xxxxxx > Driscoll, Kevin R <kevin.driscoll_at_xxxxxx > *Sent:* Tuesday, August 27, 2013 11:28 AM
> *To:* Matthew Squair
> *Cc:* systemsafety_at_xxxxxx > *Subject:* Re: [SystemSafety] Critical Design Checklist ****
>
> ****
>
> > such a list should possess orthogonality, decidability, atomicity,
> criticality and a rationale. ****
>
> Addressing orthogonality (and completeness), the list should have a proper
> taxonomy. But, that’s hard to do.****
>
> ****
>
> Internally, we keep revisiting the creation of a taxonomy for fault types,
> even though much has been published on the subject. It never seems to be
> exactly what we want.****
>
> ****
>
> *From:* systemsafety-bounces_at_xxxxxx > mailto:systemsafety-bounces_at_xxxxxx > *On Behalf Of *Matthew Squair
> *Sent:* Tuesday, August 27, 2013 04:12
> *To:* martyn_at_xxxxxx > *Cc:* systemsafety_at_xxxxxx > *Subject:* Re: [SystemSafety] Critical Design Checklist****
>
> ****
>
> Not so much a list but a comment that the items in such a list should
> possess orthogonality, decidability, atomicity, criticality and a
> rationale. ****
>
> ****
>
> The criticality should address Martyn's 'and what then' comment.
>
> On Tuesday, 27 August 2013, Martyn Thomas wrote:****
>
> On 26/08/2013 21:37, Driscoll, Kevin R wrote:****
>
> For NASA, we are creating a Critical Design Checklist:****
>
> • *Objective*****
>
> - *A checklist for designers to help them determine if a
> safety-critical design has met its safety requirements*****
>
> ****
>
> Kevin
>
> For this purpose, I interpret your phrase "safety requirements" for a
> "safety-critical design" as meaning that any system that can be shown to
> implement the design correctly will meet the safety requirements for such a
> system in some required operating conditions.
>
> Here's my initial checklist:
>
> 1. Have you stated the "safety requirements" unambiguously and completely?
> How do you know? Can you be certain? If not, what is your confidence level
> and how as it derived?
> 2. Have you specified unambiguously and completely the range of operating
> conditions under which the safety requirements must be met? How do you
> know? Can you be certain? If not, what is your confidence level and how as
> it derived?
> 3. Do you have scientifically sound evidence that the safety-critcal
> design meets the safety requirements?
> 4. Has this evidence been examined by an independent expert and certified
> to be scientifically sound for this purpose?
> 5. Can you name the both the individual who will be personally accountable
> if the design later proves not to meet its safety requirements and the
> organisation that will be liable for any damages?
> 6. Has the individual signed to accept accountability? Has a Director of
> the organisation signed to accept liability?
>
> Of course, there is a lot of detail conceled within these top-level
> questions. For example, the specification of operating conditions is likely
> to contain detail of required training for operators, which will also need
> to be shown to be adequate.
>
> But there's probably no need to go into more detail as you will probably
> get at least one answer "no" to the top six questions.
>
> What will you do then?
>
> Regards
>
> Martyn****
>
>
>
> --
> Sent from Gmail Mobile****
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety_at_xxxxxx >
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety_at_xxxxxx >
>

-- 
*Matthew Squair*
*
*
Mob: +61 488770655
Email: MattSquair_at_xxxxxx



_______________________________________________ The System Safety Mailing List systemsafety_at_xxxxxx
Received on Thu Aug 29 2013 - 03:36:25 CEST

This archive was generated by hypermail 2.3.0 : Sat Feb 16 2019 - 01:17:06 CET