Re: [SystemSafety] Claims for formal methods

From: Derek M Jones < >
Date: Thu, 20 Feb 2014 13:45:45 +0000


Hello Monsieur MENTRE,

> Hello Mr. Jones,
>
>> The French group in the above links did some interesting work on
>> compiler code generation and I believe they now refer to it as a
>> verified compiler backend (which is what it idoes) rather than a
>> verified C compiler.
>
> They are doing more than just a C backend and they are progressing
> slowly but surely towards parsing the C grammar. For example, you can

The issue was about the gulf between the claims made and what had actually been achieved.

A soap powder manufacturer cannot make excessive advertising claims about its current products just because it has researchers in a lab trying to cook up a better powder.

> http://gallium.inria.fr/blog/verifying-a-parser-for-a-c-compiler/
> http://gallium.inria.fr/blog/verifying-a-parser-for-a-c-compiler-2/

As always with this kind of work the devil is in the detail.

They currently have what they consider to be a reliable proof of the correctness of one of the tools they used.

As they freely admit there are issues with their current approach: "... our solution worked, but was not fully satisfactory, because the "lexer" was so complicated that it included a full non-certified parser, which could, if buggy, in some corner cases, introduce bugs in the whole parser."

I think that holes like this will always exist in formal proofs and perhaps a better approach would be to try and figure out how many nine's might be claimed for the likelihood of the 'proof' behaving correctly.

>> 2. The CompCert system may successfully translate programs
>> containing undefined behavior. Any proof statements made about such
>> programs may not be valid.
>
> Yes. But this is not the issue with the Verified C compiler but with the
> input C program itself. You can use other tools (like Frama-C, Astrée or
> Polyspace) to check your C program has no undefined behaviour.

So you are agreeing that any proof statement made by the 'Verified' C compiler may be wrong.

Your proposed solution is to first run the code through unverified tools to weed out those cases that the 'Verified' compiler gets wrong?

I have spoken to some people who were at recent talk by the French group, who tell me that they now (correctly) refer to what they have done as a verified backend and not a verified C compiler.

>> 3. The support tools are assumed to be correct; primarily the Coq
>> proof assistant, which is written in OCaml.
>
> Yes. But Coq is designed in such a way (a very small, manually verified
> trusted core and then libraries built upon that core) that the
> possibility to have an issue in Coq is very very low (but not zero, I
> agree).

Essentially this is one of the axioms of the proof.

This is the kind of information Soap powder manufacturers put in the small print.

>> 4. The CompCert system makes decisions about implementation
>> dependent behaviors and any proofs only apply in the context of these
>> decisions.
>
> Yes. As any compiler. At least those decisions are formally documented.

More small print.

>> 5; The CompCert system makes decisions about unspecified behaviors
>> and any proofs only apply in the context of these decisions.
>
> Yes. As any compiler. At least those decisions are formally documented.

I did not manage to find this document. But more small print.

> As pointed out on the comment of your blog, an independent research
> paper (http://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf) has
> shown that ComptCert is a very strong compiler, much stronger that any
> others. Quoting Regehr's paper:

CompCert's put in a good performance here. However, Regehr's research targets code generations issues, the part of the compiler on which the verification work has been done.

There is the issue of quantity of optimization performed. CompCert does some optimizations, but not nearly as much as gcc and llvm; you would expect to find fewer faults because there is less code in CompCert to go wrong. Were no faults found in CompCert because insufficient tests were run? Calculating a scaling factor to adjust for quantity of optimization is a large, open, research problem.

-- 
Derek M. Jones                  tel: +44 (0) 1252 520 667
Knowledge Software Ltd          blog:shape-of-code.coding-guidelines.com
Software analysis               http://www.knosof.co.uk
_______________________________________________
The System Safety Mailing List
systemsafety_at_xxxxxx
Received on Thu Feb 20 2014 - 14:48:38 CET

This archive was generated by hypermail 2.3.0 : Tue Apr 23 2019 - 05:17:06 CEST