Re: [SystemSafety] Claims for formal methods

From: David MENTRE < >
Date: Thu, 20 Feb 2014 16:58:57 +0100

Dear Mr. Jones,

Le 20/02/2014 14:45, Derek M Jones a écrit :
> The issue was about the gulf between the claims made and what had
> actually been achieved.
> A soap powder manufacturer cannot make excessive advertising claims
> about its current products just because it has researchers in a lab
> trying to cook up a better powder.

Talking about a "gulf" between claims made and what is achieved is excessive in my mind. :-)

Formal methods are not silver bullet and they should be carefully applied. But if applied correctly, they do bring significant assurance on the part they cover (which is not everything, we agree).

> As always with this kind of work the devil is in the detail.
> They currently have what they consider to be a reliable proof of the
> correctness of one of the tools they used.
> As they freely admit there are issues with their current approach:
> "... our solution worked, but was not fully satisfactory, because the
> "lexer" was so complicated that it included a full non-certified parser,
> which could, if buggy, in some corner cases, introduce bugs in the whole
> parser."

That's why the second blog post explains how they can avoid such issue.

CompCert authors are aware of such potential issues (and warn about it): """
This part of CompCert (transformation of C source text to CompCert C abstract syntax trees) is not formally verified. However, CompCert C is a subset of C, and the compiler can output the generated CompCert C code in C concrete syntax (flag -dc), therefore the result of this transformation can be manually inspected. Moreover, most static analysis and program verification tools for C operate on a simplified C language similar to CompCert C. By conducting the analysis or the program verification directly on the CompCert C form, bugs potentially introduced by this first part of the compiler can be detected. """

> I think that holes like this will always exist in formal proofs and
> perhaps a better approach would be to try and figure out how many nine's
> might be claimed for the likelihood of the 'proof' behaving correctly.

I think most real-world practitioners of formal methods perfectly understand the limits of the used approaches and complement it with needed reviews, tests, etc.

>>> 2. The CompCert system may successfully translate programs
>>> containing undefined behavior. Any proof statements made about such
>>> programs may not be valid.
>> Yes. But this is not the issue with the Verified C compiler but with the
>> input C program itself. You can use other tools (like Frama-C, Astrée or
>> Polyspace) to check your C program has no undefined behaviour.
> So you are agreeing that any proof statement made by the 'Verified' C
> compiler may be wrong.

No, not at all.

> Your proposed solution is to first run the code through unverified tools
> to weed out those cases that the 'Verified' compiler gets wrong?

What I wanted to say is that there is a separation of concern between the program and the compiler:

Both combined, you get a "perfect" software, e.g. absence of run-time error compiled to running code.

Of course, definition of "perfect" depends on your needs. For example absence of run-time error might be a minimal requirement but it does not ensure the software will not put the system in unsafe state (and other formal tools can help you prove that, not CompCert).

If for you "Verified Compiler" means "Compiler that stops on bugs in my program", then CompCert is not fulfilling this definition.

> CompCert's put in a good performance here. However, Regehr's research
> targets code generations issues, the part of the compiler on which the
> verification work has been done.

As I said, the verification work does not cover all the compiler (CompCert authors never claimed that, from the very beginning). But they do cover a significant part of it, which is increasing over the years.

> There is the issue of quantity of optimization performed.
> CompCert does some optimizations, but not nearly as much as gcc and
> llvm;

For tight loops, gcc or llvm are much better, but for regular code, CompCert claims 20% slower than gcc -O3 (

> you would expect to find fewer faults because there is less
> code in CompCert to go wrong. Were no faults found in CompCert because
> insufficient tests were run?

No, guaranties from CompCert does not come from test but from proof, by structure. This is very different from gcc or llvm design. Therefore such result is valid *for any kind of input*. To take a simple example, the result of the unverified register allocation algorithm is formally checked. Therefore they cannot have any errors in register allocation. By structure.

Of course, it is your right to be sceptical about such claims (it is a sane attitude). That's why CompCert authors are putting effort to gather all the fine prints (assumed axioms, how the compiler and its proof are structured, assumptions put on Coq or OCaml, etc.) of the claimed proved properties and put them in front of certification authorities (in DO178B/C context if I'm correct), the objective being to convince them that the claimed properties are indeed rightly ensured.

Sincerely yours,
D. Mentré

The System Safety Mailing List
systemsafety_at_xxxxxx Received on Thu Feb 20 2014 - 16:59:11 CET

This archive was generated by hypermail 2.3.0 : Tue Jun 04 2019 - 21:17:06 CEST