Re: [SystemSafety] Claims for formal methods

From: Peter Bernard Ladkin < >
Date: Thu, 20 Feb 2014 12:11:32 +0100


I think you are underselling somewhat.

On 2014-02-20 10:16 , David MENTRE wrote:

>> 3.    The support tools are assumed to be correct; primarily the Coq
>> proof assistant, which is written in OCaml.

> Yes. But Coq is designed in such a way (a very small, manually verified trusted core and then
> libraries built upon that core) that the possibility to have an issue in Coq is very very low (but
> not zero, I agree).
> Regarding OCaml, yes some bugs in the OCaml compiler or run-time could theoretically impact Coq. But
> having such bugs is very remote.

It is important here not just to talk about "correct" and "bugs" indiscriminately but to explain its and their nature. Of course, some understanding of logic is needed to make necessary distinctions, and as we have been discussing this may not always be present.

A general-purpose theorem prover such as Coq has been well exercised. That is, its functionality as a usable computer program has been well exercised. All of that functionality as a usable program is peripheral to its functional use as a prover. If you start the thing up and it barfs, you restart it; if it barfs continually, you go away and have a cup of coffee, admit defeat and do something else. If it runs and runs and runs and runs, you do similarly. If it gives you nonsense output, you do similarly.

A prover is asked to say whether A follows logically from B. It does so by finding a proof, automatically, quasi-automatically, or with considerable human help. When it says "yes", it has found what its internal rules (not just the explicitly programmed ones but the emergent ones) say is a proof.

That proof can be submitted for inspection, either by people or by a separate piece of SW that checks it.

The important correctness property of a prover and its associated activity is that (a) when it says "yes", (b) it produces an object, and (c) that object can be subsequently checked to see that it is indeed a proof. That is the property of soundness. This property is not wisely applied to the prover alone, but to the entire three-step a-b-c process. The first two steps are transparent. The third step involves soundness of the checker, which if it is SW may or may not be part of the prover package.

If the prover is a general-purpose prover, then the third step is straightforward and also transparent. But many provers nowadays rely on so-called decision procedures, sort-of oracles which take a specific mathematical structure and perform operations peculiar to that structure to come up with a yes/no decision on whether a property holds of that structure. They vary considerably in how hard it may be to verify whether a "yes" answer is correct (for a "no" answer, mostly they produce a counterexample, so again this may be checked by hand or by machine and is usually not problematic).

For example, answers from Pressburger arithmetic are easy to check by hand.

However, the extensive combinatorics involved in the decision procedures for such proofs as that of the four-color theorem or Hales's theorem are obscure. The decision procedure has searched through practically-innumerable examples, using combinatorial-decomposition techniques, and has told you what its findings are about all those examples. It just has to be wrong about one, somehow, and its result is out the window. And when it's checking more than what any mathematician or group of mathematicians could check in a lifetime, then there is a real question about how such a proof can ever be accepted (human mathematicians are also not infallible). It is, though, questionable how relevant such a phenomenon is to algorithm or program verification. I have never seen such a questionable example arise in any formal algorithm/program verification I have ever worked with or around.

Also, it is not easy to build a subtly-unsound prover. It is of course very easy to build an unsound prover and when this prover is exercised its unsoundness is usually discovered moderately quickly. But once you have something which most of the time carefully follows the inference mechanisms you have built in and checked, it is not going to condone random inferences, and if it does do odd things sometimes then you can discover that by inspection and hedge it.

The point is that routine verification is open to secondary checking. The key to understanding that things amiss with a prover are benign is to understand what "correctness" means in this case and that such correctness is more or less transparent. That's what makes such SW different from Excel.

PBL Prof. Peter Bernard Ladkin, Faculty of Technology, University of Bielefeld, 33594 Bielefeld, Germany Tel+msg +49 (0)521 880 7319

The System Safety Mailing List
systemsafety_at_xxxxxx Received on Thu Feb 20 2014 - 12:11:41 CET

This archive was generated by hypermail 2.3.0 : Tue Jun 04 2019 - 21:17:06 CEST