Re: [SystemSafety] systemsafety Digest, Vol 34, Issue 16

From: Roderick Chapman < >
Date: Fri, 08 May 2015 15:08:37 +0100

On 08/05/2015 11:00, systemsafety-request_at_xxxxxx wrote:
> On my calculations, your results imply there were 1896 conditions that
> had to be assessed / proved manually.
>
> We have done this type of manual review of tool output - though not for
> SPARK - and it takes quite a while to review, assess and document (i.e.
> measured in man-days).
>
> How long did this assessment/proof stage take in your case?
The actual numbers (in the ITP conference paper) are:

Total VCs: 152927

Proved automatically (no user-supplied lemmas): 151026

Proved automatically (with user-supplied lemmas): 1701

Proved by "review" (i.e. human): 200

It's hard to break down the effort required. A user-defined lemma (essential an inference rule which is added to the theorem prover) is developed, reviewed for soundness (or proved sound using a general-purpose interactive prover like our own Checker, Coq, or Isabelle), added to the prover, but then might get used to prove several VCs, so the effort to develop a "good" lemma might pay-off in automating the proof of hundreds of VCs.

Once you've got a good set of additional lemmas, those 1701 VCs are re-proved automatically, so there's no repeated effort.

The remaining 200 were reviewed by people and marked as "OK" - this gets picked up by the tool as "proved by review."

The effort to do this is not recorded - it was done by the developers as part of
their normal development activity, not as a separate activity, so the time to develop the lemmas and reviews was not recorded separately.

The proof lives on - it has been maintained and is reproduced daily for all subsequent changes and builds of the system.


The System Safety Mailing List
systemsafety_at_xxxxxx Received on Fri May 08 2015 - 16:09:08 CEST

This archive was generated by hypermail 2.3.0 : Sun Apr 21 2019 - 23:17:07 CEST