A formal semantics for SPARK in Coq

This page describes the current state of the spark semantics in Coq. This owrk started during the internship of Zhi Zhang from SAntoS lab (Kansas University) in the CPR team of CEDRIC lab during summer 2013.

Anonymous access to the development archive:

git clone --recursive https://forge.open-do.org/anonscm/git/spark2014/spark2014.git

directory: language_formal/spark2014_semantics

Current Status

Zhi developed the semantics of the following subset of the targeted language (itself a subset of spark 2014):

Lexical elements:
- Identifiers (2.3); OK
- for literals: integer literals (2.4); OK
- but also boolean litterals (not in the original proposal but well...) OK
Declarations and types:
- No user type declarations;
- No pre-defined types other than:
  - Boolean (3.5.3); OK
  - Integer (3.5.4 (11)); OK
- Object declarations (3.3.1 (5/2)) currently trying (not aliased, with only one defining identifier in the defining_identifier_list);
Names and expressions:
- Integer literals (4.2); OK
- Operators:
  - non-short-circuit logical operator (4.5.1); OK
  - relational operators (4.5.2); OK
  - adding operator (4.5.3, 4.5.4) OK
  - multiplying operator (4.5.5); OK (mod, rem and exponentiation are not included)
Statements:
- Assignments (5.2); OK
- if statements (5.3) without else / elsif; OK
- while loops (5.5 (8)) with no exit statements; OK
- Loop invariants (SPARK RM 5.5.3); NO
- procedure calls (not in the original proposal but well...) NO
Subprograms:
- Procedure bodies (6.1-6.3) NO that are the initial declaration of the subprogram (and not the completion: leaving "explicit" declarations)
- IN and OUT parameters, no IN OUT (6.2); OK
- Pre and Post aspects (6.1.1. (2/3) and (4/3)); NO
- with clauses targeting the spec unit N/A implied by a subprogram unit;

In short, annotation were not planned for Zhi internship, and procedure bodies and calls was a too big piece of formalization for such a short period of time. We knew that from the beginning.

The whole formalization is clean and is clearly a very good starting point to achieve the formalization of the subset. We defined two semantics:

one semantics defined as an inductive relation supposed to be the reference semantic. This is the one that must be validated by experts. In this semantics we model the runtime checks as on-the-fly verifications that raise an exception on failure. Once an exception is raised, it is propagated to the statement that raised it and the execution is then stuck on the exceptional state (there is no exception handler in the language anyway).
One semantics defined as a function (therefore executable in coq and also extractable to ml or haskell). This semantics is parametrized by the maximum number of steps allowed (as it is not possible to define non terminating functions in coq). If the execution is not finished when the maximum number of steps is reached then the execution stops on an "unfinished" state. This semantics is also parametrized by the set of runtime checks to be performed. Zhi has proved that on any (well formed) program (there exist a max number above which) this semantics returns the same result as the reference semantics (iff there is such a result). Since this semantics may be called with an incomplete set of runtime checks, it has one more kind of return state : so-called Abnormal state (which should be renamed since abnormal has a different meaning in the ada RM).

This work has been presented as an industrial paper at ACM HILT2013 conference.