A Formal Semantics for the C Programming Language

Nikolaos S. Papaspyrou

Doctoral Dissertation, February 1998

National Technical University of Athens
Department of Electrical and Computer Engineering
Division of Computer Science
Software Engineering Laboratory


C together with its descendants represents a strong and indisputable status quo in the current software industry. It is a very popular general-purpose programming language, characterized by its economy of expression, its large set of operators and data types, and its concern for source code portability. The current reference document for C is the international standard ISO/IEC 9899:1990. The semantics of C is informally defined in the standard, using natural language. This causes a number of ambiguities and problems of interpretation about the intended semantics of the language.

In this thesis, a formal denotational semantics for the ANSI C programming language is proposed, with emphasis on its accuracy and completeness with respect to the standard. It is demonstrated that a programming language as useful in practice and as inherently complicated as C can nonetheless be defined formally. The proposed semantics could be used as a precise, unambiguous, abstract and implementation-independent standard for the language. Moreover, it would be a basis for the formal reasoning about C programs and a valuable theoretical tool in the software development process.

In order to improve in modularity and elegance, the proposed semantics uses several monads and monad transformers to model different aspects of computations. Interesting results have been achieved in the attempt to accurately model complex characteristics of C, such as the unspecified order of evaluation and sequence points, using monad notation. These results may be useful in specifying the semantics of programming languages supporting non-deterministic features and parallelism.

An implementation of an abstract interpreter for C programs based on the proposed semantics has also been developed, using Haskell as the implementation language. The implementation has been used to evaluate the accuracy and completeness of the proposed semantics. Although this process is still under way, the results so far have been entirely satisfactory.

Keywords:    ANSI C programming language, ISO/IEC 9899:1990 standard, formal definition, denotational semantics, monads, monad transformers.

At work! Available as:

Happiness is not a state to arrive at, but a manner of travelling.
-- Margaret Lee Runbeck.

Nikos Papaspyrou (nickie@softlab.ntua.gr).
Last updated: 12/03/2002 11:06.