# Third International Workshop on Formal Techniques for Safety-Critical Systems (FTSCS 2014)

Preliminary Proceedings

Editors: Cyrille Artho and Peter Csaba Ölveczky

#### **Preface**

This volume contains the preliminary proceedings of the *Third International Workshop on Formal Techniques for Safety-Critical Systems* (FTSCS 2014), held in Luxembourg on November 6–7, 2014, as a satellite event of the ICFEM conference.

The aim of this workshop is to bring together researchers and engineers who are interested in the application of formal and semi-formal methods to improve the quality of safety-critical computer systems. FTSCS strives to promote research and development of formal methods and tools for industrial applications, and is particularly interested in industrial applications of formal methods. Specific topics include, but are not limited to:

- case studies and experience reports on the use of formal methods for analyzing safety-critical systems, including avionics, automotive, medical, and other kinds of safety-critical and QoS-critical systems;
- methods, techniques and tools to support automated analysis, certification, debugging, etc., of complex safety/QoS-critical systems;
- analysis methods that address the limitations of formal methods in industry (usability, scalability, etc.);
- formal analysis support for modeling languages used in industry, such as AADL, Ptolemy, SysML, SCADE, Modelica, etc.; and
- code generation from validated models.

The workshop received 42 submissions; 40 of these were regular papers and 2 were work-in-progress/position papers. Each submission was reviewed by at least three referees. Based on the reviews and extensive discussions, the program committee selected 16 regular papers and both work-in-progress papers for presentation at the workshop and inclusion in this volume. In addition, our program also includes invited talks by Klaus Havelund and Thomas Noll.

Revised versions of accepted regular papers will appear in the post-proceedings of FTSCS 2014 that will be published as a volume in Springer's *Communications in Computer and Information Science* (CCIS) series. Extended versions of selected papers from the workshop will also appear in a special issue of the *Science of Computer Programming* journal.

Many colleagues and friends have contributed to FTSCS 2014. First, we would like to thank Kokichi Futatsugi and Hitoshi Ohsaki for initiating this series of workshops. We thank Klaus Havelund and Thomas Noll for accepting our invitation to give invited talks and the authors who submitted their work to FTSCS 2014 and who, through their contributions, make this workshop an interesting event. We are particularly grateful that so many well known researchers agreed to serve on the program committee, and that they all provided timely, insightful, and detailed reviews.

We also thank the editors of Communications in Computer and Information Science for agreeing to publish the proceedings of FTSCS 2014 as a volume in their series, and Jan A. Bergstra and Bas van Vlijmen for accepting our proposal to devote a special issue of the Science of Computer Programming journal to extended versions of selected papers from FTSCS 2014. Furthermore, Jun Pang

has been very helpful with the local arrangements. Finally, we thank Andrei Voronkov for the excellent EasyChair conference systems.

We hope that you will all enjoy the workshop!

November, 2014

Cyrille Artho Peter Csaba Ölveczky

### **Program Chairs**

Cyrille Artho AIST

Peter Csaba Ölveczky University of Oslo

### **Program Committee**

Erika Ábrahám RWTH Aachen University

Musab AlTurki King Fahd University of Petroleum and Minerals

Toshiaki Aoki JAIST

Farhad Arbab Leiden University and CWI

Cyrille Artho AIST

Kyungmin Bae Carnegie-Mellon University

Saddek Bensalem Verimag

Armin Biere Johannes Kepler University
Ansgar Fehnker University of the South Pacific

Mamoun Filali IRIT

Bernd Fischer Stellenbosch University

Klaus Havelund NASA JPL

Marieke Huisman University of Twente

Ralf Huuck NICTA

Fuyuki Ishikawa National Institute of Informatics

Takashi Kitamura AIST

Alexander Knapp Augsburg University

Yang Liu Nanyang Technological University

Robi Malik University of Waikato

Frédéric Mallet Université Nice Sophia Antipolis

Cesar Munoz NASA Langley

Thomas Noll RWTH Aachen University

Peter Csaba Ölveczky University of Oslo

Charles Pecheur Université catholique de Louvain

Paul Pettersson Mälardalen University

Camilo Rocha Escuela Colombiana de Ingeniería

Ralf Sasse ETH Zürich

Oleg Sokolsky
University of Pennsylvania
Sofiène Tahar
Concordia University
Carolyn Talcott
SRI International
Osaka University
Chen-Wei Wang
Mike Whalen
Huibiao Zhu
University of Minnesota
East China Normal University

# **Additional Reviewers**

Dunchev, Cvetan Enoiu, Eduard Paul

Fang, Huixing Gao, Sa

Hatvani, Leo Huang, Yanhong Hung, Dang Van Jansen, Christina Jansen, Nils Johnsen, Andreas

Kremer, Gereon Li, Qin

Limbrée, Christophe Mentis, Anakreon Mu, Chunyan Siddique, Umair

Soualhia, Mbarka Wu, Xi

# Table of Contents

## **Invited Presentations**

| Experience with Rule-Based Analysis of Spacecraft Logs                                                                                                                                                              |    |  |  |  |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|--|--|--|
| Safety, Dependability and Performance Analysis of Aerospace Systems $\dots$ Thomas Noll                                                                                                                             | 2  |  |  |  |
| Concurrency in Hardware and Systems                                                                                                                                                                                 |    |  |  |  |
| Parallelism Analysis: Precise WCET Values for Complex Multi-Core Systems                                                                                                                                            | 6  |  |  |  |
| Formal Verification of Distributed Task Migration for Thermal  Management in On-chip Multi-core Systems using nuXmv  Syed Ali Asadullah Bukhari, Faiq Khalid Lodhi, Osman Hasan, Muhammad Shafique and Joerg Henkel | 22 |  |  |  |
| Coalgebraic Semantic Model for the Clock Constraint Specification  Language                                                                                                                                         | 37 |  |  |  |
| Modelling Hybrid Systems in Hy-tccp                                                                                                                                                                                 | 52 |  |  |  |
| Railway Systems                                                                                                                                                                                                     |    |  |  |  |
| Formal Modeling and Verification of Interlocking Systems Featuring Sequential Release                                                                                                                               | 58 |  |  |  |
| Dynamic State Machines for Formalizing Railway Control System Specifications                                                                                                                                        | 74 |  |  |  |
| Ugo Gentile, Roberto Nardone, Adriano Peron, Valeria Vittorini,<br>Stefano Marrone, Renato De Guglielmo, Nicola Mazzocca and Luigi<br>Velardi                                                                       | 14 |  |  |  |
| Modelling and Analysing the European Rail Traffic Management System in Real-Time Maude                                                                                                                              | 90 |  |  |  |
| Program Analysis                                                                                                                                                                                                    |    |  |  |  |
| Expression-based aliasing for OO–languages                                                                                                                                                                          | 96 |  |  |  |

| Specifying and Verifying Concurrent C Programs with TLA+                                          | . ]                              |
|---------------------------------------------------------------------------------------------------|----------------------------------|
| Protocol Analysis; Refinement Checking, Automata                                                  | . 127<br>. 142<br>. 157<br>. 172 |
| Key-Secrecy of PACE with OTS/CafeOBJ                                                              | 27                               |
| A Normalized Form for FIFO Protocols Traces, Application to the<br>Replay of Mode-based Protocols | 12                               |
| A Formal Model of SysML Blocks using CSP for Assured Systems  Engineering                         | 57                               |
| Checking Integral Real-time Automata for Extended Linear Duration Invariants                      | '2                               |
| Automotive Systems                                                                                |                                  |
| A Spin-based Approach for Checking OSEK/VDX Applications                                          | 37                               |
| Checking the Conformance of a Promela Design to Its Formal Specification in Event-B               | )3                               |
| Analyzing Industrial Architectural Models by Simulation and Model-Checking                        | Ç                                |

# Experience with Rule-Based Analysis of Spacecraft Logs\*

Klaus Havelund and Rajeev Joshi

Jet Propulsion Laboratory California Institute of Technology California, USA

Runtime verification (RV) consists in part of checking execution traces against user-provided formalized specifications. Throughout the last decade many new systems have emerged, most of which support specification notations based on state machines, regular expressions, temporal logic, or grammars. The field of Artificial Intelligence (AI) has for an even longer period of time studied rulebased production systems, which at a closer look appear to be relevant for RV, although seemingly focused on slightly different application domains, such as for example business processes and expert systems. The core algorithm in many of these systems is the Rete algorithm. We have in previous work implemented a rule-based system, named LogFire, for runtime verification, founded on the Rete algorithm, as an internal DSL in the Scala programming language (in essence a library). Using Scala's support for defining DSLs allows us to write rules elegantly as part of Scala programs. This combination appears attractive from a practical point of view. LogFire is currently running daily, analyzing telemetry emitted by the Curiosity rover on Mars. The analysis provides input to a visualization tool, which gives mission operators an overview of the activities on board the rover. We illustrate the effectiveness of combining specification-based log analysis and visualization. We will discuss this application and identify pros and cons of the approach.

<sup>\*</sup> The work described in this publication was carried out at Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.

# Safety, Dependability and Performance Analysis of Aerospace Systems

Thomas Noll\*

Software Modeling and Verification Group RWTH Aachen University, Germany http://moves.rwth-aachen.de/

#### 1 Introduction

Building modern aerospace systems is highly demanding. They should be extremely dependable, offering service without interruption (i.e., without failure) for a very long time – typically years or decades. Whereas "five nines" dependability, i.e., a 99.999% availability, is satisfactory for most safety-critical systems, for aerospace on-board systems it is not. Faults are costly and may severely damage reputations. Dramatic examples are known. Fatal defects in the control software of the Ariane-5 rocket and the Mars Pathfinder have led to headlines in newspapers all over the world. Rigorous design support and analysis techniques are called for. Bugs must be found as early as possible in the design process while performance and reliability guarantees need to be checked whenever possible. The effect of fault diagnosis, isolation and recovery must be quantifiable.

Tailored effective techniques exist for specific system-level aspects. Peer reviewing and extensive testing detect most of the software bugs, performance is checked using queueing networks or simulation, and hardware safety levels are analysed using a profiled Failure Modes and Effects Analysis (FMEA) approach. Fine. But how is the consistency between the analysis results ensured? What is the relevance of a zero-bug confirmation if its analysis is based on a system view that ignores critical performance bottlenecks? There is a clear need for an integrated, coherent approach! This is easier said than done: the inherent heterogeneous character of on-board systems involving software, sensors, actuators, hydraulics, electrical components, etc., each with its own specific development approach, severely complicates this.

#### 2 Modeling Using an AADL Dialect

About five years ago we took up this grand challenge. Within the ESA-funded COMPASS (COrrectness, Modelling and Performance of AeroSpace Systems)

<sup>\*</sup> We thank all co-workers in the COMPASS project for their contributions, including the groups of Alessandro Cimatti (FBK, Trento, IT), Xavier Olive (Thales Alenia Space, FR), David Lesens (Airbus Defence and Space, FR) and Yuri Yushtein (ESA/ESTEC, NL). This research has been funded by the European Space Agency via several grants.

#### 2 Thomas Noll

project, an overarching model-based approach has been developed. The key is to model on-board systems at an adequate level of abstraction using a general-purpose modelling and specification formalism based on the Architecture Analysis & Design Language (AADL) as standardised by SAE International. This enables engineers to use an industry-standard, textual and graphical notation with precise semantics to model system designs, including both hardware as well as software components. Ambiguities about the meaning of designs are abandoned. System aspects that can be modelled are, amongst others,

- (timed) hardware operations, specified on the level of processors, buses, etc.,
- software operations, supporting concepts such as processes and threads,
- hybrid aspects, i.e., continuous, real-valued variables with (linear) timedependent dynamics, and
- faults with probabilistic failure rates and their propagation between components.

A complete system specification describes three parts: (1) nominal behaviour, (2) error behaviour, and (3) a fault injection that relates the former and the latter by defining in which ways the occurrence of failures affects the system's nominal behaviour. Systems are described in a hierarchical, component-based manner such that the structure of the model strongly resembles the real system's structure. A detailed description of the language and its formal semantics can be found in [3].

#### 3 Formal Verification

This coherent and multi-disciplinary modelling approach is complemented by a rich palette of analysis techniques. The richness of the AADL dialect gives the power to specify and generate a single system model that can be analysed for multiple qualities: reliability, availability, safety, performance, and their mixture. All analysis outcomes are related to the same system's perspective, thus ensuring compatibility. First and foremost, mathematical techniques are used to enable an early integration of bug hunting in the design process. This reduces the time that is typically spent on a posteriori testing – in on-board systems, more time and effort is spent on verification than on construction! – and allows for early adaptations of the design. The true power of the applied techniques is their almost full automation: once a model and a property (e.g., can a system ever reach a state in which it cannot progress?) are given, running the analysis is push-button technology. In case the property is violated, diagnostic feedback is provided in terms of a counterexample which is helpful to identify the cause of the property refutation. These model-checking techniques [1,5] are based on a full state space exploration, and detect all kinds of bugs, in particular also those that are due to the intricacies of concurrency: multiple threads acting on shared data structures. Bugs of this type are becoming increasingly frequent, as multi-threading grows at a staggering rate.

#### 4 Requirements Specification

Whereas academic tools rely on properties defined in mathematical logic, a formalism that is a major obstacle for usage by design engineers, COMPASS uses specification patterns [6, 8]. These patterns act as parametrised "template" to the engineers and thus offer a comprehensible and easy-to-use framework for requirement specification. In order to ensure the quality of requirements, they can be validated independently of the system model. This includes property consistency (i.e., checking that requirements do not exclude each other) and property assertion (i.e., checking whether an assertion is a logical consequence of the requirements).

#### 5 Safety

Analysing system safety and dependability is supported by key techniques such as (dynamic) fault tree analysis (FTA), (dynamic) Failure Modes and Effects Analysis (FMEA), fault tolerance evaluation, and criticality analysis [2]. System models can include a formal description of both the fault detection and isolation subsystems, and the recovery actions to be taken. Based on these models, tool facilities are provided to analyse the operational effectiveness of the FDIR (Fault Detection, Isolation and Recovery) measures, and to assess whether the observability of system parameters is sufficient to make failure situations diagnosable.

#### 6 Toolset

All techniques and the full modelling approach are supported by the COMPASS toolset [4], which is freely downloadable for all ESA member states from http://compass.informatik.rwth-aachen.de/. The tool features a graphical user interface and runs under the Linux OS.

#### 7 Industrial Evaluation

The COMPASS approach and toolset was intensively tested on serious industrial cases by Thales Alenia Space in Cannes (France). These cases include thermal regulation in satellites and satellite mode management with its associated FDIR strategy. It was concluded that the modelling approach based on AADL provides sufficient expressiveness to model all hardware and software subsystems in satellite avionics. The hierarchical structure of specifications and the component-based paradigm enables the reuse of models. Also incremental modelling is very well supported. The Reliability, Availability, Maintainability and Safety (RAMS) analyses as provided by the toolset were found to be mature enough to be adopted by industry, indicating that the integrated COMPASS approach significantly reduces the time and cost for safety analysis compared to

#### 4 Thomas Noll

traditional on-board design processes [9]. Those findings were confirmed by applying our formal modelling and analysis techniques on a regular industrial-size design of a modern satellite platform in parallel with the conventional software development of the platform [7].

#### References

- 1. Baier, C., Katoen, J.P.: Principles of Model Checking. MIT Press (2008)
- 2. Bozzano, M., Villafiorita, A.: Design and Safety Assessment of Critical Systems. CRC Press (2010)
- Bozzano, M., Cimatti, A., Katoen, J.P., Nguyen, V.Y., Noll, T., Roveri, M.: Safety, dependability, and performance analysis of extended AADL models. The Computer Journal 54(5) (2011) 754–775
- Bozzano, M., Cimatti, A., Katoen, J.P., Nguyen, V.Y., Noll, T., Roveri, M., Wimmer, R.: A model checker for AADL (tool presentation). In: Proc. of 22nd Int. Conf. on Computer Aided Verification (CAV 2010). Volume 6174 of LNCS., Springer (2010) 562–565
- 5. Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press (1999)
- Dwyer, M., Avrunin, G., Corbett, J.: Patterns in property specifications for finitestate verification. In: Int. Conf. on Software Engineering (ICSE), IEEE CS Press (1999) 411–420
- Esteve, M.A., Katoen, J.P., Nguyen, V.Y., Postma, B., Yushtein, Y.: Formal correctness, safety, dependability and performance analysis of a satellite. In: 34th Int. Conf. on Software Engineering (ICSE 2012), ACM and IEEE CS Press (2012) 1022–1031
- Grunske, L.: Specification patterns for probabilistic quality properties. In: Int. Conf. on Software Engineering (ICSE), ACM (2008) 31–40
- Yushtein, Y., Bozzano, M., Cimatti, A., Katoen, J.P., Nguyen, V., Noll, T., Olive, X., Roveri, M.: System-software co-engineering: Dependability and safety perspective. In: Proc. 4th IEEE Int. Conf. on Space Mission Challenges for Information Technology (SMC-IT 2011), IEEE CS Press (2011) 18–25

## Parallelism Analysis: Precise WCET Values for Complex Multi-Core Systems

Timon Kelter and Peter Marwedel

Department of Computer Science, TU Dortmund Otto-Hahn-Straße 16, 44227 Dortmund, Germany {timon.kelter,peter.marwedel}@tu-dortmund.de

Abstract. In the verification of safety-critical real-time systems, the problem of determining the worst-case execution time (WCET) of a task is of utmost importance. Safe formal methods have been established for solving the single-task, single-core WCET problem. The de-facto standard approach uses abstract interpretation to derive basic block execution times and a combinatorial path analysis which derives the longest path through the program. WCET analyses for multi-core computers have extended this methodology by assuming that shared resources are partitioned in either time or space and that therefore each core can still be analyzed separately. For real-world multi-cores this assumption is often not true, making the classic WCET analysis approach either inapplicable or highly pessimistic. To overcome this, we present a new technique to explore the interleavings of a parallel task system as well as an exclusion criterion to prove that certain interleavings can never occur. We show how this technique can be integrated into existing WCET analysis approaches and finally provide results for the application of this new analysis type to a collection of real-time benchmarks, where average WCET reductions of 32% were observed.

Keywords: WCET, Multi-Core, Parallelism, Shared Resources

#### 1 Introduction

WCET analysis is an important prerequisite for schedulability analysis and for overall system validation of safety-critical real-time systems, i.e. systems in which tasks must complete within a given deadline. The runtime of any task  $\tau$  depends on its inputs, on the system state at the start of  $\tau$  and on the interference imposed on  $\tau$  by preempting tasks on the same core or by parallel tasks running on other cores. To compute the WCET, first an abstract interpretation on the domain of abstract system hardware states is run. With the resulting hardware state overestimations a safe bound on the runtime of each basic block can be derived. This procedure is called *microarchitectural analysis* (MA). As the last step, the path analysis determines the longest path through the program with the help of the basic block runtimes determined by the MA [19]. In this paper we propose an abstract interpretation of the system hardware state that is able to efficiently explore all possible interactions between multiple concurrently running tasks.

#### 2 Timon Kelter and Peter Marwedel

As soon as multiple cores may access a shared hardware resource in parallel, the runtimes of parallel tasks are no longer independent but they depend on

- 1. the order in which the requests arrive at the shared resource and
- 2. the policy with which requests to the shared resource are arbitrated.

Previous work has eliminated the first dependency by choosing a state-partitioned arbitration strategy which guarantees that the actions of any core C cannot modify the state of the shared resource as seen by cores  $C_o \neq C$ . This implies, that the delay for any access from C is independent of the potential concurrent accesses from all  $C_o \neq C$ . Therefore, we can still perform a per-core analysis and the state space does not become much bigger than for the single-core case. An example for such a state-partitioned strategy is time-division multiple access (TDMA) [5]. However, state-partitioned arbitration increases the average access delay compared to state-permeable strategies like fair arbitration (FAIR) and fixed-priority arbitration (PRIO) [6]. WCET analysis for these types of arbitration has been nonexistent or pessimistic at best. Therefore our main goal in this paper is to make a first step towards a precise WCET analysis for shared state-permeable resources, since they are often found in real-world systems.

#### 2 Related Work

WCET analysis There is an extensive body of work on single-core WCET analysis as summarized in [19], which led to the standard approach of separating the microarchitectural analysis from the path analysis. Our techniques also build upon this concept by extending the former analysis to multi-cores.

The first known approach to multi-core WCET analysis is based on the Real-Time Calculus (RTC) [14, 15]. It uses "access curves" to strongly abstract from the concrete system, which introduces strong pessimism in the results and is restricted to timing-compositional architectures [4]. The only known, non-RTCbased approach to the analysis of shared state-permeable resources is based on parallel summaries [10]. For a shared cache, it precomputes worst-case interference summaries for each core which contain the effects that all program points in all possibly concurrently running tasks can have on the state of the shared resource, which also introduces considerable pessimism. The authors of [1] combined the summary-based shared cache approach from [10] with a safe abstraction for the analysis of TDMA buses [5], which results in a scalable but pessimistic WCET analysis for multi-core WCET estimation. Finally, modelcheckers have been used to determine multi-core WCETs [3] and these could potentially also handle state-permeable resources. Unfortunately the approach does not scale to bigger programs or realistic systems, since the generic model checker has few possibilities of pruning the huge search space.

Parallel program analysis Static analysis of the synchronization structure of concurrent programs was first considered by [17] where the analysis of the "concurrency state" of the system and the notion of a parallel execution graph was first established. We build our work on this, though the analysis in [17]

worked at a far more coarse-grained level. A reference approach to bit-vector-based abstract interpretation on programs with explicit fork-join parallelism is given in [9]. Unfortunately, the microarchitectural analysis that we are examining here is not a bit-vector problem. In reachability analysis for parallel programs "stubborn sets" [18] can be used to prune the search space, but again the microarchitectural analysis differs significantly from reachability analysis. Finally, a recent publication [12] examines the computation of feasible synchronization-aware parallel interleavings. Their approach focuses on path analysis and is thus orthogonal to ours.

#### 3 System and Task Model

We assume a task set T containing only strictly periodic tasks, as often found in hard real-time systems. In the following sections, we will need a common reference point in time for all running tasks, where times are measured in multiples of the shortest clock cycle. Therefore we first require that all  $\tau_i \in T$  are sharing the same period  $p_i = p_T$  and that each task is executed non-preemptively on a separate core. We will discuss how to lift these restrictions in Section 5. Each task  $\tau_i$  may have a different release time  $r_i$  within the common period.

The analysis can be adapted to any topology, but for our experiments we will use an example architecture with n = |T| ARM7TDMI cores,<sup>1</sup> each having a private cache and a scratchpad. The cores are connected to a shared bus which is arbitrated under either TDMA, FAIR round-robin or fixed core priorities. Behind the bus, shared instruction and data caches are located as well as non-cached memories.

#### 4 Parallelism Analysis

Before starting with the formal part of the framework, we briefly sketch the intuition behind the analysis procedure. Our goal will be to efficiently explore all feasible interleavings of multiple tasks running in parallel. As an example, consider the execution of the tasks from Figure 1 under the assumption that both tasks start concurrently at time 0. For this assumption we can find all valid parallel execution scenarios from the parallel execution graph (PEG) shown in Figure 2. The construction of this graph starts with nodes corresponding to the initial system states, in this case with only the node AE (the  $\delta$ -values will be explained below). From these start nodes, we iteratively simulate cycle steps of the system. To keep our example PEG from Figure 2 sufficiently small, we assume that every block will take one cycle to complete. Therefore, our initial block AE is terminated after the first cycle and the execution must continue in one of the nodes AE, BE, BF and AF. To generate these successors we simply follow all combinations of successor blocks in the task CFGs. The loop bounds are not used here. If we continue the graph construction in this manner, we will end up

<sup>&</sup>lt;sup>1</sup> The choice of ARM7TDMI cores is motivated by the fact that we already have an implementation of the abstract pipeline model for these cores (compare Section 4.4).

#### 4 Timon Kelter and Peter Marwedel



Fig. 2: The final Parallel Execution Graph for tasks  $\tau_1$  and  $\tau_2$  from Figure 1, starting synchronously at time 0.

with a full product graph of the task CFGs. When every core has reached the end of its task, indicated by the "\perp " sign in Figure 2, we add a back-edge from \perp to AE to account for the repeated execution of the tasks in the cyclic schedule. The purpose of this final PEG is, that it contains each basic block of each task in all possible parallel execution scenarios. Thus we can derive the WCET of each basic block from the PEG and use these to compute the task WCETs.

As visible, the PEG in Figure 2 is *not* a full product graph of the graphs from Figure 1. The construction of the graph has been stopped at nodes BE, AG, BG, DF and DG. To explain why this was done, and why it is correct, we need the  $\delta$ -values and the loop bounds. We define  $\delta^{(i)}$  as an interval containing all points in time, measured from the beginning of the common period  $p_T$ , at which a node may be entered on core i. Initially we set  $\delta^{(1)} = \delta^{(2)} = [0,0]$  for node AE, since core 1 (2) enters node A (E) at time 0. From here on, every time we visit a node X in the analysis, we recompute its  $\delta$  intervals with the help of a path analysis which computes the length of the shortest and longest paths to the basic blocks in X. As an example, when we visit node AE the second time, we have already seen, that both block A and E complete within one cycle. Therefore, since A can be executed at most three times and E at most two times (see Figure 1), the path analysis can infer that any execution of block A must begin in the time frame  $\delta^{(1)} = [0,2]$  and similarly any execution of block E must begin within  $\delta^{(2)} = [0,1]$ . Thus, the path analysis always operates only on the CFGs of the

individual tasks, *not* on the PEG. The PEG is only used to compute the possible runtimes of the basic blocks within the tasks.

The path analysis for node BE yields  $\delta^{(1)} = [2,3]$  (due to the loop at A which must complete before B) and  $\delta^{(2)} = [0,1]$ . Here we can see the application of the computed  $\delta$ -values: We can exclude this node from the PEG and thus from the analysis. Through the  $\delta$ -values we know, that at this point blocks B and E cannot be executed concurrently because their execution time windows do not overlap. All blocks for which we can prove this can be removed from the PEG as long as their  $\delta$ -values stay unmodified. In Figure 2 these removed blocks are marked by a dotted border. If accesses to a shared resource, with a duration of one cycle, would occur in B and E we would still obtain the same PEG which shows that these accesses can never interfere with each other.

#### 4.1 Framework

The phases of our WCET analysis framework are shown in Figure 3. We are using the same CFG reconstruction, value analysis and path analysis stages as the classical WCET analysis [19]. These stages also work for each task in separation. Only for the microarchitectural analysis, we first construct the initial PEG states, based on the system schedule. Then we conduct a data-flow analysis on the PEG until the PEG itself as well as the associated system states have reached a fix-point. From this converged PEG we extract the basic block runtimes that are finally used to compute the WCET and BCET in an IPET-based path analysis.



Fig. 3: The analysis framework. The dashed parts are new contributions compared to [19] and will be discussed in the next sections.

#### 4.2 Prerequisites

To precisely define our analysis procedure we will need some terminology which is introduced in the following.

Given a set of tasks T together with CFGs  $G_{\tau} = (V_{\tau}, E_{\tau})$  for all  $\tau \in T$ , a task execution position  $\psi_{\tau}$  is a tuple (v, i, c, d), where  $v \in V_{\tau}$  is a basic block,  $i \in v$  is an instruction within that basic block and c is the number of cycles that were already spent on the processing of this instruction. Finally, d is the number of cycles that the task must wait until its execution will begin. A system execution

#### 6 Timon Kelter and Peter Marwedel

position (SEP)  $\Psi$  on n cores is an n-tuple with  $\Psi \in \hat{\Psi} = \times_{i=1}^n \hat{\psi}_{\tau_i} \cup \{\bot\}$ ,  $\tau_i$  being the task mapped to core i. The special token  $\bot$  indicates that the respective core is currently running idle. Here and in the following we use  $\hat{A}$  to denote the set of all tuples of type A. The motivation for this definition is, that other than in our introductory example from Figure 2, real basic blocks will contain more than one instruction<sup>2</sup> each of which may take multiple cycles to complete. Still we need to be able to split the execution of each basic block into chunks which may be as small as a single CPU cycle, as we will see in the following. We will use SEPs to specify the point at which the execution is resumed in a PEG block, therefore SEPs correspond to the block labels from Figure 2 (e.g. AE, BE, AF, etc).

An abstract parallel system state (APSS)  $\Sigma \in \hat{\Sigma}$  is a structure which models a set of concrete states of an entire parallel system, including all cores and memory hierarchy elements. Again,  $\hat{\Sigma}$  is the set of all possible APSSs. We give more detail on how to form proper APSSs at a later point, for now we only require a cycle step function  $\xi_{\Sigma}: \hat{\Sigma} \times \hat{\Psi} \times 2^{\{1,\dots,n\}} \to (\{0,1\}^n \times \hat{\Sigma})$ . The invocation of  $\xi_{\Sigma}(\Sigma, \Psi, \alpha)$  must simulate all possible state transfers that may happen when a single clock cycle is executed at position  $\Psi$  in system state  $\Sigma$ . However, only the cores in the set  $\alpha \subseteq \{1,\dots,|T|\}$  may perform a cycle step, to be able to account for different release times. For any instruction completion vector  $c \in \{0,1\}^n$  which may occur in this cycle, it must specify the result state, where c defines for each core, whether it has completed the execution of its current instruction (1) or not (0). The "current instruction" is always given by the "program counter" register value.

The APSSs will be subject to a data-flow analysis, therefore we also require a partial order  $\sqsubseteq$  on  $\hat{\mathcal{L}}$  such that  $(\hat{\mathcal{L}}, \sqsubseteq)$  is a lattice [7], with a *supremum* or *join* function  $\sqcup : \hat{\mathcal{L}} \times \hat{\mathcal{L}} \to \hat{\mathcal{L}}$ . Intuitively, since APSSs represent sets of concrete states,  $\mathcal{L}_1 \sqsubseteq \mathcal{L}_2$  specifies whether  $\mathcal{L}_2$  completely contains  $\mathcal{L}_1$ . To ensure the termination of the data-flow framework  $\xi_{\mathcal{L}}$  must also be monotonic with respect to  $\sqsubseteq$ .

A Parallel Execution Graph  $G_P = (V_P, E_P)$  is a directed graph with node set  $V_P \subseteq \hat{\Psi}$  and edge set  $E_P \subseteq V_P \times V_P$ . For any PEG we define a block time window function  $\delta: V_P \to \hat{I}^n$ , an edge state function  $\lambda: E_P \to \hat{\mathcal{L}}$  and a block length function  $\omega: V_P \to \mathbb{N}$ .  $\hat{I} = \{[x,y] \subset 2^{\mathbb{N}} | x \leq y\}$  is the set of all execution time intervals, measured in cycles from the last point where all cores were synchronized. The time window function will be used to rule out infeasible SEPs as indicated in Figure 2, the edge state function is used to propagate the possible hardware states from one PEG node to the other and the block length function specifies how many cycles were spend on the execution of a PEG node. The three functions are not defined a priori. They will be computed by the algorithms presented in the following.

We denote by  $v_1 \leadsto_G v_2$  that there is a path in the directed graph G = (V, E) from  $v_1 \in V$  to  $v_2 \in V$ , i.e. that  $v_2$  is reachable from  $v_1$ .

#### Algorithm 1 PEG-driven parallelism analysis

```
1 function ParallelismAnalysis(\Sigma_{\text{start}}, G_{\tau_1}, ..., G_{\tau_n})
        Q \leftarrow (v_{\tau_1}^{\text{start}}, 0, 0, r_1) \times \dots \times (v_{\tau_n}^{\text{start}}, 0, 0, r_n)
                                                                                                     ▷ Initialize start block
 3
         G_P \leftarrow (Q, \emptyset)
 4
         \forall v \in Q : \delta(v) \leftarrow [0,0], \lambda((\bot,v)) \leftarrow \Sigma_{\text{start}}
                                                                                                      ▷ Initialize start state
         while Q \neq \emptyset do
 5
                                                                                                       \triangleright Analyze next block
 6
             v = \text{PopFront}(Q)
            for i \in \{1, ..., n\} do
\delta(v)^{(i)} \leftarrow \bigcup_{(u,v) \in E_P} \delta(u)^{(i)} + \omega(u)
 7
                                                                                      \triangleright Update \delta-window for all cores
 8
                 if IsLoopHeadOrExit(v^{(i)}) then
 9
                     \delta(v)^{(i)} = r_i + \text{PATHANALYSIS}(v^{(i)}, G_{\tau_i}, G_P, \omega)
10
             if \bigcap_{i=1}^n \delta(v)^{(i)} = \emptyset then
11
                                                                               \triangleright If the exclusion criterion holds ...
12
                  V_P \leftarrow V_P \setminus v
                                                                                   \triangleright ... inhibit the block creation ...
             else
                                                                                 ▷ ... else analyze the current block
13
14
                 \lambda_{\text{prev}} \leftarrow \lambda, G_{P,\text{prev}} \leftarrow G_P
15
                  (G_P, \lambda, \omega) \leftarrow \text{AnalyzeBlock}(v, G_P, \lambda, \omega)
16
                 if \lambda_{prev} \neq \lambda \vee G_{P,prev} \neq G_P then
                                                                                 ▶ If graph or states were altered ...
                      \forall (v,z) \in E_P : \text{PushBack}(Q,z)
17
                                                                                            if E_{P,\text{prev}} \neq E_P then
                                                                                                  \triangleright If edges were added ...
18
19
                          \forall v \leadsto_{G_P} z : \text{PushBack}(Q, z)
                                                                                                 \triangleright ... propagate \delta-changes
20
         return (G_P, \omega)
```

#### 4.3 Analysis Algorithm

The outline of the main analysis is shown in Algorithm 1. It starts with an initialization of the work-list Q in line 2. According to the system schedule, the SEP consists of the begin of the start block of each task  $(v_{\tau_i}^{\text{start}})$  with a delay of  $r_i$  cycles. This SEP is assigned a time window of [0,0]. We also create a virtual edge  $(\bot,v)$  pointing to it, which is assigned the initial APSS  $\varSigma_{\text{start}}$ . Then we process items from the queue Q until it gets empty (line 5). In the main loop, we extract the first block v from the queue. In line 8 we infer the block time window for all task positions  $v^{(i)} \in v$  from the windows and runtimes of its predecessors.<sup>3</sup>

If v is part of a sequential block chain, the  $\delta$ -update in line 8 is sufficient. On the other hand, if v is a loop head (like A in Figure 1) or a loop exit (like B in Figure 1), then we have to take the loop bounds into account to determine the block time window, like we have done in the computation of  $\delta^{(1)}$  in e.g. AE and BE in Figure 2. This is done in line 10, where the existing path analysis of our framework is used to compute the shortest and the longest path from  $v_{\tau_i}^{\text{start}}$  to  $v^{(i)}$ . We currently use an adapted IPET analysis based on Integer Linear Programming [11] here, but advanced single-source all-sinks analyses would be even better suited [8]. It follows the given loop bounds and uses the BBRUNTIME function as shown in Algorithm 2 to determine the runtime of individual basic blocks in  $G_{\tau_i}$ . If any block  $u \in G_{\tau_i}$  with  $u \leadsto_{G_{\tau_i}} v^{(i)}$  is not yet covered in  $G_P$ ,

<sup>&</sup>lt;sup>2</sup> In the example we have not even differentiated between basic blocks and instructions.

<sup>&</sup>lt;sup>3</sup> Here and in the following we use  $()^{(i)}$  to access the *i*-th element of a tuple.

then BBRUNTIME will return  $\emptyset$  for its runtime. The path analysis will return  $[0, \infty]$  for the path length to  $v^{(i)}$ , then.

The  $\delta$  values are used in line 11, where we try to apply the *block exclusion criterion* by intersecting all block time windows. If the intersection is empty, this SEP cannot be reached from its current predecessors and we remove it from the graph in line 12. This is exactly what we have done with BE in Figure 2. Still, we may re-add v in the future when it becomes accessible via new edges. Then we will re-check whether our exclusion criterion still holds.

If the exclusion criterion does not hold (line 13), we analyze the parallel execution block (PEB) beginning at node v (line 15). This analysis will determine a block runtime  $\omega(v)$ , an output APPS for all out-edges of v and possibly alter  $G_P$ . If the output states or the graph are changed, we push the successors of v into the work-list at line 17. By doing this, all changes to the block time windows  $\delta$ , edge states  $\lambda$  and block runtimes  $\omega$  will be propagated through the graph. Finally, if we have added edges to the PEG, we also push all blocks z which are reachable from v into Q (line 19), to ensure that a new attempt to compute  $\delta(z)$  is started, if z is a loop head or exit. The algorithm terminates when no more edges are added and all edge states have converged.

#### **Algorithm 2** Basic block runtime extraction

All in all Algorithm 1 is a standard data-flow analysis work-list algorithm, with the difference that we are dynamically re-sorting the work-list (line 5) and dynamically expanding (line 15) the underlying graph. The block removal in line 12 can be viewed as "delaying" or "inhibiting" part of the graph expansion, since it can only happen, when a block is first visited. When Parallelism-Analysis has finished, all reachable blocks of all tasks will have been visited in one or more parallel execution blocks and BBRuntime will therefore yield valid runtimes for all basic blocks.

To complete the view on the analysis, Algorithm 3 shows the function AN-ALYZEBLOCK which is tightly coupled with Algorithm 1. First, the incoming APSSs are joined in line 2. The current system execution position  $\Psi_{\text{run}}$  is initialized to v (remember that  $V_P \subseteq \hat{\Psi}$ ) and the block duration  $\omega(v)$  is set to zero. Then we simulate the effect of successive system cycle steps on  $\Psi_{\text{run}}$  and  $\Sigma_{\text{run}}$ , until on any core, either a) the end of a basic block is reached or b) the successor SEP is ambiguous. The latter happens, when it is uncertain in APSS  $\Sigma_{\text{run}}$  whether the current instruction of at least one core will complete or not. In this case we track all completion combinations in separate successor blocks.

The first step in each cycle is to invoke the APSS cycle step function  $\xi_{\Sigma}$ , which is done in line 5, but only for those cores with zero delay cycles (set  $\alpha$ ). The APSS

cycle step function  $\xi_{\Sigma}$  returns a mapping  $\kappa \subseteq \hat{I} \times \hat{\Sigma}$ , i.e. it associates instruction completion vectors to successor APSSs. Line 6 checks the two block termination conditions a) and b) mentioned above. The helper function  $\phi_c^{\alpha}: \hat{\Psi} \to \hat{\Psi}$  generates the successor SEP for a given SEP  $\Psi$ , instruction completion vector c and active core set  $\alpha$ . If neither a basic block end is reached, nor the successor SEP is ambiguous, we take over the results of the cycle step as our new working SEP  $\Psi_{\text{run}}$  and APSS  $\Sigma_{\text{run}}$  in line 7 and increment the cycle counter for this block in line 8. Here,  $\Psi_{\text{run}}^{(i)(1)}$  is the basic block executed by core i,  $\kappa^{(1)(1)}$  is the first instruction completion vector and  $\kappa^{(1)(2)}$  is its associated successor APSS.

If the block end is detected, we terminate the current block as shown from line 9 on. It will be one invariant of our analysis that the length of a block can only stay the same or be reduced in successive analyses of the same block. Therefore we only check in line 10, whether the block has been shortened. This may happen due to a newly joined-in APSS, that triggers an earlier ambiguous successor SEP. In this case, we remove all previous out-edges of the current block v (line 11). In any case, we add for each instruction completion vector c an out-edge to  $\phi_c^{\alpha}(\Sigma_{\text{run}})$  which gets annotated with the respective out-state  $\Sigma_c$  (lines 13–16). In the end, the modified graph, edge states and block lengths are returned in line 18.

#### Algorithm 3 PEG block analysis

```
1 function AnalyzeBlock(v, G_P, \lambda, \omega)
  2
            \Sigma_{\text{run}} \leftarrow \bigsqcup_{\forall e = (u,v) \in E_P} \lambda(e)

▷ Join incoming states

  3
            \Psi_{\rm run} \leftarrow v, \, \omega(v) \leftarrow 0, \, \omega_{\rm prev} \leftarrow \omega
  4
            while true do

\kappa \leftarrow \xi_{\Sigma}(\Sigma_{\text{run}}, \Psi_{\text{run}}, \alpha = \{i | \Psi_{\text{run}}^{(i)} = (\cdot, \cdot, \cdot, 0)\})

  5

⊳ Simulate next cycle in block

                if |\kappa| = 1 \land \nexists i : (\phi_{\kappa^{(1)(1)}}^{\alpha}(\Psi_{\text{run}}))^{(i)(1)} \neq \Psi_{\text{run}}^{(i)(1)} then \triangleright Split/Basic block end? 

\Sigma_{\text{run}} \leftarrow \kappa^{(1)(2)}, \Psi_{\text{run}} \leftarrow \phi_{\kappa^{(1)(1)}}^{\alpha}(\Psi_{\text{run}}) \rightarrow \text{If not, prepare next cycle}
  6
  7
  8
                     \omega(v) \leftarrow \omega(v) + 1
  9
                                                                                                    ▷ Else terminate the current block
10
                      if \omega(v) < \omega_{\text{prev}}(v) then
                                                                                           11
                           E_P \leftarrow E_P \setminus \{(v, w) \in E_P\}
                      for (c \to \Sigma_c) \in \kappa do
12
                                                                                                 > Add new successors and out-states
                           V_P \leftarrow V_P \cup \{v_{new} = \phi_c^{\alpha}(\Sigma_{\text{run}})\}
13
                           \delta(v_{\text{new}}) \leftarrow [0, \infty], \omega(v_{\text{new}}) \leftarrow \infty
14
                           E_P \leftarrow E_P \cup \{e_{\text{new}} = (v, v_{\text{new}})\}
15
16
                           \lambda(e_{\text{new}}) \leftarrow \Sigma_c
17
                      break
18
            return (G_P, \lambda, \omega)
```

With Algorithm 3 we completed the macroscopic side of the analysis. In the next subsection we will examine the microscopic perspective, namely how to efficiently represent abstract parallel system states.

#### 4.4 Parallel System State Models

10

An APSS must model the *state* of all microarchitectural components which are relevant to the timing of the system, i.e. all cores and their pipelines and all *memory hierarchy elements* (MHEs) like private and/or shared caches, buses and memories. Here, *state* denotes an approximation of the relevant content of the component as well as the operation that the component is currently performing.

Therefore we define an APSS  $\Sigma$  as a set of tuples, where each tuple contains abstract states for each pipeline and memory hierarchy element in the system. The rationale behind  $\Sigma$  being a set of tuples is, that we may have to split the state, e.g. when two different paths in the pipeline must be considered. These different execution paths may have identical instruction completion vectors, but still we need to maintain them separately in a common  $\Sigma$  set, to trace the different microarchitectural behaviors.

The driving force behind the microarchitectural simulation are the cores' pipelines, which are modeled as non-deterministic finite-state machines [19]. In each cycle, the abstract pipeline states follow all transitions which are enabled according to their current state which includes the currently executing instructions. Multiple transitions may be enabled due to uncertainty in the analysis, e.g. due to statically unknown memory access targets and register values. In such a case, one successor state is generated for every possible transition. During the abstract cycle step, the pipeline models issue memory transactions as dictated by the machine specification. Completion of such transactions is signaled back from the abstract MHE states to the affected pipeline state. Finally, the completion of instructions, known as the *commit* of an instruction, is communicated to our framework via an entry in the instruction completion vector as introduced in Section 4.2.

In every cycle step, i.e. every invocation of  $\xi_{\Sigma}$ , we perform the cycle step independently on each tuple  $\sigma \in \Sigma$ . The results are then sorted by completion vector and returned to returned to the PEG block analysis (Algorithm 3). Inside the individual  $\sigma$  tuples we use established abstract domains, namely abstract finite state machines for pipelines [19], cache block age maps for caches [19] and TDMA offset sets for TDMA busses [5]. For FAIR and PRIO arbitration no suitable abstractions were found in the per-core analysis. Since we explicitly track parallel interleavings in the PEG, we can analyze these protocols for the first time by providing abstract arbitration functions as shown in the following.

Arbitration functions A simplified version of the bus state is illustrated in Figure 4, where a PEG block  $\Psi$  is shown. The state  $\Sigma_{\text{run}}$  for this block (see Algorithm 3) holds two sub-states, of which  $\sigma_2$  is presented in more detail. In this sub-state the two cores in this example are currently performing a multiplication and an instruction fetch. Bus B1 is a TDMA bus, from its state we know that we currently are either at cycle 0 or 4 in the fixed-length, cyclic TDMA schedule. The state for FAIR-arbitrated buses like B2 holds an overapproximation of the cores which may have last accessed the bus. In the case of B2 this reveals that the last access has definitely been carried out by core 2.



Fig. 4: An example PEG block  $\Psi$  with attached APSS  $\Sigma_{\rm run}$ .

With these state definitions we can easily define the abstract arbitration functions which determine possible arbitration winners:

- **TDMA**: All cores whose *grant window* has a non-empty intersection with the current TDMA offsets may be granted. If we assume a schedule of length 10 cycles, in which cycles [0-4] are assigned to core 1 (grant window of core 1) and cycles [5-9] are assigned to core 2 (grant window of core 2), then in the state from Figure 4 a request to B1 would only be granted for core 1.
- **FAIR**: All cores which are the next in the core list for at least one previously accessing core  $c_p$  may be granted. In Figure 4 if both cores request access to B2, only the request from core 1 will be granted.
- PRIO: All requests with the highest priority may be granted. Thus for PRIO we do not need to maintain any kind of state, since the arbitration can be done solely based on the fixed priorities.

Different arbitration outcomes are then distributed to different result tuples  $\sigma$ . Since the PEG already carries the burden of constructing all possible interleaving scenarios, we can formulate the arbitration analysis in a rather simple manner, here. By construction, this has not been possible for the standard per-core WCET analysis approach.

#### 4.5 Correctness

Formally complete proofs cannot be given here due to space constraints, but we try to provide some intuition on why the analysis is correct. First of all, through the monotonicity of  $\xi_{\Sigma}$ , we can prove Lemma 1:

**Lemma 1.** During the runtime of function Parallelismanalysis in Algorithm 1, every block  $v \in V_P$  which has once been analyzed with AnalyzeBlock will never be removed from the graph again. For the ith and jth invocation of AnalyzeBlock $(v,\cdot,\cdot,\cdot)$  with i < j, we have

$$-\omega^{i}(v) > \omega^{j}(v),$$

$$- \bigsqcup_{e=(u,v)\in E_{P}^{i}} \lambda^{i}(e) \subseteq \bigsqcup_{e=(u,v)\in E_{P}^{j}} \lambda^{j}(e) \text{ and}$$

$$- \forall k \in \{1,\ldots,n\} : \left(\delta^{i}(v)\right)^{(k)} \subseteq \left(\delta^{j}(v)\right)^{(k)}.$$

With rising analysis iteration count, for each  $v \in V_P$  the block runtime will only shrink, the incoming APSS will only get more imprecise and the execution time intervals for each task execution position will only become wider.

#### 12 Timon Kelter and Peter Marwedel

For any possible task set execution, which we model as a sequence S of SEPs, we can prove with Lemma 1, that the APSSs attached to the converged PEG are safe over-approximations of the concrete system states with which S is traversed. This yields Theorem 1.

**Theorem 1.** The PEG provides safe over-approximations of the parallel system state with which each system execution position can be entered. Therefore the basic block durations derived by BBRUNTIME are safe over-approximations of the block runtimes in any parallel execution context.

#### 5 Analysis Extensions

If the underlying architecture is guaranteed to be free of timing anomalies [4], then in each block analysis (Algorithm 3, line 5) we can skip all instruction completion vectors  $c \in \kappa$  which are dominated by another vector, i.e.  $c_1 <_c c_2 \Leftrightarrow \forall i \in \{1,\ldots,n\}: c_2^{(i)} \Rightarrow c_1^{(i)}$ . The dominated vectors correspond to an earlier termination of an instruction and since in a timing-anomaly-free architecture every local worst-case action is always also the global worst-case action, we can assume that they are never part of the worst-case path. This can drastically reduce the state space and the PEG size.

In task sets with explicit synchronization points we have to consider these points in the path analysis as shown in [13]. In addition we can also use them to prune the PEG as we have done in Section 4, since a task which is waiting for synchronization cannot progress until a partner has arrived to complete the rendez-vous. This idea has already been used in [17] and similar to there, it can be used on top of the timing information to further prune the PEG.

The extension of our framework to task sets with non-uniform periods is also possible. With non-uniform task periods we can still compute the global hyperperiod, i.e. the smallest common multiple of all task periods and build a PEG for this hyperperiod. The problem that we face here is, that with the current framework we cannot determine the absolute point in time at which we are when a task instance has finished executing, since then we can no longer compute the block time window on the basis of the local CFG and a task release time. This means we would have to assume in every successive cycle step, that the next task instance might start or not, which would drastically increase the PEG size. However this can be limited if we take into account synchronization structures or if timing-based approximations of the task instance spawn behavior can be found.

#### 6 Evaluation

We implemented the analysis algorithms inside the WCC compiler framework [2], which was also used in [6]. We ran our evaluations on single-core tasks from the MRTC and DSPStone real-time benchmark suites. Out of these single-core tasks we formed packages of 2 to 4 tasks, all of which were assigned a release time of 0. We analyzed the system topology from Section 3 with 2 or 4 cores, depending



Fig. 5: Efficiency of the block exclusion criterion on example benchmarks for varying number of cores and arbitration policies. The solid line is a linear regression of the data points.

on the task set. In the evaluation, we focus on analyzing state-permeable bus arbitration methods (PRIO and FAIR) which were not analyzable (PRIO) or not precisely analyzable (FAIR) without the presented parallelism analysis. The bus which is arbitrated by these methods is the shared memory bus introduced in Section 3.

In Figure 5 the results of our block exclusion criterion (BEC) from Algorithm 1, line 11 are shown. Each mark represents one analysis run on one task set. The circle marks indicate runs where the shared bus was configured for FAIR arbitration, the triangles correspond to fixed priority-based arbitration and the squares correspond to TDMA. Non-filled (filled) marks are analysis runs with the 2-core (4-core) system. The x-axis value is the number of PEG blocks that are generated during the analysis, when the BEC is used compared to the case when it is not used (100%). On the y-axis the required analysis time is shown, also compared to the case that the BEC was not used (100%). From the data points and the solid regression curve it is visible that the analysis time scales roughly linearly with the number of PEG blocks, which was expected, since the runtime of the main loop in Algorithm 1 depends on the total number of blocks. The variations stem from the convergence behavior of the individual benchmarks, i.e. how often loops have to be visited until the attached APSSs converge. More importantly, we can see from Figure 5 that the BEC is effective, as on average it rules out 35.6% of all blocks and leads to a reduction in analysis time of 49.7%.

The average resulting analysis time is presented in Table 1. The column "Analysis Type" shows which type of WCET analysis was tested. We compare the classical multi-core WCET analysis [1] (abbr. "C") to our new parallelism analysis with (abbr. "P|B") and without (abbr. "P|O") usage of the block exclusion criterion. As already seen in Figure 5, "P|B" is always superior to "P|O" but both are slower than the classical approach "C" by a factor of 130 on average. This is a result of the more complex system state and of the thousands of parallel interleavings that have to be explored, whereas the classical analysis

#### 14 Timon Kelter and Peter Marwedel

Table 1: Average analysis time and PEG sizes.

| Schedule | Analysis | $\emptyset$ Duration | $\varnothing$ PEBs |
|----------|----------|----------------------|--------------------|
|          | Type     | (Seconds)            | (Number)           |
| FAIR     | C N      | 4                    | 0                  |
| FAIR     | P O N    | 1,695                | 2,177              |
| FAIR     | P B N    | 583                  | 1,223              |
| FAIR     | C T      | 6                    | 0                  |
| FAIR     | P O T    | 2,065                | 9,595              |
| FAIR     | P B T    | 801                  | 7,828              |
| PRIO     | P O N    | 1,438                | 1,800              |
| PRIO     | P B N    | 514                  | 1,175              |
| PRIO     | P O T    | 1,971                | 6,971              |
| PRIO     | P B T    | 808                  | 5,118              |



Fig. 6: Relative WCET results.

only operates on the CFG of a single task and the state of a single core. The last element of the "Type" column shows whether the architecture was assumed to have timing anomalies (abbr. "T") or not (abbr. "N"). As presented in Section 5, this can be used to drastically reduce the PEG size, which is visible in Table 1 in column "Ø PEBs", which holds the average number of PEG blocks for this analysis scenario. The configurations where absence of timing anomalies was assumed ("N") produce far lower PEG sizes and analysis times than their counterparts ("T").

The benefits we get from the parallelism analysis ("P"-configurations) at the price of increased analysis times are that we can analyze the PRIO arbitration for the first time and that we can significantly reduce the arbitration delay estimations for FAIR arbitration.

Details on both aspects are presented in Figure 6, where the average of the quotient of WCET and measured runtime (MRT) is shown for different analysis configurations from Table 1. Remember here, that we can only determine a safe upper bound  $WCET_{\rm est}$  on the real  $WCET_{\rm real}$  in all of our analyses. Therefore the above quotient is a bound on the WCET overestimation, since by  $WCET_{\rm est} \geq WCET_{\rm real} \geq MRT$  we have that  $WCET_{\rm est} \div MRT \geq WCET_{\rm est} \div WCET_{\rm real}$ . Each MRT was determined by simulating the task set execution for the given system configuration on the cycle-true virtual prototyping IDE COMET [16].

First of all, we can see in Figure 6 that the PEG-based WCET analyses (all configurations containing "P") for a system with PRIO arbitration yield results that are comparable to those for FAIR arbitration. The remaining overestimation is mostly due to other unavoidable sources of imprecision, like loose loop bounds and pipeline and value analysis overestimation. Also, we see that the restriction to timing-anomaly free architectures (all configurations with "N") enables not only reduced analysis times (cf. Table 1) but also tighter WCET estimations. The usage of the block exclusion criterion (configurations with "B") also leads to slightly decreased overestimation.

Finally, the "C"-configurations show the overestimation for the classical WCET analysis framework, which can only assume the maximum possible delay for every access in state-permeable arbitration policies. Our new parallelism-based analysis is able to clearly outperform this approach, being 32% more accurate on average, but of course at the expense of increased analysis times.

#### 7 Conclusions

We have presented a new type of WCET analysis which can precisely bound the runtime of safety-critical tasks running on complex multi-core systems. This is achieved by exploring all possible execution interleavings of a parallel periodic task set. A parallel execution graph (PEG) is employed to represent the interleavings in compressed form, a concept that was already used in [17]. What is genuine to the application of the PEG in WCET analysis is firstly that here we must work at the granularity of single machine cycles which drastically increases the graph size. But secondly and more importantly we can also use the timing information that we are generating for pruning parts of the graph which we prove to be not reachable in any real execution through the use of a new timing-based block exclusion criterion.

We tested this analysis on a prototype implementation. For a shared bus scheduled under a fair round-robin policy we observed WCET reductions of 32% on average, compared to previous analysis approaches. For fixed priority-based scheduling no previous individual-access analysis methods exist. Here we could derive WCET values with a tightly bounded maximum overestimation of only 30–50% on average, which is comparable to the single-core WCET overestimation ratio of our analyzer. In the future we plan to explore combinations of the block exclusion criterion and synchronization-aware analysis to further reduce the PEG size and lift the restriction that all tasks must have a uniform period. We also seek to evaluate the performance of the PEG-based analysis for systems with shared caches, for which up to now only pessimistic analyses existed.

#### 8 Acknowledgments

This work was partially supported by EU COST Action IC1202: Timing Analysis On Code-Level (TACLe). The authors would also like to thank Synopsys for the provision of the virtual prototyping IDE CoMET.

#### References

- Chattopadhyay, S., Kee, C., Roychoudhury, A., Kelter, T., Marwedel, P., Falk, H.: A Unified WCET Analysis Framework for Multi-Core Platforms. In: Real-Time and Embedded Technology and Applications Symposium (2012)
- Falk, H., Lokuciejewski, P.: A Compiler Framework for the Reduction of Worst-Case Execution Times. Journal on Real-Time Systems 46(2), 251–300 (October 2010)

- 3. Gustavsson, A.: Worst-Case Execution Time Analysis of Parallel Systems. In: Nyström, D., Nolte, T. (eds.) Real Time in Sweden 2011. pp. 104–107. Dag Nyström and Thomas Nolte (June 2011)
- 4. Hahn, S., Reineke, J., Wilhelm, R.: Towards Compositionality in Execution Time Analysis – Definition and Challenges. In: International Workshop on Compositional Theory and Technology for Real-Time Embedded Systems (December 2013)
- Kelter, T., Falk, H., Marwedel, P., Chattopadhyay, S., Roychoudhury, A.: Bus-Aware Multicore WCET Analysis through TDMA Offset Bounds. In: Euromicro Conference on Real-Time Systems. pp. 3–12. Porto, Portugal (July 2011)
- Kelter, T., Harde, T., Marwedel, P., Falk, H.: Evaluation of Resource Arbitration Methods for Multi-Core Real-Time Systems. In: International Workshop on Worst-Case Execution Time Analysis (July 2013)
- Kildall, G.A.: A Unified Approach to Global Program Optimization. In: Symposium on Principles of Programming Languages. pp. 194–206. ACM, New York, USA (1973)
- 8. Kleinsorge, J.C., Falk, H., Marwedel, P.: Simple Analysis of Partial Worst-Case Execution Paths on General Control Flow Graphs. In: Proceedings of the International Conference on Embedded Software. pp. 1–10 (September 2013)
- 9. Knoop, J., Steffen, B., Vollmer, J.: Parallelism for Free: Efficient and Optimal Bitvector Analyses for Parallel Programs. ACM Trans. Program. Lang. Syst. 18(3), 268–299 (May 1996)
- Li, Y., Suhendra, V., Liang, Y., Mitra, T., Roychoudhury, A.: Timing Analysis of Concurrent Programs Running on Shared Cache Multi-Cores. In: IEEE Real-Time Systems Symposium. pp. 57–67. IEEE Computer Society, Washington, USA (2009)
- Li, Y.T.S., Malik, S.: Performance Analysis of Embedded Software Using Implicit Path Enumeration. In: Proceedings of the Annual ACM/IEEE Design Automation Conference. pp. 456–461. ACM, New York, USA (1995)
- 12. Mittermayr, R., Blieberger, J.: Timing Analysis of Concurrent Programs. In: International Workshop on Worst-Case Execution Time Analysis. pp. 59–68 (2012)
- 13. Potop-Butucaru, D., Puaut, I.: Integrated Worst-Case Execution Time Estimation of Multicore Applications. In: Maiza, C. (ed.) International Workshop on Worst-Case Execution Time Analysis. pp. 21–31. Dagstuhl, Germany (2013)
- Schliecker, S., Negrean, M., Nicolescu, G., Paulin, P., Ernst, R.: Reliable Performance Analysis of a Multicore Multithreaded System-on-chip. In: International Conference on Hardware/Software Codesign and System Synthesis. pp. 161–166. ACM, New York, USA (2008)
- 15. Schranzhofer, A., Pellizzoni, R., Chen, J.J., Thiele, L., Caccamo, M.: Worst-Case Response Time Analysis of Resource Access Models in Multi-Core Systems. In: Design Automation Conference (2010)
- 16. Synopsys Inc.: CoMET System Engineering IDE. http://www.synopsys.com
- 17. Taylor, R.N.: A General-purpose Algorithm for Analyzing Concurrent Programs. Communications of the ACM 26(5), 361–376 (May 1983)
- Valmari, A.: Eliminating redundant interleavings during concurrent program verification. In: Odijk, E., Rem, M., Syre, J.C. (eds.) Parallel Architectures and Languages Europe, Lecture Notes in Computer Science, vol. 366, pp. 89–103. Springer Berlin Heidelberg (1989)
- Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D., Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I., Puschner, P., Staschulat, J., Stenström, P.: The Worst-Case Execution Time Problem - Overview of Methods and Survey of Tools. ACM Trans. Embed. Comput. Syst. 7(3) (2008)

# Formal Verification of Distributed Task Migration for Thermal Management in On-chip Multi-core Systems using nuXmv

Syed Ali Asadullah Bukhari<sup>1</sup>, Faiq Khalid Lodhi<sup>1</sup>, Osman Hasan<sup>1</sup>, Muhammad Shafique<sup>2</sup>, and Jörg Henkel<sup>2</sup>

<sup>1</sup>School of Electrical Engineering and Computer Science (SEECS)
National University of Sciences and Technology (NUST)
Islamabad, Pakistan
{ali.asadullah,faiq.khalid,osman.hasan}@seecs.nust.edu.pk

<sup>2</sup> Chair for Embedded Systems (CES)
Karlsruhe Institute of Technology (KIT)
Karlsruhe, Germany
{muhammad.shafique,henkel}@kit.edu

**Abstract.** With the growing interest in using distributed task migration algorithms for dynamic thermal management (DTM) in multi-core chips comes the challenge of their rigorous verification. Traditional analysis techniques, like simulation and emulation, cannot cope with the design complexity and distributed nature of such algorithms and thus compromise on the rigor and accuracy of the analysis results. Formal methods, especially model checking, can play a vital role in alleviating these issues. Due to the presence of continuous elements, such as temperatures, and the large number of cores running the distributed algorithms in this analysis, we propose to use the nuXmv model checker to analyze distributed task migration algorithms for DTM. The main motivations behind this choice include the ability to handle the real numbers and the scalable SMT-based bounded model checking capabilities in nuXmv that perfectly fit the stability and deadlock analysis requirements of the distributed DTM algorithms. The paper presents the detailed analysis of a state-of-the-art task migration algorithm of distributed DTM for many-core systems. The functional and timing verification is done on a larger grid size of  $9 \times 9$  cores, which is thermally managed by the selected DTM approach. The results indicate the usefulness of the proposed approach, as we have been able to catch a couple of discrepancies in the original model and gain many new insights about the behavior of the algorithm.

**Keywords:** Model Checking, Thermal Management, Task Migration, Multi-core Architectures

#### 1 Introduction

The ever-increasing need of the computing power and technological advances have led to many cores on a chip [33,39]. This accelerated increase, accompa-

nied by higher power densities, has opened up the challenge of coping with the elevated chip temperatures, which pose serious threats to the reliability of the computing systems. Various Thermal Management (TM) techniques [1,8,21] have recently been proposed to overcome these issues. In particular, the Dynamic Thermal management (DTM) [40,27] for multi-core systems via the task migration mechanism has been identified as a very promising solution to the heating problems in many-core systems with high core integration by the ITRS roadmap of 2013 [18].

The DTM techniques can be broadly classified into two categories: central and distributed [19]. Central DTM (cDTM) is done by a central controller, which is responsible for the overall thermal management of the chip and thus, has the visibility of all the global parameters, such as the core temperatures, of the system [7]. This approach has the inherent issue of scalability as the cDTM often encounters performance degradation while dealing with many-core systems [36.12.34]. On the other hand, Distributed DTM (dDTM) manages the heating issues of the chip by employing several thermal management agents as opposed to a single controller [19,34]. An agent in a distributed system perceives the environment through communication with other agents and can take decisions on its own to a certain extent [37]. The obvious gain in this method is that the need of global knowledge is no longer necessary and thus it resolves the abovementioned scalability issue of cDTM. Since the dDTM agents are not aware of the overall thermal scenario of the complete chip, it is customary to approximate the required data by information exchange among the neighboring agents only. Based on this information, some dDTM techniques develop an overall thermal model of the system for predicting the core temperatures [28,32,41]. If this estimate is above a certain threshold then the task migration is activated. Other dDTM techniques, like [11,13,23], make the task migration decisions based on certain algorithms that manipulate the temperature values obtained from the neighboring cores only. For example, a recent task migration technique [23], estimates the average temperature of the complete chip by taking the inputs of the neighboring cores using the distributed signal average tracking algorithm [5,4].

The need for a thorough analysis of these thermal management techniques is of vital importance as an inefficient task migration decision may lead to the creation of hot spots (regions with excessive temperatures within the chip) and thus endanger the reliability of the chip. Traditionally, the dDTM techniques are analyzed using either simulations or by running on real hardware systems. Both of these methods compromise on the accuracy of the analysis results by analyzing a subset of the possible scenarios only due to their large design-space, which is in turn caused by the distributed nature of DTM techniques and the presence of 100s of cores in the present-age systems where the distrusted DTM techniques are employed. Moreover, choosing the sample set is another major issue while analyzing the dDTM techniques due to the enormous amount of possible options, like the possible temperature values for all the cores are actually infinite due to the continuous nature of temperature. This non-exhaustiveness

and incompleteness of the analysis may lead to unwanted scenarios, like the delayed release of the Montecito chip using the Foxton DTM algorithm [10].

Formal verification [9] can overcome the above-mentioned inaccuracy limitations of simulation-based verification due to its inherent soundness and completeness. Given the reactive nature of DTM techniques, model checking has been used for their analysis [29,35,25]. Moreover, the SPIN model checker [16] has been recently used in conjunction with Lamport timestamps [22] to analyze the functional and timing properties of the Thermal-aware Agent-based Power Economy (TAPE) [17], which is a state-of-the-art agent-based dDTM scheme. However, this analysis is only done for a 9 core, i.e.,  $3 \times 3$ , core system and the continuous values of algorithm parameters and the temperature have been abstracted by discrete values in order to cope with the state-space explosion problem of model checking [6]. These abstractions limit the usefulness of applying model checking for analyzing dDTM techniques as the exhaustiveness of the analysis is compromised to a certain degree.

The main focus of the current paper is to alleviate the above-mentioned issues encountered in [17]. For this purpose, we propose to use the recently released nuXmv model checker [3] to analyze dDTM systems. The distinguishing features of the nuXmv model checker include the ability to handle real numbers and implicit handling of state counters. Thus, the continuous values in dDTM approaches can be modeled more appropriately and the timing properties of the DTM approached can be analyzed without using the Lamport timestamps explicitly. Moreover, the SAT and SMT based engines of the nuXmv model checker facilitate analyzing larger models and we can thus analyze large grids of multi-core systems.

In order to illustrate the usefulness of the proposed approach, this paper presents the formal analysis of a recently proposed task migration algorithm for hot spot reduction in many-core systems [23]. The algorithm executes the task migration based on a simple criterion of comparing the temperature of the core(s) with the neighboring cores and the average temperature of the chip. The average temperature of the chip in turn is computed using the recently proposed technique of distributed average estimation for time-varying signals [4]. Besides the generic and simplistic nature of this algorithm (as it just manipulates the temperature values from its neighbor to make decision for task migration), another main motivation for choosing this as our case study is its close relationship with other advanced task migration algorithms, such as [24]. Moreover, model checking is not suitable for dDTM techniques like [28,32,41], due to their predictive nature.

#### 2 Preliminaries

In this section, we give a brief introduction to the nuXmv model checker and the task migration algorithm for many-core systems [23], which we have formally verified in this paper. The intent is to facilitate the understanding of the rest of the paper for both the dDTM technique design and the formal methods communities.

#### 2.1 nuXmv Model Checker

The nuXmv symbolic model checker [3,31] is a very recent formal verification tool that extends the NuSMV model checker [30], which in turn is a finite state transitions model checker. nuXmv extends the capabilities of the NuSMV by complementing NuSMV's verification techniques by SAT algorithms for finite state systems. For infinite state systems, it introduces new data types of *Integers* and *Reals* and also provides the support of Satisfiability Modulo Theories (SMT), using MathSAT [26], for verification.

The system that needs to be modeled is expressed in the nuXmv language, which supports the modular programming approach where the overall system is divided into several modules that interact with one another in the MAIN module. The properties to be verified can be specified in nuXmv using the Linear Temporal Logic (LTL) and Computation Tree Logic (CTL). The LTL specifications are written in nuXmv with the help of logical operations like, AND (&), OR (|), Exclusive OR (xor), Exclusive NOR (xnor), implication (->) and equality (<->), and temporal operators, like Globally (G), Finally (F), next (X) and until (U). Similarly, the CTL specifications can be written by combining logical operations with quantified temporal operators, like exists globally (EG), exists next state (EX) and forall finally (AF). In case a property turns out to be false, a counterexample in the execution trace of the FSM is provided.

#### 2.2 Task Migration Algorithm for Hot Spot Reduction

The main goal of any dynamic DTM technique is to maintain an acceptable average temperature across all the cores. This reduction in the temperature does not always guarantee a balanced distribution that is actually required for the reduction of thermal hot spots. The algorithm proposed in [23], which is under consideration in this paper, overcomes this limitation by performing distributed task migration with the primary goal of achieving thermal reliability and reduced temperature variance across the chip. The algorithm makes use of the recently proposed distributed average signal tracking algorithm [5,4], which shows that the states of all the distributed agents converge to the average value of the time-varying reference signals. The following equation is used to estimate the average:

$$\dot{z}_i(t) = \alpha \sum_{j \in N_i} sgn[x_j(t) - x_i(t)]$$

$$x_i(t) = z_i(t) + r_i(t)$$
(1)

where sgn(x) is the signum function defined as:

$$sgn(x) = \begin{cases} -1 & \text{if } x < 0\\ 0 & \text{if } x = 0\\ 1 & \text{if } x > 0 \end{cases}$$
 (2)

and  $z_i(t)$  is the estimated average signal,  $x_i(t)$  and  $N_i$  are the states of the distributed agent i and its neighborhood, respectively,  $r_i(t)$  is the reference signal with bounded derivatives in a finite time and  $\alpha$  is a constant value greater than 0. The task migration algorithm makes use of this fact to estimate the average temperature of a core, without the need of global knowledge of the temperature of every core. The task migration policy is then executed only on the cores having a temperature greater than the estimated average temperature  $T_{avg}$ . As a result, a considerable amount of data exchange is avoided among the cores and only necessary task migration is done for effectively reducing the temperature. If a core has a temperature greater than the estimated  $T_{avg}$ , then the following task migration criterion is used to check if the task can be migrated from the current core to some destination core among the neighbors:

- 1.  $T_{destination} < T_{current}$ , where  $T_{destination}$  and  $T_{current}$  are the temperatures of the destination and current cores, respectively.
- 2.  $P_{destination} < P_{current}$ , where  $P_{destination}$  is the task load of the destined core and  $P_{current}$  is the counterpart of the *current* core.
- 3.  $TNP_{destination} < TNP_{current}$ , where  $TNP_{destination}$  and  $TNP_{current}$  are the workloads of the destination and current cores, respectively.

If the temperature T of the core is less than the  $T_{avg}$ , then the task migration policy is not activated and the core retains its temperature, otherwise the above mentioned conditions are checked to decide if the task migration is done for a core or not. All the 4 neighbors are passed through the criterion and tasks are exchanged if the conditions are met and then checked with the next neighbor. By the end of the algorithm execution, the most appropriate core is found for task exchange. The pseudo-code for this algorithm is given in Algorithm 1 [23], and Fig. 1 presents a typical execution of the algorithm to illustrate the above-mentioned behavior. The node 0, in Fig. 1, represents the current core and the neighboring cores are denoted by 1, 2, 3 and 4. Each core is checked for the satisfiability of the task migration conditions and the right core (shown black) is chosen. The results from MATLAB implementation of this DTM technique on a 6 x 6 grid show a 30 percent hot spot reduction and smaller temperature variance [23].

#### 3 Modeling the DTM algorithm in nuXmv

In this section, we explain the FSM for Algorithm 1 and its modeling in the language of the nuXmv model checker.



Fig. 1: A typical execution of the selected algorithm [23].

#### 3.1 Our Refinements to the original Task Migration Algorithm

While modeling Algorithm 1 in the nuXmv language, we had to handle some of the scenarios that were not mentioned in the paper [23] where the original algorithm was published. Before going into the implementation details of the model, we find it appropriate to point out the discrepancies in the existing algorithm and our proposed solutions.

1. Since the migration algorithm executes concurrently on all the nodes, it may happen that two different nodes node A and B want to migrate their task to the same name node C at the same time. The algorithm proposed in [23] does not resolve this conflict. In our model, we have resolved this conflict by giving priority to the node that has a lower value of estimated  $T_{avg}$ . This means that all the nodes not only need to know the temperatures of

**Algorithm 1** Distributed thermal management algorithm for avoiding hot spots [23]

Require: Task loads, many-core processor configuration

Ensure: Optimized temperature distribution

Start simulation at room temperature

for each execution cycle do

- 1. Simulate power traces under different task loads
- 2. Obtain temperature responses of the many-core microprocessor, and estimate average temperature using distributed state tracking algorithm

if migration criteria is met then

Perform distributed task migration using the proposed scheme in Fig. 1 core by core.

end if

end for

their neighboring nodes, but also of the nodes that could possibly migrate tasks with their neighbors. This revision caters for the conflict resolution but increases the complexity of the algorithm.

2. Another conflict of a similar nature arises when any node A desires to retain its value, because its current temperature is lesser than the estimated  $T_{avg}$ , while one of its neighboring nodes wants to exchange the tasks, based on the execution of its task migration policy. This situation is also resolved by priority assignment based on  $T_{avg}$  in our refinement.

#### 3.2 FSM for the Revised Algorithm

The FSM, depicted in Fig. 2, details the working of the refined algorithm for core 0. The temperature, task load, workloads including the neighboring cores and estimated average temperature by each core are represented by Ts, Ps, TNPs and  $T_{avg}$ , respectively. The temperature of the core is compared with the estimated average temperature, i.e.,  $T_{avg}$ . If the core temperature is greater then the task migration policy is activated and the migration criterion is executed on the neighboring cores one by one to select the core for the migration. Once, the destination core is selected, the improved condition for the task migration is checked to finalize the core selection. The respective conditions are shown in the FSM.

#### 3.3 Modeling the Average Estimation Algorithm

In order to model Eq. 1, we have to first take the integral of the  $\dot{z}_i(t)$ , The integral of a signum function is given as [38]:

$$\int sign(x) \, dx = |x| \tag{3}$$

and

$$\int \dot{z}_i(t) dt = \int \alpha \sum_{j \in N_i} sgn[x_j(t) - x_i(t)] dt$$



Fig. 2: Finite state machine showing the working of the algorithm for core 0.

Thus, the equation for  $z_i(t)$  becomes

$$z_i(t) = \alpha \sum_{j \in N_i} |x_j(t) - x_i(t)|$$

and we have

$$x_i(t) = \alpha \sum_{j \in N_i} |x_j(t) - x_i(t)| + r_i(t)$$
 (4)

In our modeling,  $x_i(t)$  becomes equivalent to  $T_{avg}$  that a core i estimates, and  $r_i$  becomes the core i's temperature.

#### 3.4 Model for the $9 \times 9$ grid

The algorithm under verification allows its nodes to exchange information with a maximum of four neighbors, i.e., north, south, east and west. Information exchange with the diagonal neighboring nodes is not allowed. In order to construct the model of any arbitrary  $n \times n$  grid, which supports the originally proposed algorithm of [23], we need three distinct types of nodes, i.e., nodes that can communicate with 2, 3 and 4 neighbors, depending on their location in the grid. However, our refinement of the original algorithm requires 6 different types of nodes, as a node with 3 neighbors may need information of 4 or 5 second-level neighboring nodes depending on its location in the grid. We have defined second level neighbors of a core x as the cores that can communicate with the neighbors of that core x. Similarly, a four-neighbor node may require information of 4, 5 or

6 second level neighboring nodes depending on its location in the grid. Therefore, we have modeled the  $9 \times 9$  grid using six different modules:  $n2_3$ ,  $n3_4$ ,  $n3_5$ ,  $n4_6$ ,  $n4_7$  and  $n4_8$  as shown in Fig. 3. The name of these modules  $nx_y$  show that the cores modeled by this module have x immediate neighbors and y other second-level neighbors that can exchange tasks with this core or its neighbors. The MAIN module calls the instances of these six distinct modules to complete the overall model of a  $9 \times 9$  grid. This model is then used for verifying both functional and timing properties of the given algorithm in the next section. The code listing for the these modules and more implementation details are available at [2].

| 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  | 8  | n2_3 |
|----|----|----|----|----|----|----|----|----|------|
| 9  | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | n3_4 |
| 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | n3_5 |
| 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | n4_6 |
| 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | n4_7 |
| 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | n4_8 |
| 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |      |
| 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 |      |
| 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 |      |

Fig. 3: Categorization of cores based on the amount of information exchange.

# 4 Verification of the DTM Algorithm

# 4.1 Experimental Setup

We used the version 1.0 of the nuXmv model checker along with the Windows 8.1 Professional OS running on a i3 processor, 2.93GHz(4 CPUs), with 4 GB memory for our experiments. In order to assume realistic values of temperatures for our experimentation, we used the temperature range between 41°C and 56°C for a single core as has been reported in [14]. The verification is done for a  $9 \times 9$  grid of nodes (cores) with all of them running processes as described in the previous section. The complete model contains 81 processes.

# 10

#### 4.2 Functional Verification

We have done the functional verification of the DTM algorithm by verifying the following properties using the nuXmv's bounded model checking (BMC) support for real numbers:

**Deadlock.** A deadlock state in a system leads to an undesired cyclic behavior. In case of DTM, deadlock happens if the temperature of some core x is greater than the estimated temperature and it is unable to exchange its load with some other core. This behavior could result in the creation of thermal hot spots across the chip. In order to make sure that the DTM algorithm is free of deadlocks, the following property needs to be satisfied:

$$G(core_x.T_0 > core_x.T_{avg} \rightarrow F(core_x.T_0 \le core_k.T_0))$$

This property checks that any core having temperature greater than the average temperature will eventually get a reduction in temperature.

**Liveliness.** The liveliness property in a system makes sure that the system returns to its good working or desired state. In our verification, we have defined liveliness using the following specification:

$$G(core_x.T_0 < core_x.T_{avg} \rightarrow X(core_x.n = x))$$

This property states that if the temperature of a core is less than the estimated average temperature of the core, then in the very next state, the core does not need to migrate tasks to its neighbors.

**Stability.** Stability is one of the most important properties for any DTM algorithm. In the given algorithm, stability is attained when the temperature of all the cores will eventually be less than or equal to the estimated average temperature of the chip.

$$GF((core_0.T_0 <= core_0.T_{avg})\&(core_1.T_0 <= core_1.T_{avg}).....\&(core_{80}.T_0 <= core_{80}.T_{avg}))$$

The stability condition,  $core_n.T_0 <= core_n.T_{avg}$ , for a core n is defined using the fact that the algorithm tries to achieve stability by executing the task migration until the core has a temperature equal to or less than the estimated average as shown in Fig. 1. Also, the GF operator is used to ensure that our stability property holds true (some where in future or eventually) across any execution path (globally) of the algorithm. For our  $9 \times 9$  grid, it means that eventually, there would be a state where all the cores of the system have a temperature that does not exceed the estimated average temperature.

Verification of Temperature Estimation Algorithm. An interesting observation is the estimation of the average chip temperature using Eq. 1. The graphs below show the average chip temperature estimation behavior of six cores, each corresponding to one of the six different neighbor configurations in our grid.



Fig. 4: Temperature Estimation in °C

Initially, each core sees the initial value of the  $T_{avg}$  as the average temperature of the core. The cores then estimate the temperature of the chip with the help of underlying average tracking algorithm in the DTM. For illustration purposes, we have shown the actual average temperature of the grid, and the estimated temperature of the selected cores on the same plot. It shows that the

average estimation algorithm making use of the temperature information form the neighboring node gives a good average estimate of the overall chip temperature, confirming the functionality of the average estimation algorithm. Moreover the estimated average by different cores is also following a similar pattern.

The verification times and the memory consumption for some of the functional properties, verified in this work, are given in Table 1. The time measurements in Table 1 is done by using nuXmv function time.

| Properties          | Core | Module | Memory Usage<br>(MBs) | Time (s) |
|---------------------|------|--------|-----------------------|----------|
| Liveliness Property | 0    | n2_3   | 1015.69               | 745.67   |
| Liveliness Property | 1    | n3_4   | 1051.71               | 751.25   |
| Liveliness Property | 2    | n3_5   | 1025.64               | 749.65   |
| Liveliness Property | 10   | n4_6   | 1041.52               | 758.75   |
| Liveliness Property | 11   | n4_7   | 1031.74               | 781.85   |
| Liveliness Property | 20   | n4_8   | 1033.85               | 790.65   |
| Deadlock Property   | 0    | n2_3   | 1351.41               | 1245.59  |
| Deadlock Property   | 1    | n3_4   | 1325.35               | 1235.61  |
| Deadlock Property   | 2    | n3_5   | 1315.63               | 1241.91  |
| Deadlock Property   | 10   | n4_6   | 1359.54               | 1249.71  |
| Deadlock Property   | 11   | n4_7   | 1359.54               | 1239.41  |
| Deadlock Property   | 20   | n4_8   | 1343.51               | 1251.11  |
| Stability Property  | -    |        | 2051.51               | 2253.56  |

**Table 1:** Timing and memory resources for some of the properties verified by our technique.

#### 4.3 Timing Verification

The functional properties, presented in the previous section, have been verified for the initial temperature values taken randomly from the allowable range given in [14]. In this section, we verified various timing related properties for specific scenarios, with particular initial temperatures. In order to measure the time stamps between the states transition, we have used built-in nuXmv commands execute\_trace and execute\_partial\_trace. Table 2 shows the time to stability for different selected cores (one from each module) under 16 possible conditions. Here, n2\_3, n3\_4, n3\_5, n4\_6, n4\_7 and n4\_8 represents different modules.  $T_0$ represents the temperature of the tested core and  $T_1$ ,  $T_2$ ,  $T_3$  and  $T_4$  represent the temperature of neighbors of the tested core. The first case nb0 in Table 2, represents the case when the temperature of all the neighbor cores is less than the threshold. Similarly, the nb1234 represents the case when the temperature of all neighbor cores, i.e., 1, 2, 3 and 4, exceed the threshold. Whereas, the other cases represent the intermediate possibilities between these extreme scenarios. It can be seen from Table 2 that a maximum of 141 state transitions are required to reach stability when the given core, of type n4\_8, and three of its neighbors are at a temperature of 56°C.

| Scenarios | Experimental |    |    |    | Setup | $n2_{-}3$ | n3_4 | n3_5 | $n4_{-}6$ | $n4_{-}7$ | n4_8 |
|-----------|--------------|----|----|----|-------|-----------|------|------|-----------|-----------|------|
| Scenarios | T0           | T1 | T2 | T3 | T4    | 0         | 1    | 2    | 10        | 11        | 20   |
| nb0       | 56           |    |    |    |       | 49        | 52   | 51   | 63        | 67        | 62   |
| nb1       | 56           | 56 |    |    |       | 53        | 53   | 54   | 61        | 63        | 61   |
| nb2       | 56           |    | 56 |    |       | 61        | 57   | 55   | 71        | 73        | 69   |
| nb3       | 56           |    |    | 56 |       | -         | 20   | 57   | 72        | 79        | 80   |
| nb4       | 56           |    |    |    | 56    | -         | -    | -    | 89        | 87        | 91   |
| nb12      | 56           | 56 | 56 |    |       | 63        | 59   | 53   | 94        | 92        | 85   |
| nb13      | 56           | 56 |    | 56 |       | 62        | 61   | 54   | 91        | 98        | 96   |
| nb14      | 56           | 56 |    |    | 56    | -         | 57   | 51   | 107       | 114       | 113  |
| nb23      | 56           |    | 56 | 56 |       | 62        | 62   | 59   | 117       | 112       | 105  |
| nb24      | 56           |    | 56 |    | 56    | -         | 58   | 57   | 103       | 109       | 112  |
| nb34      | 56           |    |    | 56 | 56    | -         | 60   | 65   | 100       | 105       | 102  |
| nb123     | 56           | 56 | 56 | 56 |       | -         | -    | -    | 115       | 120       | 117  |
| nb124     | 56           | 56 | 56 |    | 56    | -         | -    | -    | 119       | 124       | 118  |
| nb134     | 56           | 56 |    | 56 | 56    | -         | -    | -    | 121       | 125       | 131  |
| nb234     | 56           |    | 56 | 56 | 56    | -         | -    | -    | 132       | 129       | 141  |
| nb1234    | 56           | 56 | 56 | 56 | 56    | -         | -    | -    | 129       | 131       | 137  |

**Table 2:** No. of transitions required to achieve stability for different test scenarios.

## 5 Conclusion

This paper presents the formal verification of both functional and timing properties of a recent dDTM technique [23] for on-chip many-core systems using the nuXmv model checker. Due to the ability to handle real numbers and the powerful verification methods, based on SAT and SMT solvers, in nuXmv, we have been able to gain many new insights into the given algorithm. While modeling the selected task migration algorithm [23] in nuXmv, we identified a couple of ambiguities in the original algorithm [23] that have been fixed in our implementation of the algorithm using the nuXmv language. The analyzed model has 81 cores and the analysis is done within the range of 41 to 56°C. To the best of our knowledge, such a big model cannot be handled rigorously by simulation-based testing. We plan to extend this work by proposing a common ground to analyze and compare dDTM schemes, such as [11,15,34,20], both in terms of functional and timing properties.

# References

- Brooks, D., Martonosi, M.: Dynamic thermal management for high-performance microprocessors. In: High-Performance Computer Architecture. pp. 171–182. IEEE (2001)
- 2. Bukhari, S.A.A., Lodhi, F.K.: Formal verification of distributed task migration for thermal management in on-chip multi-core systems using nuXmv, National

- University of Sciences and Technology (2014), http://save.seecs.nust.edu.pk/projects/fdDTM/fdDTM.html
- 3. Cavada, R., Cimatti, A., Dorigatti, M., Griggio, A., Mariotti, A., Micheli, A., Mover, S., Roveri, M., Tonetta, S.: The nuXmv Symbolic Model Checker. In: Computer Aided Verification, LNCS, vol. 8559, pp. 334–342. Springer (2014)
- 4. Chen, F., Cao, Y., Ren, W.: Distributed computation of the average of multiple time-varying reference signals. In: American Control Conference. pp. 1650–1655 (2011)
- Chen, F., Cao, Y., Ren, W.: Distributed average tracking of multiple time-varying reference signals with bounded derivatives. Automatic Control, IEEE Transactions on 57(12), 3169–3174 (2012)
- 6. Clarke, Jr., E.M., Grumberg, O., Peled, D.A.: Model Checking, MIT Press (1999)
- 7. Donald, J., Martonosi, M.: Techniques for multicore thermal management: Classification and new exploration. In: Computer Architecture. pp. 78–88 (2006)
- 8. Donald, J., Martonosi, M.: Techniques for multicore thermal management: Classification and new exploration. In: ACM SIGARCH Computer Architecture News. vol. 34, pp. 78–88. IEEE Computer Society (2006)
- Drechsler, R.: Advanced Formal Verification. Falk Symposium Series, Springer (2004)
- Dunn, D.: Intel delays Montecito in roadmap shakeup. EE Times, Manufacturing/-Packaging (Oct 2005)
- 11. Ebi, T., Faruque, M., Henkel, J.: Tape: Thermal-aware agent-based power econom multi/many-core architectures. In: Computer-Aided Design. pp. 302–309 (2009)
- Ebi, T., Kramer, D., Karl, W., Henkel, J.: Economic learning for thermal-aware power budgeting in many-core architectures. In: Hardware/Software Codesign and System Synthesis. pp. 189–196. ACM (2011)
- Ge, Y., Malani, P., Qiu, Q.: Distributed task migration for thermal management in many-core systems. In: Design Automation Conference. pp. 579–584. ACM (2010)
- Glocker, E., Schmitt-Landsiedel, D.: Modeling of temperature scenarios in a multicore processor system. Advances in Radio Science 11, 219–225 (2013)
- Henkel, J., Ebi, T., Amrouch, H., Khdr, H.: Thermal management for dependable on-chip systems. In: Asia and South Pacific Design Automation Conference. pp. 113–118 (2013)
- 16. Holzmann, G.J.: The Model Checker SPIN. IEEE Transactions on software engineering 23(5), 279–295 (1997)
- 17. Ismail, M., Hasan, O., Ebi, T., Shafique, M., Henkel, J.: Formal verification of distributed dynamic thermal management. In: Computer-Aided Design. pp. 248–255. IEEE (2013)
- ITRS: (2014), http://www.itrs.net/Links/2013ITRS/2013Chapters/ 2013Overview.pdf
- 19. Kadin, M., Reda, S., Uht, A.: Central vs. distributed dynamic thermal management for multi-core processors: Which one is better? In: Great Lakes Symposium on VLSI. pp. 137–140. ACM (2009)
- 20. Khdr, H., Ebi, T., Shafique, M., Amrouch, H., Henkel, J.: mdtm: multi-objective dynamic thermal management for on-chip systems. In: Design, Automation & Test in Europe. p. 330 (2014)
- Kong, J., Chung, S.W., Skadron, K.: Recent thermal management techniques for microprocessors. ACM Comput. Surv. 44(3), 13:1–13:42 (2012)
- Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)

- 23. Liu, Z., Huang, X., Tan, S.D., Wang, H., Tang, H.: Distributed task migration for thermal hot spot reduction in many-core microprocessors. In: ASIC. pp. 1–4 (2013)
- Liu, Z., Xu, T., Tan, S.D., Wang, H.: Dynamic thermal management for multicore microprocessors considering transient thermal effects. In: Design Automation Conference. pp. 473–478 (2013)
- Lungu, A., Bose, P., Sorin, D.J., German, S., Janssen, G.: Multicore power management: Ensuring robustness via early-stage formal verification. In: Formal Methods and Models for Codesign. pp. 78–87. IEEE (2009)
- $26. \ \mathrm{MathSAT} \ 5: \ (2014), \ \mathtt{http://mathsat.fbk.eu/}$
- Mukherjee, R., Memik, S.O.: Physical aware frequency selection for dynamic thermal management in multi-core systems. In: Computer-aided Design. pp. 547–552.
   ACM (2006)
- 28. Nath, R., Carmean, D., Rosing, T.S.: Power modeling and thermal management techniques for manycores. In: Computers and Communications. pp. 740–746. IEEE (2013)
- Norman, G., Parker, D., Kwiatkowska, M., Shukla, E., Gupta, R.: Using probabilistic model checking for dynamic power management. Formal Aspects of Computing 17, 202–215 (2003)
- 30. NuSMV: (2014), http://nusmv.fbk.eu/
- 31. nuXmv: (2014), https://nuxmv.fbk.eu/
- Salami, B., Baharani, M., Noori, H.: An adaptive temperature threshold schema for dynamic thermal management of multi-core processors. In: Computer Architecture and Digital Systems. pp. 119–120 (2013)
- Schauer, B.: Multicore processors

  –a necessity. ProQuest discovery guides pp. 1–14
  (2008)
- Shafique, M., Henkel, J.: Agent-based distributed power management for kilo-core processors. In: Computer-Aided Design. pp. 153–160. IEEE (2013)
- Shukla, S., Gupta, R.: A model checking approach to evaluating system level dynamic power management policies for embedded systems. In: High-Level Design Validation and Test Workshop, IEEE. pp. 53–57 (2001)
- 36. Singh, A., Shafique, M., Kumar, A., Henkel, J.: Mapping on multi/many-core systems: Survey of current and emerging trends. In: Design Automation Conference (DAC), ACM / EDAC / IEEE. pp. 1–10 (2013)
- Weiss, G.: Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT Press (1999)
- 38. Wolfram: (2014), http://functions.wolfram.com/ComplexComponents/Sign/21/01/01/
- 39. Wyngaard, J., Inggs, M., Collins, J., Farrimond, B.: Towards a many-core architecture for HPC. In: Field Programmable Logic and Applications. pp. 1–4 (2013)
- 40. Yang, J., Zhou, X., Chrobak, M., Zhang, Y., Jin, L.: Dynamic thermal management through task scheduling. In: Performance Analysis of Systems and software. pp. 191–201 (2008)
- 41. Yun, B., Shin, K.G., Wang, S.: Predicting thermal behavior for temperature management in time-critical multicore systems. In: IEEE Real-Time and Embedded Technology and Applications Symposium. pp. 185–194 (2013)

# Coalgebraic Semantic Model for the Clock Constraint Specification Language

Frédéric Mallet<sup>1</sup> and Grygoriy Zholtkevych<sup>2</sup>

 Univ. Nice Sophia Antipolis, CNRS, I3S, UMR 7271, Sophia Antipolis, France INRIA Sophia Antipolis Méditerranée, Sophia Antipolis, France East China Normal University/Software Engineering Institute, Shanghai, PRC
 Dep. Theor. and Appl. CS, V.N. Karazin Kharkiv National Univ., Kharkiv, Ukraine

Abstract. The Clock Constraint Specification Language (CCSL) has initially been introduced as part of the UML Profile for MARTE dedicated to the modeling and analysis of real-time and embedded systems. CCSL proposes a set of simple patterns classically used to specify causal and temporal properties of (UML/EMF) models. The paper proposes a new semantic model for CCSL based on the notion of "clock coalgebra". Coalgebra promises to give a unified framework to study the behavior and semantics of reactive systems and, more generally, infinite data structures. They appear as being the adequate mathematical structure to capture the infinite nature of CCSL operators. This paper proposes a coalgebraic structure for CCSL, or rather a natural generalization of CCSL that we call generalized clock constraints: GenCCSL. We establish that GenCCSL covers the class of CCSL constraints and we give examples of GenCCSL constraints that cannot be expressed with classical CCSL. Then, we discuss the properties of the newly introduced class, including ways to detect valid and invalid GenCCSL behaviors, as well as deciding whether a GenCCSL constraint is also a CCSL one.

 $\textbf{Keywords:} \ \ \text{concurrent system, behavior model, clock model, transition system, coalgebra$ 

#### 1 Introduction

The UML profile for MARTE (Modeling and Analysis of Real-Time Embedded systems) [13] is dedicated to the modeling and analysis of real-time and embedded systems. Its time model [3] builds on the notion of logical clock that was concurrently made popular in distributed systems [8] and in synchronous languages [4]. Logical clocks offer a good abstraction to describe causal relationships between the occurrences of events in a distributed systems, but also synchronization constraints in synchronous (software or circuit) implementations. MARTE offers a stereotype to identify clocks and then promotes the use of the Clock Constraint Specification Language (Annex C.3) as a concrete syntax to handle and constrain those clocks. An operational semantics of CCSL has been defined separately [1] as a basis for building a simulator, called TimeSquare [6]. Later, a denotational

semantics was defined [9] in a bid to provide some exhaustive verification support for CCSL. The equivalence of these two semantics has also been proven [17]. CCSL operational semantics is inspired by the approach proposed by G. Plotkin for defining the operational semantics of software systems [14]. The theory of universal coalgebra [15] proposes another mathematical model to study the semantics of reactive systems and, more generally, of infinite data structures. It appears as being well fitted to deal with the infinite nature of CCSL operators.

This paper proposes a co-algebraic semantic model to reason on CCSL constraints. This opens the path to the use of co-algebraic bisimulation. It is used here to identify a lack of expressiveness in CCSL and define an extension, called GenCCSL.

We proceed by associating a coalgebra with each CCSL clock constraint. This is useful to identify valid schedules with tracks in the corresponding coalgebra. Such an identification embeds the class of clock constraints into a wider class of constraints, GenCCSL. The principal problem in this context is to understand whether this embedding is bijective or not. Showing that this embedding is not bijective leads us to conclude that CCSL is incomplete. We then discuss the properties of the constraints characterized by the generalized class, GenCCSL.

The remainder of the paper is as follows. Section 2 introduces the vocabulary, the syntax and the semantics of the considered CCSL constraints as well as the notion of coalgebra. Section 3 proposes our clock coalgebra. Section 4 presents the notion of stationary clock constraint and stresses the incompleteness of CCSL with regards to the proposed coalgebra. Section 5 discusses related works and sources of inspiration. Section 6 concludes and summarizes the main results.

# 2 Preliminaries

In this section we remind the notation and definitions of the key terms used.

A (logical) clock denotes a repetitive event of relevance for the system under consideration and its sequence of occurrences. For instance, if you consider a train system. A clock can be any command from the driver to control systems (brake pressed, power on...) or an event occurring (door opening, urgency brake requested...). If you rather consider a computer architecture, clocks can represent the processor clock, but also any kinds of requests on buses, fetch operations, interrupts...). We do not assume a regular rhythm in the occurrences (interrupt) but we do not preclude it (processor clock). The occurrences of the events, *i.e.*, the clocks, are also called its ticks. When the clock ticks, the event occurs. We do not need here any specific property on the clocks and therefore we do not give a formal definition. In the following, we consider that we operate on a set of clocks,  $\mathcal{C}$ .

## 2.1 Clock Constraints. Syntax

Syntactically a clock constraint is a finite set of primitive clock constraints, which are classified as clock relations and clock definitions.

There are four kinds of clock relations: subclocking, exclusion, causality, and precedence. All these relations are binary relations over clocks. The following notation is used to denote these relations between clocks  $a \in \mathcal{C}$  and  $b \in \mathcal{C}$ :

 $a \subseteq b$  denotes that clock a is a subclock of clock b,  $a \not\equiv b$  denotes that clock a and b are mutually exclusive,  $a \bowtie b$  denotes that clock a causes clock b, and  $a \bowtie b$  denotes that clock a precedes clock b.

Intuitively, the first two relations are synchronous constraints. Subclock allows a (sub)clock to tick only when its master (super)clock ticks. We say that the subclock is coarser than the superclock, and the superclock is finer than the subclock. The second one forbids two clocks to tick simultaneously, without giving a priority to either clock. The last two relations are asynchronous. Causality relates an effect to its cause, e.g., the sending of a message in a queue and its reception. Its the classical symmetric and transitive causality relation of event structures [12]. If  $a \leq b$ , we say that a is faster than b, or b is slower than a. Precedence is similar to Causality but excludes instantaneous communications. The formal definitions of these relations are given in the next subsection.

There is one kind of unary clock definition parametrized by a positive natural number,  $n \in \mathbb{N}$ .

 $b \triangleq a \$ n$  denotes that clock b is a n-times delay of clock a. Intuitively, b is the same clock as a except that the first n ticks are ignored. Finally, there are four kinds of binary clock definitions: union, intersection, infimum, and supremum. The following notation is used for these definitions

 $c \triangleq a + b$  denotes that clock c is a union of clocks a and b,  $c \triangleq a * b$  denotes that clock c is an intersection of clocks a and b,  $c \triangleq a \land b$  denotes that clock c is an infimum of clocks a and b, and  $c \triangleq a \lor b$  denotes that clock c is a supremum of clocks a and b.

Intuitively, the union of two clocks a and b is the coarsest clock c that is a super clock of both a and b. The intersection is the finest clock that is a subclock of both a and b. These two operators are related to the synchronous relation of subclocking. Let us take as an example, a system where a command is sent to two actuators. Let a and b be clocks that represent the instants at which the command is actually received by the actuators. Then, a + b represent the instants at which a command is received on at least one actuator whereas a \* b represent the instants at which the command arrives simultaneously on the two actuators.

Infimum and Supremum play a dual role with the asynchronous relation of causality. The infimum of two clocks a and b is the slowest clock (with regards to causality) that is faster than both a and b. The supremum is the fastest clock slower than both a and b. Using the same example as before,  $a \land b$  represents the instants of the earliest reception on either a or b. Sometimes a may receive

the command first, sometimes it is b. On the other hand,  $a \lor b$  represents the instants of the latest reception on either a or b.

#### 2.2 Clock Constraints. Semantics

One way to define the semantics for a clock constraint system is to specify the corresponding set of schedules, *i.e.*, the scenarios of the valid system behavior. In general, for a clock system, there are several (infinitely many) valid schedules.

**Definition 1.** Let C be a given finite set of clocks then a map  $\sigma: \mathbb{N}_{>0} \to \mathcal{P}(C)$  is called a schedule and an element  $\chi^{\sigma}$  of  $\mathbb{N}^{C}$  is called a configuration for a given schedule  $\sigma$ .  $\chi_a^{\sigma} \in \mathbb{N}$  is a component of  $\chi^{\sigma}$  that denotes the configuration of clock a.

Intuitively, a schedule is a sequence of steps. For a given step  $t \in \mathbb{N}_{>0}$ ,  $\sigma(t)$  is the set of clocks that tick simultaneously.

With each schedule  $\sigma$ , the sequence of configurations  $\langle \chi^{\sigma}(t) | t \in \mathbb{N} \rangle$  can be defined in the following manner:

$$\begin{split} & \boldsymbol{\chi}^{\sigma}(0) = \boldsymbol{0} \,; \\ & \chi_{a}^{\sigma}(t) = \begin{cases} \chi_{a}^{\sigma}(t-1) \,, & \text{if } a \notin \sigma(t) \\ \chi_{a}^{\sigma}(t-1) + 1 \,, & \text{if } a \in \sigma(t) \end{cases} \text{ for all } t \in \mathbb{N}_{>0} \,, a \in \mathcal{C} \,. \end{split}$$

In other words,  $\chi^{\sigma}(t)$  counts the number of activations (ticks) of all the clocks at step t (for a given schedule).

There is a close interrelation between schedules and sequences of configurations. This interrelation is established in the following simple proposition.

**Proposition 1.** Let  $\langle \chi(t) | t \in \mathbb{N} \rangle$  be a sequence of configurations then there exists a schedule  $\sigma$  such that  $\chi_a(t) = \chi_a^{\sigma}(t)$  for all  $t \in \mathbb{N}$  and  $a \in \mathcal{C}$  if and only if the following conditions hold

$$\chi(0) = 0$$

and

$$0 \le \chi_a(t+1) - \chi_a(t) \le 1$$
 for all  $t \in \mathbb{N}$  and  $a \in \mathcal{C}$ .

Moreover, the schedule  $\sigma$  is uniquely determined by  $\langle \chi(t) | t \in \mathbb{N} \rangle$ .

*Proof.* The proof is simple and we do not give it. But we specify the method to construct  $\sigma: \sigma(t) = \{a \in \mathcal{C} \mid \chi_a(t) > \chi_a(t-1)\}$  for  $t \in \mathbb{N}_{>0}$ 

Below, we define the semantics of a primitive clock constraint as a set of schedules that satisfy this constraint (similarly to [11]). We shall use the abbreviated notation  $\sigma \models Cons$  to represent the statement "schedule  $\sigma$  satisfies clock constraint Cons" and  $\llbracket Cons \rrbracket$  to refer to the set of all the schedules satisfying clock constraint Cons.

Bold font denotes vectors.  $\chi^{\sigma} \in \mathbb{N}^{\mathcal{C}}$  whereas  $\chi_a^{\sigma} \in \mathbb{N}$ .

**Subclocking.** Let a and b be clocks belonging to C then we shall assume that  $\sigma \models a \subseteq b$  means the validity of the following statement

$$a \in \sigma(t)$$
 implies  $b \in \sigma(t)$  for all  $t \in \mathbb{N}_{>0}$ .

Subclocking is the basic synchronous construct allowing one event to occur only if its master event also occurs.

**Exclusion.** Let a and b be clocks belonging to C then we shall assume that  $\sigma \models a \not\parallel b$  means the validity of the following statement

$$a \notin \sigma(t)$$
 or  $b \notin \sigma(t)$  for all  $t \in \mathbb{N}_{>0}$ .

Exclusion, here, is a purely synchronous notion, which is very different from the notion of exclusion in event structures. Indeed, it prevents two clocks from ticking simultaneously. In event structures, two occurrences are exclusive of each other means that if one occurs, the other one will never be able to occur, ever.

**Causality.** Let a and b be clocks belonging to C then we shall assume that  $\sigma \models a \mid \downarrow b$  means the validity of the following statement

$$\chi_a^{\sigma}(t) \ge \chi_b^{\sigma}(t)$$
 for all  $t \in \mathbb{N}$ .

On the contrary, consality is a purely asynchronous notion, classical in process networks and Petri nets.

**Precedence.** Let a and b be clocks belonging to C then we shall assume that  $\sigma \models a b$  means the validity of the following statement

$$\chi_a^{\sigma}(t) = \chi_b^{\sigma}(t)$$
 implies  $b \notin \sigma(t+1)$  for all  $t \in \mathbb{N}$ .

Let us recall that  $\chi^{\sigma}(0) = \mathbf{0}$  so the equality of configurations is at least achieved initially. Then a is bound to tick strictly faster than b unless they both never tick. This latter pathological case denotes a classical liveness problem in CCSL specifications. That is why we usually attempt to establish that all the clocks tick infinitely often.

**Delay.** Let a and b be clocks belonging to  $\mathcal{C}$  then we say that b is delayed for  $m \in \mathbb{N}$  compared to a (it is denoted by  $b \triangleq a \$ m$ ) and assume that  $\sigma \models b \triangleq a \$ m$  if the following statement is valid

$$\chi_b^{\sigma}(t) = \max(\chi_a^{\sigma}(t) - m, 0) \text{ for all } t \in \mathbb{N}$$

**Union.** Let a, b, and c be clocks belonging to C then we say that c is union of a and b and assume that  $\sigma \models c \triangleq a + b$  if the following statement is valid

$$c \in \sigma(t)$$
 iff  $a \in \sigma(t)$  or  $b \in \sigma(t)$  for all  $t \in \mathbb{N}_{>0}$ 

**Intersection.** Let a, b, and c be clocks belonging to  $\mathcal{C}$  then we say that c is intersection of a and b and assume that  $\sigma \models c \triangleq a$   $\flat$  b if the following statement is valid

$$c \in \sigma(t)$$
 iff  $a \in \sigma(t)$  and  $b \in \sigma(t)$  for all  $t \in \mathbb{N}_{>0}$ 

**Infimum.** Let a, b, and c be clocks belonging to C then we say that c is infimum of a and b and assume that  $\sigma \models c \triangleq a \land b$  if the following statement is valid

$$\chi_c^{\sigma}(t) = \max(\chi_a^{\sigma}(t), \chi_b^{\sigma}(t))$$
 for all  $t \in \mathbb{N}$ 

**Supremum.** Let a, b, and c be clocks belonging to  $\mathcal{C}$  then we say that c is supremum of a and b and assume that  $\sigma \models c \triangleq a \lor b$  if the following statement is valid

$$\chi_c^{\sigma}(t) = \min(\chi_a^{\sigma}(t), \chi_b^{\sigma}(t)) \text{ for all } t \in \mathbb{N}$$

**Definition 2.** If S is a finite set of primitive clock constraints described above then  $\sigma \models S$  means that  $\sigma \models \gamma$  for each  $\gamma \in S$  and  $\llbracket S \rrbracket$  denotes the set of all schedules  $\sigma$  such that  $\sigma \models S$ .

#### 2.3 Coalgebra as a Tool to Model Computer Systems

Gordon Plotkin explains [14] that transition structures are adequate models of computer systems: "In discrete (digital) computer systems behaviour consists of elementary steps which are occurrences of operations. Such elementary steps are called here, (and also in many other situations in Computer Science) transitions (= moves). Thus a transition step from one configuration to another and as a first idea we take it to be a binary relation between configurations." In [15] it has been shown that considering transition systems as a coalgebra gives useful, non-trivial results. It allows mathematical reasoning on infinite data structures, such as the behavior of reactive systems and it paves the way to co-algebra homomorphism and bisimulation. This is essential to prove the semantic preservation when transforming MARTE/CCSL into other formal models. In this paper, we rely on it to define a notion of incompleteness for CCSL and propose an extension, called GenCCSL.

We give in this subsection a minimally needed review of the definitions and notations used on transition systems and coalgebra. We then define a  $clock\ coalgebra$  for MARTE/CCSL.

**Definition 3.** A transition system is a structure  $\langle \Gamma, \longrightarrow \rangle$  where  $\Gamma$  is a set (of elements,  $\gamma$ , called configurations) and  $\longrightarrow \subset \Gamma \times \Gamma$  is a binary relation (called the transition relation). Read  $\gamma \longrightarrow \gamma'$  as saying that there is a transition from configuration  $\gamma$  to configuration  $\gamma'$ .

Using the notion of coalgebra we obtain an alternative way to describe transition systems.

**Definition 4.** A (powerset) coalgebra [15] is a structure  $\langle \Gamma, \alpha \rangle$  where  $\alpha$  is a map from  $\Gamma$  into the set of all subsets of  $\Gamma$ ,  $\mathcal{P}(\Gamma)$ . In this context  $\Gamma$  is called the carrier of the coalgebra.

It is evident that any transition system  $\langle \Gamma, \longrightarrow \rangle$  determines the coalgebra  $\langle \Gamma, \alpha \rangle$ , where  $\gamma' \in \alpha(\gamma)$  if and only if  $\gamma \longrightarrow \gamma'$ , and conversely, any coalgebra  $\langle \Gamma, \alpha \rangle$  determines the transition system  $\langle \Gamma, \longrightarrow \rangle$ , where  $\gamma \longrightarrow \gamma'$  if and only if  $\gamma' \in \alpha(\gamma)$ .

**Definition 5.** Let  $\langle \Gamma, \alpha \rangle$  be a coalgebra, B be a subset of  $\Gamma$  then the structure  $\langle B, \alpha \rangle$  is called a subcoalgebra of  $\langle \Gamma, \alpha \rangle$  if the embedding  $\alpha(\gamma) \subset B$  is true for each  $\gamma \in B$ .

One can check that any coalgebra  $\langle \Gamma, \alpha \rangle$  is a subcoalgebra of itself and the intersection of a family of subcoalgebras is a subcoalgebra too. Hence, for each subset  $X \subset \Gamma$  there exists a least one subcoalgebra whose carrier contains X. In this case, the carrier of this subcoalgebra is denoted by  $\langle X \rangle$ .

To calculate  $\langle X \rangle$  one can use Tarski's fixed point theorem [16] for the monotonic operator  $\Psi_X$  on the lattice  $\mathcal{P}_X(\Gamma)$ , where  $\mathcal{P}_X(\Gamma)$  is the set of all  $\Gamma$  subsets that cover X. This operator is defined by the following formula

$$\Psi_X(V) = V \cup \{ \gamma' \in \Gamma \mid (\exists \gamma \in V) \ \gamma' \in \alpha(\gamma) \}.$$

**Calculation Schema.** To calculate  $\langle X \rangle$  one can build the following sequence of sets

$$V_0 = X$$
,

and for n > 0

$$V_n = V_{n-1} \cup \{ \gamma' \in \Gamma \mid (\exists \gamma \in V_{n-1}) \ \gamma' \in \alpha(\gamma) \} ;$$

then

$$\langle X \rangle = \bigcup_{n \ge 0} V_n \,.$$

This computational schema ensures that an element  $\gamma \in \Gamma$  belongs to  $\langle X \rangle$  if and only if there exists a finite sequence  $\gamma_0, \ldots, \gamma_{n-1}, \gamma_n$  formed by elements of  $\Gamma$  such that

$$\gamma_0 \in X \text{ and } \gamma_n = \gamma;$$
 (1)

$$\gamma_k \in \alpha(\gamma_{k-1}) \text{ for } k = 1, \dots, n.$$
 (2)

Finite or infinite  $\Gamma$ -valued sequences satisfying (2) are used below therefore we give them the name "tracks".

Hence, conditions (1) and (2) mean that an element  $\gamma \in \Gamma$  belongs to  $\langle X \rangle$  if and only if there exists a track that links some element of X and  $\gamma$ .

## 3 Clock Constraints and Coalgebras

In this section interrelations between clock constraints and powerset coalgebras are studied.

Below we assume that some finite set of clocks  $\mathcal{C}$  has been given. Let us define the constraint-free coalgebra over a clock set  $\mathcal{C}$  as the coalgebra with the carrier  $\mathbb{N}^{\mathcal{C}}$  and the map  $\alpha: \mathbb{N}^{\mathcal{C}} \to \mathcal{P}(\mathbb{N}^{\mathcal{C}})$  defined by the formula:

$$\chi' \in \alpha(\chi)$$
 if and only if  $0 \le \chi'_a - \chi_a \le 1$  for all  $a \in \mathcal{C}$ .

It is evident that for any  $\chi \in \mathbb{N}^{\mathcal{C}}$  the map  $\alpha$  is represented in the form

$$\alpha(\boldsymbol{\chi}) = \boldsymbol{\chi} + \{0,1\}^{\mathcal{C}}.$$

The following statement is, in fact, a reformulation of Prop. 1, which states that a clock can only tick once at each instant and that all the evolutions are possible when no constraint is specified.

**Proposition 2.** Let  $\langle \chi(t) | t \in \mathbb{N} \rangle$  be a sequence of configurations then there exists a schedule  $\sigma$  such that  $\chi_a(t) = \chi_a^{\sigma}(t)$  for all  $t \in \mathbb{N}$  and  $a \in \mathcal{C}$  if and only if this sequence is a track in the coalgebra  $\langle \mathbb{N}^{\mathcal{C}}, \alpha \rangle$  such that  $\chi(0) = \mathbf{0}$ .

A track  $\langle \boldsymbol{\chi}(t) \mid t \in \mathbb{N} \rangle$  is called *initial* if the condition  $\boldsymbol{\chi}(0) = \mathbf{0}$  holds.

The natural and simplest way to take into account some constraints is to specify a map  $\Delta: \mathbb{N}^{\mathcal{C}} \to \mathcal{P}(\{0,1\}^{\mathcal{C}})$  such that  $\mathbf{0} \in \Delta(\chi)$  for any  $\chi \in \mathbb{N}^{\mathcal{C}}$  and to define

$$\alpha_{\triangle}(\boldsymbol{\chi}) = \boldsymbol{\chi} + \triangle(\boldsymbol{\chi}).$$

A map  $\triangle: \mathbb{N}^{\mathcal{C}} \to \mathcal{P}(\{0,1\}^{\mathcal{C}})$  that satisfies the condition  $\mathbf{0} \in \triangle(\chi)$  for any  $\chi \in \mathbb{N}^{\mathcal{C}}$  is called an *actuation distribution* on  $\mathcal{C}$ . The actuation distribution captures the set of sets of clocks that are allowed to tick simultaneously at one instant given a configuration.

**Definition 6.** Let  $\triangle : \mathbb{N}^{\mathcal{C}} \to \mathcal{P}(\{0,1\}^{\mathcal{C}})$  be an actuation distribution and  $\langle \mathbb{N}^{\mathcal{C}}, \alpha_{\triangle} \rangle$  be a coalgebra, where  $\alpha_{\triangle}(\chi) = \chi + \triangle(\chi)$ , then an element of  $\mathbb{N}^{\mathcal{C}}$  is called  $\triangle$ -reachable configuration if it belongs to the carrier of the minimal subcoalgebra containing  $\mathbf{0}$ .

Such a set of reachable configurations is denoted below by  $R(\triangle)$ .

**Definition 7.** Let  $\triangle : \mathbb{N}^{\mathcal{C}} \to \mathcal{P}(\{0,1\}^{\mathcal{C}})$  be an actuation distribution then the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  is called the clock coalgebra associated with  $\triangle$ .

**Proposition 3.** Let  $\Delta : \mathbb{N}^{\mathcal{C}} \to \mathcal{P}(\{0,1\}^{\mathcal{C}})$  be an actuation distribution and  $\langle \chi(t) | t \in \mathbb{N} \rangle$  be an initial track in the coalgebra  $\langle \mathbb{N}^{\mathcal{C}}, \alpha_{\triangle} \rangle$  if and only if it is an initial track in the subcoalgebra  $\langle R(\Delta), \alpha_{\triangle} \rangle$ .

*Proof.* This is immediate consequence of the definition of initial tracks.  $\Box$ 

#### 3.1 Structure of Actuation Distributions

The set  $\mathcal{P}\left(\{0,1\}^{\mathcal{C}}\right)$  is finite therefore there exists a finite partition of the set  $\mathbb{N}^{\mathcal{C}}$  such that an actuation distribution  $\triangle$  is constant on each atom of the partition. One can extract the coarsest of such partitions.

Hence, if we fix some actuation distribution  $\triangle$  and denote by  $\Pi_{\triangle}$  the coarsest partition of  $\mathbb{N}^{\mathcal{C}}$  such that  $\triangle$  is constant on any atom of the partition then we can assume that each such atom can be represented by a formula of some formal arithmetical system.

If  $\Pi_{\triangle}$  has k atoms and  $\lambda_1(\chi), \ldots, \lambda_k(\chi)$  are formulae, which represent the corresponding atoms, then the following conditions hold

$$\lambda_1(\boldsymbol{\chi}) \vee \cdots \vee \lambda_k(\boldsymbol{\chi}) \equiv \mathfrak{t},$$
 (3)

$$\lambda_i(\boldsymbol{\chi}) \wedge \lambda_j(\boldsymbol{\chi}) \equiv \mathfrak{f} \text{ for } i \neq j \text{ and } 1 \leq i, j \leq k,$$
 (4)

where  $\mathfrak{t}$  is true and  $\mathfrak{f}$  is false.

Further, let us denote by  $\triangle_i$  the value of  $\triangle(\chi)$  under condition that  $\lambda_i(\chi) = \mathfrak{t}$  where i = 1, ..., k. Then  $\mathbf{0} \in \triangle_i \subset \{0, 1\}^{\mathcal{C}}$  and it can be represented by the boolean function  $\delta_i(\tau)$  over a boolean vector  $\boldsymbol{\tau} = \langle \tau_c \mid c \in \mathcal{C} \rangle$  determined by the following condition

$$\delta_i(\boldsymbol{\tau}) = 1$$
 if and only if  $\boldsymbol{\tau} \in \triangle_i$ .

The condition  $\mathbf{0} \in \triangle_i$  ensures validity of the equation

$$\delta_i(\mathbf{0}) = 1. \tag{5}$$

Hence, the following proposition describes the structure of an actuation distribution.

**Proposition 4.** Each actuation distribution  $\triangle$  can be represented as a set of rules

$$\lambda_i(\boldsymbol{\chi}) \Longrightarrow \delta_i(\boldsymbol{\tau}) \text{ where } i = 1, \dots, k$$

such that formulae  $\lambda_1, \ldots, \lambda_k$  satisfy conditions (3) and (4) and boolean functions  $\delta_1, \ldots, \delta_k$  satisfy condition (5).

#### 3.2 Coalgebras for Primitive Clock Constraints

In this subsection coalgebras associated with primitive clock constraints are computed. To do this we use the computational scheme presented in Sec. 2.3.

**Proposition 5 (Clock Relations).** Let  $a,b \in C$  and Rel be a clock relation between clocks a and b then

**case**  $Rel = \{a \subseteq b\}$ : if  $\triangle$  is defined by the following rule

$$\mathfrak{t} \Longrightarrow \tau_a \to \tau_b$$

then  $\langle \chi(t) \mid t \in \mathbb{N} \rangle$  is a track in the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  if and only if there exists a schedule  $\sigma$  such that  $\sigma \models Rel$  and  $\chi^{\sigma}(t) = \chi(t)$  for all  $t \in \mathbb{N}$ ;

**case**  $Rel = \{a \mid \# b\}$ : if  $\triangle$  is defined by the following rule

$$\mathfrak{t} \Longrightarrow \neg \tau_a \vee \neg \tau_b$$

then  $\langle \chi(t) \mid t \in \mathbb{N} \rangle$  is a track in the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  if and only if there exists a schedule  $\sigma$  such that  $\sigma \models Rel$  and  $\chi^{\sigma}(t) = \chi(t)$  for all  $t \in \mathbb{N}$ ; case  $Rel = \{a \mid \forall b\}$ : if  $\triangle$  is defined by the following set of rules

$$\{\chi_a = \chi_b \Longrightarrow \tau_b \to \tau_a\}$$

then  $\langle \boldsymbol{\chi}(t) \mid t \in \mathbb{N} \rangle$  is a track in the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  if and only if there exists a schedule  $\sigma$  such that  $\sigma \models Rel$  and  $\boldsymbol{\chi}^{\sigma}(t) = \boldsymbol{\chi}(t)$  for all  $t \in \mathbb{N}$ ; case  $Rel = \{a \mid \boldsymbol{\prec} \mid b\}$ : if  $\triangle$  is defined by the following set of rules

$$\{\chi_a = \chi_b \Longrightarrow \neg \tau_b\}$$

then  $\langle \chi(t) \mid t \in \mathbb{N} \rangle$  is a track in the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  if and only if there exists a schedule  $\sigma$  such that  $\sigma \models Rel$  and  $\chi^{\sigma}(t) = \chi(t)$  for all  $t \in \mathbb{N}$ .

*Proof.* Let us define the function  $n: \mathbb{N}^{\mathcal{C}} \to \mathbb{N}$  in the following manner

$$n(\boldsymbol{\chi}) = \min\{m \in \mathbb{N} \mid \boldsymbol{\chi} \in V_m\},\,$$

where  $\langle V_n \mid n \in \mathbb{N} \rangle$  is the series of sets defined by the computational schema from Section 2.3. Using Prop. 3, and mathematical induction by  $n(\chi)$  one can check that

$$R(\triangle) = \begin{cases} \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_a \leq \chi_b \}, \text{ for the case of subclocking} \\ \mathbb{N}^{\mathcal{C}}, & \text{for the case of exclusion} \\ \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_a \geq \chi_b \}, \text{ for the case of causality and precedence} \end{cases}$$

Further, checking that any track in the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  corresponds to some schedule  $\sigma$  such that  $\sigma \models Rel$  and conversely is an easy exercise.

**Proposition 6 (Delay).** Let  $a, b \in \mathcal{C}$  and  $Expr = \{b \triangleq a \$ m\}$  for some natural m then if  $\triangle$  is defined by the following set of rules

$$\{\chi_a < m \Longrightarrow \neg \tau_b, \ \chi_a \ge m \Longrightarrow \tau_a \leftrightarrow \tau_b\}$$

then  $\langle \boldsymbol{\chi}(t) \mid t \in \mathbb{N} \rangle$  is a track in the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  if and only if there exists a schedule  $\sigma$  such that  $\sigma \models Expr$  and  $\boldsymbol{\chi}^{\sigma}(t) = \boldsymbol{\chi}(t)$  for all  $t \in \mathbb{N}$ .

*Proof.* Acting as in the proof of the previous proposition one can establish that

$$R(\Delta) = \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_a \leq m, \ \chi_b = 0 \} \cup \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_a > m, \ \chi_b = \chi_a - m \}.$$

Further reasoning are similar to the reasoning in the previous proof.

**Proposition 7 (Binary Clock Definitions).** Let  $a, b, c \in C$  and C be a binary definition of clock c using clocks a and b then

**case**  $Expr = \{c \triangleq a \mid + b\}$ : if  $\triangle$  is defined by the following rule

$$\mathfrak{t} \Longrightarrow \tau_c \leftrightarrow \tau_a \vee \tau_b$$

then  $\langle \boldsymbol{\chi}(t) \mid t \in \mathbb{N} \rangle$  is a track in the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  if and only if there exists a schedule  $\sigma$  such that  $\sigma \models Expr$  and  $\boldsymbol{\chi}^{\sigma}(t) = \boldsymbol{\chi}(t)$  for all  $t \in \mathbb{N}$ ; case  $Expr = \{c \triangleq a \mid * b\}$ : if  $\triangle$  is defined by the following rule

$$\mathfrak{t} \Longrightarrow \tau_c \leftrightarrow \tau_a \wedge \tau_b$$

then  $\langle \boldsymbol{\chi}(t) \mid t \in \mathbb{N} \rangle$  is a track in the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  if and only if there exists a schedule  $\sigma$  such that  $\sigma \models Expr$  and  $\boldsymbol{\chi}^{\sigma}(t) = \boldsymbol{\chi}(t)$  for all  $t \in \mathbb{N}$ ; case  $Expr = \{c \triangleq a \upharpoonright b\}$ : if  $\triangle$  is defined by the following set of rules

$$\{\chi_a < \chi_b \Longrightarrow \tau_c \leftrightarrow \tau_b, \ \chi_a = \chi_b \Longrightarrow \tau_c \leftrightarrow \tau_a \lor \tau_b, \ \chi_a > \chi_b \Longrightarrow \tau_c \leftrightarrow \tau_a\}$$

then  $\langle \boldsymbol{\chi}(t) \mid t \in \mathbb{N} \rangle$  is a track in the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  if and only if there exists a schedule  $\sigma$  such that  $\sigma \models Expr$  and  $\boldsymbol{\chi}^{\sigma}(t) = \boldsymbol{\chi}(t)$  for all  $t \in \mathbb{N}$ ; case  $Expr = \{c \triangleq a \mid \nabla \mid b\}$ : if  $\triangle$  is defined by the following set of rules

$$\{\chi_a < \chi_b \Longrightarrow \tau_c \leftrightarrow \tau_a, \ \chi_a = \chi_b \Longrightarrow \tau_c \leftrightarrow \tau_a \land \tau_b, \ \chi_a > \chi_b \Longrightarrow \tau_c \leftrightarrow \tau_b\}$$

then  $\langle \chi(t) | t \in \mathbb{N} \rangle$  is a track in the coalgebra  $\langle R(\triangle), \alpha_{\triangle} \rangle$  if and only if there exists a schedule  $\sigma$  such that  $\sigma \models Expr$  and  $\chi^{\sigma}(t) = \chi(t)$  for all  $t \in \mathbb{N}$ .

Proof. Acting as in the proof of Prop. 5 one can establish that

$$R(\triangle) = \begin{cases} \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_b \leq \chi_a \leq \chi_c \leq \chi_a + \chi_b \} \cup \\ \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_a < \chi_b \leq \chi_c \leq \chi_a + \chi_b \} , \text{ in the case of union} \\ \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_c \leq \chi_a \leq \chi_b \} \bigcup \\ \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_c \leq \chi_b \leq \chi_a \} , \text{ in the case of intersection} \\ \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_a \leq \chi_b, \chi_c = \chi_b \} \bigcup \\ \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_a > \chi_b, \chi_c = \chi_a \} \bigcup \\ \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_a \leq \chi_b, \chi_c = \chi_a \} \bigcup \\ \{ \boldsymbol{\chi} \in \mathbb{N}^{\mathcal{C}} \mid \chi_a > \chi_b, \chi_c = \chi_b \} , \text{ in the case of supremum} \end{cases}$$

Further reasonings are the same as in the previous propositions.

# 4 Stationary Clock Constraints

Actuation distributions of some clock constraints do not depend on the current configuration as one can see from previous considerations. It motivates the following definition.

**Definition 8.** An actuation distribution  $\triangle : \mathbb{N}^{\mathcal{C}} \to \{0,1\}^{\mathcal{C}}$  is called stationary if the map  $\triangle$  is a constant map.

Some primitive clock constraints, such as subclocking, exclusion, union and intersection, represent stationary actuation distributions. Therefore the question whether any stationary actuation distribution is represented by a set of stationary primitive clock constraints is interesting.

The following proposition demonstrates that it is true for 2-clock systems.

**Proposition 8.** Let  $C = \{a, b\}$  then any corresponding stationary actuation distribution can be expressed by a set of subclocking and exclusion relations.

*Proof.* Let us represent a vector  $\tau \in \mathbb{N}^{\mathcal{C}}$  as  $\tau = (\tau_a, \tau_b)$  the all possible stationary actuation distributions for  $\mathcal{C}$  can be listed in the following manner.

Hence, we have built all the possible cases.

However, this gives, in fact, the only example of completeness of stationary primitive clock constraints to represent stationary actuation distributions.

**Theorem 1.** If |C| > 2 then there exists at least one stationary actuation distribution that cannot be represented as a set of stationary primitive clock constraints.

*Proof.* Analyzing definitions collected in Sec. 3.2 one can see that subclocking relations, exclusion relations, union definitions, and intersection definitions form an exhaustive list of stationary primitive clock constraints.

Further suppose that  $|\mathcal{C}| = n$  and n > 2.

Firstly, note that  $\triangle$  for a stationary primitive clock relation contains at most  $3 \cdot 2^{n-2}$  vectors.

Secondly, let us calculate the number of vectors that can belong to  $\triangle$  considering only either union or intersection definitions when n=3. To do it, let us denote  $\mathcal{C}=\{a,b,c\}$  then actuation vectors  $\boldsymbol{\tau}$  be represented as  $\boldsymbol{\tau}=\{\tau_a,\tau_b,\tau_c\}$  and all possible actuation distributions for a stationary definitions are listed as follows

$$c \triangleq a + b \qquad \triangle = \{(0,0,0), (1,0,1), (0,1,1), (1,1,1)\}$$
  
$$c \triangleq a * b \qquad \triangle = \{(0,0,0), (1,0,0), (0,1,0), (1,1,1)\}$$

Using this fact one can claim that for  $n \geq 3$ ,  $\triangle$  for stationary clock definitions contains at most  $4 \cdot 2^{n-3} = 2^{n-1}$  vectors.

Thirdly,  $\triangle$  for any clock constraint represented by a set of stationary clock relations and clock definitions is equal to the intersection of the corresponding  $\triangle$ -s, thus the studied  $\triangle$  contains at most  $3 \cdot 2^{n-2}$  vectors.

Further,  $\triangle$  other arbitrary stationary constraints can contain from one to  $2^n-1$  vectors. Hence, all  $\triangle$ -s that contain from  $3 \cdot 2^{n-2} + 1$  to  $2^n - 1$  vectors cannot be represented by a set of clock relations or/and clock definitions. To demonstrate that such constraints exist let us calculate the number of elements in the set

$$\{k \in \mathbb{N} \mid 3 \cdot 2^{n-2} + 1 \le k \le 2^n - 1\}.$$

One can easily check that  $2^{n-1} - 1$  is always in the set for  $n \ge 3$ .

Example 1. If  $\triangle = \{ \tau \in \{0,1\}^{\mathcal{C}} \mid (\exists c \in \mathcal{C}) \ \tau_c = 0 \}$  then it cannot be represented as a set of stationary clock relations and clock definitions.

Intuitively, this example shows that one cannot explicitly prevent one specific clock from ticking with classical CCSL. This is a pathological example that illustrates the incompleteness of CCSL. By relying on a co-algebra, we extend the family of constraints that can be built and we can also highlight what cannot be done with CCSL.

## 5 Related work

CCSL operational semantics is inspired by the approach proposed by G. Plotkin for defining the operational semantics of software systems [14]. In [15] it is proposed to use the concept of "universal coalgebra" for studying Plotkin's semantic model. This is the main inspiration for our work.

Co-inductive structures have already been used in the context of logic programming for modeling complex real-time systems (see for instance, [7]). However, they were mainly used as a way to handle infinite structures, where infinity came from the dense nature of time and of its continuous evolution, such as in timed pushdown automata [5]. We use them here to handle systems that are discrete (but still infinite) in nature and to prove that CCSL is not rich enough to capture all that could be built by the coalgebraic structure. We then propose to build a generalized and complete constraint language as an extension of CCSL.

Transformation based approaches have been proposed for mapping CCSL or a subset of it, into different semantic domains such as VHDL, Petri nets, and Promela. André et al. [2] have presented an automatic transformation of a CCSL specification into VHDL code. The proposed transformation assembles instances of pre-built VHDL components while preserving the polychronous semantics of CCSL. The generated code can be integrated in the VHDL design and verification flow. Mallet and André have proposed a formal semantics to a kernel subset of CCSL, and presented an equivalent interpretation of the kernel in two different formal languages, namely Signal and Time Petri nets [10]. In their work, relevant examples have been used to show instances when Petri-nets are suitable

to express CCSL constraints, as well as instances where synchronous languages are more appropriate. Our contribution is very different in nature, since rather that restricting the scope of CCSL to allow verification, we here attempt to generalize the language and find a wider semantic domain that still brings useful information about the constraint system.

### 6 Conclusion

In the paper we have presented a new semantic domain used to study the expressiveness of the MARTE CCSL constraint language. CCSL constraints are encoded using a clock co-algebra that is later used to identify what can be expressed in CCSL and more importantly what cannot be expressed.

In Subsection 3.1, a coalgebraic structure of clocks has been studied. The obtained results show that computability of constraint preconditions is necessary for verifying the validity of a schedule for a constraint. Taking into account that any precondition is a predicate of natural variables the question about a choice of formal arithmetic system for specifying such a precondition arises.

Besides, this new structure defines a class of constraints covering classical CCSL constraints. This result is given in Subsection 3.2. Theorem 1 shows that the newly defined class of clock constraints is strictly larger than the class of CCSL constraints. This semantic domain therefore defines a class of generalized clock constraints. We then study the relationships between this generalized class and the classical class of clock constraints.

Using a coalgebraic structure to capture clock constraints is a first step to allow for the bisimulation of CCSL specifications. This is important since CCSL was meant to provide a reference semantic domains for MARTE time model. Such MARTE/CCSL models are then doomed to be transformed into other formal modeling languages amenable to analysis. Bisimulation would then provide a support for verifying the correctness of the transformation.

#### References

- 1. André, C.: Syntax and semantics of the Clock Constraint Specification Language (CCSL). Research Report 6925, INRIA (May 2009), http://hal.inria.fr/inria-00384077/
- André, C., Mallet, F., DeAntoni, J.: VHDL observers for clock constraint checking. In: Industrial Embedded Systems (SIES), 2010 Int. Symp. on. pp. 98–107. IEEE, Trento, Italy (July 2010)
- André, C., Mallet, F., de Simone, R.: Modeling time(s). In: 10th Int. Conf. on Model Driven Engineering Languages and Systems (MODELS '07). pp. 559–573.
   No. 4735 in LNCS, ACM-IEEE, Springer, Nashville, TN, USA (September 2007)
- Benveniste, A., Caspi, P., Edwards, S.A., Halbwachs, N., Le Guernic, P., de Simone, R.: The synchronous languages 12 years later. Proceedings of the IEEE 91(1), 64– 83 (jan 2003)
- Dang, Z.: Binary reachability analysis of pushdown timed automata with dense clocks. In: Computer Aided Verification, 13th Int. Conf., CAV 2001. pp. 506–518 (2001)

- Deantoni, J., Mallet, F.: Timesquare: Treat your models with logical time. In: Furia, C.A., Nanz, S. (eds.) TOOLS (50). Lecture Notes in Computer Science, vol. 7304, pp. 34–41. Springer (2012)
- Gupta, G., Saeedloei, N., DeVries, B.W., Min, R., Marple, K., Kluzniak, F.: Infinite computation, co-induction and computational logic. In: Algebra and Coalgebra in Computer Science - 4th Int. Conf., CALCO 2011. pp. 40–54 (2011)
- 8. Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)
- Mallet, F.: Logical Time @ Work for the Modeling and Analysis of Embedded Systems. LAMBERT Academic Publishing (January 2011), iSBN: 978-3-8433-9388-1.
- Mallet, F., André, C.: On the semantics of UML/Marte clock constraints. In: Int. Symp. on Object/component/service-oriented Real-time distributed Computing (ISORC'09). pp. 305–312. IEEE Computer Press, Japan, Tokyo (March 2009)
- Mallet, F., Millo, J.V., de Simone, R.: Safe CCSL specifications and marked graphs.
   In: 11th ACM/IEEE Int. Conf. on Formal Methods and Models for Codesign. pp. 157–166. IEEE (2013)
- Nielsen, M., Plotkin, G.D., Winskel, G.: Petri nets, event structures and domains. In: Kahn, G. (ed.) Semantics of Concurrent Computation, Proceedings of the International Symposium, Evian, France, July 2-4, 1979. Lecture Notes in Computer Science, vol. 70, pp. 266–284. Springer (1979), http://dx.doi.org/10.1007/BFb0022474
- OMG: UML Profile for MARTE, v1.0. Object Management Group (November 2009), formal/2009-11-02
- Plotkin, G.D.: A structural approach to operational semantics. J. Log. Algebr. Program. 60-61, 17-139 (2004)
- Rutten, J.J.M.M.: Universal coalgebra: a theory of systems. Theor. Comput. Sci. 249(1), 3–80 (2000), http://dx.doi.org/10.1016/S0304-3975(00)00056-6
- Tarski, A.: A lattice-theoretical fixpoint theorem and its applications. Pacific Journal of Mathematics 5(2), 285–309 (1955), http://projecteuclid.org/euclid.pjm/1103044538
- 17. Zholtkevych, G., Mallet, F., Zaretska, I., Zholtkevych, G.: Two semantic models for clock relations in the clock constraint specification language. In: Communications in Computer and Information Science, vol. 412, pp. 190–209. Springer (2013)

# Modeling Hybrid Systems in Hy-tccp

Damián Adalid María del Mar Gallardo Laura Titolo [damian,gallardo,laura.titolo]@lcc.uma.es

Dept. Lenguajes y Ciencias de la Computación E.T.S.I. Informática University of Málaga\*

**Abstract.** Concurrent, reactive and hybrid systems require quality modeling languages to be described and analyzed. The Timed Concurrent Constraint Language (tccp) was introduced as a simple but powerful model for reactive systems. In this paper, we present *hybrid tccp* (Hytccp), an extension of tccp over continuous time which includes new constructs to model the continuous dynamics of hybrid systems.

#### 1 Introduction

Concurrent, reactive and hybrid systems have had a wide diffusion and they have become essential to an increasingly large number of applications. Often, systems of these kinds are safety critical, i.e., an error in the software can have tragic consequences. In the case of hybrid systems, the modeling and the analysis phases are particularly hard due to the combination of discrete and continuous dynamics and the presence of real variables. Many formalisms have been developed to describe concurrent systems. One of these is the  $Concurrent\ Constraint\ paradigm\ (ccp)\ [8]$ . It differs from other paradigms mainly due to the notion of store-asconstraint that replaces the classical store-as-valuation model. In this paradigm, the agents running in parallel communicate by means of a global constraint store. The  $Timed\ Concurrent\ Constraint\ Language\ [2]\ (tccp\ in\ short)$  is a concurrent logic language obtained by extending ccp with the notion of time and a suitable mechanism to model time-outs and preemptions.

In this paper, we present Hy-tccp an extension of tccp over continuous time. The declarative nature of Hy-tccp facilitates a high level description of hybrid systems in the style of hybrid automata [7]. Furthermore, its logical nature eases the development of semantics based program manipulation tools for hybrid systems (verifiers, analyzers, debuggers...). Parallel composition of hybrid automata is naturally supported in Hy-tccp due to the existence of a global shared store and to the synchronization mechanism.

The paper is organized as follows. In Section 2, we briefly introduce the language tccp and we show how we have extended it to obtain Hy-tccp. Section 3 contains an example to highlight the expressive power of our language. Finally, Section 4 concludes the paper and presents some related work.

<sup>\*</sup> This work has been supported by the Andalusian Excellence Project P11-TIC7659 and the Spanish Ministry of Economy and Competitiveness project TIN2012-35669

## 2 Hy-tccp: a hybrid extension of tccp

The Timed Concurrent Constraint Language (tccp, [2]) is a time extension of ccp suitable for describing concurrent and reactive systems. The computation in tccp proceeds as the concurrent execution of several agents that can monotonically add information in a global constraint store, or query information from it. tccp is parametric w.r.t. a cylindric constraint system which handles the information on system variables. Briefly, a cylindric constraint system is a structure  $\mathbf{C} = \langle \mathcal{C}, \vdash, \land, false, true, Var, \exists \rangle$  composed by a set of constraints  $\mathcal{C}$  ordered by the entailment relation  $\vdash$  (intuitively,  $c \vdash d$  if c contains more information than d) where  $\land$  is a binary operator that merges the information from two constraints; false and true are, respectively, the greatest and the least element of  $\mathcal{C}$ ; Var is a denumerable set of variables and  $\exists$  existentially quantifies variables over constraints. The syntax of agents is given by the grammar:

$$A \coloneqq \mathsf{stop} \mid \mathsf{tell}(c) \mid A \parallel A \mid \exists x \, A \mid \sum_{i=1}^n \mathsf{ask}(c_i) \to A \mid \mathsf{now} \ c \ \mathsf{then} \ A \ \mathsf{else} \ A \mid p(\bar{x})$$

where  $c, c_1, \ldots, c_n$  are finite constraints in  $C, \bar{x} \in Var \times \cdots \times Var$  and p is a predicate symbol. A tccp program is a pair D.A, where A is the initial agent and D is a set of process declarations of the form  $p(\bar{x}):-A$ . The notion of time is introduced by defining a discrete global clock. The operational semantics of tecp [2] is described by a transition system  $T = (Conf, \rightarrow)$ . Configurations in Conf are pairs (A, c) representing the agent A to be executed in the current global store c. The transition relation  $\rightarrow \subseteq Conf \times Conf$  is the least relation satisfying the rules in Figure 1. As can be seen from the rules, the stop agent represents the successful termination of the computation. The tell(c) agent adds the constraint c to the current store. The choice agent  $\sum_{i=1}^{n} \operatorname{ask}(c_i) \to A_i$  non-deterministically executes one of the agents  $A_i$  whose corresponding guard  $c_i$  is entailed by the store; otherwise, if no guard is entailed by the store, the agent suspends. The conditional agent now c then A else B behaves like A (respectively B) if c is (respectively is not) entailed by the store.  $A \parallel B$  models the parallel composition of A and B in terms of maximal parallelism, i.e., all the enabled agents of A and B are executed at the same time. The agent  $\exists x A$  makes variable x local to A. Finally, the agent  $p(\bar{x})$  takes from D a declaration of the form  $p(\bar{x}):-A$  and then executes A.

We introduce the language Hy-tccp, which subsumes tccp and includes new agents to model hybrid systems in the style of hybrid automata. Hybrid automata [7] are an extension of finite-state automata. Intuitively, their discrete behavior is defined by means of a finite set of discrete states (called locations) and a set of (instantaneous) discrete transitions from one location to another. The continuous behavior of hybrid automata is described at each location by some Ordinary Differential Equations (ODEs) which describe how continuous variables evolve over time (continuous transitions). Each location is associated

<sup>&</sup>lt;sup>1</sup> The auxiliary agent  $\exists^l x A$  makes explicit the local store l of A. This auxiliary agent is linked to the principal hiding construct by setting the initial local store to true, thus  $\exists x A := \exists^{true} x A$ .

```
\langle \mathsf{tell}(c), d \rangle \to \langle \mathsf{stop}, c \wedge d \rangle
                                                                                                                                            \langle \sum_{i=1}^{n} \operatorname{ask}(c_i) \to A_i, d \rangle \to \langle A_j, d \rangle
               \langle A, d \rangle \rightarrow \langle A', d' \rangle, d \vdash c
                                                                                                                                                                     \langle A, d \rangle \not\rightarrow, d \vdash c
(now c then A else B, d \rightarrow \langle A', d' \rangle
                                                                                                                                            \langle \text{now } c \text{ then } A \text{ else } B, d \rangle \rightarrow \langle A, d \rangle
               \langle B, d \rangle \rightarrow \langle B', d' \rangle, d \not\vdash c
                                                                                                                                                                    \langle B, d \rangle \not\rightarrow, d \not\vdash c
(now c then A else B, d) \rightarrow (B', d')
                                                                                                                                           \langle \text{now } c \text{ then } A \text{ else } B, d \rangle \rightarrow \langle B, d \rangle
\langle A, d \rangle \rightarrow \langle A', d' \rangle \quad \langle B, d \rangle \rightarrow \langle B', c' \rangle
                                                                                                                                                      \langle A, d \rangle \rightarrow \langle A', d' \rangle \quad \langle B, d \rangle \not\rightarrow
         \langle A \parallel B, \overline{d} \rangle \rightarrow \langle A' \parallel B', \overline{d'} \wedge c' \rangle
                                                                                                                                                           \langle A \parallel B, \, d \rangle \to \langle A' \parallel B, \, d' \rangle
                                                                                                                                                                                          p(\bar{x}) \coloneq A \in D
             \langle A,\,l\wedge\exists_x\,d\rangle\to\langle B,\,l'\rangle
\langle \exists^l x A, d \rangle \rightarrow \langle \exists^{l'} x B, d \wedge \exists_x l' \rangle
                                                                                                                                                                                    \langle p(\bar{x}), d \rangle \rightarrow \langle A, d \rangle
```

**Fig. 1.** The transition system for tccp.

with an *invariant* predicate which constrains the value of the continuous variables at that location and with an *initial* predicate that establishes their possible initial values. Discrete transitions are associated with a *jump* predicate that may include a *guard* and a *reset* predicate which updates the value and/or the flow of continuous variables.

Hy-tccp uses a tccp monotonic store (called discrete store) to model the information about the current location and the associated invariants. Discrete transitions of hybrid automata are modeled as instantaneous transitions in Hytccp and they are used to synchronize parallel agents. We distinguish the set of discrete variables Var, whose information is accumulated monotonically, and the set of continuous variables  $\widetilde{Var}$ , whose values change continuously over time  $(Var \cap Var = \varnothing)$ . Constraints in  $\mathcal{C}$  are now defined over  $Var \cup Var$ . The tccpstore is extended by adding a component called *continuous store*. The continuous store is not monotonic, instead it records the dynamical evolution of the continuous variables. Thus, a Hy-tccp store is a pair  $\langle c, \tilde{c} \rangle$  where c (discrete store) is a monotonic constraint store as in tccp and  $\tilde{c}$  (continuous store) is a function that associates a continuous variable with its current value and its flow<sup>2</sup>, which indicates how its value changes over time by means of an ODE. Given a continuous store  $\tilde{c}$  and a continuous variable x,  $\tilde{c}(x) = \langle v, f \rangle$  means that x has value v and flow f. Given  $\tau \in \mathbb{R}_{>0}$  and  $inv \in \mathcal{C}$  we denote as  $\langle c, \tilde{c} \rangle \rightsquigarrow_{\tau}^{inv} \langle c, \tilde{c}_{\tau} \rangle$  the projection of the store  $\langle c, \tilde{c} \rangle$  at time  $\tau$  satisfying inv. The value of the variables are updated at time  $\tau$ , while the flows are unchanged. In order to model behaviors typical of hybrid systems we introduce two new agents w.r.t. tccp: change and ask. The agent change updates the value and/or the flow of a given continuous variable (reset predicate of hybrid automata). The tccp choice agent is extended by allowing the non-deterministic choice between discrete and continuous transitions in the following way:  $\sum_{i=1}^{n} \mathsf{ask}(c_i) \to A_i + \sum_{j=1}^{m} \widetilde{\mathsf{ask}}(inv_j)$  where

In this paper, we assume that continuous variables evolve independently from each other. Given  $x \in \widetilde{Var}$ , its flow is defined as a predicate on set  $\{x, \dot{x}\}$  where  $\dot{x}$  denotes the first order derivative of x (e.g.  $\dot{x} = 2$  or  $\dot{x} = 2x$ ).

$$\frac{}{\langle \mathsf{change}(x,v,f),\,d,\,\tilde{d}\rangle \to_{\sigma} \langle \mathsf{stop},\,d,\,\tilde{d} \triangleleft (x \mapsto (v,f))\rangle} \tag{R1}$$

$$\frac{\exists \ 1 \le k \le m, \ \tau \in \mathbb{R}^+. \langle d, \tilde{d} \rangle \leadsto_{\tau}^{inv_j} \langle d, \tilde{d}_{\tau} \rangle}{\langle \sum_{i=1}^n \operatorname{ask}(c_i) \to A_i + \sum_{j=1}^m \widetilde{\operatorname{ask}}(inv_j), \ d, \tilde{d} \rangle \to_{\tau} \langle \sum_{i=1}^n \operatorname{ask}(c_i) \to A_i + \sum_{j=1}^m \widetilde{\operatorname{ask}}(inv_j), \ d, \tilde{d}_{\tau} \rangle} \tag{\textbf{R2}}$$

$$\frac{\langle A, d, \tilde{d} \rangle \rightarrow_{\tau} \langle A, d, \tilde{d}' \rangle \quad \langle B, d, \tilde{d} \rangle \rightarrow_{\tau} \langle B, d, \tilde{d}' \rangle}{\langle A \parallel B, d, \tilde{d} \rangle \rightarrow_{\tau} \langle A \parallel B, d, \tilde{d}' \rangle}$$
(R3)

$$\frac{\langle A, d, \tilde{d} \rangle \rightarrow_{\sigma} \langle A', d', \tilde{d}' \rangle \quad \langle B, d, \tilde{d} \rangle \rightarrow_{\tau} \langle B, d, \tilde{d}'' \rangle}{\langle A \parallel B, d, \tilde{d} \rangle \rightarrow_{\sigma} \langle A' \parallel B, d', \tilde{d}' \rangle}$$
(R4)

Fig. 2. The transition system for Hy-tccp.

 $n \ge 0$  and  $m \ge 0$ . Here, the ask branches can be non-deterministically selected in case the invariant  $inv_i$  is entailed in the current store. This corresponds to the passage of continuous time in a hybrid automaton location. The continuous variables evolve over *continuous* time while  $inv_i$  holds and until another ask branch is selected. The operational semantics of Hy-tccp is described by a transition system  $T = (Conf, \rightarrow_{\sigma}, \rightarrow_{\tau})$ . Configurations in Conf are triple  $(A, c, \tilde{c})$ representing the agent A to be executed in the current extended store  $\langle c, \tilde{c} \rangle$ . The transition relation  $\rightarrow_{\sigma} \subseteq \widetilde{Conf} \times \widetilde{Conf}$  represents a tccp discrete transition whose execution is instantaneous, while  $\rightarrow_{\tau} \subseteq \widetilde{Conf} \times \widetilde{Conf}$  models a continuous transition of duration  $\tau$ . In Figure 2 we describe the rules that we have added to the operational semantics of tccp in order to deal with continuous time and variables. In Rule R1 the agent change uses the operator ⊲ that, given a continuous store  $\tilde{c}$  and a triple (x, v, f), updates  $\tilde{c}$  with a new initial value v and a new flow f for the variable x. In Rule **R2** time passes continuously while one of the ask invariants holds in the store and the values of the continuous variables change over time following their flow. Rule **R3** represents the parallel execution of two continuous transitions, note that their duration must coincide. Rule R4 expresses the parallel composition of a discrete and a continuous transition. In this case, the discrete transition is executed before the continuous one.

## 3 Example: a dam management system

In this section we model a dam management system with Hy-tccp (Figure 3). Our experience in this area [4] has shown us that this is a realistic and significant example to demonstrate the expressive power and usability of our language. Due to the monotonicity of the discrete constraint store, streams (written in a list-fashion way) are used to model imperative-style variables [2]. Our dam controller system is modeled as the parallel composition of a controller, a supplier and two gate processes<sup>3</sup>. Vol represents the total amount of water, it has initial value INITVOL and flow 0 (i.e., its value is constant over time). T represents a timer used by the supplier, it has initial value 0 and flow 1, thus it evolves

<sup>&</sup>lt;sup>3</sup> The code of gate is omitted due to space limitations.

lineary over time. When T reaches the value 3600, the supplier sends to the controller the value of the new inflow of water through the input channel In. At this point, the controller checks to which interval the current volume of water (Vol) belongs. Intervals are defined by using several sub-indexed constants  $THRESHOLD_i$ . According to the current value of Vol, the controller sends a signal to each gate through the output channels ToG1 and ToG2 in order to set their status. At the same time, the continuous store is updated. We use the symbol  $_{-}$  in the second argument of agent change to indicate that only the flow of Vol is updated, while its value is unchanged. The new flow of Vol depends on the new inflow received from the supplier (NewIn) and on a value representing the water discharged through the gates. This value is computed by the function Out according to the current state of the gates. It is worth noting that the ask construct in supplier is used to make time pass. Its invariant ensures that T never exceeds the value 3600.

```
\begin{split} & \text{init:-} \exists \ In, \ ToG1, ToG2, \ Vol, T \big( \text{change} \big( T, 0, \dot{T} = 1 \big) \parallel \text{supplier} \big( T, \ In \big) \parallel \text{tell} \big( Vol \leq THRESHOLD_3 \big) \parallel \\ & \text{change} \big( Vol, INITVOL, \dot{Vol} = 0 \big) \parallel \text{controller} \big( Vol, In, ToG1, ToG2 \big) \parallel \text{gate} \big( ToG1 \big) \parallel \text{gate} \big( ToG2 \big) \big) \\ & \text{controller} \big( Vol, In, ToG1, ToG2 \big) := \exists NewIn, ToG1', ToG2', In' \Big( \\ & \text{ask} \big( In = [NewIn]_-] \big) \to \Big( \text{tell} \big( In = [NewIn]_-In' \big) \big) \parallel \\ & \text{ask} \big( Vol \leq THRESHOLD_1 \big) \to \Big( \text{tell} \big( ToG1 = [close|ToG1'] \big) \parallel \text{tell} \big( ToG2 = [close|ToG2'] \big) \parallel \\ & \text{change} \big( Vol, \_, \dot{Vol} = NewIn - Out \big( close, close \big) \big) \parallel \text{controller} \big( Vol, In', ToG1', ToG2' \big) \\ & + \text{ask} \big( Vol > THRESHOLD_1 \wedge Vol \leq THRESHOLD_2 \big) \to \big( \text{tell} \big( ToG1 = [halfOpen|ToG1'] \big) \parallel \\ & \text{tell} \big( ToG2 = [halfOpen|ToG2'] \big) \parallel \text{change} \big( Vol, \_, \dot{Vol} = NewIn - Out \big( halfOpen, halfOpen \big) \big) \parallel \\ & \text{controller} \big( Vol, In', ToG1', ToG2' \big) \big) \\ & + \text{ask} \big( Vol > THRESHOLD_2 \wedge Vol < THRESHOLD_3 \big) \to \big( \text{tell} \big( ToG1 = [halfOpen|ToG1'] \big) \parallel \\ & \text{controller} \big( Vol, In', ToG1', ToG2' \big) \big) \\ & + \text{ask} \big( Vol = THRESHOLD_3 \big) \to \big( \text{tell} \big( ToG1 = [open|ToG1'] \big) \parallel \text{tell} \big( ToG2 = [open|ToG2'] \big) \parallel \\ & \text{change} \big( Vol, \_, \dot{Vol} = -Out \big( open, open \big) \big) \parallel \text{controller} \big( Vol, In, ToG1', ToG2' \big) \big) \Big) \\ & \text{supplier} \big( T, In \big) := \exists In' \big( \widetilde{\text{ask}} \big( T \leq 3600 \big) \\ & + \text{ask} \big( T = 3600 \big) \to \big( \text{tell} \big( In = [Random(0, 350) | In'] \big) \parallel \text{change} \big( T, 0, \dot{T} = 1 \big) \parallel \text{supplier} \big( T, In' \big) \big) \Big) \\ \end{aligned}
```

Fig. 3. HY-tccp model for a dam management system

## 4 Conclusions and Related Work

In this paper we have presented Hy-tccp, an extension of tccp over continuous time, with the aim of modeling hybrid systems in a simple and declarative way. The language is parametric to both, the cylindric constraint system used to manage the discrete behavior, and the class of differential equation solvers that models the continuous behavior. Although we are aware that the decidability limits of hybrid systems [7] lie on the class of initialized rectangular systems, in this paper we have only restricted the class of differential equations used by assuming that the dynamics of a continuous variable does not depend on the others. In this way we obtain a more general framework.

In [6], hcc was introduced as the first extension over continuous time of the concurrent constraint paradigm. Although both Hy-tccp and hcc are declarative languages with a logical nature, they have some important differences. Hy-tccp has been defined as a modeling language for hybrid systems in the style of hybrid automata. Unlike hcc, which is deterministic, Hy-tccp provides the non-deterministic choice agent which allows the transitions of hybrid automata to be expressed as a list of ask and ask branches. Furthermore, in hcc, the information on the value and flow of continuous variables is modeled as a constraint of the underlying continuous constraint system. On the contrary, in Hy-tccp, there is a clear distinction between discrete and continuous variables. The process algebra Hybrid Chi [1] shares with Hy-tccp the separation between discrete and continuous variables, the synchronous nature and the concept of delayable guard (corresponding to the suspension of the non-deterministic choice). In [3], HyPa is introduced as an extension of the process algebra ACP. It differs from Hybrid Chi mainly in the way time-determinism is treated, and in the modeling of time passing.

In the future we plan to develop a framework for the description and simulation of Hy-tccp programs. We are also interested in defining a translation rules system from Hy-tccp to hybrid automata and viceversa. Furthermore, we plan to use model checking and abstract interpretation to verify temporal properties of hybrid systems written in Hy-tccp (as done in [5] for SPIN).

#### References

- van Beek, D.A., Man, K.L., Reniers, M.A., Rooda, J.E., Schiffelers, R.R.H.: Syntax and consistent equation semantics of hybrid chi. Journal of Logic and Algebraic Programming 68(1-2), 129–210 (2006)
- de Boer, F.S., Gabbrielli, M., Meo, M.C.: A Timed Concurrent Constraint Language. Information and Computation 161(1), 45–83 (2000)
- Cuijpers, P.J.L., Reniers, M.A.: Hybrid process algebra. Journal of Logic and Algebraic Programming 62(2), 191–245 (2005)
- Gallardo, M.M., Merino, P., Panizo, L., Linares, A.: A practical use of model checking for synthesis: generating a dam controller for flood management. Software: Practice and Experience 41, 1329–1347 (2011)
- Gallardo, M.M., Panizo, L.: Extending Model Checkers for Hybrid System Verification: the case study of SPIN. Software Testing, Verification and Reliability (2013)
- Gupta, V., Jagadeesan, R., Saraswat, V.A., Bobrow, D.: Programming in hybrid constraint languages. In: Antsaklis, P., Kohn, W., Nerode, A., Sastry, S. (eds.) Hybrid Systems II. Lecture Notes in Computer Science, vol. 999, pp. 226–251. Springer (1994)
- Henzinger, T.A.: The theory of hybrid automata. In: Proceedings of the 11th Annual IEEE Symposium on Logic in Computer Science. pp. 278–292. LICS '96, IEEE Computer Society, Washington, DC, USA (1996)
- 8. Saraswat, V.A.: Concurrent Constraint Programming Languages. Ph.D. thesis, Pittsburgh, PA, USA (1989)

# Formal Modeling and Verification of Interlocking Systems Featuring Sequential Release

Linh H. Vu<sup>1</sup>, Anne E. Haxthausen<sup>1</sup>, and Jan Peleska<sup>2</sup>

DTU Compute, Technical University of Denmark, Kongens Lyngby, Denmark. {lvho,aeha}@dtu.dk
Department of Mathematics and Computer Science University of Bremen, Bremen, Germany. jp@informatik.uni-bremen.de

Abstract. In this paper, we present a method and an associated tool suite for formal verification of the new ETCS level 2 based Danish railway interlocking systems. We have made a generic and reconfigurable model of the system behavior and generic high-level safety properties. This model accommodates sequential release — a feature in the new Danish interlocking systems. The generic model and safety properties can be instantiated with interlocking configuration data, resulting in a concrete model in the form of a Kripke structure, and in high-level safety properties expressed as state invariants. Using SMT based bounded model checking (BMC) and inductive reasoning, we are able to verify the properties for model instances corresponding to railway networks of industrial size. Experiments also show that BMC is efficient for finding bugs in the railway interlocking designs.

**Keywords:** Railway interlocking systems  $\cdot$  Formal verification  $\cdot$  Bounded model checking  $\cdot$  Inductive reasoning  $\cdot$  RobustRails  $\cdot$  Safety-critical systems

# 1 Introduction

An interlocking system is responsible for guiding trains safely through a given railway network. It is a vital part of any railway signaling system and has the highest safety integrity level (SIL4) according to the CENELEC 50128 standard [5]. Conventionally, the development and verification process of interlocking systems is informal and mostly manual, hence time-consuming, costly, and error-prone. Thus, automated verification of interlocking systems is an active research topic, investigated by several research groups, see e.g. [10, 8, 23, 15, 9, 14]. As part of the RobustRailS research project<sup>3</sup>, our work aims at establishing a holistic method supporting the verification of such systems. The method should be formal and facilitate automation in order to provide a better verification process compared to the conventional one. In Denmark, in the period of 2009–2021, new interlocking systems that are compatible with standardized

<sup>3</sup> http://robustrails.man.dtu.dk

European Train Control System (ETCS) Level 2 [4] will be deployed in the entire country within the context of the Danish Signalling Programme<sup>4</sup>. In the context of the RobustRailS project accompanying the signalling programme on a scientific level, the proposed method will be applied to these new systems.

The main contributions presented in this paper are as follows. (1) We present a formal model of the behavior of ETCS Level 2 compatible interlocking systems. (2) The model accommodates sequential release: this is a method for incrementally releasing route portions that have been traversed by the associated train, with the objective to increase the level of concurrency in route allocation and, consequently, the train throughput. (3) The state space encodings allow for high-level safety properties and state transition relations to be processed in a highly efficient manner by SMT solvers supporting bit vector and integer arithmetics. (4) A verification technique combining induction with bounded model checking (BMC) using novel SMT solvers enables the verification of safety properties for railway network instances of industrial size.

The paper is organized as follows: Section 2 gives a brief introduction to the new Danish route-based interlocking systems. The proposed method is described in Sect. 3. Section 4 presents the formal, generic model in the form of a Kripke structure, while the safety properties are formalized in Sect. 5. Section 6 describes the verification strategy. The experimental results are shown in Sect. 7. Related work and concluding remarks are presented in Sect. 8 and Sect. 9, respectively.

## 2 The new Danish Route-based Interlocking Systems

A railway network in ETCS Level 2 consists of a number of track-side elements of different types<sup>5</sup>: linear sections, points, marker boards. Figure 1 shows an example layout of a railway network having four linear sections (t10,t12,t14,t20), two points (t11,t13), and eight marker boards (mb10..mb21). A linear section is a section with up to two neighbors: one in the up end, and one in the downend<sup>6</sup>, e.g. the linear section t12 in Fig. 1 has t13 and t11 as neighbors at its up end and down end, respectively. A point can have up to three neighbors: one at the stem, one at the plus end, and one at the minus end, e.g. point t11 in Fig. 1 has t10, t12, and t20 as neighbors at its stem, plus, and minus ends, respectively. Linear sections and points are collectively called detection sections, as they are used by interlocking systems to detect the presence of trains in a railway network. A point can be switched between two positions: PLUS and MINUS. When it is in the PLUS (MINUS) position, traffic can run from its stem to its plus (minus) end and vice verse. A marker board is installed along a section, and it is used as reference location for an intended travel direction that it is facing, e.g. mb11 in Fig. 1 is installed along section t10, and it is intended

<sup>4</sup> http://www.bane.dk/signalprogrammet

<sup>&</sup>lt;sup>5</sup> Here we only show types that are relevant for the work presented in this paper.

<sup>&</sup>lt;sup>6</sup> In Denmark, *up* and *down* denote the directions in which the distance to a reference location is *increasing* and *decreasing*, respectively. The location is the same for both up and down, e.g. an end of a line.

for travel direction up. Contrary to legacy systems, in ETCS Level 2, there are no physical signals, but *virtual signals* associated with marker boards. A virtual signal can be OPEN or CLOSED, respectively, allowing or disallowing traffic to pass the associated marker board. For simplicity, the terms *virtual signals*, *signals*, and *marker boards* are used interchangeably throughout this paper.



Fig. 1. An example railway network layout

An interlocking system monitors constantly the status of track-side elements, and sets them to appropriate states in order to allow trains traveling safely through the given railway network. The new Danish interlocking systems are route-based. An *interlocking table* specifies the routes in the given network layout and the conditions for setting these routes. A *route* is a path from a *source* signal to a *destination* signal.

In railway signaling terminology, setting a route denotes the process of allocating the resources – i.e. sections, points, signals – for the route, and then locking it exclusively for only one train when the resources are allocated. The specification of a route and conditions for setting and releasing it include the following information: (a) a list of the detection sections in the route's path, (b) a list of the detection sections which are used as overlaps – buffer space in case trains overshoot the route's path, (c) required positions of  $points^7$  used by the route, (d) a set of protecting signals used for flank or front protection [19] for the route, and (e) a set of conflicting routes which must not be set while the current route is set.

Table 1 shows an excerpt of an interlocking table for the network shown in Fig. 1. As can be seen, one of the routes has id 1a, goes from mb11 to mb13 via two sections t11 and t12, and has no overlap. It requires point t11 (on its path) to be in PLUS position and point t13 (outside its path) to be in MINUS position (as a protecting point). The route has mb20 and mb12 as protecting signals, and it is in conflict with routes 1b, 2a, 2b, 3, 4, 5a, 5b, 6b, 7.

Interlocking Principles. In order to prevent collision and derailment of trains, traditional route-based interlocking systems employ a basic principle: a route is locked exclusively for use of one train at a time. This is obtained by following a strict procedure for setting and releasing routes based on information in their interlocking tables. As an example, let us consider the following procedure for route 1a specified in Table 1:

<sup>&</sup>lt;sup>7</sup> This includes points in the path and overlaps, and points used for flank and front protection. For detail about flank and front protection, see [19].

**Table 1.** Excerpt of the interlocking table for the network layout in Fig. 1. The overlaps column is omitted as it is empty for all of the routes. (p means PLUS, m means MINUS.)

| id | source | destination | <del>*</del> | . 6       | 1       | conflicts               |
|----|--------|-------------|--------------|-----------|---------|-------------------------|
| 1a | mb11   | mb13        | t11:p;t13:m  | mb12;mb20 | t11;t12 | 1b;2a;2b;3;4;5a;5b;6b;7 |
|    |        |             |              |           |         |                         |
| 7  | mb20   | mb10        | t11:m        | mb11;mb12 | t11;t10 | 1a;1b;2a;2b;3;6a        |

- (0) Initially the route is free.
- (1) When a request for setting the route is received by the interlocking system, the route is *marked* as requested.
- (2) The interlocking system checks the status of different track-side elements in the system to figure out whether it can start allocating resources for route 1a, e.g. sections t11 and t12 must be vacant, and conflicting routes must not be allocated or locked. If so, the interlocking commands points and signals to their required positions according to the route's specification, e.g. it commands the point t11 to switch to PLUS, t13 to switch to MINUS, and the protecting signals mb12 and mb20 to change to CLOSED.
- (3) The interlocking system constantly monitors the status of the trackside elements. When the signals and points have changed their states as commanded in step (2), the route is *locked* and its source signal mb11 is set to OPEN, allowing a train to enter the route.
- (4) When the locked route is *used*, i.e. a train enters it, the source signal mb11 is set to CLOSED preventing other trains from entering.
- (5) The route is *released* (set back to *free*) when the train has finished using it, i.e. the train has passed mb13, or the train has come to standstill in front of mb13.

Sequential Release. The new Danish interlocking systems employ sequential release (also known as sectional release) [19]. This new feature results in two major changes:

- (a) With sequential release, the interlocking can release an element in a locked route as soon as the train has passed it, instead of waiting until the train has finished using the route and then releasing the route as a whole. Consequently, the capacity increases.
- (b) As a direct result of (a), a route may be allocated (in step (2) above) while some of its conflicting routes are still in use by trains, instead of waiting for all of its conflicting routes to be released as in traditional route-based interlocking systems. For example, when a train has passed section t11 while going along route 1a, t11 will be released and then route 7 going in the opposite direction (see Table 1) can be allocated (assuming that other conditions for this are fulfilled).

## 3 Verification Method

The verification process is shown in Fig. 2. The verification process begins with



Fig. 2. Verification process

the configuration data of an interlocking system, consisting of a network layout and an interlocking table. The configuration data are described in a domain-specific language [22] (DSL) having an XML representation. After being parsed into an internal representation, a static checker verifies whether the configuration data is statically well-formed according to the static semantics of the DSL. As an option the user may not provide an interlocking table, but instead use an interlocking table generator (ITG) to get a table created automatically. Instantiating a generic model of the dynamic behavior of the Danish interlocking systems with the wellformed configuration data results in a model instance in the form of a Kripke structure. Similarly, the concrete safety-properties expressed as state invariants are also generated from the generic safety-properties. The model instance is then checked against the concrete properties using a combination of BMC and inductive reasoning. If the model instance does not satisfy the properties, counter-examples will be generated. An interface for visualizing the counter-examples at the DSL level is under development.

The tool-chain associated with the method has been implemented using the RT-Tester tool-box [17,21]. The bounded model checker in RT-Tester uses the SONOLAR SMT solver [18] to compute counter-examples for induction and base cases. RT-Tester has been selected because (1) it is an integrated model-based testing and BMC tool, and (2) its SMT solver also supports floating point arithmetic. The first property is crucial for us, because our objective is to complement the model verification with HW/SW integration tests. The second capability is vital, because we also plan to extend the model by real-time aspects, such as train velocity and braking curves.

# 4 Kripke Structure Encodings of Interlocking Systems

The dynamic behavior of an interlocking system is formalized as a Kripke structure  $K = (S, s_0, R, L, AP)$  with state space S, initial state  $s_0 \in S$ , transition relation  $R \subseteq S \times S$ , and labeling function  $L : S \to 2^{AP}$ , where AP is the set of atomic propositions and  $2^{AP}$  is the power set of AP. The labeling function L maps a state s to the set L(s) of atomic propositions that hold in s. Due to the

<sup>&</sup>lt;sup>8</sup> A graphical representation and editor is currently under development.

limited space of this paper and the complexity of the Kripke encodings, in the following subsections, we only outline how the state space S and the transition relation R of a Kripke structure are encoded.

### 4.1 State Space

In order to encode the states of an interlocking system, a finite set  $V = \{v_0, \ldots, v_n\}$  of variables is defined to represent the current status of different components in the system such as a track element or a route. Each variable  $v \in V$  has an associated finite domain  $D_v \subset \mathbb{N}_0$ . The state space is the set of all valuation functions  $s: V \to \bigcup_{v \in V} D_v$  for which  $s(v) \in D_v$  for all  $v \in V$ . The initial state  $s_0$  is the (safe) state in which all detection sections are vacant, all signals are closed, all routes are free, and there are no trains in the network. In our encodings,  $s_0$  is the state in which all variables are evaluated to 0. For readability, sometimes we use named constants instead of their corresponding integral values in the subsequent paragraphs.

Vacancy Status. The vacancy status of a section in a given travel direction is encoded using the three least significant bits HTO of a non-negative integer variable as shown in Fig. 3. For example, the variable l.U2D records the vacancy status of a linear section l in the direction from its up end to its down end. The value 1 of the bits H, T, O indicate: (H) the head of the train is within the section, (T) the tail of the train is within the section, and (O) the section is occupied, respectively. This encoding offers two advantages: (a) the encoding can cover the case where a train occupies more than one detection section (e.g., when it is crossing the joint between two sections), and (b) the safety properties can be expressed efficiently using arithmetic operations on integer variables as shown in Sect. 5.



Fig. 3. A variable recording occupancy status of a detection section

Lockable Elements. In order to accommodate sequential release into our model, we consider a linear or point section as a lockable element. The status of a lockable element e is encoded by two variables: (1) e.MODE – indicating the mode of the element, and (2) e.PREV – this variable is set to 1 when the previous section in the same route has been released, otherwise e.PREV = 0. An element can be in one of the following modes: FREE (the element is not exclusively locked by a route, or used by any train), EXLCK (the element is exclusively locked for a route), or USED (the element has been used, i.e., occupied, by a train after it was exclusively locked for a route).

Point Positions. The position of a point p is encoded by two variables: (1) p.POS – the actual position of the point, and (2) p.CMD – the point position

commanded by the interlocking. The value of p.POS can be one of the following<sup>9</sup>: PLUS(0), MINUS(1), or INTERMEDIATE(2) (the position where the point is switching from one side to the other). The value of p.CMD can only be PLUS or MINUS (as the interlocking cannot command a point to switch to the INTERMEDIATE position).

Signal Aspects. The aspect of a signal s is encoded by two variables: (1) s.ACT – the actual aspect of the virtual signal, its value can be OPEN or CLOSED, and (2) s.CMD – the aspect as commanded by the interlocking, the possible values of this variable have the same meaning as the ones of s.ACT. The s.ACT variable represents the aspect of the signal as "seen" by the train, while s.CMD is the aspect of the signal as seen by the interlocking. The values of these two variables may be different because of the delay in the communication between the interlocking system and the trains.

Routes. For each route r, a variable r.MODE is used to encode the current mode of that route. A route can be in one of the following modes: FREE, MARKED, ALLOCATING, LOCKED, or USED.

#### 4.2 Transition Relation

The transition relation  $R \subseteq S \times S$  can be represented symbolically by a predicate  $\Phi$  with free variables in  $V \cup V'$ , where  $V' = \{v' \mid v \in V\}$  is the set of next-state variables. A pair of states  $(s,s') \in R$ , if and only if  $\Phi$  evaluates to true when replacing every  $v \in V$  occuring in  $\Phi$  with s(v) and every  $v' \in V'$  occuring in  $\Phi$  with s'(v). In order to specify  $\Phi$ , we divide the transitions in an interlocking system into four types as in the following, each type is represented collectively in a predicate with free variables in  $V \cup V'$ .

- (0) route dispatching transitions represented collectively by the predicate  $\Phi_d$ ;
- (1) interlocking transitions e.g., setting mode of a route represented by the predicate  $\Phi_{\iota}$ :
- (2) track element transitions e.g., switching a point or a signal represented by the predicate  $\Phi_{\epsilon}$ ; and
- (3) train movement transitions represented by the predicate  $\Phi_{\tau}$ .

Transitions of type (0) are not prioritized, i.e., they can be chosen whenever they are enabled, independently from other transitions. On the other hand, transitions of types (1), (2), and (3) are prioritized in the descending order that they appear in the list, i.e., transitions of type (1) has the highest priority and transitions of type (3) has the lowest. Whenever two transitions of different priorities are both enabled, the one with higher priority will be chosen. Transitions with the same priority are chosen non-deterministically if they are enabled at the same time. This priority of transitions is based on the intuition that in practice, the events

<sup>&</sup>lt;sup>9</sup> The notation name(integer-value) means that name is the name of constant having the value integer-value.

in the interlocking control logic occur at significantly higher speed than the ones occurring in a track element. An analogous argument applies to events related to track elements and others related to train movements. With these types of transitions, the transition relation of an interlocking system can be specified as in the following

$$\Phi \equiv \Phi_d \vee ITE(\iota, \Phi_\iota, ITE(\epsilon, \Phi_\epsilon, \Phi_\tau)) \tag{1}$$

where ITE(c,i,e) is the *if-then-else* function: if c holds then the value of the function is i, otherwise it is e;  $\iota$  expresses whether an interlocking transition is enabled; and  $\epsilon$  expresses whether a track element transition is enabled. The route dispatching transition relation  $\Phi_d$  is put outside of the ITE function in (1) in order to allow the routes to be dispatched arbitrarily. If route dispatching transitions were given the same or higher priority as the one of interlocking control logic transitions, all routes which could be dispatched would have to be dispatched before track elements or trains could make any transition. On the other hand, if route dispatching were given lower priority than interlocking control logic transitions, then a route could not be dispatched if another route is processed by the interlocking.

Route Dispatching. A route can be dispatched arbitrarily whenever its mode is FREE. This means that multiple routes can be dispatched at the same time.

Life-cycle of a Route. Figure 4 shows the "life-cycle" of a route, i.e., its different modes and the transitions from one mode to another. This "life-cycle" reflects the procedure for setting and sequentially releasing a route as described in Sect. 2. The transitions labeled (1), (2), (3), (4), and (6) in Fig. 4 correspond to items (1) – (5) in the procedure presented in Sect. 2 for setting and releasing a route. Transition (5) models the sequential release that can take place while the route stays in USED mode: as the train moves along the route, its elements are released sequentially as soon as the train has passed them. Transition (2) is adapted to sequential release: allocation is now also allowed when a conflicting route is in the USED mode, as long as elements shared with the given route have been sequentially released.



Fig. 4. A life-cycle of a route

Life-cycle of a Lockable Element. Figure 5 depicts the "life-cycle" of a lockable element within the network controlled by the interlocking system. Each node in the diagram is labeled with information about the status of the element e: (a) whether the element is vacant, (b) its current mode, and (c) the value of the PREV variable indicating whether the previous element prev(r,e) of e in the route r has been released. An element e is initially in a state in which it is vacant, in FREE mode, and its PREV variable is 0. (1) When the interlocking system is allocating a route r that uses e, it sets the mode of the element to EXLCK, meaning that the element is locked exclusively for r. (2) The element becomes occupied, i.e., not vacant, as a train enters. (3) After that, e's mode is set to USED. (4) When the train leaves the previous element prev(r,e) of e in the route r, prev(r,e) is released, and it informs e by setting the variable e.PREV to 1. (5) When the train leaves e, the latter becomes vacant again, (6) e is released and the next element next(r,e) in the same route is informed by setting next(r,e).PREV to 1.



Fig. 5. "Life-cycle" of a lockable element e. vacant(e) is a formula over variables encoding e's vacancy status shown in Sect. 4.1.

Switching Points. A point p can be switched if it is requested to be switched to a position p.CMD that is different from its current position p.POS. The point switching process occurs in two steps:

- (1) the point moves from its current position to the *intermediate* position, i.e.,  $p.POS \neq p.CMD \land p.POS \neq INTERMEDIATE \land p.POS' = INTERMEDIATE,$
- (2) the point is switched from the *intermediate* position to the requested position, i.e.,  $p.POS = INTERMEDIATE \land p.POS' = p.CMD$ .

Switching Signals. Whenever the actual aspect s.ACT of a signal s differs from the commanded aspect s.CMD, the actual aspect of the signal is set to the commanded aspect, i.e.,  $s.ACT \neq s.CMD \land s.ACT' = s.CMD$ .

Train Movements. Trains are not explicitly specified in our model, in the sense that there are no explicit train objects. Instead, train movements and other aspects are implicitly modeled via the occupancy status of train detection sections, inspired by the "rubber-band" model described in [1]. This implicit

model is advantageous when compared to the explicit one, because it models arbitrary numbers of trains of arbitrary length. In the implicit model of train movements, train length – in terms of numbers of sections that a train occupies – may vary as trains move. This variation reflects the actual view of interlocking systems of the train length: although trains have fixed geometric length, their length – in terms of the number of sections that they occupy – as seen by the interlocking systems is not fixed.

# 5 High-level Safety Properties

Interlocking systems must at least guarantee the high-level safety properties of non-collision and non-derailment. These properties can be expressed as state invariants over the vacancy status variables of linear and point sections in the given network. Basically, an interlocking system is safe if no hazardous situations occur on any linear or point section at any time. Thus, the high-level safety properties can be expressed formally by the following state invariant

$$\phi = \neg (\bigvee_{l:Linear} Hazard_l \lor \bigvee_{p:Point} Hazard_p)$$
 (2)

where  $Hazard_l$  and  $Hazard_p$  specify conditions for hazards to occur on a linear section l and a point p, respectively. These propositions are conjunctions of sub-propositions expressing hazards of different types on a section such as: (a) head-to-head collision, (b) trains following each other collision on a section, or (c) derailment on a point. Some examples for hazards are given in the subsequent paragraphs.

Head-to-head collision on a linear section. A head-to-head collision occurs on a linear section l, when two trains running in opposite directions meet. This situation is expressed by the following formula where l.D2U (l.U2D) is the variable encoding the vacancy status of the section in the travel direction from down (up) to up (down).

$$l.D2U * l.U2D > 0 \tag{3}$$

As l.D2U \* l.U2D > 0 iff l.D2U > 0 and l.U2D > 0, the formula expresses that the section is occupied in both down-to-up (l.D2U > 0) and up-to-down (l.U2D > 0) directions. Collisions of type (b) are formulated in the similar way.

Derailment on a point. A derailment occurs when a train traverses a point p which is not locked in the correct position for the travel direction of the train. This situation is expressed by the following formula where p.POS is the point's actual position, p.S2PM, p.P2S, and p.M2S are variables encoding the vacancy status of the point in the travel direction entering the point from its stem, plus, or minus ends, respectively, & and  $\gg$  are bit-wise and and arithmetic bit shift right operators, respectively.

$$p.POS*p.P2S+(1-(p.POS \& 1))*p.M2S+(p.POS \gg 1)*p.S2PM > 0$$
 (4)

Formula (4) captures the following cases: (a) a train is entering a point from its plus end (p.P2S > 0) while the point is in not in the plus position (p.POS > 0), (b) a train is entering a point from its minus end (p.M2S > 0) while the point is not in the minus position (1 - (p.POS & 1) > 0), and (c) a train is entering a point from its stem end (p.S2PM > 0) while the point is in the intermediate position  $((p.POS \gg 1) > 0)$ .

# 6 Verification of Safety Properties

When a model K (see Sect. 4) and a proposition  $\phi$  expressing high-level safety properties (see Sect. 5) have been generated, the next task according to our method is to prove the absence of hazardous situations, i.e., to prove that  $\phi$  holds in all reachable states of K. This is written  $K \models G(\phi)$ . The following subsections describe our approach for verifying this.

# 6.1 Verification Strategy

We employ a strategy combining BMC and k-induction techniques similar to the one in [13]. The verification procedure is performed in two steps: (i) base case: prove that  $\phi$  holds for k > 0 consecutive states<sup>10</sup>, starting from the initial state  $s_0$ , and (ii) induction case: prove that if  $\phi$  holds for k > 0 consecutive states, starting from an arbitrary state  $s_n$ , then  $\phi$  will also hold in the  $(k+1)^{th}$  state. Both the base case and the induction case are transformed to problems of finding counter-examples for their negated formulas using an SMT solver. If no counter-examples are found, then the cases have been proved.

### 6.2 Invariant Strengthening

As pointed out in [3], when  $\phi$  is not strong enough to be inductive, counter-examples are found for the induction case. These counter-examples are often spurious, i.e., they start from an unreachable state and do not correspond to any actual run of the system. In order to make  $\phi$  inductive, it is strengthened with an extra invariant  $\psi$ , i.e., one should prove  $\phi \wedge \psi$  instead of  $\phi$ .  $\psi$  is called the strengthening invariant, which eliminates the spurious counter-examples. An example of such strengthening properties is given in the following.

Train Integrity. Some states of the variables expressing the train occupancy status of the track sections (see Sect. 4) are not feasible as they correspond to situations that are not physically possible. An example of an infeasible state is one in which the variables express that a section s is occupied in one direction by a train without the head being on the section, but the next section in that travel direction is unoccupied.

 $<sup>^{10}</sup>$  Two states are consecutive, if there is a transition from the first to the second according to the model K.

The train integrity conditions can be formalized as a conjunction of formulas over the track vacancy variables. For each travel direction (up and down), there is a conjunct for each section s that has a next section in the given travel direction. The pattern of such a conjunct depends on the other sections the current section is connected to in the given travel direction. For instance, for travel direction up and a linear section s that has a linear section s' as neighbor in travel direction s', the conjunct will take the following form:

$$(s.D2U \& 0b101) = 0b001 \iff (s'.D2U \& 0b011) = 0b001$$
 (5)

where & is the bit-wise and operator. This formula expresses that section s is occupied by a train in direction up (the O bit of s.D2U is 1) without the head being on the section (the H bit of s.D2U is 0), if and only if section s' is occupied by a train in direction up (the O bit of s'.D2U is 1) without a tail being on the section (the T bit of s'.D2U is 0). Formula (5) shows the expressiveness of our state encodings allowing properties to be efficiently formulated in compact formulas.

# 7 Experiments

We have used the tool-chain to verify the safety properties for model instances of a number of railway networks, ranging from a trivial tiny toy network to a large station (Køge) extracted from the early deployment line of the new Danish signalling systems.

**Table 2.** Verification results for different networks using simple induction (k=1). Toy, cross, and mini are made-up trivial networks, while Gadstrup-Havdrup (Gt-Hd) and Køge are extracted from the early deployment line in the Danish Signalling Programme. (BR: branching ratio)

| Case  | Linears | Points | Signals | Routes | $\mathbf{BR}$ | Vars | Time(sec) | Memory  |
|-------|---------|--------|---------|--------|---------------|------|-----------|---------|
| Toy   | 3       | 1      | 4       | 6      | 0.33          | 35   | 9         | 51 MB   |
| Cross | 4       | 2      | 8       | 10     | 0.50          | 56   | 64        | 127 MB  |
| Mini  | 4       | 2      | 8       | 12     | 0.50          | 58   | 76        | 128 MB  |
| Gt-Hd | 18      | 5      | 21      | 30     | 0.28          | 179  | 1826      | 1171 MB |
| Køge  | 46      | 23     | 49      | 59     | 0.50          | 502  | 33627     | 4788 MB |

In our first trials of verifying the models, we used simple induction (k-induction with k=1), but we got spurious counter-examples. To avoid that we tried to increase k and strengthen the invariant to be verified. It turned out that the verification time increased significantly as k increased, making it impossible to verify even the small networks. However, we were able to derive strengthening properties  $\psi$  (see Sect. 6) for which the verification could be done just using simple induction. (Not for all applications this is possible, see, e.g., [13]). Table 2 shows the results of the final verification. Each row of the table lists the size of a network in terms of the number of linear sections,

points, signals, and routes in the configuration, and the number of generated variables in the corresponding model instance. The two last columns show the approximate accumulated verification time and memory usage. All experiments have been performed on Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 8GB RAM, Ubuntu 14.04 LTS, Linux 3.13.0-27-generic x86\_64 kernel.

The branching ratio of a network (BR in Table 2) is defined as the ratio of the number of points to the number of linear sections in that network. The larger the branching ratio is, the more complex the corresponding network is in terms of branching. The size of the formula  $\Phi$  specifying the transition relation as well as the size of the formulas  $\phi$  and  $\psi$  specifying the state invariants grow as the size of the network grows. Our experiments show that the formulas grow much more when the network's branching ratio also increases, than when the branching ratio is nearly the same (as it is, e.g., the case when chaining multiple simple stations). This is due to the fact that the interdependency between variables in the model also increases when BR increases.

We also injected errors into models. Counter examples for these were normally found in relatively short time. This appears to be a general trend when dealing with interlocking systems [16]. In a few cases, it took long time to find counter examples. Such examples usually represent very subtle errors in the model or the configuration data, which may be easily overlooked by inspection.

### 8 Related Work

In recent years, the railway domain has become one of the most promising application domains of formal methods. Several research groups have investigated how formal methods would help efficiently producing more robust railway control systems. An overview of recent trends can be found in [7], and recommendations and best-practices for efficient development and verification of safe railway control systems are summarized in [12]. Re-configurable systems and automated verification are among these recommendations that we have followed.

Model checking is a promising technique for verifying safety properties of interlocking systems thanks to its capability to be fully automated. Unfortunately, due to the state explosion problem, the technique is only able to verify applications of small size [8]. Several techniques have been proposed in order to push the applicability bounds toward industrial size. Winter et al. suggest using ordering strategies optimized for interlocking models [23]. A number of high-level abstractions for reducing the complexity of interlocking models are presented in [15]. In [6], Fantechi et al. suggest a distributed interlocking model whose verification can be divided into small tasks and verified in parallel. SAT-based model checking and slicing technique are used in [16]. In order to remedy the problem with state space explosion in the global model checking approach, we have recently for some other applications [13, 14] used BMC instead. In the current work, a combination of SMT-based BMC with inductive reasoning allowed us to verify safety properties without having to explore the whole state space, hence we were able to push the bounds even further to handle larger

networks of industrial size. As an alternative to the model checking approach, theorem proving based techniques have also shown success in the railway domain, see, e.g., [2, 11], but are less automated.

Although sequential release has been used in some interlocking systems, we have not found any published formal models of interlocking systems that integrate this feature. In [20], the conditions for elements to be unlocked and reused in sequential releases are pre-computed and specified in the interlocking tables. In our approach, sequential release is integrated into the behavioral model rather than into the configuration data. This reduces the complexity of the configuration data and makes interlocking configuration data relatively independent from the chosen interlocking approaches.

### 9 Conclusion and Future Work

This paper presented a fully automated, formal method and an associated tool suite for verifying the forthcoming new ETCS Level 2 based Danish railway interlocking systems featuring sequential release. A formal model for these systems was outlined. A novelty in our contribution is that the system is part of an ETCS Level 2 based signalling system in which there are no physical signals along the tracks; instead, movement authorities are communicated via on-board computers. By introducing the concept of virtual signals, we have been able to handle the assignment of movement authorities in a way that is very similar to the situations where conventional signals are used. Another novelty is that the formal model features sequential release. As a consequence, the model is more complex than those supporting route-based release only, because additional variables and transitions are required. Therefore the verification becomes more challenging. In spite of this difficulty, using a combination of SMT-based BMC and inductive reasoning, we were able to successfully verify safety properties for systems controlling large networks of realistic size. This was enabled by encodings of the state space, the transition relation, and of the safety properties that can be efficiently evaluated by SMT solvers supporting bit vector and integer arithmetics.

In order to compare our verification approach to the approaches that use BDD-based symbolic model checking, a translation from our model to NuSMV – a well-known BDD-based symbolic model checker – is currently in progress. For future work, we will benchmark how sequential release affects the complexity, and hence verification challenges, of interlocking models. Furthermore, we will investigate advanced techniques for automating the process of discovering strengthening invariants, or reducing the size of the networks that need to be modeled. For the current model there are potential overlaps between the strengthening invariants, which should be eliminated in order to reduce the size of the formula to be solved by the SMT solver.

Acknowledgments. The authors would like to thank Jan Bertelsen from Thales and Ross Edwin Gammon and Nikhil Mohan Pande from Railnet Denmark for

helping us with their expertise about Danish interlocking systems and always being helpful when we had questions; Dr.-Ing. Uwe Schulze and Florian Lapschies from University of Bremen for their help with the implementation in the RT-Tester tool-chain. The first two authors' research has been funded by the RobustRailS project granted by the Danish Council for Strategic Research. The third author's work has been partially funded by ITEA2 project openETCS under grant agreement 11025.

# References

- M. Aanæs and H. P. Thai. Modelling and Verification of Relay Interlocking Systems. Master's thesis, Technical University of Denmark, DTU Informatics, E-mail: reception@imm.dtu.dk, 2012.
- Salimeh Behnia, Amel Mammar, Jean-Marc Mota, Nicolas Breton, Paul Caspi, and Pascal Raymond. Industrialising a Proof-based Verification Approach of Computerised Interlocking Systems. In Eleventh International Conference on Computer System Design and Operation in the Railway and Other Transit Systems (COMPRAIL08). WIT Press, 2008.
- 3. Leonardo De Moura, Harald Rueß, and Maria Sorea. Bounded Model Checking and Induction: From Refutation to Verification. In *Computer Aided Verification*, pages 14–26. Springer, 2003.
- 4. ERTMS. Annex A for ETCS Baseline 3 and GSM-R Baseline 0, April 2012.
- CENELEC European Committee for Electrotechnical Standardization. EN 50128:2011 - Railway applications - Communications, signalling and processing systems - Software for railway control and protection systems. 2011.
- 6. Alessandro Fantechi. Distributing the Challenge of Model Checking Interlocking Control Tables. In Tiziana Margaria and Bernhard Steffen, editors, Leveraging Applications of Formal Methods, Verification and Validation. Applications and Case Studies, volume 7610 of Lecture Notes in Computer Science, pages 276–289. Springer, 2012.
- Alessandro Fantechi. Twenty-Five Years of Formal Methods and Railways: What Next? In Steve Counsell and Manuel Núñez, editors, Software Engineering and Formal Methods, volume 8368 of Lecture Notes in Computer Science, pages 167– 183. Springer, 2014.
- 8. Alessio Ferrari, Gianluca Magnani, Daniele Grasso, and Alessandro Fantechi. Model Checking Interlocking Control Tables. In Eckehard Schnieder and Géza Tarnai, editors, FORMS/FORMAT 2010 Formal Methods for Automation and Safety in Railway and Automotive Systems, pages 107–115. Springer, 2010.
- 9. Helle Hvid Hansen, Jeroen Ketema, Bas Luttik, Mohammad Reza Mousavi, Jaco van de Pol, and Osmar Marchi dos Santos. Automated Verification of Executable UML Models. In Bernhard K. Aichernig, Frank S. de Boer, and Marcello M. Bonsangue, editors, *FMCO*, volume 6957 of *Lecture Notes in Computer Science*, pages 225–250. Springer, 2010.
- 10. Anne E. Haxthausen, Marie Le Bliguet, and Andreas A. Kjær. Modelling and Verification of Relay Interlocking Systems. In Christine Choppy and Oleg Sokolsky, editors, 15th Monterey Workshop: Foundations of Computer Software, Future Trends and Techniques for Development, number 6028 in Lecture Notes in Computer Science, pages 141–153. Springer, 2010. Invited paper.

- 11. Anne E. Haxthausen and Jan Peleska. Formal Development and Verification of a Distributed Railway Control Systems. In *IEEE Transactions on Software Engineering*, volume 26, pages 687–701. IEEE, 2000.
- Anne E. Haxthausen and Jan Peleska. Efficient Development and Verification of Safe Railway Control Software. In *Railways: Types, Design and Safety Issues*, pages 127–148. Nova Science Publishers, Inc., 2013.
- Anne E. Haxthausen, Jan Peleska, and Sebastian Kinder. A Formal Approach for the Construction and Verification of Railway Control Systems. In *Formal Aspects* of *Computing*, volume 23, pages 191–219. Springer, 2011.
- 14. Anne E. Haxthausen, Jan Peleska, and Ralf Pinger. Applied Bounded Model Checking for Interlocking System Designs. In Steve Counsell and Manuel Núñez, editors, Software Engineering and Formal Methods, volume 8368 of Lecture Notes in Computer Science, pages 205–220. Springer, 2014.
- 15. Philip James, Faron Möller, HoangNga Nguyen, Markus Roggenbach, Steve Schneider, Helen Treharne, Matthew Trumble, and David Williams. Verification of Scheme Plans Using CSP||B. In Steve Counsell and Manuel Núñez, editors, Software Engineering and Formal Methods, volume 8368 of Lecture Notes in Computer Science, pages 189–204. Springer, 2014.
- Phillip James and Markus Roggenbach. Automatically Verifying Railway Interlockings Using SAT-based Model Checking. In *Electronic Communications* of the EASST, volume 35. EASST, 2011.
- 17. Jan Peleska. Industrial-Strength Model-Based Testing State of the Art and Current Challenges. In Alexander K. Petrenko and Holger Schlingloff, editors, Proceedings 8th Workshop on Model-Based Testing, Rome, Italy, volume 111 of *Electronic Proceedings in Theoretical Computer Science*, pages 3–28. Open Publishing Association, 2013.
- 18. Jan Peleska, Elena Vorobev, and Florian Lapschies. Automated Test Case Generation with SMT-Solving and Abstract Interpretation. In Mihaela Gheorghiu Bobaru et al., editor, NASA Formal Methods, volume 6617 of Lecture Notes in Computer Science, pages 298–312. Springer, 2011.
- Gregor Theeg, Sergeĭ Valentinovich Vlasenko, and Enrico Anders. Railway Signalling & Interlocking: International Compendium. Eurailpress, 2009.
- 20. David Tombs, Neil Robinson, and George Nikandros. Signalling Control Table Generation and Verification. In CORE 2002: Cost Efficient Railways through Engineering, page 415. Railway Technical Society of Australasia/Rail Track Association of Australia, 2002.
- 21. Verified Systems International GmbH. RT-Tester Model-Based Test Case and Test Data Generator RTT-MBT User Manual, 2013.
- 22. Linh Hong Vu, Anne E. Haxthausen, and Jan Peleska. A Domain-Specific Language for Railway Interlocking Systems. In Eckehard Schnieder and Géza Tarnai, editors, FORMS/FORMAT 2014 10th Symposium on Formal Methods for Automation and Safety in Railway and Automative Systems, pages 200–209. Institute for Traffic Safety and Automation Engineering, Technische Universität Braunschweig, 2014.
- 23. Kirsten Winter. Optimising Ordering Strategies for Symbolic Model Checking of Railway Interlockings. In Tiziana Margaria and Bernhard Steffen, editors, Leveraging Applications of Formal Methods, Verification and Validation. Applications and Case Studies, volume 7610 of Lecture Notes in Computer Science, pages 246–260. Springer, 2012.

# Dynamic State Machines for Formalizing Railway Control System Specifications

Ugo Gentile<sup>1</sup>, Roberto Nardone<sup>1</sup>, Adriano Peron<sup>1</sup>, Valeria Vittorini<sup>1</sup>, Stefano Marrone<sup>3</sup>, Renato De Guglielmo<sup>2</sup>, Nicola Mazzocca<sup>1</sup>, and Luigi Velardi<sup>2</sup>

<sup>1</sup> Università di Napoli "Federico II",

{ugo.gentile,roberto.nardone,adrperon,valeria.vittorini,nicola.mazzocca}@unina.it

AnsaldoSTS, {renato.deguglielmo,luigi.velardi}@ansaldo-sts.com

Seconda Università di Napoli, stefano.marrone@unina2.it

Abstract. Verification of railway control systems requires a hard testing activity regulated by international standards which explicitly recommend the usage of Finite State Machines (FSMs) to model the specification of the system under test. Despite the great number of work addressing the usage of FSMs and their extensions, actual model-driven verification processes still lacks concise and expressive enough notations, able to easily capture characteristic features of specific domains. This paper introduces DSTM4Rail, a hierarchical state machines formalism to be used in verification contexts, whose peculiarity mainly resides in the semantics of fork-and-join which allows dynamic (bounded) instantiation of machines (processes). The formalism described in this paper is industry driven, as it raises from real industrial needs in the context of an European project. Hence, the proposed semantics is motivated by illustrating concrete issues in modeling specific functionalities of the Radio Block Centre, the vital core of the ERTMS/ETCS Control System.

**Keywords:** State Machine Dynamic Instantiation, Railway Control System, Metamodel, Model Driven, System Testing, CRYSTAL

# 1 Introduction

One of the most critical components installed in modern railways is the signaling system which aims at guaranteeing the complete control of the railway traffic with a high-level of safety, essentially to prevent trains from colliding. These systems shall be validated against system requirements, given by the client and by international standards, such as the CENELEC norms as for the European standards is concerned (i.e., EN50128 [6] and EN50126 [5]). The first step of the V&V process is to describe the system behaviour and requirements by using a state-based language (highly recommended by the standards during these phases of the life cycle). The work described in this paper is part of a wider research activity carried out within the ongoing ARTEMIS Joint Undertaking project CRYSTAL (CRitical sYSTem engineering Acceleration) [8] with the objective to alleviate the high effort (in terms of costs and time) required by the V&V

activities [3]. CRYSTAL is strongly industry-oriented and will provide readyto-use integrated tool chains having a mature technology-readiness level. To achieve technical innovation, CRYSTAL developed a user-driven approach [20] by applying engineering methods to industrially relevant *Use Cases* from the automotive, aerospace, rail and health sectors and aims at increasing the maturity of existing concepts developed in previous projects on European and national level (e.g., CESAR [7] and MBAT [17]). Our work is conducted in the railway domain, according to the needs expressed by Ansaldo STS (ASTS), an international transportation leader in the field of signaling and integrated transport systems for passenger traffic (Railway/Mass Transit) and freight operation. Our ultimate goal in CRYSTAL is to reduce the time needed for the definition of system level tests of railway control systems. To meet this objective a model-driven approach for the automated generation of test cases is being developed. The starting point is the definition of a domain specific formal state-based language (DSTM4Rail) to be used for modeling the system behavior and formalize the requirements (from which the test specifications are obtained). This paper specifically introduces DSTM4Rail (Dynamic STate Machine for Railways control systems). The name of the language says that its definition was driven by the specific needs expressed by ASTS. At the state, DSTM4Rail has been developed to model the behavior and the requirements of a railway control system for system testing purposes, but it could be applied to model different critical control systems. The motivation for a new language resides in the requirements expressed by the railway industry: a formal language to be integrated into a model-driven process as simple as possible and as rich as needed [10]. Consequently, the idea to extend existing languages has been discarded in order to open to the effective usage of the language in the industrial setting. The formalism metamodel we propose allows for modeling the dynamic (bounded) instantiation of machines (processes) as well as communications and timing constraints. The focus of the paper is on the control flow, hence here the semantics of fork-and-join is formally defined in order to model discrete behaviors through finite state-transition systems. The paper is organized as follows. Section 2 presents the application domain and states the language requirements. Section 3 describes DSTM4Rail through its metamodel and introduces its formal syntax and semantics. Section 4 gives some meaningful examples of language application to real examples. Section 5 provides a discussion about the related work and clarifies the motivation behind the introduction of DSTM4Rail. Finally, Section 6 contains some closing remarks.

# 2 RBC Use Case and Language Requirements

The CRYSTAL Use Case from ASTS is the Radio Block Centre (RBC) system, a computer-based system whose aim is to control the movements of the set of trains on the track area that is under its supervision, in order to guarantee a safe intertrain distance according to the ERTMS/ETCS specifications. ERTMS/ETCS (European Rail Traffic Management System/European Train Control System) is a standard for the interoperability of the European railway signaling systems

that ensures both technological compatibility among trans-European railway networks and integration of the new signalling systems with the existing national train interlocking systems. The RBC system is in charge of timely transmitting to the on-board system of each train its up-to-date Movement Authority (MA) and the related speed profile, in addition it is responsible for the management of the emergency situations within its own sub-track. The industrial needs regarding the specification language can be summarized as follows: a) it must have formal syntax and formal semantics; b) it must be easy to understand and use; c) it must suit for modeling domain specific behavior and requirements. In other words, the language must provide *primitives* able to cope in a non-ambiguous and simple way with specific modeling issues, and specifically:

- R1 concurrent execution flows: concurrency shall be allowed as RBC manages concurrent execution flows (e.g., it handles simultaneously several communications with other systems).
- R2 instantiation/termination of machine: fork and join of control flows shall be allowed as well as the possibility to instantiate synchronously and asynchronously new machines. In addition, the *preemption* property should be considered on joins, in order to force the termination of a machine.
- R3 trigger and condition: a special trigger (say "any") shall be defined in order to help modelers when specifying a transition which is triggered by the occurrence of an event not belonging to a given set; similarly a special "any" condition shall be allowed.
- R4 broadcast communication: broadcasting communication between machines shall be allowed in order to manage situations in which different machines need to be triggered by the same event (e.g. in case of watchdog timers).
- R5 timers: timers shall be considered as well as the trigger corresponding to their expiration; activation/deactivation of timers shall be introduced as actions.
- R6 *variables*: variables are necessary in order to store information and to enable a concise representation of machines.

In order to contextualize DSTM4Rail we provide a general picture of the model-driven test case generation approach we are currently developing within the CRYSTAL project. Fig. 1(b) shows the system level testing process adopted by ASTS (instantiated on the RBC).

A set of Test Specifications is derived from the requirements. Then, Test Cases are obtained from the Test Specifications and translated into executable tests. These activities shall be conducted by the V&V team, which is independent from the development team and should not know any information about the development (mandatory, since CENELEC standards are applicable). Fig. 1(a) depicts the proposed test case generation approach: it has impact on the part of the testing process enclosed in the dashed box in Fig. 1(b). Chains of model transformations yield Test Cases by applying model checking techniques from a state-based specification of the system behavior and the Test Specifications [2, 18]. These models should be independent from the specific model checker. Hence, besides the definition of a proper formal state-based language (DSTM4Rail), the

### 4 U. Gentile, R. Nardone, A. Peron, et al.



Fig. 1. Test case generation approach.

approach will require the development of a domain specific modeling language (Intermediate DSML) both as the target language of model transformation engines from the state-based models and as the source language to different model checkers. The Test Specification Patterns will provide general reusable models for recurrent classes of requirements [9].

# 3 DSTM4Rail

DSTM4Rail extends Hierarchical State Machines [1] by adding concepts of fork, join and recursive execution of machines inside a box, allowing for the *dynamic instantiation* of machines. The main advantages are: (1) each state machine may be parametric over a finite set of dynamically evaluated parameters, and (2) the same machine may be instantiated many times without explicitly replicating its entire structure. From now on we refer to an *executing* state machine as a *process*.

### 3.1 Metamodel

An excerpt of the DSTM4Rail metamodel is shown in Fig. 2. An Ecore diagram is used, according to the technology adopted to generate the model editor and graphical interface [21]. As the focus of the paper is on the representation and evolution of the control flow, the described portion of the metamodel introduces

the syntactical elements aiming at covering the requirements R1, R2, R4 and R6 of Section 2. Concepts and relationships pertaining to data flow are just outlined.

The main class is Dynamic State Machine (DSTM), which represents the entire specification model. It is characterized by the attribute max\_proc, which indicates the maximum number of processes active in each instant of time (to bound a DSTM being a FSM). A DSTM is composed of different Machines, Channels and Variables. For the sake of simplicity, in this paper we consider Channels and Variables with a global scope; they allow for communication between machines. A single Machine is composed of Vertexes, Transitions and Parameters.



Fig. 2. DSTM4Rail metamodel.

The class *Vertex* is abstract since different kinds of vertexes (with different features and constraints) may be present in a machine. The vertex types are:

- Node: node of a machine;
- Entering Node: PseudoNode, entry point of a machine;
- Initial Node: default entering node of a machine (at most one for each machine);

- Exiting Node: exit (or final) point of a machine (more than one if return conditions are required);
- Box: encloses one or more state machines which are concurrently instantiated when the box is entered:
- Fork: PseudoNode, splits an incoming transition into more outgoing transitions; it allows for instantiating one or more processes either synchronously or asynchronously with the currently executing process;
- Join: PseudoNode, merges outgoing transition from concurrently executing processes; it synchronizes the termination of concurrently executing processes or allows to force the termination when a process is able to perform a preemptive exiting transition.

The classes Fork, Join and EnteringNode are inherited from the abstract class PseudoNode which encompasses different types of transient vertexes in the machine. Entering and exiting nodes define the interface of each machine, in particular the first node of a process is either the initial node or an entering node (when explicitly expressed in the higher level box instantiating the machine). The association between Box and Machine enables concurrent processes to be instantiated entering the Box. Note that an entering node and/or exiting node and/or values of parameters can be specified only if the box instantiates a single machine, otherwise the default ones are considered. Two associations between Node and Action say that entering and exiting actions could be specified for a node, indicating the set of behaviours to be performed when entering (exiting) into (from) the node.

A Transition is associated with a source and a destination vertex, and may specify a Trigger, a Condition and an Action. Without loss of generality we can assume that a Trigger is a freely built logical expression over standard logical connectors on a suitable set of elementary triggers (basic events); analogously a Condition is a freely built logical expression over standard logical connectors on a suitable set of atomic propositions which provides a fine-grained control over the firing of the transition; finally the Action is a sequence of elementary actions (operations), induced by the application domain, to be performed when the transition fires. Triggers are related to the reception of messages on channels, hence an association with *Channel* is present. Analogously conditions may be expressed on a set of variables (an association is present between Condition and Variable). An action may cause the update of a variable or the transmission of messages on channels, hence two associations with Channel and Variable have been inserted. A Transition can specify also an entering node, if different from the default one, of an activated machine (when entering a box), hence an association with EnteringNode is present. Analogously it can specify a precise exit node, when returning from a box, hence another association with ExitingNode is also present.

A set of constraints are defined in order to forbid the definition of transitions between any kind of vertexes, the introduction of triggers when exiting the *PseudoNodes*, the truth of the attribute *isPreemptive* (of *Transition*) when the destination of a transition is not a *Join*, the specification of entering or exiting nodes for the set of transitions which not connect boxes. These constraints are formally defined in the next Subsection.

### 3.2 Formal Syntax

In this Subsection we formally provide the fragment of the abstract syntax of DSTM4Rail needed to represent the control flow (with respect to the metamodel elements previously described).

Let  $\mathcal{T}$ ,  $\mathcal{C}$  and  $\mathcal{A}$  be the syntactical categories of triggers, conditions and actions, respectively. An element of  $\mathcal{T}$  (resp.  $\mathcal{C}$ ,  $\mathcal{A}$ ) is a trigger (resp. condition, action) expression. Assume that the symbol  $\overline{\tau} \in \mathcal{T}$  represents the trigger "always\_available" and the symbol  $\overline{\alpha} \in \mathcal{A}$  represents the action "no\_action". Let X be a set of variables and let  $\Phi$  a set of channels.

**Definition 1 (Dynamic STate Machine).** A DSTM D over  $\mathcal{T}$ ,  $\mathcal{C}$  and  $\mathcal{A}$  is a tuple  $\langle M_1, \ldots, M_n, X, \Phi, max\_proc, max\_inst \rangle$  where:

- $M_i$  is a tuple  $\langle N_i, En_i, df_i, Ex_i, Bx_i, Y_i, Fk_i, Jn_i, \delta_i \rangle$  for  $1 \leq i \leq n$ , where:
  - $N_i$  is a (finite) set of nodes;
  - $En_i$  is a (finite) set of entering pseudo-nodes;
  - $df_i \in En_i$  is the initial node (default);
  - $Ex_i \subseteq N_i$  is a set of exiting nodes;
  - $Bx_i$  is a (finite) set of boxes;
  - $Y_i: Bx_i \to \{1, \ldots, n\}^*$  assigns to every box a sequence (list) of machine indexes;
  - $Fk_i$  is a (finite) set of fork pseudo-nodes;
  - $Fk_i \times \{\downarrow\}$  is a (finite) set of asynchronous fork pseudo-nodes;
  - $Jn_i$  is a (finite) set of join pseudo-nodes;
  - $Jn_i \times \{ \otimes \}$  is a (finite) set of preemptive join pseudo-nodes;
  - $\delta_i \subseteq Source \times \mathcal{T} \times \mathcal{C} \times \mathcal{A} \times Target$  is the transition relation, where:  $Source = (N_i \setminus Ex_i) \cup En_i \cup Bx_i \cup (Bx_i \times Ex(D)) \cup Fk_i \cup (Fk_i \times \{\downarrow\}) \cup Jn_i;$  $Target = N_i \cup Bx_i \cup (Bx_i \times En(D)) \cup Fk_i \cup Jn_i \cup (Jn_i \times \{\otimes\}).$

A transition  $\delta$  must match one of the following cases (constraints):

- \* "implicit transition" whenever the source is in  $En_i$  and decoration is  $(\overline{\tau}, true, \alpha)$  with  $\alpha \in \mathcal{A}$ ;
- \* "internal transition" whenever both source and target are in  $N_i$ ;
- \* "entering fork transition" whenever the source is in Source \  $(Fk_i \cup (Fk_i \times \{\downarrow\}))$  and the target is in  $Fk_i$ ;
- \* "asynchronous fork" whenever the source is in  $Fk_i \times \{\downarrow\}$  and the target is in  $N_i$  and decoration is  $(\overline{\tau}, true, \alpha)$  with  $\alpha \in \mathcal{A}$ ;
- \* "internal join" whenever the source is in  $N_i$  and the target is in  $Jn_i \cup (Jn_i \times \{ \otimes \});$
- \* "exiting join transition" whenever the source is in  $Jn_i$  and the target is in  $Target \setminus (Jn_i \cup (Jn_i \times \{\otimes\}))$  and decoration is  $(\overline{\tau}, true, \alpha)$  with  $\alpha \in \mathcal{A}$ ;
- \* "call by default" whenever the target is in  $Bx_i$ ; if the source is in  $Fk_i$  decoration is  $(\overline{\tau}, true, \alpha)$  with  $\alpha \in \mathcal{A}$ ;

- \* "call by entering" whenever the target has the form (bx, en) where  $bx \in Bx_i$  and  $en \in En_j$  with  $j = Y_i(bx)$ ; if the source is in  $Fk_i$  decoration is  $(\overline{\tau}, true, \alpha)$  with  $\alpha \in \mathcal{A}$ ;
- \* "return by default" whenever the source is in  $Bx_i$  and decoration is  $(\overline{\tau}, true, \alpha)$  with  $\alpha \in \mathcal{A}$ ;
- \* "return by exiting" whenever the source has the form (bx, ex) where  $bx \in Bx_i$  and  $ex \in Ex_j$  with  $j = Y_i(bx)$  and decoration is  $(\overline{\tau}, true, \alpha)$  with  $\alpha \in A$ ;
- \* "return by interrupt" whenever the source is in  $Bx_i$  and decoration is  $(\tau, true, \alpha)$  with  $\tau \in \mathcal{T} \setminus \{\overline{\tau}\}$  and  $\alpha \in \mathcal{A}$ .
- X is a (finite) set of variables;
- $-\Phi$  is a (finite) set of channels;
- max\_proc is the maximum number of processes concurrently active in each instant of time;
- $max\_inst: \{M_1, \ldots, M_n\} \rightarrow \{1, \ldots, max\_proc\}$  assigns to each machine the maximum number of instantiations.

Note that this definition constrains transitions to belong to a predefined set of kinds, avoiding the possibility to freely connect vertexes of a machine; furthermore the decoration of transitions exiting from pseudo-nodes shall not have a trigger or a condition.

### 3.3 Sketch of the Formal Semantics

The evolution of a DSTM is a sequence of instantaneous reactions (*steps*). A step is a maximal set of transitions which are triggered by the current set of available events, under the following constraints:

- 1. a node/box cannot be entered and exited simultaneously in a step (this is instead possible for pseudo-nodes); as a consequence, if a transition  $t_i$  enters into a node n (resp. box b), and a transition  $t_j$  exits from n (resp. b), then  $t_i$  and  $t_j$  cannot fire in the same step;
- 2. the events generated in a step (through exit actions of the exited nodes, actions of the fired transitions and entry actions of the entered nodes) cannot trigger transition firings in the same step, but only in the next one.

As usual, formal semantics can be provided by means of a Labeled Transition System (LTS) as a 4-tuple  $L = \langle S, \Sigma, \Delta, S_0 \rangle$ , where:

- S is a non-empty set of states;
- $-\Sigma$  is a non-empty alphabet of labels;
- $-\Delta$  is a transition relation, i.e., a subset of  $S \times \Sigma \times S$ ;
- $-S_0 \subseteq S$  is a set of initial states.

With reference to a DSTM D (see *Definition 1*),  $s \in S$  represents the *current* state of D including: (a) control flow, (b) values of variables, (c) state (content) of channels, (d) set of events produced by actions. Formally:

**Definition 2 (DSTM state).** The state of a DTSM is a tuple  $\langle ControlTree, Frontier, \rho, Event_{now}, Event_{next} \rangle$ , where:

- 1. ControlTree describes the control flow;
- Frontier is the subset of nodes in ControlTree that can be source of transitions in the current step;
- 3.  $\rho$  is the current evaluation of the set of variables X;
- 4.  $Event_{now}$  is the set of instantaneously detectable events;
- 5.  $Event_{next}$  is the set of events that will be detectable in the next step.

ControlTree is a tree representing processes, boxes and nodes currently active in the DSTM. Since the maximum number of processes contemporary active in a DSTM is finite (limited by  $max\_proc$ ), the ControlTree is bounded in width and depth.

The Frontier is the subset of ControlTree nodes which can be updated due to a transition firing in the LTS. The Frontier is used to avoid the firing of sequences of transitions in a single step: in order to fire, in fact, a transition of the DSTM has to exit only from nodes belonging to the Frontier. When a transition of the DSTM fires, the ControlTree is updated, and the set of nodes exited by taking the transition are removed from the Frontier, tracing the portion of the ControlTree that has been already updated. When no more transitions can fire (depending on the values of variables and channels, control tree and frontier), the LTS has a transition corresponding to the completion of the current step and the initialization of the next step. In this case the entire ControlTree is considered as Frontier of the next step, and the set of Eventnow of the next step are updated with those produced in the current step.

The ControlTree is formally defined as follows. A node  $n_{ct}$  is an element in  $\{0, 1, 2, \ldots, max\_proc\}^*$  (i.e., it is a numerical string); a tree  $T_{ct}$  is a prefix closed set of nodes:  $T_{ct} = \{n_{cs} : n_{cs} \text{ is a node}\}$ , and if  $n_{cs} \in T_{ct}$  and  $n' \prec n \Rightarrow n' \in T_{ct}$  (where  $\prec$  is the prefix relation between strings). We denote as  $Leaves(T_{ct})$  the set of leaves of  $T_{ct}$ . The following short hands are used:  $M(D) = \{M_1, \ldots, M_n\}$ ,  $N(D) = \bigcup_{1}^{n} N_i$  and  $Bx(D) = \bigcup_{1}^{n} Bx_i$ .

**Definition 3 (ControlTree).** A Control Tree CT over a DSTM D is a pair  $\langle T_{ct}, \lambda \rangle$ , where

```
-T_{ct} is a tree;
```

 $-\lambda$  is a labeling function  $\lambda: T_{ct} \to M(D) \cup N(D) \cup Bx(D)$ 

satisfying the following constraints for every  $n \in T_{ct}$ :

```
1. n \in Leaves(T_{ct}) \Leftrightarrow \lambda(n) \in N(D);

2. n \notin Leaves(T_{ct}) \Leftrightarrow \lambda(n) \in M(D) \cup B(D);

3. if n = n'.i (i.e., n' is the parent of n) with i \in \{0, ..., max\_proc\}:

-n \in Leaves(T_{ct}) \Rightarrow \exists j : \lambda(n) \in N_j \text{ and } \lambda(n') = M_j;

-n \notin Leaves(T_{ct}) \text{ and } \lambda(n) = bx \text{ with } bx \in Bx_j \Rightarrow \lambda(n') = M_j;

-n \notin Leaves(T_{ct}) \text{ and } \lambda(n) = M_j \Rightarrow \lambda(n') = bx \in Bx_k \text{ and } j \text{ occurs in } Y_k(bx), \text{ for some } k \in \{1, ..., n\}.
```

This formalization defines the structure of a *ControlTree*: the root represents the highest level process, the leaves represent specific nodes in which each process is waiting, while internal nodes represent callers and called instance of processes. If a node of the *ControlTree* represents a node or a box of a machine inside a DSTM, then its parent shall necessarily represent that machine; if a node of the *ControlTree* represents a machine, then its parent shall necessarily represent the box which has instantiated that machine.

The transition relation  $\Delta$  of the LTS is not formally described here for sake of space but in the following we provide examples representing the evolution of the ControlTree; these examples are centred on the firing of a simple transition, of the fork and of the join operators. Fig. 3 exemplifies the evolution of the ControlTree in the simplest case of transition firing between nodes. Let us suppose that the process, instance of  $M_i$  in Fig. 3(a), is in the node a; after the firing of the showed transition, the *ControlTree* evolves as indicated in Fig. 3(b): the root represents  $M_i$ , its child is a before the firing, and b after the firing. Fig. 4 and Fig. 5 show the evolution of the ControlTree in case of fork and join. The fork implies that other two branches are added to the process  $M_i$  reporting the boxes bx1 and bx2 which instantiate respectively the processes  $M_i$  (which is in its state a') and  $M_k$  (which is in its state a''). After the fork, the processes  $M_i$ concurrently evolves with its children  $M_i$  and  $M_k$ . Similarly the join in Fig. 5(a) merges the internal control flow with those coming from the two boxes, when the flowing processes terminate: in fact, when processes  $M_j$  and  $M_k$  reach their exiting nodes (resp.  $ex_i$  and  $ex_k$ ) and  $M_i$  is in the node a, the ControlTree evolves removing the branches representing the instantiated processes and the process  $M_i$  resumes from the node b.



Fig. 3. Transition between nodes.



Fig. 4. fork PseudoNode.



Fig. 5. join PseudoNode.

# 4 Application to the RBC Use Case

This Section describes the application of DSTM4Rail to the modeling of some RBC functionalities, chosen with the aim to highlight how this language solves easily (natively with state machines) some key modeling issues of the railway control systems. Specifically requirements R1 and R2 in Section 2 are covered with these use cases.

Communication establishment. When a train is going to establish a safe connection with an RBC, it sends a proper "initiation of communication session" (CONN\_REQ) message to it. RBC may accept only connection requests from a limited number of trains: this number is defined by the P\_MAX\_TRAIN parameter and depends on physical features. Over this value, RBC refuses a new connection sending to the train a proper message (CONN\_REF). This case recalls the need of having variables and parameters inside a machine, as well as a flexible mechanisms for dynamic instantiation and join of processes. Hence, we have to model the process which accepts or refuses the requests by checking the number of already accepted connections, stored in the cont variable, as well as the dynamic instantiation of the processes in charge of managing the communication with the specific train. Another difference with existing modeling languages is that the first process remains active to manage other communication requests after the instantiation of a lower level machine. Fig. 7 shows the modeling solution of this problem in DSTM4Rail where the main process is called COMM\_EST and the lower level process are called TRAIN\_CONN. A transition exits from the node idle, triggered by the CONN\_REQ reception event and, if the number of already accepted connections cont is less than the value of the parameter P\_MAX\_TRAIN, cont is incremented and the fork instantiates a new process entering in the box connect. The state of the main process proceeds to the node idle in order to manage other communication requests. Note that the dynamic instantiation of this machine is constrained (by construction) to be at maximum P\_MAX\_TRAIN. When a process TRAIN\_CONN terminates, the control flow exiting from the box *connect* is joined with the control flow coming from the idle node in order to capture this termination and decrement the counter cont. This modeling solution is not allowed by widespread used languages as UML which requires that a machine is suspended when lower level machines are activated. In UML [19] this situation can be managed by explicitly realizing more replicas of the same machine. Fig. 6 shows the evolution of the ControlTree. When a new communication is established, a new TRAIN\_CONN machine is instantiated by the *connect* box and a new branch is inserted in the ControlTree. Hence, at a certain instant of time a set of parallel TRAIN\_CONN machines can independently evolve. When a process reaches the exiting node, then its corresponding branch is removed from the ControlTree and the process is deallocated.

Establishment of communication session. Once the CONN\_REQ request is accepted, the communication session between the train and RBC must be



Fig. 6. Evolution of the ControlTree.



Fig. 7. Train registration management.

established. If the procedure succeeds RBC authorizes the train to move (Start-of-Mission, SoM). The Euroradio defines a protocol for safe communication establishment. Ultimately, RBC sends the SYSTEM\_VERSION message to the train; the train answers with an ACK and a SESSION\_ESTABLISHED message. If the SESSION\_ESTABLISHED message is received by RBC before the ACK, RBC sends the SYSTEM\_VERSION message again. After three attempts or if other messages are received in the meanwhile, the procedure is aborted and the SoM procedure cannot be performed. This scenario may be easily modeled by providing a machine with multiple exit points: Fig. 8(a) shows how the SOM\_PROCEDURE machine is instantiated only after the termination of the SESSION\_ESTABLISHMENT process (whose model is depicted in Fig. 8(b)) through the ok exiting node.

Management of the train movement. During the movement of the train, the RBC periodically sends the Movement Authority (MA) to the train (Section 2); concurrently, RBC has to monitor the commands that come from the Centralized Traffic Control (CTC) where a human operator may raise an alarm which require the train to brake: in these case an Unconditional Emergency Stop (UES) message is sent to the train. On the other hand, when the train successfully ends its trip, RBC performs the "End of Mission" (EoM) procedure. This scenario needs for representing concurrently executing machines one of whom may force the termination of the others. DSTM4Rail models this si-

tustion by a preemptive join, as shown in Fig. 9 where the processes CENTRAL\_CONTROL and PERIODIC\_MA are executed concurrently but, when the first machine reaches the *UES* exiting node, the join on the left preemptively forces the process PERIODIC\_MA to terminate. In this case the machine EMERGENCY\_MANAGEMENT is instantiated. On the contrary, if the process PERIODIC\_MA terminates in the *EoM* exiting node, the join on the right preemptively forces the CENTRAL\_CONTROL to terminate, and the END\_OF\_MISSION machine is instantiated.





Fig. 9. Management of the train movement.

Fig. 8. Establishment of communication session.

# 5 Discussion and Related Work

A number of formal methods and techniques have been developed by the scientific community in the past decades and applied to the development of critical systems, including railway applications [4]. Thought their usage is not largely common in industrial settings, Finite State Machines (FSMs) are widely used in modeling systems where control handling aspects are predominant. Statecharts [13] extend FSMs with hierarchy, concurrency and communication among concurrent components. Hierarchy is achieved by injecting FSMs into states of other FSMs. Concurrency is achieved by composing FSMs in parallel and by letting them run synchronously.

Among different variants of Statecharts, those integrated in UML 2.0 [19] are widespread used. UML State Machines admit parallel execution through the usage of composite states and regions. In this formalism, the fork (and join) is used in order to split (and merge) an incoming transition into two or more transitions terminating on orthogonal target vertices (i.e., vertices in different regions of a composite state). Recursive activation and dynamic instantiation is not natively

admitted. Communicating Hierarchical Machines (CHMs) are a variant of Statecharts introduced for succinctness reasons. They introduced the idea to have a collection of finite state machines (modules) having nodes and boxes. A transition entering a box represents a call to one or more instances of another module. In a Statechart there is no notion of module and instance. If multiple instances of the same module are required by the specification, each instance has to be explicitly defined. On the other way the introduction of modules allows to define Recursive State Machines (RSMs) where a module can recursively call itself [1]. Notice that, in the case of Recursive State Machines, we are not anymore in the category of Finite State Machines. In [15] CHMs has been extended introducing Dynamic Hierarchical Machines (DHMs) which allow the dynamic activation of machines: any DHM  $M_1$  can send to a concurrent DHM  $M_2$  a third DHM  $M_3$ , which starts running either in parallel with  $M_1$  and  $M_2$ , or inside  $M_2$ , depending on contextual information.

Among the commercial specification environments based on Statecharts, we considered STATEMATE [14] and Stateflow. STATEMATE is the first specification environment adopting Statecharts with the original semantics defined for the formalism and revised in [14]. STATEMATE does not allow fork and join PseudoNodes and do not consider dynamic activation of modules. Stateflow is a component of a the Simulink graphical language used in Matlab. It allows hierarchical state machines to be combined with flow chart diagrams ad it is generally used to specify the discrete controller in the model of a hybrid system (the continuous dynamics are specified by the capabilities of Simulink). Despite Stateflow is syntactically similar to a Statecharts notation, from the semantic viewpoint ([12], [11]) it avoids any form of non determinism and it imposes an explicit strict scheduling in presence of concurrency, thus being in truth a graphical notation for a sequential imperative language.

Differently from Statecharts we adopt, in the proposed formalism, the possibility of dynamically instantiate modules. The dynamic (possibly recursive) activation of modules is obtained by the structural elements of fork and join PseudoNodes (and not by message passing as in DHMs). Moreover, the nonfiniteness of RSMs and DHMs is cut off by bounding the number of simultaneous possible instances of a module. Our work moves from the cited language, mainly allowing for the dynamic instantiation of machines and removing the assumptions, implicitly intended in many languages, that control flows, exiting from a fork, must be merged thorough a join operator. DSTM4Rail, in fact, allows for recursive activation of the same machine by specifying a novel semantics for fork and join operators. Moreover, the computational power of RSMs and DHMs is cut off by bounding the number of simultaneous possible instances of a module. In doing this we follow the approach adopted in many works; e.g., see [16], where the proposed specification language is syntactically inspired by Statecharts (Requirements State Machines), but the semantics is revised and adapted to cope with the needs of the specific application domain (avionic systems).

Specifically, with respect to the UML 2.0, the syntactical elements are similar with the exception of the introduction of the box concept, the asynchronous char-

acterization of the fork and the additional notion of preemptive join. Some other concepts have been removed, since considered redundant and easily realizable in different ways: for example, regions inside composite states can be obtained with the parallel instantiation of machines. The semantics of DSTM4Rail, instead, is completely different for what concerning parallel execution. With respect to the Hierarchical Machines, we assume a similar idea of hierarchy between machines but we enrich this notion with recursive instantiation and parallel execution of machines, through the introduction of syntactical concepts of fork and join, muted from UML. Hence this formalism is substantially different, in syntax and semantics, from Hierarchical Machine (which not permit recursion), from Recursive Machine (which not permit parallelism) and from Communicating Machine (which not permit recursion and dynamic instantiation).

### 6 Conclusions

This paper presented DSTM4Rail, a formal language for the specification of the behavior of critical control systems extending the approach of Hierarchical State Machine. The critical nature of the systems to model and the high level of usability required by the application domain suggested: (1) a strong formalization of the language; (2) the synthesis and the extension of some of the features of existing FSM-based languages, and (3) the capability to be integrated into modern model-driven processes. The language has its main strengths in the extended semantics of fork and join which allows for the dynamic instantiation and the preemptive termination of machines. The modeling approach has been applied to a modern railway control system in order to demonstrate its potentialities.

# Acknowledgments

This paper is partially supported by research project CRYSTAL (Critical System Engineering Acceleration), funded from the ARTEMIS Joint Undertaking under grant agreement n. 332830 and from ARTEMIS member states Austria, Belgium, Czech Republic, France, Germany, Italy, Netherlands, Spain, Sweden, United Kingdom. The work of Dr. Nardone has been supported by MIUR under project SVEVIA (PON02\_00485\_3487758) of the public-private laboratory COSMIC (PON02\_00669).

# References

- Alur, R., Kannan, S., Yannakakis, M.: Communicating Hierarchical State Machines. In: Wiedermann, J., van Emde Boas, P., Nielsen, M. (eds.) ICALP 1999. LNCS, vol. 1644, pp. 169–178. Springer, Heidelberg (1999)
- 2. Ammann, P., Black, P., Majurski, W.: Using model checking to generate tests from specifications. In: Proc. 2nd IEEE Intern. Conf. on Formal Engineering Methods (ICFEM'98), pp. 46-54. IEEE Computer Society (1998)

- Barberio, G., Di Martino, B., Mazzocca, N., Velardi, L. et al.: An Interoperable Testing Environment for ERTMS/ETCS control systems. 1st Int. Work. on Dependable Embedded and Cyber-physical Systems and Systems-of-Systems (DECSoS14), September 9th, 2014, Firenze, Italy.
- 4. Bjorner, Dines: New results and trends in formal techniques and tools for the development of software for transportation systems A review. Symposium on Formal Methods for Railway Operation and Control Systems (FORMS 2003), Budapest/Hungary, LHarmattan Hongrie, May 2003, G. Tarnai and E. Schnieder (Eds.), Germany. (2003)
- CENELEC, EN 50126:2012: Railway applications Demonstration of Reliability, Availability, Maintainability and Safety (RAMS) - Part 1: Generic RAMS process.
- CENELEC, EN 50128:2011: Railway applications Communication, signalling and processing systems - Software for railway control and protection systems.
- CESAR: Cost-Efficient methods and processes for SAfety Relevant embedded systems, http://www.cesarproject.eu/
- CRYSTAL: CRitical sYSTem engineering AcceLeration, http://www.crystal-artemis.eu/
- 9. Gentile, U., Marrone, S., Mele, G., Nardone, R., and Peron, A.: Test Specification Patterns for Automatic Generation of Test Sequences. In: Lang, F. and Flammini, F. (eds.), FMICS 2014, LNCS, vol. 8718, pp. 170–184. Springer, Heidelberg (2014).
- Glinz, M.: Statecharts For Requirements Specification As Simple As Possible, As Rich As Needed. In: International Workshop on Scenarios and State Machines: Models Algorithms and Tools (2002)
- 11. Hamon, G.: A denotational semantics for Stateflow. In: The Fifth ACM International Conference on Embedded Software, pp.164-172. ACM Press (2005)
- Hamon, G., Rushby, J.: An operational semantics for Stateflow. In: Fundamental Approaches to Software Engineering, 7th International Conference. LNCS 2984, pp. 229-243. Springer (2004)
- 13. Harel, D.: Statecharts: A Visual Formalism for Complex Systems. Science of Computer Programming 8, 231–274 (1987)
- 14. Harel, D., Naamad, A.: The STATEMATE Semantics of Statecharts. ACM Trans. on Soft. Engineering and Methodology 5(4), pp.293-333 (1996)
- Lanotte, R., Maggiolo-Schettini, A., Peron, A., Tini, S.: Dynamical Hierachical Machines. Fundamenta Informaticae, 54, pp.237-252 (2003)
- Leveson, N.G., Heimdahl, M.P.E., Hildreth, H., Reese, J.D.: Requirements Specification for Process-Control Systems. IEEE Trans. on Soft. Eng. 20(9), pp. 684–707 (1994)
- 17. MBAT: Combined Model-based Analysis and Testing of Embedded Systems, http://www.mbat-artemis.eu/
- Mohalik, S., Gadkari, A.A., Yeolekar, A., Shashidhar, K.C., Ramesh, S.: Automatic test case generation from Simulink/Stateflow models using model checking. Software Testing Verification and Reliability, 24(2) pp.155-180 (2014)
- 19. OMG. Unified Modeling Language (UML), v2.4.1, Superstructure Specification.
- Pflügl, H., El-Salloum, C., Kundner, I.: CRYSTAL, CRitical sYSTem engineering AcceLeration, a Truly European Dimension. ARTEMIS Magazine 14, 12–15 (2013)
- 21. Steinberg, D., Budinsky, F., Paternostro, M., Merks, E.: EMF: Eclipse Modeling Framework. Addison-Wesley Professional, 2009.

# Modelling and Analysing the European Rail Traffic Management System in Real-Time Maude

Andrew Lawrence\*, Ulrich Berger, Phillip James, Markus Roggenbach, and Monika Seisenberger

Swansea University, UK
ajlawrence@acm.org
{u.berger,p.d.james,m.roggenbach,m.seisenberger}@swansea.ac.uk

**Abstract.** We report on experimental work towards modelling the ERTMS in Real Time Maude and performing qualitative analysis for safety and quantitative analysis for system performance.

### 1 Introduction

The European Rail Traffic Management System (ERTMS) is a next generation train control system, loosely specified in [6]. Traditionally, the railway has been constructed from discrete entities such as signals, track circuits for train detection, and interlockings that use propositional equations to guarantee train safety. In contrast, the ERTMS deals with continuous data that allows for a finer grain of control over railway traffic. In ERTMS, a radio block centre (RBC) grants each train a block of track, called a movement authority, in which the train is allowed to move, see Figure 1. ERTMS requires trains to have on-board equipment that ensures trains to brake in time. While the correctness of traditional interlocking systems is relatively well understood [1], it is ongoing research to verify ERTMS based systems for safety properties such as collision freedom due to the involvement of continuous data. Furthermore, it is an open field of how to substantiate the claim that the ERTMS approach offers a higher performance of the railway compared to traditional railway control.

# 2 Methodology

We first model the ERTMS as a system of hybrid automata [3] and establish that these automata are non-zeno, i.e., every finite initialised trajectory in the labelled transition system defined by the automaton has an infinite trajectory of which it is a prefix in the set of (time-)divergent initialized trajectories in the automaton. This property has been manually proven for the automaton representing a train. For the other components, namely the interlocking and the radio block centre, the automata are non-zeno as they do not include timing conditions.

 $<sup>^\</sup>star$  Supported by Siemens Rail Automation and EPSRC

### 2 Lawrence, Berger, James, Roggenbach, Seisenberger



Fig. 1: Overview on ERTMS, level 2

Real-Time Maude is an extension of Full Maude which supports the specification of real-time and hybrid systems. Offered techniques include simulation and time-bounded LTL model checking for verification of timed properties. We use the Real-Time Maude system [5] for the specification of ERTMS, and the Maude linear temporal logic (LTL) model checker [2] for verification. In our model, speed, acceleration, braking behaviour and track length are integral parts. This allows us to specify and prove safety properties, such as collision freedom, over the system and to study how the system performs over time via execution of the specification as a simulation.



Fig. 2: Case Studies

# 3 Case Studies

We first formalise a simple example railway in the shape of a pentagon (see Fig. 2 (a)) with two trains, one slower than the other, an interlocking, an RBC

and five track segments  $\{l_0, \ldots, l_4\}$  as four hybrid automata to capture the behaviour of the system. This example demonstrates how the ERTMS system deals with a length of track containing multiple trains that are controlled by an RBC and interlocking. We proceed to model this example in the Real Time Maude system as an object oriented specification capturing the message passing and communications between the different components of the system. This specification is executed to simulate the behaviour of the modelled railway. The faster train can be observed catching up with the slow train and then braking and waiting for authorisation before moving off again. We verify that the movement authorities of the two trains do not overlap, expressed by safety properties of the form "it is globally true that if both trains are behind their movement authorities then either train1's movement authority is behind train2's or vice versa".

As a second case study we model an open system, a simple junction (see Fig. 2 (b)) which contains a single point and two routes each consisting of five track segments. Trains are inserted onto the railway line according to a schedule by a controller object. Each train is also assigned a given route which they must follow. This example demonstrates the behaviour of ERTMS with respect to two further important constructs in the railway domain namely routes and points. It enables us to analyse the throughput of the ERTMS over a junction where one train must wait for the point to become available. The Maude LTL model checker is then applied to verify that a point does not move in a given movement authority, a safety property that is essential for the prevention of derailment.

### 4 Formalisation

In the following we show some aspects of our formalisation to give a hint on how we modelled in Maude the interplay between trains, the interlocking and the radio block centre. Here, we presume some familiarity with Maude as an introduction to its language constructs is beyond the scope of this paper.

In particular, we show the structure of the train class, and formulate a simple safety condition to be model checked. Classes for the radio block processor and the interlocking are defined similarly.

Real time Maude allows us to compose objects and messages together to form a subsort of System called a Configuration.

```
mod CONFIGURATION is
   *** basic object system sorts
   sorts Object Msg Configuration .

*** construction of configurations
   subsort Object Msg < Configuration .
   op none : -> Configuration [ctor] .
   op __ : Configuration Configuration -> Configuration
        [ctor config assoc comm id: none] .
```

### 4 Lawrence, Berger, James, Roggenbach, Seisenberger

The composition operation for configurations is commutative, and the order of objects and messages in a configuration does not matter. Inspired by [4], we use an operator  $\delta$  which describes how a distributed system evolves over time. In particular, it ensures that messages (which cannot be part of an object configuration) between the various components are read.

```
op delta : Configuration -> Configuration [frozen] .
var OCREST : ObjectConfiguration .
vars CON1 CON2 : NEConfiguration .
rl [timetrans] : {OCREST} => {delta(OCREST)} in time 1 .
rl [delta1] : delta(CON1 CON2) => delta(CON1) delta(CON2) .
```

We model the train class to have a state that can range over one of four possibilities: constant speed, accelerating, braking, and stopped. We further include current position, speed, acceleration, movement authority, current track segment, and maximal speed, and specify its behaviour with appropriate rules (omitted).

To prove safety, we apply the Maude LTL Model checker, which does so-called on-the-fly model checking with respect to an appropriate satisfaction relation. The safety properties we have checked are typically of the form: if train1 is behind its movement authority and train2 is behind its movement authority, then there is no overlap of movement authorities.

# 5 Results & Future Work.

Though the presented case studies are of a simple nature, they demonstrate that our modelling approach works: safety properties can be formulated and verified in reasonable time via model-checking, performance can be studied via simulation.



Fig. 3: Train speed / distance over time – Pentagon example

Fig. 3 shows such a simulation for the case of the Pentagon example with two trains, where a slow train (train1) is followed by a fast train (train2). As one can see from the dashed curves, the slow train forces the fast train to make regular speed adaptions, and in this sense dominates the system. At the end of each track, the slow train requests a new movement authority. Until this is granted, the train has to slow down. The fast train however, has to slow down more often as it is hindered in its progress by the slow train ahead. The ERTMS was introduced to improve exactly this kind of behaviour by opening up the possibility to add a management layer that provides a strategy over what kind of movement authority shall be provided to a train. In our example, for instance the speed profile for the fast train could always be limited to a speed of 4.

It is future work to study larger, realistic examples in order to see if the approach scales. This might involve the development of abstractions, and extensions of our modelling to allow us to establish performance properties as theorems.

Part of this work was presented in a talk at WADT 2014 (http://wadt2014.cs.ovgu.de) and appeared (as an abstract) in the electronic pre-proceedings of this workshop (available on the WADT webpage).

**Acknowledgement.** The authors would like to thank Simon Chadwick from Siemens Rail Automation, Chippenham, UK, for his support and encouraging feedback. A special thanks also goes to Erwin R. Catesbeiana (Jr.) for his appropriately timed feedback on the paper.

### References

- 1. Chadwick, S., James, P. Kanso, K., Lawrence, A. Moller, F., Roggenbach, M., Seisenberger, M. and Setzer, A.: Verification of solid state interlocking programs. In SEFM'13, LNCS 8368, Springer 2014.
- Eker, S., Meseguer, J., and Sridharanarayanan A.: The Maude LTL Model Checker and its Implementation. In SPIN'03, LNCS 2648, Springer 2003.
- 3. Henzinger, T. A.: The Theory of Hybrid Automata. In Verification of Digital and Hybrid Systems, NATO ASI Series F: Computer and Systems Sciences 170, pages 265-292, Springer, 2000.
- Olveczky, P. C., Thorvaldsen, S.: Formal Modelling and Analysis of the OGDC Wireless Sensor Network Algorithm in Real-Time Maude. In FMOODS'07, LNCS 4468, pages 122-140. Springer 2007.
- Ölveczky, P. C. and Meseguer, J.: Specification and Analysis of Real-Time Systems Using Real-Time Maude. In FASE'04. LNCS 2984, Springer, 2004.
- International Union of Railways: ETCS System Requirements Specification (SRS) ver. 2.3.0, 2006.

# Expression-based aliasing for OO-languages

Georgiana Caltais<sup>1</sup>

Department of Computer Science, ETH Zürich, Switzerland

**Abstract.** Alias analysis has been an interesting research topic in verification and optimization of programs. The undecidability of determining whether two expressions in a program may reference to the same object is the main source of the challenges raised in alias analysis. In this paper we propose an extension of a previously introduced alias calculus based on program expressions, to the setting of unbounded program executions such as infinite loops and recursive calls. Moreover, we devise a corresponding executable specification in the K-framework. An important property of our extension is that, in a non-concurrent setting, the corresponding alias expressions can be over-approximated in terms of a notion of regular expressions. This further enables us to show that the associated K-machinery implements an algorithm that always stops and provides a sound over-approximation of the "may aliasing" information, where soundness stands for the lack of false negatives. As a case study, we analyze the integration and further applications of the alias calculus in SCOOP. The latter is an object-oriented programming model for concurrency, recently formalized in Maude; K definitions can be compiled into Maude for execution.

# 1 Introduction

A research direction of interest in Computer Science is the application of alias analysis in verification and optimization of programs. One of the challenges along this line of research has been the undecidability of determining whether two expressions in a program may reference the same object. A rich suite of approaches aiming at providing a satisfactory balance between scalability and precision has already been developed in this regard. Examples include: (i) intraprocedural frameworks [17, 16] that handle isolated functions only, and their inter-procedural counterparts [16, 23, 12] that consider the interactions between function calls; (ii) type-based techniques [9]; (iii) flow-based techniques [4,7] that establish aliases depending on the control-flow information of a procedure; (iv) context-(in)sensitive approaches [10, 30] that depend on whether the calling context of a function is taken into account or not; (v) field-(in)sensitive approaches [21, 1] that depend on whether the individual fields of objects in a program are traced or not. More details on such classifications can be found in [26], for instance. For a comprehensive survey on alias analyses for objectoriented programs, corresponding issues and remaining open problems, we refer the interested reader to the works in [29,11].

Of particular interest for the work in this paper is the untyped, flow-sensitive, field sensitive, inter-procedural and context-sensitive calculus for may aliasing, introduced in [15]. The aforementioned calculus covers most of the aspects of a modern object-oriented language, namely: object creation and deletion, conditionals, assignments, loops and (possibly recursive) function calls. The approach in [15] abstracts the aliasing information in terms of explicit access paths [18] referred to as alias expressions. Consider, for an example, the code

$$x := y;$$
 $\mathbf{loop} \ x := x.next \ \mathbf{end}$  (1)

The corresponding execution causes x to become aliased to y.next.next..., with a possibly infinite number of occurrences of the field next. The set of associated alias expressions can be equivalently written as:

$$\{[x, y.next^k] \mid k \ge 0\}. \tag{2}$$

The sources of imprecision introduced by the calculus in [15] are limited to ignoring tests in conditionals, and to "cutting at length L", for the case of possibly infinite alias relation as in (2). Intuitively, the cutting technique considers sequences longer than a given length L as aliased to all expressions.

There is a huge literature on heap analysis for aliasing [11], but hardly any paper that presents a calculus as in [15] allowing the derivation of alias relations as the result of applying various instructions of a programming language.

Our focus is two folded. First, we want extend the framework in [15] to the setting of unbounded program executions such as infinite loops and recursive calls. In accordance, the goal is to provide a way to shift from "finite" to "infinite behaviours". This can be achieved in a rather straightforward manner, by redefining the construct  $\mathbf{loop}\ p$   $\mathbf{end}$  in [15] according to the informal semantics: "execute p repeatedly any number of times, including zero". However, developing a corresponding mechanism for reasoning on "may aliasing" in a finite number of steps is not trivial. The key observation that paves the way to a possible (finite state-based) modeling in a non-concurrent setting is that the alias expressions corresponding to loops and recursive calls grow in a regular fashion. Hence, they are finitely representable, as it is easy to see in (2), for instance. Such regularities cannot be exploited in concurrent contexts, due to the "non-determinism" of process interaction.

A similar technique exploiting regular behaviour of (non-concurrent) programs, in order to reason on "may aliasing", was previously introduced in [2]. In short, the results in [2] utilize abstract representations of programs in terms of finite pushdown systems, for which infinite execution paths have a regular structure (or are "lasso shaped") [3]. Then, in the style of abstract interpretation [8], the collecting semantics is applied over the (finite state) pushdown systems to obtain the alias analysis itself. In short, the main difference with the results in [2] consists in how the abstract memory addresses corresponding to pointer variables are represented. In [2] these range over a finite set of natural numbers. In this paper we consider alias expressions build according to the calculus in [15].

The work in [2] also proposes an implementation of pushdown systems in the K-framework [27]. The latter is an executable semantic framework based on Rewriting Logic (RL) [19], and has successfully been used for defining programming languages and corresponding formal analysis tools. Moreover, K definitions have a direct implementation in K-Maude [28].

We agree that it could be worth presenting our analysis as an abstract interpretation (AI) [8]. A modelling exploiting the machinery of AI (based on abstract domains, abstraction and concretization functions, Galois connections, fixed-points, etc.) is an interesting, but different research topic per se.

Our second interest w.r.t. may aliasing is its integration in SCOOP [22] – a simple object oriented programming model for concurrency; thus an operational based approach on handling the alias calculus is more appropriate. The basis of a RL-based framework for the design and analysis of the SCOOP model was recently set in [22]. The reference implementation of SCOOP is Eiffel [20]. The integration of alias analysis belongs to a more ambitious goal, namely, the construction of a RL-based toolbox for the analysis of SCOOP programs (examples include a deadlock detector and a type checker).

Our contribution. By drawing inspiration from, and building on top of the results in [15, 2], in this paper we propose:

- an extension of the (finite) alias calculus in [15] to the setting of unbounded program executions, and a sound over-approximation technique based on "regular alias expressions", for non-concurrent settings;
- a RL-based specification of the extended calculus;
- an algorithm that always terminates and provides a sound over-approximation
  of "may aliasing" by exploiting a notion of regular (finitely representable)
  aliases, for non-concurrent settings.

Moreover, we analyze the integration, implementation and further applications of the alias calculus in SCOOP.

We refer the interested reader to [5] for the extended version of the current paper including: the full specification of the RL-based machinery, two examples emphasizing the naturalness of applying the executable aliasing framework and a case study exploiting the corresponding implementation in SCOOP, respectively, together with the detailed proofs of the formal results.

Paper structure. The paper is organized as follows. In Section 2 we introduce the extension of the alias calculus in [15] to unbounded executions. In Section 3 we provide the RL-based executable specification of the calculus in the  $\mathbb K$  semantic framework. The implementation in SCOOP, and further applications are discussed in Section 4. In Section 5 we draw the conclusions and provide pointers to future work.

### 2 The alias calculus

In this section we define an extension of the calculus in [15], to unbounded program executions. Moreover, based on the idea behind the *pumping lemma for* 

regular languages [25], we devise a corresponding sound over-approximation of "may aliasing" in terms of regular expressions, applicable in sequential contexts. This paves the way to developing an algorithm for the aliasing problem, as presented in Section 3, in the formal setting of the K semantic framework [27].

**Preliminaries.** We proceed by briefly recalling the notion of *alias relation* and a series of associated notations and basic operations, as introduced in [15].

We call an expression a (possibly infinite) path of shape x.y.z...., where x is a local variable, class attribute or Current, and y,z,... are attributes. Here, Current, also known as this or self, stands for the current object. For an arbitrary alias expression e, it holds that e.Current = Current.e = e. Let E represent the set of all expressions of a program. An  $alias\ relation$  is a symmetric and irreflexive binary relation over  $E \times E$ .

Given an alias relation r and an expression e, we define

$$r/e = \{e\} \cup \{x : E \mid [x, e] \in r\}$$

denoting the set consisting of all elements in r which are aliased to e, plus e itself.

Let x be an expression; we write r - x to represent r without the pairs with one element of shape x.e.

We say that an alias relation is dot complete whenever for any t, u, v and a it holds that if [t, u] and [t.a, v] are alias pairs, then [u.a, v] is an alias pair and, moreover, if a is in the domain of t, then [t.a, u.a] is an alias pair. By the "domain of t" we refer to a method or a field in the class corresponding to the object referred by the expression associated to t. For instance, given a class NODE with a field next of type NODE, and a NODE object x, we say that next is in the domain of t = x.next.next. For the sake of brevity, we write dot-complete(r) for the closure under dot-completeness of a relation r.

The notation r[x = u] represents the relation r augmented with pairs [x, y] and made dot complete, where y is an element of u.

# 2.1 Extension to unbounded executions

We further introduce an extension of the alias calculus in [15] to infinite alias relations corresponding to unbounded executions such as infinite loops or recursive calls. The main difference in our approach is reflected by the definition of loops, which now complies to the usual fixed-point denotational semantics.

The alias calculus is defined by a set of axioms "describing" how the execution a program affects the aliasing between expressions. As in [15], the calculus ignores tests in conditionals and loops. The *program instructions* are defined as follows:

$$p ::= p; p \mid$$
 then  $p \in p \in p \in d \mid$  create  $x \mid$  forget  $x \mid t := s \mid$  loop  $p \in d \mid$  call  $f(l) \mid x.$  call  $f(l)$ .

In short, we write r \* p to represent the alias information obtained by executing p when starting with the initial alias relation r.

The axiom for sequential composition is defined in the obvious way:

$$r \gg (p;q) = (r \gg p) \gg q. \tag{4}$$

Conditionals are handled by considering the union of the alias pairs resulted from the execution of the instructions corresponding to each of the two branches, when starting with the same initial relation:

$$r \gg (\mathbf{then} \ p \ \mathbf{else} \ q \ \mathbf{end}) = r \gg p \ \cup \ r \gg q.$$
 (5)

As previously mentioned, we define  $r \gg \mathbf{loop}\ p$  end according to its informal semantics: "execute p repeatedly any number of times, including zero". The corresponding rule is:

$$r * (\mathbf{loop} \ p \ \mathbf{end}) = \bigcup_{n \in \mathbb{N}} (r * p^n)$$
 (6)

where  $\cup$  stands for the union of alias relations, as above. This way, our calculus is extended to infinite alias relations. This is the main difference with the approach in [15] that proposes a "cutting" technique restricting the model to a maximum length L. In [15], sequences longer than L are considered as aliased to all expressions. Orthogonally, for sequential settings, we provide finite representations of infinite alias relations based on over-approximating regular expressions, as we shall see in Section 2.2.

Both the creation and the deletion of an object x eliminate from the current alias relation all the pairs having one element prefixed by x:

$$r * (\mathbf{create} \ x) = r - x$$
  
 $r * (\mathbf{forget} \ x) = r - x.$  (7)

The (qualified) function calls comply to their initial definitions in [15]:

$$r * (\mathbf{call} f(l)) = (r[f^{\bullet}:l]) * | f |$$

$$r * (x.\mathbf{call} f(l)) = x.((x'.r) * \mathbf{call} f(x'.l)).$$

$$(8)$$

Here  $f^{\bullet}$  and |f| stand for the formal argument list and the body of f, respectively, whereas r[u:v] is the relation r in which every element of the list v is replaced by its counterpart in u. Intuitively, the negative variable x' is meant to transpose the context of the qualified call to the context of the caller. Note that "." (i.e., the constructor for alias expressions) is generalized to distribute over lists and relations:  $x.[a,b,\ldots] = [x.a,x.b,\ldots]$ .

For an example, consider a class C in an OO-language, and an associated procedure f that assigns a local variable y, defined as:  $f(x) \{ y := x \}$ . Then, for instance, the aliasing for a.call f(a) computes as follows:

$$\emptyset \text{ } \text{ } \text{ } a.\mathbf{call} \text{ } f(a) = \\ a.(a'.\emptyset \text{ } \text{ } \text{ } y := a'.a) = \\ a.(\emptyset \text{ } \text{ } y := \textit{Current}) = \\ dot\text{-}complete(\{[a.y,a]\}).$$

Recursive function calls can lead to infinite alias relations. In sequential settings, as for the case of loops, the mechanism exploiting sound regular overapproximations in order to derive finite representations of such relations is presented in the subsequent sections.

The axiom for assignment is as well in accordance with its original counterpart in [15]:

$$r * (t := s) =$$
**given**  $r_1 = r[ot = t]$  then  $(r_1 - t)[t = (r_1/s - t)] - ot$  end (9)

where ot is a fresh variable (that stands for "old t"). Intuitively, the aliasing information w.r.t. the initial value of t is "saved" by associating t and ot in r and closing the new relation under dot-completeness, in  $r_1$ . Then, the initial t is "forgotten" by computing  $r_1 - t$  and the new aliasing information is added in a consistent way. Namely, we add all pairs (t,s'), where s' ranges over  $r_1/s - t$  representing all expressions already aliased with s in  $r_1$ , including s itself, but without t. Recall that alias relations are not reflexive, thus by eliminating t we make sure we do not include pairs of shape [t,t]. Then, we consider again the closure under dot-completeness and forget the aliasing information w.r.t. the initial value of t, by removing ot.

Remark 1. It is worth discussing the reason behind not considering transitive alias relations. Assume the following program:

then 
$$x := y$$
 else  $y := z$  end

Based on the equations (5) and (9) handling conditionals and assignments, respectively, the calculus correctly identifies the alias set:  $\{[x,y],[y,z]\}$ . Including [x,z] would be semantically equivalent to the execution of the two branches in the conditional at the same time, which is not what we want.

# 2.2 A sound over-approximation

In a sequential setting, the challenge of computing the alias information in the context of (infinite) loops and recursive calls reduces to evaluating their corresponding "unfoldings", captured by expressions of shape

$$r \gg p^{\omega}$$
,

with  $\omega$  ranging over naturals plus infinity, r an (initial) alias relation  $(r = \emptyset)$ , and p a basic control block defined by:

$$p ::= p ; p \mid$$
 then  $p \mid$  else  $p \mid$  end  $\mid$  create  $x \mid$  forget  $x \mid$  (10)  $t := s .$ 

The value  $r \gg p^{\omega}$  refers to the alias relation obtained by recursively executing the control block p, and it is calculated in the expected way:

$$r * p^0 = r$$

$$r * p^{k+1} = (r * p^k) * p.$$

Consider again the code in (1):

$$x := y;$$
  
loop  $x := x.next$  end.

Its execution generates the alias relation

$$(((\emptyset * (x := y)) * (x := x.next)) * (x := x.next)...$$

including an infinite number of pairs of shape:

$$[x, y.next], [x, y.next.next], [x, y.next.next.next]...$$
 (11)

A similar reasoning does not hold for concurrent applications, where process interaction is not "regular".

In what follows we provide a way to compute finite representations of infinite alias relations in sequential settings. The key observation is that alias expressions corresponding to unbounded program executions grow in a regular fashion. See, for instance, the aliases in (11), which are pairs of type  $[x, y.next^{k\geq 1}]$ .

Regular expressions are defined similarly to the regular languages over an alphabet. We say that an expression is regular if it is a local variable, class attribute or Current. Moreover, the concatenation  $e_1 \cdot e_2$  of two regular expressions  $e_1$  and  $e_2$  is also regular. Given a regular alias expression e, the expression  $e^*$  is also regular; here  $(-)^*$  denotes the Kleene star [14]. We call an alias relation regular if it consists of pairs of regular expressions.

**Lemma 1.** Assume p a program built according to the rules in (3). Then, in a sequential setting, the relation  $\emptyset \gg p$  is regular.

*Proof.* The result follows by induction on the structure of p.

Inspired by the idea behind the pumping lemma for regular languages [25], we define a lasso property for alias relations, which identifies the repetitive patterns within the structure of the corresponding alias expressions. The intuition is that such patterns will occur for an infinite number of times due to the execution of loops or recursive function calls. Then, we supply sound over-approximations of "lasso" relations, based on regular alias expressions.

In the context of alias relations, we say that the lasso property is satisfied by r and r' whenever the following two conditions hold: (1) r behaves like a lasso base of r'. Namely, all the pairs  $[e_1, e_2] \in r$  are used to generate elements  $[e'_1, e'_2] \in r'$ , by repeating tails of prefixes of  $e_1$  and  $e_2$ , respectively, and (2) r' is a lasso extension of r. Namely, all the pairs in r' are generated from elements of r by repeating tails of their prefixes. For example, if  $e_1$  above is an expression of shape x.y.z.w, then  $e'_1$  can be x.y.y.z.w if we consider the tail y of the prefix x.y, or x.y.z.y.z.w if we take the tail y.z of the prefix x.y.z.

Formally, consider r and r' two alias relations, and  $x_i, y_i$  and  $z_i$  a set of (possibly empty) expressions, for  $i \in \{1, 2\}$ . Then:

$$lasso(r, r') = ([x_1y_1z_1, x_2y_2z_2] \in r \text{ iff } [x_1y_1y_1z_1, x_2y_2y_2z_2] \in r').$$
 (12)

For the simplicity of notation we sometimes omit the dot-separators between expressions. For instance, we write x y z in lieu of x.y.z.

Assuming a lasso over r and r', we compute a relation consisting of regular expressions over-approximating r and r' as:

$$reg(r, r') = \{ [x_1 y_1^* z_1, x_2 y_2^* z_2] \mid [x_1 y_1 z_1, x_2 y_2 z_2] \in r \land [x_1 y_1 y_1 z_1, x_2 y_2 y_2 z_2] \in r' \}$$
(13)

where  $x_i, y_i$  and  $z_i$  are possibly empty expressions, for  $i \in \{1, 2\}$ . As previously indicated, the over-approximation is sound w.r.t. the repeated application of a basic control block as in (10), in the way that it does not introduce any false negatives:

**Lemma 2.** Consider r and r' two alias relations, and p a basic control block in a sequential setting. If  $r \gg p = r'$  and lasso(r, r') = true, then the following holds for all n > 1:

$$r \gg p^n \in \operatorname{reg}(r, r')$$
.

*Proof.* The reasoning is by induction on n. The base case follows immediately, whereas the induction step is proved by "reductio ad absurdum".

## 3 A K-machinery for collecting aliases

In this section we provide the specification of a RL-based mechanism collecting the alias information in the K semantic framework [27]. We choose K more as a notational convention to enable compact and modular definitions. In reality, the K-rules in this section are implemented in Maude, as rewriting theories, on top of the formalization of SCOOP [22] (we refer to Section 4 for more details on our approach).

In short, our strategy is to start with a program built on top of the control structures in (3), then to apply the corresponding  $\mathbb{K}$ -rules in order to get the "may aliasing" information in a designated  $\mathbb{K}$ -cell ( $\langle - \rangle_{al}$ ). Independently of the setting (sequential or concurrent) one can exploit this approach in order to evaluate the aliases of a given finite length L. We also show that for sequential contexts, the application of the  $\mathbb{K}$ -rules is finite and the aliases in the final configuration soundly over-approximate the (infinite) "may alias" relations of the calculus.

Brief overview of  $\mathbb{K}$ .  $\mathbb{K}$  [27] is an executable semantic framework based on Rewriting Logic [19]. It is suitable for defining (concurrent) languages and corresponding formal analysis tools, with straightforward implementation in  $\mathbb{K}$ -Maude [28].  $\mathbb{K}$ -definitions make use of the so-called *cells*, which are labelled and can be nested, and (rewriting) *rules* describing the intended (operational) semantics.

A cell is denoted by  $\langle - \rangle_{[name]}$ , where [name] stands for the name of the cell. A construction  $\langle . \rangle_n$  stands for an empty cell named n. We use "pattern

matching" and write  $\langle c \ldots \rangle_n$  for a cell with content c at the top, followed by an arbitrary content  $(\ldots)$ . Orthogonally, we can utilize cells of shape  $\langle \ldots c \rangle_n$  and  $\langle \ldots c \ldots \rangle_n$ , defined in the obvious way.

Of particular interest is  $\langle - \rangle_k$  – the continuation cell, or the k-cell, holding the stack of program instructions (associated to one processor), in the context of a programming language formalization. We write

$$\langle i_1 \curvearrowright i_2 \ldots \rangle_k$$

for a set of instructions to be "executed", starting with instruction  $i_1$ , followed by  $i_2$ . The associative operation  $\curvearrowright$  is the instruction sequencing.

A K-rewrite rule

$$\langle c \ldots \rangle_{n_1} \langle c' \rangle_{n_2} \Rightarrow \langle c' \ldots \rangle_{n_1} \langle \ldots c' \rangle_{n_3}$$
 (14)

reads as: if cell  $n_1$  has c at the top and cell  $n_2$  contains value c', then c is replaced by c' in  $n_1$  and c' is added at the end of the cell  $n_3$ . The content of  $n_2$  remains unchanged. In short, (14) is written in a  $\mathbb{K}$ -like syntax as:

$$\langle \frac{c}{c'} \dots \rangle_{n_1} \langle c' \rangle_{n_2} \langle \dots \frac{c}{c'} \rangle_{n_3}$$
.

We further provide the details behind the  $\mathbb{K}$ -specification of the alias calculus. As expected, the k-cell retains the instruction stack of the object-oriented program. We utilize cells  $\langle -\rangle_{\rm al}$  to enclose the current alias information, and the so-called back-tracking cells  $\langle -\rangle_{\rm bkt}$ -... enabling the sound computation of aliases for the case of then - else - end and, in non-concurrent contexts, for loops and (possibly recursive) function calls. As a convention, we mark with ( $\clubsuit$ ) the rules that are sound only for non-concurrent applications, based on Lemma 2. Due to space limitations, in what follows we introduce only the  $\mathbb{K}$ -rules for handling assignments and loops. The entire specification can be found in [5].

As expected, the assignment rule simply restores the current alias relation according to its axiom in (9), and removes the assignment instruction from the top of the k-cell:

$$\left\langle \frac{r}{(r_1-t)[t=(r_1/s-t)]-ot} \right\rangle_{\text{al}} \left\langle \frac{t:=s}{\cdot} \dots \right\rangle_{\text{k}} \quad \text{with } r_1=r[ot=t]$$
 (15)

For **loop** p **end**, we utilize a meta-construction p 1 **loop** p **end** simulating the unfolding corresponding to (6), and a back-tracking stack  $\langle - \rangle_{\text{bkt-1}}$  collecting the alias information obtained after each execution of p. Moreover, the K-implementation exploits the result in Lemma 2. Whenever a "lasso" is reached, the infinite rewriting is prevented by resuming the infinite application of p in terms of a sound over-approximating alias relation. The K-rules are as follows.

First, the aforementioned unfolding is performed, and the alias relation before p is stored in the back-tracking cell as  $\langle r \rangle_{\text{al-o}} \langle p \rangle_{\text{l}}$ :

$$\langle r \rangle_{\rm al} \langle \frac{\text{loop } p \text{ end}}{p \text{ } | \text{loop } p \text{ end}} \cdots \rangle_{\rm k} \langle \frac{.}{\langle r \rangle_{\rm al-o} \langle p \rangle_{\rm l}} \cdots \rangle_{\rm bkt-l}$$
 (16)

If the alias relation r' obtained after the successful execution of p (marked by  $\boxed{1}$  at the top of the continuation) is not a lasso of the aliasing r before p (previously stored in  $\langle -\rangle_{\text{bkt-1}}$ ) then p is constrained to a new execution by becoming the top of the k-cell, and r' is memorized for back-tracking:

$$\langle r' \rangle_{\text{al}} \langle \frac{\boxed{1 \text{ loop } p \text{ end}}}{p \ \boxed{1 \text{ loop } p \text{ end}}} \dots \rangle_{\text{k}} \langle \frac{\langle r \rangle_{\text{al-o}} \langle p \rangle_{\text{l}}}{\langle r' \rangle_{\text{al-o}} \langle p \rangle_{\text{l}}} \dots \rangle_{\text{bkt-l}} \text{ if not lasso}(r, r')$$
 (4) (17)

Last, if a lasso is reached after the execution of p, then the current aliasing is soundly replaced by a "regular" over-approximation  $\operatorname{reg}(r,r')$ , the corresponding back-tracking information is removed from  $\langle - \rangle_{\operatorname{bkt-l}}$  and the **loop** instruction is eliminated from the k-cell:

In a non-concurrent setting, the machinery orchestrating the K-rules introduced in this section implements an algorithm that always terminates and provides a sound over-approximation of "may aliasing".

**Theorem 1.** Consider p a program built on top of the control structures in (3), that executes in a sequential setting. Then, the application of the corresponding  $\mathbb{K}$ -rules when starting with p and an empty alias relation, is a finite rewriting of shape

$$\langle \emptyset \rangle_{\rm al} \langle p \rangle_{\rm k} \stackrel{(*)}{\Longrightarrow} \langle r \rangle_{\rm al} \langle . \rangle_{\rm k},$$

with r a sound over-approximation of the aliasing information corresponding to the execution of p.

*Proof.* The key observation is that, due to the execution of loops and/or recursive calls, expressions can infinitely grow in a regular fashion. Hence, a lasso is always reached. Consequently, the control structure generating the infinite behaviour is removed from the k-cell, according to the associated  $\mathbb{K}$ -specification for loops and/or recursive calls. This guarantees termination. Moreover, recall that the regular expressions replacing the current alias information are a sound overapproximation, according to Lemma 2.

Observe that the RL-based machinery can simulate precisely the "cutting at length L" technique in [15]. It suffices to disable the rules ( $\clubsuit$ ) and stop the rewriting after L steps.

## 4 Integration in SCOOP

In this section we provide a brief overview on the integration and applicability of the alias calculus in SCOOP [22] – a simple object-oriented programming model for concurrency. Two main characteristics make SCOOP simple: 1) just

one keyword programmers have to learn and use in order to enable concurrent executions, namely, *separate* and 2) the burden of orchestrating concurrent executions is handled within the model, therefore reducing the risk of correctness issues.

In short, the key idea of SCOOP is to associate to each object a processor, or handler (that can be a CPU, or it can also be implemented in software, as a process or thread). Assume a processor p that performs a call o.f() on an object o. If o is declared as "separate", then p sends a request for executing f() to q — the handler of o (note that p and q can coincide). Meanwhile, p can continue. Processors communicate via channels.

The Maude semantics of SCOOP in [22] is defined over tuples of shape

$$\langle p_1 :: St_1 \mid \ldots \mid p_n :: St_n, \sigma \rangle$$

where,  $p_i$  denotes a processor (for  $i \in \{1, ..., n\}$ ),  $St_i$  is the call stack of  $p_i$  and  $\sigma$  is the *state* of the system. States hold the information about the *heap* (which is a mapping of references to objects) and the *store* (which includes formal arguments, local variables, *etc.*).

The assignment instruction, for instance, is formally specified as the transition rule:

$$\frac{\text{a is fresh}}{\Gamma \vdash \langle p :: t := s; St, \sigma \rangle \rightarrow \langle p :: \text{eval}(a, s); \text{wait}(a); \text{write}(t, a.data); St, \sigma \rangle}$$
 (19)

where, intuitively, "eval(a, s)" evaluates s and puts the result on channel a, "wait(a)" enables processor p to use the evaluation result, "write(t, a.data)" sets the value of t to a.data, St is a call stack, and  $\Gamma$  is a typing environment [24] containing the class hierarchy of a program and all the type definitions.

At this point it is easy to understand that the K-rule for assignments

$$\left\langle \frac{r}{(r_1 - t)[t = (r_1/s - t)] - ot} \right\rangle_{\text{al}} \left\langle \frac{t := s}{\cdot} \dots \right\rangle_{\text{k}} \quad \text{with } r_1 = r[ot = t] \quad (15)$$

can be straightforwardly integrated in (19) by enriching the state structure with a new field encapsulating the alias information, and considering instead the transition  $\Gamma \vdash \langle p :: t := s; St, \sigma \rangle \rightarrow \langle p :: \operatorname{eval}(a,s); \operatorname{wait}(a); \operatorname{write}(t,a.data); St, \sigma' \rangle$  where

$$\sigma.aliases = r$$
  $\sigma'.aliases = (r_1 - t)[t = (r_1/s - t)] - ot$ 

with r and  $r_1$  as in (15). The integration of all the K-rules of the alias calculus on top of the Maude formalization of SCOOP can be achieved by following a similar approach.

> maude SCOOP.maude ..\examples\aliasing-linked\_list.maude corresponding to the code in (1):

$$x := y$$
; loop  $x := x.next$  end.

The console outputs the aliased expressions for a rewriting of depth 100 which include, as expected, pairs of shape  $[x, y.next^k]$ . (The over-approximating mechanism for sequential settings is still to be implemented.)

As can be observed based on the code in aliasing-linked\_list.maude, in order to implement our applications in Maude, we use intermediate (still intuitive) representations. For instance, the class structure defining a node in a simple linked list, with filed *next* is declared as:

```
class 'NODE
    create {'make}
    ( attribute { 'ANY } 'next : [?, . , 'NODE] ; )
    [...]
end;
```

where 'next: [?, ., 'NODE] stands for an object of type NODE, that is handled by the current processor (.) and that can be Void (?), and 'make plays the role of a constructor. The intermediate representation of the instruction block in (1) is:

```
assign ('x, 'y);
until False loop ( assign ('x, 'x . 'next(nil)) ; ) end ;
```

For a detailed description of SCOOP and its Maude formalization we refer the interested reader to the work in [22].

### 4.1 Further applications of the alias calculus

Apart from providing an alias analysis tool, the alias calculus can be exploited in order to build an abstract semantics of SCOOP. For example, an abstraction of the assignment rule (15) would omit the evaluation of the right-hand side of the assignment t:=s and the associated message passing between channels:

$$\frac{\cdot}{\Gamma \vdash \langle p :: t := s; St, \, \sigma \rangle \rightarrow \langle p :: St, \, \sigma' \rangle}$$

where

$$\sigma.aliases = r$$
  $\sigma'.aliases = (r_1 - t)[t = (r_1/s - t)] - ot$ 

with r and  $r_1$  as in (15). This way one derives a simplified, reduced semantics of SCOOP, more appropriate for model checking, for instance; the current SCOOP formalization in Maude is often too large for this purpose. A survey on abstracting techniques on top of Maude executable semantics is provided in [19].

Furthermore, the aliasing information could be used for the so-called "dead-locking" problem, where two or more executing threads are each waiting for the other to finish. In the context of SCOOP, this is equivalent to identifying whether a set of processors reserve each other circularly (*i.e.*, there is a Coffman deadlock). This situation might occur, for instance, in a Dinning Philosophers scenario, where both philosophers and forks are objects residing on their own processors. The difficulty of identifying such deadlocks stems from the fact that SCOOP processors are known from object references, which may be aliased.

### 5 Conclusions

In this paper we provide an extension of the alias calculus in [15] from finite alias relations to infinite ones corresponding to loops and recursive calls. Moreover, we devise an associated executable specification in the K semantic framework [27]. In Theorem 1 we show that the RL-based machinery implements an algorithm that always terminates with a sound over-approximation of "may aliasing", in non-concurrent settings. This is achieved based on the sound (finitely representable) over-approximation of "lasso shaped") alias expressions in terms of regular expressions, as in Lemma 2. We also discuss the integration and applicability of the alias calculus on top of the Maude formalization of SCOOP [22].

An immediate direction for future work is to identify interesting (industrial) case studies to be analyzed using the framework developed in this paper. We are also interested in devising heuristics comparing the efficiency and the precision (e.g., the number of false positives introduced by the alias approximations) between our approach and other aliasing techniques. In this respect, we anticipate that the rewriting modulo associativity, together with the pattern matching capabilities of Maude will accelerate the identification of the "lasso" properties and the corresponding over-approximating regular alias expressions. This could eventually provide an effective reasoning apparatus for the "may aliasing" problem.

Another research direction is to derive alias-based abstractions for analyzing concurrent programs. We foresee possible connections with the work in [13] on concurrent Kleene algebra formalizing choice, iteration, sequential and concurrent composition of programs. The corresponding definitions exploit abstractions of programs in terms of traces of events that can depend on each other. Thus, obvious challenges in this respect include: (i) defining notions of dependence for all the program constructs in this paper, (ii) relating the concurrent Kleene operators to the semantics of the SCOOP concurrency model and (iii) checking whether fixed-points approximating the aliasing information can be identified via fixed-point theorems.

Furthermore, it would be worth investigating whether the graph-based model of alias relations introduced in [15] can be exploited in order to derive finite  $\mathbb{K}$  specifications of the extended alias calculus. In case of a positive answer, the general aim is to study whether this type of representation increases the speed of the reasoning mechanism, and why not – its accuracy. With the same purpose, we refer to a possible integration with the technique in [6] that handles point-to graphs via a stack-based algorithm for fixed-point computations.

We are also interested to what extent an abstract semantics based on aliases for SCOOP can be exploited for building more efficient analysis tools such as deadlock detectors, for instance. A survey on similar techniques that abstract away from possibly irrelevant information w.r.t. the problem under consideration is provided in [19].

Acknowledgements We are grateful for valuable comments to the anonymous reviewers, Măriuca Asăvoae, Alexander Kogtenkov, José Meseguer, Bertrand Meyer, Benjamin Morandi and Sergey Velder. The research leading to these results has received funding from the European Research Council under the European Union's Seventh Framework Programme (FP7/2007-2013) / ERC Grant agreement no. 291389.

### References

- E. Albert, P. Arenas, S. Genaim, and G. Puebla. Field-sensitive value analysis by field-insensitive analysis. In *Proceedings of the 2Nd World Congress on Formal Methods*, FM '09, pages 370–386, Berlin, Heidelberg, 2009. Springer-Verlag.
- I. M. Asavoae. Abstract semantics for alias analysis in K. Electr. Notes Theor. Comput. Sci., 304:97-110, 2014.
- 3. A. Bouajjani, J. Esparza, and O. Maler. Reachability analysis of pushdown automata: Application to model-checking. In *CONCUR*, pages 135–150, 1997.
- 4. M. Burke, P. Carini, J.-D. Choi, and M. Hind. Flow-insensitive interprocedural alias analysis in the presence of pointers. In K. Pingali, U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, editors, Languages and Compilers for Parallel Computing, volume 892 of Lecture Notes in Computer Science, pages 234–250. Springer Berlin Heidelberg, 1995.
- G. Caltais. Expression-based aliasing for OO-languages. CoRR, abs/1409.7509, 2014.
- D. R. Chase, M. N. Wegman, and F. K. Zadeck. Analysis of pointers and structures. In PLDI, pages 296–310, 1990.
- J.-D. Choi, M. Burke, and P. Carini. Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects. In Proceedings of the 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '93, pages 232-245, New York, NY, USA, 1993. ACM.
- P. Cousot and R. Cousot. Abstract interpretation and application to logic programs. J. Log. Program., 13(2&3):103-179, 1992.
- A. Diwan, K. S. McKinley, and J. E. B. Moss. Type-based alias analysis. SIGPLAN Not., 33(5):106-117, May 1998.
- M. Emami, R. Ghiya, and L. J. Hendren. Context-sensitive interprocedural pointsto analysis in the presence of function pointers. In *Proceedings of the ACM SIG-PLAN 1994 Conference on Programming Language Design and Implementation*, PLDI '94, pages 242–256, New York, NY, USA, 1994. ACM.
- 11. M. Hind. Pointer analysis: haven't we solved this problem yet? In PASTE, pages  $54-61,\,2001.$
- M. Hind, M. Burke, P. Carini, and J.-D. Choi. Interprocedural pointer alias analysis. ACM Trans. Program. Lang. Syst., 21(4):848-894, July 1999.
- C. A. R. Hoare, B. Möller, G. Struth, and I. Wehrman. Concurrent Kleene algebra. In CONCUR 2009 Concurrency Theory, 20th International Conference, CONCUR 2009, Bologna, Italy, September 1-4, 2009. Proceedings, pages 399-414, 2009.
- S. C. Kleene. Representation of events in nerve nets and finite automata. In C. Shannon and J. McCarthy, editors, Automata Studies, pages 3-41. Princeton University Press, Princeton, NJ, 1956.

- A. Kogtenkov, B. Meyer, and S. Velder. Alias and change calculi, applied to frame inference. CoRR, abs/1307.3189, 2013.
- W. Landi. Undecidability of static analysis. ACM Lett. Program. Lang. Syst., 1(4):323-337, Dec. 1992.
- W. Landi and B. G. Ryder. Pointer-induced aliasing: A problem classification. In Proceedings of the 18th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '91, pages 93-103, New York, NY, USA, 1991. ACM.
- J. R. Larus and P. N. Hilfinger. Detecting conflicts between structure accesses. In PLDI, pages 21–34, 1988.
- J. Meseguer and G. Rosu. The rewriting logic semantics project: A progress report. In Fundamentals of Computation Theory - 18th International Symposium, FCT 2011, Oslo, Norway, August 22-25, 2011. Proceedings, pages 1-37, 2011.
- 20. B. Meyer. Eiffel: The Language. Prentice-Hall, 1991.
- 21. A. Miné. Field-sensitive value analysis of embedded c programs with union types and pointer arithmetics. In *Proceedings of the 2006 ACM SIGPLAN/SIGBED Conference on Language, Compilers, and Tool Support for Embedded Systems*, LCTES '06, pages 54–63, New York, NY, USA, 2006. ACM.
- B. Morandi, M. Schill, S. Nanz, and B. Meyer. Prototyping a concurrency model. In ACSD, pages 170–179, 2013.
- 23. E. M. Myers. A precise inter-procedural data flow algorithm. In *Proceedings* of the 8th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '81, pages 219–230, New York, NY, USA, 1981. ACM.
- P. Nienaltowski. Practical Framework for Contract-based Concurrent Objectoriented Programming. ETH, 2007.
- 25. M. O. Rabin and D. Scott. Finite automata and their decision problems.  $IBM\ J.$  Res. Dev., 3(2):114-125, Apr. 1959.
- V. Robert and X. Leroy. A formally-verified alias analysis. In CPP, pages 11–26, 2012.
- 27. G. Rosu and T. F. Serbanuta. K overview and SIMPLE case study. In *Proceedings of International K Workshop (K'11)*, ENTCS. Elsevier, 2013. To appear.
- T.-F. Serbanuta and G. Rosu. K-Maude: A rewriting based tool for semantics of programming languages. In WRLA, pages 104-122, 2010.
- M. Sridharan, S. Chandra, J. Dolby, S. J. Fink, and E. Yahav. Alias analysis for object-oriented programs. In *Aliasing in Object-Oriented Programming*, pages 196–232, 2013.
- 30. R. P. Wilson and M. S. Lam. Efficient context-sensitive pointer analysis for C programs. In *Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation*, PLDI '95, pages 1–12, New York, NY, USA, 1995. ACM.

# Specifying and Verifying Concurrent C Programs with TLA+

Amira Methni<sup>1,3</sup>, Matthieu Lemerre<sup>1</sup>, Belgacem Ben Hedia<sup>1</sup>, Serge Haddad<sup>2</sup>, Kamel Barkaoui<sup>3</sup>

CEA-List, Saclay, France
 LSV, ENS Cachan, CNRS & INRIA, France
 CNAM, CEDRIC, France

Abstract. Verifying software systems automatically from their source code rather than modelling them in a dedicated language gives more confidence in establishing their properties. Here we propose a formal specification and verification approach for concurrent C programs directly based on the semantics of C. We define a set of translation rules and implement it in a tool (C2TLA+) that automatically translates C code into a TLA+ specification. The TLC model checker can use this specification to generate a model, allowing to check the absence of runtime errors and dead code in the C program in a given configuration. In addition, we show how translated specifications interact with manually written ones to: check the C code against safety or liveness properties; provide concurrency primitives or model hardware that cannot be expressed in C; and use abstract versions of translated C functions to address the stateexplosion problem. All these verifications have been conducted on an industrial case study, which is a part of the microkernel of the PharOS real-time system.

# 1 Introduction

Most software systems like the Linux kernel or the Apache Webserver are implemented in a low level language such as C, which is one of the most used programming languages in industry. Verifying C code is challenging, in particular due to the presence of pointers and pointer arithmetic.

Moreover, C software systems are often concurrent, and traditional testing techniques are not efficient to check the correctness of the implementation. Thus, the use of formal verification techniques is essential. We address these issues in the context of formal verification of operating systems microkernels written in C code. In this paper, we focus on the model checking technique, a popular technique for the verification of correctness properties of finite-state systems. Given a set of properties expressed in a temporal logic and a model, it automatically analyzes the state space of the model and checks whether the model satisfies the properties [6]. To apply this technique to the verification of C programs, the target modeling language should express all C features, handle concurrency, allow to state the properties that we want to verify, and its tools should scale up to large systems.

#### A. Methni et al.

Contribution. Our main contribution is to provide a formal specification and verification approach of C concurrent programs, based on both axiomatic (e.g. pre-post conditions) and operational (executable model) specification of a C implementation. We use TLA+ [17] as a formal specification language for writing our specifications. In this approach, we translate a C code to an executable TLA+ specification using the C2TLA+ tool that we present in the paper. The generated specifications can be checked for runtime errors in the C code. We show how the specifications thus generated can be completed with manually written TLA+ specifications: to provide concurrency primitives, to model hardware that cannot be expressed in C, to check the C code against safety or liveness properties and to provide an abstract operational specification. In the latter case, the operational specification can be used in place of the C code in order to verify the whole system. Preliminary experiments hint that this could considerably lessen the state explosion problem. These examples are presented in a concrete case study, which is part of the microkernel of the real-time operating system PharOS [19].

Outline. The rest of the paper is organized as follows. We discuss related work in Section 2. We give an overview of TLA+ in Section 3. Section 4 presents the global approach and focus on the translation from C to TLA+. Section 5 presents a concrete application of the approach on the case study. Section 6 concludes and presents future research directions.

# 2 Related Work

There are a variety of formal verification techniques. Among them there are deductive verification techniques using theorem proving such as VCC [7]. These techniques provide a rigorous approach but usually require a lot of human effort and user expertise. Model checking is an automatic technique which requires less human effort because it is fully automated once the system and its properties are specified. But, it is restricted to finite-state systems. In what follows we focus on the model checking tools for C programs related to our work.

SLAM [2] was the first model checker for C programs to implement the counter-example-guided predicate abstraction refinement (CEGAR) approach [5]. This approach has been used later in the BLAST [11] toolkit. SLAM and BLAST have been used to check device drivers but they are only used for sequential C programs.

Besides CEGAR based tools, an approach consists to transform the C code into the input language of a model checker. Modex [14] can automatically extract a Promela model from a C code implementation. The Promela code generated is then checked with the SPIN [12] model checker. Promela is a simple language that does not handle pointer and has no procedure calls. Modex handles these missing features by including embedded declarations and statements inside Promela specifications. The embedded code fragments can not be checked by the SPIN and can contain a division by zero error, or null pointer dereference. To

**Fig. 1.** TLA syntax [17]

mitigate this problem, Modex instruments additional checks using assertions. But, not all errors can be anticipated and the model checker can crash [13].

CBMC [4] is a bounded model checker for ANSI C programs that translates a program into a formula (in Static Single Assignment form) which is then fed to a SAT or SMT solver to check its satisfiability. It can be used to verify array bounds, pointer safety, exceptions and user-specified assertions. On the other hand, CBMC explores program behavior exhaustively but only up to a given depth, i. e., it is restricted to programs without deep loops [10]. PlusCal [18] is a high-level language for expressing multiprocess algorithms. A PlusCal algorithm can be automatically translated into a TLA+ specification. PlusCal-2 [1] improves Lamport's PlusCal language by adding new constructs like hierarchical processes and specifying atomicity for some part of the code. Moreover, it does not support some constructs of imperative programming like pointer-based structures and does not handle function calls. PlusCal is also an algorithm language that can be used to replace pseudo code but cannot be used in the final implementation.

In this work, we use TLA+ as formal framework which provides an expressive power to specify the semantics of a programming language. It is supported by the TLC model checker and the TLAPS [8] prover. Moreover, TLA+ is a logic that can reason about concurrent systems and can express safety and liveness properties unlike SLAM, BLAST and CBMC which have limited support for concurrent properties as they only check safety properties. Furthermore, TLA+ provides a mechanism for structuring large specifications using a refinement process between different levels of abstraction unlike Spin and CBMC.

# 3 An Overview of TLA+

TLA+ [17] is the specification language of the Temporal Logic of Actions (TLA). TLA is a variant of linear temporal logic introduced by Lamport [16] for specifying and reasoning about concurrent systems. The syntax of TLA is given in Figure 1 (the symbol  $\triangleq$  means equal by definition). Readers interested in a more detailed presentation of TLA+ can refer to Lamport's book [17].

TLA+ specifies a system by describing its possible behaviors. A behavior is an infinite sequence of states. A state is an assignment of values to variables. A state function is a nonboolean expression built from constants, variables and constant operators and it assigns a value to each state. For example, y+2 is a state function that assigns to state s two plus the value that s assigns to the

#### 4 A. Methni et al.

variable y. An action is a boolean expression containing constants, variables and primed variables (adorned with "'" operator). Unprimed variables refer to variable values in the actual state and primed variables refer to their values in the next-state. Thus, an action represents a relation between old states and new states. A state predicate (or predicate for short) is an action with no primed variables.

TLA+ formulas are built up from actions and predicates using boolean operators ( $\neg$  and  $\land$  and others that can be derived from these two), quantification over logical variables ( $\forall$ ,  $\exists$ ), and the unary temporal operator  $\Box$  (always) of linear temporal logic [20].

The behaviors satisfying this specification are the ones that represent correct behaviors of the system, where a behavior represents a conceivable history of a universe that may contain the system.

The predicate "ENABLED  $\mathcal{A}$ ", where  $\mathcal{A}$  is an action, is defined to be true in a state s iff there exists some state t such that the pair of states  $\langle s,t\rangle$  satisfies  $\mathcal{A}$ . The formula  $[\mathcal{A}]_{vars}$ , where  $\mathcal{A}$  is an action and vars the tuple of all system variables, is equal to  $(\mathcal{A} \vee (vars' = vars))$  where vars' is the expression obtained by priming all variables in vars. It asserts that every step (pair of successive states) is either an  $\mathcal{A}$  step or else leaves the values of all variables vars unchanged. TLA+ defines the abbreviation "UNCHANGED vars" to denote that vars' = vars. While TLA+ permits a variety of specification styles, the specification that we use is defined by:

 $Spec \triangleq Init \wedge \Box [Next]_{vars} \wedge Fairness$  (1)

where:

- Init is a state predicate describing the possible initial states by assigning values to all system variables,
- Next is an action representing the program's next-state relation,
- vars is the tuple of all variables,
- Fairness is an optional formula representing weak or strong assumptions about the execution of actions.

Formula Spec is true of a behavior  $\sigma$  iff Init is true of the first state of  $\sigma$  and every step of  $\sigma$  is either a Next step or a "stuttering step", in which none of the specified variables change their values, and Fairness holds.

The TLA+ formula  $Spec \Rightarrow \phi$  is valid when the model represented by Spec satisfies the property  $\phi$ , or implements the model  $\phi$ .

TLA+ has a model checker called TLC that can be used to check the validity of safety and liveness properties. TLC handles specifications that have the standard form of the formula (1). It requires a configuration file which defines the finite-state instance to analyze. TLC begins by generating all states satisfying the initial predicate Init. Then, it generates every possible next-state t such that the pair of states  $\langle s,t\rangle$  satisfies Next and the Fairness constraints, looking for a state where an invariant is violated. Finally, it checks temporal properties over the state space.



Fig. 2. Specification and verification process

# 4 Specification and Verification Process

### 4.1 Proposed Approach

Approach Workflow. The specification and verification process is illustrated in Figure 2. The first step of the process is to translate from an implementation provided by one or more .c files a TLA+ specification using our translator C2TLA+. Before translation, the C files are parsed and normalized according to CIL (C Intermediate Language) [21]. Normalization to CIL makes programs more amenable to analysis and transformation. In particular, all expressions containing side-effects are put into separate statements (introducing temporary variables); initializers for local variables are turned into assignments; all forms of loops (while, for and do-while) are normalized as a single while(1) looping construct plus explicit goto statement.

After obtaining the Abstract Syntax Tree (AST) of the C program, C2TLA+ generates the TLA+ specification according to a set of translation rules described in Subsection 4.2. The whole system is composed of TLA+ modules resulting from C translation or manual specification that come from different sources:

- Several standard modules are provided with TLA+. They contain the definition of basic operators. Like Head, Tail, Len (for length),  $\circ$  (for concatenation), and SubSeq (for subsequence) that are defined in Sequences module.
- The *Runtime* module contains the TLA+ definition of arithmetic, logical and relational operators used by C2TLA+, as well as the definition of *load()* and *store()* for loading/storing an lvalue in the memory.
- Modules resulting from translation. C2TLA+ generates for each .c file a TLA+ module and the *Parameters* module which contains the definition of constants, type sizes, offsets of member fields and variables used by the translation. It also defines the initial predicate *Init*, the action *Next* and the specification formula *Spec*. For simplicity, we assume that the size of an integer or a pointer is 1 (one memory cell).

#### 6 A. Methni et al.

Optional manual modules can be specified by the user. They provide concurrency primitives or hardware that can not be expressed in C, or an abstract model.

The set of properties is manually specified. Then, all the modules are integrated to form the complete specification, which is given to TLC to generate the model and check the properties (or refinements) to be verified. If a property is not satisfied, TLC reports a trace that leads to the bad state. TLC also provides coverage information, i.e., the number of times each action was "executed" to construct a new state. Using this information, we can identify actions that are never "executed" and which might indicate an error in the specification. Both the trace and coverage information can be translated back to C.

The Considered Subset of C. We restrict ourselves to a subset of C resulting from the simplifications done by CIL. Table 1 gives the BNF representation of the AST of CIL for this subset. The considered aspects include basic data-types (int, struct, enum), integer operations, arrays, pointers, pointer arithmetic, all kinds of control flow statements, function calls and recursion. Currently, we do not handle float types, non-portable conversions between objects of different types, dynamic allocation, function calls through pointers, and assignment of structs (not needed by our case study), but the translator could be updated to handle them.

**Table 1.** BNF representation of the AST of CIL for the considered subset of C (The symbols  $+_{pa}/-_{pp}$  denote the addition/substraction between a pointer and an integer.  $-_{pp}$  denotes the substraction between two pointers.  $\varepsilon$  is a terminal symbol that denotes an empty element).

```
 \begin{array}{l} ::= <\! \mathsf{decls} > (<\! \mathsf{fun\_def} >)^* \\ ::= \varepsilon_{\mathit{decl}} \mid <\! \mathsf{decl} > <\! \mathsf{decls} > \\ ::= <\! \mathsf{type} > \mathsf{VAR\_ID} \; ; \end{array} 
<decls>
                                                                                                                                                                                                                                                                                           ::=<type> VAR_ID;
::= & type> VAR_ID;
::= & type> VAR_ID;
::= \text{type} = VAR_ID,
::=<type> FUN_ID (<params>) { <decl> <stmt> }
::= \text{type> FUN_ID (<params>) { <decl> <stmt> };
::= \text{type> * | struct { (<type> VAR_ID;)* };
::= { (<stmt>;)* } | while(1) <stmt>
if <expr> <stmt> (else <stmt>)? | <lval> = <expr> <lval> = FUN_ID ((<expr>,)*) | LABEL_ID: <stmt> }
goto LABEL_ID | break | continue | return (<expr>)?
& \varepsilon stmt \text{ * Skip instruction * \text{ ::= } <expr> <br/> <expr> + \varepsilon expr> | <expr> | <expr> + \varepsilon expr> | <expr> + \varepsilon expr| <expr> + \varepsilon expr> | <expr> + \varepsilon expr| <expr| <e
 <decl>
     <params><param>
       <fun_def>
<type>
<stmt>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       <lval> = <expr>
<expr>
                                                                                                                                                                                                                                                                                                            <lu>al>
 <offs>
[a-zA-Z][0-9a-zA-Z_]*
[1-9]([0-9])*
CONSTANT
```

# 4.2 Memory Layout of Concurrent C Program

A concurrent program consists in several interleaved sequences of operations called *processes* (corresponding to threads in C). C2TLA+ attributes a unique



Fig. 3. Example of a C code in which one process (with id equals 0) executes function PO() and the second one executes function P1(). The arrows in the C code indicate which statement the process id is executing. The top of the  $stack\_regs[0]$  indicates that process 0 is executing the statement with label 9 of function max().

identifier to each process, and defines the constant ProcSet to be the set of all process identifiers.

The memory layout of a C program is organized in C2TLA+ into four regions:

- A region that contains global (and static) variables. This region is represented by a an array, called mem, that maps addresses to values. This memory region is shared by all processes.
- A region that contains local variables and function parameters. It is represented by the TLA+ variable stack\_data. This region is represented by a 2-dimensional array: one dimension corresponds to the process id (the stack is not shared between processes); the other to addresses (i.e., offsets in the stack). The stack of each function is divided into stack frames whose boundaries (for each process) are given in another variable, stack\_regs. Each stack frame corresponds to a call to a function which has not yet returned. Note that this representation allows a function to access variables in its callers (through pointers), which is frequent in C.
- A region that stores the program counter of each process; i.e., which statement is being executed. This information needs to be saved and restored on function calls and returns. Rather than saving the program counter together with the data (in the stack\_data variable), we find it simpler to organize the registers of

#### A. Methni et al.

8

```
load(id,\ ptr) \stackrel{\Delta}{=} \text{ IF } ptr.loc = \text{``mem''} \text{ THEN } mem[ptr.offs] \\ \text{ELSE } stack\_data[id][Head(stack\_regs[id]).fp + ptr.offs]
```

Fig. 4. Definition of load() operator

the program as a stack. We define the TLA+ variable  $stack\_regs$ , associating to each process a stack of records. Each record contains two fields:

- pc, the program counter, points to the current statement of the function being executed, represented by a tuple (function name, label);
- fp, the frame pointer, contains the base offset of the current stack frame.

Note that we do not need to store the stack pointer, which is already given using " $Len(stack\_data)$ ". Each element of the stack of records represents the registers of a function in the callstack; in particular, " $Head(stack\_regs[id])$ " represents the registers of the function being currently executed by the process id.

- A region that contains the values returned by a process. It is modeled using an array called *ret*, indexed by the process identifier.

C2TLA+ maps each C variable to unique TLA+ constant modeled by a record composed with two fields. The first one, loc, determines the memory region where the variable is stored (mem or  $stack\_data$ ). The other one, offs, defines the offset of the data in the memory region. Figure 3 provides a snapshot of the memory on a C code example. The TLA+ expression [ $loc \mapsto "mem"$ ,  $offs \mapsto 0$ ] denotes the record  $Addr\_x$  such that  $Addr\_x.loc$  equals "mem" and  $Addr\_x.offs$  equals 0. offs for a local variable is relative to the start of the stack frame of the current function, while offs for a global variable is the absolute index in mem.

C2TLA+ assigns to global (and static) variables not explicitly initialized the value 0 for integers, and [ $loc \mapsto Null$ ,  $offs \mapsto Null$ ] for pointers. For local variables, it assigns the Undef value. Null and Undef are TLA+ "model values", which are an unspecified values that TLC considers to be unequal to any value that can be expressed in TLA+.

Loading and Assignment. An Ivalue is a kind of expression that is evaluated to an address and which refers to a region of storage. Accessing the value stored in this region is performed using the load() operator (defined in Figure 4) which uses the TLA+ construct IF/THEN/ELSE.

The position of a parameter or local variable in  $stack\_data[id]$  is relative to the base of the stack frame of the current function, which equals to  $Head(stack\_regs[id]).fp$ .

Fig. 5. Definition of store() operator

Arrays, Pointer Arithmetic and Structure Member. Accessing an array element in C2TLA+ requires computing the offset using the size of the elements, the index and the base address of the array. For example, accessing to z[a] is translated into:

```
load(id,[loc \mapsto Addr\_z.loc,of\!fs \mapsto (Addr\_z.of\!fs + (load(id,Addr\_a)*Size\_of\_int))])
```

The same kind of computation is used to perform pointer arithmetic. Similarly, accessing a structure member is achieved by shifting the base address of the structure with the constant accumulated size of all previous members. For example, accessing to point.y is translated into:

```
load(id,[loc \mapsto Addr\_point.loc,of\!fs \mapsto (Addr\_point.of\!fs + O\!f\!fset\_point\_y)])
```

### 4.3 Intra-procedural Control Flow

Function Definition. Each C function definition is translated into an operator with the process identifier id as argument. The function body is translated into the disjunction of the translation of each statement it contains. A C statement is translated into the conjunction of actions that are done simultaneously. At a given state one and only one action is true (i. e., feasible). The translation of function dec() of the example is as follows:

```
\begin{aligned} dec(id) &\triangleq \lor \land Head(stack\_regs[id]).pc = (\text{``dec''}, \text{`'lbl\_19''}) \\ &\land store(id, Addr\_dec\_i, minus(load(id, Addr\_y), load(id, Addr\_dec\_param\_b))) \\ &\land stack\_regs' = [stack\_regs \texttt{EXCEPT} ! [id] = \\ &\quad ([pc \mapsto \langle \text{``dec''}, \text{``lbl\_20''}), fp \mapsto Head(stack\_regs[id]).fp]) \circ Tail(stack\_regs[id])] \\ &\land \texttt{UNCHANGED} \ (\textit{ret}) \\ &\lor \land Head(stack\_regs[id]).pc = \langle \text{``dec''}, \text{``lbl\_20''} \rangle \\ &\land stack\_regs' = \dots \end{aligned}
```

The translation of each statement s simultaneously asserts that the program counter points to s; performs the action corresponding to that statement; and updates the program counter to point to the next statement to execute.

Jump Statements. The translation of goto/break/continue statements consists in updating  $stack\_regs[id]$  to the successor statement. The goto 11 statement in function max() is translated as:

```
 \begin{array}{l} \vee \wedge \mathit{Head}(\mathit{stack\_regs}[id]).\mathit{pc} = \langle \text{``max''}, \text{``lbl\_10''} \rangle \\ \wedge \mathit{stack\_regs'} = [\mathit{stack\_regs} \text{ except } ! [id] = \\ \langle [\mathit{pc} \mapsto \langle \text{``max''}, \text{``lbl\_12''} \rangle, \mathit{fp} \mapsto \mathit{Head}(\mathit{stack\_regs}[id]).\mathit{fp}] \rangle \circ \mathit{Tail}(\mathit{stack\_regs}[id]) \\ \wedge \mathit{Unchanged} \ \langle \mathit{mem}, \ \mathit{stack\_data}, \ \mathit{ret} \rangle \\ \end{array}
```

Selection Statements. C integer expressions used in if condition are normalized by C2TLA+. Selection statement causes the program control (i. e.,  $stack\_regs[id]$ ) to be transferred to a specific block based upon whether the guard expression is true or not. The translation of if statement in function max() is as follows:

```
 \begin{tabular}{ll} $ \lor \land Head(stack\_regs[id]).pc = \langle ``max", ""bl\_9" \rangle \\ $ \land \text{IF} ((Gt(load(id, Addr\_max\_param\_u)), (load(id, Addr\_max\_param\_v)))} \neq [val \mapsto 0]) \\ $ \text{THEN } stack\_regs' = [stack\_regs \ EXCEPT \ ![id] = \\ $ \langle [pc \mapsto \langle "max", ""bl\_10" \rangle, fp \mapsto Head(stack\_regs[id]).fp] \rangle \circ Tail(stack\_regs[id])] \\ $ \text{ELSE } stack\_regs' = [stack\_regs \ EXCEPT \ ![id] = \\ $ \langle [pc \mapsto \langle "max", "lbl\_11" \rangle, fp \mapsto Head(stack\_regs[id]).fp] \rangle \circ Tail(stack\_regs[id])] \\ $ \land \text{UNCHANGED } \langle mem, stack\_data, ret \rangle \\ \end{tabular}
```

Iteration Statement. All loops in C are normalized by CIL as a single while(1) looping construct (plus eventual if and break statements), that we translate like other jump statements.

### 4.4 Inter-procedural Control Flow

Function Call. The function call is translated in two actions. Before calling a function f, its stack frame is pushed onto the  $stack\_data[id]$  which obeys the LIFO order. The  $stack\_regs[id]$  is updated by changing its head to a record whose pc field points to the action done once the call has finished. At the top of  $stack\_regs[id]$  is pushed a record with pc pointing to the first statement of the called function, and fp to the new stack frame. Once the function returns, the second action copies the return value. For instance, the translation of r1 = dec(2) is as follows:

```
 \begin{array}{l} \lor \land Head(\ stack\_regs[id]).pc = \langle \text{`"P1"}, \text{ `"Ibl.30"} \rangle \\ \land stack\_data' = [stack\_data\ except\ ![id] = stack\_data[id] \circ \langle [val \mapsto 2], \ [val \mapsto Undef] \rangle] \\ \land stack\_regs' = [stack\_regs\ except\ ![id] \\ = \langle [pc \mapsto \langle \text{`"dec"}, \text{"Ibl.19"} \rangle, fp \mapsto Len(stack\_data[id]) + 1] \ \rangle \\ \circ \langle [pc \mapsto \langle \text{`"P1"}, \text{"Ibl.30.1"} \rangle, fp \mapsto Head(stack\_regs[id]).fp] \rangle \circ Tail(stack\_regs[id])] \\ \land \text{UNCHANGED}\ \langle mem,\ ret \rangle \\ \lor \land Head(\ stack\_regs[id]).pc = \langle \text{`"P1"}, \text{`"Ibl.30.1"} \rangle \\ \land store(id,\ Addr\_P1\_r1,\ ret[id]) \\ \land stack\_regs' = [stack\_regs\ except\ ![id] = \\ \langle [pc \mapsto \langle \text{`"P1"}, \text{"Ibl.31"} \rangle, fp \mapsto Head(stack\_reg[id]).fp] \rangle \circ Tail(stack\_regs[id])] \\ \land \text{UNCHANGED}\ \langle \text{ret} \rangle \end{aligned}
```

Return Statement. Once the function returns, the top of the stack\_regs[id] is popped and its stack frame is removed from stack\_data[id] using the SubSeq operator. The returned value is stored on ret[id]. The return i statement of function dec() is translated as follows:

```
 \begin{array}{l} \lor \land Head(stack\_regs[id]).pc \ = \langle \text{``dec''}, \text{``lbl.20''} \rangle \\ \land stack\_regs' = [stack\_regs \text{ except !}[id] = Tail(stack\_regs[id])] \\ \land stack\_data' = [stack\_data \text{ except !}[id] = \\ SubSeq(stack\_data[id], 1, Head(stack\_regs[id]).fp - 1)] \\ \land ret' = [ret \text{ except !}[id] = load(id, Addr\_dec\_i)] \\ \land \text{ unchanged } \langle mem \rangle \end{array}
```

# 4.5 Generating the Specification

In addition to generating constants and variables declarations, C2TLA+ also defines in *Parameters* module the main specification by generating:

- The *Init* predicate that initializes all variables of the system.
- The tuple of all variables  $vars \stackrel{\Delta}{=} \langle mem, stack\_data, stack\_regs, ret \rangle$ .
- process(id), that defines the next-state action of process id. It asserts that one of the functions is being executed until stack\_regs[id] becomes empty (i. e., the process has returned from its entry function). For the C code example, it is defined as:

 $process(id) \stackrel{\Delta}{=} \wedge stack\_regs[id] \neq \langle \rangle \\ \wedge (\ max(id) \lor inc(id) \lor dec(id) \lor P0(id) \lor P1(id) \ )$ 

- The next-state action *Next* of all processes, that states that one of the process that has not finished is nondeterministically chosen to execute one step.

$$Next \stackrel{\Delta}{=} \lor \exists id \in ProcSet : process(id) \\ \lor (\forall id \in ProcSet : (stack\_regs[id] = \langle \rangle) \land (unchanged \ vars))$$

– The complete specification  $Spec \triangleq Init \wedge \Box [Next]_{vars} \wedge WF_{vars}(Next)$ . It is necessary to consider the fairness assumptions if we want to check liveness properties. We assume only weak fairness assumptions.

The specification can be checked by TLC without manually defining anything by the user. Errors that occur because TLC could not evaluate an expression correspond to a runtime error in the C code, like dereferencing a null pointer, and are reported to the user. C2TLA+ also generates the *Termination* property which asserts that all processes have their stack pointer eventually empty. This property is useful in some test cases.

 $Termination \stackrel{\Delta}{=} \lozenge(\forall \ id \in ProcSet \ : \ Head(stack\_regs[id]).pc = \langle \rangle)$ 

### 5 Implementation and Experiments

C2TLA+ is developed as a Frama-C [9] plugin, implemented in OCaml. Frama-C uses CIL to reorganize and simplify C code, produces an Abstract Syntax Tree (AST) and passes it to the C2TLA+ translator. We have used C2TLA+ in a case study, described in Section 5.1. We use this case study as an example to describe the interactions between generated specifications and manually specified ones.

# 5.1 Case Study Description

We have applied our approach and tools (C2TLA+, TLC) on a critical part of the microkernel of the PharOS [19] real-time operating system (RTOS). This part contains approximately 600 lines of code and consists in a distributed version of the scheduling algorithm of the RTOS tasks. It implements a variant of the EDF (earliest-deadline first) scheduling algorithm. It runs on a dual-core system and consists of two processes: one running on the *control core* and the other on the *executing core*. The two processes share a set of task lists. Concurrent access to shared data is ensured by lock-free synchronization.

Figure 6(a) presents the architecture of the modules of the microkernel that are of interest to us:

#### 12 A. Methni et al.



Fig. 6. Case study description

date provides the current date of the system. The considered implementation uses Lamport's algorithm of concurrent reading and writing of clocks [15]. This allows to read a concrete clock value, even if this value is concurrently updated.

**spinlock** implements lock-based concurrency primitive using "compare-and-swap" primitive.

tasklist implements the life-cycle of a task as given in Figure 6(b). Tasks can be in several states, each state corresponds to a data structure listing the tasks in that state. The incoming/outgoing edge denotes insertion/removal operation. Tasks are characterized by their *start time* and *deadline*.

**scheduler** is at the top-level. It performs inter-core notifications to awake processes when they have things to do. This module is not considered in translation because we do not provide support for interruptions yet.

### 5.2 TLA+ Modules of the Model

C2TLA+ takes as inputs the C source code of these modules. By applying our approach, we obtain the TLA+ modules of Figure 7.

C2TLA+ generates the Parameters module and a TLA+ module for each C input file. These modules can interact with manually specified TLA+ modules.

**Test Environment.** The test environment represents the entry point of the model. It simulates the main *scheduler* module by calling the *tasklist* API and it is manually specified in the *TestEnvironment* TLA+ module.

# Interacting with Manually TLA+ Specifications.

Specifying Concurrency Primitives. The spinlock module contains the definition of "acquire" and "release" operations which use the "compare-and-swap" (CAS) primitive. Figure 8(a) shows the pseudo code version of this primitive. As this operation is performed atomically, we cannot translate it with C2TLA+. Such primitives are specified manually, respecting the calling conventions of Subsection 4.4 and are declared in the C code using \_\_attribute\_\_ annotation mechanism to define the TLA+ module where the primitive is specified. For instance,



Fig. 7. TLA+ modules of the case study

```
CAS(id) \stackrel{\Delta}{=}
int CAS (int *,int , int) __attribute__
                                                               \begin{array}{l} \land Head(stack\_regs[id]).pc = \langle \text{"CAS"}, \text{"lbl.1"} \rangle \\ \land \text{IF} \ (load(id, load(id, Addr\_CAS\_param\_addr)) = \end{array} 
((Atomic_primitives,alias("CAS")));
                                                                 load(id, Addr\_CAS\_param\_old))
THEN \land mem' = [mem \ \text{EXCEPT}]
int CAS(int *addr. int old. int new)
 atomic {
                                                                       ![load(id, Addr\_CAS\_param\_addr).offs] =
                                                                          \begin{array}{l} load(id, Addr\_CAS\_param\_new)] \\ \wedge ret' = [ret \ \text{EXCEPT} \ ![id] = [val \mapsto 1]] \end{array} 
 int temp = *addr;
 if (temp == old)
                                                                 ELSE \land ret' = [ret \ \text{EXCEPT} \ ! [id] = [val \mapsto 0]]
      *addr = new;
                                                                          ∧ UNCHANGED ⟨mem⟩
     return 0;
                                                               \land stack\_regs' = [stack\_regs \ except \ ![id] =
                                                                       Tail(stack\_regs[id])]
 else return 1:}
                                                                  stack\_data' = [stack\_data \ except \ ![id] =
                                                               SubSeq(stack\_data[id],\ 1, Head(stack\_regs[id]).fp-1)]
               (a) Pseudo code
                                                                                        (b) TLA+ code
```

Fig. 8. CAS definition

CAS is specified in the *Atomic\_primitives* module as shown in Figure 8(b). Other primitives could be added to *Atomic\_primitives* which could be provided as a standard module.

Using an Abstract Model. The implementation of read and write operations on clock, in date module, is performed on several instructions. The possible interleaving of these instructions multiplies the number of states of the model. To cope with this problem, we write a TLA+ version of date, called Date\_abs which reads and writes the whole date atomically. Using this version considerably decreases the state space (see Table 2). We also verify that Date (the translated module) is a refinement of Date\_abs.

### 5.3 Specifying and Verifying Properties

We verified various properties of the system. Here we provide some examples. We have checked that all spinlocks protect the critical sections, i. e., statements of the two processes cannot be executed simultaneously.

```
\begin{array}{l} \textit{Mutex}(\textit{sc1}, \textit{sc2}) \overset{\triangle}{=} \\ \square((\textit{Head}(\textit{stack\_regs}["exec\_core"]).pc = \textit{sc1}) \Rightarrow (\textit{Head}(\textit{stack\_regs}["control\_core"]).pc \neq \textit{sc2})) \end{array}
```

An important invariant of the system is that the tasks in the ready list are sorted by their deadlines; this is necessary to implement the earliest-deadline first algorithm. To state this invariant, we first define a recursive operator *getSeqDeadlines* which maps the C linked list to a more abstract TLA+ sequence. The property is simpler to state on this abstract sequence by defining the *IsSortedSeq()* operator.

The property applied on *ready* list is expressed as follows:

```
\square IsSortedSeq(getSeqDeadlines[load("unused", Addr\_readyList)])
```

We have also checked some liveness properties, for instance, that if a thread entered its critical section, it will eventually leave it. This property can be expressed by comparing the program counter of the process to the statement labels of the functions "spinlock\_acquire" and "spinlock\_release". For example, for the executing core, the property is expressed as:

In order to use the abstract model  $Date\_abs$  instead of Date, we have to check that the Date model is a refinement of the  $Date\_abs$  model. For this, we have to map states in Date model with those of  $Date\_abs$  model by substituting constants and variables used in  $Date\_abs$  with those of Date. The refinement is expressed in TLA+ as logical implication. Verifying this refinement is satisfying that the specification of Date implies this substitution.

# 5.4 Verification and Discussion

We integrated the modules together and we performed model checking on two complete specifications. The first specification uses the translated <code>Date</code> module and the second one uses the abstract <code>Date\_abs</code> module. The experiment was performed on an Intel Core Pentium i7-2760QM with 8 cores (2.40GHz each) machine, with 8Gb of RAM memory. We model checked the two specifications by considering four possible values of the clock. The executing core updates the <code>start time</code> and <code>deadline</code> of the task that has run and inserts it into the <code>unsorted</code> lists. We consider that this update is performed by assigning two possible values. Table 2 provides the generated states and the model checking time according to the number of tasks, for the two considered specifications.

For two tasks, the specification using *Date* module takes more than 3 hours to be model checked. Using an abstract model significantly reduces the size of the state space and the time required for model checking.

We have successfully checked that the correctness properties (defined in Subsection 5.3) are satisfied by the model. One of the motivations for verifying this

Specification using Tasks Date  $Date\_abs$ State space Time State space Time 1 5.986.509 227 718.084 2 >501.876.263 | >10.800 5.450.732 64 3 45.201.603 960 138.679.106 2.400 4

**Table 2.** Runtimes of model checking (time in seconds)

code was to check that the fine-grained locking constructs were properly used. We checked that changing the locks in the source code leads to TLC finding that some invariants become violated. In that case, we obtain the error trace that explains how the error can happen and TLC reports that the coverage is incomplete.

### 6 Conclusion and Future Work

We have sketched an approach for specifying and verifying C code based on an automated translation from C to TLA+. The main advantage of our approach is the ability to make generated TLA+ specifications from a C implementation interact with more abstract, potentially already existing manually specified TLA+ specifications. We use the TLC model checker to verify a part of the implementation of an RTOS microkernel against safety and liveness properties expressed in TLA+. We also checked that a generated specification was a refinement of an abstract TLA+ specification, and showed that we could successfully use abstraction to reduce the size of the state space.

We plan to extend this work on several interesting directions. We would like to extend the translator to handle a bigger subset of C and to generate TLA+ properties from the ACSL [3] specification language used in Frama-C. We want to update the translator so that the generated TLA+ specification catches all of C runtime errors. It would be interesting to benefit from Frama-C analysis of shared variables by several processes to generate TLA+ code with less interleaving between the processes, to reduce the state space. We also plan to further study the use of TLA+ modules with different levels of refinement. Finally, we aim to use the TLA+ proof system [8] to prove properties on an abstract specification of PharOS and prove that the specification generated by C2TLA+ is a refinement of this abstract specification.

# References

- 1. Akhtar, S., Merz, S., Quinson, M.: A High-Level Language for Modeling Algorithms and Their Properties. In: SBMF'10. LNCS, vol. 6527, pp. 49–63 (2010)
- Ball, T., Rajamani, S.K.: The SLAM project: Debugging System Software via Static Analysis. SIGPLAN Not (2002)

- Baudin, P., Filliâtre, J.C., Marché, C., Monate, B., Moy, Y., Prevosto, V.: ACSL: ANSI/ISO C Specification Language, version 1.4 (2009), http://frama-c.cea. fr/acsl.html
- Clarke, E., Kroening, D., Lerda, F.: A Tool for Checking ANSI-C Programs. In: TACAS'04. LNCS, vol. 2988, pp. 168–176. Springer (2004)
- 5. Clarke, E.M., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-Guided Abstraction Refinement. In: CAV'00. pp. 154–169. CAV, Springer (2000)
- Clarke, Jr., E.M., Grumberg, O., Peled, D.A.: Model checking. MIT Press, Cambridge, MA, USA (1999)
- Cohen, E., Dahlweid, M., Hillebrand, M.A., Leinenbach, D., Moskal, M., Santen, T., Schulte, W., Tobies, S.: VCC: A Practical System for Verifying Concurrent C. In: TPHOLs'09. pp. 23–42. Springer-Verlag (2009)
- 8. Cousineau, D., Doligez, D., Lamport, L., Merz, S., Ricketts, D., Vanzetto, H.: TLA+ Proofs. In: FM'12. LNCS, vol. 7436, pp. 147–154. Springer (2012)
- 9. Cuoq, P., Kirchner, F., Kosmatov, N., Prevosto, V., Signoles, J., Yakobowski, B.: Frama-C: A Software Analysis Perspective. In: SEFM'12. pp. 233–247. Springer-Verlag (2012)
- D'Silva, V., Kroening, D., Weissenbacher, G.: A Survey of Automated Techniques for Formal Software Verification. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems (TCAD) 27(7), 1165–1178 (2008)
- 11. Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Software Verification with BLAST. pp. 235–239. Springer (2003)
- 12. Holzmann, G.J.: The Model Checker SPIN. IEEE Trans. Software Eng. 23(5), 279–295 (1997)
- Holzmann, G.J.: Trends in Software Verification. In: FME'03. pp. 40–50. LNCS, Springer (2003)
- Holzmann, G.J., Smith, M.H.: An Automated Verification Method for Distributed Systems Software Based on Model Extraction. IEEE Trans. on Software Engineering 28, 364–377 (2002)
- 15. Lamport, L.: Concurrent Reading and Writing of Clocks. ACM Trans. Comput. Syst. 8(4), 305–310 (1990)
- Lamport, L.: The Temporal Logic of Actions. ACM Trans. Program. Lang. Syst. 16(3), 872–923 (1994)
- 17. Lamport, L.: Specifying Systems, The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley (2002)
- Lamport, L.: The PlusCal Algorithm Language. In: ICTAC'09. pp. 36–60. Springer-Verlag (2009)
- Lemerre, M., Ohayon, E., Chabrol, D., Jan, M., Jacques, M.B.: Method and Tools for Mixed-Criticality Real-Time Applications within PharOS. In: Proceedings of AMICS 2011: 1st International Workshop on Architectures and Applications for Mixed-Criticality Systems (2011)
- Manna, Z., Pnueli, A.: The Temporal Logic of Reactive and Concurrent Systems. Springer-Verlag New York, Inc. (1992)
- Necula, G.C., Mcpeak, S., Rahul, S.P., Weimer, W.: CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs. In: In Int'l Conf. Compiler Construction. pp. 213–228 (2002)

# Key-Secrecy of PACE with OTS/CafeOBJ

#### Dominik Klein

Bundesamt für Sicherheit in der Informationstechnik (BSI)
Dominik.Klein@bsi.bund.de

**Abstract.** The ICAO-standardized Password Authenticated Connection Establishment (PACE) protocol is used all over the world to secure access to electronic passports. Key-secrecy of PACE is proven by first modeling it as an Observational Transition System (OTS) in CafeOBJ, and then proving invariant properties by induction.

### 1 Introduction

Cryptographic primitives, such as encryption mechanisms, hash functions or message authentication codes, undergo the scrutiny of a large community of researchers. While their mathematical foundations might not yet be understood in full detail, there have been few sudden groundbreaking attacks on them. Using these primitives as building blocks to construct security protocols is, however, another difficult challenge. In fact, despite using well-known cryptographic primitives, erroneous protocol specifications and design decisions have often lead to attacks. A famous example is [16], and the survey [7] contains an impressive list of failed attempts to design secure protocols. Formally proving properties of a protocol to exclude subtle attacks is one important step in the construction of security protocols.

Password Authenticated Connection Establishment (PACE) [4,13] is a cryptographic protocol used all over the world for electronic passports. PACE establishes a secure communication channel between a terminal (trying to access data stored on the passport's RFID chip) and the passport itself. Ensuring trust in PACE is of uttermost importance due to several reasons: First, the predecessor of PACE, called Basic Access Control (BAC), is plagued with security concerns due to low-entropy passwords. Second, the contact-less RFID interface of electronic passports raises concerns of citizens that passports enable secret tracking or that criminals may remotely read out sensitive biometric information. Third, PACE is used in national id-cards that enable secure authentication for e-commerce.

CafeOBJ is an algebraic specification and programming language [8]. After specifying a formal model, e.g. of a cryptographic protocol such as PACE, CafeOBJ can also be used as an interactive theorem prover to show invariant properties of such a specified model: Mathematical proofs are written as *proof scores*, and a proof can be established by *executing* its proof score. This approach is used in this paper.

The contribution of this paper is threefold. First, key secrecy of PACE itself is shown, strengthening trust in the protocol. Second, while CafeOBJ has a

proven track-record in the verification of security protocols [17–21, 23], the proof serves once more as a case study to show that theorem proving in CafeOBJ scales well beyond simple academic problems to real-world scenarios. Third, to the author's best knowledge, this proof is the first to model a protocol based on a Diffie-Hellman key-exchange in such detail in CafeOBJ. This might serve as a foundation for analyzing other DH-based protocols. The source code of the proof is available at https://github.com/d-klein/ots-proof.

The structure of this paper is as follows: In Section 2, the PACE protocol is introduced. A very brief recapitulation of modeling OTSs in CafeOBJ, and proving their invariants is given in Section 3. Section 4 provides an abstract version of PACE and shows how to model it as an OTS. The proof of key secrecy of PACE is shown in Section 5. Experiences and learned lessons are summarized in Section 6, and related work is reviewed in Section 7. Finally, concluding remarks are given in Section 8.

# 2 The PACE Key Agreement Protocol

To ensure compatibility with existing document formats and infrastructure, contactless RFID chips were chosen for electronic passports. This introduces two risks that need to be addressed: *Skimming*, i.e. an attacker reading out data from the passport without authorization, and *eavesdropping*, i.e. intercepting communication data during transmission. Note that skimming requires an online connection with the passport, whereas eavesdropped data can be analyzed offline after interception.

To prevent skimming, a terminal accessing data on the passport should prove that it is authorized to access the data. This can be done by e.g. reading information printed on the passport by OCR, and sending this data to the chip. The terminal thus demonstrates that it has physical access to the passport, and a passport holder can control electronic access to his passport by controlling physical access. Printed information on the passport often has low entropy. The machine-readable zone (MRZ) for example can be read by OCR and has 88 digits, but the vast majority of digits are not unique w.r.t. each passport, or can be easily guessed. Just hashing this printed data to directly derive a session key does not prevent sufficiently against offline attacks on eavesdropped transmission data, since the session key is the same for each session, and also has low entropy. Instead, a strong session key unique to each session is required to prevent (offline) analysis of eavesdropped transmission data.

The goal of the PACE key agreement protocol is to establish a secure, authenticated connection with a strong session key between the chip inside a passport and a corresponding terminal. PACE uses a pre-shared low entropy password to derive a strong session key by using a Diffie-Hellman key exchange [9]. The protocol is versatile in the sense that it allows to use either standard multiplicative groups of integers modulo p or groups based on elliptic curves. The latter is important in practice, since RFID chips have limited processing power.

The protocol works as follows: First, it is assumed that a common low entropy password  $\pi$  is known both by the chip and the terminal. Depending on the document type (international travel document, national id-card etc.) and use-case (border control, e-commerce) three solutions exist in practice: 1.) The password is derived from the MRZ, 2.) the password is derived from a Card Access Number (CAN) specifically printed on the document for this purpose or 3.) the password is derived from a secret personal identification number (PIN) known only to the owner of the document. In all cases, the password is stored on the chip in a protected way. To read out data on the chip, the MRZ is optically read by the terminal, or the CAN or the PIN is entered manually.

In the next step, the chip sends both a random nonce s encrypted by a symmetric cipher with the hash  $\mathcal{H}$  of  $\pi$  and the domain parameter  $D_{\mathrm{PICC}}$  for the group operation to the terminal. Using a mapping function and the domain parameter, the nonce s is mapped to some generator g of the group  $\langle g \rangle$ . Both the terminal and the chip chose another nonce x resp. y and compute exponents, i.e. the group operation is applied with the nonce together with the generator to derive  $g^x$  resp.  $g^y$ . These are then shared, and a key  $K = (g^x)^y = (g^y)^x$  and MAC and session-keys are derived. Knowledge of the sent exponents and the key is verified by exchanging MAC-tokens. See Figure 1 for a brief overview of the protocol. For more detailed specifications, see [4].

# 3 OTS, CafeOBJ and Invariant-Proving

The PACE protocol is modeled as an Observational Transition System (OTS). For precise definitions and an introduction to OTSs, cf. [19]. Here, only a brief recapitulation on how OTSs are modeled in CafeOBJ is provided in order to give an intuition of the overall proof approach and proof structure. An OTS is a triple of a set of observable values, a set of initial states, and a set of conditional transition rules. A protocol can be modeled as an OTS, where in each state of the protocol, observations on this state can be made. The effect of a state change on the observations is described by transitions. An *invariant* is a property that holds (is observable) in all states reachable from the initial ones.

CafeOBJ is based on equational reasoning. Algebraic data types and operations on them are described by conditional rewrite rules. Rewrite rules are called equations in CafeOBJ, but they are applied directed from left to right. An OTS is modeled in CafeOBJ as follows:

- The state space is modeled as a hidden sort H.
- A data type D is described in order-sorted algebra with visible sort V.
- An observation is modeled as a CafeOBJ behavioral operator: bop o : H V1 V2 ... VN  $\rightarrow$  V

V1,..., VN and V are visible sorts corresponding to data types  $D_1, ..., D_n$ , and H is the hidden sort representing the state space. Intuitively, this equation describes that the observation V can be made in state H, where H is characterized by V1 ... VN.

A transition is also modeled as a CafeOBJ behavioral operator:

```
Passport Chip (PICC)
                                                                                                                            Terminal (PCD)
                                                           shared password \pi
choose nonce s \leftarrow \mathbb{Z}_q
static domain parameter D_{PICC}
z = \mathbf{enc}(\mathcal{H}(\pi), s)
                                                                     D_{\mathrm{PICC}},z
                                                                                                                          s = \mathbf{dec}(\mathcal{H}(\pi), z)
g = \mathbf{map}(D_{PICC}, s)
                                                                                                                      g = \mathbf{map}(D_{PICC}, s)
choose x \leftarrow \mathbb{Z}_q^*
                                                                                                                               choose y \leftarrow \mathbb{Z}_q^*
                                                                                                                                           h_2 = g^{\hat{y}}
h_1 = g^x
                                                                         \xrightarrow{h_1}
                                                                         \leftarrow^{h_2}
abort, if h_2 \not\in \langle g \rangle or h_1 \doteq h_2
                                                                                                      abort, if h_2 \not\in \langle g \rangle or h_2 \doteq h_1
K = h_2^x = (g^y)^x
                                                                                                                            K = h_1^y = (g^x)^y
                                                                                                                          K_{\text{MAC}} = \mathcal{H}(K||1)
K_{\text{MAC}} = \mathcal{H}(K||1)
K_{\text{ENC}} = \mathcal{H}(K||2)
                                                                                                                          K_{\text{ENC}} = \mathcal{H}(K||2)
                                                                                                             T_{\text{PCD}} = \mathbf{mac}(K_{\text{MAC}}, h_1)
T_{\text{PICC}} = \mathbf{mac}(K_{\text{MAC}}, h_2)
                                                                       T_{\text{PICC}}
                                                                       T_{PCD}
```

Fig. 1. The PACE protocol.

abort, if  $T_{\text{PICC}} \neq \mathbf{mac}(K_{\text{MAC}}, h_2)$ 

abort, if  $T_{PCD} \neq \mathbf{mac}(K_{MAC}, h_1)$ 

bop t : H V1 V2 ... VM -> H

```
The first argument of t refers to the current state. The operator t — identified by the indices V1 ... VM — maps the current state to another state in the state space. How this transition operator affects the state space in particular, is defined in CafeOBJ with conditional equations of the form:

ceq o(t(X,Y1,...,YM),Z1,...,ZN) = changeval(X,Y1,...,YM,Z1,...,ZN)

if effective-condition(X,Y1,...,YM,Z1,...,ZN).

ceq t(X,Y1,...,YM) = X
```

Here changeval is the operation that changes values of the observation to the ones of the successor state, and effective-condition evaluates whether the condition to apply the transition is met in the current state. If the observed values never change when applying the transition, one can combine the above simply to: eq o(t(X,Y1,...,YM),Z1,...,ZN) = o(X,Y1,...,YM).

if not effective-condition(X,Y1,...,YM,Z1,...,ZN).

CafeOBJ uses proof scores to prove invariants that hold in a model that is specified as described above. Proof scores define the proof obligations and induction hypothesis needed to proof invariants by induction.

**Proof Scores.** A proof score of an invariant consists of two parts: First, the induction hypothesis w.r.t. the predicate in the initial state is shown. Then the induction step follows. For each invariant  $\operatorname{pred}_i(s, \mathbf{x})$  a corresponding operator and an equation is defined:

```
op invI : H V1 V2 ... VN \rightarrow Bool . eq invI(S,X1,...,XN) = ... .
```

In the definitions of visible sorts in the specification, also a constant init is defined, denoting an arbitrary initial state. Then to prove  $\operatorname{pred}_i(s, \mathbf{x})$ , one fixes arbitrary objects v1,...,vN for the visible sorts V1,...,vN and issues a reduce command w.r.t. the initial state: red invI(init,v1,...,vN).

For the induction step one has to show that if  $\operatorname{pred}_i(s,\mathbf{x})$  holds in state s, then it also holds in any possible next state s'. For each predicate one fixes arbitrary states s and s' by ops s,s': -> H, defines an operator of form op istepI: V1 V2 ... VN -> Bool and an equation for the induction step:

```
eq istepI(X1,X2,...,XN) = invI(s,X1,,...,XN) implies invI(s',X1,...,XN).
```

Then one fixes arbitrary objects  $v1, \ldots, vN$  for the visible sorts, defines how s' results from s by a transition t by eq s' =  $t(s, \ldots)$ ., and issue a reduce command red  $istepI(v1, \ldots, vN)$ . The reduce command uses the equations to obtain the equational normal form of an expression. If both for the initial state and the induction step rewriting to normal form reaches the constant true, the proof w.r.t. to transition t has succeeded. For a full proof, all defined transitions have to be considered.

Lemmata. Quite often the induction step cannot be shown directly, since the induction hypothesis is too weak. Then a lemma is needed. Let invJ be a predicate with free variables of visible sorts E1,...,EK, and let e1,...,eK denote either free variables of, or expressions (i.e. terms) of these sorts. One can strengthen the induction hypothesis by augmenting invJ in state s, i.e. by issuing red invJ(s,e1,...eK) implies istepI(v1,...,vN). One advantage in OTS/CafeOBJ is that one can use invJ to strengthen the induction step in the proof of invI and vice-versa.

Case Analysis. Another proof technique is case analysis. Suppose for example that v1 is assumed to be of arbitrary form. For a constructor f, we can then distinguish the case that either v1 is constructed by f applied to some arbitrary vC, or that this is not the case. Then the induction step is split: One declares v1 = f(vC), and reduces red istepI(v1,...). Then one does the same again, but declares (v1 = f(vC)) = false before reducing. Clearly all possible cases have been exhaustively considered, since it is always true that:

```
(v1 = f(vC)) or (not (v1 = f(vC)))
```

Of course it is possible to strengthen the induction hypothesis by more than one predicate, and to combine lemma application with case analysis.

# 4 Modeling PACE in CafeOBJ

The system is modeled in a way such that an unbounded number of principals interact with each other by sending messages. Honest principals behave according to protocol. Malicious ones can fake and forge messages. The malicious principals are modeled as the most general intruder according to the Dolev-Yao intruder model [10]. Moreover the following assumption are made:

- Cryptographic primitives are sound. Random nonces are unique and cannot be guessed, encrypted messages can only be decoded by knowing the correct key, hashes are one-way and there are no collisions, and two message authentication codes are the same only if generated from the same message with the same key.
- 2. The intruder can glean any public information (i.e. messages, ciphers etc.) that is sent in the network.
- 3. The intruder can send two kinds of messages: He can use ciphers based on cryptographic primitives from existing messages as black boxes to send new fake messages, or he can use eavesdropped information to generate new messages from scratch. But, as noted above, he cannot eavesdrop information from ciphers based on cryptographic primitives without knowing the corresponding keys or passwords.

### 4.1 An abstract version of PACE

To abstract away from implementation-dependent information and those that cannot be captured in the Dolev-Yao model anyway, the following abstract version of the PACE protocol is used.

```
Message 1: p \rightarrow q: \mathbf{enc}_{\pi}(n_s, D)

Message 2: p \rightarrow q: *(n_a, G)

Message 3: q \rightarrow p: *(n_b, G)

Message 4: p \rightarrow q: \mathbf{mac}(\mathcal{H}(*(n_a, *(n_b, G))), *(n_b, G), D)

Message 5: q \rightarrow p: \mathbf{mac}(\mathcal{H}(*(n_b, *(n_a, G))), *(n_a, G), D)
```

It is assumed that a run of PACE is conducted by exchanging five messages. In the first step, a message is sent from a principal p to another one q. The message encrypts a random nonce  $n_s$  with the shared password  $\pi$ , with attached static domain parameters D. Next, p maps the nonce  $n_s$  from the first message with the domain parameters to a group generator G. Then p chooses a random nonce  $n_a$ , applies the operator \* to both  $n_a$  and G and sends the result  $*(n_a, G)$  to q. In a similar manner, q chooses a random nonce  $n_b$  and sends  $*(n_b, G)$  to p. Next, p computes the key  $\mathcal{H}(*(n_a, *(n_b, G)))$ . He then sends a message authentication code — encoded with that key — with the received exponent  $*(n_b, G)$  and domain parameters D to q, in order to verify knowledge of both the received exponent and the generated key. Principal q does the same in reverse, and the common key  $\mathcal{H}(*(n_a, *(n_b, G)))$  is used from now on to exchange encrypted messages.

### 4.2 Basic Data Types

The following algebraic data types, i.e. visible sorts and corresponding constructors are used:

- Principal denotes both honest and malicious principals in the network.
- Random denotes random nonces. Random nonces are supposed to be unique and unguessable.
- Dompar denotes the static domain parameters of PACE. Used domain parameters are not secret and known to every principal.
- Mappoint denotes a group generator. The constructor maptopoint of data type Mappoint takes as input a random nonce and static domain parameters and returns a group generator. It is supposed that maptopoint is a one-way function.
- Expo denotes an exponent of the form  $g^x$ , where the group generator g is generated by maptopoint using a random nonce and domain parameters as input.
- Hash denotes keys it is supposed that hashing is the key derivation function. The constructor hash takes as input a random nonce and an exponent and returns a key.
- Cipher1 denotes the cipher resulting from a symmetric encryption. Its constructor enc takes as input a random nonce and static domain parameters. It is implicitly assumed that a Cipher1 is encoded with the shared password  $\pi$  in the following way: Given a Cipher1, every principal is able reconstruct the static domain parameters. But only if he knows the shared password  $\pi$ , he is able to decode the random nonce.
- Cipher3 denotes message authentication codes. The constructor mac takes as input a hash, an exponent and domain parameters.

Three sorts and data types are defined for the messages in Section 4: Message 1 of Section 4 is of type Message1, messages 2 and 3 are of type Message2, and messages 4 and 5 are of type Message3. Here, Message1 is a Cipher1 attached with meta-information describing the creator, the (seemingly) sender, and the receiver of a message. For example

#### me1(intruder,p,q,c)

denotes a Message1 where c is a Cipher1, and the message is (seemingly) sent from principal p to q, but was actually created by the intruder, i.e. faked and injected in the network. Similar, a Message3 is a Cipher3 attached with corresponding meta-information. The data type Message2 is constructed by attaching meta-information to an exponent. Moreover for the definition of the data structures two design decisions — cf. also Section 6 — should be noticed:

Modeling of the shared password  $\pi$ . PACE assumes a fixed shared password  $\pi$  known among honest principals. Knowledge of the password is modeled by a predicate knowspi where knowspi (intruder) = false is set. No specific

data type is introduced for decryption of messages of type 1, instead it is just distinguished between messages that are created by an honest principal who does know  $\pi$  and the intruder, who does not.

Equality of hashes. The equality operator \_=\_ for hashes is defined as

i.e. that  $\mathcal{H}(*(n_a,*(n_b,G_1))) = \mathcal{H}(*(n_c,*(n_d,G_2)))$  if  $G_1 = G_2$ ,  $n_a = n_c$ ,  $n_b = n_d$ , or  $G_1 = G_2$ ,  $n_a = n_d$ ,  $n_b = n_c$ . This captures the equality of the keys generated during the key exchange.

### 4.3 Protocol Modeling

In order to collect all sent messages, all generated random nonces, and other information, the following definition of a *multiset* on an abstract level from [19] is reused. This definition is then later used as a parametrized module to define multisets containing the data-types defined in the previous section.

```
mod* SOUP (D :: EQTRIV) principal-sort Soup {
   [Elt.D < Soup]
   op empty : -> Soup {constr}
   op _ _ : Soup Soup -> Soup {constr assoc comm id: empty}
   op _\in_ : Elt.D Soup -> Bool
   var S : Soup
   vars E1 E2 : Elt.D
   eq E1 \in empty = false .
   eq E1 \in (E2 S) = (E1 = E2) or E1 \in S .
}
```

The operator \in defines membership in the multiset, and a space defines insertion. To collect all random nonces for example, one can define an observation bop rands: System -> RandSoup that takes as input a state, and returns as the observation a soup of random nonces. Given a random nonce r and a state s, one can test membership by r \in rands(s), and — for example describing the effects of a transition — insert r in the multiset by r rands(s). Observations and transitions are defined as follows:

### -- observations

```
bop network : System -> Network
bop rands : System -> RandSoup
bop hashes : System -> HashSoup
bop randsi : System -> RandSoup
bop expos : System -> ExpoSoup
bop cipher1s : System -> Cipher1Soup
```

```
bop cipher3s : System -> Cipher3Soup
-- transitions
bop sdm1 : System Principal Principal Random Dompar
                                                               -> System
                                                               -> System
bop sdm2 : System Principal Principal Random Message1
bop sdm3 : System Principal Principal Message1 Message2 Message2
                                                               -> System
-- faking and forging messages based on the gleaned info
bop fkm11 : System Principal Principal Cipher1
                                                               -> System
bop fkm12 : System Principal Principal Random Dompar
                                                               -> System
bop fkm21 : System Principal Principal Expo
                                                               -> System
bop fkm22 : System Principal Principal Random Random Dompar
                                                               -> System
bop fkm31 : System Principal Principal Cipher3
                                                               -> System
bop fkm32 : System Principal Principal Random Expo Expo Dompar -> System
```

Seven observers are used to collect information:

- network returns a multiset of all messages that have been sent so far.
- rands returns a multiset containing *all* random nonces that have been generated so far.
- hashes returns all keys resulting from the PACE protocol that have been gleaned or self-generated by the intruder. The name stems from the fact that one considers hash to be the key derivation function.
- randsi contains all random nonces gleaned or self-generated by the intruder.
- expos contains all exponents that have been inserted in the network and
- cipher1s and cipher3s collect *all* ciphertexts of messages of type 1 and messages of type 3 (i.e. mac-tokens).

The transitions sdm1, sdm2, and sdm3 describe state transitions and their effects on observations when an honest principal sends a message of type 1, 2 or 3. Therefore the conditions on when these transitions are effective, capture precisely the behavior of an honest principal. For example sdm1 is defined as:

```
 \begin{array}{lll} eq & c-sdm1(S,P,Q,R,D) = not(R \in \pi ands(S)) & . \\ ceq & network(sdm1(S,P,Q,R,D)) & = me1(P,P,Q,enc(R,D)) & network(S) \\ & if & c-sdm1(S,P,Q,R,D) & . \end{array}
```

Thus an honest principal p can add a message me1(P,P,Q,enc(R,D)) in state S — in message protocol notation  $p \to q : enc_{\pi}(R,D)$  — only to the network if the nonce R is fresh. Freshness means that R is not contained in the set of all nonces that have been generated before reaching state S. This freshness condition is modeled by the first equation.

The transitions fkmXY describe state transitions and their effects on observations when the intruder generates messages. Here one distinguishes two cases: 1.) the intruder fakes an existing message by changing its source and destination (fkmX1) and 2.) the intruder injects a new message in the network using information available to him (fkmX2). Therefore the effective conditions for these transitions are usually more lax than the ones for sdmX. For example the condition to fake a message of type 1

```
eq c-fkm11(S,P,Q,C1) = C1 \in cipher1s(S) .
```

is just that a cipher1 exists in the network. The intruder can then inject the message me1(intruder,P,Q,C1) with arbitrary source P and destination Q. Note that the meta information denoting the creator of the message cannot be altered by the intruder.

An example for the second case is the condition to construct an arbitrary new message of type 1

```
eq c-fkm12(S,P,Q,R,D) = (not (R \in rands(S))) or (R \in randsi(S)) .
```

Here the intruder can choose to either use a fresh random nonce, or one that he has gleaned or generated in an earlier state. He then injects the message mel(intruder,P,Q,enc(R,D)) into the network.

# 5 Proving Key-Secrecy

Key secrecy is shown in the following sense: Suppose that one takes the perspective of an honest principal, i.e. one is either the passport or the terminal, and one behaves according to protocol. In particular it is assumed that

- 1. one has either sent a Message1 with a nonce encrypted with the shared password  $\pi$  and domain parameters (passport) or one has received a Message1 from a principal who knows  $\pi$  and decrypted it (terminal) and
- 2. one constructed a generator of the group with the nonce and the domain parameters from the above message, used the generator together with a fresh nonce to create an exponent, and sent it to the other party and
- 3. one *seemingly* (it is unknown who created the message) received an exponent back from that other party and
- 4. one *seemingly* received a MAC-token that, using ones secret nonce together with the received exponent as a key, validates that the other party knows ones sent exponent and the domain parameters.

Then the resulting key must never be known to the intruder. This can be almost verbatim translated into the next main theorem:

```
and hash(cipher3(M3)) = hash(rand(expo(M21)),expo(M22)))
implies
  not (hash(cipher3(M3)) \in hashes(S)) .
```

Application of Lemmata and Case Analysis. To prove key secrecy one needs additional invariants. Central to strengthening the induction hypothesis for istep900 is the invariant that the assumptions of inv900 imply that both principals have implicitly agreed upon the same generator g, which itself depends on the nonce exchanged in the first message. For brevity suppose that assump(S,M1,M21,M22,P,Q) is a predicate that denotes truth of the assumptions of invariant inv900 above. The invariant can then be expressed as:

```
eq inv800(S,M1,M21,M22,M3,P,Q) = assump(S,M1,M21,M22,P,Q)
implies rand(point(expo(M22))) = rand(point(expo(M21)))
```

How such a lemma is used in the proof together with case analysis is illustrated, albeit for a simpler invariant. Frequent use of the following invariant as a lemma for others is made. It states that if one is in a state S, and a M1 of type Message1 is in the network, then the random nonce of M1 has been used and is thus included in the collection of all random nonces rands(S).

inv300 is proven inductively on the number of transitions. In the case of transition fkm11 one performs case analysis w.r.t. its effective condition:

```
(c-fkm11(s,p10,q10,c11) = false) or (c-fkm11(s,p10,q10,c11) = true)
```

Here p10 and q10 denote arbitrary principals, and c11 denotes an arbitrary cipher1. For the first case, the proof directly succeeds:

```
open ISTEP
  ops p10 q10 : -> Principal .
  op m10 : -> Message1 .
  op c11 : -> Cipher1 .
  eq c-fkm11(s,p10,q10,c10) = false .
  eq s' = fkm11(s,p10,q10,r10,d10) .
  red istep300(m10) .
```

For the second case c-fkm11(s,p10,q10,c11) = true, one replaces the term with its definition c11 \in cipher1s(s) = true and performs another case analysis w.r.t. the equality m10 = me1(intruder,p10,q10,c11).

```
open ISTEP
  ops p10 q10 : -> Principal .
  ops m10 : -> Message1 .
  op c11 : -> Cipher1 .
  eq c11 \in cipher1s(s) = true .
```

```
eq m10 = me1(intruder,p10,q10,c11) .
eq s' = fkm11(s,p10,q10,c11) .
***
close
```

If one directly tries to prove the induction step by reducing red istep300(m10) inserted at \*\*\*, CafeOBJ outputs

```
rand(c11) \in rands(s) xor
  me1(intruder,p10,q10,c11) \in network(s) xor ...
```

This indicates that if me1(intruder,p10,q10,c11) is not already included in and thus inserted in the network as a result of the transition fkm11, then rand(c11) \in rands(s) must be true for the induction step to hold. Therefore the induction hypothesis needs to be strengthened. One does so by introducing yet another invariant inv150, which states that if a cipher1 is in the network, than its random nonce is included in the set of all used random nonces.

```
eq inv150(S,C1) = C1 \in cipher1s(S) implies rand(C1) \in rands(S) .
```

And indeed, applying inv150 as a lemma at \*\*\* by inserting

```
red inv150(s,c11) implies istep300(m10)
```

successfully finishes the induction step. Therefore it has been verified that if a Cipher1 exists, i.e. is included in the set of collected ciphers observable in state S in the network, then the random nonce of that cipher must be included in the set of collected nonces observable in that state.

# 6 Experience and Lessons Learned

From the experience of applying OTS/CafeOBJ to a rather large real-world example, three guidelines are formulated:

1. Refine your specification. When stuck in a proof attempt, it is worthwhile to reconsider the specification. Take for example the definition of equality of hashes. Initially equality was defined for two ciphers3's C1 and C2 intuitively as

```
eq (C1 = C2) = (hash(C1) = hash(C2)) and expo(C1) = expo(C2)
and expo(C1) = expo(C2)
```

This has the awkward consequence that messages can no longer uniquely be identified: When a principal sends a message of type 3, implicitly two messages are added to the network, one w.r.t. each case of equality of the hash of the cipher. Then for example an invariant like

```
m3 \in network(s) implies cipher3(m3) \in cipher3s(s)
```

does not hold if we have cipher3(m3) = mac(hash(r2,expo(r1,...),...) and mac(hash(r1,expo(r2,...),...) \in cipher3s(s). This makes reasoning during the induction steps quite unintuitive and led to defining equality of cipher3's as syntactic equality of normals forms, and formulating theorems accordingly when referring to multiple cipher3's with the same hash.

- 2. Simplify your specification. Trying to specify every detail naturally gives a proof that is most faithful to the real protocol. It however also leads to more involved proofs and case-analysis. For example, we purposely decided not to fully model the symmetric cipher used to encrypt the shared password  $\pi$ , but rather to model knowledge of  $\pi$  with a predicate.
- 3. A deductive proof approach. It is very simple in CafeOBJ to quickly add a lemma without proving it. Some invariants, like inv900 in the current case, are quite involved, and it is likely that one encounters problems with the specification during the proof, and refines or simplifies the specification thereafter. This often also affects helper lemmata. It it thus very useful to focus on the proof of a complex invariant, thereby using several simpler, unproven lemmata, and only afterwards focus on the proof of the latter.

The main hindrance when conducting the proof is related to performance. Suppose one is proving an invariant of the form  $a_1 \wedge a_2 \dots \wedge a_n \implies b$ , such as inv900. A direct proof attempt often does not terminate, due to the amount of branching. To get a terminating result, one can make a trivial case analysis w.r.t.  $a_i$ , e.g. distinguish the case for  $\neg a_1$ , for  $a_1 \wedge \neg a_2$ , and so on, to finally reach the case for  $a_1 \wedge \dots \wedge a_n$ . Even then sometimes a proof attempt does not terminate, so additional assumptions and corresponding cases have to be added. Almost all cases are trivial – it is obvious that in the case with the assumption  $\neg a_1$  the above invariant holds – but lead to a blow up of the size of the proofs. For example, our proof score consists of 38427 lines, of which the vast majority are for such trivial cases. Fortunately, the majority of these cases could be generated automatically by scripts. Nevertheless, tools that tie more directly with CafeOBJ, or come distributed with it, would be certainly helpful for an easier work-flow and increased productivity.

All in all, 40 invariants of the formalization of PACE were proven. The verification of all invariants together takes approximately two hours and eight minutes on an Intel Core i7-3520M @ 2.9 Ghz.

# 7 Related Work

Security Analysis of PACE. An inductive verification [5] of the PACE protocol has been conducted in the verification support environment (VSE) [14]. VSE has been developed in the 1990's by a consortium of German universities and industry to provide a tool to meet industry needs for the development of highly trustworthy systems. Since the proof source is not publicly published and the VSE tool and documentation is not available for download, a comparison is difficult. An independent verification of the proof however is important to ensure trust in the protocol, not only for users, but also for work in international standardization bodies. A pen-and-paper proof for security in the sense of Abdalla, Fouque and Pointcheval [1] has been given in [2]. In [6] attempts are made to merge the pen-and-paper proof with the VSE-proof.

Formal Analysis of Security Protocols. According to [4], the execution of the protocol, and thus the state space, is not bounded. An approach based on model-checking seems therefore not appropriate. Other than (classical) model-checking, a plethora of tools and approaches exist to formally analyze security protocols, and the reader is referred to [3] for a comprehensive overview. Compared to other tools, the choice of CafeOBJ was motivated rather from the perspective of a practitioner, and not necessarily due to other tools lacking features. In particular the OTS/CafeOBJ approach is well documented, has a proven track record w.r.t. security protocol verification [17–21, 23], the CafeOBJ platform is very stable, and modeling of protocols is straight-forward. Also, it is not difficult to start with an abstract specification, and then add details and extend proofs later on.

The lack of automation in OTS/CafeOBJ is a double-edged sword. On one hand no hidden limitations exist, whereas most tools that aim for full automation make some assumptions to e.g. reduce the state space. It is sometimes not easy to anticipate in advance which of these limitations apply for the protocol one intends to prove. Moreover the manual approach forces oneself to recapitulate on the formalization and its appropriateness of capturing the protocol in question. On the other hand, the lack of automation is sometimes not timeeffective and somewhat tedious. Constructing tools that not only offer a high level of automation, but also fully axiomatize Abelian group-theory to account for more in-depth algebraic attacks is an ongoing research-topic, with several tools, e.g. the Tamarin tool [24], which is based on multiset rewriting, or an extended version of ProVerif [15]. Maude, another member of the OBJ family, has been used for formal analysis of security protocols [22], and in particular the Maude-NPA [11] tool offers a narrowing based approach for Diffie-Hellman. Last, automation of the OTS/CafeOBJ approach itself has also recently been increased significantly [12].

All these approaches are natural candidates when extending the proof, e.g. by adding detail to the specification w.r.t. mapping a point and domain parameters to a group generator, or when extending to the protocol sequence to the full protocol sequence for extended access control

#### 8 Conclusion and Future Work

Key secrecy has been successfully verified in CafeOBJ. This not only facilitates trust in the PACE protocol, but also represents one more case-study that shows that the OTS/CafeOBJ approach scales well beyond toy-examples like NS(L)PK to real-world scenarios. Also, the PACE proof can serve as a guide on how to model a DH-key exchange in CafeOBJ. Key-Secrecy however, is only one important property of PACE. Future directions include to extend the proof to mutual authentication, perfect forward secrecy, and the full EAC2 protocol stack, possibly with the help of the automated tools mentioned in Section 7.

#### References

- Abdalla, M., Fouque, P.A., Pointcheval, D.: Password-based authenticated key exchange in the three-party setting. In: Proc. 8th PKC. LNCS, vol. 3386 (2005)
- Bender, J., Fischlin, M., Kügler, D.: Security analysis of the PACE key-agreement protocol. In: ISC. LNCS, vol. 5735 (2009)
- 3. Blanchet, B.: Security protocol verification: Symbolic and computational models. In: Proc. 1st POST. LNCS, vol. 7215 (2012)
- 4. BSI: Advanced security mechanisms for machine readable travel documents (2012)
- Cheikhrouhou, L., Stephan, W.: Meilensteinreport: Inductive verification of PACE. Tech. rep., DFKI GmbH (2010)
- 6. Cheikhrouhou, L., Stephan, W., Dagdelen, Ö., Fischlin, M., Ullmann, M.: Merging the cryptographic security analysis and the algebraic-logic security proof of PACE. In: Sicherheit. LNI, vol. 195 (2012)
- 7. Clark, J., Jacob, J.: A survey of authentication protocol literature (1997)
- 8. Diaconescu, R., Futatsugi, K.: CafeOBJ Report: The language, proof techniques and methodologies for object-oriented algebraic specification, AMAST Series in Computing, vol. 6. World Scientific (1998)
- 9. Diffie, W., Hellman, M.E.: New directions in cryptography. IEEE Transactions on Information Theory 22(6) (1976)
- 10. Dolev, D., Yao, A.C.C.: On the security of public key protocols. IEEE Transactions on Information Theory 29(2) (1983)
- Escobar, S., Meadows, C., Meseguer, J.: Maude-NPA: Cryptographic protocol analysis modulo equational properties. In: FOSAD. lncs, vol. 5705 (2007)
- 12. Găină, D., Zhang, M., Chiba, Y., Arimoto, Y.: Constructor-based inductive theorem prover. In: 5th CALCO. LNCS, vol. 8089 (2013)
- 13. ICAO: Doc 9303 Machine readable travel documents
- Koch, F.A., Ullmann, M., Wittmann, S.: Verification support environment. In: Proc. 8th CAV. LNCS, vol. 1102 (1996)
- Küsters, R., Truderung, T.: Using ProVerif to analyze protocols with Diffie-Hellman exponentiation. In: Proc. 22nd CSF (2009)
- Lowe, G.: An attack on the Needham-Schroeder public-key authentication protocol. Inf. Process. Lett. 56(3) (1995)
- 17. Ogata, K., Futatsugi, K.: Rewriting-based verification of authentication protocols. Electr. Notes Theor. Comput. Sci. 71 (2002)
- 18. Ogata, K., Futatsugi, K.: Flaw and modification of the iKP electronic payment protocols. Inf. Process. Lett. 86(2) (2003)
- Ogata, K., Futatsugi, K.: Proof scores in the OTS/CafeOBJ method. In: Proc. 6th FMOODS 2003. LNCS, vol. 2884 (2003)
- Ogata, K., Futatsugi, K.: Equational approach to formal analysis of TLS. In: Proc. 25th ICDCS. IEEE Computer Society (2005)
- Ogata, K., Futatsugi, K.: Proof score approach to analysis of electronic commerce protocols. International Journal of Software Engineering and Knowledge Engineering 20(2) (2010)
- 22. Ölveczky, P.C., Grimeland, M.: Formal analysis of time-dependent cryptographic protocols in real-time maude. In: Proc. 21st IPDPS (2007)
- Ouranos, I., Ogata, K., Stefaneas, P.S.: Formal analysis of TESLA protocol in the timed OTS/CafeOBJ method. In: Proc. 5th ISoLA. LNCS, vol. 7610 (2012)
- Schmidt, B., Meier, S., Cremers, C.J.F., Basin, D.A.: Automated analysis of Diffie-Hellman protocols and advanced security properties. In: Proc. 25th CSF (2012)

# A Normalized Form for FIFO Protocols Traces, Application to the Replay of Mode-based Protocols

Mamoun Filali, Meriem Ouederni, Jean-Baptiste Raclet

IRIT CNRS Université de Toulouse

**Abstract.** The traditional concern of runtime verification is the ability to detect an incorrect system behavior and maybe to act on such systems whenever incorrect behavior of a software system is detected [16]. In this paper, our concern is to provide a system observation through which the system behavior could for instance be diagnosed, *e.g.* to resolve unexpected bugs. Such a system observation is elaborated from a partial observation. Our work is at the protocol level: given a distributed application relying on a FIFO protocol for message passing, our concern is to reconstruct a full execution given by its observable send events.

#### 1 Introduction

The availability of a web-based infrastructure, e.g., internet, has popularized worldwide distributed applications. Nowadays, commercial transactions and several administrative procedures, e.g., eGovernment, are often executed in a distributed setting. The interpretation of data produced by such applications is very useful and of utmost importance. For instance, it enable programme diagnosis in order to understand the executed traces. In addition, if an unexpected bug occurs at execution time, the interpretation allows us to replay, i.e., reconstruct, the traces leading to the fault. Thus, one may be able to debug the program and find the root cause of the bug. In this paper, we address the following problem: how can we faithfully make an interpretation of data available at runtime? Runtime verification [16] can be identified as one domain addressing such a topic. Actually, the traditional concern of runtime verification is the ability to detect an incorrect system behavior and maybe to act on such systems whenever incorrect behavior of a software system is detected. In this paper, our aim is to provide a system observation through which the system behavior could for instance be diagnosed. Such a system observation is elaborated from a partial observation. Our work is at the protocol level: given a distributed application relying on a FIFO protocol for message passing, our concern is to reconstruct a full execution given by its observable send events. We motivate the use of send events as follows: first, the decision to receive is usually considered as an internal or private decision; moreover, sends are seen over the network while the receipts are not.

The rest of the paper is organized as follows. After the definition of basic semantics notions in Section 2, we illustrate and motivate the studied model in Section 3. Section 4 introduces a normal form of executions. A replay algorithm based upon this normal form is studied in Section 5. Before concluding, we review some related works in Section 6.

# 2 FIFO protocols semantics

In this section, we present the semantics of the studied model: FIFO protocols. After presenting the notations used throughout the paper, we recall the basic semantic notions used in the sequel. Transition systems [2] together with runs and traces, the basic notions for observing a transition system, are first defined. After defining syntactically send-receive protocols systems, their semantics is given as transition systems.

#### 2.1 Notations

Finite Sequences (lists). Let S be a set,  $S^*$  is the set of finite sequences over S. An element  $x_1 ldots x_n$  of S is also denoted  $x_{(i)}$ . [] is the empty sequence, [e] is a one element (e) list, given a non empty sequence l:  $x_1 ldots x_n$ , hd(l) is the first element of l:  $x_1$ , tl(l) is the sequence resulting from the suppression of  $x_1$ :  $x_2 ldots x_n$ , last(l) is the last element of l:  $x_n$ , butlast(l) is the sequence resulting from the suppression of  $x_n$ :  $x_1 ldots x_{n-1}$ . set(l) is the set of the elements of the list l:  $\{x_1 ldots x_n\}$ . By abuse of notation, we write  $\forall e \in l$ . ... instead of  $\forall e \in \text{set}(l)$ . ... . Given a set H, l ldot H is the sequence resulting from the suppression of the elements of H within l. The concatenation of two lists l, l' is denoted l@l'. Given a sequence L of sequences over S, concat(L) is the sequence resulting from the concatenation of the elements of L.

Updates. Given a structured datatype, e.g., an array, a record, ..., := denotes its update. For instance, given an array A where s is a valid index, A[s:=v] is the array resulting from the update by v of the element of A at index s, given a record R with a field named f, R[f:=v] is the record resulting from the update by v of the field named f.

#### 2.2 Transition Systems

**Definition 1 (Labelled Transition Systems).** A labelled transition system defined over a set of states S and a set of labels  $\Sigma$  is a couple  $Sys = (I, \rightarrow)$  where I: the set of initial states is a subset of S and  $\rightarrow$ : the labelled transition relation is a subset of  $S \times \Sigma \times S$ .

In the following, given a transition tr = (s, l, s'), its label l, will be denoted Lab(tr). s the initial state is denoted Src(tr), s' the destination state is denoted Dst(tr).

**Definition 2 (Runs).** Given a labelled transition system  $Sys = (I, \rightarrow)$  over S and  $\Sigma$ , a run is an element of  $(S \times \Sigma \times S)^* : s_0 l_0 s'_0 \dots s_n l_n s'_n$  such that  $n \in \mathbb{N}$ ,  $s_0 \in I$  and  $\forall i \leq n$ .  $(s_i, l_i, s'_i) \in \rightarrow \land \forall i < n$ .  $s'_i = s_{i+1}$ . Its set of runs is denoted  $\mathcal{R}_{Sys}$ .

A run is a sequence of interleaved states and labels obtained through the execution of the algorithm or protocol modelled as a transition system. **Notations.** Given an initial state s and a run r,  $\mathcal{R}(s)$  is the set of runs starting at s. Given a non empty run  $r = s_0 l_0 s'_0 \dots s_n l_n s'_n$ , ends(r) denotes the pair  $(s_0, s'_n)$ .

**Definition 3 (Traces).** Given a labelled transition system  $Sys = (I, \rightarrow)$  over S and  $\Sigma$ ,  $\mathcal{E} \subseteq \Sigma$  called the espilon set, a trace is an element of  $\Sigma^*$  obtained as a projection of a run where letters of the epsilon set  $\mathcal{E}$  have been suppressed.

$$\operatorname{Traces}(\mathit{Sys},\mathcal{E}) = \bigcup_{s_{(i)},s'_{(i)}} \{l_0 \dots l_n \setminus \mathcal{E}. \ (s_0, l_0, s'_0) \dots (s_n, l_n, s'_n) \in \mathcal{R}_{\mathit{Sys}}\}$$

Intuitively, a trace is a sequence of labels that can be observed through the execution of the algorithm or protocol modelled as a transition system:  $\operatorname{Trace}((s_0, l_0, s'_0) \dots (s_n, l_n, s'_n), \mathcal{E}) = l_0 \dots l_n \setminus \mathcal{E}^{-1}$ .

#### 2.3 FIFO Protocols Systems

**Definition 4 (Send-receive protocols).** A Send receive protocol is defined as a tuple  $(St, \delta)$  over a set P of peers, a set  $\mathcal{L}$  of locations<sup>2</sup> and  $\mathcal{M}$  a set of messages. To each peer p is assigned an automaton  $(St_p, \delta_p)$  where  $St_p$ , a location of  $\mathcal{L}$ , is the initial state of the peer p and  $\delta_p$  is the set of transitions of the peer p. A transition is one of:

- a send transition denoted (st, q!m, st'): in state st, the peer p sends the message m to the peer q through the input queue of peer  $q \in \mathcal{P}$  and moves to state st'.
- a receive transition denoted (st,?m,st'): in state st, the peer p receives (or consumes) the message m through its own input queue and moves to state st'.
- $(st, \epsilon, st')$  is an epsilon transition (internal transition).

In the following, we consider deterministic send-receive protocols, that is, we omit  $\epsilon$  transitions and consider only deterministic transitions; we then use  $\delta$  transitions as partial functions: the state resulting when moving from state st through a  $\delta$  transition is denoted  $\delta(st)$ .

**Definition 5 (FIFO Send-Receive Systems).** Given a send-receive protocol  $(St, \delta)$  defined over  $(\mathcal{P}, \mathcal{L}, \mathcal{M})$ , we define its FIFO labelled transition system FIFO\_LTS $((\mathcal{P}, \mathcal{L}, \mathcal{M}, St, \delta))$  as the following tuple:

S the set of global states is a mapping giving for each peer its local state and its FIFO queue:  $\mathcal{P} \to \mathcal{L} \times \mathcal{M}^*$ . We represent a global state as an array of records indexed by the set of peers:

array  $\mathcal{P}$  of record state :  $\mathcal{L}$ , queue : list of  $\mathcal{M}$  end

 $\Sigma$  its set of labels as the (disjoint) union of:

<sup>&</sup>lt;sup>1</sup> We write  $\operatorname{Trace}(r)$  when  $\mathcal{E}$  is clear in the context.

<sup>&</sup>lt;sup>2</sup> The location has to be understood here as the local, or internal state of a peer.

- its send labels: P × M × P, where an element is denoted sSd
   where s is the source peer, d the destination peer and m the sent
   message from s to d.
- its receive labels:  $\mathcal{P} \times \mathcal{M}$ , where an element is denoted  $R_p^m$  where p is the receiving peer and m the received message.
- The set of initial states is the singleton  $\{p \mapsto \{state := \{St_p, \text{queue} := \|\}\}$
- $\rightarrow$  its transition relation as the union  $S \cup R$  where:
  - S is the set of send transitions<sup>3</sup>:

$$\begin{split} \mathcal{S} &= \bigcup_{s} \mathcal{S}_{s} \\ \mathcal{S}_{s} &= \bigcup_{r,m} \{ (St,_{s} S_{r}^{m}, St'). \quad s \neq r \wedge (St[s], r!m, St'[s]) \in \delta_{s} \\ & \wedge St' = St[s:=St[s] [ \text{state} := St'[s].state] \\ & r := St[r] [ \text{queue} := St[r]. \text{queue}@[m]] ] \end{split}$$

ullet R is the set of receive transitions:

$$\begin{split} \mathcal{R} &= \bigcup_r \mathcal{R}_r \\ \mathcal{R}_r &= \bigcup_m \{ (St, R_r^m, St'). \quad (St[r], ?m, St'[r]) \in \delta_r \\ & \wedge St[r]. \\ \text{queue} &\neq [] \wedge m = hd(St[r]. \\ \text{queue}) \\ & \wedge St' = St[r]. \\ \text{state} := St'[r]. \\ \text{state}, \\ \text{queue} &:= tl(St[r]. \\ \text{queue}) \end{split}$$

Intuitively, each peer p has a FIFO queue that contains the sequence of data items that have been sent to peer p but not yet received by p. Each receive transition removes a data item from the queue of the receiving peer and each send transition adds a data item to the queue of the destination peer. Moreover, we note that FIFO queues are unbounded. It follows that our transition systems are infinite.

**Notations.** Given a label l (an element of  $\Sigma$ ):

- the predicate is Send, resp. is Receive, denotes if l is a send label, resp. a receive label.
- The function On denotes its peer:  $\operatorname{On}({}_sS^m_d)=s,\operatorname{On}(R^m_r)=r.$  In the following, we consider a fixed FIFO protocol so we will omit the tuple  $(\mathcal{P},\mathcal{L},\mathcal{M},St,\delta).$

**Definition 6 (FIFO Protocols Traces).** The traces of a FIFO protocol system are defined as the projection of the set of its runs over Send labels.

$$\text{FIFO\_Traces}(\mathcal{P}, \mathcal{L}, \mathcal{M}, St, \delta) = \text{Traces}(\text{FIFO\_LTS}(\mathcal{P}, \mathcal{L}, \mathcal{M}, St, \delta), \bigcup_{r,m} \{R_r^m\},)$$

Remark. FIFO automata differ from what could be called a "Fifo channel automata" [17] where peers communicate through FIFO communication channels: FIFO automata define for every peer a unique FIFO on which the peer receives all the messages the other peers have sent to it. The FIFO automata, considered in this paper, define the usual send-receive order over the basic send receive events of a distributed system

<sup>&</sup>lt;sup>3</sup> Note that the definition although circular, the state field of the structure is constrained through  $\delta_s$ .

computation but they also introduce a compatible total order over the sends to a given peer. In the domain of embedded systems, one can usually rely on such a semantics. Actually it is weaker than the instantaneous broadcast which is common in synchronous languages.

## 3 An illustrative Example

As an illustrative example, we consider one of the folklore algorithms for the distributed spanning tree construction [10]. Given a network of peers PEERS knowing initially their respective local neighborhood as a set of peers, we have to construct a spanning tree where the root ROOT is a priori designated, e.g., peer 0. At the end, each peer (but peer 0) has to know its father and its sons with respect to the constructed spanning tree. As a working example, we shall consider the network topology illustrated in figure 1. Figure 2 suggests one spanning tree that would be possible to construct given the supposed underlying topology (fig. 1).





Fig. 1. Underlying network topology

Fig. 2. Spanning tree

# 3.1 A Distributed Spanning Tree Construction Algorithm

A so called diffusing computation [8] can be used for building a spanning tree. Peer 0 initiates the computation by sending the message BeMySon to each of its neighbours. When a peer receives its first BeMySon message, it takes the sender as its father and sends to each neighbor peer<sup>4</sup>, but its father, a BeMySon message. It then waits for an acknowledgement for each of the sent messages. Its set of sons consists of the peers that have acknowledged positively by an OK message. When a peer (but the root) has received all the acknowledgments, in turn, he acknowledges its father by an OK message. While waiting for an acknowledgment, a peer can receive a BeMySon message, in that case, it acknowledges it negatively by a NO message. The computation is terminated once the root has received all its awaited acknowledgments.

 $<sup>^4</sup>$  We have supposed that each peers knows initially its local neighborhood.

In the following, we suppose that communication is FIFO: each peer has a queue where it receives the messages sent to him. The figures 3 and 4 illustrate respectively the automaton of the root peer and a not root peer<sup>5</sup>. We have modeled the algorithm in the language PusCal [14](based on the semantics formalism: TLA+ [13]). This model is given in appendix ??.



Fig. 3. Root peer automaton

## 3.2 Replaying a Trace

Replay consists in synthesizing a run where epsilon transitions, i.e., receive transitions have been guessed and interleaved with given send transitions of an actual trace (see the figure 6). For that purpose, we suppose given the *modes* of the protocol: modes give for each peer and state if it is ready to receive or ready to send. The considered algorithm assumes that a peer cannot be both ready to receive and ready to send. With respect to our example, modes are given by the tables of figure 5.

# 4 A Normal Form for FIFO Protocols Traces

In this section, we propose a normal form for FIFO protocol traces. Given a trace, a normal form is a run which can be seen as the representant

<sup>&</sup>lt;sup>5</sup> These automata differ with respect to the initial transition (from the idle state) and the terminal transition (to the term state).



Fig. 4. Not Root peer automaton

| Root peer                                                           |                |
|---------------------------------------------------------------------|----------------|
| State                                                               | Mode           |
| $ask \land rem \neq \emptyset$                                      | ReadyToSend    |
| $ \operatorname{collect} \wedge \operatorname{rem} \neq \emptyset $ | ReadyToReceive |
| ref                                                                 | ReadyToSend    |

| Not root peer                                                     |                |
|-------------------------------------------------------------------|----------------|
| State                                                             | Mode           |
| idle                                                              | ReadyToReceive |
| $ask \land rem \neq \emptyset$                                    | ReadyToSend    |
| $\operatorname{collect} \wedge \operatorname{rem} \neq \emptyset$ | ReadyToReceive |
| ref                                                               | ReadyToSend    |
| $collect \wedge rem = \emptyset$                                  | ReadyToSend    |

 $\mathbf{Fig.}\ \mathbf{5.}\ \mathrm{Protocol\ modes}$ 

of all the runs with the same trace. In fact, given a trace, we do not reconstruct the actual run which led to the trace in question but a run with the same trace. Such a run is called the normalized run. Should such a run be replayed, it would produce the same trace as the trace in question. In the following, we are concerned by the correctness of the normalized run (re)construction.

# 4.1 Basic Operations

We introduce two basic operations that will be used for normalization. These operations preserve runs and their semantics, i.e., traces.

Unrolling consists in pushing to the front of a run, a given transition.
 Intuitively, we unroll the effect of such a transition from the end until

| Trace                                                     | Synthesized interleaved receives for the replay |
|-----------------------------------------------------------|-------------------------------------------------|
| $\frac{11acc}{{}_{0}S_{3}^{[data:="BeMySon",sender:=0]}}$ |                                                 |
|                                                           | $R_3^{[data:="BeMySon",sender:=0]}$             |
| $_{3}S_{2}^{[data:="BeMySon",sender:=3]}$                 |                                                 |
| ${}_{0}S_{1}^{[data:="BeMySon",sender:=0]}$               |                                                 |
|                                                           | $R_2^{[data:="BeMySon",sender:=3]}$             |
| $_{2}S_{1}^{[data:="BeMySon",sender:=2]}$                 | 2                                               |
|                                                           | $R_1^{[data:="BeMySon",sender:=0]}$             |
| $_{1}S_{2}^{[data:="BeMySon",sender:=1]}$                 | 1                                               |
|                                                           | $R_2^{[data:="BeMySon",sender:=1]}$             |
| $_{2}S_{1}^{[data:="NO",sender:=2]}$                      | -2                                              |
| 2~1                                                       | $R_1^{[data:="BeMySon",sender:=2]}$             |
| $_{1}S_{2}^{[data:="NO",sender:=1]}$                      | 1                                               |
| 1~2                                                       | $R_{2}^{[data:="NO",sender:=1]}$                |
| $_2S_3^{[data:="OK",sender:=2]}$                          | 102                                             |
| 2~3                                                       | $R_3^{[data:="OK",sender:=2]}$                  |
| $_{3}S_{0}^{[data:="OK",sender:=3]}$                      | 1 2 3                                           |
| 3~ <sub>0</sub>                                           | $R_1^{[data:="BeMySon",sender:=0]}$             |
| $_1S_0^{[data:="OK",sender:=1]}$                          |                                                 |

Fig. 6. Replay of the Protocol from an Actual Trace

the front is reached. Intuitively, unroll expresses that a transition at the end of a run can also occur at the beginning of such a run.

- Partitioning consists in splitting a run in two such that the first run contains only transitions from a given peer and the other run contains the remaining transitions issued by the other peers.

**Unrolling** The parameters of the unroll operation are as follows:  $(sw_1, sw_2)$  swaps a pair  $(t_1, t_2)$  as the pair  $(sw_1(t_1, t_2), sw_2(t_1, t_2))$ . r is the run over which the unrolling occurs and e is the transition to be unrolled.

unroll is described recursively by the following text:

$$\begin{array}{c} \operatorname{unroll}_{sw_1,sw_2}(r,e) \triangleq & \text{if } r = [] \text{ then } [e] \\ & \text{else } \operatorname{unroll}_{sw_1,sw_2}(\operatorname{butlast}(r),sw_1(\operatorname{last}(r),e))) \\ & @[sw_2(\operatorname{last}(r),e)] \end{array}$$

Subsequently, we apply the unroll operation over runs. In order to preserve run properties, we consider the following local properties<sup>6</sup> of the swap operation:

 $-\,$  a swap preserves the adjacency of (run) elements.

$$Dst(tr) = Src(tr') \Rightarrow Dst(sw_1(tr, tr')) = Src(sw_2(tr, tr'))$$

<sup>&</sup>lt;sup>6</sup> They are local since they apply to two elements: tr, tr' and not globally, e.g., to a list of elements.

a swap preserves the ends of adjacent (run) elements.

$$\operatorname{Dst}(tr) = \operatorname{Src}(tr') \Rightarrow \operatorname{Src}(sw_1(tr, tr')) = \operatorname{Src}(tr) \wedge \operatorname{Dst}(sw_2(tr, tr')) = \operatorname{Dst}(tr')$$

a swap preserves the trace of runs of 2 elements.

$$Dst(tr) = Src(tr') \Rightarrow Trace([sw_1(tr, tr')]@[sw_2(tr, tr')]) = Trace([tr]@[tr'])$$

Theorem 1 (Unrolling a Receive over Receives). Unrolling a receive over a receive run yields a new receive run with the same ends.

$$\begin{array}{l} \forall \ r \ e. \ r@[e] \in \mathcal{R}(i) \land (\forall \ tr \in r@[e]. \ \text{isReceive}(\text{Lab}(tr))) \\ \Rightarrow \quad \text{unroll}_{sw_{R_1},sw_{R_2}}(r,e) \in \mathcal{R}(i) \\ \land \ \text{Trace}(\text{unroll}_{sw_{R_1},sw_{R_2}}(r,e)) = \text{Trace}(r@[e]) \\ \land \ \text{ends}(\text{unroll}_{sw_{R_1},sw_{R_2}}(r,e)) = \text{ends}(r@[e]) \end{array}$$

where  $sw_{R_1}$  and  $sw_{R_2}$  instantiate the generic swap:

$$sw_{R_1}((S_1,R_{p'}^{m'},S_1'),(S_2,R_p^m,S_2')) = \begin{array}{l} \textbf{if} \ p = p' \ \textbf{then} \ (S_1,R_{p'}^{m'},S_1') \\ & \textbf{else} \ (S_1,\\ R_p^m,\\ S_1[p := [ \, \textbf{state} := S_2'[p]. \textbf{state},\\ & \text{queue} = tl(S_1[p]. \textbf{queue}) ]]) \\ | \ sw_{R_1}(e_1,e_2) = e_1 \\ sw_{R_2}((S_1,R_{p'}^{m'},S_1'),(S_2,R_p^m,S_2')) = \begin{array}{l} \textbf{if} \ p = p' \ \textbf{then} \ (S_2,R_p^m,S_2')\\ & \textbf{else} \ (S_1[p := [ \, \textbf{state} := S_2'[p]. \textbf{state},\\ & \text{queue} = tl(S_1[p]. \textbf{queue}) ]] \\ R_{p'}^{m'},\\ S_2') \\ | \ sw_{R_2}(e_1,e_2) = e_2 \end{array}$$

Remark. Intuitively, such a global swap expresses that receives of different peers are independent: they can be swapped.

Theorem 2 (Unrolling a send over receives). Unrolling a send issued by a node s over a receive run not issued by s yields a new run with the same initial and end state.

```
 \forall \ r \ e. \ r@[e] \in \mathcal{R}(i) \land (\forall \ tr \in r. \ \text{isReceive}(\text{Lab}(tr))) \land \text{On}(\text{Lab}(tr)) \neq \text{On}(\text{Lab}(e)) \land \text{isSend}(e) \\ \Rightarrow \quad \text{unroll}_{sw_{S_1},sw_{S_2}}(r,e) \in \mathcal{R}(i) \\ \land \ \text{Trace}(\text{unroll}_{sw_{S_1},sw_{S_2}}(r,e)) = \text{Trace}(r@[e]) \\ \land \ \text{ends}(\text{unroll}_{sw_{S_1},sw_{S_2}}(r,e)) = \text{ends}(r@[e]) \\ \land \ \text{isSend}(\text{Lab}(hd(\text{unroll}_{sw_{S_1},sw_{S_2}}(r,e)))) \\ \land \ \forall \ tr \in \ tl(\text{unroll}_{sw_{S_1},sw_{S_2}}(r,e)). \ \text{isReceive}(\text{Lab}(tr))
```

where  $sw_{S_1}$  ,  $sw_{S_2}$  instantiate the generic swap:

```
sw_{S_{1}}((S_{1},R_{r_{1}}^{m_{1}},S_{1}'),(S_{2},_{s_{2}}S_{r_{2}}^{m_{2}},S_{2}')) = \text{if } s_{2} \neq r_{1} \text{ then} 
(S_{1},\\ s_{2}S_{r_{2}}^{m_{2}},\\ S_{1}(s_{2}:=((S_{1}'s_{2})\{state:=state(S_{2}'(s_{2}))\}),\\ r_{2}:=((S_{1}'r_{2})\{queue:=((queue(S_{1}r_{2}))@[m_{2}])\})))
\text{else } (S_{1},R_{r_{1}}^{m_{1}},S_{1}')
|sw_{S_{1}}(e_{1},e_{2}) = e_{1}
```

```
sw_{S_2}((S_1,R_{r_1}^{m_1},S_1'),(S_{2,s_2}S_{r_2}^{m_2},S_2')) = \text{ if } s_2 \neq r_1 \text{ then } \\ (S_1(s_2:=((S_1's_2)\{state:=state(S_2'(s_2))\}),\\ r_2:=((S_1'r_2)\{queue:=((queue(S_1r_2))@[m_2])\})),\\ R_{r_1}^{m_1},\\ S_2')\\ \text{ else } (S_{2,s_2}S_{r_2}^{m_2},S_2')\\ |sw_{S_2}(e_1,e_2) = e_2
```

Remark. Intuitively, if a send of a node s occurs in a run after a receive of node s' ( $s' \neq s$ ), it can also occur before. Such sends can be swapped.

Partitioning The partition operation has two parameters: a peer p and a run r. Partitioning divides a run into two complementary runs: the first contains only transitions of peer p and the second contains transitions that do not belong to p. In order to obtain runs, partitioning is done through the unroll operation: all the transitions of p are successively unrolled; hence transitions of p are pushed to the front thus separating them from the transitions from peers other than p.

The partition operation is described recursively by the following text:

```
\begin{array}{ll} \operatorname{partition}(p,r) \triangleq \operatorname{if}\ r = []\ \operatorname{then}\ ([],[]) \\ & \operatorname{else}\ \operatorname{let}\ (o,n) = \operatorname{partition}(p,\operatorname{butlast}(r))\ \operatorname{in} \\ & \operatorname{if}\ \operatorname{On}(\operatorname{Lab}(\operatorname{last}(r))) = p\ \operatorname{then} \\ & \operatorname{let}\ a' = \operatorname{unroll}_{sw_{R_1},sw_{R_2}}(\operatorname{last}(r),n)\ \operatorname{in}\ (n@[hd(a')],tl(a')) \\ & \operatorname{else}\ (o,n@[\operatorname{last}(r)]) \\ \operatorname{on}(p,r) & \triangleq \operatorname{fst}(\operatorname{partition}(p,r)) \\ \operatorname{not\_on}(p,r) & \triangleq \operatorname{snd}(\operatorname{partition}(p,r)) \end{array}
```

**Theorem 3 (Partitionning receives).** For any peer p, a run over receive transitions is partitioned into two adjacent runs.

```
 \forall r. \ r \in \mathcal{R}(i) \land (\forall \ tr \in r. \ \text{isReceive}(\text{Lab}(tr))) \\ \Rightarrow \quad \text{on}(p,r)@\text{not\_on}(p,r) \in \mathcal{R}(i) \\ \land \ \text{Trace}(\text{on}(p,r)@\text{not\_on}(p,r)) = \text{Trace}(r) \\ \land \ \text{ends}(\text{on}(p,r)@\text{not\_on}(p,r)) = \text{ends}(r)
```

#### 4.2 Normalization



Fig. 7. Normalized Run Structure

Normalization splits a run into a sequence of *packed* runs and a trail run. Each packed run has one send transition and preceding receive transitions from the same peer. In the trail part, we have all the receive transitions that have not yet been followed by a send of the same peer. The figure 7 illustrates normalization. More precisely:

- the empty run is split into an empty sequence and an empty trail.
- Given a run split into a sequence packed runs and a trail run, adding a transition to this run yields the following: a receive transition ( $R_1^a$  in the fig. 8) is added to the trail part leaving the packed runs unchanged. For a send transition ( ${}_2S_3^a$  in the fig. 9), we partition the

$$\left(\begin{array}{c|c} packed \ runs \ sequence \\ \hline & \\ \hline &$$

Fig. 8. Normalized Run Structure (after a Receive wrt. Fig. 7)

trail part into the on s part and the not\_on s part. The new packed part is the concatenation of the on part and the send transition, and the new trail part is the remaining not\_on part.

Fig. 9. Normalized run structure (after a Send wrt. Fig. 7)

Normalization is formalized as follows:

```
\begin{split} \operatorname{normalize}(r) &\triangleq & \text{ if } r = [] \text{ then } ([],[]) \\ & \text{ else let } (p,t) = \operatorname{normalize}(\operatorname{butlast}(r)) \text{ in } \\ & \text{ if } \operatorname{isReceive}(\operatorname{Lab}(\operatorname{last}(r))) \text{ then } (p,t@[\operatorname{last}(r)]) \\ & \text{ else let } s = \operatorname{On}(\operatorname{Lab}(\operatorname{last}(r))) \text{ in } \\ & \text{ let } u = \operatorname{unroll}_{sw_{S_1},sw_{S_2}}(\operatorname{not\_on}(s,t),\operatorname{last}(r)) \text{ in } \\ & (p@\operatorname{on}(s,t)@[\operatorname{hd}(u)],\operatorname{tl}(u)) \\ & \text{packed}(r) &\triangleq \operatorname{fst}(\operatorname{normalize}(r)) \\ & \text{trail}(r) &\triangleq \operatorname{snd}(\operatorname{normalize}(r)) \end{split}
```

**Theorem 4 (Normalization).** Normalization splits a run to two adjacent runs called the packed run and the trail run. A run and its packed

run have the same trace.

```
 \begin{array}{c} \operatorname{packed}(r) @ \operatorname{trail}(r) \in \mathcal{R}(i) \\ \forall \ r.r \in \mathcal{R}(i) \Rightarrow \wedge \operatorname{Trace}(\operatorname{packed}(r)) = \operatorname{Trace}(r) \\ \wedge \ \forall \ tr \in \operatorname{trail}(r). \ \operatorname{isReceive}(\operatorname{Lab}(tr)) \end{array}
```

Remark. We have  $packed(r) \in \mathcal{R}(i)$ . Moreover, since the trail contains only receive transitions, its trace is empty.

## 5 Application to Trace-based Replay

In this section, we give the formal specification of a replay through a given observation. Note that we are not concerned by the construction of the observation by itself but by the reconstruction of an execution from the observation of a run and its correctness.

## 5.1 Replay definition

Intuitively, a replay function reconstructs a run which has a given trace supposed to be obtained through an actual execution.

**Definition 7 (replay).** Given a transition system  $Sys = (I, \rightarrow)$ , a replay function rf takes as parameter an initial state  $i \in I$  and the trace tr of a run starting at i and returns a run of Sys starting at I with the same trace.

$$\begin{split} \operatorname{replay}((I, \to), rf) &\triangleq \forall i \in I. \ \forall r \in \mathcal{R}_{(I, \to)}(i). \quad rf(i, \operatorname{Trace}(r)) \in \mathcal{R}_{(I, \to)}(i) \\ &\wedge \operatorname{Trace}(rf(i, \operatorname{Trace}(r))) = \operatorname{Trace}(r) \end{split}$$

This definition deserves some comments:

- we do not reconstruct the actual execution that leads to the given trace but an execution with the same trace.
- As said in section 2, the construction of the observation relies on a total order on the sends to a given peer. Such a total order can be established thanks to elaborated order mechanisms [12] or to highly precise network time protocols [15].

An interesting property of a replay function is to be incremental: runs can be reconstructed through trace suffixes and corresponding starting states. Such a property is formalized as follows<sup>7</sup>:

Property 1 (incremental replay).

$$\operatorname{replay}((I, \to), rf) \Rightarrow \forall i \in I. \ \forall p \ s. \ p@s \in \mathcal{R}_{(I, \to)}(i) \Rightarrow \operatorname{replay}((p(I), \to), rf)$$

The incremental replay property is interesting with respect to space saving: provided that we do not have to reconstruct a run from the initial state, we can store only a suffix of the trace and its corresponding starting state. Such a property is also interesting with respect to the so called "right to be forgotten"; actually, it allows to forget a prefix of a run and make henceforth the replay starts after that prefix. Of course, it does not forbid to apply memoization. Without surprise, this fact tells us the "right to be forgotten" and the "right to store" are linked.

<sup>&</sup>lt;sup>7</sup> We use the functional overloading seen in section 2: p(I) is the state reached after the execution of the run prefix p.

#### 5.2 Mode based FIFO protocols

In mode based protocol, when ready to interact with the environment, a peer is ready either in the receiving mode: it is ready to receive messages or in the sending mode: it is ready to send messages. Such a mode can be represented through a function defined over the product of peers and states as follows:

```
mode : peer \times state \rightarrow \{Ready2Receive, Ready2Send\}
```

The table 5 details such a function for our illustrative example.

# 5.3 A replay algorithm

The considered replay algorithm takes as input:

- a FIFO protocol recorded trace,
- the mode of the states of each peer through the function mode,
- and the automata of the peers through their respective  $\delta$  transition function.

It reconstructs a run following the structure of a normalized run (see fig. 7). Actually, it builds recursively a sequence of packed runs. Each packed run is built through the  ${\tt reconstruct\_run}$  function and proceeds as follows: for each  ${}_sS_r^m$  label of the recorded trace, a pack of labels is reconstructed recursively through the  ${\tt reconstruct\_pack}$  function.

```
\begin{split} \operatorname{reconstruct\_pack}(St,_s S_r^m) &\triangleq \text{ if } \operatorname{mode}(s, St[s]) = \operatorname{Ready2Send } \mathbf{then} \\ & \operatorname{let } St' = St[s := St[s] [\operatorname{state} := \delta_s(St[s])], \\ & r := St[r] [\operatorname{queue} := St[r]. \operatorname{queue}@[m]]] \\ & \operatorname{in } \left[ (St,_s S_r^m, St') \right] \\ & \operatorname{else } \operatorname{if } St[s]. \operatorname{queue} \neq [] \operatorname{then} \\ & \operatorname{let } m = hd(St[s]. \operatorname{queue}) \operatorname{in} \\ & \operatorname{let } St' = St[s := [\operatorname{state} := \delta_s(St[s]), \operatorname{queue} := tl(St[s]. \operatorname{queue})] \\ & \operatorname{in } \left[ (St, R_s^m, St') \right] @\operatorname{reconstruct\_pack}(St',_s S_r^m) \\ & \operatorname{else } \left[ \right] \end{split}
```

```
\begin{split} \operatorname{reconstruct\_run}(St,tr) &\triangleq \text{ if } tr = [\,] \text{ then } [\,] \\ & \text{ else let } p = \operatorname{reconstruct\_pack}(St,hd(tr)) \text{ in } \\ & \text{ if } p \neq [\,] \text{ then } \\ & \text{ let } (\_,\_,St') = \operatorname{last}(p) \text{ in } \\ & p@\operatorname{reconstruct\_run}(St',tl(tr)) \\ & \text{ else } [\,] \end{split}
```

Lemma 1 (reconstruct\_run).

```
\forall r. \ r \in \mathcal{R}(i) \Rightarrow \text{reconstruct\_run}(i, \text{Trace}(r)) = \text{packed}(r)
```

from this lemma and theorem 4, we deduce

Theorem 5 (replay algorithm). The function reconstruct\_run defines a replay algorithm.

```
replay(FIFO\_LTS, reconstruct\_run)
```

#### 6 Related works

The seminal work related to our concerns is probably that of Mazurkiewicz [7]. Let us recall that basic trace theory is defined over an alphabet of actions  $\Sigma$ , and a symmetric and reflexive relation I called the independence relation. Two elements of  $\Sigma^*$  are said to be equivalent if one can be obtained from the other by commuting independent actions:

$$x \simeq y \equiv \exists u \ v. \ \exists (a, b) \in I. \ x = uabv \land y = ubav$$

In fact, a trace is an equivalence class with respect to the transitive closure of the previous relation ( $\simeq$ ). With respect to our model, such a commutation does exist for receive transitions occurring on different nodes. Our normalization relies on such a property. In addition, with respect to our model, we also use semi-commutations [6,4] (the independence relation is not symmetric). Actually, when a receive is followed by a send on a different node, we can swap these actions (but not the reverse, in general). This is the second property, our normalization relies on. To the best of our knowledge, the use of such tools for the reconstruction problem is new.

The basic references of FIFO protocols are [18,5]. As remarked in section 2.3, the model studied in this paper does not consider channels. First investigations show that our algorithm can be extended to some variants of the basic model. We are currently studying such extensions. Last, we mention the work of [3] which deals with realizability and synchronizability properties. Indeed, their work has been the starting point of our study: we reuse the asynchronous communication model with fifo buffers as the semantics model. They show that such a model can be used for important application classes [9, 1].

#### 7 Conclusion

In this paper, we have studied how to rebuild a full distributed computation from its partial observation. The study has been done at the semantic model level: first we have formalized the underlying distributed system protocol as a transition system, then we have proposed an algorithm for reconstructing a run given its trace. The correctness of the algorithm has been established with respect to a given definition of replay. Concerning the actual implementation context of the algorithm, we have suggested some basic ideas.

For our future work, we envision two directions: first, it would be interesting to study the replay problem in order to take into account other models of distributed systems. We are especially interested in real-time distributed systems. Another direction that seems promising is to better understand the needs of high level applications relying on run-time data [11] in order to provide them an appropriate knowledge.

#### References

- J. Armstrong. Getting Erlang to talk to the outside world. In Proceedings of the 2002 ACM SIGPLAN Workshop on Erlang, ER-LANG '02, pages 64–72, New York, NY, USA, 2002. ACM.
- A. Arnold. Finite transition systems semantics of communicating systems. Prentice Hall international series in computer science. Prentice Hall, 1994.
- S. Basu, T. Bultan, and M. Ouederni. Deciding choreography realizability. In J. Field and M. Hicks, editors, *POPL*, pages 191–202. ACM, 2012.
- A. Bouajjani, A. Muscholl, and T. Touili. Permutation rewriting and algorithmic verification. *Information and Computation*, 205(2):199

  – 224, 2007.
- D. Brand and P. Zafiropulo. On communicating finite-state machines. J. ACM, 30(2):323–342, 1983.
- M. Clerbout and M. Latteux. Semi-commutations. Inf. Comput., 73(1):59-74, 1987.
- V. Diekert and G. Rozenberg, editors. The Book of Traces. World Scientific, Singapore, 1995.
- E. Dijkstra and C. Scholten. Termination detection for diffusing computations. Information Processing Letters, 11(1):1-4, 1980.
- M. Fähndrich, M. Aiken, C. Hawblitzel, O. Hodson, G. Hunt, J. R. Larus, and S. Levi. Language support for fast and reliable messagebased communication in Singularity OS. SIGOPS Oper. Syst. Rev., 40(4):177–190, Apr. 2006.
- W. Fokkink. Distributed Algorithms: An Intuitive Approach. MIT Press, 2013.
- G. Gößler, D. L. Métayer, and J.-B. Raclet. Causality analysis in contract violation. In H. Barringer, Y. Falcone, B. Finkbeiner, K. Havelund, I. Lee, G. J. Pace, G. Rosu, O. Sokolsky, and N. Tillmann, editors, RV, volume 6418 of Lecture Notes in Computer Science, pages 270–284. Springer, 2010.
- L. Lamport. Time, clocks and the ordering of events in a distributed system. CACM, 21(7):558–565, 1978.
- L. Lamport. Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2002.
- 14. L. Lamport. Euclid writes an algorithm: A fairytale. *Int. J. Software and Informatics*, 5(1-2):7–20, 2011.
- E. A. Lee and Y. Zhao. Reinventing computing for real time. In Proceedings of the 12th Monterey Conference on Reliable Systems on Unreliable Networked Platforms, pages 1–25, Berlin, Heidelberg, 2007. Springer-Verlag.
- 16. M. Leucker and C. Schallhart. A brief account of runtime verification. J. Log. Algebr. Program., 78(5):293–303, 2009.
- N. A. Lynch. Distributed Algorithms. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1996.
- 18. G. von Bochmann. Finite state description of communication protocols. *Computer Networks*, 2:361–372, 1978.

# A Formal Model of SysML Blocks using CSP for Assured Systems Engineering

Jaco Jacobs and Andrew Simpson

Department of Computer Science, University of Oxford Wolfson Building, Parks Road Oxford OX1 3QD {jaco.jacobs, andrew.simpson}@cs.ox.ac.uk

Abstract. The Systems Modeling Language (SysML) is a semi-formal, visual modelling language used in the specification and design of systems. In this paper, we describe how Communicating Sequential Processes (CSP) and its associated refinement checker, Failures Divergences Refinement (FDR), gives rise to an approach that facilitates the refinement checking of the behavioural consistency of SysML diagrams. We formalise the conjoined behaviour of key behavioural constructs — state machines and activities — within the context of SysML. Furthermore, blocks, the fundamental modelling construct of the SysML language, can be combined in a compositional approach to system specification. The use of a process-algebraic formalism enables us to explore the behaviour of the resulting composition more rigorously. We demonstrate how CSP, in conjunction with SysML, can be used in a formal top-down approach to systems engineering. A small case study validates the contribution.

#### 1 Introduction

Accidents associated with complex systems are frequently the result of unfore-seen interactions amongst components that all satisfy their individual requirements [1]. These component interaction accidents are increasingly common: state of the art systems are more interdependent on other technologically advanced systems and interact in ways not foreseen or intended by the original designer. The Mars Polar Lander accident is one example of such a failure: both the landing legs and the control software of the descent engines functioned as specified by their respective behavioural specifications. The systems engineers, however, did not consider all the potential interactions between the landing legs and the control software of the descent engines [1].

The OMG's Systems Modeling Language (SysML) [2] is a graphical modelling notation used in the specification and integration of complex, large-scale systems. A keystone of this activity is ensuring that requirements, as imposed by the various stakeholders, are adequately captured and subsequently addressed when specifying a potential solution. The intention of SysML, thus, is to accurately specify intended component behaviour with the expectation to minimise interaction accidents. However, SysML is a semi-formal notation. If we are to

carry out an extensive analysis of component interactions, more mathematical rigour is indispensable.

Reasoning about behaviour — in particular, the myriad of interactions between components — is a rather cumbersome activity for the human mind. In addition, our cognitive ability to cope with multiple, separate descriptions of behaviour, and ultimately fuse these into a unified interpretation, is rather limited. We need to augment our faculties with appropriate notations in order to effectively reason about such behaviours. Moreover, if we are going to utilise these notations in a meaningful fashion, we require mechanised tool support. Communicating Sequential Processes (CSP) [3] is one such notation, backed up by Failures Divergences Refinement (FDR) in the form of a refinement checker.

Activities and state machines are the core behavioural constructs used to ascribe behaviour to SysML blocks. The aforementioned constructs are frequently used in combination: activities are used to assign behavioural features that ought to execute in a particular state, or on a given transition [2]. In this paper, we provide a behavioural semantics for the conjoined behaviour of state machines and activities. In the past, there have been several contributions where the sole focus lied either with the formalisation of state machines, or activities. To the best of our knowledge, this paper is the first contribution where the intention is on the provision of a behavioural semantics that encompasses both these formalisms.

At the structural level, SysML takes a compositional stance with regards to systems specification: a block can be comprised of other blocks, which, in turn, might themselves consist of blocks. However, for the approach to be effective and useful, the behavioural conduct of these blocks need to be specified in a consistent manner. Moreover, the approach needs to enable the modeller to sufficiently abstract away details irrelevant to a particular level of abstraction.

This paper is a companion of sorts to the work presented in [4]: it extends the formalisation of state machines to encompass entry, exit, and do behaviours modelled via activities. In doing so, a formal behavioural semantics is provided for activities, in terms of CSP.

The structure of the remainder of this paper is as follows. In Section 2, we provide a brief introduction to SysML. Section 3 outlines our process-algebraic approach to formalise SysML activities, state machines, and blocks. We show how CSP can be employed to analyse expositions composed of multiple, communicating state machine and activity constructs. In Section 4, we employ a small case study to illuminate and validate the contribution. Section 5 summarises the contributions of this paper, and places it in context with respect to other research.

## 2 Background

In this section, we give a necessarily brief introduction to SysML. We assume familiarity with CSP.

**Blocks** Blocks are the fundamental modelling constructs of SysML and provide the context in which behaviours execute. A *block* is often composed of other blocks, termed *parts*, each of which has its own associated behaviour. The classifier behaviour of a block can serve as an abstraction of the behaviours of its parts. Thus, the abstraction serves as a specification that the parts must realise: the parts must interact in such a way that their combined behaviour conforms to the abstraction. This interpretation also sits well with the concept of refinement and abstraction in CSP.

The classifier behaviour is the main behaviour of a block, and executes from the instant the instance is created until the point of destruction. The modelling construct most frequently used to represent the classifier behaviour is a state machine. In most systems engineering methodologies, activities are typically used as a complementary modelling notation to state machines: it is the behavioural formalism normally associated with the effect component of a transition; alternatively, it is used to model behaviours related to a particular state.

Typically, two block instances communicate using signal events. The initiating block sends a signal event to a target block. This signal event is defined as part of the supplementary behaviours — described using activities — associated with the initiating state machine: the entry or exit behaviours of the active state; or the effect component of the enabled transition. The receipt of the signal event in the target block may subsequently trigger a transition in its state machine. The approach described above is popular when modelling event-based systems.

A signal is a classifier that types the asynchronous messages that are communicated between blocks. Each signal optionally has an associated set of attributes which correspond to the parameters that make up the content of the message. A connector connects two or more parts or references. The connection formally allows the connected components to interact, although the connector does not characterise the nature of the interaction. Instead, the interaction is stipulated by the behaviours of the connected blocks.

**Activities** Activities allow the modeller to describe complex routes along which actions execute. These routes are termed *flows*. In SysML activities there are two types of flows: control flows and object flows.

Actions are the fundamental building blocks of activities and always execute within the context of an activity. An action accepts inputs and produces outputs. The flow of input and output items between actions are described using object flows. Control flows, on the other hand, impose additional constraints on the execution of actions. When a control flow connects one action to another, the target action cannot start until the source action has completed. Control nodes are used in the specification of control flow: they are used to impose control logic on the execution of actions. The control nodes are the fork, join, decision, merge, initial and final nodes.

Several types of actions exist: the *send signal event action* sends a signal event; the *receive signal event action* waits on the receipt of a particular signal event; and the *value specification action* allows the specification of a particular

value to an input of an action. *Opaque actions* allow the specification of actions in a language external to SysML.

**State machines** State machines graphically depict state-dependent behaviour in terms of nodes and labelled edges: nodes represent states, whereas the edges correspond to transitions between states.

In SysML, a *state* is an abstraction of the mode that the owning block finds itself in. A change of state is effected by the arrival of a triggering event, causing an appropriate transition to fire. A *transition* consists of a trigger, a guard and an effect. The *trigger* denotes the event that serves as stimulus for the transition to fire; the *guard* is a conditional expression used to decide whether the transition is to fire at all; and the *effect* is a supplementary behaviour that executes on the transition.

# 3 A CSP view of SysML blocks

This section outlines an approach to integrate the semi-formal SysML notation with the process algebra CSP. In order to define a formal semantics for blocks, parts and state machines, we need a precise description of their syntax. To this end, we define simple mathematical constructs that are closely related to the syntactical structure of their corresponding SysML counterparts.

Activities Broadly speaking, our approach maps every node and every edge in an activity diagram to a CSP process. We restrict actions to either have either a single outgoing control or object flow, but not both; our semantics allows for simple forks and joins in the sense that a fork node splits control into multiple flows that eventually all end in a corresponding join node. We present the formalisation as it relates to a single activity A; A denotes the set containing all activities in our universe of discourse.

An activity  $A \in \mathcal{A}$  consists of a finite collection of nodes, denoted  $N_A$ , and edges between those nodes, denoted  $E_A$ . We partition  $N_A$  such that  $N_A^I$  represents the set of initial nodes,  $N_A^F$  the set of final nodes,  $N_A^{FK}$  the set of fork nodes,  $N_A^{JN}$  the set of join nodes,  $N_A^{SS}$  the send signal event actions,  $N_A^{RS}$  the receive signal event actions,  $N_A^O$  the opaque actions, and  $N_A^{PN}$  the set of activity parameter nodes. The edges are partitioned such that  $E_A^{OF}$  represents the object flows, and  $E_A^{CF}$  represents the set of control flows.

We define the following functions, to return for a particular flow  $f \in E_A$ : the source node,  $source: E_A \to N_A$ ; and the target node,  $target: E_A \to N_A$ . Additionally, we define functions to return for a particular node  $n \in N_A$ : the set of outgoing control flows,  $outgoing_{cf}: N_A \to \mathbb{P} E_A^{CF}$ ; and the outgoing object flow,  $outgoing_{of}: N_A \to E_A^{OF}$ . Assume that the construction name(n) returns the name of the send or receive signal event, or opaque action for  $n \in N_A^{SS} \cup N_A^{OS}$ .

 $N_A^{RS} \cup N_A^O$ .

The formalisation makes use of a mapping function  $\mathcal{F}$ . In particular,  $\mathcal{F}(A,c)$  is the process modelling the construct c, either an edge or a node, of activity A.

Activity parameter node. An activity parameter node  $n \in N_A^{PN}$ , models a parameter, p, that can be used within the context of the activity. In CSP, the node is modelled as an argument to the process modelling the activity. Diagrammatically, an object flow  $of \in E_A^{OF}$  connects the parameter node with other nodes that use this as a parameter. For the purpose of this paper we assume that a single argument is represented by each activity parameter node that serve as input to the activity. The activity's behaviour starts as the process modelling the initial node  $n_0 \in N_A^I$ 

```
A(p) = let \mathcal{F}(A, n_0) = \dots within \mathcal{F}(A, n_0)
```

An activity without a parameter is modelled similarly, but the process parameter p is elided.

Control flow edge. A control flow  $cf \in E_A^{CF}$  can be thought of as a CSP process. The behaviour of this process is dependent on the target node of the control flow, given by target(cf). If the target is not a join node, i.e.  $target(cf) \notin N_A^{JN}$ , the process simply designates its behaviour to be that of the target node.

```
 \begin{array}{ll} \mathcal{F}(A,cf) = & \\ \mathcal{F}(A,target(cf)) & \text{if } target(cf) \notin N_A^{JN} \\ Join(cf) & \text{otherwise} \end{array}
```

In the case where  $target(cf) \in N_A^{JN}$ , there will be, based on our assumption of activities above, k-1 other control flows which terminate in the same join node. Let the control flows be  $cf_0 ... cf_{k-1}$ . Exactly one of the control flows,  $cf_0$ , will exhibit the behaviour of the join node.

```
Join(e) = join \rightarrow Skip if e \neq cf_0

join \rightarrow \mathcal{F}(A, target(e)) otherwise
```

The above construction ensures that exactly one of the previously forked flows continues after the join. Many interpretations of activity diagrams assume control flows to have associated guards, typically expressed in natural language. Due to obvious reasons natural language guards are not suitable for a precise behavioural semantics and are thus excluded.

Object flow edge. An object flow of  $\in E_A^{OF}$  is used to model the passing of parameters<sup>1</sup> between activity parameter nodes, call behaviour actions or send and receive signal events. The behaviour of an object flow edge is a parametrised process that takes as input the value of the argument, say p, passed along the

We restrict ourselves to signal parameters here, although in SysML these can be any classifier that can serve as an input to an activity.

object flow. Throughout, process arguments are placed within square brackets to denote them as such.

$$\mathcal{F}(A, of)[p] = \mathcal{F}(A, target(of))[p]$$

Initial node. An initial node  $n \in N_A^I$  has a single outgoing edge, a control flow  $cf \in outgoing_{cf}(n)$ . The process behaves like the control flow edge emanating from the initial node.

$$\mathcal{F}(A, n) = \mathcal{F}(A, cf)$$

Send signal event action. A send signal event action  $n_1 \in N_A^{SS}$  has a single outgoing control flow  $cf \in outgoing_{cf}(n_1)$ .

$$\mathcal{F}(A, n_1) = name(n_1) \to \mathcal{F}(A, cf)$$

Optionally, an incoming object flow of is possible, which serves as input to the send signal event action, and models the parameters send as part of the send signal event. In our semantics, the object flow of, if present, emanates from an activity parameter node  $n_2 \in N_A^{PN}$  and terminates on send signal event<sup>2</sup> node  $n_1^3$ . The construction  $par(n_2)$  is the parameter available within the context of the owing activity (defined within the let within construct).

$$\mathcal{F}(A, n_1) = name(n_1).par(n_2) \to \mathcal{F}(A, cf)$$

Alternatively, the send signal event has a single incoming object flow, but no incoming control flow. In this case the process modelling the send signal event action would have an input argument, p, passed from the process modelling the object flow. The outgoing control flow is given by  $cf \in outgoing_{cf}(n_1)$ . The formalisation follows.

$$\mathcal{F}(A, n_1)[p] = name(n_1).p \to \mathcal{F}(A, cf)$$

The above models the case where the parameter comes from: an object flow emanating from a value specification action; the output of an opaque action; or the output of a receive signal event action.

Receive signal event action. A receive signal event action  $n \in N_A^{RS}$  has a single outgoing control flow  $cf \in outgoing_{cf}(n)$ . Note that it is not possible to have an outgoing object flow if an outgoing control flow is present.

$$\mathcal{F}(A, n) = name(n) \to \mathcal{F}(A, cf)$$

Alternatively, the receive signal event may be passed a parameter as part of the event. In this case it is conceivable that an object flow will exit the action. The formalisation follows.

$$\mathcal{F}(A, n) = name(n)?p \rightarrow \mathcal{F}(A, outgoing_{of}(n))[p]$$

<sup>&</sup>lt;sup>2</sup> A value specification action, rather than an activity parameter node, connected via an object flow, can be used for constants.

<sup>&</sup>lt;sup>3</sup> Note that an incoming control flow is still present and also terminates on  $n_1$ .

The input p on the CSP channel corresponds to the parameter passed as part of the receive signal event.

Final node. A final node  $n \in N_A^F$  has no outgoing edges. It is trivially modelled as the CSP Skip process.

$$\mathcal{F}(A, n) = Skip$$

Fork node. A fork node  $n \in N_A^{FK}$  splits the control flow in k parallel flows  $cf_0 \dots cf_{k-1}$ .

$$\mathcal{F}(A, n) = [|join|] j : outgoing_{cf}(n) \bullet \mathcal{F}(A, j)$$

The above alphabetised indexed parallel construction ensures that all the different threads of control only synchronise on the *join* event; all other events are interleaved.

Join node. A join node  $n \in N_A^{JN}$  synchronises k parallel control flows and has a single outgoing control flow  $cf = outgoing_c f(n)$ .

$$\mathcal{F}(A, n) = \mathcal{F}(A, cf)$$

State machines This paper is a companion of sorts to the work presented in [4]: it extends the formalisation of state machines to encompass entry, exit, and do behaviours modelled via activities. This hybrid approach is typical of most systems engineering methodologies used in practice today. In addition, as the activities execute within the context of an owing state machine, the run to completion execution semantics of state machines are applicable. We briefly reprise the necessary mathematical structures and CSP descriptions of [4] to ensure this paper is self-contained. We restrict ourselves to non-hierarchical state machines and ignore guard conditions on transitions in order to simplify the presentation here. The interested reader can refer to [4] for an account of more complex state machines.

A state machine  $M \in \mathcal{M}$  consists of a finite set of states, denoted  $S_M$ , and transitions between those states, denoted  $T_M$ . We partition  $S_M$  such that  $S_M^I$  represents the set of initial states,  $S_M^F$  the set of final states,  $S_M^S$  the set of simple states. A function outgoing:  $S_M \to \mathbb{P}$   $T_M$  returns the set of outgoing transitions for a given state.

We define the following functions, to return for a transition  $t \in T_M$ : the source state, source :  $T_M \to S_M$ ; the target state, target :  $T_M \to S_M$ ; the trigger, trigger :  $T_M \to \mathcal{S}$ ; and the effect, given by effect :  $T_M \to \mathcal{A}$ .  $\mathcal{S}$  is the set of signals.

The entry and exit behaviours of a particular state are given by the following functions:  $entry: S_M \to \mathcal{A}$ ; and  $exit: S_M \to \mathcal{A}$ . In each case, an activity modelling the behaviour is returned.

A mapping function  $\mathcal{F}$  is used to formalise the behaviour;  $\mathcal{F}(M, s)$  is a process that describes the behaviour of M in state s.

Initial state. An initial state  $s \in S_M^I$  has a single outgoing transition t that defines its unique starting point. Optionally, an effect component can be specified

for the transition using an activity  $A \in \mathcal{A}$ . In the following: effect(t) returns a behaviour specified via an activity; similarly, entry(target(t)) returns the entry behaviour of the target state specified via an activity.

```
\mathcal{F}(M,s) = effect(t) \circ entry(target(t)) \circ \mathcal{F}(M,target(t))
```

Simple state. The CSP channel local is used for communicating with the event queue of the state machine M. The arrival of a SysML signal event serves as the trigger; consequently this is made available as a CSP event. If the signal signature has a data component associated with it, this is made available as an input along with the channel modelling the event<sup>4</sup>.

We need to consider the eventuality where the state machine receives a signal event not expected in the current state s. Here, the state machine discards the unexpected event. In the following, assume that unexpected(s) returns the set of unexpected events for state s (receive signal events that are valid in other states of  $S_M$  but not in s). The components proc and disc denote the event being processed and discarded, respectively. In both cases, it is removed from the event queue.

```
 \begin{split} \mathcal{F}(M,s) &= \\ & \Box \ t : outgoing(s) \bullet local.proc.trigger(t) \rightarrow \\ & exit(s) \ \S \ effect(t) \ \S \ entry(target(t)) \ \S \ \mathcal{F}(M,target(t)) \\ & \Box \\ & \Box \ t : unexpected(s) \bullet local.disc.trigger(t) \rightarrow \mathcal{F}(M,s) \end{split}
```

Final state. Consider a final state  $s \in S_M^F$ . A final state has no outgoing transitions and is trivially modelled as the deadlocked process.

$$\mathcal{F}(M,s) = Skip$$

Event queue. The state machine as a whole is modelled with a single process that contains all the localised process descriptions defined above. The overall structure is similar to that given by Davies and Crichton [5]. The state machine receives all communications through an event queue, modelled as a CSP buffer of size 1. It communicates with this buffer on a CSP channel, local. Each of the localised processes has access to this channel in order to receive communications from the event queue. The overall process M(queue, local) initially behaves as the process associated with the initial state  $\mathcal{F}(M, s_0)$ . Throughout, the state machine behaves like the various processes until it possibly reaches a final state, after which it behaves as  $\mathcal{F}(M, s_f)$ . The local process EQ models the event queue. Here, we assume a queue with a maximum capacity of 1; the queue blocks when full. The datatype Dispatched, communicated along with the event on channel

<sup>&</sup>lt;sup>4</sup> Next, the guard (if it exists) is evaluated and if false the event is discarded without effect. Conversely, if the guard evaluates to true the behavioural construct specified for the effect are executed before behaving as the process associated with the destination state. Guards are omitted in this paper due to space restrictions.

*local*, models the dispatching of an event: an event can either be processed, *proc* or, if the state machine is in a state where the dispatched event is not expected, discarded, *disc*.

```
\begin{split} M(queue, local) &= \\ \text{let} & \mathcal{F}(M, s_0) = \dots \\ & \dots \\ \mathcal{F}(M, s_f) = Stop \\ & EQ = queue?e \rightarrow local?p!e \rightarrow EQ \\ \text{within} & \mathcal{F}(M, s_0) \mid\mid \{\mid in \mid\} \mid\mid EQ \end{split}
```

The state machine of a block  $B_i$  only receives (through its event queue) the provided receptions. The required features are communicated across the connectors linking parts. In our formalisation, the name of the part is used as the channel name.

**Blocks** The formalisation above additionally allows us to showcase how CSP can be used in a compositional approach to specification and refinement within the context of systems engineering.

Assume a block  $B_i \in \mathcal{B}$  composed of K constituent blocks  $B_0 \dots B_{K-1}$ , where  $i \geq K$ . We known that the aggregate behaviour exhibited by blocks  $B_0 \dots B_{K-1}$  must adhere to that of the composite block  $B_i$ ;  $B_i$  is an abstract specification block that the more concrete implementation blocks  $B_0 \dots B_{K-1}$  must implement. Stated in terms of CSP: the characteristic process of  $B_i$  serves as the specification process and  $B_0 \dots B_{K-1}$ , suitably combined using parallel composition, form the implementation process.

Assume that classifier(B) represents the classifier behaviour of a SysML block. Using CSP the conformance of the implementation process to that of the specification can be stated thus.

```
classifier(B_i) \sqsubseteq || P : \{B_0 ... B_{K-1}\} \bullet classifier(P)
```

Events introduced at the lower level of implementation are excluded from the above observation; the hiding operator of CSP can be used to conceal such events.

Using this approach, and assuming the refinement holds,  $B_i$  can be safely substituted for the concrete composition  $B_0 \dots B_{K-1}$ . This stepwise, compositional approach to systems specification and design sits well with CSP's approach to refinement. This statement is not necessarily true for conventional model checkers that rely on temporal logics to assert safety or liveness properties. In a system of systems,  $B_i$ , previously our system of interest, is now just a component block representing one of the subsystems.

## 4 A robotic arm

In this section we apply the concepts central to our methodology to an illustrative case study. We study a single component, a robotic arm, of a fully fledged case



Fig. 1. The block definition and internal block diagrams of the arm system.



Fig. 2. The state machine diagrams of the arm system.

study that is well known in the formal methods community. The production cell is an industrial installation of a metal processing plant located in Karlsruhe, Germany [6]. However, in the interest of brevity and clarity, we consider the arm as our system of interest. The arm is one subsystem of the travelling crane, which is yet another component of the much bigger system — the production cell.

A bidirectional motor can operate in two opposing directions. An electromagnet can activate or deactivate a magnetic field using an electric current. A potentiometer provides a value within certain limits so as to indicate the range of extension.

The arm is equipped with a bidirectional motor responsible for vertical extension. An electromagnet is placed at the front of the arm for handling metal objects; a potentiometer is present to indicate the range of extension of the arm.

Refer to Figure 1. The structural aspects of the system are modelled using blocks for the controller, bidirectional motor, electromagnet, and the potentiometer; signals and enumeration definitions further illuminate the design by introducing the messages and associated parameters communicated between state machines and activities.

Figures 2 and 3 show the state machines and activities of the arm system.

The channels used by the state machine of the bidirectional motor can be defined thus. The Direction enumeration of Figure 1 can be represented with a CSP datatype. Channel and datatype definitions for other state machines are similar.

```
\begin{array}{l} {\sf datatype} \ Dispatched = proc \mid disc \\ {\sf datatype} \ Direction = fwd \mid rev \\ {\sf datatype} \ BDMotorSignal = \\ BDMotorOn.Direction \mid BDMotorOff \\ {\sf channel} \ bdmotor : BDMotorSignal \\ {\sf channel} \ bdmotorlocal : Dispatched.BDMotorSignal \\ \end{array}
```

In the above, the channel *bdmotor* is used by other state machines to communicate with the state machine of the bidirectional motor via its associated event queue; the channel *bdmotorlocal* is used by the event queue of the bidirectional motor to dispatch events (to the bidirectional motor's state machine) for processing.

The CSP process modelling the characteristic behaviour of the Controller follows. The activity Extend is associated with the effect component of the transitions emanating from the idle state; the activity Magnetise represents the entry behaviour of the grasp state. CSP datatype definitions are used to type the provided receptions of the Controller block; these serve as triggers for the classifying state machine. The name of the instance is used as the channel name when communicating with a state machine; a channel with the same name and the suffix local is used to model the internal event queue of the corresponding state machine.

```
Controller(queue, local) =
  let
    I_0 = IDLE
    IDLE =
       local.proc.Grasp?e \rightarrow
          Extend(local, e) § Magnetise § GRASP
       local.proc.Drop?e \rightarrow
          Extend(local, e) \ \ Demagnetise \ \ DROP
       local.disc?e: \{|OnPD|\} \rightarrow IDLE
     GRASP =
       Retract(local) \ \cent{gradient} IDLE
       local.disc?e: \{|Grasp, Drop, OnPD|\} \rightarrow GRASP
    DROP = \dots
    EQ = queue?e \rightarrow local?p!e \rightarrow EQ
  within
     I_0 [|\{|local|\}|] EQ
CONTROLLER = Controller(controller, controllerlocal)
```

```
\begin{split} &\alpha CONTROLLER = \\ &Union(\{\{|\ controller, controllerlocal\ |\},\\ &\alpha Magnetise, \alpha Demagnetise, \alpha Extend, \alpha Retract\}) \end{split}
```

The processes *Magnetise* and *Extend*, modelling the activities used in the *CONTROLLER* process, follows. The event queue is passed in as the activity executes within the context of its owing state machine.

```
Magnetise =
  let
     I_0 = SS_0
     SS_0 = magnet.magnetOn \rightarrow F_0
     F_0 = Skip
  within
     I_0
\alpha Magnetise = \{ | magnet.MagnetOn | \}
Extend(local, pd) =
  let
     I_0 = VS_0
     VS_0 = SS_0(fwd)
     SS_0(o) = bdmotor.BDMotorOn.o \rightarrow SS_1
     SS_1 = pdmeter.NotifyPD.pd \rightarrow RS_0
     RS_0 =
        local.proc.OnPD \rightarrow SS_2
        local.disc?ev: \{ | Grasp, Drop | \} \rightarrow RS_0
     SS_2 = bdmotor.BDMotorOff \rightarrow F_0
F_0 = Skip
  within
     I_0
\alpha Extend =
  \{| bdmotor.BDMotorOn.fwd, bdmotor.BDMotorOff, \}
     pdmeter.NotifyPD \mid \}
```

The processes, along with their respective alphabets, denoting concrete parts for the magnet, bidirectional motor and potentiometer can be similarly defined, but are excluded here due to space constraints. Activities and alphabets used within these state machines can also be similarly defined.

```
\begin{split} MAGNET &= Magnet(magnet, magnetlocal) \\ BDMOTOR &= BDMotor(bdmotor, bdmotorlocal) \\ PDMETER &= PDMeter(pdmeter, pdmeterlocal) \end{split}
```

The definition of the process ARM, modelling the abstract block that serves as the specification that the parts must realise, follows.

```
Arm(queue, local) =
```

```
let I_0 = READY
READY = \dots
BUSY = \dots
SetReady \circ READY
\square
local.disc?e: \{|PickUp, PutDown|\} \rightarrow BUSY
EQ = queue?e \rightarrow local?p!e \rightarrow EQ
within
I_0 [|\{|local|\}|] EQ
ARM = Arm(arm, armlocal)
\alpha ARM = \dots
Union(\{\{|arm, armlocal|\}, \alpha SetReady\})
```

Assuming that  $P = \{CONTROLLER, MAGNET, BDMOTOR, PDMETER\}$  we then have  $CONCRETE = \|p: P \bullet [\alpha p]p$ . In the aforementioned,  $\alpha p$  denotes the set of events communicable by P. The set of processes P represent the concrete implementation blocks whose conjoined behaviour must be that of the block arm that serves as its specification. The similarity with CSP here is striking: refinement in CSP is expressed between specification and implementation processes.

 $CONCRETE^R$  is the process with events suitably renamed to ensure compatible alphabets.

```
\begin{split} CONCRETE^R = \\ CONCRETE[\ controller.Grasp.pd_0 \leftarrow arm.PickUp.pd_0, \\ controller.Drop.pd_0 \leftarrow arm.PutDown.pd_0, \\ controller.Grasp.pd_1 \leftarrow arm.PickUp.pd_1 \ldots] \end{split}
```

The set Hidden are those events not present in the alphabet of the abstract specification process ARM;  $\Sigma$  denotes the set of all CSP events within the context of the specification. Thus

```
\label{eq:hidden} \begin{split} \textit{Hidden} &= \Sigma \setminus \{ | \; arm.PickUp, arm.PutDown, \\ & \; armlocal.proc.PickUp, armlocal.proc.PutDown, \\ & \; armlocal.disc.PickUp, armlocal.disc.PutDown, \\ & \; client \; | \} \end{split}
```

FDR verifies the assertion

```
ARM \sqsubseteq CONCRETE^R \setminus Hidden \qquad [\sqsubseteq holds]
```

Given that the refinement holds, ARM can be substituted for its parts in the complete system: the behaviour of the concrete implementation processes, denoted by CONCRETE, can neither refuse nor accept an event that ARM can. Stated another way, the characteristic behaviour of CONCRETE is completely contained within that of ARM. The compositional approach presented above is effective in alleviating the state space explosion problem: subsystems can be developed and formally verified in isolation and subsequently combined to form an integrated system description.



Fig. 3. The activity diagrams of the arm system.

#### 5 Conclusions

There is a wealth of literature on the formalisation of activity and state machine diagrams, primarily within the context of UML. In order to limit the scope we only report on approaches that utilise CSP.

Ng and Butler [7] proposed the formalisation of UML state machine diagrams using CSP as the semantic domain [7]. They define the translation in terms of a mapping function from structural diagrammatic constructs to their CSP counterparts. The work of Yeung and colleagues [8] built on that of Ng and Butler by generalising inter-level transitions.

Xu et al. [9] formalised activity diagrams in CSP. A transformation function is defined that maps the mathematical representation of an activity to the semantic domain of CSP. The goal in [9] is on providing a formal semantics for activities in terms of CSP, rather than checking behavioural conformance. Only a limited number of diagrammatic constructs are considered and object flows are omitted. Constructs such as send and receive event actions are not addressed.

Our work is different than the aforementioned contributions in a number of ways. This paper presents a compositional approach to refinement and specification, evaluated within the context of SysML. In addition, we consider the behaviour of several interacting state machines, supplemented with behaviours described via activities. In contrast, previous approaches placed emphasis on the formalisation of a single state machine (or activity); considering the execution semantics in terms of interaction with other state machines (or activities) was not their primary focus.

The choice of CSP is due to a number of factors. The behavioural aspects of SysML can be modelled naturally by a process-algebraic formalism such as CSP, resulting in a formal framework where assertions about requirements can be proved or refuted with relative ease [4]. CSP's approach to process composition, combined with the fact that refinement is preserved within context, would allow us to decompose a complex design of a system (or system of systems) in such a way that the automated analysis is computationally feasible. In particular, the

decompositional approach to specification, as illuminated by the case study in Section 4, allows us to substitute a collection of blocks with a single block that depicts the intended behaviour of the whole. Furthermore, CSP's approach to establish refinement — by comparing the behaviour of a characteristic specification process to that of a concrete implementation process — coincides with SysML's compositional outlook to specification and the notion that a block can act as a specification of constituent blocks. In contrast, in conventional model checking approaches where there is no concept of refinement, this distinction is less clear. The above approach is mechanisable via a model-to-model (SysML metamodel to CSP meta-model) and subsequent model-to-text (machine-readable CSP) transformation. Details of an implementation have been omitted due to space constraints.

#### References

- Leveson, N.G.: Engineering a Safer World: Systems Thinking Applied to Safety. MIT Press (2012)
- 2. Object Management Group: Systems Modeling Language Specification, version 1.3. (2012) Available at: http://www.omg.org/spec/SysML/1.3, [2014, March].
- 3. Hoare, C.A.R.: Communicating Sequential Processes. Prentice Hall (1985)
- 4. Jacobs, J., Simpson, A.: Towards a process algebra framework for supporting behavioural consistency and requirements traceability in SysML. In: Proceedings of the 15th International Conference on Formal Engineering Methods (ICFEM 2013). Volume 8144 of Lecture Notes in Computer Science. Springer (2013) 266–281
- Davies, J.W.M., Crichton, C.R.: Concurrency and refinement in the Unified Modeling Language. Electronic Notes in Theoretical Computer Science 70(3) (2002) 217–243
- Lewerentz, C., Lindner, T.: Case study Production Cell. In: Formal Development of Reactive Systems. Volume 891 of Lecture Notes in Computer Science. Springer (1995)
- Ng, M.Y., Butler, M.: Towards formalizing UML state diagrams in CSP. In: Proceedings of the 1st International Conference on Software Engineering and Formal Methods (SEFM 2003), IEEE (2003) 138–147
- Yeung, W.L., Leung, K.R.P.H., Dong, W., Wang, J.: Improvements towards formalizing UML state diagrams in CSP. In: Proceedings of the 12th Asia-Pacific Software Engineering Conference (APSEC 2005), IEEE (2005) 176–182
- 9. Xu, D., Philbert, N., Liu, Z., Liu, W.: Towards formalizing UML activity diagrams in CSP. In: Proceedings of the 2008 International Symposium on Computer Science and Computational Technology (ISCSCT 2008), IEEE (2008) 450–453

# Checking Integral Real-time Automata for Extended Linear Duration Invariants

Changil Choe<sup>1</sup>, Univan Ahn<sup>2</sup> and Song Han<sup>3</sup>

- <sup>1</sup> Faculty of Mathematics, Kim Il Sung University, D.P.R.K mathcci@yahoo.com
  - <sup>2</sup> Faculty of Physics, Kim Il Sung University, D.P.R.K univan.ahn@gmail.com
- <sup>3</sup> Faculty of Mathematics, Kim Il Sung University, D.P.R.K mathsonghan@yahoo.com

Abstract. Linear duration invariants are important safety properties of real-time systems. They are represented as linear inequalities of integrated durations of system states, which form a decidable subclass of Duration Calculus formulas. The model checking for linear duration invariants was first considered on the real-time automata and later it was extended to the timed automata. The problem of whether a real-time automaton satisfies a linear duration invariant was transformed into a finite number of linear programming problems. In this paper, extended linear duration invariants which are linear inequalities of integrals of physical quantities that characterize real-time systems are introduced. The semantics of extended linear duration invariants is defined by introducing integral real-time automata whose states are labeled with a finite number of integrable functions. The problem of checking an integral real-time automaton for an extended linear duration invariant is transformed into a finite number of nonlinear programming problems which can be solved easily. A case study of a reaction tank is discussed to demonstrate the effectiveness of the technique introduced in the paper.

**Keywords:** real-time system, real-time automaton, linear duration invariant, integral real-time automaton, extended linear duration invariant

#### 1 Introduction

Duration Calculus (abbreviated to DC) represents a logical approach to the formal design of real-time systems [1]. DC uses durations of states over time intervals to specify and reason about real-time behavior of software embedded systems. The duration of a state in a time interval is the total presence time of the state in the interval. Linear constraints on the durations of system states form a class of important properties of real-time systems. This class was given a name linear duration invariants and first introduced in [2].

A linear duration invariant is a DC formula of the form

$$c_{min} \le \ell \le c_{max} \to \sum_{i=1}^{n} c_i \int s_i \le C,$$

where  $c_{min}$ ,  $c_{max}$ ,  $c_i$   $(1 \le i \le n)$  and C are real numbers, and  $s_i$   $(1 \le i \le n)$  are states of the system. A linear duration invariant means that for any observation time intervals, if the length  $\ell$  of the interval satisfies the constraint  $c_{min} \le \ell \le c_{max}$  then the durations of the system states over that interval should satisfy the constraint  $\sum_{i=1}^{n} c_i \int s_i \le C$ . Many desired properties of real-time systems are represented as linear duration invariants.

In [2], authors also defined a satisfaction problem of a linear duration invariant for a *real-time automaton* and proved that the problem of checking satisfaction of a linear duration invariant by a real-time automaton can be transformed into a finite number of linear programming problems. After the publication of [2], many works were devoted to extending this satisfaction problem to the timed automata and checking timed automata for linear duration invariants, e.g. [3–7].

In this paper, we introduce *extended linear duration invariants* that are represented as linear constraints on the accumulated physical quantities in the observation intervals. An extended linear duration invariant has the form

$$c_{min} \le \ell \le c_{max} \to \sum_{i=1}^{m} c_i \int f_i \le C.$$

Here,  $c_{min}$ ,  $c_{max}$ ,  $c_i$   $(1 \le i \le m)$  and C are real numbers, and  $f_i$   $(1 \le i \le m)$  are integrable functions labeled to the states of the system.  $\int f_i$   $(1 \le i \le m)$  stand for the integrals of  $f_i$   $(1 \le i \le m)$  in the observation intervals. Formal definition of  $\int f$  is given in Section 2.

To define the semantics of extended linear duration invariants, we introduce a variant of real-time automata called *integral real-time automata*. An integral real-time automaton is a real-time automaton whose states are labeled with a finite number of integrable functions. The semantics of extended linear duration invariants with respect to integral real-time automata is defined as a conservative extension of the semantics of linear duration invariants with respect to real-time automata.

Then we prove that the problem of checking an integral real-time automaton for an extended linear duration invariant is reduced to a finite number of nonlinear programming problems. (In [2], the problem of checking a real-time automaton for a linear duration invariant was reduced to a finite number of linear programming problems.) The constraints of these nonlinear programming problems constitute convex polyhedra in Euclidean spaces and the objective functions are separable. Nonlinear programming problems of this type were already studied well and any algorithm for solving these problems can be used for deciding our satisfaction problem.

As a case study, we represent a reaction tank as an integral real-time automaton, specify its safety requirement using an extended linear duration invariant, and prove that the model satisfies the specification. The reaction tank is a very small two-state system for which it seems to be considered for the first time in this paper. It motivated us to extend the real-time automata approach for systems verification to the nonlinear programming. It will be useful if we have a verification technique for dealing with nonlinear accumulations of physical quan-

tities using DC, for rough linearization of nonlinear behaviors may weaken the confidence on the verification.

The paper is organized as follows. In the next section we introduce integral real-time automata and extended linear duration invariants. In Section 3 we define the semantics of extended linear duration invariants with respect to integral real-time automata. In Section 4 we present a technique for checking integral real-time automata against extended linear duration invariants. The reaction tank is discussed throughout the paper to validate the technique introduced in the paper.

# 2 Integral Real-time Automata and Extended Linear Duration Invariants

In this section, we introduce integral real-time automata and extended linear duration invariants, and define the semantics of extended linear duration invariants with respect to integral real-time automata.

#### 2.1 Integral Real-time Automata

Before introducing integral real-time automata, we recall the definition of real-time automata and consider a real-time automaton model of a gas burner [8, 9, 2].

**Definition 1.** A real-time automaton A is a tuple  $\langle S, T, low, up \rangle$  which satisfies the following conditions [2]:

- S is a finite set of states  $\{s_1, s_2, \ldots, s_n\}$ .
- $T \subseteq S \times S$  is a fintie set of transitions.
- The functions low:  $T \to R$  and  $up: T \to (R \cup \{\infty\})$  denote the lower- and upper-bound timing constraints on the transitions, where  $0 \le low(\rho) \le up(\rho)$  and  $low(\rho) = 0 \to up(\rho) > 0$  for any  $\rho \in T$ .

Every state of a real-time automaton is both an initial and an accepting state. The set of real-time automata is a subclass of the timed automata of [9], where each automaton has one clock that is reset after every transition.

Let us consider an example of real-time system that is represented as a real-time automaton, a gas burner first investigated in [11]. A gas burner works by repeating heating and idling. When it moves from idling to heating, gas flows for a little time before it is ignited. And when it fails to ignite the gas, gas still flows until the flame failure is detected and the gas valve is closed. To prevent a dangerous accumulation of gas, the time intervals where gas is leaking should not become too long.

The real-time requirement of the gas burner is that in any observation interval not smaller than one minute, the proportion of total time of gas leaks should not be more than one-twentieth of the interval. This requirement can be refined into following two design decisions.

#### 4 Changil Choe, Univan Ahn and Song Han

Des1: Any gas leak should be stoppable within one second.

Des2: The time distance between two gas leaks should not be less than thirty seconds.

Des1 and Des2 can be represented by the real-time automaton in Fig. 1, which has two states of Leak and Nonleak.



Fig. 1. Left: A real-time automaton for the gas burner. Right: An integral real-time automaton for the reaction tank.

The timing constraint on transition (Leak, Nonleak) is a bounded and closed interval [0,1]. This denotes that the automaton can stay in the Leak state for at most one time unit before a transition to the Nonleak state takes place. The timing constraint on transition (Nonleak, Leak) is a left closed, unbounded interval  $[30,\infty)$ . This denotes that the automaton must stay in the Nonleak state for at least 30 time units before a transition to the Leak state can take place, and it can even stay in the Nonleak state forever.

Now we introduce integral real-time automata and consider an integral real-time automaton model of a safety critical system for the reaction tank.  $\mathbb{R}^+$  denotes the set of nonnegative real-numbers.  $Intg(\mathbb{R}^+)$  denotes the set of integrable functions over  $\mathbb{R}^+$ .

**Definition 2.** An integral real-time automaton D is a tuple  $\langle S, T, low, up, L \rangle$  which satisfies the following conditions:

- < S, T, low, up > is a real-time automaton.
- $-L: S \to 2^{Intg(\mathbb{R}^+)}$  labels a finite set of integrable functions to each state  $s \in S$ .

A function labeled to a state represents the generation process of a certain physical quantity, which is progressed during the continuous presence of that state. What we are interested in the paper is an inequality which is related to the integrals of such generation processes. So no other conditions are given to the functions which are labeled to the states other than the integrability condition. Integrability is needed to make the model checking decidable. Every state of an integral real-time automaton is both an initial and an accepting state.

The reason for defining integral real-time automata might be questioned because there is a possibility that integral real-time automata can be transformed into hybrid automata of [12]. Such a question will be solved after we introduce extended linear duration invariants and define their semantics using integral

real-time automata in Section 3. It is enough to represent the system as a simple integral real-time automaton for the model checking of an extended linear duration invariant. If necessary, we can consider the model checking of hybrid automata for extended linear duration invariants.

Let us consider an example of safety critical systems which can be represented as an integral real-time automaton.

**Reaction tank.** A chemical reaction which involves a harmful gas release is repeated indefinitely inside the reaction tank. Each reaction cycle takes from 3 to 4 hours. The products are taken out of the tank and waste liquid are sent to the next process after the reaction cycle is finished. The reaction can be repeated after 2 hours from the end of preceding reaction, and it may not ever be resumed. An air cleaner is installed and works constantly to neutralize the released harmful gas.

The left function of Fig. 2, denoted by  $f_{tox}$ , shows the variation of the amount of harmful gas which is released during the reaction. The right function of Fig. 2, denoted by  $f_{detox}$ , shows the neutralization ability of the air cleaner.



Fig. 2. Left: The harmful gas release characteristic during the reaction. Right: The neutralization ability of the air cleaner.

Analytic expressions of  $f_{tox}$  and  $f_{detox}$  are as follows.  $f_{toc}(x) = 0.35x^3 - 3.23x^2 + 7.53x \qquad 0 \le x \le 4$  $f_{detoc}(x) = 2.4 \qquad x \ge 0$ 

The air pollution of the working environment will be caused if the air cleaner fails in neutralizing harmful gas in real-time. This imposes a safety requirement on the reaction tank; in any observation interval not smaller than 12 hours, the air cleaner should be completely capable of neutralizing the released harmful gas.

Fig. 1 shows an integral real-time automaton model of the reaction tank. The automaton has two states of reaction and idling. Two functions  $f_{tox}$  and  $f_{detox}$  are labeled to the reaction state. This denotes that harmful gas release and its neutralization occur together in the reaction state. A function  $f_{detox}$  is labeled to the idling state. This denotes that already released harmful gas is neutralized in the idling state without extra release of harmful gas.

A real-time automaton can be considered as an integral real-time automaton whose states are labeled with one constant function f(x) = 1. In that sense, the notion of integral real-time automata is an extension of the notion of real-time automata. It is possible to represent an integral real-time automaton as a weighted timed automaton, if the functions labeled to the states of the integral real-time automaton are constant or piecewise constant.

Remark 1. We assumed that the domains of the functions which are labeled to the states are  $\mathbb{R}^+$  when we defined the integral real-time automata. However, readers may notice that the domain of the function  $f_{tox}$  which is labeled to the reaction state in Fig. 1 is [0,4]. The reason we confine the domain of  $f_{tox}$  to [0,4] is that the system can stay in the reaction state at most 4 hours. We can easily extend the domain of  $f_{tox}$  to  $\mathbb{R}^+$  by assigning 0 to every x greater than 4.

#### 2.2 Extended Linear Duration Invariants

Before defining extended linear duration invariants, we recall the definition of linear duration invariants and consider an example of linear duration invariant specifications of real-time requirements. Linear duration invariants form a decidable subclass of DC. The syntax, semantics and proof system of DC were summarized in the monograph [13].

DC is very effective in expressing various patterns of real-time requirements, but its formulas are highly undecidable. Only a very small class of chop free formulas including linear duration invariants is decidable [14]. Because linear duration invariants are important properties of real-time systems, model checking of linear duration invariants has attracted great deal of attention since the introduction of DC.

**Definition 3.** A linear duration invariant for the real-time automaton A is a DC formula of the form

$$c_{min} \le \ell \le c_{max} \to \sum_{i=1}^{n} c_i \int s_i \le C.$$

Here,  $c_{min}$ ,  $c_{max}$ ,  $c_i$   $(1 \le i \le n)$  and C are real numbers, and  $s_i$   $(1 \le i \le n)$  are states of A.  $\ell$  is a term which takes the length of interval for each observation interval.  $\int s$  is a term which takes the integrated duration of s for each observation interval. The real-time requirement of the gas burner mentioned in Section 2.1 is represented as  $\ell \ge 60 \to (19 \int Leak - \int Nonleak) \le 0$  [11]. If it is obvious from the context, we call the linear duration invariants for a real-time automaton simply the linear duration invariants. LDI will be used to denote a linear duration invariant.

Now we define extended linear duration invariants and consider an example of extended linear duration invariant specifications.

**Definition 4.** An extended linear duration invariant for the integral real-time automaton D is a formula of the form

$$c_{min} \le \ell \le c_{max} \to \sum_{i=1}^{m} c_i \int f_i \le C.$$

Here,  $c_{min}$ ,  $c_{max}$ ,  $c_i$  ( $1 \le i \le m$ ) and C are real numbers, and  $f_i$  ( $1 \le i \le m$ ) are functions labeled to the states of D.  $\ell$  is a term which takes the length of interval for each observation interval.  $\int f_i$  is a term which takes the integral of  $f_i$  for each observation interval. If it is obvious from the context, we call the extended linear duration invariants for an integral real-time automaton simply the extended linear duration invariants. ELDI will be used to denote an extended linear duration invariant.

An essential difference between the linear duration invariants and the extended linear duration invariants comes from the difference of the calculations of  $\int s$  and  $\int f$ . We use an example to show it.

$$s_1$$
  $s_2$   $s_2$   $s_3$   $s_2$   $s_3$   $s_2$   $s_3$   $s_2$   $s_3$   $s_2$   $s_3$   $s_3$   $s_4$   $s_2$   $s_3$   $s_4$   $s_5$   $s_4$   $s_5$   $s_5$ 

The left side of the above figure is a behavior of a real-time automaton. For this behavior,  $\int s_1$  and  $\int s_2$  are calculated as  $\int s_1 = 3+1 = 4$  and  $\int s_2 = 2.5$ . The right side of the above figure is a behavior of an integral real-time automaton. For this behavior,  $\int f$  and  $\int g$  are calculated as  $\int f = \int_0^3 f(x) dx + \int_0^1 f(x) dx$  and  $\int g = \int_0^3 g(x) dx + \int_0^{2.5} g(x) dx + \int_0^1 g(x) dx$ . The linear duration invariants and the extended linear duration invariants

The linear duration invariants and the extended linear duration invariants also have a difference in the structures of their linear terms. The linear term  $\sum_{i=1}^{n} c_i \int s_i$  of a linear duration invariant for the real-time automaton A consists of subterms  $c_1 \int s_1, \ldots, c_{n-1} \int s_{n-1}$  and  $c_n \int s_n$ , whose number is equal to the number of states of A. (Note that some  $c_i$  could be 0.) But the linear term  $\sum_{i=1}^{m} c_i \int f_i$  of an extended linear duration invariant for the integral real-time automaton D consists of subterms  $c_1 \int f_1, \ldots, c_{m-1} \int f_{m-1}$  and  $c_m \int f_m$ , whose number is equal to the number of different functions labeled to the states of D. (Note that some  $c_i$  could also be 0.)

For example, the number of different functions labeled to the states of the integral real-time automaton in Fig. 1 is 2. Hence, the linear term of any extended linear duration invariant for this integral real-time automaton consists of two subterms  $c_1 \int f_{tox}$  and  $c_2 \int f_{detox}$ .

The readers who are familiar with DC can easily find that extended linear duration invariants are not the formulas of DC, because there is no term  $\int f$  in the syntax of DC. To define extended linear duration invariants strictly, we first should extend DC by adding term  $\int f$  to the syntax of DC and then define extended linear duration invariants as the formulas of the extended DC. In this paper, we don't consider the extension of DC and define extended linear duration

invariants by directly extending linear duration invariants. That's why we called extended linear duration invariants in Definition 4 simply formulas rather than DC formulas.

It is possible to extend the syntax of DC to be allowed to include the real valued term  $\int f$ . Early paper of Zhou Chaochen et al [15] would be helpful for doing this work, where the authors extended DC by introducing real valued term dt to capture properties of piecewise continuous states. The semantics of the term dt in [15] is different from the one of the term  $\int f$  in this paper, however. And the main concern of the authors in [15] was to introduce a proof theory of the extended DC, rather than developing a model checking technique. Nevertheless, [15] provides a good approach for extending DC by introducing real valued term  $\int f$ .

Returning to the reaction tank, the safety requirement which was already considered in Section 2.1 can be specified as an extended linear duration invariant as

$$12 \le \ell \to \int f_{tox} - \int f_{detox} \le 0.$$

Here,  $\int f_{tox}$  represents the total amount of gas released in each observation interval and  $\int f_{detox}$  represents the total amount of gas which can be neutralized in that interval. It is impossible to specify this real-time requirement as a linear duration invariant.

From the above discussions, readers will know well the motivation of introducing the integral real-time automata and the extended linear duration invariants. In the next section, we define the semantics of the extended linear duration invariants with respect to the integral real-time automata.

# 3 Semantics of Extended Linear Duration Invariants

We define the semantics of extended linear duration invariants with respect to the integral real-time automata by conservatively extending the semantics of linear duration invariants with respect to the real-time automata.

Given an integral real-time automaton  $D = \langle S, T, low, up, L \rangle$ ,  $\rho$  is used to denote a transition of D, i.e. an element of T. For a transition  $\rho = (s, s')$ , the notations  $\overleftarrow{\rho} = s$  and  $\overrightarrow{\rho} = s'$  are used.

 $\rho_1\rho_2\ldots\rho_n$  is called a behavior if  $\overrightarrow{\rho_i}=\overleftarrow{\rho_{i+1}}$  for every  $i\ (1\leq i\leq n-1)$ . Beh is used to denote a behavior.  $TBeh=(\rho_1,t_1)(\rho_2,t_2)\ldots(\rho_n,t_n)$  is called a time-stamped behavior obtained from  $Beh=\rho_1\rho_2\ldots\rho_n$ , where  $low(\rho_i)\leq t_i\leq up(\rho_i)$  for every  $i\ (1\leq i\leq n)$ . For example,  $Beh=\rho_1\rho_2\rho_1$  is a behavior of the integral real-time automaton in Fig. 1, where  $\rho_1=(reaction,idling)$  and  $\rho_2=(idling,reaction)$ .  $TBeh=(\rho_1,3.5)(\rho_2,3)(\rho_1,4)$  is a time-stamped behavior obtained from  $Beh=\rho_1\rho_2\rho_1$ .

 $\rho_1 \rho_2 \dots \rho_n$  is called a sequence. A sequence may violate transition consecutivity of the automaton. Seq is used to denote a sequence. Given a sequence  $Seq = \rho_1 \rho_2 \dots \rho_n$ ,  $TSeq = (\rho_1, t_1)(\rho_2, t_2) \dots (\rho_n, t_n)$  is called a time-stamped sequence obtained from  $Seq = \rho_1 \rho_2 \dots \rho_n$ , where  $\overrightarrow{\rho_i} = \overleftarrow{\rho_{i+1}}$  for every  $i \ (1 \le i \le n-1)$ .

For example,  $Seq = \rho_1 \rho_1 \rho_2$  is a sequence of the integral real-time automaton in Fig. 1.  $TSeq = (\rho_1, 3.5)(\rho_1, 4)(\rho_2, 3)$  is a time-stamped sequence obtained from  $Seq = \rho_1 \rho_1 \rho_2$ .  $\rho_1 \rho_1 \rho_2$  is a sequence, but it is not a behavior. A behavior is a sequence and a time-stamped behavior is a time-stamped sequence.

 $L_D$  denotes the set of behaviors of the integral real-time automaton D.  $L_D$  is a regular language over the alphabet T, as it is accepted by a finite automaton where every state is both an initial and an accepting state.

For a time-stamped sequence  $TSeq = (\rho_1, t_1)(\rho_2, t_2) \dots (\rho_n, t_n)$  of  $D, \ell(TSeq)$  is defined as

$$\ell(TSeq) = \sum_{i=1}^{n} t_i.$$

For example, the value of  $\ell$  for  $TSeq = (\rho_1, 3.5)(\rho_1, 4)(\rho_2, 3)$  is  $\ell(TSeq) = 3.5 + 4 + 3 = 10.5$ .

Let f be a function labeled to a state of D. For the time-stamped sequence  $TSeq = (\rho_1, t_1)(\rho_2, t_2) \dots (\rho_n, t_n)$  of D,  $\int f(TSeq)$  is defined as

$$\int f(TSeq) = \sum_{i=1}^{n} \left\{ \begin{cases} \int_{0}^{t_{i}} f(x)dx & f \in L(\overleftarrow{\rho_{i}}) \\ 0 & otherwise \end{cases} \right\}.$$

For example, the value of  $\int f_{tox}$  for the time-stamped sequence  $TSeq = (\rho_1, 3.5)(\rho_2, 3)(\rho_1, 4)$  in Fig. 1 is  $\int f_{tox}(TSeq) = \int_0^{3.5} f_{tox}(x)dx + \int_0^4 f_{tox}(x)dx$ , since  $f_{tox}$  is only labeled to  $\overleftarrow{\rho_1}$ .

We denote the linear term  $\sum_{i=1}^{m} c_i \int f_i$  of ELDI by LF. For a time-stamped sequence  $TSeq = (\rho_1, t_1)(\rho_2, t_2) \dots (\rho_n, t_n)$  of D, LF(TSeq) is defined as

$$LF(TSeq) = \sum_{i=1}^{m} c_i \int f_i(TSeq).$$

**Lemma 1.**  $LF(TSeq_1TSeq_2) = LF(TSeq_2TSeq_1) = LF(TSeq_1) + LF(TSeq_2)$  for any time-stamped sequences  $TSeq_1$  and  $TSeq_2$ .

Here,  $TSeq_1TSeq_2$  is the concatenation of  $TSeq_1$  and  $TSeq_2$ . The proof of the lemma is a straightforward from the definition of LF(TSeq).

Now we can move to the semantics of extended linear duration invariants using the functions defined above.

**Definition 5.** The satisfaction of ELDI by D is definied as follows.

- ELDI is satisfied by a time-stamped sequence TSeq of D iff  $c_{min} \leq \ell(TSeq) \leq c_{max}$  implies  $LF(TSeq) \leq C$ . Otherwise ELDI is said to be violated by TSeq.

- ELDI is satisfied by a sequence Seq, denoted by  $Seq \models ELDI$ , iff ELDI is satisfied by every time-stamped sequence obtained from Seq. Otherwise ELDI is said to be violated by Seq.
- Let L be a language over T. ELDI is satisfied by L, denoted by  $L \models ELDI$ , iff  $Seq \models ELDI$  for every  $Seq \in L$ . Otherwise ELDI is said to be violated by L.
- ELDI is satisfied by D, denoted by  $D \models ELDI$ , iff  $L_D \models ELDI$ . Otherwise ELDI is said to be violated by D.

Remark 2. In [2], the meaning of the term  $\int s$  of a linear duration invariant was defined as

$$\int s(TSeq) = \sum_{i=1}^{n} \left\{ \begin{matrix} t_i & \overleftarrow{p_i} = s \\ 0 & otherwise \end{matrix} \right\}.$$

That is,  $\int s(TSeq)$  calculates the total duration of the state s. However,  $\int f(TSeq)$  calculates the total accumulation of the physical quantity f for the duration of the states labeled with f.

# 4 Checking Algorithm

In this section, we present an algorithm for checking an integral real-time automaton for an extended linear duration invariant, which is an extension of the technique developed in [2] to check if a real-time automaton satisfies a linear duration invariant. We first show the main idea of the algorithm through an example of checking the reaction tank for its safety requirement. We then formalize our algorithm which reduces the checking task to a finite set of nonlinear programming problems which can be easily solved.

# 4.1 Verification of the Reaction Tank: Main Idea of the Checking Algorithm

We denote the integral real-time automaton of the reaction tank (Fig. 1) by D and its real-time requirement  $12 \leq \ell \rightarrow \int f_{tox} - \int f_{detox} \leq 0$  by  $ELDI_D$ . The problem  $L_D \models ELDI_D$  must be solved for the verification of the reaction tank.

D has only two transitions  $\rho_1 = (reaction, idling)$  and  $\rho_2 = (idling, reaction)$ , but it can produce infinitely many behaviors. They (namely  $L_D$ ) can be expressed in terms of regular language as

$$(\rho_1\rho_2)^* \cup (\rho_1\rho_2)^*\rho_1 \cup (\rho_2\rho_1)^* \cup (\rho_2\rho_1)^*\rho_2$$

where \* stands for the repetition and  $\cup$  for the union. Note that both reaction and *idling* are initial states of D. Then the problem  $L_D \models ELDI_D$  can be divided into four problems  $(\rho_1\rho_2)^* \models ELDI_D$ ,  $(\rho_1\rho_2)^*\rho_1 \models ELDI_D$ ,  $(\rho_2\rho_1)^* \models$ 

 $ELDI_D$  and  $(\rho_2\rho_1)^*\rho_2 \models ELDI_D$  by considering  $(\rho_1\rho_2)^*$ ,  $(\rho_1\rho_2)^*\rho_1$ ,  $(\rho_2\rho_1)^*$  and  $(\rho_2\rho_1)^*\rho_2$  individually.

Recalling the definition of the extended linear duration invariants, we can easily deduce that the problem  $(\rho_1\rho_2)^* \models ELDI_D$  and  $(\rho_2\rho_1)^* \models ELDI_D$  are equivalent with respect to the satisfaction. Thus, three problems  $(\rho_1\rho_2)^* \models ELDI_D$ ,  $(\rho_1\rho_2)^*\rho_1 \models ELDI_D$  and  $(\rho_2\rho_1)^*\rho_2 \models ELDI_D$  must be solved for the verification of the reaction tank.

Let us first consider the problem  $(\rho_1\rho_2)^* \models ELDI_D$ . For any time-stamped sequence  $TSeq_1 = (\rho_1, t_1)(\rho_2, t_2)$  obtained from  $Seq_1 = (\rho_1\rho_2)^1$ ,  $LF(TSeq_1)$  is calculated as follows.

```
LF(TSeq_1) = \int_0^{t_1} f_{tox}(x) dx - \int_0^{t_1} f_{detox}(x) dx - \int_0^{t_2} f_{detox}(x) dx = \int_0^{t_1} (0.35x^3 - 3.23x^2 + 7.53x) dx - \int_0^{t_1} 2.4 dx - \int_0^{t_2} 2.4 dx = 0.09t_1^4 - 1.08t_1^3 + 3.76t_1^2 - 2.4t_1 - 2.4t_2.
```

The value of  $0.09t_1^4 - 1.08t_1^3 + 3.76t_1^2 - 2.4t_1 - 2.4t_2$  is smaller than 0 for each  $(t_1, t_2)$  which ranges over  $[3, 4] \times [2, \infty)$ . From this fact and Lemma 1, the value of  $LF(TSeq_k)$  is smaller than 0 for each time-stamped sequence  $TSeq_k$  obtained from  $Seq_k = (\rho_1\rho_2)^k = \rho_1\rho_2 \dots \rho_1\rho_2$ . Therefore,  $ELDI_D$  is satisfied by  $(\rho_1\rho_2)^*$  from the definition of the satisfaction.

Let us next consider the problem  $(\rho_1\rho_2)^*\rho_1 \models ELDI_D$ . We can unfold  $(\rho_1\rho_2)^*\rho_1$  as  $\rho_1 \cup (\rho_1\rho_2)\rho_1 \cup (\rho_1\rho_2)^2\rho_1 \cup \ldots$  Then, the problem  $(\rho_1\rho_2)^*\rho_1 \models ELDI_D$  is divided into infinite number of problems:  $(\rho_1\rho_2)^k\rho_1 \models ELDI_D$   $(k \geq 0)$ . For every time-stamped sequence TSeq obtained from  $(\rho_1\rho_2)^k\rho_1$   $(k \geq 2)$ , the value of  $\ell(TSeq)$  is greater than 12 and the value of LF(TSeq) is smaller than 0. This implies that  $ELDI_D$  is satisfied by  $(\rho_1\rho_2)^k\rho_1$   $(k \geq 2)$ . Thus, it is enough to solve the problems  $\rho_1 \models ELDI_D$  and  $\rho_1\rho_2\rho_1 \models ELDI_D$  to decide the problem  $(\rho_1\rho_2)^*\rho_1 \models ELDI_D$ .

It is obvious that  $ELDI_D$  is satisfied by  $\rho_1$  because  $\ell(TSeq)$  is smaller than 12 for every time-stamped sequence obtained from  $\rho_1$ . The problem  $\rho_1\rho_2\rho_1 \models ELDI_D$  is transformed into a nonlinear programming problem as follows.

Every time-stamped sequence  $(\rho_1, t_1)(\rho_2, t_2)(\rho_1, t_1)$  which is obtained from  $\rho_1\rho_2\rho_1$  must satisfy the constraints  $3 \le t_1 \le 4$ ,  $t_2 \ge 2$  and  $3 \le t_3 \le 4$ . For this time-stamped sequence, the value of  $\ell$  is  $t_1 + t_2 + t_3$  and the value of  $\ell F$  is  $\int_0^{t_1} f_{tox}(x) dx + \int_0^{t_3} f_{tox}(x) dx - \int_0^{t_1} f_{detox}(x) dx - \int_0^{t_2} f_{detox}(x) dx - \int_0^{t_3} f_{detox}(x) dx = 0.09t_1^4 - 1.08t_1^3 + 3.76t_1^2 - 2.4t_1 - 2.4t_2 + 0.09t_3^4 - 1.08t_3^3 + 3.76t_3^2 - 2.4t_3$ .

Then, we can decide the problem  $\rho_1\rho_2\rho_1 \models ELDI_D$  by solving the following nonlinear programming problem.

Constraints:

```
3 \le t_1 \le 4, t_2 \ge 2, 3 \le t_3 \le 4 and t_1 + t_2 + t_3 \ge 12.
```

Objective function:

```
0.09t_1^4 - 1.08t_1^3 + 3.76t_1^2 - 2.4t_1 - 2.4t_2 + 0.09t_3^4 - 1.08t_3^3 + 3.76t_3^2 - 2.4t_3.
```

If the maximal value of the objective function is less than or equal to 0, then  $ELDI_D$  is satisfied by  $\rho_1\rho_2\rho_1$ . And if the maximal value of the objective function is greater than 0, then  $ELDI_D$  is violated by  $\rho_1\rho_2\rho_1$ . By solving the problem,

we can easily know that the maximal value of this objective function is -0.64. Therefore,  $ELDI_D$  is satisfied by  $\rho_1\rho_2\rho_1$ .

Lastly, the third problem  $(\rho_2\rho_1)^*\rho_2 \models ELDI_D$  can also be solved in the way used above and we skip it.

#### 4.2 Algorithm

In this subsection, we formalize the technique of the previous subsection. We confine ourselves to the subclass of integral real-time automata which satisfy  $low(\rho) > 0$  for each transition  $\rho$ . Let  $D = \langle S, T, low, up, L \rangle$  be an integral real-time automaton which satisfy the above constraint.

At first we consider the extended linear duration invariants which have the form

$$c_{min} \le \ell \to \sum_{i=1}^{m} c_i \int f_i \le C.$$

For convenience, we denote an extended linear duration invariant of this form by ELDI.

**Definition 6.** Two languages  $L_1$  and  $L_2$  over T are called equivalent with respect to ELDI, denoted by  $L_1 \equiv L_2$ , if  $L_1 \models ELDI$  iff  $L_2 \models ELDI$ .

The equivalence of two languages in this paper is slightly different from the one in [2], but the two equivalences have the same meaning. We identify a regular expression with the language it denotes.

 $L_D$  can be transformed into an equivalent finite union of regular expressions of the form  $\rho_1 \dots \rho_m Seq_1^* \dots Seq_k^*$ . The procedure which was used in [2] to transform  $L_A$  (where A is a real-time automaton) into an equivalent finite union of regular expressions of the form  $\rho_1 \dots \rho_m Seq_1^* \dots Seq_k^*$  can also be used in our case without any modification. We don't repeat it in this paper.

Thus, we can decide the problem  $L_D \models ELDI$  if we have a technique for solving the problem  $\rho_1 \dots \rho_m Seq_1^* \dots Seq_k^* \models ELDI$ .

Given a sequence  $Seq = \rho_1 \rho_2 \dots \rho_m$ , we let  $\ell_{min} = \sum_{i=1}^m low(\rho_i)$  and  $\rho_1 \rho_2 \dots \rho_m^{min} = (\rho_1, low(\rho_1))(\rho_2, low(\rho_2)) \dots (\rho_m, low(\rho_m))$ . And by solving the following nonlinear programming problem, we can obtain the time-stamped sequence  $TSeq^{max} = (\rho_1, t_1^0)(\rho_2, t_2^0) \dots (\rho_m, t_m^0)$  which has the maximal value of LF among all time-stamped sequences obtained from Seq.

Constraints:

 $low(\rho_1) \le t_1 \le up(\rho_1), \dots, low(\rho_m) \le t_m \le up(\rho_m)$  and  $t_1 + \dots + t_m \ge 12$ . Objective function:

LT(TSeq).

We formulate the following three theorems and give sketches of the constructive proofs in order to reduce the problem  $\rho_1 \dots \rho_m Seq_1^* \dots Seq_k^* \models ELDI$  to a finite set of nonlinear programming problems.

**Theorem 1.** ELDI is violated by  $\rho_1 \dots \rho_m Seq_1^* \dots Seq_k^*$  if  $LF(TSeq_i^{max}) > 0$  for some  $i (1 \le i \le k)$ .

*Proof.* Let  $t_0 = t_1^0 + t_2^0 \dots t_m^0$ . We set  $n_1 = \lceil \frac{c_{min} - \ell_{min}}{t^0} \rceil$ ,  $n_2 = \lceil \frac{C - LF(\rho_1 \rho_2 \dots \rho_m^{min})}{LF(TSeq_i^{max})} \rceil$  and  $n = max\{n_1, n_2\}$ . Then, ELDI is violated by  $\rho_1 \dots \rho_m^{min}(TSeq_i^{max})^n$  which is an element of  $\rho_1 \dots \rho_m Seq_1^* \dots Seq_k^*$ .

**Theorem 2.**  $\rho_1 \dots \rho_m Seq_1^* \dots Seq_k^*$  is equivalent to a finite union of sequences if  $LF(TSeq_i^{max}) < 0$  for every i (1 < i < k).

*Proof.*  $Seq^*$  is equivalent to  $\bigcup_{i=1}^n (Seq)^i$  for some n (> 0) if  $LF(TSeq^{max}) \le 0$ . To prove it, we consider the case  $C < LF(TSeq^{max}) < 0$ . Other cases can be considered in a similar way. We set  $n_1 = \lceil \frac{c_{min}}{\ell_{min}} \rceil$ ,  $n_2 = \lceil \frac{C}{LF(TSeq^{max})} \rceil$  and  $n = max\{n_1, n_2\}$ . For this n,  $Seq^* \equiv \bigcup_{i=1}^n (Seq)^i$ 

Then,  $\rho_1 \dots \rho_m Seq_1^* \dots Seq_k^* \equiv \rho_1 \dots \rho_m \bigcup_{i=1}^{n_1} (Seq)^i \dots \bigcup_{i=1}^{n_k} (Seq)^i$ . Therefore,  $\rho_1 \dots \rho_m Seq_1^* \dots Seq_k^*$  is equivalent to a finite union of sequences from the distribution law for the concatenation over the union.

**Theorem 3.** For any  $Seq = \rho_1 \dots \rho_m$ , the problem  $Seq \models ELDI$  is solvable using nonlinear programming.

*Proof.* Consider the timed sequence  $TSeq = (\rho_1, t_1)(\rho_2, t_2) \dots (\rho_m, t_m)$ . The constraints of the nonlinear programming problem are obtained from the timing constraints of D and from the left-hand side of the implication in the definition of ELDI. That is,

$$low(\rho_1) \le t_1 \le up(\rho_1), \dots, low(\rho_m) \le t_m \le up(\rho_m)$$
 and  $c_{min} \le \ell(TSeq)$ .

The objective function of the nonlinear programming problem is

$$LF(TSeq) \ (= \sum_{i=1}^{m} c_i \int f_i(TSeq)).$$

If the maximal value of the objective function is smaller than or equal to C, ELDI is satisfied by Seq. Otherwise, ELDI is violated by Seq.

The algorithm presented above can be easily generalized to the decision procedure for the satisfaction problem of extended linear duration invariants which have ther form  $c_{min} \leq \ell \leq c_{max} \rightarrow \sum_{i=1}^{m} c_i \int f_i \leq C$ . We don't consider about it in this paper.

In this subsection, we've confined ourselves to the integral real-time automata whose lower-bound timing constraints on the transitions are not 0. The algorithm, however, can also be used in other cases of integral real-time automata by replacing each 0 with a "sufficiently small positive number". For better understanding of the algorithm presented in this subsection, readers are referred to [2].

#### 4.3 Solvability of the Nonlinear Programming Problems

Fortunately, the nonlinear programming problems which are generated as a result of applying the checking algorithm are solvable. We briefly discuss about it in this subsection.

Let P be the nonlinear programming problem generated from the problem  $Seq \models ELDI$ , where  $Seq = \rho_1 \rho_2 \dots \rho_m$ .

The feasible set of P is a simple convex polyhedron of  $\mathbb{R}^m$ . (See Theorem 3.) The objective function of P has the structure

$$F(t_1) + F(t_2) + \ldots + F(t_m).$$

Here  $t_i \in [low(\rho_i), up(\rho_i)]$  and  $F(t_i) = \pm \int_0^{t_i} f_{i_1}(x) dx \pm \int_0^{t_i} f_{i_2}(x) dx + \dots + \pm \int_0^{t_i} f_{i_{m_i}}(x) dx$ .  $f_{i_1}, f_{i_2}, \dots, f_{i_{m_{i-1}}}$  and  $f_{i_{m_i}}$  are functions labeled to  $\overleftarrow{\rho_i}$ .  $\pm$  denotes + or -. That is, the objective function of P is a separable function. The methods for solving nonlinear programming problems of this type were already studied well, e.g. [16].

# 5 Conclusion

The case study of the reaction tank demonstrates that extended linear duration invariants introduced in the paper represent a practical pattern of safety properties of real-time systems. The semantics of extended linear duration invariants was defined using integral real-time automata. By introducing extended linear duration invariants and integral real-time automata, we could extend the real-time automata approach to the nonlinear programming.

#### References

- 1. Zhou Chaochen, C.A.R. Hoare, Anders P. Ravn. A calculus of durations. Information Processing Letters, Volume 40, Issue5, 1991, pp 269-276.
- Zhou Chaochen, Zhang Jingzhong, Yang Lu, Li Xiaoshan. Linear Duration Invariants. Formal Techniques in Real-time and Fault-tolerant systems, Lecture Notes in Computer Science, Volume 863, 1994, pp 86-109.
- Y. Kesten, A. Pnueli, J Sifakis and S. Yovine. Integration Graphics: A Class of Decidable Hybrid Systems. LNCS 736, pp 179-208, Springer-Verlag, 1994.
- Victor A. Braberman, Dang Van Hung. On Checking Timed Automata for Linear Duration Invariants. 19th IEEE Real-Time Systems Symposium RTSS98, 1998, Madrid, Spain, IEEE Computer Society Press, 1998, pp 264-273.
- Miaomiao Zhang, Dang Van Hung, Zhiming Liu. Verication of Linear Duration Invariants by Model Checking CTL Properties. Theoretical Aspects of Computing -ICTAC 2008, Lecture Notes in Computer Science, Volume 5160, 2008, pp 395-409.
- Miaomiao Zhang, Zhiming Liu, Naijun Zhan. Model Checking Linear Duration Invariants of Networks of Automata, Fundamentals of Software Engineering, Lecture Notes in Computer Science, Volume 5961, 2010, pp 244-259.
- Pham Hong Thai, Dang Van Hung. Verifying Linear Duration Constraints of Timed Automata. Lecture Notes in Computer Science, Vol 3407, pp. 295-309, Springer-Verlag, 2005.
- 8. A.P. Ravn, H. Rischel, K.M. Hansen. Specifying and Verifying Requirements of Real-Time Systems, IEEE Trans. Software Eng., Vol 19, No 1, pp 41-55, January 1993.

- 9. Zhou Chaochen, Li Xiaoshan. A Mean-Value Duration Calculus, in A Classical Mind, Essays in Honour of C. A. R. Hoare, pp 431-451, A. W. Roscoe (ed.), Prentice Hall International, 1994.
- Rajeev Alur, David L. Dill, A Theory of Timed Automata, Theoretical Computer Science, Volume 126, 1994, pp 45-73.
- Sørensen E.V., Ravn A.P., Rischel H. Control Program for a Gas Burner: Part
   Informal Requirements, ProCoS Case Study 1. ProCoS I, ESPRIT BRA 3104,
   Report No. ID/DTH EVS2, Department of Computer Science, Technical University of Denmark. 1990.
- Rajeev Alur, Costas Courcoubetis, Thomas A. Henzinger, Pei Hsin Ho. Hybrid automata: an algorithmic approach to the specication and verication of hybrid systems, Hybrid Systems, Lecture Notes in Computer Science 736, 1993, pp 209-229.
- Zhou Chaochen, Hansen M.R. Duration Calculus: A Formal Approach to Real-Time Systems. Monographs in Theoretical computer Science, An EATCS. Springer-Verlag, 2004, 250p.
- Zhoe Chaochen, Michael R. Hansen, Peter Sestoft. Decidability and Undecidability Results for Duration Calculus. STACS93, Lecture Notes in Computer Science, Volume 665, 1993, pp 58-68.
- 15. Zhou Chaochen, Anders P. Ravn and Michael R. Hansen. An Extended Duration Calculus for Hybrid Real-time Systems. Lecture Notes in Computer Science, Volume 736, 1993, pp 36-59.
- Hao Zhang and Shuning Wang. Global Optimization of Separable Objective Functions on Convex Polyhedra via Piecewise-linear Approximation. Journal of Computational and Applied Mathematics, Vol 197, 2006, pp 212-217.

# A Spin-based Approach for Checking OSEK/VDX Applications

Haitao Zhang, Toshiaki Aoki, Yuki Chiba

Japan Advanced Institute of Science and Technology {zhanghaitao, toshiaki, chiba}@jaist.ac.jp

Abstract. OSEK/VDX, a standard of automobile OS, has been widely adopted by many manufacturers to design and develop a vehicle-mounted OS. With the increasing functionalities in vehicles, more and more applications are developed based on the OSEK/VDX OS. However, how to verify the developed OSEK/VDX applications is still at preliminary stage. In our previous work, we have proposed a bounded model checking approach to verify the OSEK/VDX applications. In this paper, we describe and develop an alternative approach to verify the OSEK/VDX applications based on the Spin. There are two motivations in this paper, one is to show how to use Spin to verify the OSEK/VDX applications, and the other is to investigate the effectiveness of our bounded model checking approach and Spin-based approach based on the experiments.

Keywords: OSEK/VDX applications, Scheduler, Spin model checker

#### 1 Introduction

OSEK/VDX [1][2], a standard of automobile OS, is proposed by German and France automobile manufacturers in 1994. The original motivation of OSEK/VDX standard is to resolve the problem of increasing software content in automobiles and to deliver high-quality products. With the development of OSEK/VDX OS standard, it has been widely adopted by many automobile manufacturers to design and develop a vehicle-mounted OS, such as BMW, Opel, and Volkswagen. As to enhance the driving fun and safety, more and more applications are developed based on the OSEK/VDX OS. However, how to completely check developed OSEK/VDX applications is becoming a challenge for developers with the increasing complexity in the development.

To completely check OSEK/VDX applications, model checking [3][4] as an exhaustive technique can be applied to verify the OSEK/VDX applications. There are many model checking methods that have been applied to verify the sequential software [5] and multi-threaded software [6][7]. However, it is difficult to directly use these existing model checking methods to verify the OSEK/VDX applications, since the execution characteristics of OSEK/VDX applications are different from sequential software and general multi-threaded software. E.g., when an application runs on the OSEK/VDX OS, (i) tasks within the application are

concurrently executed and the running task can be explicitly determined by a scheduler according to task priority and configuration data, (ii) tasks within application can invoke service APIs to interact with OSEK/VDX OS for changing task states, setting a synchronization event, and accessing a shared resource, and (iii) the invoked service APIs may lead to context switch of tasks. According to these execution characteristics, we can easily find that the checking process on OSEK/VDX applications is different from checking sequential software and multi-threaded software, since the OSEK/VDX application is like a multi-threaded software compared with sequential software, and moreover, in contrast with multi-threaded software, the OSEK/VDX application can interact with OSEK/VDX OS via service APIs and its executions are conducted by an explicit scheduler<sup>1</sup>.

As to verify the OSEK/VDX applications using model checking technique, in our previous work [8][9], we have proposed a technique named execution path generator (EPG) to verify the design model of OSEK/VDX applications based on the bounded model checking (BMC) [10]. Particularly, in order to accurately construct a transitions system for the OSEK/VDX application and avoid the behaviors of OS model to be poured into the transition system, an OS model corresponding to the OSEK/VDX specification is embedded in the EPG to respond to the invoked service APIs and compute the running task. We have conducted many experiments using EPG technique, the experiment results show that, although the EPG technique can handle the complex applications which contain a lot of tasks and APIs, it will spend much time checking the applications which hold a lot of loops. Furthermore, the EPG technique for now cannot check the applications which contain interruptions. Therefore, in this paper we develop an alternative approach to check the OSEK/VDX applications based on the Spin model checker [11], and we want to investigate the effectiveness of Spin-based approach and EPG technique based on the experiments.

As to accurately check an OSEK/VDX application using Spin model checker, in our Spin-based approach a synchronization model (SynM) is used to simulate the executions of the target application. In the SynM, all of the tasks and interrupt service routines (ISRs) within the target application are regarded as process, and the OSEK/VDX OS model as a special process is employed to responding to the invoked service APIs and conducting the executions of tasks, and moreover, the channel within promela is used to implement the interactive behaviors between application model and OS model via service APIs.

We have implemented our SynM in Spin model checker and conducted many experiments using Spin based on the several OSEK/VDX applications. The experiment results show that, our Spin-based approach is capable of checking the safety properties related to variables, service APIs, OS data, and mutual exclusion in the checking process, and moreover, the Spin-based approach also can be

<sup>&</sup>lt;sup>1</sup> In general multi-threaded software such as SystemC programs, the executions of threads are conducted by a non-deterministic scheduler. As to completely check the multi-threaded software, all of the possible interleavings of threads are taken into account in the checking process.

used to check the applications which hold ISRs. In addition, in the experiments we also investigated the effectiveness of the Spin-based approach and EPG technique. The investigation results show that, (i) for the simple application which contains a few tasks (less than 15) but many loops, Spin-based approach is faster than EPG technique in the verification. However, (ii) for the complex application which contains many tasks and APIs, the EPG technique is more efficient to check these applications compared with Spin-based approach.

The rest of the paper is structured as follows. The preliminaries for OSEK/VDX OS and applications are presented in section 2. Based on the discussion about the execution characteristics of OSEK/VDX applications, the Spin-based approach is presented in section 3. As to evaluate our approach, some experiments are carried out in section 4. Related work is discussed in section 5. Conclusion and future work are shown in the last section.

# 2 preliminary

#### 2.1 OSEK/VDX OS

A general OSEK/VDX OS consists of a scheduler module, event process module, resource process module, alarm process module, and interruption process module. Based on these system modules, OSEK/VDX OS supports a standardized application interfaces (APIs) for user to develop customized applications. In our research, we focus on the applications that communicate with scheduler module, event process module, resource process module, and interruption process module. The structure of OSEK/VDX OS with an application is shown in Fig.1.

Scheduluer module: OSEK/VDX OS can process two types of tasks, basic task and extended task. The states of a basic task consist of running state, suspended state, and ready state. Compared with basic task, the extended task can hold synchronization events and has a unique state called waiting state. In the scheduling process, the static priority scheduling policy with non-preemptive and full-preemptive strategies is adopted by scheduler to conduct the executions of tasks, and moreover, scheduler manages a ready queue to indicate the execution order of tasks. Besides, scheduler can respond to four service APIs (TerminateTask, ActivateTask, ChainTask, and Schedule) that can be invoked by tasks to switch task states. For instance, if the service API ActivateTask(tk1) is invoked by running task, and task tk1 is currently in the suspended state, scheduler will move task tk1 from suspended state to ready state.

Event process module: In the event process module, OSEK/VDX OS provides a synchronization mechanism for implementing synchronous executions between tasks. Particularly, only extended tasks can hold a definite number of events, and events are the criteria for the switching of task states from running state to waiting state or from waiting state to ready state. There are three service APIs (SetEvent, WaitEvent, and ClearEvent) that can be responded by event process module, and tasks can invoke these service APIs to implement the synchronous executions. E.g., when the running task tk1 waits for the event



Fig. 1. The structure of OSEK/VDX OS with an application.

evt1 using service API WaitEvent(evt1), task tk1 cannot continue until the event evt1 is set by other tasks (basic tasks or extended tasks) using service API SetEvent(tk1, evt1).

Resource process module: The priority inversion and deadlock are two typical problems of common synchronization mechanism when several tasks access the same shared resource with different priorities. In order to avoid these two problems, OSEK/VDX OS adopts the Priority Ceiling Protocol [12] to coordinate the behaviors of accessing shared resources in the resource process module. The resource process module supports two service APIs (GetResource and ReleaseResource) which can be invoked by tasks to access a shared resource according to the ceiling priority of the accessed resource. For example, if the service API GetResource(res1) is invoked by running task, and the priority of the task is lower than the ceiling priority of the resource res1, the priority of the task will be raised to the ceiling priority of the resource res1, and the priority of the task will be reset to the priority before requiring the resource res1 when ReleaseResource(res1) is invoked by the task. Note that, the ceiling priority of a shared resource is lower than the lowest priority of all tasks that do not access the resource, and it is higher than the priority of all tasks that access the resource.

Interruption process module: The interrupt service routines (ISRs) play an important role in the OSEK/VDX applications, such as responding to an external event or receiving data from a sensor. In OSEK/VDX OS, the interruption process module supports two categories of ISRs (ISR categories 1 and 2) for applications. The features of ISRs within OSEK/VDX OS are as follow. (i) The impulse signals of ISRs are triggered by the external asynchronous events. (ii) The ISRs can interrupt the non-preemptive and full-preemptive tasks, and a lower priority ISR can be interrupted by a higher priority ISR. (iii) In contrast with category 1 ISR, the category 2 ISR can invoke service APIs to activate a task, set an event to an extended task, and access a shared resource. (iv) The rescheduling will happen if a category 2 ISR have been terminated and no other ISR is activated.



Fig. 2. The simple application.



Fig. 3. The execution sequences of the application shown in Fig. 2.

# 2.2 OSEK/VDX Application and Execution Characteristics

An application developed based on OSEK/VDX OS consists of two files, one is the source file, and the other is the configuration file. The source file, which can be developed by C++ language, is used to present the concrete behaviors of the application. The configuration file is used to define tasks, events, resources, and ISRs. A simple OSEK/VDX application without ISRs is shown in Fig.2.

As to clearly comprehend the execution characteristics of OSEK/VDX applications, an example is discussed in this part. In the simple application shown in Fig.2, since only the attribute  $\texttt{AUTOSTART}^2$  of conTask is set to be TRUE, conTask will be firstly moved to running state by scheduler and then conTask is executed. As shown in Fig.3, when the service API ActivateTask(plusTask) is invoked

<sup>&</sup>lt;sup>2</sup> AUTOSTART: if the attribute AUTOSTART of a task is set to be TRUE, the task starts from ready state in the initial state. Otherwise, the task starts from suspended state.

by conTask, scheduler will be loaded to respond to the API. For this moment, the running task conTask will be preempted by plusTask since the priority of plusTask is higher than conTask and the attribute SCHEDUL<sup>3</sup> of conTask is set to be FULL (if a task is activated, the task will be moved from suspended to ready state by scheduler). Currently, the task plusTask gets run-unit to run, and goes to suspended state when the service API TerminateTask() is invoked (the service API TerminateTask() is used to terminate the executions of a task, and terminated tasks will be moved from running state to suspended state by scheduler. If the running task is terminated, scheduler then dispatches the head task in the ready queue to run). When plusTask is terminated, conTask will be moved to running state again and continue its executions from preempted point. Then, minusTask is activated by conTask, and will be run when the service API TerminateTask() is invoked by conTask (conTask cannot be preempted by minusTask, since the priority of minusTask is lesser than conTask).

According to the executions of the given example, we can find the following execution characteristics, (i) which task within the application is to be run is determined by scheduler according to the ready queue and configuration file of the application, (ii) task states can be changed by invoked service APIs, (iii) the invoked service APIs may lead to context switch of tasks. Based on the listed characteristics, we can easily find that the execution characteristics of OSEK/VDX applications are different from sequential software and multi-threaded software. In order to employ Spin model checker to accurately verify the OSEK/VDX applications, there are two challenges that should be addressed, e.g., (i) how to implement the scheduling behaviors, (ii) how to implement the interactive behaviors between tasks and scheduler via service APIs. As to overcome these two challenges, we develop a synchronization model to simulate the executions of the OSEK/VDX applications, which will be demonstrated in the next section.

# 3 The Spin-based Checking Approach

# 3.1 The Synchronization Model

To accurately check OSEK/VDX applications using Spin model checker, the key work is how to construct a checking model. Based on the given example shown in Fig.2, we have found that the running task within the application is determined by the scheduler according to the ready queue and task configuration data. Thus, as to accurately simulate the executions of tasks, the best way is to construct an OS model (such as scheduler model) in the checking model to conduct the executions of tasks.

In our approach, a synchronization model (SynM) is constructed to simulate the executions of target application, which is shown in Fig.4. The SynM is a combination of OS model  $\mathcal{OS}$  and application model App. Where, the OS model

<sup>&</sup>lt;sup>3</sup> SCHEDUL: if the attribute SCHEDUL of a task is set to be FULL, the task can be preempted by higher priority tasks. Otherwise, the task will not leave *running* state until the service API *TerminateTask*, *ChainTask* or *Schedule* is invoked, or waits for an event.



Fig. 4. The synchronization model (SynM).

corresponding to the OSEK/VDX specification is employed to conduct the executions of the application and respond to the invoked service APIs. The application model  $App=\{\Delta, T, I\}$  is the set of components,  $\Delta$  is the configuration file of application,  $T=\{t_1, t_2, \cdots\}$  is the finite set of tasks defined in the application,  $I=\{isr_1, isr_2, \cdots\}$  is the finite set of interrupt service routines (ISRs) defined in the application (note that, in the SynM all of the tasks, ISRs and OS model are regarded as process). Furthermore, the application model and OS model will synchronously execute via service APIs. The execution characteristics of SynM are stated in the following.

When an application runs on the OSEK/VDX OS, the head task in the ready queue will be dispatched to run if the run-unit is idle. The other tasks in the ready state, suspended state and waiting state will not be run until the running chance is given by scheduler. Thus, the first execution characteristic of SynM is as follows.

- a task  $t \in T$  can be run iff its ID equals to the running task ID that is computed by OS model, and the remanent tasks  $A' = T \setminus \{t\}$  are restrained to execute.

Once a service API is invoked by the application, the OS will be loaded to run for responding to the invoked service API (the executions of the application will be preempted by OS). When OS has already completed its executions, the run-unit is released, and then the application will be continued again. According to the described execution characteristics, the following three execution characteristics are poured into the SynM for simulating the interactive executions between OS model and application model.

- application model App and OS model OS are synchronously executed via APIs.
- when a service API is invoked by running task t or isr, the task t or isr will stop its execution to wait for the executions of OS model.
- once OS model receives an invoked service API from App, OS model will be executed for responding to the invoked service API and computing the running task ID. If OS model has completed its executions, application model App will be executed from the stopped point, and then OS model waits for the next service API from application model.



Fig. 5. OS model.

As to completely simulate the ISRs, in our SynM all of the possible interleavings between running task and ISRs are considered due to the non-deterministic occurring time and execution order of ISRs. Substantially, in the checking process, the executions between ISRs and currently running task can be considered as the concurrent program. In SynM, we stipulate that ISRs I and running task t are concurrently executed, and  $\forall isr_i, isr_j \in I$  are also concurrently executed. In addition, in order to avoid the executions of ISRs to be interleaved by tasks, the behaviors of ISRs are designated as an atomic sequence. Here, a shortcoming is involved in our approach because of the atomic sequence, that is, our approach does not allow higher priority ISRs to interrupt the lower priority ISRs. Based on the above analysis, the following one execution characteristic is put into SynM.

-  $\forall isr \in I$  and running task  $t \in T$  are concurrently executed, and  $\forall isr_i, isr_j \in I$  are concurrently executed in the SynM. Note that the behaviors of ISRs are designated as atomic sequence.

Furthermore, as to support the checking process on the given property related to task states, event states, and shared resource states, an interface corresponding to OS data is provided by SynM.

Fig. 6. The synchronization functions specified in promela.

```
inline waitForRun(_tid){
  (runTask == _tid);
}
```

Fig. 7. The inline function waitForRun(\_tid) specified in promela.

#### 3.2 Implementation in Spin

According to the SynM, we can easily simulate the executions of an OSEK/VDX application. As to conveniently use Spin to check OSEK/VDX applications based on the SynM, we have constructed an OS model using promela language according to the OSEK/VDX specification, and moreover, three interface functions are supported by the constructed OS model for easily constructing the application model according to the given application. The OS model and interface functions are stated in the following.

OS model: The OS model is developed based on the our previous work [13], which has been adopted by Japan automobile manufacturers to test the developed OSEK/VDX OS. As shown in Fig.5, the OS model is a tuple  $\mathcal{OS}=(S, s_0, D, F, \Sigma)$ , which is a combination of scheduler model, event process model and resource process model. Where, S is the finite set of states,  $s_0 \in S$  is the initial state.  $D=\{runTask, readyQueue, suspendList, waitList, evtBitArray, resAccessList\}$  is the set of data structures. F is the set of functions.  $\Sigma \subseteq S \times F \times S$  is the set of transition relations.

In the OS model, D is the interface of OS data shown in SynM. Where, runTask which is a variable is used to store the running task tid ( $tid \in \mathbb{N}$  is the identifier of tasks). Since several tasks can share a same priority in the OSEK/VDX OS, the readyQueue which is composed of queues with different priorities is used to store the tid of ready tasks. The data structures suspendList and waitList are used to store the tid of tasks in the suspended state and waiting state, respectively. evtBitArray which is a matrix is used to store the event states of extended tasks ( $eid \in \mathbb{N}$  is the identifier of events). resAccessList which is composed of lists is used to indicate the state of resources accessed by tasks ( $rid \in \mathbb{N}$  is the identifier of resources). In the function set F, API? APIName(para1, para2) and notifyApp!finishMessage are the synchronization functions, their implementations specified in promela are shown in Fig.6. Here, API? APIName(para1, para2) is used to receive the invoked service APIs from application model (where, APIName is the name of invoked service API, para1 and para2 are the parameters in the service APIs). notifyApp!finishMessage is used to notify the application model

```
inline taskAPI(_tid,_APIName,_para1,_para2){
  atomic{
  if
    ::_APIName == GetResource ->
        resAccessList[_para1].lock=true;
    ::_APIName == ReleaseResource ->
        resAccessList[_para1].lock=false;
  fi;
  API ! _APIName(_para1,_para2);
  notifyAPP ? finishMessage;
  (_tid == runTask);
  }
}
```

Fig. 8. The inline function taskAPI specified in promela.

Fig. 9. The inline function ISRAPI specified in promela.

that OS model has already completed its executions. In addition, the assertion assert(runTask! = -1) is used to terminate the checking process if there is no running task (where, "-1" represents that running task is idle). The other functions in F such as ChainTask(tid) and TerminateTask(), which are the standardized functions defined in OSEK/VDX specification, are used to operate the system data D according to the invoked service APIs.

Interface functions: The first interface function waitForRun() shown in Fig.7 is used to restrain the executions of the tasks whose tid are not equal to runTask.

The second interface function taskAPI() shown in Fig.8, which can be invoked by tasks, is used to simulate the behaviors of service APIs, in which API!\_APIName(\_para1, \_para2) and notifyAPP?finishMessage are used to implement the interactive executions between OS model and tasks, (\_tid == runTask) is employed to simulate the context switch of tasks caused by the invoked service API (the parameter \_tid is the host task ID). In addition, since category 2 ISRs can invoke service APIs to access a shared resource, it may lead to the mutual exclusion problem when running task and ISRs want to access the same shared resource. Therefore, the if branches are used to change the state of shared resources for restraining the executions of ISRs when running task is holding the shared resource, where the variable lock is used to label the resource state.

```
#include "OSmodel.h
proctype conTask() {
 start:
 waitForRun(conTask.tid);
   taskAPI(conTask.tid, ActivateTask, plusTask.tid,-1);
    taskAPI(conTask.tid, ActivateTask, minusTask.tid,-1);
   taskAPI(conTask tid TerminateTask -1 -1):
 goto start;
proctype plusTask() {
waitForRun(plusTask.tid);
   buffer=buffer+1:
    taskAPI(plusTask.tid, TerminateTask, -1,-1);
 goto start;
proctype minusTask() {
start
   waitForRun(minusTask.tid):
    buffer=buffer-1:
    taskAPI(minusTask.tid, TerminateTask, -1,-1);
goto start;
run OSModel(); run conTask(); run plusTask(); run minusTask();
```

Fig. 10. The application model of the example shown in Fig.2.

Like tasks, for the category 2 ISRs, we also provide an interface function to implement the interactive executions between OS model and ISRs, which is shown in Fig.9. In the function, the variable *lock* is used to implement the mutual exclusion behaviors between ISRs and tasks. Note that the functions shown in Fig.8 and Fig.9 are only used to restraint the executions of ISRs when running task and ISRs want to access the same resource, the behaviors between tasks or ISRs for accessing shared resources are coordinated by the resource process model and atomic sequence, respectively.

Checking application using Spin: Based on the OS model and supported interface functions, we can easily construct an application model for the given application and check the constructed application model using Spin model checker. E.g., the application model of the example shown in Fig.2 has been presented in Fig.10. Note that, for the configuration file, our OS model also provides an interface function for inputting the configuration data. The OS model is available at the osek-spin homepage<sup>4</sup>. In addition, since the category 2 ISRs can invoked service APIs to set an event to an extended task or activate a task from suspendList, it will possibly lead to a rescheduling point. Therefore, when we check the application which contains category 2 ISR, we should insert the function waitForRun(\_tid) into each transition of tasks to simulate the context switch of tasks.

<sup>&</sup>lt;sup>4</sup> http://www.jaist.ac.jp/~s1220209/osek-spin.htm

#### 3.3 Given Property

Based on the OS model and supported interface functions, we can accurately check an OSEK/VDX application using Spin model checker. In this section, we will talk about what kinds of given properties can be checked by our approach in the practical checking process.

Variable property: In the practical checking process, sometimes we want to check whether the executions of target application have already reached a specified state via asserting the values of variables declared in an application. Based on the SynM, we can find that all of the executions of target application can be checked by Spin model checker. Thus, our approach can be used to check variable property using assertion statement.

LTL property: In addition to assertions, the given property which holds temporal operators is frequently used to check an application in the practical checking process. For instance, we want to check whether the value of a variable will be changed to be zero in the future. Since Spin model checker can accept the given property specified in Linear Temporal Logic (LTL), our approach thus can be used to check the LTL property.

Service API property: The service API is also an interesting checking point for the OSEK/VDX applications, since service APIs perform an important part in the interaction between application and OSEK/VDX OS. In the checking process, we usually want to check whether a service API will be invoked by tasks. In our approach, the service API is represented as a set {APIName,para1,para2} of variables in promela. Therefore, our approach can check the service API property.

OS data property: When an application runs on the OSEK/VDX OS, it is difficult to judge the execution situations of the application since the executions of OSEK/VDX applications are conducted by the scheduler, and tasks within application can invoke service APIs to synchronously execute and access shared resources. As to clearly detect the execution situations of an application, the states of tasks, events and shared resources are often considered as a checking point. To check this type of property (which is named as OS data property in our paper), an interface with respect to OS data such as the data in the ready queue is provided by OS model in our approach. E.g., we can use the LTL property shown in formula (1) to check whether the task tid will be run after ActivateTask(tid) is invoked.

Mutual exclusion property: Furthermore, the checking process on mutual exclusion property also will be carried out in the practical checking process, since tasks and ISRs within application can enter a critical section for accessing a shared resource using service APIs GetResource(rid) and ReleaseResource(rid). Informally, mutual exclusion contains two properties, one is exclusiveness, the other is liveness. In our approach, the task tid of accessing shared resources is recorded by resAccessList of OS model. Thus, our approach can be used to check these two properties. For instance, we can use the LTL properties shown

in formula (2) and (3) to check the exclusiveness property and liveness property respectively, where we suppose task tk1 and task tk2 will access the same shared resource rid, IN represents matching task tid in list, n is the number of tasks defined in the application.

!
$$\langle tk1.tid \text{ IN } resAccessList[rid].list[0:n] \&\& tk2.tid \text{ IN } resAccessList[rid].list[0:n] )$$
 (2)

$$\Leftrightarrow$$
  $(tk1.tid \ IN \ resAccessList[rid].list[0:n]) (3)$ 

# 4 Experiment and Discussion

As to show the practicality of our approach, some experiments are carried out in this part. In the experiments, as to comprehensively investigate the effectiveness our approach, the applications which hold different task number, API number and loop number are selected as our benchmarks. Moreover, we also compared the Spin-based approach with  $osek-bmc^5$  which is an implementation of our EPG technique. In the experiments, we investigate four aspects, including task number, API number, loop number, and ISR number. jSpin is selected as the experiment platform, and the "C complier" is configured to "-DVECTORSZ=16384-DBITSTATE", the max depth is set to "20,000,000". In the EPG technique, the max depth is set to "20,000,000", and the loop bound is set to 40. All of the experiment results have been listed in Table I. In the results table, #t is the number of tasks, #l is the number of loops, #s is the number of explored states. "Mb" is the memory consumption measured in Mbyte, "time" is the time consumption measured in second. The benchmarks used in the experiments are available at http://www.jaist.ac.jp/~s1220209/osek-spin.htm.

# 4.1 Experiment results

There are some noticeable results in the Table 1. In all of the conducted experiments (lines 1-18), Spin-based approach will check more states than <code>osek-bmc</code>. Moreover, if we increase the task number and APIs number, Spin will run out of memory and time (e.g., line 4 and 10). Compared with Spin-based approach, <code>osek-bmc</code> can successfully check these examples with small states, and spends lower cost (time and memory) than Spin. It is easy to explain why <code>osek-bmc</code> is excellent in the verification. In EPG technique, the OS model is embedded in the checking algorithm level for avoiding the transitions of OS model to be verified in the checking process. However, in Spin-based approach, since the OS model is a part of constructed checking model, Spin will not only check the behaviors of tasks but also verify the OS model behaviors, and moreover, all of the states with respect to both tasks and OS model states will be stored in the memory in the checking process. Therefore, Spin-based approach will spend more time and memory checking the same applications compared with EPG technique.

<sup>&</sup>lt;sup>5</sup> osek-bmc is available at http://www.jaist.ac.jp/~s1220209/Index.htm

Table 1. Comparison between Spin-based approach and EPG technique

| benchmark        | size |    |            |      | Spin-based approach |      |         |        | osek-bmc/EPG |      |         |        |
|------------------|------|----|------------|------|---------------------|------|---------|--------|--------------|------|---------|--------|
|                  | #t   | #l | loop bound | #API | #8                  | Mb   | time(s) | result | #8           | Mb   | time(s) | result |
| 1 passCnt1       | 4    | 0  | -          | 4    | 480                 | 755  | 0.19    | sat    | 18           | 2.13 | 0.093   | sat    |
| 2 passCnt2       | 10   | 0  | -          | 10   | 137225              | 768  | 3.76    | sat    | 46           | 2.13 | 0.097   | sat    |
| 3 passCnt3_bug   | 15   | 0  | -          | 15   | 670176              | 798  | 17.6    | unsat  | 29           | 2.14 | 0.231   | unsat  |
| 4 passCnt4_bug   | 20   | 0  | -          | 20   | -                   | M.O. | T.O.    | -      | 41           | 2.15 | 0.301   | unsat  |
| 6 msgp4_bug      | 18   | 0  | -          | 35   | -                   | -    | T.O.    | -      | 145          | 2.22 | 0.571   | unsat  |
| 7 increAPI1_bug  | 10   | 1  | 10         | 200  | 2955686             | 832  | 59.9    | unsat  | 333          | 2.23 | 2.923   | unsat  |
| 8 increAPI2_bug  | 10   | 1  | 20         | 400  | 5905975             | 891  | 116     | unsat  | 663          | 2.24 | 6.130   | unsat  |
| 9 increAPI3_bug  | 10   | 1  | 30         | 600  | 8897424             | 937  | 174     | unsat  | 993          | 2.27 | 10.24   | unsat  |
| 10 increAPI4_bug | 10   | 1  | 40         | 800  | -                   | M.O. | -       | -      | 1323         | 2.31 | 15.23   | unsat  |
| 11 token2_bug    | 6    | 6  | 40         | 161  | 34371               | 765  | 1.12    | unsat  | 2283         | 2.23 | 139     | unsat  |
| 12 token3_safe   | 9    | 9  | 40         | 161  | 46990               | 769  | 1.26    | sat    | 6417         | 2.41 | 192     | sat    |
| 13 cyclic1       | 6    | 16 | 5          | 86   | 4025                | 757  | 0.26    | sat    | 992          | 2.41 | 10.76   | sat    |
| 14 cyclic2       | 9    | 28 | 10         | 289  | 21803               | 761  | 1.31    | sat    | 3276         | 2.61 | 60.59   | sat    |
| 15 cyclic3       | 12   | 40 | 10         | 412  | 116432              | 768  | 3.97    | sat    | 4680         | 2.80 | 94.94   | sat    |
| 16 cyclic4       | 15   | 56 | 10         | 575  | 1110057             | 799  | 29.4    | sat    | 6552         | 3.06 | 198.4   | sat    |
| 17 acc_res1_safe | 2    | 3  | 10         | 4    | 13907               | 759  | 0.46    | sat    | 3491         | 2.32 | 88.8    | sat    |
| 18 acc_res2_safe | 13   | 13 | 10         | 480  | 483126              | 762  | 12.2    | sat    | -            | -    | T.O.    | -      |
| 19 passCnt1_1ISR | 4    | 0  | -          | 4    | 500                 | 761  | 0.18    | unsat  |              |      |         |        |
| 20 passCnt1_2ISR | 4    | 0  | -          | 4    | 724                 | 762  | 0.19    | unsat  |              |      |         |        |
| 21 passCnt1_3ISR | 4    | 0  | -          | 4    | 9656                | 763  | 0.43    | unsat  |              |      |         |        |
| 22 passCnt2_1ISR | 10   | 0  | -          | 10   | 131037              | 769  | 3.56    | unsat  |              |      |         |        |
| 23 passCnt2_2ISR | 10   | 0  | -          | 10   | 146770              | 772  | 3.93    | unsat  |              |      |         |        |
| 24 passCnt2_3ISR | 10   | 0  | -          | 10   | 666960              | 775  | 17.2    | unsat  |              |      |         |        |

However, if the target application contains a few tasks but many loops (lines 11-18), Spin-based approach will defeat <code>osek-bmc</code> in time consumption. This is because, in EPG technique, since the different APIs in different branches will lead to different task execution sequences, the transition system of the target application is constructed based on the execution paths. Therefore, when the target application holds a lot of loops, <code>osek-bmc</code> will check a large number of execution paths and a large number of the same sub-paths will be repeatedly verified in the verification, which will slow down the performance of <code>osek-bmc</code>. In contrast with EPG technique, in Spin-based approach, loops will not be unfold in the checking process, and moreover, we do not need to set an appropriate bound for loops. These efforts will make Spin-based approach more efficient than EPG technique. Furthermore, based on the conducted experiments (lines 19-24), we can find that the Spin-based approach is capable of checking the applications which holds ISRs (EPG technique for now cannot check ISRs).

#### 4.2 Discussion

Based on the shown experiments, there are several important investigation results can be considered in the practical verification of OSEK/VDX applications. (i) For the simple applications which contain a few tasks (less than 15) but many loops, the Spin-based approach is capable of checking this kind of applications in the practical verification. However, (ii) for the complex applications which hold a lot of tasks and APIs, we should use osek-bmc to verify these applications. Furthermore, (iii) if the target application holds a lot of tasks, APIs and loops, we can firstly use Spin-based approach to check the application until Spin-based approach runs out of the memory, and then use osek-bmc to continue the checking process.

#### 5 Related work

With the development of OSEK/VDX OS standard, OSEK/VDX has been widely applied in the development of vehicle-mounted OS. For the developed OSEK/VDX OS and its applications, how to ensure the reliability is becoming challenge for developers with the continuously increasing complexity in the development process. To the scope of checking developed OSEK/VDX OS, there are some invaluable methods, e.g., Jiang Chen and Toshiaki Aoki have proposed a method [13] to generate the highly reliable test-cases for checking whether developed OS conforms to the OSEK/VDX OS standard based on the Spin model checker. As to support an environment of OSEK/VDX OS for model checking, an UML-based method for producing promela scripts of OSEK/VDX OS is also proposed in paper [14]. In addition, for the Trampoline [15] which is an open source RTOS developed based on the OSEK/VDX OS standard, authors proposed a method [16] to convert the Trampoline kernel into formal models and an incremental verification approach is applied in the verification. Furthermore, a CSP-based approach for checking the code-level OSEK/VDX OS is also addressed in the paper [17].

To the developed applications, the paper [18] has proposed a method to check the timing property based on the UPPAAL. However, to the best of our knowledge, there is no work that considers a formal method to check the safety property of OSEK/VDX applications except our previous works. The main contribution of our paper is that we successfully apply Spin to check the OSEK/VDX applications based on our SynM. The advantages of our approach are as follow. (i) Our approach can accurately check the OSEK/VDX applications, since the OS model as a special process is used to respond to service APIs and compute the running task in the checking process. (ii) The checking process on ISRs is taken into account in our approach.

#### 6 Conclusion and Future Work

In this paper, we presented an approach to check OSEK/VDX applications based on the Spin model checker. In our approach, as to accurately check OSEK/VDX applications using Spin model checker, a synchronization model is employed to simulate the executions of OSEK/VDX applications. We have implemented our approach in Spin model checker and conducted many experiments, the experiment results show that our Spin-based approach is capable of checking the safety property of OSEK/VDX applications. We have also investigated the effectiveness of the Spin-based approach and EPG technique based on the many experiments. The investigation results show that, (i) for the simple application which contains a few tasks (less than 15) but many loops, Spin-based approach is faster than EPG technique in the verification. However, (ii) for the complex application which contains many tasks and APIs, the EPG technique is more efficient to check these applications compared with Spin-based approach.

In the future, there is an important work that will be carried out based on the drawbacks of our approach. In the conducted experiments, we find that, in our Spin-based approach the OS model will be involved in the verification compared with EPG technique. Therefore, as to efficiently check OSEK/VDX applications

using Spin, we will translate the behaviors of OSEK/VDX applications into the sequential C program based on the EPG technique.

#### References

- 1. J. Lemieux, *Programming in the OSEK/VDX Environment*. Suite 200 Lawrence, KS 66046, USA: CMP, 2001.
- OSEK/VDX Group, "OSEK/VDX operating system specification 2.2.3," http://portal.osek-vdx.org/.
- 3. Edmund M. Clarke, Orna Grumberg and David E. Long, "Model Checking and Abstraction," *ACM Trans*, vol. 16, no. 5, pp. 1512–1542, Sept. 1994.
- 4. Edmund M. Clarke, E. Allen Emerson, et al, "Model Checking: Algorithmic Verification and Debugging," *Commun. ACM*, vol. 152, no. 11, pp. 74–84, Nov. 2009.
- Zijiang Yang, Chao Wang, Aarti Gupta, et al, "Model Checking Sequential Software Programs Via Mixed Symbolic Analysis," ACM Transactions on Design Automation of Electronic Systems, vol. 14, no. 1, pp. 1–26, Jan. 2009.
- Shaz Qadeer and Jakob Rehof, "Context-bounded model checking of concurrent software," TACAS 2005, LNCS 3440, pp. 93–107, 2005.
- Scott D. Stoller, "Model-Checking Multi-threaded Distributed Java Programs," 7th International SPIN Workshop, pp. 224–244, Sep. 2000.
- 8. Haitao Zhang, Toshiaki Aoki, Hsin-Hung Lin, et al, "SMT-based Bounded Model Checking for OSEK/VDX Applications," 20th APSEC, vol. 2, no. 4, pp. 307–314, Dec. 2013.
- 9. Haitao Zhang, Toshiaki Aoki, et al, "An Approach for Checking OSEK/VDX Applications," 13th QSIC, pp. 113–116, 2013.
- 10. Armin Biere, Edmund M. Clarke and Yunshan Zhu, "Bounded Model Checking," *Advances in Computers*, vol. 58, no. 11, pp. 117–148, 2003.
- 11. Gerard J.Holzmann, *The Spin Model Checker: Primer and Reference Manual.* Boston, USA: Lucent Technologies Inc., Bell Laboratories, Sep. 2003.
- Alan Burns and Andy Wellings, Real-Time Systems and Programming Languages (4th Edition). New York, NY, USA: Addison Wesley Longmain, Aug. 2009.
- 13. Jiang Chen, Toshiaki Aoki, "Conformance Testing for OSEK/VDX Operating System Using Model Checking," 18th Asia Pacific, pp. 274–281, Dec. 2011.
- Kenro Yatake, Toshiaki Aoki, "Automatic Generation of Model Checking Scripts Based on Environment Modeling," 17th International SPIN Conference on Model Checking Software, pp. 58–75, 2010.
- 15. "Trampoline," http://trampoline.rts-software.org/.
- 16. Yunja Choi, "Safety Analysis of Trampoline OS Using Model Checking: An Experience Report," Software Reliability Engineering (ISSRE), pp. 200–209, Nov. 2011.
- 17. Yanhong Huang, Yongxin Zhao, et al, "Modeling and Verifying the Code-Level OSEK/VDX Operating System with CSP," 5th Theoretical Aspects of Software Engineering (TASE), pp. 142–149, Aug. 2011.
- 18. Libor Waszniowski, Zdenk Hanzlek, "Formal verification of multitasking applications based on timed automata model," *Real-Time Systems*, vol. 38, no. 1, pp. 39–65, Jan. 2008.

# Checking the Conformance of a Promela Design to Its Formal Specification in Event-B

Dieu-Huong Vu, Yuki Chiba, Kenro Yatake, Toshiaki Aoki

School of Information Science,
Japan Advanced Institute of Science and Technology
{huongvd, chiba, k-yatake, toshiaki}@jaist.ac.jp

Abstract. Verification of a design with respect to its requirement specification is important to prevent errors before constructing an actual implementation. Existing works focus on verification tasks where specifications are described using temporal logics or using the same languages as that used to describe designs. In this paper, we consider cases where specifications and designs are described using different languages. For verifying such cases, we propose a framework to check if a design conforms to its specification based on their simulation relation. Specifically, we define the semantics of specifications and designs commonly as labelled transition systems (LTS), and check if a design conforms to its specification based on the simulation relation of their LTS. In this paper, we present our framework for the verification of reactive systems, and we present the case where specifications and the designs are described in Event-B and Promela/Spin, respectively. As a case study, we show an experiment of applying our framework to the conformance check of the specification and the design of OSEK/VDX OS.

**Keywords:** Formal Verification, Model Checking, Formal Specification, Design, Simulation Relation

#### 1 Introduction

A software development process begins with informal requirements which the target software is expected to meet. The informal requirements are translated into formal specifications to ensure their consistency. Then, system designs are developed as models for implementation. Finally, the implementation is done according to the designs using programming languages. In this development process, we should verify the fact that the designs satisfy the requirements described by formal specifications since incorrect designs likely lead to significant costs caused by back tracking of development steps.

We focus on the development of reactive systems. Most of them are considered as safety-critical because their failure may result in loss of lives and assets (e.g., operating systems for mobile vehicles). Reactive systems do not execute by themselves but in combination with their environments. Environments are the external systems which invoke the services of the target systems, e.g., software

applications running on the operating systems. The specification of such a reactive system represents its externally visible behavior. That is, the specification represents what the system does in response to the invocations of its environments. Formal specification languages such as VDM[14], Z[16] and Event-B[1] allow us to formally describe the specification. For example, it is straightforward to describe the effect of adding an item into a container using notions such as sets, relations and functions in the formal specification languages. Generally, important properties of the reactive systems, e.g., the properties regarding to pre-conditions and post-conditions of the system services, could be straightforwardly described in the formal specification languages. On the other hand, the design represents the collaboration of internal components to realize observable behaviors described in the specification. It usually contains implementable data structures such as record types, flags, and hash tables. We consider that imperative specification languages like Promela/Spin are appropriate to describe the design since the data structures and behaviors based on them can be straightforwardly described. For example, an algorithm to search and retrieve a certain item from the container could be straightforwardly described in Promela using various control structures based on the data structures such as arrays, record types, or hash tables. The problem is how to verify the designs with respect to their specifications when they are described in different specification languages.

To verify the designs with respect to their specifications, existing works focus on cases where the specifications are described using temporal logics [6,7] or using the same languages as that used to describe the designs [5]. This paper proposes a method to verify the designs against their formal specifications where the specifications and the designs are described in different specification languages. We adopt Event-B for the specification and Promela/Spin for the design. One may say that some of the formal specification languages provide refinement and automatic generation of codes. We can describe the specification in an appropriate specification language; then, we derive the behaviors of the design from the higher-level specification. However, deriving highly optimized behaviors of the design from the highly abstracted specification is generally very hard. Therefore, this approach is not appropriate to verify the systems with complex data structures and highly optimized behaviors like the operating systems. Our idea is to describe the design in the specification language which is easy to represent the design. Then, we verify the design against the specification. Our approach provides another way to ensure that the design is consistent with the specification. Another question may arise here. The specification can be described in temporal logic if we describe the design in Promela/Spin. However, it is well-known that correctly describing properties in temporal logic is difficult [8]. Whereas, by using the rich notions (e.g., sets and relations) in the formal specification languages like Event-B, one could easily describe the properties to be checked against the design. In addition, the tool of the formal specification languages provides a function to verify the consistency and the correctness of the properties. Thus, we think that dealing with the specification and the design based on the different specification languages is appropriate for systems in which there exist a big gap between the specification and the design like the operating systems.

Our approach to check the design against the specification is based on a simulation relation [11, 13, 18] between them. Firstly, we formally describe specification in Event-B [20] to remove ambiguity and inconsistency in the specification which is written in a natural language. Then, we generate an LTS from this formal specification; and, from each state, verification conditions which must be met by the corresponding state of the design are generated. Finally, we apply model checking [3] to the design to check the verification conditions. In this way, we can check the correspondence of state transitions, or the simulation relation, between the specification and the design. This ensures that the design conforms to the specification.

This paper presents a framework for the verification of reactive systems. We present the formal definition of our framework, and as a case study, we show an experiment of applying our framework to the conformance check of the specification and the design of OSEK/VDX OS [17] (OSEK OS, for short). Verification of OSEK OS is important because it is widely used in automotive control softwares; its bugs may has devastating effects to the human life. The paper is organized as follows: In Section 2 and 3, we present the definitions of specifications and designs, respectively. In Section 4, we present the definition of our verification framework. In Section 5 and 6, we present the case study with the results of several experiments and discuss the effectiveness of our framework. In Section 7, we cite the related works. In Section 8, we conclude this paper.

# 2 Specifications

In this section, we present notations of Event-B used in the specification and formal model of the specification.

**Specification in Event-B.** A reactive system is a system that operates by reacting to stimuli from its environment. Typically, operating systems are reactive, because they react to the invocations from the software applications. A reactive system is captured as a collection of services, which are triggered by the invocations from the environment. We regard the specifications of the reactive systems in Event-B as highly abstracted level descriptions: data structures are represented using notion of sets, relations and functions; and system services are represented in terms of events with guards and substitutions. When the guard of an event is true, the event is fired and its substitution is executed atomically. Figure 1 demonstrates the specification of OSEK OS in Event-B. The VARI-ABLES enumerates the state variables; for example, tasks and res represent all the created tasks and the managed hardware resources. The INVARIANTS defines constraints on values of the state variables: it defines data types, e.g., TASK is an abstract data structure and tasks is a subset of TASK; and conditions for the correctness of the behaviors, e.g., at any time only one task is in running state. The EVENTS describes system services, e.g., ActivateTask activates a task. The events modify values of the variables and make the corresponding state transitions. The events must preserve the invariants to guarantee the consistency of the specification.

```
VARIABLES tasks,res,inr,evt,tstate,rdyQu,pri
INVARIANTS
tasks⊆TASK
∀ta,tb·ta∈tasks ∧tb∈tasks ∧tstate(ta)=run ∧ tstate(tb)=run⇒ta=tb
EVENTS
ActivateTask =
any t
where grd1:t∈ tasks,grd2:tstate(t)=sus
then act1:tstate(t):=rdy,act2:rdyQu:=rdyQu∪{t}
ChainTask=
any t1,t2
where grd1:t1,t2 ∈ tasks,grd2:tstate(t1)=run,grd2:tstate(t2)=sus
then act1:tstate(t1):=sus,act2:tstate(t2):=rdy,
act3:rdyQu:=rdyQu∪{t2}
```

Fig. 1: Specification of OSEK OS in Event-B

Formal Semantics.  $\mathcal{V}$  is the set of variables.  $\mathcal{D}$  is the domain, which is the set of values. Exp is the set of expressions in the specifications. An expression may contain variables in  $\mathcal{V}$ , values in  $\mathcal{D}$ , arithmetic operators, logical operators, and set operators. BExp is the set of boolean expressions (BExp  $\subset$  Exp). A substitution  $a:\mathcal{V}\to$  Exp is a mapping from  $\mathcal{V}$  to Exp. We note that value assignments are also substitutions because  $\mathcal{D}\subseteq$  Exp. ACT is the set of substitutions for specifications. A guard is a boolean expression. GRD is the set of guards. An event is a pair  $\langle g,a\rangle$  of a guard g and a substitution  $a.\ \mathcal{E}$  is the set of events. If  $e=\langle g,a\rangle$  then we write grd(e)=g and act(e)=a. A state is a value assignment.  $[exp]_{\sigma}$  denotes the interpretation of the value of an expression exp in a state  $\sigma$ . We say a guard g holds in a state  $\sigma$  iff  $[g]_{\sigma}=tt$ . Init is the set of special initialization events that have no guard. We denote  $\sigma\stackrel{e}{\longrightarrow}\sigma'$  for an event  $e=\langle g,a\rangle$  and states  $\sigma$  and  $\sigma'$  if  $[g]_{\sigma}=tt$  and  $\sigma'=\{v\mapsto [a(v)]_{\sigma}\mid v\in V\}$ .

**Definition 1.** (Specification models). A specification model is a tuple  $S = \langle \mathcal{V}_S, \mathcal{D}_S, \mathcal{E}_S, \operatorname{Init}_S, \operatorname{Inv} \rangle$  where  $\mathcal{V}_S \subseteq \mathcal{V}$  is the set of variables used in S,  $\mathcal{D}_S \subseteq \mathcal{D}$  is the domain,  $\mathcal{E}_S \subseteq \mathcal{E}$  is the set of events,  $\operatorname{Init}_S \in \operatorname{Init}$  is the initialization of S, and  $\operatorname{Inv} \in \operatorname{BExp}$  is the invariant of S. An LTS derived from the specification model S is defined as  $M_S = \langle Q_S, \mathcal{E}_S, \delta_S, I_S \rangle$  where  $Q_S = \{\sigma \mid \sigma : \mathcal{V}_S \to \mathcal{D}_S\}$  is a non-empty set of states,  $\delta_S = \{\sigma \xrightarrow{e} \sigma' \mid \sigma, \sigma' \in Q_S, \ e \in \mathcal{E}_S\}$  is a transition relation, and  $I_S = \{act(e) \mid e \in \operatorname{Init}_S\}$  is a set of initial states.

In Event-B, a substitution can be deterministic or non-deterministic. We regard a non-deterministic substitution as multiple deterministic substitutions. Therefore, we assume that the LTS is deterministic.

# 3 Designs and Environments of the target system.

In this section, we present the design model of reactive systems described in Promela. We assume that the design only defines a set of service functions, it cannot operate by itself. To operate it, we need an environment which calls functions of the reactive system. Therefore, the design needs to be verified in the combination with their environments. We also present the environment model and the combination model.

**Design in Promela.** Promela allows us to describe the design with highly optimized behaviors in an imperative manner. The abstract data structures in Event-B are replaced by the implementable data structures. Design decisions to realize the external behaviors are explicitly described using various control structures. Service functions of reactive systems can be described by using inline functions. Figure 2 (left) illustrates a design of OSEK OS. We call this model a design model. It is described in about 2800 lines of Promela code, according to the approach in [2]. It first defines data structures such as tsk and ready which represent an array of tasks and ready queues, respectively. They replace the abstract data structures task and rdyQu in Event-B. Following these data structures, a set of functions is defined. For example, \_ActivateTask and \_TerminateTask are the functions to perform activation and termination of tasks, respectively. The function signature contains a function name and some parameters (function arguments). The functions are called from the environment. When a function is invoked, its parameters are instantiated by values specified from the environment. The body of the function consists of substitutions.

Environment of target system. Figure 2 (right) shows an example of an environment for the OSEK OS. We call this model an *environment model*. It first defines entities in the environment such as tasks and resources. Then, it defines sequences of function calls to the OSEK OS. By combining the design and the environment, we can make a closed system which can operate by itself. We call this a *combination model*. In terms of Promela, a combination model can be obtained by including the Promela code of the design into that of the environment model. As we explain later, an environment model is constructed from the specification model, and input to Spin to check the simulation relation.

```
 \begin{array}{l} typedef\ TCB\ \{int\ id,\ pr,\ dpr,\ \ldots\} \\ typedef\ RCB\ \{int\ id,\ pr,\ tid,\ \ldots\} \\ TCB\ tsk[5]; \\ RCB\ res[5]; \\ int\ ready[25]; \\ inline\ _schedule()\ \{\ldots\} \\ inline\ _DeclareTask(ttd,\ pr)\ \{\ldots\} \\ inline\ _DeclareTask(ttd,\ pr)\ \{\ldots\} \\ inline\ _ChainTask(tid)\ \{\ldots\} \\ inline\ _ChainTask(tid)\ \{\ldots\} \\ inline\ _TerminateTask(tid)\ \{\ldots\} \\ inline\ _GetTaskState(tid)\ \{\ldots\} \\ inline\ _GetTaskState(tid)\ \{\ldots\} \\ inline\ _SetTaskState(tid)\ \{\ldots\} \\ inline\
```

Fig. 2: Design model and Environment model in Promela

Formal Semantics.  $\mathcal{P}$  is the set of parameters (function arguments). In the design, an expression may contain constants, variables, parameters and arithmetic operators, therefore, a so-called parameterized expression. The set of parameterized expressions is denoted as PExp. A function body is defined as a substitution. The substitution may contain the parameterized expressions. We use p-substitution to denote the substitution in the design. p-substitution is a

mapping from  $\mathcal{V}$  to PExp. The set of p-substitutions is denoted as PSubst. Id is the set of *identifiers* (used as function names). For the simplicity, we assume that functions have only one parameter. The design also includes an initialization function which assigns the initial values for the variables. Design models are defined as follows.

**Definition 2.** (Design model). A design model is a tuple  $D = \langle \mathcal{V}_D, \mathcal{D}_D, \mathcal{P}_D, F, \mathcal{E}_D, I_D \rangle$  where  $\mathcal{V}_D \subseteq \mathcal{V}$  is the set of variables used in D,  $\mathcal{D}_D \subseteq \mathcal{D}$  is the domain of D,  $\mathcal{P}_D \subseteq \mathcal{P}$  is a finite set of parameters for D, F is a set of function signatures defined as  $F = \{id(p) \mid id \in \mathrm{Id}, p \in \mathcal{P}_D\}$ ,  $\mathcal{E}_D$  is a relation such that  $\mathcal{E}_D \subseteq F \times \mathrm{PSubst}$ , and  $I_D$  is a set of value assignments of the initialization function such that  $I_D \subseteq \{\sigma \mid \sigma : \mathcal{V}_D \to \mathcal{D}_D\}$ .

We assume that the functions in the design are deterministic to have a unique successor state for each current state and each called function. This assumption is realistic for the implementation of the reactive systems like the automotive operating systems. On the other hand, it is generally non-deterministic to select a function applicable in each state. This is described in environment models. Environment models are defined as follows.

**Definition 3.** (Environment model). An environment model for a design model D is a tuple  $E = \langle \mathcal{V}_E, \mathcal{D}_E, \mathcal{E}_E, I_E \rangle$  where  $\mathcal{V}_E \subseteq \mathcal{V}$  is a set of variables used in E,  $\mathcal{D}_E = \mathcal{D}_D$  is the domain of E,  $\mathcal{E}_E$  is a set of invocations to D such that  $\mathcal{E}_E \subseteq \{id(v) \mid id \in \mathrm{Id}, v \in \mathcal{V}_E\}$ , and  $I_E$  is a set of value assignments from  $\mathcal{V}_E$  to  $\mathcal{D}_D$ .

A combination of a design and an environment describes the execution of the design according to the environment. An expression in the combination contains constants from  $\mathcal{D}$ , variables in  $\mathcal{V}$ , and arithmetic operators. The set of expressions in combinations is denoted as Exp'. A substitution for combinations is a mapping from  $\mathcal{V}$  to Exp'. The set of substitutions for combinations is denoted as SubstDE. For a mapping  $\pi$  from  $\mathcal{P}$  to  $\mathcal{V}$  and a parameterized expression  $pexp \in PExp$ ,  $pexp_{\pi}$  is the result of replacing each parameter p appearing in pexp by  $\pi(p)$ . In other words, if a(v) is an expression in D then  $a(v)_{\pi}$  is an expression in the combination obtained by replacing each parameter p appearing in a(v) by  $\pi(p)$ . Combination models are defined as LTSs as follows.

**Definition 4.** (Combination model). Let  $D = \langle \mathcal{V}_D, \mathcal{D}_D, \mathcal{P}_D, F, \Sigma_D, I_D \rangle$  be a design model and  $E = \langle \mathcal{V}_E, \mathcal{D}_E, \Sigma_E, I_E \rangle$  an environment model.

- 1. We denote  $\sigma \xrightarrow{id(v)} \sigma'$  for an invocation  $id(v) \in \Sigma_E$  and states  $\sigma$  and  $\sigma'$  if there exist  $(id(p), a) \in \Sigma_D$  and a mapping  $\pi : \mathcal{P}_D \to \mathcal{V}_E$  such that  $\pi(p) = v$  and  $\sigma' = \{v \mapsto [a(v)_{\pi}]_{\sigma} \mid v \in \mathcal{V}_D \cup \mathcal{V}_E\}.$
- 2. The combination model of D and E (denoted as  $D \cdot E$ ) is an  $LTS \langle Q_{D \cdot E}, \Sigma_{D \cdot E}, \delta_{D \cdot E}, I_{D \cdot E} \rangle$  where  $Q_{D \cdot E} = \{ \sigma \mid \sigma : \mathcal{V}_D \cup \mathcal{V}_E \to \mathcal{D}_D \}$  is a set of states,  $\Sigma_{D \cdot E} = \Sigma_E, \delta_{D \cdot E} = \{ \sigma \xrightarrow{id(v)} \sigma' \mid \sigma, \sigma' \in Q_{D \cdot E}, id(v) \in \Sigma_E \}$  is a transition relation, and  $I_{D \cdot E} = I_D \cup I_E$  is a set of initial states of D and E.

# 4 Checking the Design against its Formal Specification

In this section, we present a framework for checking a design against its formal specification based on a simulation relation. We first present an overview, then, we present formal definitions.

# 4.1 Overview

Suppose that M1 and M2 are two LTSs. We define M2 simulating M1 based on semantics of LTSs by extending the given relation on the states. The states are value assignments which are mappings from the variables to the values. Therefore, the relation on states of M1 and those of M2 are established based on mappings R and C where R is the mapping from variables of M1 to those in M2, C is the mapping from values in M1 to those in M2. Figure 3 (left) shows a relation between state p of M1 and state q of M2. p relates to q based on Rand C because u = sus in state p corresponds to v = 1 in state q with mappings R(u) = v and C(sus) = 1. M2 simulates M1 if for each transition in M1 from state p to state p' and p relates to state q of M2, there exists state q' and a corresponding transition in M2 from q to q' such that p' relates to q'. In Figure 3 (right), a line arrow connecting p to p' represents a one-step transition from p to p', and a dashed arrow connecting q to q' represents an n-step transition from q to q'. To check whether M2 simulates M1, we check whether there exists a reachable state q' from q such that v=2 corresponds to u=rdy in p' with mappings R(u) = v and C(rdy) = 2.



Fig. 3: Simulation Relation

Figure 4 shows the steps to verify the simulation between a specification and a design using the Spin model checker. Firstly, bounds for the verification are given and an LTS is generated from the Event-B specification within the bounds. Next, the LTS is in turn used to generate the environment, which exercises service functions described in the design. The verification then amounts to checking the validity of certain relations between variables of the Promela design and variables of the Event-B specification in every reachable state. This is done using Spin assertions which are generated from states of the LTS and the given relations represented as mappings. In the end, the verification of the assertions ensures that the design conforms to the specification.

Giving Bounds. As specified in Event-B, there may be infinitely many states and transitions of target system because variables in Event-B obtain values in unbounded domains. Model checking does an exhaustive check of the system. It needs a representation of the system as a finite set of all possible states. So,



Fig. 4: Checking simulation relation of the design and its formal specification (steps)

abstract types in Event-B must be replaced by concrete types, e.g.,  $tasks \subseteq TASK$  where  $TASK = \{a, b, c, d\}$ . Also, types having infinite ranges of values like Int and Nat must be restricted as finite ranges by giving a minimum value and a maximum value for the ranges. By such restriction, the state space and the set of transitions explored from Event-B specification become finite sets. This makes the LTS explored from the specification finite. We define such restrictions as bounds of the verification.

Generating an LTS from the specification. In order to generate the LTS from the specification and bounds, the LTS Generator computes all possible transitions and reachable states. Every value used in the computation must be within the bounds. Starting at the initialization, the generator enumerates all possible values for the constants and variables of the specification that satisfy the initialization and the invariant to compute the set of initial states. To compute all possible transitions from a state, the generator finds all possible values for event parameters of an individual event to evaluate the guard of that event. If the guard holds in the given state, the generator computes the effect of the event based on substitution of that event. When new states are generated, we repeat this process to these states until no new state is generated.

Generating the Environment. In order to verify that designs satisfy their formal specifications, environments of the target systems are constructed and combined with the designs. Environments trigger the specific behaviors of the designs by calling functions of the designs; we construct such comprehensive environments that they represent all possible behavior described in their specifications. In the previous step, we generated the LTS of the specification. In this step, we generate the environment by translating the LTS into Promela such that the enabled events in LTS are translated to the corresponding function calls in Promela. This is performed by the Promela Code Generator.

Figure 5(a) demonstrates an LTS, which is generated from the specification of OSEK OS. The LTS represents possible sequences of state transitions within the bounds. Here, the rectangles represent the states and the labeled arrows represent the events that are enabled in each state. For example, two events AT(t1), AT(t2) are enabled in state s0, and two events TT(t1), AT(t2) are enabled in state s1. In our framework, the states are defined as the value assignments; however, we show them here as values, e.g., (sus, sus, sus), for readability. The LTS is translated into Promela to generate the environment, e.g., from (a) to (b) of Figure 5. For this generation, we give a mapping from the events in the LTS to the function calls in the environment. It could be one-to-one or one-to-many mapping. Figure 5 shows a sample case of one-to-one mapping. Here, event AT(t1) in the LTS is mapped to func-



Fig. 5: Generation of environment from LTS

tion call \_ActivateTask(task1.tid) in the environment; also, event TT(t1) is mapped to function call \_TerminateTask(task1.tid). The states and transitions in the LTS are represented by labels and if-statements in the environment. There may be more than one function call applicable in each state. For example, \_ActivateTask(task2.tid) and \_TerminateTask(task1.tid) are applicable in state s1; which function call actually applied is non-deterministic. By combining the design model and the environment model, we obtain the combination model, which will be input to the model checker in the last step of the framework.

Generating the Assertions. Verification conditions, which represent constraints on the simulation relation between the specification and the design, are encoded as assertions. They will be checked by Spin. From each reachable state of the LTS, we generate an assertion that must be met by the corresponding state of the design. This generation is based on the mappings R and C from the variables, the values in the specification to those in the design. This is also performed by the Promela Code Generator. In sample case of Figure 3 (right), for example, from state p' where u = rdy at the top with mappings R(u) = v and C(rdy) = 2, the generator outputs an assertion v = 2 to check whether there exists corresponding state q' at the bottom.

#### 4.2 Formal Definitions

We now give formal definitions of the relation between states, the bounds, the simulation relation of two LTSs within the bounds, and steps in the framework.

**Definition 5.** (Relation between states). Let  $S = \langle \mathcal{V}_S, \mathcal{D}_S, \mathcal{E}_S, \operatorname{Init}_S, Inv \rangle$  be a specification model,  $M_S = \langle Q_S, \mathcal{E}_S, \delta_S, I_S \rangle$  the LTS derived from  $S, D = \langle \mathcal{V}_D, \mathcal{D}_D, \mathcal{P}_D, F, \mathcal{E}_D, I_D \rangle$  a design model,  $E = \langle \mathcal{V}_E, \mathcal{D}_E, \mathcal{E}_E, I_E \rangle$  an environment model for D, and  $D \cdot E = \langle Q_{D \cdot E}, \mathcal{E}_{D \cdot E}, \delta_{D \cdot E}, I_{D \cdot E} \rangle$  the combination model of D and E. We say a state  $\sigma_{D \cdot E} \in Q_{D \cdot E}$  relates to a state  $\sigma_S \in Q_S$  based on mappings  $R : \mathcal{V}_S \to \mathcal{V}_D$  and  $C : \mathcal{D}_S \to \mathcal{D}_D$  (denoted  $\sigma_S \preceq_{R,C} \sigma_{D \cdot E}$ ), if for any  $x \in \mathcal{V}_S$  and  $y \in \mathcal{V}_D$ , R(x) = y implies  $C(\sigma_S(x)) = \sigma_{D \cdot E}(y)$ .

We omit R, C from  $\leq_{R,C}$  if they are clear from the context.

As mentioned earlier, the bounds are introduced to obtain a finite LTS from the Event-B specification. A finite LTS is obtained from an infinite LTS when we restrict the state space and the set of actions that trigger the state transitions. The bounds are defined as follows:

**Definition 6.** (Bounds). Bounds for LTS  $\langle Q, \Sigma, \delta, I \rangle$  are defined as a pair  $B = \langle G, H \rangle$  of mappings G and H where  $G: 2^Q \to 2^Q$ ,  $G(Q) \subseteq Q$ , and  $Q' \subseteq Q''$  implies  $G(Q') \subseteq G(Q'')$  and  $H: Q \times \Sigma \to \{tt, ff\}$  and for any state  $p \in Q$ , there exist finitely many actions  $a \in \Sigma$  such that H(p, a) = tt.

**Definition 7.** (Bounded LTS). An LTS obtained by restricting an LTS  $M = \langle Q, \Sigma, \delta, I \rangle$  within bounds  $B = \langle G, H \rangle$  is defined as  $M \downarrow_B = \langle \widehat{Q}, \widehat{\Sigma}, \widehat{\delta}, \widehat{I} \rangle$ , where  $\widehat{Q} = G(Q)$ ,  $\widehat{\Sigma} = \{a \mid \forall p \in Q, a \in \Sigma, H(p, a) = tt\}$ ,  $\widehat{\delta} = \{p \xrightarrow{a} p' \in \delta \mid H(p, a) = tt\}$ , and  $\widehat{I} = G(I)$ .

To implement the bounds for LTS associated to the Event-B specification, we restrict the range of the variable values. When every range of the variable values has been restricted, the state space and set of actions of the LTS become finite sets. We give a mapping X for implementing such bounds to generate the LTS. X is a mapping from variables to finite sets of values that the variables may obtain. We use  $\mathrm{ES}_X(\sigma)$  to denote the set of all events which are applicable to state  $\sigma$  and satisfy restrictions defined by X.

Suppose  $S = \langle \mathcal{V}_S, \mathcal{D}_S, \mathcal{L}_S, \operatorname{Init}_S, \operatorname{Iniv} \rangle$  be a specification model and  $\langle Q_S, \mathcal{L}_S, \delta_S, I_S \rangle$  an LTS derived from S. With the mapping X, we define mappings G and H as follows:  $G(Q_S) = \{ \sigma \in Q_S \mid \forall v \in \mathcal{V}_S.\sigma(v) \in X(v) \}, G(I_S) \subset G(Q_S),$  and  $H(\sigma, e) = tt$  iff  $e \in \operatorname{ES}_X(\sigma)$ .

We now define a simulation relation between two LTSs. In general, a one-step transition in the specification is followed by an n-step transition in the design. In the definition,  $\Sigma^+$  denotes the set of non-empty strings of  $\Sigma$ ,  $\delta^+$  denotes an n-step transition relation, and  $p \xrightarrow{a_1 a_2 \dots a_n} p' \in \delta^+$  denotes an n-step transition from state p to state p'.

**Definition 8.** (Simulation relation). Let  $M_1 = \langle Q_1, \Sigma_1, \delta_1, I_1 \rangle$  and  $M_2 = \langle Q_2, \Sigma_2, \delta_2, I_2 \rangle$  be LTSs, and  $f : \Sigma_1 \to \Sigma_2^+$  a function from  $\Sigma_1$  to  $\Sigma_2^+$ . Suppose a relation  $\preceq \subseteq Q_1 \times Q_2$  is given. M2 simulates M1 with respect to  $\preceq$  if for all  $q_1, q_1' \in Q_1$ ,  $q_2 \in Q_2$ ,  $a \in \Sigma_1$  such that  $q_1 \preceq q_2$  and  $q_1 \overset{a}{\to} q_1' \in \delta_1$ , there exist  $q_2' \in Q_2$  such that  $q_1' \preceq q_2'$  and  $q_2 \overset{f(a)}{\to} q_2' \in \delta_2^+$ . If M2 simulates M1 with respect to  $\preceq$ , we denote M1  $\preceq$  M2.

**Definition 9.** (Simulation relation of two LTSs within bounds). Let  $M_1$  and  $M_2$  be two LTSs, and B be bounds. The simulation relation of  $M_1$  and  $M_2$  within bounds B is defined as  $M_1 \preceq_B M_2$  if  $M_1 \downarrow_B \preceq M_2$ . If  $M_1 \preceq_B M_2$  holds, we say  $M_2$  simulates  $M_1$  within B.

If an error is found when applying our framework to verify the design against the bounded specification, there actually exists a state transition in the bounded specification that is not followed by the design. It is obvious that this state transition is also included in the original specification; thus, the design does not conform to the original specification. Formally,  $M_1 \npreceq_B M_2 \Rightarrow M_1 \npreceq_A M_2$ .

Generating the Environments. An environment is generated from the LTS of the specification model. Let  $S = \langle \mathcal{V}_S, \mathcal{D}_S, \mathcal{E}_S, \operatorname{Init}_S, Inv \rangle$  be a specification

model and  $M_S = \langle Q_S, \Sigma_S, \delta_S, I_S \rangle$  be the LTS derived from S. Based on the given mapping  $f: \Sigma_S \to \Sigma_{D\cdot E}^+$  from the events in the LTS to the function calls in the environment, mapping  $R': \mathcal{V}_S \to \mathcal{V}_E$  and mapping  $C: \mathcal{D}_S \to \mathcal{D}_D$ , the environment model  $E = \langle \mathcal{V}_E, \mathcal{D}_E, \Sigma_E, I_E \rangle$  with  $\mathcal{D}_E = \mathcal{D}_D$  is generated such that  $\Sigma_E = \{f(e) \mid e \in \Sigma_S\}$  and  $I_E = \{f(e) \mid e \in I_S\}$ .

Generating the Assertions. The relation on states of the specification and the combination is given based on the mappings  $R: \mathcal{V}_S \to \mathcal{V}_D$  and  $C: \mathcal{D}_S \to \mathcal{D}_D$ ; verification conditions are generated as follows:

- For initial state, to check whether  $\sigma_S^0 \preceq \sigma_{D.E}^0$ , an assertion is generated:  $\bigwedge_{x \in \mathcal{V}_S, y \in \mathcal{V}_D, y = R(x)} (\sigma_{D.E}^0(y) = C(\sigma_S^0(x))),$
- For all (reachable) states  $\sigma_S, \sigma_S' \in Q_S$  and  $\sigma_{D.E} \in Q_{D.E}$  such that  $\sigma_S \stackrel{e}{\longrightarrow} \sigma_S' \in \delta_{S\downarrow_B}$ , and  $\sigma_S \preceq \sigma_{D.E}$ , in order to verify whether there exists state  $\sigma_{D.E}' \in Q_{D.E}$  and transition  $\sigma_{D.E} \stackrel{f(e)}{\longrightarrow} \sigma_{D.E}' \in \delta_{D.E}^+$  such that  $\sigma_S' \preceq \sigma_{D.E}'$ , an assertion is generated:  $\bigwedge_{x \in \mathcal{V}_S, y \in \mathcal{V}_D, y = R(x)} (\sigma_{D.E}'(y) = C(\sigma_S'(x)))$

In the last step, we input the combination model and the assertions to Spin to check the simulation relation of the specification and the design. The assertions will be verified in every reachable state of the combination. This ensures that for each state transition in the specification, there exists a corresponding transition in the combination. Such kind of correspondence shows the consistency of the functions in the design with the events in the specification. This is useful to check properties relevant to the pre-conditions and the post-conditions of the service functions of the reactive systems. The typical bugs caused by the computational statements of the functions can be found by checking the relations between data elements of the design and the specification in every reachable state. In the end, the verification of simulation between the design and the specification has been completed within the bounds.

#### 5 Case study

We implemented our framework as a generator that produces: the LTS of the bounded specification; the environment in Promela; and the assertions. As an application of our framework to a practical system, we conducted several experiments to verify that a design of OSEK OS in Promela conforms to its formal specification in Event-B. These two models are partially illustrated in Figures 1 and 2.

In this framework, bounds are set for the verification to make sure that every variable in the Event-B specification obtains values in finite ranges. As shown in Figure 1, variables tasks, res, evt, and inr define entities managed by OSEK OS such as tasks, resources, events, and interrupt routines; variable pri defines the priority assigned to tasks, resources, and interrupt routines; and variable tstate defines the task state. The finite ranges of values for them must be

introduced in the experiments as bounds for the verification. By using various bounds, we can separate the cases that deal with distinct groups of system services from which check the relation between different groups. This helps us to avoid the state explosion and keep important behaviors of the target system we want to verify in the cases.

All experiments are conducted on an Intel(R) Core(TM) i7 Processor at 2.67GHz running Linux. Verification results outputted by Spin are shown in Table 1. Here, the first column ("No.") represents experiment numbers. The next column presents size of ranges for variables tasks, pri, res, evt, and inr. Values in this column express bounds of the verification. Column "LTS Generation" shows statistics of the LTS generator. Here, columns "#State", and "#Trans" present the number of distinct states and that of transitions appearing in the LTS; column "Time" presents the time taken (s) for the generation. Column "Model Checking" presents statistics of the model checker including total actual memory usage, the time taken (s), and the verification result in which " $\sqrt{}$ " indicates the verification has been completed. Groups of system services of OSEK OS consist of task management, resource management, event mechanism, and interruption management. In the table, experiments No.1-No.9 are performed to check the task management independently from the other groups of system services. In these cases, we show ranges for tasks and pri. Experiments No.10-No.14 are performed to check relation between task management, resource management, event mechanism, and interruption management; therefore, we show ranges for tasks, pri, res, evt, and inr.

From the experiment results, we can see that the time taken and the total actual memory usage for the generation of the LTS from Event-B specification and the verification of the simulation relation are reasonable. For the model checking result, no errors were returned in all cases of experiments. Several safety properties of OSEK OS have been confirmed by these experiments such as "tasks and interrupt routines shall not terminate while occupying resources" and "high-priority tasks such as life saving units must always be executed before all low priority tasks". This is because the design of OSEK OS has already been reviewed carefully by many researchers and engineers. Still, this result offers a confidence on the conformance of the OSEK OS design with respect to its specification within input bounds.

## 6 Discussion

Generality of the Framework. OSEK OS is the operating system which is widely used in the automotive systems. Our framework is applied to verify the design of a practical system, that is, OSEK OS design. The framework directly checks the design against its formal specification. Although we show the experiments, when our framework is applied to the operating system, it is not limited to this application. In the framework, the simulation relation is defined based on semantic of LTS. In models, the states are interpreted as value assignments. The design is described as a collection of functions which update the value assign-

Table 1: Experiment Outputs

| No. | Size of Ranges |     |     |     |     | LTS Generation |        |         | Model Checking |         |        |
|-----|----------------|-----|-----|-----|-----|----------------|--------|---------|----------------|---------|--------|
|     | tasks          | pri | res | evt | inr | #State         | #Trans | Time(s) | Memory(Mb)     | Time(s) | Result |
| 1   | 1              | 1   | 0   | 0   | 0   | 2              | 2      | 1.0     | 129.2          | 3       |        |
| 2   | 2              | 2   | 0   | 0   | 0   | 4              | 10     | 1.0     | 129.2          | 3.5     |        |
| 3   | 3              | 3   | 0   | 0   | 0   | 8              | 36     | 1.0     | 129.2          | 3.5     |        |
| 4   | 4              | 3   | 0   | 0   | 0   | 16             | 112    | 1.2     | 129.2          | 4.2     |        |
| 5   | 5              | 3   | 0   | 0   | 0   | 32             | 320    | 1.2     | 130.6          | 4.9     |        |
| 6   | 6              | 3   | 0   | 0   | 0   | 64             | 864    | 1.3     | 132.6          | 10.3    |        |
| 7   | 7              | 3   | 0   | 0   | 0   | 128            | 2240   | 1.3     | 324.5          | 26.1    |        |
| 8   | 8              | 3   | 0   | 0   | 0   | 256            | 5632   | 2.1     | 382.8          | 99.2    |        |
| 9   | 9              | 3   | 0   | 0   | 0   | 512            | 13824  | 3.0     | 430.8          | 362.1   |        |
| 10  | 5              | 7   | 0   | 0   | 2   | 128            | 1536   | 2.0     | 133.1          | 17.5    |        |
| 11  | 2              | 1   | 1   | 0   | 0   | 8              | 22     | 1.1     | 130.1          | 7.6     |        |
| 12  | 2              | 1   | 0   | 1   | 0   | 10             | 27     | 1.1     | 129.2          | 4.7     |        |
| 13  | 3              | 6   | 1   | 0   | 2   | 80             | 520    | 1.2     | 129.2          | 8.3     |        |
| 14  | 3              | 6   | 1   | 1   | 2   | 152            | 1036   | 2.0     | 132.3          | 14.1    |        |

ments. The environment is described as a collection of invocations. This style of models is adopted not only for operating systems but also other reactive systems.

In our case study, Promela is used as a specification language to describe the design and the environment; however, our framework can be applied for the designs described in not only Promela but also other languages as long as they can deal with a collection of functions for the design and sequences of invocations for the environment.

**Notion of Bounds.** We introduce a formalization of the bounds for verifying the simulation relation of the design and the formal specification with Event-B. The bounds are used to obtain a finite LTS associated to Event-B model. This bound can be applied generally to any design and its formal specification as long as the formal models of the inputs are defined as LTSs. In Section 4, we present the interpretation of the bound in a concrete model, that is, Event-B model. In the first step of interpreting the bounds in the specification, we introduce finite ranges of variable values in the specification. Next, we regard the typical bugs that can be found in the verification with a large value domain. For finding such bugs of the target system, in addition to restrict the range of values, one can restrict system services of the target system. The intention of such additional restriction is to exclude transitions not relevant to the bugs and to reduce size of model for which model checking is feasible. It is important to give the appropriate restrictions or the proper bounds for the model. We could do this by studying behavior scenarios for each property to be checked. Based on the behavior scenarios, we could estimate the appropriate range of values for the variables and determine what system services must be included in the bounded model. Also, we could make sure that the critical scenarios are actually contained in the bounded model by traversing the execution sequences of the LTS accordingly with the scenarios. Consequently, the bounds need to be decided depending on the properties to be checked.

Comprehensiveness of Environment. The behaviors of the target systems depend on patterns of function calls from their environments. For the comprehensive verification of reactive systems, we need to use the environments that cover all possible patterns of invocations. Accordingly, an advantage of our framework is that it is able to systematically generate all possible patterns of invocations from the LTS of the specification in Event-B. This is essential to generate the environments for the comprehensiveness of verification with respect to the specification.

### 7 Related Works

Verification of systems using model checking. [6] presents a case study on checking the operating systems compliant with OSEK/VDX. The authors describe the specification in temporal logic formulas. Separately, we describe the specification in Event-B. This improves the consistency of properties extracted from the specification and provides general environments for comprehensive verifications.

Verification of systems based on simulation relations. FDR [5] is a refinement checker for the process algebra CSP. Inputs of FDR are the specifications and the implementations written in the same language. Our framework accepts the inputs written in different languages. [19] and [9] present approaches to verify the OS kernels based on theorem proving. Theorem proving can be used to verify the infinite systems; however, it generally requires a lot of interactive proofs. In our framework, we use model checking combining with prover tools of Event-B. Although, ranges are bounded due to the limitation of model checking; however, we are able to improve quality of the properties checked and get completely automatic verification. Therefore, we have a high degree of confidence in the verification results.

Generation of LTS from Event-B model. [10] presents the ProB tool which supports interactively animating B models. Using ProB, users can see the current state and set an upper limit on the number of ways that the same operation can be executed. In our works, we firstly set finite ranges for types in Event-B specification, then, explore all possible sequences of state transitions within defined ranges. [4] defines the semantic of Event-B model as labeled transition systems to reason about behavioral aspects of specifications in Event-B. We formally define the framework from scratch. We precisely define finite ranges of variable values in Event-B specification as bounds of our verification; then, we generate all possible behaviors from Event-B specification within defined ranges.

Construction of the environment of the operating system. In previous works, we verified the OSEK OS by constructing a general model of the environ-

ment from scratch [21]: it includes a class diagram and state diagrams of objects in the environment. These diagrams are composed to generate the environment scripts. In current work, the environment is generated from the Event-B specification. Hence, by construction, it is comprehensive with respect to the specification. The environment is used to exercise the design and check the given relation between variables of the Promela design and variables of the Event-B specification in every reachable state. This guarantees that the design conforms to the specification. Also, the correctness of the specification is guaranteed by tools of Event-B; the quality of the environment is improved.

Combination of Event-B model and model checking. For combination of Event-B and model checking, tools like ProB[10] and Eboc[12] work as model checkers for Event-B. As another approach, [15] translates Event-B model into Promela model and use Spin to check the model. We consider that we could obtain a skeleton of the design in Promela if we apply the mappings to the Event-B specification; however, we still need to add design decisions into the target model. We used Promela to describe the design. Our work has not directly translated Event-B code into Promela but translate LTS of the Event-B specification and assertions into Promela. Then, we use Spin to check the simulation relation between the design model and LTS of the specification in Promela.

### 8 Conclusion

We proposed an approach to verify designs against their formal specifications which are described in different specification languages respectively. A primary achievement of the approach is to make it possible to describe the specification and the design in appropriate languages for a verification of the design. Formal specification languages are intended to facilitate describing the specifications. Promela is intended to analyze the designs. Our approach follows these intentions faithfully. In fact, as mentioned in Section 1, it is natural for reactive systems like operating systems to describe the designs in the imperative specification languages. On the other hand, describing their detailed properties in temporal logic is generally hard. It is easy to imagine that the temporal logic formulas representing the specification shown in the case study become very complex and prone to mistakes. Instead of the temporal logic, we provide a way to represent the specification in formal specification language Event-B and check the design against it with the Spin model checker. Event-B is appropriate to represent the specification because it has rich notions such as sets and relations. In addition, Event-B allows us to ensure the consistency and the correctness of the specification by its verification facilities such as discharging proof obligations and refinement. That is, we can check the design against such consistent and correct specification. This would drastically improve the reliability of model checking results because the specification is reliable. There is a possibility that our approach is applicable not only for Event-B and Promela but also the other specification languages. We plan to extend the verification framework to accept the additional choice of the specification languages.

#### References

- Abrial, J.R.: Modeling in Event-B: system and software engineering. Cambridge University Press (2010)
- 2. Aoki, T.: Model checking multi-task software on real-time operating systems. In: The 11th IEEE International Symposium on Object Oriented Real-Time Distributed Computing, pp. 551–555 (2008)
- Baier, C., Katoen, J.P.: Principles of Model Checking (Representation and Mind Series). The MIT Press (2008)
- 4. Bert, D., Potet, M.L., Stouls, N.: Genesyst: a tool to reason about behavioral aspects of B event specifications. application to security properties (2010)
- Broadfoot, P.J., Roscoe, A.W.: Tutorial on FDR and its applications. In: Proceedings of the 7th International SPIN Workshop. pp. 322–322 (2000)
- Choi, Y.: Model checking trampoline os: a case study on safety analysis for automotive software. Softw. Test., Verif. Reliab. 24(1), 38–60 (2014)
- Clarke, E.M., Grumberg, O., Long, D.E.: Model checking and abstraction. ACM Trans. Program. Lang. Syst. 16(5), 1512–1542 (Sep 1994)
- Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Patterns in property specifications for finite-state verification. In: Proceedings of the 21st International Conference on Software Engineering. pp. 411–420. ICSE '99, ACM, New York, NY, USA (1999)
- Klein, G., Andronick, J., Elphinstone, K., Heiser, G., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Winwood, S.: seL4: Formal verification of an operating-system kernel. Communications of the ACM 53(6), 107–115 (2010)
- Leuschel, M., Butler, M.: ProB: An automated analysis toolset for the B method. International Journal on Software Tools for Technology Transfer 10(2), 185–203 (2008)
- Lynch, N., Vaandrager, F.: Forward and backward simulations i.: Untimed systems. Inf. Comput. 121(2), 214–233 (Sep 1995)
- Matos, P., Fischer, B., Marques-Silva, J.: A lazy unbounded model checker for event-b. In: Formal Methods and Software Engineering, vol. 5885 (485-503, 2009)
- Milner, R.: Communication and concurrency. PHI Series in computer science, Prentice Hall (1989)
- 14. Muller, A.: VDM the Vienna development method (2009)
- 15. Muller, T.: Formal methods, model-cheking and rodin plugin development to link event-b and spin (2009)
- 16. ORegan, G.: Z formal specification language. In: Mathematics in Computing, pp. 109–122. Springer London (2013)
- 17. OSEK/VDX Group: OSEK/VDX operating system specification 2.2.3, http://portal.osek-vdx.org/, http://portal.osek-vdx.org/
- Reeves, S., Streader, D.: Guarded operations, refinement and simulation. Electron. Notes Theor. Comput. Sci. 259, 177–191 (2009)
- In der Rieden, T., Knapp, S.: An approach to the pervasive formal specification and verification of an automotive system. In: Proceedings of the 10th International Workshop on Formal Methods for Industrial Critical Systems. pp. 115–124 (2005)
- Vu, D.H., Aoki, T.: Faithfully formalizing OSEK/VDX operating system specification. In: Proceedings of the 3rd Symposium on Information and Communication Technology. pp. 13–20 (2012)
- Yatake, K., Aoki, T.: Model checking of OSEK/VDX OS design model based on environment modeling. In: Proceedings of the 9th International Colloquium on Theoretical Aspects of Computing (ICTAC '12). pp. 183–197 (2012)

## Analyzing Industrial Architectural Models by Simulation and Model-Checking

Raluca Marinescu<sup>1</sup>, Henrik Kaijser<sup>2</sup>, Marius Mikučionis<sup>3</sup>, Cristina Seceleanu<sup>1</sup>, Henrik Lönn<sup>2</sup>, and Alexandre David<sup>3</sup>

Mälardalen University, Västerås, Sweden raluca.marinescu@mdh.se, cristina.seceleanu@mdh.se
<sup>2</sup> Volvo Group Trucks Technology, Göteborg, Sweden henrik.kaijser@volvo.com, henrik.lonn@volvo.com
<sup>3</sup> Aalborg University, Aalborg, Denmark marius@cs.aau.dk, adavid@cs.aau.dk

**Abstract.** The software architecture of any automotive system has to be decided well in advance of production, so it is very desirable to assess its quality in order to obtain quick indications of errors at early design phases. In this paper, we present a constellation of analysis techniques for architectural models described in EAST-ADL. The methods are complementary in terms of covering EAST-ADL model analysis against a rich set of requirements, and in terms of the varying degree of confidence in the provided guarantees. Based on the needs of the current modeldriven development in a chosen automotive context, we propose three analysis techniques of EAST-ADL architectural models, in an attempt to tackle some of the exposed design needs: simulation of EAST-ADL functions in Simulink, model-checking EAST-ADL models with timed automata semantics, and statistical model-checking in UPPAAL, applied on an automatically generated network of timed automata. An industrial Brake-by-Wire prototype is the case study on which we show the potential of simulating EAST-ADL models in Simulink, model-checking downscale EAST-ADL models, as well statistical model-checking of full model versions, in order to tame verification scalability problems.

## 1 Introduction

Mechanical and hydraulic systems in current vehicles are being replaced by electrical/electronic systems that can implement highly complex functions like cruise control and automatic braking. In order to deal with this complexity, the automotive industry has moved towards a model-based development process, during which high-level system models are designed and analyzed against requirements. Since many automotive systems are safety-critical, new standards such as ISO26262 place requirements on the quality of software. Consequently, companies that wish to adopt such standards will need to use methods and tools fit for guaranteeing such quality on each level of design abstraction.

Simulink [2], a model-based tool for design, simulation, and code generation of embedded systems, is already a well-established practice in the automotive

domain. Simulink is typically used to define and assess system behavior in an early phase, or to create a detailed behavioral behavioral definition of the system in order to automatically generate the corresponding code. Architectural description languages, on the other hand, can be introduced earlier in the development, to provide models that could handle the complex software architecture of automotive systems. Compared to the current state-of-practice, architectural models offer a well-defined and standardized structure that deals with all the related information (e.g. functions, timing, triggering) of safety-critical systems [8]. A candidate for this task is EAST-ADL [7], an architectural description language dedicated to the modeling and development of automotive embedded systems. The use of such modeling notations enables the application of verification techniques early in the industrial development process, in an attempt to gain early-phase indications of possible functional and timing errors.

In this paper, we propose a constellation of complementary verification techniques that can be applied on EAST-ADL models to deliver various types of model correctness assurance. We start by briefly presenting the East-adl architectural language and the tools involved in the verification process (see Section 2), and we discuss the current state-of-practice in the development of automotive systems as used nowadays by the automotive industry (see Section 3). Next, we present our simulation and model-checking methodology (see Section 4), and we show the verification techniques based on the: (i) simulation of East-adl models from a set of predefined verification cases with Simulink (see Section 6), (ii) symbolic simulation and formal verification of EAST-ADL with UPPAAL, and (iii) statistical model-checking of the architectural model with UPPAAL SMC (see Section 8). In order to enable the verification of architectural models in EAST-ADL, we also contribute with a timed automata (TA) semantics that we propose for the East-add components (see Section 7). We show how the formal techniques underlying the tools complement each other, by applying the EAST-ADL to Simulink, and East-ADL to Uppaal-TA transformations to analyze the Brake-by-Wire (BBW) industrial system (see Section 5). Such an endeavor exposes also the advantages and limitations of each framework, when used on an industrial system model, which can serve as a guiding result especially if safety standards such as ISO26262 are to be adopted. We end this paper by discussing similar related works (see Section 9), and by presenting our conclusions (see Section 10). The actual contribution of this paper consists of introducing two new transformations, one from East-add models to Simulink models, and one from East-add models to Uppaal models, together with the application of simulation, model-checking and statistical model-checking on an industrial architectural model.

## 2 Brief Overview of the EAST-ADL Language

EAST-ADL [7] is an AUTOSAR [4] compatible architectural description language for automotive electronic systems. The functionality of the system is defined at four levels of abstraction, as follows. The *Vehicle Level* is the highest level of

abstraction and describes the electronic features as they are perceived externally. Next, the Analysis Level allows an abstract functional representation of the architecture without prescribing a specific hardware topology. The Design Level presents a detailed functional representation of the architecture, plus the allocation of these elements on to the hardware platform. Last, the Implementation Level describes the implementation of the system using AUTOSAR elements. At each abstraction level, the system model relies on the definition of a set of Function Types representing components that describe the functional structure of the system. Each of these Function Types has: (i) a set of Flow Ports that provide and receive data, (ii) a Function Trigger that can be either time-based or event-based, and (iii) a FunctionBehavior. The system is modeled as a set of interconnected FunctionPrototypes, where each FunctionPrototype is an instantiation of the corresponding Function Type. The execution of each Function Prototype is based on the "read-execute-write" semantics, which enables semantically sound analysis and behavioral composition, and makes the function execution independent of the notation used, when defining its internal behavior. The FunctionBehavior is defined using different notations and tools, e.g., Simulink or UPPAAL PORT timed automata (TA) [13]. At each level of abstraction, the above structural elements of the system can be extended with annotations for orthogonal aspects like requirements, timing properties, generic constraints. etc. EAST-ADL also provides means to describe different validation and verification activities as VVCases for different levels of abstraction.

In the following section, we present a typical automotive development process and we try to identify different needs and gaps that need to be addressed.

# 3 The Current Development Process in an Automotive Context

We have identified four main groups of actors who are involved in a typical automotive development process: the *Client*, the *System Engineers*, the *Software Developers*, and the *Verification Engineers*.



Fig. 1: A typical automotive development process.

As depicted in Figure 1, the *Client* compiles a set of informal, natural language requirements describing the new system that needs to be implemented. The *System Engineers* break down these requirements in incremental steps, passing the current requirement set from one engineer to the other for further decom-

position. The Software Developers decompose further these requirements while considering implementation elements like the system architecture. This new set of requirements, consisting of one requirement document per system component, is divided among the Software Developers, who create a model-based implementation of the components in the system. The components may be modeled using the Simulink tool, and the code is automatically generated based on these models. This code is integrated as the behavior of an AUTOSAR software component and, where necessary, adjusted by the Software Developers. In order to ensure correct behavior, model-in-the-loop and software-in-the-loop analysis are used. Once a software component has been implemented, it can be deployed on an electronic control unit (ECU) for component testing. Finally, the Verification Engineers perform testing at the system level directly on the platform, using manually written tests. Any bugs discovered in the implementation or any problems in the requirements are reported back to the person responsible for the implementation or requirement, respectively. Since models start to be included in the industrial development process, there is also an increased need of stronger evidence of model correctness with respect to functional or timing requirements.

For the development process described above, different state-of-the-art techniques could facilitate model integration and verification, as follows:

- Introducing architectural languages (like EAST-ADL) will keep track of requirements, features, functions, and hardware topology in an integrated model, making the design decisions consistent and traceable.
- Providing the behavior for architectural components based on formal definitions like TA, together with typical Simulink definitions, will enable alternative representations of the same function, hence providing a more comprehensive assessment of the system.
- Applying formal verification techniques, like model-checking, on the system's formalized structural and behavioral model will provide correctness assurances regarding important properties.

In order to adopt these steps, an integrated system model is needed, such that different verification techniques can be applied consistently, on the same system description, at various levels of abstraction.

## 4 Our Methodology for Analyzing Architectural Models

In this section, we propose a methodology for simulation and model-checking of East-add models, which is depicted in Figure 2. Our verification methodology consists of the following steps:

 Create the East-adl model and provide the behavior of each FunctionPrototype as a FMU<sup>1</sup> [3] or a Simulink model;

<sup>&</sup>lt;sup>1</sup> The Functional Mock-up Interface (FMI) is a tool-independent standard to support behavior models using a combination of xml-files and compiled C-code. The standard defines the concept of a Functional Mock-up Unit (FMU), as a software component that implements the FMI standard.



Fig. 2: Our simulation and model-checking methodology.

- Select the verification method:
  - 1. Simulation: by implementing an automatic transformation from the architectural model to a Simulink model and calling the Simulink tool, we can provide verification through simulation;
  - 2. Model-checking: by implementing an automatic transformation from the architectural model to a network of TA, we can use the UPPAAL or UPPAAL SMC model-checker to formally verify the system;
- Return the verification results back to the East-adl model for possible improvements of the design.

There are several differences between the two frameworks. The simulation method requires the East-add model to be extended with verification and validation elements as VVCases, which describe the part of the model to be analyzed, together with the definition of monitor FunctionTypes, stimuli data, and the requirements to be verified. The behavioral model of the monitor is provided as an FMU or a Simulink model. The transformation to the network of TA provides formal semantics for the architectural model in terms of timed transition systems [5]. In order to preserve the informal semantics of the architectural language, the transformation produces a network of two synchronized TA for each East-add FunctionPrototype: an Interface TA with the elements provided in the architectural model and a Behavior TA.

The parts represented with a dotted line in Figure 2 have not been implemented in the current version of the transformation. By extending our methodology to include an automatic transformation from the Simulink component model to the corresponding *Behavior* TA, the two models would be consistent

and the verification results of the both frameworks would truly complement each other. However, information would be lost in such a transformation and the TA model would require manual refinements, such that the TA could represent the key behavior of the component that is largely consistent with the corresponding Simulink model.

## 5 An Example from Industry: Brake-by-Wire Case Study



Fig. 3: The EAST-ADL model of the BBW system at Design Level.

In this section, we introduce the Brake-by-Wire (BBW) system that will be used through the paper as the running example to illustrate our techniques. The BBW system is a braking system equipped with an ABS function, and without any mechanical connectors between the brake pedal and the brake actuators. A sensor attached to the brake pedal reads its position, which is used to compute the desired global brake torque. For vehicles with stability control, the torque is influenced by the wheel speed and the desired torque for each wheel is calculated based on the following equation:

$$torque = (pos/100) \times maxBrakeTorque \times distribution$$
 (1)

where pos is the pedal position with values  $\in [0,100]$ , maxBrakeTorque is the maximum global brake torque, and distribution is the static distribution factor. The ABS algorithm computes the slip rate s based on the following equation:

$$s = (v - w \times R)/v \tag{2}$$

where v is the speed of the vehicle, w is the speed of the wheel, and R is the radius of the wheel. The friction coefficient has a nonlinear relationship with the slip rate: when s starts increasing, the friction coefficient also increases, and its value reaches the peak when s is around 0.2. After that, further increase in s reduces the friction coefficient of the wheel. For this reason, if s is greater than 0.2 the brake actuator is released and no brake is applied, otherwise the requested brake torque is used.

Figure 3 presents the East-add model of the BBW system at the Design Level, and a set of requirements has been provided (to describe the functionality of this system at this level), as follows:

- $\mathbf{D}_1$  The torque on the wheel shall be defined as:  $(pos/100) \times maxBrakeTorque \times distribution$ .
- $\label{eq:D2} \textbf{D}_2 \ \ \textbf{If} \ \ \textbf{VehicleSpeedIn} \\ > \textbf{ABSVehicleSpeedThreshhold} \ \ \textbf{and} \ \ \textbf{s} \\ > \textbf{ABSSlipRateThreshhold}, \\ \text{then} \ \ \textbf{ABSBrakeTorqueOut shall be set to 0Nm}.$
- $\label{eq:D3} \textbf{D}_3 \ \ \text{If} \ \ \text{s}{<} = \text{ABSSlipRateThreshhold} \ \ \text{or} \ \ \text{VehicleSpeedIn}{<} = \text{ABSVehicleSpeedThreshhold}, \\ \text{then} \ \ \text{ABSBrakeTorqueOut shall be set to RequestedTorqueIn}.$
- $\mathbf{D}_4$  Investigate the latency between the wheel sensor and the brake pedal actuator.

The goal of this work is to show how one can verify the above requirements on the EAST-ADL description, using various verification techniques that we present in the following.

## 6 Simulation of EAST-ADL Functional Architecture in Simulink

In this section we describe the simulation method proposed in Section 4, which has been implemented as an EATOP [1] plug-in called FMUSim that synthesizes a Simulink model and configures it according to the properties in the EAST-ADL model. The model transformation preserves the compositional hierarchy of the EAST-ADL model in EATOP, and is implemented as a one-to-one mapping between EAST-ADL elements and Simulink elements, as depicted in Table 1.

In order to simulate a time-trigged EAST-ADL function, the FMU block needs to be sampled once per period. However, the FMU blocks provided by the FMI Toolbox are continuous and cannot be sampled directly. As depicted in Figure 4, the solution chosen in this implementation is to add a pulse generator and a subsystem InputData that is acting as a flip-flop clocked on the positive flank of the pulse. Since the execution of a Simulink block is instantaneous, another flip-flop OutputData is added, which is clocked on the negative flank of the pulse, such that the execution time of the FMU becomes equal to the pulse width. Similarly, in order to simulate an event-trigged EAST-ADL function, we reuse the negative flank of the trigger pulse from another time-triggered function that acts as the event source. The negative flank of EventTriggerIn is used to clock a flip-flop InputData to control execution start, as depicted in Figure 5. The execution period of the function is then simulated by adding a flip-flop OutputData, which is clocked on a step down that is generated at a time equal to the worst-case

Table 1: Mapping rules for the EAST-ADL to Simulink transformation.

| EAST-ADL element                      | Simulink element(s)                          |
|---------------------------------------|----------------------------------------------|
| composed FunctionType                 | Subsystem                                    |
| Function Connector                    | Line                                         |
| non-top-level Function Flow Port In   | Inport                                       |
| non-top-level Function Flow Port Out  | Outport                                      |
| top-level Function Flow Port In       | Repeating Sequence Interpolated              |
| top-level Function Flow Port Out      | Scope                                        |
| time-trigged leaf Function Type with  | Pattern with several elements                |
| FMU behavior                          |                                              |
| event-trigged leaf Function Type with | Pattern with several elements                |
| FMU behavior                          |                                              |
| Continuous leaf Function Type with    | FMU Block                                    |
| FMU behavior                          |                                              |
| leaf Function Type with Simulink      | Same pattern as in the FMU cases above,      |
| behavior                              | but a copy of the behavior model is inserted |
|                                       | instead of the FMU block                     |

execution time (WCET) after the function starts executing. The clock signal is exported as EventTriggerOut for the pattern to be repeatable. This means that it is possible to simulate a chain of event-trigged functions with the pattern.



Fig. 4: Simulink pattern for modeling time-trigged execution of an EAST-ADL function with execution time. The block pLDM\_Brake\_FL represents the FMU.

In this transformation, we have not addressed the nondeterminism or the possible interleavings of the *FunctionPrototypes*'s execution. Since we are performing simulations on the transformed model, the current execution pattern is one of infinitely many interleavings and event sequences, which means that some errors may be overlooked. To represent deviating clock speeds and arbitrary start-up time, an arbitrary component could be added by the transformation to the offset and period times, and a deterministic yet random sequence would secure repeatability of the simulation runs. Multiple runs with randomized parametrization would increase confidence through the extended state space



Fig. 5: Simulink pattern for modeling event-trigged execution of an EAST-ADL function with execution time. The block FMU\_function\_F represents the FMU.

covered. However, these extensions to the method are not in the scope of this paper.

**Application on the BBW case study.** We have applied the transformation described above on the BBW case study. The resulting model contains one FMU for each leaf EAST-ADL FunctionPrototype, plus the required monitors for the VVCase specified in the EAST-ADL model.



Fig. 6: Implementation of the pBrakeTorqueRRMonitor. The lower half of the figure shows the contents of the block named for each subsystem in the upper half.



Fig. 7: Simulation results provided by the pBrakeTorqueRRMonitor.

As depicted in Figure 6, pBrakeTorqueRRMonitor is a complex monitor despite the fact that it verifies a simple linear function like requirement  $D_1$  for the rear right wheel. The time until a new pedal position has propagated through the system and has given rise to a new torque value GBC\_TorqueReq\_RR varies between delay\_min and delay\_max [ms]. As shown in Figure 7, the torque requested by the brake controller on the rear right wheel is a linear scaling of the pedal position delayed by the propagation time. The boolean monitor function "looks back" in time according to the delay interval, and is able to find a pedal position corresponding to the requested torque at all evaluated time points. The result shows that requirement  $D_1$  is satisfied to the extent guaranteed by the simulation technique.

## 7 Formal Semantics of EAST-ADL as a network of Timed Automata

To formally verify that the architectural model meets its requirements, we need to exhaustively explore all the function blocks in the model. In this context, we need to represent the execution semantics of the East-add function blocks using a network of TA (see Figure 2), which has a well-defined formal semantics in terms of timed transition systems [5]. We have developed an automatic transformation, considering a subset of the East-add elements, which we define as a tuple:

$$EAST - ADL_{DesignLevel} \triangleq \langle F_P, Con, DP, Trigg, TC \rangle,$$

where  $F_P$  is the set of FunctionPrototypes, Con is the set of connectors between the  $F_P$ , DP is the set of data ports, defined as the union of input ports and output ports, Trigg is the set of triggering elements, defined as the union of events and periodic triggers, and TC the set of the model's timing constraints. In a similar manner, the TA is defined as a tuple:

$$TA \triangleq \langle L, l_0, C, A, E, I \rangle$$
,

where L is a finite set of locations,  $l_0 \in L$  is the initial location, C is a set of clocks, A is a set of possible actions, E is a set of edges between two locations, and I is a set of invariants attached to the locations.

The transformation is a one-to-one function  $\pi: EAST-ADL_{DesignLevel} \to TA$ , which maps each element in the EAST-ADL<sub>DesignLevel</sub> to a TA element. The mapping rules are:

- Each function  $F_P$  is defined in terms of a network of two TA, as shown in Figure 8. To preserve the "read-execute-write" semantics of EAST-ADL, the Interface TA (see Figure 8a) has four locations: (i) Idle, (ii) a Read location that allows the update of the variables according to the values on the input ports, independent of other computations, (iii) an Exec location that triggers the Behavior TA (see Figure 8b) that models the desired behavior of  $F_P$ , and (iv) a Write location that allows the update of the output ports according to the values of the computed internal variables, respectively, independent of other computations.
- Each input and output port DP is mapped to a global variable in the TA network, respectively.
- Each connector Con from output port  $Port_{out1}$  of  $F_{P1}$  to input port  $Port_{in2}$  of  $F_{P2}$  is transformed into an assignment  $Port_{in2} := Port_{out1}$ , along the edge from Idle to Read;
- The triggering of each interface TA is based on the triggering Trigg associated to the EAST-ADL  $F_P$ . Concretely, this creates two possible instantiations of the Interface TA: (i) for timed-triggered  $F_P$  the transformation produces a local clock, plus invariants and guards on TA (see Figure 9a), and (ii) for event-triggered  $F_P$  the transformation produces a set of dedicated variables that need to be constantly updated and reset, respectively (see Figure 10a).
- Other timing annotations TC, e.g., the execution time, can be included in the timing behavior of the TA model.



(a) Interface template. (b) Behavior template.

Fig. 8: The generic TA semantics of an East-add  $F_P$ .

Once we obtain the network of TA corresponding to the EAST-ADL model, one manually edits the *Behavior* TA to match the desired behavior of the corresponding *FunctionPrototype*. Formal analysis techniques like model-checking and statistical model-checking are then applied to verify the resulting model. In the next section we apply such transformation on the BBW EAST-ADL model, to enable the latter's verification.

## 8 Analysis of EAST-ADL Models Using Model-Checking and Statistical Model Checking

We have applied our method on the BBW architecture, and generated a network of 50 TA, by transforming each of the 25  $F_P$  of Figure 3 into a network of two synchronized TA, respectively. In Figures 9 and 10, we exemplify the transformation of two  $F_P$  as follows: Figure 9a presents the interface of the time-triggered pABS\_FL  $F_P$ , automatically generated from the EAST-ADL model, Figure 9b presents the behavior of the pABS\_FL  $F_P$  obtained after manually editing the dedicated TA template (see Figure 8b); Figure 10a shows the interface of the event-triggered pVehicleSpeedEstimator  $F_P$ , whereas Figure 10b shows the behavior of the pVehicleSpeedEstimator  $F_P$ , after manually editing the dedicated TA template.



Fig. 9: The TA model for the pABS\_FL EAST-ADL  $F_P$ .



Fig. 10: The TA model for the pVehicleSpeedEstimator East-Adl  $F_P$ .

On this formal model, we have applied model-checking and statistical model-checking techniques to validate the original East-add model against the requirements introduced in Section 5.

Model-Checking with UPPAAL. With UPPAAL, we have simulated and we have attempted to verify the previously described network of TA. However, the size of the model has lead to a state space explosion. On a computer with 1.8 Ghz Intel processor and 8GB memory, the verifier could explore only 10 962 377 states before it had run out of memory. This is not surprising, since the BBW system is subject to an enormous state-space explosion due to large number to TA in the network, each with its clock and its set of variable created based o the ports of the corresponding FunctionPrototype.

Consequently, we have used UPPAAL to verify a simplified version of the BBW system with one wheel only. Properties  $\mathbf{D}_2$  and  $\mathbf{D}_3$  are formalized as TCTL properties [5], as follows:

- D<sub>2</sub> A[] pABS\_FL\_VehicleSpeedIn>speed\_thrshld and pABS\_FL\_s==true imply pABS\_FL\_ABSBrakeTorqueOut==0.
- $\label{eq:D3} D_3 \ A[] \ pABS\_FL\_VehicleSpeedIn <= speed\_thrshld \ \mbox{or} \ pABS\_FL\_s == \mbox{false imply} \\ pABS\_FL\_ABSBrakeTorqueOut == pABS\_FL\_RequestedTorqueIn$

Both properties have been verified and hold on the model. For property  $\mathbf{D}_2$  the verification took 13,7 seconds and used 26 900KB of memory. For property  $\mathbf{D}_3$  the verification took 9,1 seconds and used 26 916KB of memory.

Statistical model-checking with UPPAAL SMC. TA is a suitable formalism for analyzing architectural models like EAST-ADL, and enables symbolic model-checking techniques to provide a rigorous proof of verifying or refuting a TCTL property. However, such techniques suffer from state-space explosion in terms of number of parallel components in the model, which is the case with complex, industrial systems. One possible solution is the use of a statistical model-checking engine to generate stochastic simulations and employ statistical methods to estimate probabilities and probability distributions over time with given confidence levels. The UPPAAL modeling language has been extended with probabilistic and dynamical constructs, given a stochastic semantics of timed automata networks [9], and the tool has been equipped with statistical model-checking (SMC) algorithms [10] to decide qualitative properties in terms of probabilities and cost. The symbolic and statistical techniques complement each other: SMC can show results only up to a specified level of confidence and never for certain like symbolic techniques, but it is a cheap way to generate and confirm safety counter-examples where symbolic techniques may employ expensive over-approximation [11]. Here, we attempt to analyze requirement  $\mathbf{D}_4$ .

Since UPPAAL SMC works on stochastic models, we have manually added probabilistic extensions to the four-wheels BBW model that contains the timed behavior. Figures 11a and 11b show exponential rates added to locations Idle and Exec of one Encoder component of Figure 3. The rate of 1 means that the component may potentially stay in the location forever, but it will stay there for 1 time unit on average which is consistent with the timed behavior. Further, we are interested in latency between pressing the pedal and applying the brakes, hence we added a monitoring stop-watch automaton shown in Figure 11c. The monitoring automaton has a stop-watch L that is stopped originally in location Wait by specifying that the derivative is zero: L'==0. The stop-watch is started when synchronization pBrakePedalSensor\_beh\_start? is received (the derivative L'==1 is implicit in timed automata). The stop-watch is stopped again when any of the wheels receive braking signal by synchronization pHW\_Brake\_FL\_beh\_start?, pHW\_Brake\_FR\_beh\_start?, pHW\_Brake\_RL\_beh\_start? or pHW\_Brake\_RR\_beh\_start? (the synchronizations are then on different edges that are drawn on top of each other to minimize cluttering). The latency can be estimated by the following query: Pr[bm.L<=1000](<>bm.Done) that asks what is the probability that the brake monitor process bm will end up in location Done in terms of the stop-watch L value.

The result is shown in Figure 11d. The average latency is 5 time units but it tends to be high even though our added stochastic delay assumptions are decreasing towards infinity, which is a worrying behavior. The good news is that it seems to be strictly limited by 6 time units and no simulation has been observed greater or equal than 6 time units, which is on the other hand surprising, as the model contains components with unlimited delays.



Fig. 11: The components decorated with stochastic extensions and estimated latency between pressing the pedal and applying brakes.

#### 9 Related Work

Several researchers have looked into the formal analysis and verification of EAST-ADL models. Kang et al. [13] propose a component-based analysis framework for the EAST-ADL models extended with TA semantics based on the UPPAAL PORT model-checker. Mallet et al. [14] describe the use of UML MARTE profile for the timing analysis of EAST-ADL. In addition, Feng et al. [12] propose a translation of EAST-ADL activity diagrams into the input language of SPIN for formal verification. More recently, Qureshi et al. [15] describe a model-to-model transformation from EAST-ADL to timed automata towards formal verification based on timing constraints using UPPAAL. Closely related to our work, in the context of model-driven development, Biehl et al. [6] propose a modular approach for data integration, together with their experiences from applying this approach for the verification of EAST-ADL models. The latter is focused on introducing a

systematic solution for model-based tool integration, whereas our work is focused on the analysis of industrial systems through complementary methodologies that provide various degrees of assurance.

### 10 Conclusions and Discussion

In this paper, we have presented a set of analysis techniques dedicated to the simulation and verification of automotive embedded systems specified in the EAST-ADL architectural language. In order to provide different correctness guarantees, we present three techniques that enable the transformation in, and analysis of EAST-ADL models with: (i) Simulink, a design and simulation tool used extensively in industry, (ii) UPPAAL for model-checking purposes, and (iii) UPPAAL SMC, a new extension of UPPAAL with statistical model-checking capabilities. We report our analysis results by applying all these frameworks on the industrial BBW case study. As future work, we intend to investigate the possible integration and application of these frameworks into the large-vehicle industrial development process.

**Limitations.** Our current transformation to Simulink does not support jittering of the execution start time and period times. The coverage of the state space in terms of different function execution orders and phasings is thus very low, but sufficient to detect the fundamental problems.

The model transformation from EAST-ADL to the network of TA and to the Simulink model rely on the execution semantics of EAST-ADL. However, the TA used to define *FunctionBehavior* is difficult to make fully consistent with the richer representation of the Simulink model or the FMU that is used by the FMUSim tool. The verifications are thus complementary, and will not in general verify the same properties.

Lessons Learned. Both transformations presented in the paper are conceptually simple, making them easy to implement and fast to execute. The two model transformations preserve the structure of the architecture, which simplifies the understanding and the debugging of the model. In our transformation to Simulink, it is possible to define useful transformation patterns for time and event triggered functions based on the FMI Toolbox and legacy Simulink blocks only, so additional commercial toolboxes are not required. The EAST-ADL models with feedback loops require that the loops are broken before they can be simulated in Simulink. This can be achieved either by adding a memory block somewhere in each loop or latching the subsystem ports of at least one subsystem in each loop. Moreover, the network of TA can be easily used for statistical model-checking with UPPAAL SMC, ensuring formal verification of the model even if the analysis with UPPAAL leads to a state-space explosion.

**Acknowledgment:** The research leading to these results has received funding from the ARTEMIS Joint Undertaking under grant agreement number 269335, and from VINNOVA, the Swedish Governmental Agency for Innovation Systems, within the MBAT project.

#### References

- 1. Eclipse. The EAST-ADL Tool Platform (EATOP) Editor Tool. Available from http://www.eclipse.org/proposals/modeling.eatop/, (2014).
- 2. Matworks. The MATLAB Simulink Design Tool. Available from http://www.mathworks.se/products/simulink/, (2014).
- 3. Modelica Association Project. The Functional Mock-up Interface (FMI) Standard. Available from http://www.fmi-standard.org/, (2014).
- 4. The AUTomotive Open System ARchitecture (AUTOSAR). Available from http://www.autosar.org/, (2014).
- Rajeev Alur. Timed Automata. In Computer Aided Verification, pages 8–22.
   Springer, (1999).
- Matthias Biehl, Carl-Johan Sjöstedt, and Martin Törngren. A Modular Tool Integration Approach- Experiences From Two Case Studies. In 3rd Workshop on Model-Driven Tool & Process Integration at the European Conference on Modelling Foundations and Applications, (2010).
- Hans Blom, Henrik Lönn, Frank Hagl, Yiannis Papadopoulos, Mark-Oliver Reiser, Carl-Johan Sjöstedt, De-Jiu Chen, Fulvio Tagliabò, Sandra Torchiaro, and Sara Tucci. EAST-ADL: An Architecture Description Language for Automotive Software-Intensive Systems. EAST-ADL WhitePaper, Volume 1, (2013).
- 8. Philippe Cuenot, DeJiu Chen, Sebastien Gerard, Henrik Lonn, M-O Reiser, David Servat, C-J Sjostedt, Ramin Tavakoli Kolagari, Martin Torngren, and Matthias Weber. Managing Complexity of Automotive Electronics using the EAST-ADL. In 12th IEEE International Conference on Engineering Complex Computer Systems, pages 353–358. IEEE, (2007).
- 9. Alexandre David, Kim G. Larsen, Axel Legay, Marius Mikucionis, Danny Bøgsted Poulsen, Jonas van Vliet, and Zheng Wang. Statistical Model Checking for Networks of Priced Timed Automata. In *International Conference on Formal Modeling and Analysis of Timed Systems (FORMATS)*, pages 80–96, (2011).
- 10. Alexandre David, Kim G. Larsen, Axel Legay, Marius Mikucionis, and Zheng Wang. Time for Statistical Model Checking of Real-Time Systems. In *Computer-Aided Verification*, pages 349–355, (2011).
- Alexandre David, Kim Guldstrand Larsen, Axel Legay, and Marius Mikucionis. Schedulability of Herschel-Planck Revisited Using Statistical Model Checking. In International Symposium On Leveraging Applications of Formal Methods, Verification and Validation, pages 293–307, (2012).
- 12. Lei Feng, DeJiu Chen, Henrik Lönn, and Martin Torngren. Verifying System Behaviors in EAST-ADL2 with the SPIN Model Checker. In *International Conference on Mechatronics and Automation*, pages 144–149, (2010).
- 13. Eun-Young Kang, Eduard Paul Enoiu, Raluca Marinescu, Cristina Seceleanu, Pierre-Yves Schobbens, and Paul Pettersson. A Methodology for Formal Analysis and Verification of EAST-ADL Models. *Reliability Engineering & System Safety International Journal*, (2013).
- 14. Frédéric Mallet, Marie-Agnès Peraldi-Frati, and Charles André. Marte CCSL to Execute EAST-ADL Timing Requirements. In *International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing*, pages 249 –253. IEEE, (2009).
- Tahir Naseer Qureshi, De-Jiu Chen, Magnus Persson, and Martin Trngren. On Integrating EAST-ADL and UPPAAL for Embedded System Architecture Verification. In *Embedded Systems Development*, volume Volume 20, pages 85–99. Springer, (2014).