1. Research & development
  2. > Blog >
  3. Formally Verifying a Critical Smart Contract

Formally Verifying a Critical Smart Contract

20 January 2020
Nomadic Labs

One of the main goals of Nomadic Labs is the development and applications of formal methods in the domain of distributed software, blockchains and smart contracts. In particular for the Tezos blockchain, for which we also develop the Cortez smartphone wallet (Android, iPhone). This wallet helps Tezos users manage their account and funds in a safe and secure manner.

How can the user be assured that this critical software has been correctly implemented? In this blog post, we detail how we have applied formal methods to verify the safety of a core component of the Cortez safety layer: the Spending Limit Contract smart contract.

We first detail the contract and its specification. Even though the deployed contract is hand-written in Michelson, we here give a more readable pseudocode in CameLIGO. We then overview common approaches to formal methods, before discussing the tool we have used: Mi-Cho-Coq. Finally, we discuss the verification effort itself.

The Spending Limit Contract

The Spending Limit Contract (SLC) is a critical component of the safety layer of the Cortez wallet, as discussed earlier on this blog. This contract ensures that only a limited amount of funds can be spent within a given time window. This adds a layer of security to the user’s account in the unfortunate event that their device is stolen. The contract accounts user spending in a queue of past transfers. This queue is implemented using a pair of lists, a technique for optimizing gas consumption3. However, this implementation increases the complexity of the contract’s logic significantly.

The Spending Limit Contract specifies two roles: an administrator and a user. They are identified by their respective key_hash, and their calls to the contract must be authenticated through a cryptographic signature using this key_hash.

Historically, the contract used authentication by caller. That is, calls to protected entrypoint would be rejected if the caller’s address did not correspond to that of the appropriate role. However, this approach required the revelation of two public keys when deploying the contract, as well as another revelation if either of the key should be changed. As revelations are costly, the current version of SLC has passed to authentication by signature. In this mode, the keys for signing are not required to be originated. Additionally, the same address can be used to call both administrator and user entrypoints, even if their signing keys change.

The administrator has full control over the contract through the administer entrypoint. First, they can reset its storage. Second, they can pass it a lambda that produces a list of operations when executed, that the contract then returns. For example, the first feature can be used to change the identity of the user or the administrator, or to modify the duration of the time window and the spending limit. The second feature can be used to remove an arbitrary amount of tokens from the account.

In earlier version of the contract, the administrator was limited to resetting the storage. Consequently, to perform a transfer larger than the limit, the administrator would have to perform three calls: one to raise the limit, a second to perform the transfer, and a third to restore the limit. By letting the administrator pass a lambda, the contract is both more flexible and less costly to use.

On the other hand, the manipulations of the user, passing through the transfer entrypoint, is subject to the restrictions of the time window and the spending limit. The user supplies a list of transfers. The smart contract verifies that the total sum of these transfers, added to any previous transfers in the time window, does not exceed the limit, in which case they are executed. Historically, the contract allowed only one transfer per call to transfer. To amend this limitation, the current version allows the user to pass a list of transfers, thus reducing the cost of performing multiple transfers.

The contract is composed of the administer and transfer entrypoints, along with a simple unauthenticated entrypoint receive used to receive funds. The actual contract is hand-written in Michelson, but for a legible illustration, we give its pseudocode in CameLIGO. Consequently, the main function of the contract dispatches between the entrypoints with their respective parameters:

type param =
  Receive of unit
| Administer of administer_param
| Transfer of transfer_param

let main (param, s : param * storage) =
  match param with
  | Receive p -> receive (p, s)
  | Administer p -> administer (p, s)
  | Transfer p -> transfer (p, s)

The full CameLIGO version, including the definition of administrate_param and transfer_param, can be found here. We now give an informal specification of the desired behavior of each entrypoint.


let change_storage (csp , sto : change_storage_param * storage) : operation list * storage =
  let key = csp.0 in
  let sign = csp.1 in
  let new_sto = csp.2 in
  if Crypto.hash_key key = sto.key_administrator
     && Crypto.check key sign (Bytes.pack (new_sto,sto.salt_administrator))
    (([] : operation list), new_sto)
    (failwith "Bad signature" : operation list * storage)

let run_function (rfp, sto : run_function_param * storage ) : operation list * storage =
  let key = rfp.0 in
  let sign = rfp.1 in
  let admin_function = rfp.2 in
  let kh = rfp.3 in
  if Crypto.hash_key key = sto.key_administrator &&
       Crypto.check key sign (Bytes.pack (admin_function,sto.salt_administrator)) then
    (admin_function (),
     { key_user = sto.key_user ;
       key_administrator = kh ;
       current_threshold = sto.current_threshold;
       time_window  = sto.time_window;
       queue =  sto.queue ;
       salt_user = sto.salt_user ;
       salt_administrator = sto.salt_administrator + 1n})
    (failwith "Bad signature" : operation list * storage)

let administer (p, s : administer_param * storage) =
  match p with
  | Change_storage cs -> change_storage (cs, s)
  | Run_function mt -> run_function (mt, s)

The administer entrypoint, illustrated by the above CameLIGO pseudocode, receives a public key, a payload and a signature of this payload. It verifies that the hash of this key equals the stored key hash of the administrator. It also verifies that the signature of the payload is valid and created by the administrator. If either of these conditions are unfulfilled, the contract fails.

Then, depending on the payload, it either updates the storage of the context, or executes the received lambda and returns the obtained list of operations.


let reverse (l : (timestamp * tez) list) : (timestamp * tez) list =
    (fun (l, x : (timestamp * tez) list * (timestamp * tez)) -> x :: l)
    l ([] : (timestamp * tez) list)

let get_time (n : int) : timestamp = Current.time

let prune (threshold, queue : tez * queue_t) : bool * tez * queue_t =
  match queue.left with
    [] -> if List.size queue.right = 0n
          then stop (threshold, {left = empty_queue_comp; right = empty_queue_comp})
          else continue (threshold, {left = reverse queue.right; right = empty_queue_comp})
  | hd::left_tl -> let ts = hd.0 in
                   let amount = hd.1 in
                   if ts > Current.time
                   then stop (threshold, queue)
                   else continue (threshold + amount, {left = left_tl; right = queue.right})

let transfer (trp, sto : transfer_param * storage) : operation list * storage =
  let key = trp.0 in
  let sign = trp.1 in
  let transfers = trp.2 in
  let kh = trp.3 in
  if Crypto.hash_key key = sto.key_user && Crypto.check key sign (Bytes.pack (transfers, sto.salt_user)) then
    let current_threshold, queue = Loop.fold_while prune (sto.current_threshold, sto.queue) in
    let total_amount = List.fold (fun (acc, amount_dest : tez * (tez * unit contract))  -> amount_dest.0 + acc) transfers 0mutez in
    let ops = List.map
                (fun (amount, destination : tez * unit contract) ->
                  Operation.transaction unit amount destination)
                transfers in
     { key_user = kh;
       key_administrator = sto.key_administrator ;
       current_threshold = current_threshold - total_amount;
       time_window = sto.time_window;
       queue = {left = queue.left; right = (Current.time, total_amount) :: queue.right};
       salt_user = sto.salt_user + 1n;
       salt_administrator = sto.salt_administrator})
  else (failwith "bad_signature" : operation list * storage)

The transfer entrypoint, illustrated by the CameLIGO pseudocode above, receives a public key, a list of new transfer orders and a signature of this list. The contract verifies that the hash of this key equals the stored key hash of the user. It also verifies that the signature of the payload is valid and created by the user. If either of these conditions are unfulfilled, the contract fails.

The contract maintains a queue of all previous transfers that fell within the time window at the previous execution of the contract. In the transfer entrypoint, this list is traversed, and all members of this queue that are now outside the time window are removed.

The contract also maintains the current limit, i.e., the amount of funds that can be spent at this moment. The current limit is now updated by adding to it the sum of the funds in the removed elements in the previous step.

If the user attempts to overspend, i.e., if the total of the funds in the new list of transfers exceeds the current limit, then the contract will fail at this point. Otherwise, meaning that the list of ordered transfers are allowed by the time window and spending limit, then a TRANSFER_TOKEN operation is emitted for each ordered transfer with the specified amount and destination.

Example. To illustrate the behavior of the transfer entrypoint, consider a contract that is configured so that no more than 30 ꜩ can be spent every 40 minutes. In other words, the time limit is set to 40 minutes and the initial limit to 30 ꜩ. Assume the user has transferred 3 ꜩ from the contract at timestamp 12:20 (transfer t1), and 10 ꜩ at timestamp 12:40 (transfer t2). After performing the last transfer, the current limit is set to 30 - 10 - 3 = 17 ꜩ. We can illustrate the state of the contract by a timeline, as below. The timeline is decorated with all past transfers, and the part of the timeline that falls in the time window is highlighted in blue.

Transfer entrypoint, initial state

Now, say the user attempts to transfer 30 ꜩ at 13:10 (transfer t3). The contract traverses the queue of past transfers. It finds that transfer t1 falls outside the time window, prunes it, and so increments the current threshold by 3 from 17 to 20 ꜩ. Nevertheless, the amount of the transfer (30 ꜩ) exceeds the current limit (20 ꜩ), and hence the request is rejected. The contract execution is cancelled, its state is unchanged and no operation is emitted.

Transfer entrypoint, invalid transfer

The user must opt for a more tempered spending: this time, at timestamp 13:20, they request a transfer of 15 ꜩ (transfer t4). Again, the history is traversed and as before the current limit is incremented to 20 ꜩ. Now, the amount of the transfer 15 ꜩ falls snugly under the current limit of 20 ꜩ. Consequently the transfer is accepted, a TRANSFER_TOKENS operation is issued, and the queue and current limit are updated:

Transfer entrypoint, valid transfer


The third entrypoint, receive is used to transfer funds to the contract without provoking any other changes in its state. This entrypoint emits no operations and does not modify the storage. This is succinctly expressed in CameLIGO pseudocode:

let receive (p, s : unit * storage) : operation list * storage = ([] : operation list), s

Approaches to Formal Verification

The criticality of the Cortez wallet, the security motivation of the spending limit contract, and the difficulties of modifying already-deployed smart contracts make SLC a prime target for formal verification. In addition, the optimized representation of the queue induces a gap between the intuitive behavior of the contract and its implementation, which makes verification non-trivial.

While code review and testing are capable of detecting the presence of bugs, they can never completely guarantee the absence of undesired behaviors. Making sure programs always behave as expected requires a more rigorous and systematic approach. This is the mainspring of formal methods. At the heart of any formal method is the specification of a system: the definition of its desired behavior in a rigorous mathematics-based – formal – flavor.

If a program is seen as a sentence in a programming language, then the semantics of this programming language is the rules that give a sense to this program, i.e. its behaviors. Hence, verifying a program is to verify that its semantics complies with its specification. We now briefly discuss five common formal methods: model checking, abstract interpretation, mechanized verification, correctness-by-construction, and deductive verification.

Model checking is a model-based formal method. Instead of reasoning on the program, verification is applied to a model: a transition state machine that enables the inspection of every possible execution path of the program. The specification, typically written in temporal logics, can then either been verified on this model or a counter-example is exhibited. The drawback of model checking is the combinatorial explosion of paths that must be explored, whose number is proportional to the size of the model.

Abstract interpretation proposes to reason on an abstract semantics instead of reasoning on the concrete execution semantics. The purpose of abstract semantics is to provide dedicated constructions to the specification to be proven. Consequently, the abstract interpretation eases the verification. The difficulty here resides in designing an appropriate and sound approximation, a complex task that is also very dependent on the properties to prove.

Mechanized verification. It is possible to use a proof assistant in order to mechanically assist the construction of a proof establishing the correctness of a program. In this case, the semantics of the programming language used, and the specification of the program is written in the proof assistant’s logic and the proof of correctness is constructed using the proof assistant’s tactics language.

Correctness-by-construction ensures compliance of the program with its specification using a top-down approach. The implementation is done in a theorem prover that (i) enables extraction of an executable program from the specification and (ii) enables the mechanized verification of the program. The program is then formally certified: it comes with its replayable formal proof.

Deductive verification enables verifying the compliance of the program with its specification directly on the behavior of the program thanks to a Hoare logics4 that enables reasoning on pre- and post-state of a program along with its semantics. The specification of the program is then expressed through properties that holds on the pre-state of the program execution, called precondition and properties that holds on the post-state of its execution, called postcondition. Together, the precondition, program and postcondition form a Hoare triplet. While Hoare logics is a relation that might not terminate, weakest precondition calculus is a function that computes the weakest precondition for a program and a postcondition in order to form a valid Hoare triplet. During the weakest precondition calculus proof obligations are produced. In mainstream tools, they are dispatched to automatic provers that either validate them, refute them or cannot decide. While easing the proof engineer’s work, using automatic provers presents two limits when used in conjunction with interactive proving. First, automatic provers typically do not adhere to a common proof format. Consequently, the proof assistants must implement a wide variety of formats, or trust the automatic prover blindly. Second, when the automatic prover can’t decide, the proof engineer has to provide a counter-example or handle the proof manually in a proof assistant. In that case, the specification that is originally designed to suit an automatic prover may be inconvenient for the proof engineer working in a proof assistant.

We have formally specified the Spending Limit Contract, in other words, described precisely and unambiguously its desired behavior. Using the weakest precondition calculus of Mi-Cho-Coq, we have then proved that the SLC contract fulfills this specification. Not only do we ensure that SLC “does what it is supposed to do”, that is, throttle spending. We also ensure that its execution does not cause any unintended run-time errors1.


Mi-Cho-Coq is a formalization of the semantics of the Tezos smart contract language Michelson in Coq. Michelson is a strongly typed, pure functional and stack-based programming language. Each Michelson instruction can be seen as a function from an input stack of working data to an output stack, in a blockchain environment. The blockchain environment gives current state of the blockchain, containing information such as the current timestamp and the balance of the executing contract.

Whereas Michelson has always had a pen-and-paper semantics, encoding it in a proof assistant such as Coq helps to ensure that all corner cases are unambiguously specified. Mi-Cho-Coq also enables deductive verification of Michelson programs thanks to its weakest precondition calculus that enables using the high-level expressiveness of Coq. This requires that all proof obligations are discharged by the proof engineer in Coq.

In Mi-Cho-Coq, the semantics of Michelson programs is given by an evaluation function eval that is similar to the interpreter embedded in the tezos-node:

  Fixpoint eval (* omitted parameters *)
           (env : @proto_env param_ty)
           (i : instruction param_ty tff0 A B)
           (fuel : Datatypes.nat)
           (SA : stack A)
           {struct fuel} : M (stack B) :=
    match fuel with
    | O => Failed _ Out_of_fuel
    | S n =>
      match i, SA, env with
      (* ... *)
      (* Semantics of the NOOP instruction *)
      | NOOP, SA, _ => Return SA
      (* Semantics of the SEQ instruction *)
      | SEQ i1 i2, SA, env =>
        let! r := eval env i1 n SA in
        eval env i2 n r
      (* ... *)

The semantics of a Michelson instruction i is given by the recursive function eval. This is a function that takes as input a blockchain state env, an input stack SA and a fuel argument, and returns an output stack. To maintain soundness of proofs expressed in Coq, all functions must be total and terminating. To fulfill the first requirement, the result of the eval is wrapped in the error monad: M (stack B). This simply means that eval either returns an output stack of type stack B or fails, which happens in the case of run-time errors or lack of fuel. The fuel (not to be confused with gas) argument ensures that the eval function is terminating, by bounding the maximum number of recursive steps that the function is allowed to take.

The snippet above demonstrates the semantics of the instruction NOOP, that simply returns the input stack SA unchanged. The semantics of the sequencing instruction SEQ i1 i2 consists of first executing i1. If successful, a new stack r is obtained on which i2 is executed.

Verifying Spending Limit in Mi-Cho-Coq

As the interpreter of Mi-Cho-Coq looks similar to the one already implemented in the tezos-node, one might wonder why go through greats lengths to rewrite the same thing? The answer is that thanks to the rich type-system of Coq, we can now state and prove properties on the behavior of eval. For instance, we can formalize the informal specification of Spending Limit Contract sketched out above in Coq, and then assert that each execution of the contract should adhere to the specification. This assertion translates into the following theorem, as expressed in Coq:

Theorem slc_correct input output :
  forall fuel,
    fuel >= slc_fuel_bound input ->
    eval env slc fuel input = Return output <->
    slc_spec fuel input output.

We let slc_fuel be a function giving a bound on the number of steps necessary to execute the contract, that depends notably on the length of the queue of past transfers. The script of the SLC smart contract is slc and its specification is given by slc_spec. This theorem states that when given a sufficient amount of fuel, the spending limit contract when executed on the input stack will return the stack output if, and only if (denoted <-> in Coq), this input and this output are related by the slc_spec relation.

The semantics of a Michelson contract is a function from an input (pre-) stack to an output (post-)stack, and the contract is a sequence of instructions. Hence, it is adequate to use the weakest precondition calculus of Mi-Cho-Coq. This calculus translates the statement eval env slc fuel input = Return output into a logical relation between the input and output stacks. For the details on how we translate the informal specification given above, into its formalized version slc_spec as well as the actual proof of the theorem using the weakest precondition calculus, we refer to the ongoing Mi-Cho-Coq merge request containing the full development.

To summarize, we have elaborated an informal specification of the Spending Limit Contract, that we have then formalized in Coq. We have then verified that the Spending Limit Contract conforms to this specification using Mi-Cho-Coq. In addition to showing that SLC behaves as expected, this also proves that the Spending Limit Contract cannot provoke any unforeseen run-time errors. All-in-all, this work required roughly three weeks of work, and the full development consists of 1377 lines of Coq code. Of these lines, 269 consist of auxiliary lemmas that we hope will be helpful for proving future contracts. Additionally, 518 lines concern the specification and verification of the code manipulating the queue of past transfers. As this data structure, and its optimized representation used in SLC, is standard, we can reasonably assume the reusability of these proofs.

Next Steps

While this development allows us to increase trust in the Spending Limit Contract, and by extension, the Cortez wallet, we have still only solved one half of the verification puzzle. The specification we have proved only concerns the functional correctness of SLC, that is, its input-output behavior. However, there is another set of properties called temporal correctness that concerns the behavior of a program over a sequence of calls.

For instance, in the case of the Spending Limit Contract, we would like to be able to specify and prove that the funds in the contract can eventually be used. This type of property cannot be assured by simply establishing a relation between the contract’s input and its computed output. Instead, the behavior of the contract must be investigated over all possible sequences of calls. This kind of reasoning is typically the domain of tools called model checkers. Case in point, the temporal properties of an auction contract for Tezos have been verified2 using the Cubicle model checker. However, temporal properties can also be expressed and proved deductively in Coq, as have been done for Scilla, another smart-contract language. The next step in our plan to assure the trust in Cortez is to complete the verification puzzle for SLC by specifying and proving its desired temporal properties, either through model checking or by deductive proof.

Another important missing link is that between these formal proofs, currently stored in a GitLab repository, and the actual smart contracts that live on the block chain. Consider a user that wants to ensure the validity of a purported correctness proof of a deployed contract. They must first install Coq and Mi-Cho-Coq and have Coq verify the correctness proof of the contract. Then, they must download the contract and verify that it is indeed exactly the same as the one proved in the Mi-Cho-Coq distribution.

We envision a smart contract explorer that automatically associates contract hashes with properties, such as correctness specifications and their associated proofs. Users can add new proofs and the explorer verifies their validity. Additionally, the explorer provides instructions allowing other users to replicate the proofs. This kind of explorer would be important to illustrate the nature and importance of formal methods to end-users, and would increase overall confidence in smart-contract based systems.

  1. A run-time error may result from a programming error but can also be caused intentionally to halt an execution, for instance when erroneous parameters have been supplied. This is used in SLC in the transfer entrypoint. When the user requests a transfer, then its sum is subtracted from current threshold. Unless the threshold is superior or equal to the sum, this subtraction results in a tez overflow that halts execution. This implementation effectively disallows overspending. 

  2. Sylvain Conchon, Alexandrina Korneva and Fatiha Zaidi. Verifying Smart Contracts with Cubicle. Presented at FMBC‘2019, publication forthcoming. 

  3. Chris Okasaki, Purely Functional Data Structures, p. 42. 

  4. C. A. R. Hoare. 1969. An axiomatic basis for computer programming. Commun. ACM 12, 10 (October 1969), 576-580. DOI=http://dx.doi.org/10.1145/363235.363259