How Does The Blockchain Know A Football Match's Result?

The Ethereum Blockchain allows to use so called "smart contracts" to be executed on the Blockchain itself. Besides Bitcoin as the first and biggest crypto currency, Ethereum has taken the second spot with its focus on smart contracts. But there is an obstacle: Smart contracts on the blockchain cannot access data from outside of the blockchain. However, real world data is most often the basis for processes in a smart contract. How does a smart contract access information from the real world without pulling it into the blockchain all by itself?

Smart contracts on Ethereum are small programs written in Solidity. These programs execute on the blockchain itself and can act upon data that is available on the blockchain. The Ethereum blockchain has its focus on these smart contracts and does not solely focus on the payment functions. Smart contract mimic "normal", paper based contracts in that they encode various clauses and conditions that have been agreed on between two or more parties. In addition to just storing these conditions, a smart contract also executes these conditions just like a "normal" computer program would do. A smart contract is thus an entity that combines functions of a paper based, traditional contract and a computer program.

Because the blockchain is an "append-only" system a contract on the blockchain cannot be changed under normal circumstances (this rule has had known deviations). For "normal" operations the blockchain can be considered immutable. A smart contract will execute the encoded clauses automatically. Because the blockchain is an open system, everyone can look at a smart contract and verify that the contract acts as advertised.

A Bet As a Smart Contract

A (simple) example for a smart contract is a bet: The smart contract acts as a platform for betting, users can place bets by paying some money to the contract (ie. sending some money to the contract via transaction). As soon as the event that has been bet on has been resolved, the smart contract will determine winners and losers of the bet and pay the wins to the the winners automatically.

Because of the transparent nature of the contract a user does not need to trust that the contract behaves as expected, but he can prove that the contract does indeed pay out the wins instead of running away with them by studying the sources of the contract.

Such a proof works charmingly well if all data that is needed to determine the outcome of a bet is available on chain: The smart contract can access such data and process it. Difficulty arises if the contract relies on data that is external to the blockchain for winner determination. Such a real world event does not have any connection to the blockchain by itself and cannot be used within the contract. Even though all conditions and clauses are part of the contract and can be used, the actual data that determines the result of the bet is not. An example is a bet on a football match: The result of the match cannot be read on chain and thus, the contract cannot determine a winner.

To allow for smart contracts that make use of real world data, the data has to be fed into the blockchain via transaction. After that it is stored on the chain and can be used by smart contracts.

In the most simple case smart contracts would rely on services to be blockchain aware. Such services would fed data into the blockchain and make it available for processing. However, external services aren't usually set up to interact with a blockchain. Mostly they provide data via API in JSON format that is useful for automatic processing. Therefore, we need an intermediary that will call a service via REST API and feed the data into the blockchain for the smart contract at a specific point in time.

Oracles To The Rescue

The function of an intermediary is adopted by so called oracles. They provide a smart contract on the blockchain that can be called by other smart contracts. An external service, run by the owners of the oracle contract, will monitor the oracle contract and feed data into the blockchain as needed. The oracle contract will then call the contract that ordered the external data. It is the calling contracts responsibility to interpret the response. In that way oracles act just the the ancient oracle of Delphi: A caller can ask a question and will received an answer. The interpretation of the answer lies with the caller. In particular, a caller does not know (and cannot see) how the answer came to be. It must trust the oracle to provide correct data and act truthfully.

Currently, one of the most used oracle services on the Ethereum blockchain is oraclize.it. The service provides an oracle on Ethereum and (with limits) on other blockchains such as Bitcoin. To use it a smart contract calls the oracle contract with the REST call that it wants the answer to. The oracle will get the answer via its external service and the call the original caller with a callback function. The callback receives the response to the REST call.

For our football bet, we rely on the data from [http://api.football-data.org/index]. It provides the results of various football leagues in a machine readable format. The full code for the example contract is available on Github. The actual code (slightly shortened here for readability) is pretty straight forward:

contract FootballBet is usingOraclize {

    struct Game {
        string gameId;
        string date;
        string status;
        string homeTeam;
        string awayTeam;
        uint homeTeamGoals;
        uint awayTeamGoals;
        Result result;
        uint receivedGoalMsgs;
    }

    struct Request {
        bool initialized;
        bool processed;
        string key;
    }

    Game game;
    mapping (bytes32 => Request) requests;

    function queryFootballData(string gameId, string key, uint gas) public {
        if (oraclize_getPrice('URL') > this.balance) {
            Info('Oraclize query was NOT sent, please add some ETH to cover for the query fee');
        } else {
            string memory url = generateUrl('https://api.football-data.org/v1/fixtures/', gameId, '?head2head=0', key);
            bytes32 requestId = oraclize_query('URL', url, gas);
            requests[requestId] = Request(true, false, key);
            Info('Oraclize query was sent, standing by for the answer..');
        }
    }

    function __callback(bytes32 myid, string result) public {
        Request memory r = requests[myid];

        if (r.initialized && !r.processed) {
            // new response
            if (r.key.toSlice().equals(JSON_FIXTURE.toSlice())) {
                var (success, tokens, numberTokens) = JsmnSol.parse(result, 45);
                if (success) {
                    for (uint k=0; k<=numberTokens; k++) {
                        // get Token contents
                        if (tokens[k].jsmnType == JsmnSol.JsmnType.STRING) {
                            string memory key = JsmnSol.getBytes(result, tokens[k]);
                            if (key.toSlice().equals(JSON_STATUS.toSlice())) {
                                game.status = JsmnSol.getBytes(result, tokens[++k]);
                            } else if (key.toSlice().equals(JSON_HOME_TEAM.toSlice())) {
                                game.homeTeam = JsmnSol.getBytes(result, tokens[++k]);
                            } else if (key.toSlice().equals(JSON_AWAY_TEAM.toSlice())) {
                                game.awayTeam = JsmnSol.getBytes(result, tokens[++k]);
                            }
                        }
                    }
                }
            }
            Info(gameToString());
            requests[myid].processed = true;
        }
    }    
}

The contract inherits usingOraclize as its parent contract. It provides the functions to interact with the oracle contract. In particular the two functions oraclize_query(string data_source, string url, uint gas) and __callback(bytes32 myid, string result) are interesting as they are the two main interactions with the oracle.

The betting contract can ask for external data by calling the oraclize_query. Its three parameters define the request to an external service:

The data source should be the constant 'URL' for calls to REST APIs.
url is the actual URL that should be called on the REST API.
gas is the maximum amount of gas that the callback function (see below) is allowed to use.

In our example the url is json(https://api.football-data.org/v1/fixtures/{gameId}?head2head=0){jsonPath}. It specifies the actual URL with a variable gameId and a variable jsonPath. The JSONPath can be used to filter the JSON response. Normally not the entire JSON response is needed for the smart contract, but only parts of it. String processing is relatively expensive in Solidity. Thus, it is useful to provide the minimally useful JSON to the contract and process only that bit. The response of the call football-data.org is this (slightly shortened) JSON:

{
  "fixture": {
    "date": "2016-08-27T13:30:00Z",
    "status": "FINISHED",
    "matchday": 1,
    "homeTeamName": "FC Augsburg",
    "awayTeamName": "VfL Wolfsburg",
    "result": {
      "goalsHomeTeam": 0,
      "goalsAwayTeam": 2
    }
  }

The JSONPath $.fixture.result will further filter the JSON to { "goalsHomeTeam": 0, "goalsAwayTeam": 2 }. This part is small enough to be parsed on chain.

oraclize_query returns a unique ID for the call to the oracle. This ID can be used in the callback to identify the original request to the API. For matching a callback to a specific query call each query is saved as a Request object. The requests are stored in a mapping so that the callback can retrieve the Request with its ID. The attribute Request.processed is a boolean value that identifies whether the request has been processed in a callback already. This mechanism shields the contract against replay attacks.

Furthermore, the contract will only process responses that have originally been requested by the contract itself: Only responses for requests with Request.initialized == true are processed by the callback. This step is needed because Solidity has the (slightly odd) behavior of initializing anything with its default value (0 for uint, false for bool etc.). Together with the (again slightly odd) fact that mappings are initialized for any conceivable key, a smart contract has to guard against callbacks that have not been requested by the contract itself.

There Is No Free Lunch

In addition to the actual oracle contract on-chain oracle services have to provide external servers, which will trigger the actual transactions to store data on the blockchain. Therefore, they are charging a fee for every call. The fee is charged to the contract that is calling oraclize_query. The contract must have enough ether in its balance to pay the fees to the oracle. queryFootballData ensures that the contract has indeed enough money to pay the oracle before calling the oracle.

Furthermore, the calling contract has to supply the transactions fees that the oracle has to pay to call the callback function. Depending on the callback's complexity they default 200,000 gas may not be enough to pay for the callback. In our example the callback parses the response JSON. String processing is rather expensive in Ethereum. The callback is, thus, one of the more costly functions in the contract and will not execute on 200,000 gas. The third parameter, gas can be used to provide a custom amount of gas to the oracle to cover the callback's fees.

The contract has to cover two kinds of fees: The first one is the call to the oracle itself, the second one is the fee for the callback transaction. Both are deducted from the contract's balance.

Truthful Behavior Can Be Verified

Trust in the blockchain is based on the possibility to verify any transaction that has happened on chain. Oracles provide a means to push external, "real world" data onto the blockchain. Any transaction that relies on such data is, as per definition, as trustworthy as the original data source. A contract for bets on football matches will only find users if the match result show up truthfully and unmodified on the blockchain.

A first trust anchor is the data source that provides the match's results. In our example this is football-data.org. For this blog post's purpose, we assume the data source to be trustworthy and reliable. If responses from the data source end up unmodified on the blockchain, a user can trust the smart contract as much as he would trust the data source.

However, using an oracle a third party comes into the transaction that needs to be included in the trust chain. For our contract that is undesirable, because a user has to trust an additional party. A smart contract should only use such a service if and only if the service can somehow show that it does not alter the data in transit. It should somehow show that the data provided to the blockchain are the same data that it received from the original data source.

Orclize.it supports notarization of the oracle's response with TLSNotary. If a user requests a TLSNotary proof, oraclize will provide a proof that the response that is sent to the callback is actually the same, unaltered response that the oracle has received from the external service. Thus, the trust chain is extended: A user still has to trust the original service, however, by means of the TLSNotary proof the user neither needs to trust the oracle, nor does he need to trust the actual smart contract.

In our smart contract we need to set the "Proof-Type" before querying the oracle:

function queryFootballData(string gameId, string key, uint gas) public {
    if (oraclize_getPrice('URL') > this.balance) {
        Info('Oraclize query was NOT sent, please add some ETH to cover for the query fee');
    } else {
        string memory url = generateUrl('https://api.football-data.org/v1/fixtures/', gameId, '?head2head=0', key);
        oraclize_setProof(proofType_TLSNotary | proofStorage_IPFS);
        bytes32 requestId = oraclize_query('URL', url, gas);
        requests[requestId] = Request(true, false, key);
        Info('Oraclize query was sent, standing by for the answer..');
    }
}

The call to oraclize_setProof(proofType_TLSNotary | proofStorage_IPFS) asks the oracle to notarize the external server call and store the proof on the Interplanetary Filesystem (IPFS). TLSNotary allows for notarization of connections that are secured by TLS 1.0 or TLS 1.1. For the notarization case we want the connection to be secured and verifiable for a (named) third party. This is an extension of the TLS protocol as TLS itself allows to secure a connection without the ability to audit the connection later on.

If a user ask for a proof the oracle will call a callback with the signature function __callback(bytes32 myid, string result, bytes proof). This callback has a third additional parameter, the proof. The parameter holds the IPFS multihash that can be used to retrieve the actual proof from IPFS. Since the proof itself is too large to be stored on the blockchain, the oracle will store it on IPFS and only put its multihash on the chain.

For our oracle use case, oraclize.it acts as "Auditee", i.e. as the party that wants to prove that it has received the server's response "as is" and has not modified it in transit. The counterpart, the "Auditor" is the party that wants to know that the response has been received by the auditee and – in the case of the oracle – that it has been relayed to the smart contract unmodified.

The TLSNotary mechanism relies on parts of the communication during the TLS handshake being withheld until a further time in the connection. Because a smart contract cannot hold secrets by definition, it cannot act as auditor for the TLSNotary proof. Since TLSNotary seems to be a intricately clever mechanism this result is unfortunate: We cannot use the mechanism directly to let the oracle prove that it acts truthfully.

To work around this limitation, oraclize.it chose to modify the whole process a bit: Instead of the smart contract acting as a auditor, another (fourth) party is involved in the process. Instead of the smart contract auditing the communication between oraclize and the server this task is delegated to a specially secured VM instance hosted at Amazon. For proof verification, a user can prove that the communication has been audited by this AWS VM. If someone managed to hack this VM, the hacker would gain access to the secret. If he would manage to hack oraclize's servers too, he would be able to fake proofs. The TLSNotary proof can, thus, be faked in theory, but the need to access to different servers (AWS VM and oraclize servers) makes it improbable in practice.

Different Keys To Truth

Oraclize.it is not the only oracle service for Ethereum. Reality Keys uses a conceptually different approach. It works using these steps:

A user registers an event with Reality Key's API. In our case the event is a football match at a certain time and place.
Reality Key's service creates two so called «reality keys» (RK) – one for a positive outcome of the event (ie. BVB wins) and another for the negative outcome (ie. team 2 wins or the match ends in a draw).
The service publishes both public keys for the RKs. Additionally it publishes a so called fact containing the result after it will have been determined on the Ethereum blockchain.
After the event has resolved (ie. the match has ended and the result has been determined conclusively) the service will publish the private key to the winning RK. The private key for the loosing RK will be destroyed. Furthermore the fact on the Ethereum blockchain will be amended with the final result.
A user can rely on the fact and the published result on the Ethereum blockchain by verifying that the fact has been signed by Reality Key's with their private key.
The result determination by the service is subject to a defined appellation period: If a user (contract or human) disagrees with the result, he can appeal to Reality Keys and request a manual, human result determination. This request has a fee, that has to be paid by the user. A result determined by human resolution is binding in the context of the service.

The trust chain when using Reality Keys as oracle is similar to the the one with oraclize.it: The user has to trust the server to create a truthful response and he has to trust Reality Keys to publish the correct result on chain. To allow for appellation a user can request an event resolutioon to be verified by a human. However, there is not built-in mechanism with Reality Keys to verify that a response from a server has been correctly relayed to the blockchain by the service. Thus, the service builds on reputation.

Conclusion

Smart contracts allow for complex computation to happen on the blockchain. Often they rely on data that originates in the real world, i.e. outside of the blockchain. Oracle services have shown up that push data from real world via transaction into the blockchain. Oraclize.is is the most commonly used service; smart contracts can integrate its services relatively easily.

Using an oracle breaks the original trust chain: Data in the blockchain are transparent and verifiable, external data is not. Thus, a user has to trust the original server that provides the data. Oraclize.it provides a means to proof that the service relays the data as is and does not tamper with them. This TLSNotary proof is done with a specially secured AWS VM for technical reasons. As long as the AWS instance can be trusted, the service cannot fake proofs.

Other oracles use a different concept. Reality Keys doesn't integrate as smoothly and does not provide proofs for their server communications. However, in particular for simple binary decisions it might be a better choice as much of the logic to determine the positive or negative outcome can be delayed to the oracle. Thus, a smart contract does not have to compute the outcome based on the data directly but can take the outcome directly from the oracle. The contract could be cheaper to use.

Oracles provide a means to process external events on the blockchain as long as real world services are not blockchain aware by themselves. If they were, trust would have to be placed in the services only. The need for rather complex proofs (such as TLSNotary) would not arise. Until then – probably for quite some time – oracles bridge the gap between real world and blockchain.