Predictions submitted by users steer Numerai's hedge fund together.
Numeraire: A Cryptographic Token for Coordinating Machine
Intelligence and Preventing Overfitting
Numeraire: A Cryptographic Token for Coordinating Machine
Intelligence and Preventing Overfitting
Richard Craib, Geo↵rey Bradway, Xander Dunn
with Joey Krug
https://numer.ai
February 20, 2017
Abstract
Machine learning competitions are susceptible to intentional overfitting. Numerai
proposes Numeraire, a new cryptographic token that can be used in a novel auction
mechanism to make overfitting economically irrational. The auction mechanism
leads to equilibrium bidding behavior that reveals rational data scientists’ confidence
in their models’ ability to perform well on new data. The auction mechanism
also yields natural arguments for the economic value of a Numeraire token.
1 Motivation
A common approach to verify accuracy in machine learning is to break the dataset into train and test sets.
A trained model can be tested for accuracy on the test set, which it has never seen. However, to maintain
statistical validity, this test set should only be used once. When a data scientist accesses the test set multiple
times and uses that score as feedback for model selection, there’s a risk of training a model that overfits the
test set. This hurts the model’s ability to perform well on new data.
Figure 1: An overfitting curve
where the test error continues to
decrease with more submissions
from data scientists, but the error
on new data increases. [2]
This overfitting problem is called adaptive data analysis [3]. Models resulting from adaptive data analysis
1
range from slightly degraded to completely useless [4]. For Numerai, adaptive data analysis occurs when
data scientists’ models have overfit historical data, at the cost of live performance. In a machine learning
competition there is incentive to overfit to the historical data because performance on that data dictates
winnings. Overfitting becomes intentional. What Numerai really needs is not a collection of great backtests
that work well on historical data, but a collection of great models that work well on new data.
Currently, the state of the art solution to holdout reuse is to limit the amount of information exposed
when using the holdout set [1]. While sucent
for scientific discovery, this solution heavily degrades user
experience and rankings in machine learning tournaments.
We propose a new system for data scientists to communicate their beliefs about the quality of their models.
Data scientists will compete in the new tournament by staking a new crypto-token, Numeraire (NMR), on
their predictions. The auction mechanism for resolving these stakes will reward correct predictions of a
model’s ability to perform well on new data. With Numeraire, data scientists will now be able to express
their confidence in their models’ live performance. Their expressions of confidence help us to emphasize the
right models and improve the performance of our hedge fund.
2 Cryptographic Tokens
Numeraire is an ERC20 Ethereum token [6]. Ethereum tokens are represented as smart contracts that are
executed on the Ethereum blockchain. The source code to Numeraire’s smart contract is publicly available1.
All minted Numeraire are sent to Numerai. The Ethereum smart contract dictates there will never be
more than 21 million Numeraire minted. Numerai will send 1 million Numeraire to data scientists based on
their historical ranking on Numerai’s leaderboard. After the initial distribution, the smart contract will mint
a fixed number of Numeraire each week until the maximum is reached. By performing well in Numerai’s
machine learning competition, data scientists will earn Numeraire on an ongoing basis.
When data scientists are confident of the predictions they have made, they send Numeraire to the
Numeraire Ethereum smart contract. The receiving contract will hold the data scientists’ Numeraire for some
holding period t, with t suciently
large to judge performance on new data. After t has passed, Numerai will
send a message to the contract with information on which data scientists’ predictions performed well on new
data. Those data scientists whose predictions performed well earn dollars based on the auction mechanism,
and their Numeraire are returned. Those data scientists whose predictions did not perform well on new
data risk having their Numeraire destroyed. The irreversible destruction of these Numeraire will be publicly
verifiable on the Ethereum blockchain.
1https://github.com/numerai/contract
2
3 Auction
3.1 Overview
Every tournament has a staking prize pool, which is some fixed number of dollars. The auction mechanism
allocates the prize pool among data scientists. Data scientists can submit bids to the auction. Bids are
tuples (c, s) where c is confidence defined as the number of Numeraire the data scientist is willing to stake to
win 1 dollar, and s is the amount of Numeraire being staked. For some time t, s is locked in the Ethereum
contract, inaccessible to anyone, including Numerai. After t has passed, a variant on the multiunit Dutch
auction is used to determine the payouts.
3.2 Auction Mechanism
The auction mechanism is a multiunit Dutch auction with some additional rules. Performance is evaluated
after time t. The performance evaluation metric is logloss2, a suitable metric for binary classification problems
like Numerai’s machine learning competition. A model is considered to have performed well if logloss
<
ln(0.5), and badly if logloss
ln(0.5). The data scientists are ranked in descending order of confidence
c. In descending order of confidence until the prize pool is depleted, data scientists are awarded s/c dollars
if their models performed well or they lose stake s if they perform badly. Once the prize pool is depleted,
data scientists no longer earn dollars or lose their stakes.
3.3 Example
Assume a prize pool of 3000 dollars, and that time t has elapsed. Assume the staking auction ended as
follows:
Confidence c Stake s s/c Logloss <
ln(0.5) Data Scientist
5 10000 2000 NO WSW
4 2000 500 YES XIRAX
1.5 3000 2000 YES PHIL CULLITON
1 5000 5000 NO DAENRIS
0.5 300 600 YES ABRIOSI
WSW didn’t achieve logloss <
ln(0.5), so his 10,000 Numeraire are destroyed. XIRAX receives $500 and
his Numeraire are returned. PHIL CULLITON receives $2000 and his Numeraire are returned. DAENRIS’
Numeraire are destroyed. ABRIOSI receives $500, $100 less than his bid because the prize pool is exhausted.
Everyone below ABRIOSI will have the Numeraire returned and receive zero dollars.
2https://www.kaggle.com/wiki/LogarithmicLoss
3
4 Analysis of Staking
Let p be the probability that the model achieves logloss <
ln(0.5) on new, unseen data. A low p would
imply a high probability that a model is overfit. Let s be a data scientist’s total Numeraire staked. Let e be
the exchange rate of Numeraire per dollar. c is the confidence. A data scientist will stake Numeraire if the
expected value of staking Numeraire is positive. If a data scientist stakes s and achieves logloss
ln(0.5),
the data scientist loses s
e
dollars. If a data scientist stakes s and achieves logloss <
ln(0.5), the data
scientist wins s
c dollars. Therefore, the expected value in dollars of staking s with confidence c is
E(c, s) = p
s
c
(1
p)
s
e
A data scientist will stake if
E(c, s)
0
p
s
c
(1
p)
s
e
0
This implies
p
c
c + e
This results in self-revelation: Data scientists are moved to reveal their true inner values. Solely in the
interest of maximizing winnings, data scientists reveal their knowledge of their models’ abilities to generalize
to new, unseen data. As we let these tournaments repeat, we expect to see bidding behaviors that accurately
reflect p, since overbidding and underbidding are both nonoptimal behaviors and the accuracy of estimating
p increases with time.
Since having a higher confidence produces greater incentive to participate in an auction, we can make
the following observations:
• The higher p, the higher c a data scientist will submit, and the more dollars the data
scientist can win from the auction.
• For a fixed p, a confidence that is too high produces E(c, s) < 0, which will deter this
strategy.
• Models that perform well on historical data but fail to generalize (low p) will either
have logloss <
ln(0.5) or have E(c, s) < 0.
• Because Numeraire can be used by data scientists to earn dollars, the exchange rate
e > 0.
• Numeraire is worth more to data scientists with large p because they can use it to earn
dollars with higher confidence.
• A data scientist with p = 1 has an expected value in dollars E(c, s) = s
c . To this data
scientist, the value of all Numeraire is the net present value of all future stake payouts
by Numerai.
4
The purpose of this auction is to get accurate probability estimates, not to maximize Numeraire staked.
The auction need not be revenue maximizing, but self-revelation is important. While a weakly dominant
strategy in second priced auctions is to bid truthfully, second priced auctions are more susceptible to collusion
and first priced auctions are more robust to this [5]. For this reason, and for simplicity, we use a Dutch
auction (first priced) rather than an Ausubel auction.
References
[1] Dwork, Feldman, Hardt, Pitassi, Reingold, Roth. Generalization in Adaptive Data Analysis and Holdout
Reuse. http://papers.nips.cc/paper/5993-generalization-in-adaptive-data-analysis-and-holdout-reuse.pdf.
[2] Gringer. Distributed under a CC BY 3.0 License. https://creativecommons.org/licenses/by/3.0/deed.en.
[3] Hardt. Adaptive data analysis. http://blog.mrtz.org/2015/12/14/adaptive-data-analysis.html.
[4] Hardt. Competing in a data science contest without reading the data.
http://blog.mrtz.org/2015/03/09/competition.html.
[5] Krishna. Auction Theory. Elsevier, Massachusetts, 2010.
[6] Wood. Ethereum: A Secure Decentralized Generalised Transaction Ledger.
http://gavwood.com/paper.pdf.
5
A weekly prize pool is paid out to the top performing users who stake.
Interested? Here is how to get started.
7,500 FACELESS CODERS PAID IN BITCOIN BUILT A HEDGE FUND'S BRAIN
RICHARD CRAIB IS a 29-year-old South African who runs a hedge fund in San Francisco. Or rather, he doesn't run it. He leaves that to an artificially intelligent system built by several thousand data scientists whose names he doesn't know.
Under the banner of a startup called Numerai, Craib and his team have built technology that masks the fund's trading data before sharing it with a vast community of anonymous data scientists. Using a method similar to homomorphic encryption, this tech works to ensure that the scientists can't see the details of the company's proprietary trades, but also organizes the data so that these scientists can build machine learning models that analyze it and, in theory, learn better ways of trading financial securities.
"We give away all our data," says Craib, who studied mathematics at Cornell University in New York before going to work for an asset management firm in South Africa. "But we convert it into this abstract form where people can build machine learning models for the data without really knowing what they're doing."
He doesn't know these data scientists because he recruits them online and pays them for their trouble in a digital currency that can preserve anonymity. "Anyone can submit predictions back to us," he says. "If they work, we pay them in bitcoin."
The company comes across as a Silicon Valley gag. All that's missing is the virtual reality.
So, to sum up: They aren't privy to his data. He isn't privy to them. And because they work from encrypted data, they can't use their machine learning models on other data—and neither can he. But Craib believes the blind can lead the blind to a better hedge fund.
Numerai's fund has been trading stocks for a year. Though he declines to say just how successful it has been, due to government regulations around the release of such information, he does say it's making money. And an increasingly large number of big-name investors have pumped money into the company, including the founder of Renaissance Technologies, an enormously successful "quant" hedge fund driven by data analysis. Craib and company have just completed their first round of venture funding, led by the New York venture capital firm Union Square Ventures. Union Square has invested $3 million in the round, with an additional $3 million coming from others.
Hedge funds have been exploring the use of machine learning algorithms for a while now, including established Wall Street names like Renaissance and Bridgewater Associates as well as tech startups like Sentient Technologies and Aidyia. But Craib's venture represents new efforts to crowdsource the creation of these algorithms. Others are working on similar projects, including Two Sigma, a second data-centric New York hedge fund. But Numerai is attempting something far more extreme.
RELATED STORIES
NATHAN BRUSCHI
Maybe Wall Street Has the Solution to Stopping Cyber Attacks
CADE METZ
The Rise of the Artificially Intelligent Hedge Fund
CADE METZ
Why Wall Street Is Embracing the Blockchain—Its Biggest Threat
The company comes across as some sort of Silicon Valley gag: a tiny startup that seeks to reinvent the financial industry through artificial intelligence, encryption, crowdsourcing, and bitcoin. All that's missing is the virtual reality. And to be sure, it's still very early for Numerai. Even one of its investors, Union Square partner Andy Weissman, calls it an "experiment."
But others are working on similar technology that can help build machine learning models more generally from encrypted data, including researchers at Microsoft. This can help companies like Microsoft better protect all the personal information they gather from customers. Oren Etzioni, the CEO of the Allen Institute for AI, says the approach could be particularly useful for Apple, which is pushing into machine learning while taking a hardline stance on data privacy. But such tech can also lead to the kind of AI crowdsourcing that Craib espouses.
On the Edge
Craib dreamed up the idea while working for that financial firm in South Africa. He declines to name the firm, but says it runs an asset management fund spanning $15 billion in assets. He helped build machine learning algorithms that could help run this fund, but these weren't all that complex. At one point, he wanted to share the company's data with a friend who was doing more advanced machine learning work with neural networks, and the company forbade him. But its stance gave him an idea. "That's when I started looking into these new ways of encrypting data—looking for a way of sharing the data with him without him being able to steal it and start his own hedge fund," he says.
The result was Numerai. Craib put a million dollars of his own money in the fund, and in April, the company announced $1.5 million in funding from a group that included Howard Morgan, one of the founders of Renaissance Technologies. Morgan has invested again in the Series A round alongside Union Square and First Round Capital.
https://numer.ai/homepage
In February, Numerai announced Numeraire, a cryptographic token to incentivize data scientists around the world to contribute artificial intelligence to our hedge fund (see Forbes, Wired, Smith+Crown). Earlier today, the Numeraire smart contract was deployed to Ethereum, and over 1.2 million tokens were sent to 19,000 data scientists around the world.
A Protocol For AI
Numerai is building the protocol to connect machine intelligence to the stock market, and we want you to build on top of it.
Numerai has made over $200 000 in payments to our users. We have used bitcoin to make these payments. The problem with bitcoin is that it exists on a different blockchain to the Numeraire token. This drastically limits the extent to which decentralized applications based on Numerai can be automated and unstoppable because these applications cannot receive payment in bitcoin, they can only receive and use ether.
If Numerai made payments in ether, then a decentralized application on Ethereum could automatically use that ether to fund its operations (for example, its gas costs). Bitcoin payments make sense for people not for decentralized autonomous organizations (DAOs). We want to move more of Numerai onto Ethereum to accommodate DAOs. Making payments in ether will have large cascading effects for the kinds of applications that can interface with Numerai.
Today we are announcing that we are abandoning bitcoin. It will be phased out of Numerai by September 30th. From that point on, all payments will flip into ether and Numeraire.
Numeraire Live On Ethereum
Starting today, data scientists can withdraw Numeraire tokens to any Ethereum address, and interact with the smart contract. Data scientists can also use Numeraire to earn more money by staking it on their predictions. If their predictions perform well, they earn more money. If their predictions perform badly, their Numeraire is destroyed on the blockchain.
The staking mechanism creates a powerful new incentive to build the best machine learning model on Numerai. For thousands of people, staking Numeraire will be the first time in their lives they have interacted with an Ethereum smart contract. And they can do it all from Numerai’s website without needing to manage keys or use an Ethereum client. This is not speculative; you can stake Numeraire right now, and the Ethereum transaction will influence the course of Numerai’s hedge fund.
INVESTORS & ADVISORS
Howard Morgan
Co-Founder of Renaissance Technologies and First Round Capital
Fred Ehrsam
Co-Founder of Coinbase
Juan Benet
Founder of Protocol Labs (IPFS and Filecoin)
Ash Fontana
Board Member at Kaggle
Joey Krug
Thiel Fellow and co-founder of Augur
Peter Diamandis
Founder of Singularity University and the IBM Watson AI XPRIZE
Olaf Carlson-Wee
Founder of Polychain Capital
Union Square Ventures
Founded by Fred Wilson
Numeraire: "... a new cryptographic token that can be used in a novel auction mechanism to make overfitting economically irrational."
https://numer.ai/static/media/whitepaper.29bf5a91.pdf
Read the whitepaper