AI (artifical intelligence) and neural networks are all over the news these days. Zuck and Elon are fighting over whether machines are going to kill us, Google's AI is beating world champions in the game of Go, and groundbreaking scientific work is now being done in collaboration with AI. In this series of blog posts we'll show you how to use AI to trade cryptocurrencies for fun and profit.
But first, let's take a step back and talk about what AI is. What's called AI today is a collection of statistical algorithms that can come up with good solutions to problems posed to them without requiring a human to instruct the machine on what exactly it should do. AI algorithms can be broken down into three categories:
- Supervised learning. Supervised learning means you have a certain number of features - think daily trade volume, current price, etc., and a label for every data point - think tomorrow's price. Provided you have good features that carry information about tomorrow's price and enough labeled data, you can use supervised learning to build a model that can predict price movement.
- Unsupervised learning. Unsupervised learning is useful when you don't really know what you want to get out of your data, but you are sure that there are patterns in your data that you want to investigate further. These techniques can be useful when combined with supervised learning, for example to create new features. For example, you can visualize bitcoin addresses as a graph and apply unsupervised learning techniques to cluster addresses into meaningful groups, like "addresses that belong to the same user", or "merchant/shopper", or "trader/holder". These techniques are also used to detect anomalies, for example detecting money laundering.
- Reinforcement Learning. Reinforcement learning is the field that studies the problems and techniques that try to retro-feed its model in order to improve. In order to accomplish this, RL needs to be able to “sense” signals, automatically decide on an action, and then compare the outcome against a “reward” definition. Reinforcement learning tries to figure out WHAT to do to maximize these rewards, but it does this by itself (no direct instructions).
Advanced AI systems combine all three approaches in order to achieve superhuman results. And today we're going to talk about the most exciting of them, reinforcement learning.
Our task is to train an AI on bitcoin price data to buy, sell or hold in a way that maximizes returns. We'll be using Google's awesome tensorflow library to do the mathematical heavy lifting, and we'll apply a technique called Deep Q learning pioneered by DeepMind. This model operates on the following concepts:
- State S, this is a representation of the current world as the algorithm sees it
- State S’, a new state one time step later than S
- Action A, one of the possible actions than can be taken at time step S
- Q, a function that approximates the reward for action A at time step S’. Can be written as Q(s,a). In our case Q is a neural network
- Reward R, the actual reward at state S’ given action A
Our goal is to learn the Q function from historical data. The self learning comes from a concept of looping through a number of different states and actions many times, and each time update the Q function a little bit. So in each loop the Q function will know a little bit more about the world around it and should be able to approximate the real reward a little bit better for each possible action.
First things first, before we proceed, let's validate that our AI works on trivial examples. Let's start with learning how to trade a currency whose price chart looks like a straight line. Let's see what happens if we only let the model train for one iteration.
Clearly, our AI has no idea about what's going on. Let's let it train for 100 iterations instead.
Great! Our AI has learned what a straight line looks like. We only observe long trades after the model has seen the first two price points.
Let's turn it up a notch and see if our AI can learn how to behave in more complex scenarios. We'll replace the straight line with a sine curve and see what strategy our AI proposes.
Again, first let's see how an undertrained model performs, and then see how performance improves if we let the model train to its full potential.
Training for 10 iterations:
Again, no clue. 1000 iterations:
Fantastic! We've trained our AI to understand what a sine wave looks like and how to trade in a more complex scenario. It trades with confidence during strong ups and downs, and cautiously tries to anticipate sharp changes when it senses an upcoming trend change.
Now that I've convinced you that AI can learn to trade on theoretical examples, let's try it out on real data.
In the second part, we will take BTC/USD price, augment it with some technical indicators, feed that into our model, and see what strategy our AI would pick.
If you enjoyed reading please subscribe to our blog so that you don't miss our announcements and new posts where we will explore other machine learning techniques and how they can be applied to the world of cryptocurrency. Also, make sure you read through our whitepaper, and feel free to ask any questions you might have or suggest topics for a new blog post in the comments section below.
Very excited to share the second part soon, but I'm still waiting for the results of training the neural network which takes considerable time even on a Tesla K80 GPU. Stay tuned for the update!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Nice blog. I haven't done anything with Deep Q Learning, so I'm looking forward to see what you guys can do with it :)
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Great stuff. Will be keen to see the result.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
It's coming up shortly. I'm running a couple more experiments to see how different history impacts the AI's behavior. It is very exciting to launch AI training and see what it comes up with. Kinda feels like being a parent
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
i Think you know, i am your follower :) that's because i like your posts :) @ronaldmcatee
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Doesn't look like you are to me https://steemit.com/@rados/followers :)
Unless you mean you are our spiritual follower, in which case we won't let you down and will keep coming up with more posts like this one.
And if you like what we do and see potential in it, consider participating in our token presale at https://www.rados.io
Detailed instructions will be published within 24 hours
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Great post! Look forward to more.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Hello rados, could yo please add references about what are you doing? i will be very gratefull.
Thanks for the article!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Hey piedra, thank you very much for your support! What do you mean by references? Maybe show an example that we can draw inspiration from for future posts?
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Great post @rados, you might like this Why Everyone Is Talking About Artificial Intelligence
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit