AI (artifical intelligence) and neural networks are all over the news these days. Zuck and Elon are fighting over whether machines are going to kill us, Google's AI is beating world champions in the game of Go, and groundbreaking scientific work is now being done in collaboration with AI. In this series of blog posts we'll show you how to use AI to trade cryptocurrencies for fun and profit.
But first, let's take a step back and talk about what AI is. What's called AI today is a collection of statistical algorithms that can come up with good solutions to problems posed to them without requiring a human to instruct the machine on what exactly it should do. AI algorithms can be broken down into three categories:
Supervised learning means you have a certain number of features - think daily trade volume, current price, etc., and a label for every data point - think tomorrow's price. Provided you have good features that carry information about tomorrow's price and enough labeled data, you can use supervised learning to build a model that can predict price movement.
Unsupervised learning. Unsupervised learning is useful when you don't really know what you want to get out of your data, but you are sure that there are patterns in your data that you want to investigate further. These techniques can be useful when combined with supervised learning, for example to create new features. For example, you can visualize bitcoin addresses as a graph and apply unsupervised learning techniques to cluster addresses into meaningful groups, like "addresses that belong to the same user", or "merchant/shopper", or "trader/holder". These techniques are also used to detect anomalies, for example detecting money laundering.
Reinforcement learning. Reinforcement learning is the field that studies the problems and techniques that try to retro-feed its model in order to improve. In order to accomplish this, RL needs to be able to “sense” signals, automatically decide on an action, and then compare the outcome against a “reward” definition. Reinforcement learning tries to figure out WHAT to do to maximize these rewards, but it does this by itself (no direct instructions).
Advanced AI systems combine all three approaches in order to achieve superhuman results. And today we're going to talk about the most exciting of them, reinforcement learning.
Our task is to train an AI on bitcoin price data to buy, sell or hold in a way that maximizes returns. We'll be using Google's awesome tensorflow library to do the mathematical heavy lifting, and we'll apply a technique called Deep Q learning pioneered by DeepMind. This model operates on the following concepts:
State S, this is a representation of the current world as the algorithm sees it.
State S’, a new state one time step later than S.
Action A, one of the possible actions than can be taken at time step S.
Function Q that approximates the reward for action A at time step S’. Can be written as Q(s,a). In our case Q is a neural network.
Reward R, the actual reward at state S’ given action A.
Our goal is to learn the Q function from historical data. The self learning comes from a concept of looping through a number of different states and actions many times, and each time update the Q function a little bit. So in each loop the Q function will know a little bit more about the world around it and should be able to approximate the real reward a little bit better for each possible action.
First things first, before we proceed, let's validate that our AI works on trivial examples. Let's start with learning how to trade a currency whose price chart looks like a straight line. Let's see what happens if we only let the model train for one iteration.
Clearly, our AI has no idea about what's going on. Let's let it train for 100 iterations instead.
Great! Our AI has learned what a straight line looks like. We only observe long trades after the model has seen the first two price points.
Let's turn it up a notch and see if our AI can learn how to behave in more complex scenarios. We'll replace the straight line with a sine curve and see what strategy our AI proposes.
Again, first let's see how an undertrained model performs, and then see how performance improves if we let the model train to its full potential.
Training for 10 iterations:
Again, no clue. 1000 iterations:
Fantastic! We've trained our AI to understand what a sine wave looks like and how to trade in a more complex scenario. It trades with confidence during strong ups and downs, and cautiously tries to anticipate sharp changes when it senses an upcoming trend change.
Now that I've convinced you that AI can learn to trade on theoretical examples, let's try it out on real data.
In the second part, we will take BTC/USD price, augment it with some technical indicators, feed that into our model, and see what strategy our AI would pick.