-------------------------------------------------------------------------------------------
This is a framework to go from any trading idea (or hypothesis) to a trading system, it is (by definition) work in progress and open to feedback and constructive criticism, since this is the only way to make it stronger.
This framework is built around the idea of building algos based on a hypothesis about the market and how it behaves under a given set of conditions, we will call this set of conditions an event.
The scope of this framework covers the process from the conception of the idea to the point where it is trading live.
The goal of each step in the framework is to prove the hypothesis wrong, or falsify it, once falsified we go back to the top adjust the hypothesis and start over. The idea behind this is that it is cheaper to discard a wrong idea earlier in the process.
The goal of this process of iterations is to make the hypothesis stronger based on the observations in the data.
The market hypothesis needs to have 2 components.
An event, which is a series of market conditions that are described by the hypothesis.
A set of rules to follow in order to profit from that event.
Overview:
- We will first formulate the hypothesis on paper
- Build the system that detects when there is a high probability of occurrence of the event.
- Build the rules to act in a manner that allows us to profit from the event, if it where to occur, and minimize risk if the event doesn't materialize.
- Formulation
=======================
During this stage we formulate a hypothesis that describes an event in the market.
- Formulate the hypothesis in written words
----------------------------------------------------------
During this step, we word out the hypothesis.
Describe the market conditions, or theories that provide a foundation
to the idea.
The hypothesis should be stated as a description of a specific market event.
What are the conditions that describe the event?
Before, During and After?
describe the event or the outcome of the event.
- Formulate the hypothesis in terms of data
----------------------------------------------------------
The goal of this step is to define the conditions in such a way that you can automatically label instances of the event in the data.
In this step you can formulate calculated data indicators that may help identify the event. Such as statistical measures on the data, rolling window calculations, etc...
During these stage we must also define which AI, if any, is a good fit for the problem.
- Detecting the Event
========================
The next step is to build a signal based on the hypothesis.
For the effect of this writing, we will consider that the signal is not trivial and is generated by an AI.
In the event that the signal generation was trivial we can simply skip this step, generate the signals and move to the next phase.
AI training
-----------------------
During this stage we will train an AI to identify the hypothesis on historical data.
To prevent hindsight biases, the data must be separated by dates, so there is no overlapping timestamps on different data groups. This holds true for every data partition that we do during the experiments.
- training data
- quiz data
- test data
Label the data:
Find instances of the event, using hindsight
-- e.g. if the event is a 10% market move, then find the instances where such a move happened and label the event right before the move.
We will use the labels to train the AI into identifying the conditions prior to the event.
Training data:
Have the AI look at the training data, make predictions, measure the accuracy of the prediction against the labels and adjust the parameters.
Repeat.
Quiz data.
Every few hundredth iterations on the training data, run the predictions on the quiz data and measure the accuracy. Do not adjust the parameters based on the quiz set!
Test data.
This data is used only once at the end of the training and the results are reported.
Acting on the event.
========================
During this stage we observe the statistical data produced by the event, and we generate a trading plan around these observations.
Profile the event
-----------------------------------
In the reports for the experiment include event profiles of the events identified by the AI on each of the 3 datasets.
The event profiler will allow us to make forecasts and predictions on what we can expect of the event in terms of risk/reward.
- Generate trading rules
----------------------------------------
Up to this point we've worked only on the description of the data.
From what we learned by describing the data we formulate a trading plan.
What is the profit target?
What is the maximum loss we will tolerate?
What is the probability of the trade being a profit?
Should the entry be made in 1 shot, or in steps?
The tradeplan is a set of rules that will be applied whenan event signal is generated by the signal engine.
Trading test
=========================
Back test
------------------------------
With the tradeplan we go back to the lab.
This time we simulate the execution of the test plan on the backtesting data.
For the backtester we have the AI identify events on the backtesting data, and we simulate trades based on the trade plan rules.
The backtest data must be divided in the same manner as the AI data.
On the training data, we iterate finding adjusting the trade rules to optimize the results. We run the quiz every few hundredth iterations and run on the test data once we consider that we have a strong set of rules.
- Walk Fwd test
-----------------------------------------
During this test we use the set of rules obtained during the back test and simulate executions against live market data.
The AI identifies the trading event, and we enter/exit based on the rules.
- Live test
-----------------------------
Test the algo in live markets and grow its volume slowly to measure the effect of slippage and control its risk.
This is a framework to go from any trading idea (or hypothesis) to a trading system, it is (by definition) work in progress and open to feedback and constructive criticism, since this is the only way to make it stronger.
This framework is built around the idea of building algos based on a hypothesis about the market and how it behaves under a given set of conditions, we will call this set of conditions an event.
The scope of this framework covers the process from the conception of the idea to the point where it is trading live.
The goal of each step in the framework is to prove the hypothesis wrong, or falsify it, once falsified we go back to the top adjust the hypothesis and start over. The idea behind this is that it is cheaper to discard a wrong idea earlier in the process.
The goal of this process of iterations is to make the hypothesis stronger based on the observations in the data.
The market hypothesis needs to have 2 components.
An event, which is a series of market conditions that are described by the hypothesis.
A set of rules to follow in order to profit from that event.
Overview:
- We will first formulate the hypothesis on paper
- Build the system that detects when there is a high probability of occurrence of the event.
- Build the rules to act in a manner that allows us to profit from the event, if it where to occur, and minimize risk if the event doesn't materialize.
- Formulation
=======================
During this stage we formulate a hypothesis that describes an event in the market.
- Formulate the hypothesis in written words
----------------------------------------------------------
During this step, we word out the hypothesis.
Describe the market conditions, or theories that provide a foundation
to the idea.
The hypothesis should be stated as a description of a specific market event.
What are the conditions that describe the event?
Before, During and After?
describe the event or the outcome of the event.
- Formulate the hypothesis in terms of data
----------------------------------------------------------
The goal of this step is to define the conditions in such a way that you can automatically label instances of the event in the data.
In this step you can formulate calculated data indicators that may help identify the event. Such as statistical measures on the data, rolling window calculations, etc...
During these stage we must also define which AI, if any, is a good fit for the problem.
- Detecting the Event
========================
The next step is to build a signal based on the hypothesis.
For the effect of this writing, we will consider that the signal is not trivial and is generated by an AI.
In the event that the signal generation was trivial we can simply skip this step, generate the signals and move to the next phase.
AI training
-----------------------
During this stage we will train an AI to identify the hypothesis on historical data.
To prevent hindsight biases, the data must be separated by dates, so there is no overlapping timestamps on different data groups. This holds true for every data partition that we do during the experiments.
- training data
- quiz data
- test data
Label the data:
Find instances of the event, using hindsight
-- e.g. if the event is a 10% market move, then find the instances where such a move happened and label the event right before the move.
We will use the labels to train the AI into identifying the conditions prior to the event.
Training data:
Have the AI look at the training data, make predictions, measure the accuracy of the prediction against the labels and adjust the parameters.
Repeat.
Quiz data.
Every few hundredth iterations on the training data, run the predictions on the quiz data and measure the accuracy. Do not adjust the parameters based on the quiz set!
Test data.
This data is used only once at the end of the training and the results are reported.
Acting on the event.
========================
During this stage we observe the statistical data produced by the event, and we generate a trading plan around these observations.
Profile the event
-----------------------------------
In the reports for the experiment include event profiles of the events identified by the AI on each of the 3 datasets.
The event profiler will allow us to make forecasts and predictions on what we can expect of the event in terms of risk/reward.
- Generate trading rules
----------------------------------------
Up to this point we've worked only on the description of the data.
From what we learned by describing the data we formulate a trading plan.
What is the profit target?
What is the maximum loss we will tolerate?
What is the probability of the trade being a profit?
Should the entry be made in 1 shot, or in steps?
The tradeplan is a set of rules that will be applied whenan event signal is generated by the signal engine.
Trading test
=========================
Back test
------------------------------
With the tradeplan we go back to the lab.
This time we simulate the execution of the test plan on the backtesting data.
For the backtester we have the AI identify events on the backtesting data, and we simulate trades based on the trade plan rules.
The backtest data must be divided in the same manner as the AI data.
On the training data, we iterate finding adjusting the trade rules to optimize the results. We run the quiz every few hundredth iterations and run on the test data once we consider that we have a strong set of rules.
- Walk Fwd test
-----------------------------------------
During this test we use the set of rules obtained during the back test and simulate executions against live market data.
The AI identifies the trading event, and we enter/exit based on the rules.
- Live test
-----------------------------
Test the algo in live markets and grow its volume slowly to measure the effect of slippage and control its risk.
