Interesting ... what platform are you using?
This is software I wrote that runs on Windows 10 with C++, opencl, and cygwin.
Perl and shell scripts gather daily price and reference data for various assets (e.g., stock indexes and ETFs) using curl and (headless) chrome to retrieve data.
Perl and shell scripts preprocess the data to do scaling and calculate indicators.
The genetic programming C++ executable (Windows console application) with opencl processes the preprocessed data to create rules. Opencl lets some calculations run on a GPU to get the results much faster. An example of a rule is:
The top line of the rule has a name of something the rule is trying to predict. In this example, rule 1 of model 06 is for signaling a short trade on the S&P 500 at the next bar's close with an exit at the close 10 bars in the future.
This rule looked at 7245 trading days of preprocessed data, would have been hit 2874 times (39.6687 percent of the time) and would have had a positive outcome 2042 times (71.0508 percent of the hits) with a mean gain of 1.69423 percent.
The body of the rule can be thought of a a high-level assembly language. Each instruction has an operation (e.g., +), one or two operands (e.g., 0.186615 or indTypeA015), and may put the result in a register (e.g., R0) which will be used in later instructions. Indentation shows instructions that would run when the preceding if statement evaluates true. Operands are floating point constants, indicators, or registers. Indicators have types, and an instruction with an operation on two different types of indicators result in NAN (not a number) or false for an if statement. Missing indicator values have a value of NAN, and operations involving NAN result in NAN. The rule would be fired when it returns a value greater than zero.
The genetic part consists of initialization followed by multiple sequences (generations) of selection, crossover, mutation, evaluation, and survival.
Initialization creates random rules and calculates a fitness measure for each rule. Fitness is based on a risk-adjusted return for a simulated trade.
Selection picks pairs of father and mother rules for crossover and mutation.
For crossover, the father rule gets copied to a son rule, and the mother rule gets copied to a daughter rule. Then a random part of the father's rule gets overlaid at a random location the daughter's rule, and a random part of the mother's rule gets overlaid at a random part of the son's rule.
For mutation, the fittest of the father and mother rules gets copied to a mutant rule. Then a random number of instructions in the mutant rule are changed to new random instructions.
Evaluation calculates fitness for each son, daughter, and mutant.
Survival picks which of the fathers, mothers, sons, daughters, and mutants are kept for the next generation.
Perl and shell scripts use the executable to interpret the best rules from multiple models for long and short directions to form a consensus to go long, short, or not trade.
The k-nearest neighbor C++ executable (Windows console application) processes preprocessed data for the model and evaluation. Model data represents price charts for different assets at different times with future results. Evaluation data represents price charts for different assets at a single time (usually the most recent time.
For each instance of evaluation data (i.e., represents a single chart), the software compares the evaluation data with each instance of the model data to find which models have similar charts. The comparison is by a weighted), Euclidean-type distance. More recent times get higher weights in the calculated distance. The results of closest "k" model instances are combined to form a risk-adjusted result as a prediction.
Perl and shell scripts use the executable's output to rank the evaluation assets into something like:
The count column is the "k" which for this example is one percent of the model instances, and the score column has the risk-adjusted prediction (higher values are better). The prediction is for going long at the close of the next bar and exiting at the close 21 bars in the future.
I don't know if either of these methods will work in the future of course, but they were certainly interesting to develop.