Started on massive refactoring by getting rid of the SubState system completely and all linked code to it.
Created a new class ParameterSet which stores the parameters and the results of the parameter. Also the class can mutate itself to create children.
Laid down the framework for the tests and filled in some of the basic test methods.
As a result of using a new system, the method of learning is altered slightly too. A parameter is chosen, tested on x pieces, then a new parameter is chosen which is a mutation of the original parameter.
From there, new parameters are either offshoots of the best parameter or children of the 2 best parameters.
When the agent is choosing style of play, it chooses either exploratory or greedy, and sticks with that strategy for x pieces, then a new style is chosen. Like the old system, the agent gradually chooses more and more greedily over time.