I have had the idea of the agent inferring its own background knowledge through observations of the conditions throughout the state and I always figured that this sort of learning already existed somewhere in machine learning.
I have found that it does in fact exist, and is known as association mining. The question is, can association mining tackle streams, variable number of attributes, and generality measures? I somehow doubt the standard association miners in Weka can, so I should instead ask Bernhard about the best one for my purposes and use it as a baseline.
It can tackle generality (if fiddled with appropriately). Such a form of learning/mining is known as multi-level association mining.
This is just a reminder post for myself, should I forget it. Currently refactoring the state definition code to use a StringFact object which contains a flexible definition for the fact types found in the state. The reason for this is because Strings are not flexible and the replacement procedure seems kinda slow. Furthermore, I am also refactoring (silly idea really) the module arguments because they are causing problem with the current map look-up system. Once this is all done, I need to ensure all the tests pass and modules are working properly.
The experiment is nearly complete for the Pacman with Population Constant 10, though the other two are still a long way from total completion. Each run has completed at least twice, though the Pop Const 10 has completed 8 runs.
Judging by the speed of each experiment, the Pacman Pop Const 30 takes roughly as long as the 50, which is interesting as they share the same results. The question is, which approach is best? The Pop Const 10 does eventually match the other two performances, and will likely level out just as the others did. But it can reach the goal in less time, but more learning iterations. I suppose in the end, all are judged by time, as the number of iterations can be infinite, if necessary. So perhaps this Pop 10 works, but should have a larger number of iterations to learn over.
Note that these results are using purely pre-goal specialisation only, no general rule specialisation. That still needs some work before I am ready to launch an experiment for that.