I can now properly compare performances of several agents against each other on the graph. By saving an agent’s performance and loading it later, it can be shown against the current agent’s performance.
That’s really all there is new I suppose. Perhaps there is more, but I’ve forgotten any updates.
Anyway, I still need to get an experiment mode in, which allows an agent/s to be loaded in an run on several domains with the results averaged and stored in a single file (per agent). This way, an agent’s average performance can be tested against other agents to see which one performs best.
This mode can be used for a single MDP with many seeds or many MDPs.