I have no results this post; they are currently in progress. At the moment, I am simply running 4 experiments over the same parameters (basic random rules) to look at the variance.
Anyway, while watching the Pacman game at home, to check if it still worked after extensive refactoring, I thought of another test. In videogames, a human player knows instinctively to try and maintain their lives. After all, if you have no lives left, you cannot gain any more points. A computer sort of realises this, by utilising the FROM_GHOST rule, but I feel like it may be able to do better. If I applied this to Infinite Mario, would the agent learn to stay alive? I figure it would, but it might pay off to test how it performs with an explicit negative reward when it loses a life.
I figure giving a negative 10000 reward would be appropriate for Pacman. Whether it makes a difference or not is up to the results.