Tried out an idea to stop the agent from destroying itself at the last few lines, but it didn’t seem to have much effect, possible making it worse.
So, another avenue of exploration is to inspect the data and see if there is a pattern to it. What I will be doing is taking in various parameters about the field (rows, columns, piece distribution) and assigning them to the output parameters as classes and then using WEKA to try and learn any patterns in the data.
The parameters are captured 10 times for each MDP (hopefully this is enough) resulting in 200 lines of data for WEKA to do its thing with. This will take a while to capture so I’ll come back tomorrow and check out the results. Regardless of results, I need to start my proving run tomorrow because this competition is closing soon.