It seems to me that I’m reaching the end of the line here in terms of improvement, as there doesn’t appear to be much more to improve upon. The Checklist doesn’t contain much to do, and I’m currently restricted from running a proving run until I can get the issue sorted.
There is one more major thing though, I suppose. The emergency play mode is something to think about as it could keep the agent playing for that extra bit longer. We (Bernhard and I) discussed in the meeting where it would be active and how it would work.
The field could be subdivided into 2 parts, normal play and emergency play. Each could have it’s own policy of parameter sets which can be learned when the field is in the appropriate state. This reflects the fact that the agent needs to play differently for different field situations.
This can be easily implemented and should be done by the weeks end.
Something else beneficial to add in is when changing parameters sets, take note of the field value before (evaluationField with no piece) and after. Add to the parameter set the difference of the two (before – after). Note that this is just raw values (no parameter multipliers).