Later today, I added some code for displaying debugging output in the console. Just basic stuff covered by the last post such as the state of the field and the current piece position as thought by the agent. It revealed some problems:
– One I’m aware of is an I-piece wanting to rotate right away. It only appears to be a problem with the I-piece though, so that makes things easier.
– When the agent chooses greedily, it often messes things up. This is likely because only a few state-tetromino pairings will have rewards associated with them. Eligibilty traces should help out here.
– findPiece still messes up and a quick fix didn’t work. I’m gonna have to look closer on this.
– After state culling, often all that is left are special states. This is useless for any piece other than I-pieces, so they are put anywhere randomly. To fix this, instead of culling states, bias placement towards lower states. This is (theoretically) already done, but obviously not working well. Also bias towards walls. These points were brought up long ago.
So, the new flow of piece placement would be more like this:
– Search for lowest + best looking position
– If 2 or more positions, choose wall one over others. Failing this, choose wider one over others to cover more ground.