Progress: Policy Updating Note

Been a little while since this. Been sick and lazy. Back to it though. Should be about a week 2 weeks when I can finally run some runs and check the results.

Policy updating note
I didn’t know where to put this in the code, but some things I noticed when writing JUnit tests.
The variables goalState_ and goalPosition_ in SmartAgent are used in the policy. The policy takes a SubState(goalState_), a Tetromino and a reward. The reward is given when the Tetromino lands where it does.

One problem, is that the Tetromino position (its centre) is given relative to the field, where the bottom-left corner is 0,0 and the centre of each square is +0.5. This position needs to be changed relative to the SubState. This is easy enough, given the SubState’s x position. This allows proper policy storage of Tetromino positions. Note that the Tetromino’s height is stored relative to the first column of the SubState.
– A side thought: The Tetromino’s height position need not be stored. When moving a piece to a goal, all that needs to be known is that the rotation and x position is correct. However, it should be stored anyway (relative to the SubState, where 0 height is ), as it can’t be discarded.

Multiple SubState storage. When a piece lands, it may not land in the goal position. Even if it does, there is opportunity to store the reward in multiple substates as often, a piece is encompassed by more than one substate. This will make learning faster and therefore, a better agent.
– This royally messes things up. Dunno if it matters though… The problem is this: Finding the y-location of the Tetromino once it has landed on the SubState. In relation to the SubState, that is. Finding it rather hard.

UPDATE: Damn server. Having a hissyfit and making me lose some progress. What I remember and what has been done:
Tetrominos do not need a y-coord. The game of Tetris can largely be visualised in 1D if you don’t worry about hitting ‘spires’ and can place all the blocks with ‘DROP’. Will need looking at later. Well, the y coord is needed in a single case: When doing a contour scan, the agent needs to ignore the position of the piece, which must be 2D.

Secondly, there is no longer a goalState_ due to multiple SubState policy updating. It was never really needed except for policy updating. Now, when a piece lands, find all the SubStates that encompass it and update them in the policy. This is a problem, finding which states encompass the piece. Using the piece definition and the xPos of the SubState, this should be achievable. Though rotation adds another variable to the already tricky equation. No matter. It shall be completed.

Due to the above 2 things, the agent need only pick the x position it is going for and the rotation by using the SubStates as a guide for good positions.

What has been done:
At home – not in the lab. Finished writing all the test cases. In no way are they complete (fully). I’d need to Jumble them for full completion probably. But I can only do that once I have code. The basic tests are up to safeguard the code anyway. Almost all of the SmartAgent tests are failing/erroring but everything else passes. Note that I have not written test cases for the agent_blah methods. Everything should be abstracted from them anyway.

Leave a Reply

Your email address will not be published. Required fields are marked *