Blog – Page 65 – Sam Sarjant

11.01.0826.09.11

Progress: Line Completion Representation

Rather than storing within every SubState object whether a column will finish a line, why not store the line completion boolean array in the entire field. This reduces the number of line completion objects from 7 4-index arrays to 1 10 index array.

A Problem
After analysing the observation numbers, I have figured out what everything means:
The first 250 numbers (well, the first X * Y numbers) represent the field, including the current falling piece.
The next 7 numbers represent what piece is falling at the moment.
The last 2 represent the Y and X size of the field respectively.

As for the actions:
0: Move Left
1: Move Right
2: Rotate Left
3: Rotate Right
4: Idle
5: Drop
These don’t really need to be known, but could come in handy for debugging purposes.

The problem I’m talking about is the fact that the piece is shown as part of the Tetris environment. If I could separate it out, the original strategy would work (well, theoretically). One slight advantage of eliminating the falling piece is:
a) I know what the piece is and its bit vector pattern.
b) In Tetris, every block is connected either to the floor, or another piece. So there can’t be any ‘floating’ pieces.

I’ll need to suss out with the RL guys if the action numbers remain consistent and the way the pieces behave.

What was accomplished today:

Created an accurate (I think) contour mapping of the state.
Found a problem with the falling piece being part of the observation and messing up the contour mapping.
As seen above, made line completion independant of the substates, leading to fewer states.
Parsed the Observation array on paper.

10.01.0803.07.12

Progress: Different Visualisation of Sub States

While working away at the project today, I noticed something. The visual representation I have on Progress: Better State Definition isn’t exactly correct. The contours can be as much as 6 units high (9 if using {-3,…,3}). They should look more like this:

IMAGE LOST

Edit: This picture shows an un-truncated version of the contours. This is what will be obtained for the whole field map and then truncated when it is split into substates.

Due to this, I realised that implementing a last hole in the line will need to be modified to fit the column, not the row.

What was accomplished today:

Started (and hopefully finished) the state splitting method, complete with substate height culling. Untested (needs to be remedied).
Further understanding of how the process works. Was missing out the parameter objects passed into agent_start and agent_step, which make things much easier to look at.

10.01.0826.09.11

Handy Links

Handy links for reference.

RL Competition site
RL-Glue main page
RL-Glue lecture slides
Reinforcement Learning: An Introduction
A Final Report on Reinforcement Learning
RL Reduced Tetris
Tetris AI Computer plays Tetris
Applying Reinforcement Learning to Tetris
Tetris is Hard, Even to Approximate