Today I addressed the todo of storing a piece reward using only the minimal state that encompasses the piece. To put it more clearly, the first todo paragraph of the previous entry. So when a piece falls and lands, only the contours on which the piece rests are sent to updatePolicy, which updates that and all superstates of that state.
Dealing with the vertical I-piece problem led me to have special storage for when it is in a special state and ‘paired’ storage when anywhere else. That is, the I-piece has a size of 1, so it can sit anywhere, but the update is given to a state of size 2 which consists of the piece and the highest neighbouring column. This makes the agent want to put I-pieces in low points as much as possible when no special state is present. If the neighbouring columns are of equal height, it chooses the one to the left I believe. Shouldn’t really matter too much.
Due to this update, paved states had to be taken away as finding where the piece lands uses the raw contour data rather than the list of substates and so it ignores paved states. This could be fixed easily enough, but this allows me to address the paved problem that I have envisioned; at least in part. There is still the problem of making special states ‘bigger’ to other pieces.
I have decided to pass on the lab test as I believe it will net nothing.
Tests passing 21/22 (testUpdateIndividualPolicy test hasn’t been implemented).