Yesterday I left an agent running on default settings (20 trials, 0.99 cooling rate) to try and find the ideal agent for playing standard Tetris (parameter 0). Here are my results:

`Using best parameter: Mults: {1, 1, 2, 38, 8}, worth 0.53121482649264, over 1091020 steps.`

Policy:

Mults: {1, 1, 2, 38, 8}, worth 0.53121482649264, over 1091020 steps.

Mults: {1, 1, 2, 38, 5}, worth 0.5309739319333816, over 55240 steps.

Mults: {1, 1, 2, 38, 12}, worth 0.5307251908396946, over 5240 steps.

Mults: {1, 2, 2, 38, 16}, worth 0.53, over 6300 steps.

Mults: {1, 1, 2, 38, 2}, worth 0.53, over 100 steps.

Mults: {1, 1, 2, 38, 4}, worth 0.5295063145809414, over 17420 steps.

Mults: {1, 2, 2, 38, 8}, worth 0.5290322580645161, over 620 steps.

Mults: {1, 1, 3, 57, 8}, worth 0.5287671232876713, over 1460 steps.

Mults: {1, 1, 2, 38, 3}, worth 0.5287234042553192, over 940 steps.

Mults: {2, 1, 2, 38, 8}, worth 0.528683241252302, over 21720 steps.

Mults: {1, 1, 2, 76, 8}, worth 0.5283422459893048, over 3740 steps.

Mults: {1, 1, 2, 38, 16}, worth 0.5282101167315175, over 10280 steps.

Mults: {1, 1, 2, 36, 8}, worth 0.5279202961672473, over 459200 steps.

Mults: {2, 1, 4, 76, 8}, worth 0.5277227722772277, over 2020 steps.

Mults: {1, 1, 2, 38, 6}, worth 0.5273214285714286, over 5600 steps.

Mults: {1, 1, 2, 37, 8}, worth 0.5272357723577236, over 4920 steps.

Mults: {1, 2, 4, 76, 8}, worth 0.52625, over 800 steps.

Mults: {1, 2, 4, 76, 16}, worth 0.525, over 160 steps.

Mults: {1, 1, 1, 38, 4}, worth 0.525, over 40 steps.

Mults: {2, 1, 4, 76, 16}, worth 0.4166666666666667, over 60 steps.

As you can see, the best parameter sets are all fairly similar, with a {1, 1, 2, 38, 8} sort of pattern. So clearly, in standard Tetris, avoiding holes is the biggest focus, (76%) and making lines (16%) is the next focus. Then height is the next thing to worry about (4%) and bumpiness and chasms are only small goals (2% each). Note that these aren’t linear multipliers. Height, lines and chasms are O(x^2) while the others are O(x).