Progress: Panic Mode Implemented

Finished implementing panic mode today, or rather, fixed a bug or two that stopped it from working. Currently running a console trainer run to test how well it compares against ordinary play.

The default panic mode has an emphasis on keeping height down and a much larger emphasis on making lines. This may mutate as the run progresses, but I am unsure.

A small note: The first parameter doesn’t get evaluated on field performance because it starts in the best possible scenario. It is possible to return to that original scenario, but not probable. This was changed so that it gets a fair go at becoming a leading parameter.

Run complete. It doesn’t appear to be all that different from the standard mode of play, but I am running another now with a slower cooling rate.

Here’s the results of the initial run (Default panic parameter {1, 1, 32, 1, 256}):
Top three regular policies:
Mults: {1, 2, 4, 171, 24}, worth 0.48651257308343354, over 23622 steps.
Mults: {1, 1, 2, 344, 20}, worth 0.48595252779913894, over 22666 steps.
Mults: {1, 1, 2, 344, 28}, worth 0.48584568835196357, over 25766 steps.
Top three emergency policies:
Mults: {1, 4, 128, 4, 1024}, worth 0.7585903930396557, over 15820 steps.
Mults: {1, 8, 128, 4, 1024}, worth 0.6960829504452268, over 217 steps.
Mults: {1, 4, 128, 2, 1024}, worth 0.5, over 20 steps.
Total lines: 65886
Total reward: 89303.0
totalSteps is: 892301

Here’s the other run with a slower cooling rate. It looks like another fluke run, so I’m going to run it again.
Top three regular policies:
Mults: {1, 1, 2, 16, 1}, worth 0.5185111528520325, over 44249 steps.
Mults: {1, 1, 16, 400, 18}, worth 0.5176797114646612, over 7206 steps.
Mults: {1, 1, 6, 314, 13}, worth 0.5174437300796199, over 1244 steps.
Top three emergency policies:
Mults: {2, 1, 270, 12, 2160}, worth 0.7890020380008522, over 3437 steps.
Mults: {1, 1, 90, 5, 1440}, worth 0.7483031673028189, over 442 steps.
Mults: {1, 1, 90, 5, 720}, worth 0.7467117998900336, over 5170 steps.
Total lines: 86861
Total reward: 114535.0
totalSteps is: 1165914

Here’s the next run (same parameters):
Top three regular policies:
Mults: {32827, 1, 5437, 361920, 108}, worth 0.5577929466199957, over 879 steps.
Mults: {32827, 1, 16, 361920, 108}, worth 0.55, over 20 steps.
Mults: {128, 1, 16, 361920, 108}, worth 0.5462591406413051, over 11214 steps.
Top three emergency policies:
Mults: {2, 1, 64, 12, 512}, worth 0.6752100842261389, over 476 steps.
Mults: {1, 1, 32, 7, 256}, worth 0.6743177562743147, over 2675 steps.
Mults: {1, 1, 32, 6, 256}, worth 0.6741266059529322, over 3349 steps.
Total lines: 63252
Total reward: 84802.0
totalSteps is: 853749

Judging from these, it makes no difference to the play style.

Leave a Reply

Your email address will not be published. Required fields are marked *