PhD Progress: Valid Actions Problem

Like the FOXCS paper, this algorithm will be using an observation containing the valid actions for the state. This is all fine and good in Blocks World, but in larger worlds like StarCraft or PacMan, this may be problematic. The algorithm would be required to compute every possible action, which is near infinite in the StarCraft domain. Perhaps the action set needs to be bounded somehow.

I feel like I’ve touched on this problem before, but I cannot recall where or when. Ergh, it’s too hot for thinking today…