PhD Progress: Statistical Pre-Goal Watching

Something that may help the rule creation process, and should be relatively easily implemented is a form of state observance which takes note of before and after states for each action within a good solution to a problem. This includes the optimal agent’s behaviour too. However, for this to be effective, the unification algorithm must be required to deal with probabilities. Because a relation may not always be in a state.

These probabilities could also represent rule mutations, where a bunch of mutations can be created simply by sampling a probabilistic state.

Of course, it wouldn’t be this simple… For the rule mutations in the simple Blocks World domain, the rules created are useful because they pertain to the goal. If an algorithm simply took notice of which relations were present for every time the agent chose to move X to Y, then a lot of junk could be included. Another problem is to what level the unification takes place. Because elements can be probabilistically counted, does that mean that relations will never be removed from a unified state, only reduced in probability? So for the moveAB goal, the unified state still contains junk like (on c e), albeit with a small probability?

While there are potential problems with this approach, it may warrant closer inspection.