Finished running a clear(a) experiment with the final policy using if above(X,a) -> moveFloor(X) as the result. This will work sometimes, but not optimally. I suppose I had better check that the above predicate is functioning as it should.
Anyway, it appears that rule regeneration is on, but I should check that out too. In any case, the accuracy based method Bernhard and I talked of (and is similar to FOXCS) needs to be implemented. It will hopefully use guided mutation to find a more specific version of that rule such that X is clear. But I also need to implement the reward thingy, where an optimal play strategy receives 0, but sub-opt receives -X. This in turn requires me to implement human-readable rule bases.