I think I figured out what Bernhard was getting at with the modular rules last meeting. I stated that there is no need to optimise the placement of the modular rules as they will simply be loaded whenever needed. While this is mostly true, there is a chance that many of the rules are not needed, and in fact slow things down.
So, what needs to be done is to somehow optimise the modular rules, perhaps by fixing them in place when needed. They could be added to the triggered rules for the policy, and when updating, if a modular rule occurs every time, fix it as part of the solution. In this way, modular rules which never fire may be removed from the policy. Then again, this is only an aesthetic thing – the agent doesn’t care if the policy contains 20 useless rules.
Well, something to keep in mind. To truly allow transfer using modules, I may need to relax the rules somewhat. For instance, a module for driving an automatic car will work ok for driving a manual (in terms of steering and acceleration), but the act of stick shifting is completely new. Then again, this could be an action to take which could be learned by the agent…
Hmm. I think I need more experiments on complex domains to fully find the answers to these hovering questions.