Another idea struck me while reading.
When determining subgoals in a problem (assuming the agent uses subgoals), they could be states that allow access to the agents ‘other’ functions. For example, say the main goal was to move to the table and pick up a cup. A sub goal would be to first move to the table. Once the agent is within reach of the cup, it has access to it’s ‘pickup’ action.
These extra features could also be agents within themselves. An ‘agent’ would consist of a manager, which oversees the large tasks, and many ‘workers,’ each specialised in their domains. Each worker would gather experience as it performs its task. So a robot may become quite good at walking, but not so good at talking if it was trained on problems that required silent travel.
The manager would appropriate reward among the workers and would also oversee when to activate a worker. In the above cup problem, it would first activate the ‘walker worker’ then when it senses that ‘pickup thing worker’ can be activated, it activates that, recording how well it performs so it can appropriate reward later.