I’m still not happy with how Ms. PacMan decides on a choice of action, so I think I’ll try another new system of choice. This one is more balanced so I don’t have to worry about alternative weights screwing the system up. The alternative weight I mention is the toJunction action, which moves to the safest junction. It doesn’t use distance, unlike every other action, so it’s action weight has to depend on the safety of the junction. Problem is, I have to find the right coefficient to balance the action weights.
Anyway, noting down my idea for the new system:
– For every action, normalise the weights after adding all actions at that level. For instance, the fromPowerDot action might end up with 0.003 LEFT, 0.001 RIGHT, 0.002 UP, which is pointless if the next action has weights like 0.4, 0.6, etc. While this is taking distance into account, it can be a bit tedious. Hmm, this new system may have trouble there… Well, I’ll find out when I get there. Anyway, these weights become normalised such that they add to 1: 0.5 LEFT, 0.17 RIGHT, 0.33 UP.
– Continue to use prioritised ranking of actions, so that the first action in the policy has a coefficient of 1, while following actions have decreasing weights, based on their index.
It may be a simple system, but it should alleviate any problems I have with action weighting. I can always expand it if necessary.
EDIT: Implemented it, and it seemed to be a crap system. But perhaps it is no worse than the current one. By also adding in weights based on closest distance, I could maybe have a better system. Of course this simply creates the junction problem again…
The proposed addition simply multiplies all normalised values by 1/closest distance. So if a dot is 1 unit away (disregarding other dots), the coefficient is 1. This keeps the urgency of closeness.
Ergh. Maybe I’ll just skip the junction thing and stick with the current system, which seems to work well enough. I could work on ghost centre instead.