Progress: Presentation

Presentation coming up. Need to plan how it’ll go.

Introduction + Contents page
– Explain a bit about myself, the title of the project, and the outline of the presentation

RL Competition 2008
– Explaining the RLComp 2008. Detail the domains, scope of competitors and timeline.

Tetris summary
– Perhaps animated Tetris. Explain Tetris to those who don’t know (no-one!).

Reinforcement Learning explanation
– Explain reinforcement learning using BlockWorld example. Could span 2 slides.

Other approaches to the AI Tetris problem
– Fixed Strategy approaches (1 slide)
– Learning Strategy approaches (1 slide)

After each explanation, start up the version and let it run long enough to see what it is (or isn’t) doing. PS. Also be aware of first piece bug present in several versions.
Version 1.0
– Basic substate strategy that doesn’t do much until some rewards are made. Each state is size 4.

Version 1.1
– Expanded strategy. Uses variable sized substates to avoid towers.

Version 1.2
– Uses semi-guided placement of pieces to try to create rewards faster.

Version 1.3
– Introduced eligibility trace for multiple reward updates. Use eligibility trace diagram here.

Version 1.4
– Short-sightedness of substates. Not excellent results. Semi-guided action not guaranteed to be best position.
– Introducing field evaluations! Performs much better. Note the O(x) of each parameter and explain.
May need extra slide to explain parameters used

Version 1.5
– 1.4 very fixed strategy. Name of the game is RL.
– Plays genetically with N (N = 20 found to be good value) trial pieces and worth policy.
– Cannot do genetic as emphasis is on online learning (learn as you go). Genetic takes too long.

Version 1.6
– Trialled out emergency play, but proved to be useless.
– Modified height parameter and changed chasms to maxWellDepth due to genetic paper.

Perhaps test the agent on multiple parameters. Figure out good ones. 2 and 11 are good.

Conclusion
– Other AI algorithms may perform better on a single domain, but this agent has the ability to evolve for a particular domain.
– Questions!