| |
| | Temporal Difference Learning and TD-Gammon |
 | | One of these is the development of a wide variety of novel nonlinear function approximation schemes, such as decision trees, localized basis functions, spline-fitting schemes and multilayer perceptions, which appear to be capable of learning complex nonlinear functions of their inputs. |
 | | In either case, building human expertise into an evaluation function, whether by knowledge engineering or by supervised training, has been found to be an extraordinarily difficult undertaking, fraught with many potential pitfalls. |
 | | Finally, non-deterministic games have the advantage that the target function one is trying to learn, the true expected outcome of a position given perfect play on both sides, is a real-valued function with a great deal of smoothness and continuity, that is, small changes in the position produce small changes in the probability of winning. |
| www.bkgm.com /articles/tesauro/tdl.html (7470 words) |
|