Reinforcement Learning with History Lists
Stephan Timmer
Broschiertes Buch

Reinforcement Learning with History Lists

Solving Partially Observable Decision Processes by Using Short Term Memory

Versandkostenfrei!
Versandfertig in 6-10 Tagen
69,90 €
inkl. MwSt.
PAYBACK Punkte
0 °P sammeln!
A very general framework for modeling uncertainty in learning environments is given by Partially observable Markov Decision Processes (POMDPs). In a POMDP setting, the learning agent infers a policy for acting optimally in all possible states of the environment, while receiving only observations of these states. The basic idea for coping with partial observability is to include memory into the representation of the policy. Perfect memory is provided by the belief space, i.e. the space of probability distributions over environmental states. However, computing policies defined on the belief spac...