Qualitative Spatial Abstraction in Reinforcement Learning (eBook) - Lutz Frommberger

Lutz Frommberger 

Qualitative Spatial Abstraction in Reinforcement Learning (eBook)

eBook
 
Format: PDF
versandkostenfrei
innerhalb Deutschlands
86 ebmiles sammeln
EUR 85,55
Sofort per Download lieferbar
Alle Preise inkl. MwSt.
Bewerten Empfehlen Merken Auf Lieblingsliste


Qualitative Spatial Abstraction in Reinforcement Learning (eBook)

Reinforcement learning has developed as a successful learning approach for domains that are not fully understood and that are too complex to be described in closed form. However, reinforcement learning does not scale well to large and continuous problems. Furthermore, acquired knowledge specific to the learned task, and transfer of knowledge to new tasks is crucial.   In this book the author investigates whether deficiencies of reinforcement learning can be overcome by suitable abstraction methods. He discusses various forms of spatial abstraction, in particular qualitative abstraction, a form of representing knowledge that has been thoroughly investigated and successfully applied in spatial cognition research. With his approach, he exploits spatial structures and structural similarity to support the learning process by abstracting from less important features and stressing the essential ones. The author demonstrates his learning approach and the transferability of knowledge by having his system learn in a virtual robot simulation system and consequently transfer the acquired knowledge to a physical robot. The approach is influenced by findings from cognitive science.   The book is suitable for researchers working in artificial intelligence, in particular knowledge representation, learning, spatial cognition, and robotics.  


Produktinformation

  • ISBN-13: 9783642165900
  • ISBN-10: 3642165907
  • Best.Nr.: 33332266
Dr. Frommberger is a researcher in the Cognitive Systems Research Group (SFB/TR 8 Spatial Cognition) of Universität Bremen; his special areas of expertise are spatial abstraction techniques, efficient reinforcement learning, cognitive logistics and qualitative representations of space.

Leseprobe zu "Qualitative Spatial Abstraction in Reinforcement Learning (eBook)"

Bitte klicken Sie auf die Navigation oder das Artikelbild, um in Qualitative Spatial Abstraction in Reinforcement Learning (eBook) zu blättern!


Inhaltsangabe

Foreword ... 4
Preface ... 6
Contents ... 9
Symbols ... 13
Acronyms ... 15
1 Introduction ... 16
1.1 Learning Machines ... 16
1.1.1 An Agent Control Task ... 17
1.1.2 Structure of a State Space ... 19
1.1.3 Abstraction ... 19
1.1.4 Knowledge Reuse ... 20
1.2 Thesis and Contributions ... 21
1.3 Outline of the Thesis ... 22
2 Foundations of Reinforcement Learning ... 24
2.1 Machine Learning ... 24
2.2 The Reinforcement Learning Model ... 25
2.3 Markov Decision Processes ... 26
2.3.1 Definition of a Markov Decision Process ... 27
2.3.2 Solving a Markov Decision Processes ... 28
2.3.3 Partially Observable Markov Decision Processes ... 30
2.4 Exploration ... 31
2.4.1 -Greedy Action Selection ... 32
2.4.2 Other Exploration Methods ... 32
2.5 Temporal Difference Learning ... 32
2.5.1 TD(0) ... 33
2.5.2 Eligibility Traces/TD() ... 33
2.5.3 Q-Learning ... 34
2.6 Performance Measures ... 35
3 Abstraction and Knowledge Transfer in Reinforcement Learning ... 37
3.1 Challenges in Reinforcement Learning ... 37
3.1.1 Reinforcement Learning in Complex State Spaces ... 38
3.1.2 Use and Reuse of Knowledge Gained by Reinforcement Learning ... 38
3.2 Value Function Approximation ... 40
3.2.1 Value Function Approximation Methods ... 41
3.2.2 Function Approximation and Optimality ... 44
3.3 Temporal Abstraction ... 44
3.3.1 Semi-Markov Decision Processes ... 45
3.3.2 Options ... 45
3.3.3 MAXQ ... 46
3.3.4 Skills ... 46
3.3.5 Further Approaches and Limitations ... 47
3.4 Spatial Abstraction ... 47
3.4.1 Adaptive State Space Partitions ... 48
3.4.2 Knowledge Reuse Based on Domain Knowledge ... 50
3.4.3 Combining Spatial and Temporal Abstraction ... 51
3.4.4 Further Task-Specific Abstractions ... 51
3.5 Transfer Learning ... 51
3.5.1 The DARPA Transfer Learning Program ... 52
3.5.2 Intra-domain Transfer Methods ... 53
3.5.3 Cross-domain Transfer Methods ... 53
3.6 Summary and Discussion ... 55
4 Qualitative State Space Abstraction ... 56
4.1 Abstraction of the State Space ... 56
4.2 A Formal Framework of Abstraction ... 57
4.2.1 Definition of Abstraction ... 58
4.2.2 Aspectualization ... 59
4.2.3 Coarsening ... 61
4.2.4 Conceptual Classification ... 62
4.2.5 Related Work on Abstraction ... 63
4.3 Abstraction and Representation ... 64
4.4 Abstraction in Agent Control Processes ... 67
4.4.1 An Action-Centered View on Abstraction ... 67
4.4.2 Preserving the Optimal Policy ... 68
4.4.3 Accessibility of the Representation ... 69
4.5 Spatial Abstraction in Reinforcement Learning ... 70
4.5.1 An Architecture for Spatial Abstraction in Reinforcement Learning ... 70
4.5.2 From MDPs to POMDPs ... 72
4.5.3 Temporally Extended Actions ... 73
4.5.4 Criteria for Efficient Abstraction ... 73
4.5.5 The Role of Domain Knowledge ... 74
4.6 A Qualitative Approach to Spatial Abstraction ... 75
4.6.1 Qualitative Spatial Representations ... 75
4.6.2 Qualitative State Space Abstraction in Agent Control Tasks ... 76
4.6.3 Qualitative Representations and Aspectualization ... 77
4.7 Summary ... 77
5 Generalization and Transfer Learning with Qualitative Spatial Abstraction ... 79
5.1 Reusing Knowledge in Learning Tasks ... 79
5.1.1 Structural Similarity ... 80
5.1.2 Structural Similarity and Knowledge Transfer ... 80
5.2 Aspectualizable State Spaces ... 81
5.2.1 A Distinction Between Different Aspects of Problems ... 82
5.2.2 Using Goal-Directed and Generally Sensible Behavior for Knowledge Transfer ... 82
5.2.3 Structure Space and Task Space ... 83
5.3 Value-Function-Approximation-Based Task Space Generalization ... 86
5.3.1 Maintaining Structure Space Knowledge ... 86
5.3.2 An Introduction to Tile Coding ... 87
5.3.3 Task Space Tile Coding ... 90
5.3.4 Ad Hoc Transfer of Policies Learned with Task Space Tile Coding ... 93
5.3.5 Discussion of Task Space Tile Coding ... 94
5.4 A Posteriori Structure Space Transfer ... 94
5.4.1 Q-Value Averaging over Task Space ... 95
5.4.2 Avoiding Task Space Bias ... 95
5.4.3 Measuring Confidence of Generalized Policies ... 97
5.5 Discussion of the Transfer Methods ... 98
5.5.1 Comparison of the Transfer Methods ... 98
5.5.2 Outlook: Hierarchical Learning of Task and Structure Space Policies ... 99
5.6 Structure-Induced Task Space Aspectualization ... 100
5.6.1 Decision and Non-decision States ... 101
5.6.2 Identifying Non-decision Structures ... 101
5.6.3 SITSA: Abstraction in Non-decision States ... 102
5.6.4 Discussion of SITSA ... 102
5.7 Summary ... 103
6 RLPR -- An Aspectualizable State Space Representation ... 105
6.1 Building a Task-Specific Spatial Representation ... 105
6.1.1 A Goal-Directed Robot Navigation Task ... 106
6.1.2 Identifying Task and Structure Space ... 107
6.1.3 Representation and Frame of Reference ... 107
6.2 Representing Task Space ... 108
6.2.1 Usage of Landmarks ... 108
6.2.2 Landmarks and Ordering Information ... 109
6.2.3 Representing Singular Landmarks ... 110
6.2.4 Views as Landmark Information ... 115
6.2.5 Navigation Based on Landmark Information Only ... 118
6.3 Representing Structure Space ... 119
6.3.1 Relative Line Position Representation (RLPR) ... 120
6.3.2 Building an RLPR Feature Vector ... 126
6.3.3 Variants of RLPR ... 126
6.3.4 Abstraction Effects in RLPR ... 127
6.3.5 RLPR and Collision Avoidance ... 128
6.4 Landmark-Enriched RLPR ... 129
6.4.1 Properties of le-RLPR ... 129
6.5 Robustness of le-RLPR ... 130
6.5.1 Robustness of Task Space Representation ... 131
6.5.2 Robustness of Structure Space Representation ... 132
6.6 Summary ... 134
7 Empirical Evaluation ... 135
7.1 Evaluation Setup ... 135
7.1.1 The Testbed ... 135
7.1.2 The Motion Noise Model ... 136
7.1.3 The le-RLPR Representation ... 137
7.1.4 Learning Algorithm, Rewards, and Cross-validation ... 137
7.2 Learning Performance ... 138
7.2.1 Performance of le-RLPR-Based Representations ... 139
7.2.2 le-RLPR Compared to the Original MDP ... 141
7.2.3 Quality of le-RLPR-Based Solutions ... 142
7.2.4 Effect of Task Space Tile Coding ... 143
7.2.5 Task Space Information Only ... 144
7.2.6 Learning Navigation with Point-Based Landmarks ... 146
7.2.7 Evaluation of SITSA ... 147
7.3 Behavior Under Noise ... 148
7.3.1 Robustness Under Motion Noise ... 149
7.3.2 Robustness Under Distorted Perception ... 150
7.4 Generalization and Transfer Learning ... 153
7.4.1 le-RLPR and Modified Environments ... 154
7.4.2 Policy Transfer to New Environments ... 155
7.5 RLPR-Based Navigation in Real-World Environments ... 158
7.5.1 Properties of a Real Office Environment ... 158
7.5.2 Differences of the Real Robot ... 159
7.5.3 Operation on Identical Observations ... 161
7.5.4 Training and Transfer ... 161
7.5.5 Behavior of the Real Robot ... 162
7.6 Summary ... 163
8 Summary and Outlook ... 167
8.1 Summary of the Results ... 167
8.2 Future Work ... 170
References ... 172
Index ... 182

Inhaltsangabe

Introduction; Foundations of Reinforcement Learning; Abstraction and Knowledge Transfer in Reinforcement Learning; Qualitative State Space Abstraction; Generalization and Transfer Learning with Qualitative Spatial Abstraction; RLPR An Aspectualizable State Space Representation; Empirical Evaluation; Summary and Outlook; References; Index
Mehr von