Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems

Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems

Versandkostenfrei!
Versandfertig in 1-2 Wochen
90,99 €
inkl. MwSt.
PAYBACK Punkte
45 °P sammeln!
A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is allocated to an action and some observable payoff is obtained. The goal is to maximize the total payoff obtained in a sequence of allocations. The name bandit refers to the colloquial term for a slot machine (a "one-armed bandit" in American slang). In a casino, a sequential allocation problem is obtained when the player is facing many slot machines at once (a "multi-armed bandit"), and must repeatedly choose where to insert the nex...