Search for dissertations about: "Multi-armed bandits"

Showing result 1 - 5 of 14 swedish dissertations containing the words Multi-armed bandits.

  1. 1. Structured Stochastic Bandits

    Author : Stefan Magureanu; Alexandre Proutiere; Emilie Kaufmann; KTH; []
    Keywords : TEKNIK OCH TEKNOLOGIER; ENGINEERING AND TECHNOLOGY; Multi-armed bandits; Learning to rank; reinforcement learning; Lipschitz Bandits; Electrical Engineering; Elektro- och systemteknik;

    Abstract : In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlated arms. Particularly, we investigate the case when the expected rewards are a Lipschitz function of the arm, and the learning to rank problem, as viewed from a MAB perspective. READ MORE

  2. 2. Minimizing Regret in Combinatorial Bandits and Reinforcement Learning

    Author : Mohammad Sadegh Talebi Mazraeh Shahi; Alexandre Proutiere; Mikael Johansson; Ronald Ortner; KTH; []
    Keywords : Multi-armed Bandits; Reinforcement Learning; Regret Minimization; Statistics; Electrical Engineering; Elektro- och systemteknik;

    Abstract : This thesis investigates sequential decision making tasks that fall in the framework of reinforcement learning (RL). These tasks involve a decision maker repeatedly interacting with an environment modeled by an unknown finite Markov decision process (MDP), who wishes to maximize a notion of reward accumulated during her experience. READ MORE

  3. 3. Reinforcement Learning and Dynamical Systems

    Author : Björn Lindenberg; Karl-Olof Lindahl; Marc G. Bellemare; Linnéuniversitetet; []
    Keywords : NATURVETENSKAP; NATURAL SCIENCES; artificial intelligence; distributional reinforcement learning; Markov decision processes; Bellman operators; deep learning; multi-armed bandits; Bayesian bandits; conjugate priors; Thompson sampling; linear finite dynamical systems; cycle orbits; fixed-point systems; Mathematics; Matematik; Computer Science; Datavetenskap;

    Abstract : This thesis concerns reinforcement learning and dynamical systems in finite discrete problem domains. Artificial intelligence studies through reinforcement learning involves developing models and algorithms for scenarios when there is an agent that is interacting with an environment. READ MORE

  4. 4. Online Learning for Energy Efficient Navigation in Stochastic Transport Networks

    Author : Niklas Åkerblom; Chalmers tekniska högskola; []
    Keywords : NATURVETENSKAP; NATURAL SCIENCES; NATURVETENSKAP; NATURAL SCIENCES; NATURVETENSKAP; NATURAL SCIENCES; Thompson Sampling; Online Minimax Path Problem; Multi-Armed Bandits; Online Learning; Online Shortest Path Problem; Machine Learning; Combinatorial Semi-Bandits; Energy Efficient Navigation;

    Abstract : Reducing the dependence on fossil fuels in the transport sector is crucial to have a realistic chance of halting climate change. The automotive industry is, therefore, transitioning towards an electrified future at an unprecedented pace. READ MORE

  5. 5. Efficient Online Learning under Bandit Feedback

    Author : Stefan Magureanu; Alexandre Proutiere; Odalric-Ambrym Maillard; KTH; []
    Keywords : TEKNIK OCH TEKNOLOGIER; ENGINEERING AND TECHNOLOGY; multi-armed bandits; reinforcement learning; learning to rank; Electrical Engineering; Elektro- och systemteknik;

    Abstract : In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlated arms. Particularly, we investigate the case when the expected rewards are a Lipschitz function of the arm and extend these results to bandits with arbitrary structure that is known to the decision maker. READ MORE