Kirjailija

Benjamin Van Roy

Kirjat ja teokset yhdessä paikassa: 2 kirjaa, julkaisuja vuosilta 2018-2023, suosituimpien joukossa Reinforcement Learning, Bit by Bit. Vertaile teosten hintoja ja tarkista saatavuus suomalaisista kirjakaupoista.

2 kirjaa

Kirjojen julkaisuhaarukka 2018-2023.

Reinforcement Learning, Bit by Bit

Xiuyuan Lu; Benjamin Van Roy; Vikranth Dwaracherla; Morteza Ibrahimi; Ian Osband; Zheng Wen

Now Publishers Inc

2023

nidottu

Vertaile hintoja

Reinforcement learning agents have demonstrated remarkable achievements in simulated environments. Data efficiency, however, significantly impedes carrying this success over to real environments. The design of data-efficient agents that address this problem calls for a deeper understanding of information acquisition and representation. This tutorial offers a framework that can guide associated agent design decisions. This framework is inspired in part by concepts from information theory that has grappled with data efficiency for many years in the design of communication systems.In this tutorial, the authors shed light on questions of what information to seek, how to seek that information, and what information to retain. To illustrate the concepts, they design simple agents that build on them and present computational results that highlight data efficiency.This book will be of interest to students and researchers working in reinforcement learning and information theorists wishing to apply their knowledge in a practical way to reinforcement learning problems.

Vertaile hintoja

A Tutorial on Thompson Sampling

Daniel J. Russo; Benjamin Van Roy; Abbas Kazerouni; Ian Osband; Zheng Wen

now publishers Inc

2018

nidottu

Vertaile hintoja

Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance. The algorithm addresses a broad range of problems in a computationally efficient manner and is therefore enjoying wide use.A Tutorial on Thompson Sampling covers the algorithm and its application, illustrating concepts through a range of examples, including Bernoulli bandit problems, shortest path problems, product recommendation, assortment, active learning with neural networks, and reinforcement learning in Markov decision processes. Most of these problems involve complex information structures, where information revealed by taking an action informs beliefs about other actions. It also discusses when and why Thompson sampling is or is not effective and relations to alternative algorithms.

Vertaile hintoja