Ian Osband

Получаване на мой собствен потребителски профил

Позовавания

	Всички	От 2019
Позовавания	7641	6808
h-индекс	25	25
i10-индекс	29	28

1600

800

400

1200

201520162017201820192020202120222023202427 74 222 467 751 1153 1361 1470 1543 528

Съавтори

Benjamin Van RoyStanford UniversityПотвърден имейл адрес: stanford.edu
Zheng WenGoogle DeepMindПотвърден имейл адрес: google.com
Vikranth DwaracherlaDeepMindПотвърден имейл адрес: google.com
Xiuyuan LuGoogle DeepMindПотвърден имейл адрес: google.com
Daniel RussoColumbia UniversityПотвърден имейл адрес: gsb.columbia.edu
Morteza IbrahimiStanford UniversityПотвърден имейл адрес: stanford.edu
Brendan O'DonoghueStanford University, Google DeepMindПотвърден имейл адрес: alumni.stanford.edu
Mohammad Gheshlaghi AzarCohere AIПотвърден имейл адрес: google.com
Todd HesterWaymoПотвърден имейл адрес: waymo.com
Bilal PiotGoogle DeepmindПотвърден имейл адрес: google.com
Olivier PietquinCohere | ex Google DeepMind (On leave - Professor at University of Lille)Потвърден имейл адрес: univ-lille.fr
Tom SchaulSenior Staff Scientist, DeepMindПотвърден имейл адрес: nyu.edu
Rémi MunosDeepMindПотвърден имейл адрес: inria.fr
Alexander PritzelDeepmindПотвърден имейл адрес: google.com
Marc LanctotResearch Scientist, Google DeepMindПотвърден имейл адрес: google.com

Следене

Ian Osband

OpenAI

Потвърден имейл адрес: openai.com - Начална страница

Reinforcement Learning


Заглавие Сортиране по цитати Сортиране по година Сортиране по заглавие	Позовавания Позовавания	Година
Deep exploration via bootstrapped DQN I Osband, C Blundell, A Pritzel, B Van Roy Advances in neural information processing systems 29, 2016	1399	2016
Deep q-learning from demonstrations T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	1164	2018
A tutorial on thompson sampling DJ Russo, B Van Roy, A Kazerouni, I Osband, Z Wen Foundations and Trends® in Machine Learning 11 (1), 1-96, 2018	1050	2018
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	778	2017
Randomized prior functions for deep reinforcement learning I Osband, J Aslanides, A Cassirer Advances in Neural Information Processing Systems 31, 2018	395	2018
Deep Exploration via Randomized Value Functions I Osband https://searchworks.stanford.edu/view/11891201, 2016	320	2016
Generalization and exploration via randomized value functions I Osband, B Van Roy, Z Wen International Conference on Machine Learning, 2377-2386, 2016	319	2016
Why is posterior sampling better than optimism for reinforcement learning? I Osband, B Van Roy International conference on machine learning, 2701-2710, 2017	255	2017
The uncertainty bellman equation and exploration B O’Donoghue, I Osband, R Munos, V Mnih International conference on machine learning, 3836-3845, 2018	207	2018
Model-based reinforcement learning and the eluder dimension I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	182	2014
Learning from demonstrations for real world reinforcement learning T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, A Sendonaris, ... arXiv preprint arXiv:1704.03732, 2017	175	2017
Behaviour suite for reinforcement learning I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ... arXiv preprint arXiv:1908.03568, 2019	174	2019
Risk versus Uncertainty in Deep Learning: Bayes, Bootstrap and the Dangers of Dropout I Osband http://bayesiandeeplearning.org/papers/BDL_4.pdf, 0	163*
Deep learning for time series modeling E Busseti, I Osband, S Wong Technical report, Stanford University, 1-5, 2012	136	2012
Near-optimal reinforcement learning in factored mdps I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	122	2014
On lower bounds for regret in reinforcement learning I Osband, B Van Roy arXiv preprint arXiv:1608.02732, 2016	107	2016
Bootstrapped thompson sampling and deep exploration I Osband, B Van Roy arXiv preprint arXiv:1507.00300, 2015	99	2015
(More) efficient reinforcement learning via posterior sampling I Osband, D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2013	94	2013
Meta-learning of sequential strategies PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ... arXiv preprint arXiv:1905.03030, 2019	82	2019
Epistemic neural networks I Osband, Z Wen, SM Asghari, V Dwaracherla, M Ibrahimi, X Lu, ... Advances in Neural Information Processing Systems 36, 2024	79	2024

Системата не може да изпълни операцията сега. Опитайте отново по-късно.

Статии 1–20

Позовавания годишно

Дублирани описания

Обединени библиографски описания

Добавяне на съавториСъавтори

Следене

Позовавания

Съавтори