Следене
Benjamin Van Roy
Benjamin Van Roy
Потвърден имейл адрес: stanford.edu - Начална страница
Заглавие
Позовавания
Позовавания
Година
Analysis of temporal-diffference learning with function approximation
J Tsitsiklis, B Van Roy
Advances in neural information processing systems 9, 1996
22461996
Deep exploration via bootstrapped DQN
I Osband, C Blundell, A Pritzel, B Van Roy
Advances in neural information processing systems 29, 2016
15052016
A tutorial on thompson sampling
D Russo, B Van Roy, A Kazerouni, I Osband, Z Wen
Foundations and Trends in Machine Learning 11 (1), pp. 1-96, 2018
11982018
The linear programming approach to approximate dynamic programming
DP De Farias, B Van Roy
Operations research 51 (6), 850-865, 2003
9772003
Regression methods for pricing complex American-style options
JN Tsitsiklis, B Van Roy
IEEE Transactions on Neural Networks 12 (4), 694-703, 2001
8642001
Learning to optimize via posterior sampling
D Russo, B Van Roy
Mathematics of Operations Research 39 (4), 1221-1243, 2014
7662014
Feature-based methods for large scale dynamic programming
JN Tsitsiklis, B Van Roy
Machine Learning 22 (1), 59-94, 1996
7241996
Markov perfect industry dynamics with many firms
G Weintraub, CL Benkard, B Van Roy
Econometrica 76 (6), 1375-1411, 2008
5712008
On constraint sampling in the linear programming approach to approximate dynamic programming
DP De Farias, B Van Roy
Mathematics of operations research 29 (3), 462-478, 2004
4972004
Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives
JN Tsitsiklis, B Van Roy
IEEE Transactions on Automatic Control 44 (10), 1840-1851, 1999
4911999
An information-theoretic analysis of thompson sampling
D Russo, B Van Roy
Journal of Machine Learning Research 17 (68), 1-30, 2016
4342016
Deep Exploration via Randomized Value Functions.
I Osband, B Van Roy, DJ Russo, Z Wen
The Journal of Machine Learning Research 20 (124), 1-62, 2019
3422019
Generalization and exploration via randomized value functions
I Osband, B Van Roy, Z Wen
International Conference on Machine Learning, 2377-2386, 2016
3382016
Consensus propagation
CC Moallemi, B Van Roy
IEEE Transactions on Information Theory 52 (11), 4753-4766, 2006
3072006
Eluder dimension and the sample complexity of optimistic exploration
D Russo, B Van Roy
Advances in Neural Information Processing Systems 26, 2013
2742013
Why is posterior sampling better than optimism for reinforcement learning?
I Osband, B Van Roy
International conference on machine learning, 2701-2710, 2017
2722017
Dynamic pricing with a prior on market response
VF Farias, B Van Roy
Operations Research 58 (1), 16-29, 2010
2712010
Solving data mining problems through pattern recognition
RL Kennedy, Y Lee, B Van Roy, CD Reed, RP Lippman
Upper Saddle River, NJ: Prentice Hall PTR, 2011
270*2011
Learning to optimize via information-directed sampling
D Russo, B Van Roy
Advances in neural information processing systems 27, 2014
2442014
Average cost temporal-difference learning
JN Tsitsiklis, B Van Roy
Automatica 35, 319-349, 1999
2411999
Системата не може да изпълни операцията сега. Опитайте отново по-късно.
Статии 1–20