Dhruva Tirumala

Cited by

	All	Since 2019
Citations	4292	3874
h-index	16	15
i10-index	18	17

800

400

200

600

2017201820192020202120222023202475 312 512 640 785 767 751 417

Dhruva Tirumala

DeepMind

Verified email at google.com


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Emergence of locomotion behaviours in rich environments N Heess, D Tb, S Sriram, J Lemmon, J Merel, G Wayne, Y Tassa, T Erez, ... arXiv preprint arXiv:1707.02286, 2017	1103	2017
Learning to reinforcement learn JX Wang, Z Kurth-Nelson, D Tirumala, H Soyer, JZ Leibo, R Munos, ... arXiv preprint arXiv:1611.05763, 2016	1038	2016
Prefrontal cortex as a meta-reinforcement learning system JX Wang, Z Kurth-Nelson, D Kumaran, D Tirumala, H Soyer, JZ Leibo, ... Nature neuroscience 21 (6), 860-868, 2018	634	2018
Distributed distributional deterministic policy gradients G Barth-Maron, MW Hoffman, D Budden, W Dabney, D Horgan, D Tb, ... arXiv preprint arXiv:1804.08617, 2018	626	2018
Learning human behaviors from motion capture by adversarial imitation J Merel, Y Tassa, D TB, S Srinivasan, J Lemmon, Z Wang, G Wayne, ... arXiv preprint arXiv:1707.02201, 2017	230	2017
V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control HF Song, A Abdolmaleki, JT Springenberg, A Clark, H Soyer, JW Rae, ... arXiv preprint arXiv:1909.12238, 2019	112	2019
Hierarchical visuomotor control of humanoids J Merel, A Ahuja, V Pham, S Tunyasuvunakool, S Liu, D Tirumala, ... arXiv preprint arXiv:1811.09656, 2018	109	2018
Information asymmetry in KL-regularized RL A Galashov, SM Jayakumar, L Hasenclever, D Tirumala, J Schwarz, ... arXiv preprint arXiv:1905.01240, 2019	104	2019
Learning agile soccer skills for a bipedal robot with deep reinforcement learning T Haarnoja, B Moran, G Lever, SH Huang, D Tirumala, J Humplik, ... Science Robotics 9 (89), eadi8022, 2024	58	2024
Data-efficient hindsight off-policy option learning M Wulfmeier, D Rao, R Hafner, T Lampe, A Abdolmaleki, T Hertweck, ... International Conference on Machine Learning, 11340-11350, 2021	46	2021
Exploiting hierarchy for learning and transfer in kl-regularized rl D Tirumala, H Noh, A Galashov, L Hasenclever, A Ahuja, G Wayne, ... arXiv preprint arXiv:1903.07438, 2019	44	2019
Behavior priors for efficient reinforcement learning D Tirumala, A Galashov, H Noh, L Hasenclever, R Pascanu, J Schwarz, ... Journal of Machine Learning Research 23 (221), 1-68, 2022	33	2022
Probing physics knowledge using tools from developmental psychology L Piloto, A Weinstein, D TB, A Ahuja, M Mirza, G Wayne, D Amos, C Hung, ... arXiv preprint arXiv:1804.01128, 2018	32	2018
Learning to reinforcement learn. ArXiv 1611.05763 JX Wang, Z Kurth-Nelson, D Tirumala, H Soyer, JZ Leibo, R Munos, ...	28	2017
Pick your battles: Interaction graphs as population-level objectives for strategic diversity M Garnelo, WM Czarnecki, S Liu, D Tirumala, J Oh, G Gidel, ... arXiv preprint arXiv:2110.04041, 2021	25	2021
Learning transferable motor skills with hierarchical latent mixture policies D Rao, F Sadeghi, L Hasenclever, M Wulfmeier, M Zambelli, G Vezzani, ... arXiv preprint arXiv:2112.05062, 2021	23	2021
Mo2: Model-based offline options S Salter, M Wulfmeier, D Tirumala, N Heess, M Riedmiller, R Hadsell, ... Conference on Lifelong Learning Agents, 902-919, 2022	12	2022
Learning to reinforcement learn (2016) JX Wang, Z Kurth-Nelson, D Tirumala, H Soyer, JZ Leibo, R Munos, ... arXiv preprint arXiv:1611.05763, 2016	12	2016
Skills: Adaptive skill sequencing for efficient temporally-extended exploration G Vezzani, D Tirumala, M Wulfmeier, D Rao, A Abdolmaleki, B Moran, ... arXiv preprint arXiv:2211.13743, 2022	5	2022
On multi-objective policy optimization as a tool for reinforcement learning: Case studies in offline RL and finetuning A Abdolmaleki, SH Huang, G Vezzani, B Shahriari, JT Springenberg, ... arXiv preprint arXiv:2106.08199, 2021	4	2021

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by