Mastering the game of go without human knowledge D Silver, J Schrittwieser, K Simonyan, I Antonoglou, A Huang, A Guez, ... nature 550 (7676), 354-359, 2017 | 11375 | 2017 |
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play D Silver, T Hubert, J Schrittwieser, I Antonoglou, M Lai, A Guez, M Lanctot, ... Science 362 (6419), 1140-1144, 2018 | 4749 | 2018 |
Mastering atari, go, chess and shogi by planning with a learned model J Schrittwieser, I Antonoglou, T Hubert, K Simonyan, L Sifre, S Schmitt, ... Nature 588 (7839), 604-609, 2020 | 2486 | 2020 |
Mastering chess and shogi by self-play with a general reinforcement learning algorithm D Silver, T Hubert, J Schrittwieser, I Antonoglou, M Lai, A Guez, M Lanctot, ... arXiv preprint arXiv:1712.01815, 2017 | 2324 | 2017 |
Competition-level code generation with alphacode Y Li, D Choi, J Chung, N Kushman, J Schrittwieser, R Leblond, T Eccles, ... Science 378 (6624), 1092-1097, 2022 | 891 | 2022 |
Discovering faster matrix multiplication algorithms with reinforcement learning A Fawzi, M Balog, A Huang, T Hubert, B Romera-Paredes, M Barekatain, ... Nature 610 (7930), 47-53, 2022 | 566 | 2022 |
Cyprien de Masson d’Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J Y Li, D Choi, J Chung, N Kushman, J Schrittwieser, R Leblond, T Eccles, ... Science 378 (6624), 1092-1097, 2022 | 275 | 2022 |
Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv 2017 D Silver, T Hubert, J Schrittwieser, I Antonoglou, M Lai, A Guez, M Lanctot, ... arXiv preprint arXiv:1712.01815, 2017 | 160 | 2017 |
Faster sorting algorithms discovered using deep reinforcement learning DJ Mankowitz, A Michi, A Zhernov, M Gelmi, M Selvi, C Paduraru, ... Nature 618 (7964), 257-263, 2023 | 142 | 2023 |
Online and offline reinforcement learning by planning with a learned model J Schrittwieser, T Hubert, A Mandhane, M Barekatain, I Antonoglou, ... Advances in Neural Information Processing Systems 34, 27580-27591, 2021 | 117 | 2021 |
Learning and planning in complex action spaces T Hubert, J Schrittwieser, I Antonoglou, M Barekatain, S Schmitt, D Silver International Conference on Machine Learning, 4476-4486, 2021 | 80 | 2021 |
Monte-Carlo tree search as regularized policy optimization JB Grill, F Altché, Y Tang, T Hubert, M Valko, I Antonoglou, R Munos International Conference on Machine Learning, 3769-3778, 2020 | 77 | 2020 |
Planning in stochastic environments with a learned model I Antonoglou, J Schrittwieser, S Ozair, TK Hubert, D Silver International Conference on Learning Representations, 2021 | 61 | 2021 |
Approximate exploitability: Learning a best response in large games F Timbers, N Bard, E Lockhart, M Lanctot, M Schmid, N Burch, ... arXiv preprint arXiv:2004.09677, 2020 | 47 | 2020 |
Muzero with self-competition for rate control in vp9 video compression A Mandhane, A Zhernov, M Rauh, C Gu, M Wang, F Xue, W Shang, ... arXiv preprint arXiv:2202.06626, 2022 | 44 | 2022 |
Lai D Silver, J Schrittwieser, K Simonyan, I Antonoglou, A Huang, A Guez, ... M., Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre, L., van den, 2018 | 2 | 2018 |
Adrian Bolton και others D Silver, J Schrittwieser, K Simonyan, I Antonoglou, A Huang, A Guez, ... Mastering the game of go without human knowledge. nature 550 (7676), 354-359, 2017 | 2 | 2017 |
Optimizing Memory Mapping Using Deep Reinforcement Learning P Wang, M Sazanovich, B Ilbeyi, PM Phothilimthana, M Purohit, HY Tay, ... arXiv preprint arXiv:2305.07440, 2023 | 1 | 2023 |
Planning for agent control using learned hidden states J Schrittwieser, I Antonoglou, TK Hubert US Patent App. 17/794,797, 2023 | 1 | 2023 |
Training rate control neural networks through reinforcement learning A Zhernov, C Gu, DJ Mankowitz, J Schrittwieser, AB Mandhane, ME Rauh, ... US Patent App. 18/565,008, 2024 | | 2024 |