Overcoming catastrophic forgetting in neural networks J Kirkpatrick, R Pascanu, N Rabinowitz, J Veness, G Desjardins, AA Rusu, ... Proceedings of the national academy of sciences 114 (13), 3521-3526, 2017 | 7730 | 2017 |
Prioritized experience replay T Schaul, J Quan, I Antonoglou, D Silver arXiv preprint arXiv:1511.05952, 2015 | 5133 | 2015 |
Deep q-learning from demonstrations T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018 | 1281 | 2018 |
Starcraft ii: A new challenge for reinforcement learning O Vinyals, T Ewalds, S Bartunov, P Georgiev, AS Vezhnevets, M Yeo, ... arXiv preprint arXiv:1708.04782, 2017 | 1088 | 2017 |
Distributed prioritized experience replay D Horgan, J Quan, D Budden, G Barth-Maron, M Hessel, H Van Hasselt, ... arXiv preprint arXiv:1803.00933, 2018 | 905 | 2018 |
Distral: Robust multitask reinforcement learning Y Teh, V Bapst, WM Czarnecki, J Quan, J Kirkpatrick, R Hadsell, N Heess, ... Advances in neural information processing systems 30, 2017 | 619 | 2017 |
Recurrent experience replay in distributed reinforcement learning S Kapturowski, G Ostrovski, J Quan, R Munos, W Dabney International conference on learning representations, 2018 | 556 | 2018 |
Transfer in deep reinforcement learning using successor features and generalised policy improvement A Barreto, D Borsa, J Quan, T Schaul, D Silver, M Hessel, D Mankowitz, ... International Conference on Machine Learning, 501-510, 2018 | 201 | 2018 |
The DeepMind JAX Ecosystem I Babuschkin, K Baumli, A Bell, S Bhupatiraju, J Bruce, P Buchlovsky, ... URL http://github.com/google-deepmind, 2020 | 188* | 2020 |
Observe and look further: Achieving consistent performance on atari T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ... arXiv preprint arXiv:1805.11593, 2018 | 140 | 2018 |
Universal successor features approximators D Borsa, A Barreto, J Quan, D Mankowitz, R Munos, H Van Hasselt, ... arXiv preprint arXiv:1812.07626, 2018 | 133 | 2018 |
The value-improvement path: Towards better representations for reinforcement learning W Dabney, A Barreto, M Rowland, R Dadashi, J Quan, MG Bellemare, ... Proceedings of the AAAI Conference on Artificial Intelligence 35 (8), 7160-7168, 2021 | 71 | 2021 |
Unicorn: Continual learning with a universal, off-policy agent DJ Mankowitz, A Žídek, A Barreto, D Horgan, M Hessel, J Quan, J Oh, ... arXiv preprint arXiv:1802.08294, 2018 | 49 | 2018 |
Training neural networks using a prioritized experience memory T Schaul, J Quan, D Silver US Patent 10,650,310, 2020 | 26 | 2020 |
Podracer architectures for scalable reinforcement learning M Hessel, M Kroiss, A Clark, I Kemaev, J Quan, T Keck, F Viola, ... arXiv preprint arXiv:2104.06272, 2021 | 22 | 2021 |
DQN Zoo: Reference implementations of DQN-based agents J Quan, G Ostrovski URL http://github.com/google-deepmind/dqn_zoo, 2020 | 21* | 2020 |
Reply to Huszár: The elastic weight consolidation penalty is empirically valid J Kirkpatrick, R Pascanu, N Rabinowitz, J Veness, G Desjardins, AA Rusu, ... Proceedings of the National Academy of Sciences 115 (11), E2498-E2498, 2018 | 21 | 2018 |
The phenomenon of policy churn T Schaul, A Barreto, J Quan, G Ostrovski Advances in Neural Information Processing Systems 35, 2537-2549, 2022 | 19 | 2022 |
General non-linear bellman equations H van Hasselt, J Quan, M Hessel, Z Xu, D Borsa, A Barreto arXiv preprint arXiv:1907.03687, 2019 | 12 | 2019 |
Reinforcement learning using distributed prioritized replay D Budden, G Barth-Maron, J Quan, DG Horgan US Patent 11,625,604, 2023 | 11 | 2023 |