William Fedus
William Fedus
Verified email at - Homepage
Cited by
Cited by
Palm: Scaling language modeling with pathways
A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ...
Journal of Machine Learning Research 24 (240), 1-113, 2023
Gpt-4 technical report
J Achiam, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, D Almeida, ...
arXiv preprint arXiv:2303.08774, 2023
Scaling instruction-finetuned language models
HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, Y Li, X Wang, ...
Journal of Machine Learning Research 25 (70), 1-53, 2024
Deep Graph Infomax.
P Velickovic, W Fedus, WL Hamilton, P Liò, Y Bengio, RD Hjelm
ICLR (Poster) 2 (3), 4, 2019
Emergent abilities of large language models
J Wei, Y Tay, R Bommasani, C Raffel, B Zoph, S Borgeaud, D Yogatama, ...
arXiv preprint arXiv:2206.07682, 2022
Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity
W Fedus, B Zoph, N Shazeer
Journal of Machine Learning Research 23 (120), 1-39, 2022
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ...
arXiv preprint arXiv:2206.04615, 2022
Deep graph infomax
P Veličković, W Fedus, WL Hamilton, P Liò, Y Bengio, RD Hjelm
arXiv preprint arXiv:1809.10341, 2018
MaskGAN: Better Text Generation via Filling in the ______
W Fedus, I Goodfellow, AM Dai
International Conference on Learning Representations (ICLR 2018), 2018
In silico labeling: Predicting fluorescent labels in unlabeled images
SF Eric Christiansen, Samuel J. Yang, D. Michael Ando, Ashkan Javaherian ...
Cell, 2018
Glam: Efficient scaling of language models with mixture-of-experts
N Du, Y Huang, AM Dai, S Tong, D Lepikhin, Y Xu, M Krikun, Y Zhou, ...
International Conference on Machine Learning, 5547-5569, 2022
Revisiting resnets: Improved training and scaling strategies
I Bello, W Fedus, X Du, ED Cubuk, A Srinivas, TY Lin, J Shlens, B Zoph
Advances in Neural Information Processing Systems 34, 22614-22627, 2021
Revisiting fundamentals of experience replay
W Fedus, P Ramachandran, R Agarwal, Y Bengio, H Larochelle, ...
International conference on machine learning, 3061-3071, 2020
Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step
W Fedus, M Rosca, B Lakshminarayanan, AM Dai, S Mohamed, ...
International Conference on Learning Representations (ICLR 2018), 2017
The case for a directional dark matter detector and the status of current experimental efforts
S Ahlen, N Afshordi, JBR Battat, J Billard, N Bozorgnia, S Burgos, ...
International Journal of Modern Physics A 25 (01), 1-51, 2010
Language GANs Falling Short
M Caccia, L Caccia, W Fedus, H Larochelle, J Pineau, L Charlin
International Conference on Learning Representations (ICLR 2020), 2018
ChatGPT: Optimizing language models for dialogue
J Schulman, B Zoph, C Kim, J Hilton, J Menick, J Weng, JFC Uribe, ...
OpenAI blog 2 (4), 2022
Toju Duke, Lucas Dixon, Kun Zhang, Quoc V
N Du, Y Huang, AM Dai, S Tong, D Lepikhin, Y Xu, M Krikun, Y Zhou, ...
Le, Yonghui Wu, Zhifeng Chen, and Claire Cui, 2021
Do transformer modifications transfer across implementations and applications?
S Narang, HW Chung, Y Tay, W Fedus, T Fevry, M Matena, K Malkan, ...
arXiv preprint arXiv:2102.11972, 2021
Hyperbolic discounting and learning over multiple horizons
W Fedus, C Gelada, Y Bengio, MG Bellemare, H Larochelle
Reinforcement Learning and Decision Making (RLDM 2019), 2019
The system can't perform the operation now. Try again later.
Articles 1–20