Follow
Xiaohua Zhai
Xiaohua Zhai
Research Scientist, Google DeepMind
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
A Dosovitskiy*, L Beyer*, A Kolesnikov*, D Weissenborn*, X Zhai*, ...
International Conference on Learning Representations (ICLR), 2021
501552021
MLP-Mixer: An all-MLP Architecture for Vision
I Tolstikhin*, N Houlsby*, A Kolesnikov*, L Beyer*, X Zhai, T Unterthiner, ...
Advances in Neural Information Processing Systems (NeurIPS), 2021
28642021
Big Transfer (BiT): General Visual Representation Learning
A Kolesnikov*, L Beyer*, X Zhai*, J Puigcerver, J Yung, S Gelly, ...
European Conference on Computer Vision (ECCV), 2020
14602020
Scaling Vision Transformers
X Zhai*, A Kolesnikov*, N Houlsby, L Beyer*, *equal contribution
Computer Vision and Pattern Recognition (CVPR), 2022
11992022
S4l: Self-supervised semi-supervised learning
X Zhai*, A Oliver*, A Kolesnikov*, L Beyer*, *equal contribution
International Conference on Computer Vision (ICCV), 1476-1485, 2019
10662019
Revisiting Self-Supervised Visual Representation Learning
A Kolesnikov*, X Zhai*, L Beyer*, *equal contribution
Computer Vision and Pattern Recognition (CVPR), 2019
8972019
Underspecification Presents Challenges for Credibility in Modern Machine Learning
A D'Amour*, K Heller*, D Moldovan*, B Adlam, B Alipanahi, A Beutel, ...
Journal of Machine Learning Research (JMLR), 2020
7952020
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
A Steiner*, A Kolesnikov*, X Zhai*, R Wightman, J Uszkoreit, L Beyer*, ...
Transactions on Machine Learning Research (TMLR), 2022
6752022
Pali: A jointly-scaled multilingual language-image model
X Chen, X Wang, S Changpinyo, AJ Piergiovanni, P Padlewski, D Salz, ...
International Conference on Learning Representations (ICLR), 2022
6192022
LiT: Zero-Shot Transfer with Locked-image Text Tuning
X Zhai*, X Wang*, B Mustafa*, A Steiner*, D Keysers, A Kolesnikov, ...
Computer Vision and Pattern Recognition (CVPR), 2022
5432022
Scaling vision transformers to 22 billion parameters
M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ...
International Conference on Machine Learning (ICML), 7480-7512, 2023
4832023
Sigmoid loss for language image pre-training
X Zhai*, B Mustafa, A Kolesnikov, L Beyer*, *equal contribution
International Conference on Computer Vision (ICCV), 2023
4632023
Simple Open-Vocabulary Object Detection with Vision Transformers
M Minderer, A Gritsenko, A Stone, M Neumann, D Weissenborn, ...
European Conference on Computer Vision (ECCV), 2022
457*2022
A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark
X Zhai*, J Puigcerver*, A Kolesnikov*, P Ruyssen, C Riquelme, M Lucic, ...
arXiv preprint arXiv:1910.04867, 2019
434*2019
& Houlsby, N.(2020). An image is worth 16x16 words: Transformers for image recognition at scale
A Dosovitskiy, L Beyer, A Kolesnikov, D Weissenborn, X Zhai, ...
arXiv preprint arXiv:2010.11929, 2010
4302010
Are we done with ImageNet?
L Beyer*, OJ Hénaff*, A Kolesnikov*, X Zhai*, A Oord*, *equal contribution
arXiv preprint arXiv:2006.07159, 2020
4082020
Self-Supervised GANs via Auxiliary Rotation Loss
T Chen, X Zhai, M Ritter, M Lucic, N Houlsby
Computer Vision and Pattern Recognition (CVPR), 12154-12163, 2019
3722019
Revisiting the Calibration of Modern Neural Networks
M Minderer, J Djolonga, R Romijnders, F Hubis, X Zhai, N Houlsby, ...
Advances in Neural Information Processing Systems (NeurIPS), 2021
3472021
Knowledge distillation: A good teacher is patient and consistent
L Beyer*, X Zhai*, A Royer*, L Markeeva*, R Anil, A Kolesnikov*, ...
Computer Vision and Pattern Recognition (CVPR), 2022
3192022
Learning cross-media joint representation with sparse and semisupervised regularization
X Zhai, Y Peng, J Xiao
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 24 (6 …, 2013
3062013
The system can't perform the operation now. Try again later.
Articles 1–20