Tri Dao

Cited by

	All	Since 2019
Citations	4985	4981
h-index	26	26
i10-index	35	35

3200

1600

800

2400

20192020202120222023202446 84 148 268 1311 3117

Public access

View all

22 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Christopher RéComputer Science, Stanford UniversityVerified email at cs.stanford.edu
Albert GuCarnegie Mellon UniversityVerified email at andrew.cmu.edu
Atri RudraKatherine Johnson Chair in AI, Professor, CSE, University at BuffaloVerified email at buffalo.edu
Stefano ErmonStanford UniversityVerified email at cs.stanford.edu
Beidi ChenCarnegie Mellon UniversityVerified email at andrew.cmu.edu
Daniel Y FuGraduate Student, Stanford UniversityVerified email at cs.stanford.edu
Zhao SongAdobe ResearchVerified email at ias.edu
Khaled Kamal SaabGoogle, Stanford UniversityVerified email at google.com
Michael PoliStanford UniversityVerified email at stanford.edu
Karan GoelStanford UniversityVerified email at stanford.edu
Eric NguyenStanford UniversityVerified email at stanford.edu
Ce ZhangTogether AI; University of ChicagoVerified email at together.xyz
Binhang Yuan（袁彬航）Hong Kong University of Science and TechnologyVerified email at ust.hk
Stephen BaccusProfessor of Neurobiology, Stanford UniversityVerified email at stanford.edu
Armin W. ThomasLiquid AIVerified email at liquid.ai
Christopher De SaAssistant Professor of Computer Science, Cornell UniversityVerified email at cs.cornell.edu
Jue WangTogether AI; ZJUVerified email at zju.edu.cn
Yongjun HeSystems Group @ ETH ZurichVerified email at inf.ethz.ch
Zichang LiuRice UniversityVerified email at rice.edu
Stefano MassaroliRIKENVerified email at riken.jp

Tri Dao

Princeton University, Together AI

Verified email at princeton.edu - Homepage

Machine learning Systems


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Flashattention: Fast and memory-efficient exact attention with io-awareness T Dao, D Fu, S Ermon, A Rudra, C Ré Advances in Neural Information Processing Systems 35, 16344-16359, 2022	1000	2022
Mamba: Linear-time sequence modeling with selective state spaces A Gu, T Dao Conference on Language Modeling (COLM), 2023	602	2023
Starcoder: may the source be with you! R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... Transactions on Machine Learning Research (TMLR), 2023	576*	2023
Flashattention-2: Faster attention with better parallelism and work partitioning T Dao International Conference on Learning Representations, 2023	324	2023
Hippo: Recurrent memory with optimal polynomial projections A Gu, T Dao, S Ermon, A Rudra, C Ré Advances in neural information processing systems 33, 1474-1487, 2020	273	2020
Combining recurrent, convolutional, and continuous-time models with linear state space layers A Gu, I Johnson, K Goel, K Saab, T Dao, A Rudra, C Ré Advances in neural information processing systems 34, 572-585, 2021	259	2021
Hungry Hungry Hippos: Towards Language Modeling with State Space Models DY Fu, T Dao, KK Saab, AW Thomas, A Rudra, C Re The Eleventh International Conference on Learning Representations, 2023	239	2023
A kernel theory of modern data augmentation T Dao, A Gu, A Ratner, V Smith, CD Sa, C Ré Proceedings of the 36th International Conference on Machine Learning, ICML, 9-15, 2019	207	2019
Hyena Hierarchy: Towards Larger Convolutional Language Models M Poli, S Massaroli, E Nguyen, DY Fu, T Dao, S Baccus, Y Bengio, ... International Conference on Machine Learning, 2023	177	2023
Deja vu: Contextual sparsity for efficient llms at inference time Z Liu, J Wang, T Dao, T Zhou, B Yuan, Z Song, A Shrivastava, C Zhang, ... International Conference on Machine Learning, 22137-22176, 2023	127	2023
S4nd: Modeling images and videos as multidimensional signals with state spaces E Nguyen, K Goel, A Gu, G Downs, P Shah, T Dao, S Baccus, C Ré Advances in neural information processing systems 35, 2846-2861, 2022	115	2022
Learning fast algorithms for linear transforms using butterfly factorizations T Dao, A Gu, M Eichhorn, A Rudra, C Ré International conference on machine learning, 1517-1527, 2019	107	2019
Scatterbrain: Unifying sparse and low-rank attention B Chen, T Dao, E Winsor, Z Song, A Rudra, C Ré Advances in Neural Information Processing Systems 34, 17413-17426, 2021	103	2021
Monarch: Expressive structured matrices for efficient and accurate training T Dao, B Chen, NS Sohoni, A Desai, M Poli, J Grogan, A Liu, A Rao, ... International Conference on Machine Learning, 4690-4721, 2022	73	2022
Mongoose: A learnable lsh framework for efficient neural network training B Chen, Z Liu, B Peng, Z Xu, JL Li, T Dao, Z Song, A Shrivastava, C Re International Conference on Learning Representations, 2020	71	2020
Pixelated butterfly: Simple and efficient sparse training for neural network models T Dao, B Chen, K Liang, J Yang, Z Song, A Rudra, C Re International Conference on Learning Representations, 2021	67	2021
Decentralized training of foundation models in heterogeneous environments B Yuan, Y He, J Davis, T Zhang, T Dao, B Chen, PS Liang, C Re, C Zhang Advances in Neural Information Processing Systems 35, 25464-25477, 2022	63	2022
Gaussian quadrature for kernel features T Dao, CM De Sa, C Ré Advances in neural information processing systems 30, 2017	60	2017
Starcoder 2 and the stack v2: The next generation A Lozhkov, R Li, LB Allal, F Cassano, J Lamy-Poirier, N Tazi, A Tang, ... arXiv preprint arXiv:2402.19173, 2024	59	2024
Medusa: Simple llm inference acceleration framework with multiple decoding heads T Cai, Y Li, Z Geng, H Peng, JD Lee, D Chen, T Dao International Conference on Machine Learning (ICML), 2024	58	2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors