Fine-tuning language models with just forward passes S Malladi, T Gao, E Nichani, A Damian, JD Lee, D Chen, S Arora Advances in Neural Information Processing Systems 36, 53038-53075, 2023 | 104 | 2023 |
A mathematical exploration of why language models help solve downstream tasks N Saunshi, S Malladi, S Arora International Conference on Learning Representations, 2021 | 80 | 2021 |
EMDomics: a robust and powerful method for the identification of genes differentially expressed between heterogeneous classes S Nabavi, D Schmolze, M Maitituoheti, S Malladi, AH Beck Bioinformatics 32 (4), 533-541, 2016 | 73 | 2016 |
On the validity of modeling sgd with stochastic differential equations (sdes) Z Li, S Malladi, S Arora Advances in Neural Information Processing Systems 34, 12712-12725, 2021 | 71 | 2021 |
A kernel-based view of language model fine-tuning S Malladi, A Wettig, D Yu, D Chen, S Arora International Conference on Machine Learning, 23610-23641, 2023 | 46 | 2023 |
Less: Selecting influential data for targeted instruction tuning M Xia, S Malladi, S Gururangan, S Arora, D Chen arXiv preprint arXiv:2402.04333, 2024 | 44 | 2024 |
On the SDEs and scaling rules for adaptive gradient algorithms S Malladi, K Lyu, A Panigrahi, S Arora Advances in Neural Information Processing Systems 35, 7697-7711, 2022 | 27 | 2022 |
Systematic analysis of sex-linked molecular alterations and therapies in cancer J Ma, S Malladi, AH Beck Scientific reports 6 (1), 19119, 2016 | 25 | 2016 |
Assessing treatment response in triple-negative breast cancer from quantitative image analysis in perfusion magnetic resonance imaging I Banerjee, S Malladi, D Lee, A Depeursinge, M Telli, J Lipson, D Golden, ... Journal of medical imaging 5 (1), 011008-011008, 2018 | 20 | 2018 |
Trainable transformer in transformer A Panigrahi, S Malladi, M Xia, S Arora arXiv preprint arXiv:2307.01189, 2023 | 18 | 2023 |
FastNorm: improving numerical stability of deep network training with efficient normalization S Malladi, I Sharapov | 11 | 2018 |
On the validity of modeling sgd with stochastic differential equations Z Li, S Malladi, S Arora arXiv 2102, 2021 | 7 | 2021 |
The marginal value of momentum for small learning rate sgd R Wang, S Malladi, T Wang, K Lyu, Z Li arXiv preprint arXiv:2307.15196, 2023 | 6 | 2023 |
Muse: Machine unlearning six-way evaluation for language models W Shi, J Lee, Y Huang, S Malladi, J Zhao, A Holtzman, D Liu, ... arXiv preprint arXiv:2407.06460, 2024 | 5 | 2024 |
Charxiv: Charting gaps in realistic chart understanding in multimodal llms Z Wang, M Xia, L He, H Chen, Y Liu, R Zhu, K Liang, X Wu, H Liu, ... arXiv preprint arXiv:2406.18521, 2024 | 5 | 2024 |
Preference Learning Algorithms Do Not Learn Preference Rankings A Chen, S Malladi, LH Zhang, X Chen, Q Zhang, R Ranganath, K Cho arXiv preprint arXiv:2405.19534, 2024 | 2 | 2024 |
Systematic identification of sex-linked molecular alterations and therapeutic strategies in cancer J Ma, S Malladi, AH Beck Cancer Research 75 (15_Supplement), 2985-2985, 2015 | | 2015 |
Progressive distillation improves feature learning via implicit curriculum A Panigrahi, B Liu, S Malladi, A Risteski, S Goel ICML 2024 Workshop on Mechanistic Interpretability, 0 | | |
2nd Workshop on Mathematical and Empirical Understanding of Foundation Models SM Xie, A Kumar, S Min, S Malladi, LM Dery, A Raghunathan, T Ma, ... ICLR 2024 Workshops, 0 | | |