Follow
Weizhu Chen
Weizhu Chen
Verified email at microsoft.com - Homepage
Title
Cited by
Cited by
Year
Lora: Low-rank adaptation of large language models
EJ Hu, Y Shen, P Wallis, Z Allen-Zhu, Y Li, S Wang, L Wang, W Chen
arXiv preprint arXiv:2106.09685, 2021
35202021
On the variance of the adaptive learning rate and beyond
L Liu, H Jiang, P He, W Chen, X Liu, J Gao, J Han
arXiv preprint arXiv:1908.03265, 2019
20132019
Deberta: Decoding-enhanced bert with disentangled attention
P He, X Liu, J Gao, W Chen
arXiv preprint arXiv:2006.03654, 2020
19212020
Multi-task deep neural networks for natural language understanding
X Liu, P He, W Chen, J Gao
arXiv preprint arXiv:1901.11504, 2019
13292019
What Makes Good In-Context Examples for GPT-?
J Liu, D Shen, Y Zhang, B Dolan, L Carin, W Chen
arXiv preprint arXiv:2101.06804, 2021
7862021
Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing
P He, J Gao, W Chen
arXiv preprint arXiv:2111.09543, 2021
5662021
Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization
H Jiang, P He, W Chen, X Liu, J Gao, T Zhao
arXiv preprint arXiv:1911.03437, 2019
4102019
Reasonet: Learning to stop reading in machine comprehension
Y Shen, PS Huang, J Gao, W Chen
Proceedings of the 23rd ACM SIGKDD international conference on knowledge …, 2017
3312017
Short text conceptualization using a probabilistic knowledgebase
Y Song, H Wang, Z Wang, H Li, W Chen
Proceedings of the twenty-second international joint conference on …, 2011
2912011
Understanding the difficulty of training transformers
L Liu, X Liu, J Gao, W Chen, J Han
arXiv preprint arXiv:2004.08249, 2020
2402020
Check your facts and try again: Improving large language models with external knowledge and automated feedback
B Peng, M Galley, P He, H Cheng, Y Xie, Y Hu, Q Huang, L Liden, Z Yu, ...
arXiv preprint arXiv:2302.12813, 2023
2372023
Fusionnet: Fusing via fully-aware attention with application to machine comprehension
HY Huang, C Zhu, Y Shen, W Chen
arXiv preprint arXiv:1711.07341, 2017
2032017
Improving multi-task deep neural networks via knowledge distillation for natural language understanding
X Liu, P He, W Chen, J Gao
arXiv preprint arXiv:1904.09482, 2019
1942019
Document transformation for multi-label feature selection in text categorization
W Chen, J Yan, B Zhang, Z Chen, Q Yang
Seventh IEEE International Conference on Data Mining (ICDM 2007), 451-456, 2007
1792007
On the advance of making language models better reasoners
Y Li, Z Lin, S Zhang, Q Fu, B Chen, JG Lou, W Chen
arXiv preprint arXiv:2206.02336, 2022
169*2022
Agieval: A human-centric benchmark for evaluating foundation models
W Zhong, R Cui, Y Guo, Y Liang, S Lu, Y Wang, A Saied, W Chen, ...
arXiv preprint arXiv:2304.06364, 2023
1612023
Adversarial training for large neural language models
X Liu, H Cheng, P He, W Chen, Y Wang, H Poon, J Gao
arXiv preprint arXiv:2004.08994, 2020
1602020
Tapex: Table pre-training via learning a neural sql executor
Q Liu, B Chen, J Guo, M Ziyadi, Z Lin, W Chen, JG Lou
arXiv preprint arXiv:2107.07653, 2021
1592021
Generation-augmented retrieval for open-domain question answering
Y Mao, P He, X Liu, Y Shen, J Gao, J Han, W Chen
arXiv preprint arXiv:2009.08553, 2020
1582020
Few-shot named entity recognition: A comprehensive study
J Huang, C Li, K Subudhi, D Jose, S Balakrishnan, W Chen, B Peng, ...
arXiv preprint arXiv:2012.14978, 2020
151*2020
The system can't perform the operation now. Try again later.
Articles 1–20