Follow
Ronghang Hu
Ronghang Hu
Research Scientist, AI at Meta
Verified email at meta.com - Homepage
Title
Cited by
Cited by
Year
Learning to reason: End-to-end module networks for visual question answering
R Hu, J Andreas, M Rohrbach, T Darrell, K Saenko
Proceedings of the IEEE international conference on computer vision, 804-813, 2017
7092017
Convnext v2: Co-designing and scaling convnets with masked autoencoders
S Woo, S Debnath, R Hu, X Chen, Z Liu, IS Kweon, S Xie
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
683*2023
Flava: A foundational language and vision alignment model
A Singh, R Hu, V Goswami, G Couairon, W Galuba, M Rohrbach, D Kiela
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
6502022
Natural language object retrieval
R Hu, H Xu, M Rohrbach, J Feng, K Saenko, T Darrell
Proceedings of the IEEE conference on computer vision and pattern …, 2016
6462016
Grounding of textual phrases in images by reconstruction
A Rohrbach, M Rohrbach, R Hu, T Darrell, B Schiele
Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The …, 2016
5502016
Speaker-follower models for vision-and-language navigation
D Fried, R Hu, V Cirik, A Rohrbach, J Andreas, LP Morency, ...
Advances in neural information processing systems 31, 2018
5212018
Segmentation from natural language expressions
R Hu, M Rohrbach, T Darrell
Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The …, 2016
4532016
Modeling relationships in referential expressions with compositional modular networks
R Hu, M Rohrbach, J Andreas, T Darrell, K Saenko
Proceedings of the IEEE conference on computer vision and pattern …, 2017
4282017
LSDA: Large scale detection through adaptation
J Hoffman, S Guadarrama, ES Tzeng, R Hu, J Donahue, R Girshick, ...
Advances in neural information processing systems 27, 2014
3852014
UniT: Multimodal Multitask Learning with a Unified Transformer
R Hu, A Singh
arXiv preprint arXiv:2102.10772, 2021
3752021
Learning to segment every thing
R Hu, P Dollár, K He, T Darrell, R Girshick
Proceedings of the IEEE conference on computer vision and pattern …, 2018
3612018
Textcaps: a dataset for image captioning with reading comprehension
O Sidorov, R Hu, M Rohrbach, A Singh
Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020
3482020
Scaling language-image pre-training via masking
Y Li, H Fan, R Hu, C Feichtenhofer, K He
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
2722023
Grounding visual explanations
L Anne Hendricks, R Hu, T Darrell, Z Akata
Proceedings of the European Conference on Computer Vision (ECCV), 264-279, 2018
2382018
Iterative answer prediction with pointer-augmented multimodal transformers for textvqa
R Hu, A Singh, T Darrell, M Rohrbach
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020
2312020
Explainable neural computation via stack neural module networks
R Hu, J Andreas, T Darrell, K Saenko
Proceedings of the European conference on computer vision (ECCV), 53-69, 2018
2282018
Language-conditioned graph networks for relational reasoning
R Hu, A Rohrbach, T Darrell, K Saenko
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
1952019
Generating counterfactual explanations with natural language
LA Hendricks, R Hu, T Darrell, Z Akata
arXiv preprint arXiv:1806.09809, 2018
1162018
Sam 2: Segment anything in images and videos
N Ravi, V Gabeur, YT Hu, R Hu, C Ryali, T Ma, H Khedr, R Rädle, ...
arXiv preprint arXiv:2408.00714, 2024
1152024
Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation
R Hu, D Fried, A Rohrbach, D Klein, T Darrell, K Saenko
arXiv preprint arXiv:1906.00347, 2019
942019
The system can't perform the operation now. Try again later.
Articles 1–20