NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models G Zhou, Y Hong, Q Wu Proceedings of the AAAI Conference on Artificial Intelligence 38 (7), 7641-7649, 2024 | 117 | 2024 |
NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation J Zhang, K Wang, R Xu, G Zhou, Y Hong, X Fang, Q Wu, Z Zhang, W He Proceedings of Robotics: Science and Systems (RSS), 2024 | 26 | 2024 |
Webvln: Vision-and-language navigation on websites Q Chen, D Pitawela, C Zhao, G Zhou, HT Chen, Q Wu Proceedings of the AAAI Conference on Artificial Intelligence 38 (2), 1165-1173, 2024 | 15 | 2024 |
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models G Zhou, Y Hong, Z Wang, XE Wang, Q Wu Proceedings of the European Conference on Computer Vision (ECCV), 2024 | 12 | 2024 |
SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts G Zhou, Y Hong, Z Wang, C Zhao, M Bansal, Q Wu arXiv preprint arXiv:2412.05552, 2024 | 1 | 2024 |