Referring expression comprehension: A survey of methods and datasets Y Qiao, C Deng, Q Wu IEEE Transactions on Multimedia 23, 4426-4440, 2020 | 98 | 2020 |
HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation Y Qiao, Y Qi, Y Hong, Z Yu, P Wang, Q Wu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 88 | 2022 |
VL-Mamba: Exploring State Space Models for Multimodal Learning Y Qiao, Z Yu, Z Zhao, S Chen, M Sun, L Guo, Q Wu, J Liu NeurIPS Workshop on Efficient Natural Language and Speech Processing, 2024 | 54 | 2024 |
Hop+: History-enhanced and order-aware pre-training for vision-and-language navigation Y Qiao, Y Qi, Y Hong, Z Yu, P Wang, Q Wu IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (7), 8524-8537, 2023 | 53 | 2023 |
Improving visual question answering using dropout and enhanced question encoder Z Fang, J Liu, Y Li, Y Qiao, H Lu Pattern Recognition 90, 404-414, 2019 | 34 | 2019 |
March in Chat: Interactive Prompting for Remote Embodied Referring Expression Y Qiao, Y Qi, Z Yu, J Liu, Q Wu Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 33 | 2023 |
R-GAN: Exploring Human-like Way for Reasonable Text-to-Image Synthesis via Generative Adversarial Networks Y Qiao, Q Chen, C Deng, N Ding, Y Qi, M Tan, X Ren, Q Wu Proceedings of the 29th ACM International Conference on Multimedia, 2085-2093, 2021 | 20 | 2021 |
VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation Y Qiao, Z Yu, Q Wu Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 14 | 2023 |
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models Y Zhang, Z Ma, J Li, Y Qiao, Z Wang, J Chai, Q Wu, M Bansal, ... Transactions on Machine Learning Research (TMLR), 2024 | 13 | 2024 |
Rankvqa: Answer re-ranking for visual question answering Y Qiao, Z Yu, J Liu 2020 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2020 | 13 | 2020 |
VC-VQA: visual calibration mechanism for visual question answering Y Qiao, Z Yu, J Liu 2020 IEEE International Conference on Image Processing (ICIP), 1481-1485, 2020 | 9 | 2020 |
Enhancing visual question answering using dropout Z Fang, J Liu, Y Qiao, Q Tang, Y Li, H Lu Proceedings of the 26th ACM international conference on Multimedia, 1002-1010, 2018 | 5 | 2018 |
LLM as Copilot for Coarse-Grained Vision-and-Language Navigation Y Qiao, Q Liu, J Liu, J Liu, Q Wu European Conference on Computer Vision ECCV 2024 15063, 459-476, 2024 | 3 | 2024 |
Multi-modal Adapter for Medical Vision-and-Language Learning Z Yu, Y Qiao, Y Xie, Q Wu International Workshop on Machine Learning in Medical Imaging, 393-402, 2023 | 3 | 2023 |
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs Y Qiao, W Lyu, H Wang, Z Wang, Z Li, Y Zhang, M Tan, Q Wu International Conference on Robotics and Automation (ICRA), 2025 | 1 | 2025 |
MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation M Sun, W Wang, Y Qiao, J Sun, Z Qin, L Guo, X Zhu, J Liu ACM Multimedia 2024, 2024 | 1 | 2024 |
Improving Online Source-free Domain Adaptation for Object Detection by Unsupervised Data Acquisition X Shi, Y Qiao, Q Wu, L Liu, F Dayoub ECCV Workshop on ROAM 2024, 2023 | 1 | 2023 |
General Scene Adaptation for Vision-and-Language Navigation H Hong, Y Qiao, S Wang, J Liu, Q Wu International Conference on Learning Representations (ICLR), 2025 | | 2025 |
Effective Tuning Strategies for Generalist Robot Manipulation Policies W Zhang, Y Li, Y Qiao, S Huang, J Liu, F Dayoub, X Ma, L Liu International Conference on Robotics and Automation (ICRA), 2025 | | 2025 |
MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation J Zhu, Y Qiao, S Zhang, X He, Q Wu, J Liu International Conference on Robotics and Automation (ICRA), 2025 | | 2025 |