Towards general computer control: A multimodal agent for red dead redemption ii as a case study W Tan, Z Ding, W Zhang, B Li, B Zhou, J Yue, H Xia, J Jiang, L Zheng, ... ICLR 2024 Workshop on Large Language Model (LLM) Agents, 2024 | 42 | 2024 |
Gaze target estimation inspired by interactive attention Z Hu, K Zhao, B Zhou, H Guo, S Wu, Y Yang, J Liu IEEE Transactions on Circuits and Systems for Video Technology 32 (12), 8524 …, 2022 | 27 | 2022 |
Learning from visual observation via offline pretrained state-to-go transformer B Zhou, K Li, J Jiang, Z Lu Advances in Neural Information Processing Systems 36, 59585-59605, 2023 | 11 | 2023 |
Unicode: Learning a unified codebook for multimodal large language models S Zheng, B Zhou, Y Feng, Y Wang, Z Lu European Conference on Computer Vision, 426-443, 2024 | 10 | 2024 |
Gfie: A dataset and baseline for gaze-following from 2d to 3d in indoor environments Z Hu, Y Yang, X Zhai, D Yang, B Zhou, J Liu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 10 | 2023 |
Learning diverse bimanual dexterous manipulation skills from human demonstrations B Zhou, H Yuan, Y Fu, Z Lu arXiv preprint arXiv:2410.02477, 2024 | 4 | 2024 |
Cross-embodiment dexterous grasping with reinforcement learning H Yuan, B Zhou, Y Fu, Z Lu arXiv preprint arXiv:2410.02479, 2024 | 2 | 2024 |
Pre-trained Visual Dynamics Representations for Efficient Policy Learning H Luo, B Zhou, Z Lu European Conference on Computer Vision, 249-267, 2024 | 2 | 2024 |
NOLO: Navigate Only Look Once B Zhou, Z Zhang, J Wang, Z Lu arXiv preprint arXiv:2408.01384, 2024 | | 2024 |