Следене
Yongfei Liu
Yongfei Liu
Bytedance
Потвърден имейл адрес: bytedance.com - Начална страница
Заглавие
Позовавания
Позовавания
Година
Part-aware Prototype Network for Few-shot Semantic Segmentation
Y Liu, X Zhang, S Zhang, X He
ECCV-2020, 2020
3692020
Pose-aware multi-level feature network for human object interaction detection
B Wan, D Zhou, Y Liu, R Li, X He
ICCV2019, 9469-9478, 2019
2602019
Learning Cross-modal Context Graph for Visual Grounding
Y Liu, B Wan, X Zhu, X He
AAAI-2020, 2020
882020
Relation-aware Instance Refinement for Weakly Supervised Visual Grounding
Y Liu, B Wan, L Ma, X He
CVPR2021, 2021
602021
Vl-interpret: An interactive visualization tool for interpreting vision-language transformers
E Aflalo, M Du, SY Tseng, Y Liu, C Wu, N Duan, V Lal
Proceedings of the IEEE/CVF Conference on computer vision and pattern …, 2022
502022
Kd-vlp: Improving end-to-end vision-and-language pretraining with object knowledge distillation
Y Liu, C Wu, S Tseng, V Lal, X He, N Duan
arXiv preprint arXiv:2109.10504, 2021
252021
Exploring the reasoning abilities of multimodal large language models (mllms): A comprehensive survey on emerging trends in multimodal reasoning
Y Wang, W Chen, X Han, X Lin, H Zhao, Y Liu, B Zhai, J Yuan, Q You, ...
arXiv preprint arXiv:2401.06805, 2024
232024
GEM: A General Evaluation Benchmark for Multimodal Tasks
L Su, N Duan, E Cui, L Ji, C Wu, H Luo, Y Liu, M Zhong, T Bharti, ...
ACL2021 Findings, 2021
162021
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning
B Wan*, Y Liu*, D Zhou, T Tuytelaars, X He
ICLR 2023, 2023
142023
Intention-aware feature propagation network for interactive segmentation
C Zhang, C Hu, Y Liu, X He
BMVC2023, 2022
9*2022
Reason out your layout: Evoking the layout master from large language models for text-to-image synthesis
X Chen, Y Liu, Y Yang, J Yuan, Q You, LP Liu, H Yang
arXiv preprint arXiv:2311.17126, 2023
82023
Grounded Image Text Matching with Mismatched Relation Reasoning
Y Wu, Y Wei, H Wang, Y Liu, S Yang, X He
ICCV2023, 2023
62023
ViTAR: Vision Transformer with Any Resolution
Q Fan, Q You, X Han, Y Liu, Y Tao, H Huang, R He, H Yang
arXiv preprint arXiv:2403.18361, 2024
32024
Improving in-context learning in diffusion models with visual context-modulated prompts
T Chen, Y Liu, Z Wang, J Yuan, Q You, H Yang, M Zhou
arXiv preprint arXiv:2312.01408, 2023
32023
Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model
H Liu, Q You, X Han, Y Liu, H Huang, R He, H Yang
arXiv preprint arXiv:2405.17815, 2024
12024
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
H Liu, Q You, X Han, Y Wang, B Zhai, Y Liu, Y Tao, H Huang, R He, ...
arXiv preprint arXiv:2403.01487, 2024
2024
CORE-MM: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models
X Han, Q You, Y Liu, W Chen, H Zheng, K Mrini, X Lin, Y Wang, B Zhai, ...
arXiv preprint arXiv:2311.11567, 2023
2023
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Visual Linguistic Model
Shan Ning, Longtian Qiu, Yongfei Liu, Xuming He
CVPR2023, 2023
2023
ViTAR: Vision Transformer with Any Resolution
Системата не може да изпълни операцията сега. Опитайте отново по-късно.
Статии 1–19