Следене
Pan Zhang
Pan Zhang
Shanghai AI Laboratory
Потвърден имейл адрес: mail.ustc.edu.cn - Начална страница
Заглавие
Позовавания
Позовавания
Година
Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation
P Zhang, B Zhang, T Zhang, D Chen, Y Wang, F Wen
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021
5752021
Cross-domain correspondence learning for exemplar-based image translation
P Zhang, B Zhang, D Chen, L Yuan, F Wen
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020
4462020
Sharegpt4v: Improving large multi-modal models with better captions
L Chen, J Li, X Dong, P Zhang, C He, J Wang, F Zhao, D Lin
European Conference on Computer Vision, 370-387, 2025
3622025
Cocosnet v2: Full-resolution correspondence learning for image translation
X Zhou, B Zhang, T Zhang, P Zhang, J Bao, D Chen, Z Zhang, F Wen
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021
3012021
Bringing old photos back to life
Z Wan, B Zhang, D Chen, P Zhang, D Chen, J Liao, F Wen
proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020
2452020
Internlm: A multilingual language model with progressively enhanced capabilities
ILM Team
2023-01-06)[2023-09-27]. https://github. com/InternLM/InternLM, 2023
1772023
VLMEvalKit: An open-source toolkit for evaluating large multi-modality models
H Duan, J Yang, Y Qiao, X Fang, L Chen, Y Liu, X Dong, Y Zang, P Zhang, ...
Proceedings of the 32nd ACM International Conference on Multimedia, 11198-11201, 2024
174*2024
Internlm-xcomposer2: Mastering free-form text-image composition and comprehension in vision-language large model
X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, X Wei, S Zhang, ...
arXiv preprint arXiv:2401.16420, 2024
1732024
Internlm-xcomposer: A vision-language large model for advanced text-image comprehension and composition
P Zhang, X Dong, B Wang, Y Cao, C Xu, L Ouyang, Z Zhao, H Duan, ...
arXiv preprint arXiv:2309.15112, 2023
1632023
Internlm2 technical report
Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ...
arXiv preprint arXiv:2403.17297, 2024
1592024
Are We on the Right Way for Evaluating Large Vision-Language Models?
L Chen, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, J Wang, Y Qiao, ...
arXiv preprint arXiv:2403.20330, 2024
1052024
Opera: Alleviating hallucination in multi-modal large language models via over-trust penalty and retrospection-allocation
Q Huang, X Dong, P Zhang, B Wang, C He, J Wang, D Lin, W Zhang, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
1012024
Internlm-xcomposer2-4khd: A pioneering large vision-language model handling resolutions from 336 pixels to 4k hd
X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, S Zhang, H Duan, ...
arXiv preprint arXiv:2404.06512, 2024
802024
Old photo restoration via deep latent space translation
Z Wan, B Zhang, D Chen, P Zhang, F Wen, J Liao
IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (2), 2071-2087, 2022
732022
Vigc: Visual instruction generation and correction
B Wang, F Wu, X Han, J Peng, H Zhong, P Zhang, X Dong, W Li, W Li, ...
Proceedings of the AAAI Conference on Artificial Intelligence 38 (6), 5309-5317, 2024
612024
Long-clip: Unlocking the long-text capability of clip
B Zhang, P Zhang, X Dong, Y Zang, J Wang
European Conference on Computer Vision, 310-325, 2025
562025
Sharegpt4video: Improving video understanding and generation with better captions
L Chen, X Wei, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, B Lin, ...
arXiv preprint arXiv:2406.04325, 2024
552024
Metaportrait: Identity-preserving talking head generation with fast personalized adaptation
B Zhang, C Qi, P Zhang, B Zhang, HT Wu, D Chen, Q Chen, Y Wang, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
482023
Alpha-clip: A clip model focusing on wherever you want
Z Sun, Y Fang, T Wu, P Zhang, Y Zang, S Kong, Y Xiong, D Lin, J Wang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
432024
V3det: Vast vocabulary visual detection dataset
J Wang, P Zhang, T Chu, Y Cao, Y Zhou, T Wu, B Wang, C He, D Lin
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
432023
Системата не може да изпълни операцията сега. Опитайте отново по-късно.
Статии 1–20