Hao Tan

Получаване на мой собствен потребителски профил

Позовавания

	Всички	От 2019
Позовавания	4487	4426
h-индекс	19	19
i10-индекс	26	25

1600

800

400

1200

201820192020202120222023202445 81 329 659 1201 1512 637

Публичен достъп

Преглед на всички

6 статии

0 статии

налични

неналични

Въз основа на изисквания при финансирането

Съавтори

Mohit BansalParker Distinguished Professor, Computer Science, UNC Chapel HillПотвърден имейл адрес: cs.unc.edu
Trung H. BuiSenior Research Scientist & Research Manager, Adobe ResearchПотвърден имейл адрес: adobe.com
Sheng ShenUC BerkeleyПотвърден имейл адрес: berkeley.edu
Licheng Yu 虞立成Research Scientist and Manager, Facebook AIПотвърден имейл адрес: fb.com
Jie Lei 雷杰Research Scientist, Meta AIПотвърден имейл адрес: fb.com
Jaemin ChoPhD Student at UNC Chapel HillПотвърден имейл адрес: cs.unc.edu
Zhewei YaoSnowflakeПотвърден имейл адрес: snowflake.com
Yicong HongAdobe ResearchПотвърден имейл адрес: anu.edu.au
Jialu LiUNC Chapel HillПотвърден имейл адрес: cs.unc.edu
Liunian Harold LiUniversity of California, Los AngelesПотвърден имейл адрес: cs.ucla.edu
Franck DernoncourtNLP/ML Researcher. MIT PhD.Потвърден имейл адрес: adobe.com
Hyounghun KimUlsan National Institute of Science and Technology (UNIST)Потвърден имейл адрес: unist.ac.kr
Zhe L. LinSenior Principal Scientist, Adobe ResearchПотвърден имейл адрес: adobe.com
Yixin NieMeta, UNC Chapel HillПотвърден имейл адрес: meta.com

Следене

Hao Tan

Adobe Research

Потвърден имейл адрес: adobe.com - Начална страница

Vision and Language 3D Multimodal


Заглавие Сортиране по цитати Сортиране по година Сортиране по заглавие	Позовавания Позовавания	Година
Lxmert: Learning cross-modality encoder representations from transformers H Tan, M Bansal Proceedings of the 2019 Conference on Empirical Methods in Natural Language …, 2019	2340	2019
Unifying vision-and-language tasks via text generation J Cho, J Lei, H Tan, M Bansal International Conference on Machine Learning, 1931-1942, 2021	440	2021
How much can clip benefit vision-and-language tasks? S Shen, LH Li, H Tan, M Bansal, A Rohrbach, KW Chang, Z Yao, ... arXiv preprint arXiv:2107.06383, 2021	354	2021
Learning to navigate unseen environments: Back translation with environmental dropout H Tan, L Yu, M Bansal arXiv preprint arXiv:1904.04195, 2019	295	2019
A joint speaker-listener-reinforcer model for referring expressions L Yu, H Tan, M Bansal, TL Berg Proceedings of the IEEE conference on computer vision and pattern …, 2017	287	2017
Vokenization: Improving language understanding with contextualized, visual-grounded supervision H Tan, M Bansal arXiv preprint arXiv:2010.06775, 2020	118	2020
Vimpac: Video pre-training via masked token prediction and contrastive learning H Tan, J Lei, T Wolf, M Bansal arXiv preprint arXiv:2106.11250, 2021	60	2021
Envedit: Environment editing for vision-and-language navigation J Li, H Tan, M Bansal Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	57	2022
Lrm: Large reconstruction model for single image to 3d Y Hong, K Zhang, J Gu, S Bi, Y Zhou, D Liu, F Liu, K Sunkavalli, T Bui, ... arXiv preprint arXiv:2311.04400, 2023	54	2023
Enabling robots to understand incomplete natural language instructions using commonsense reasoning H Chen, H Tan, A Kuntz, M Bansal, R Alterovitz 2020 IEEE International Conference on Robotics and Automation (ICRA), 1963-1969, 2020	51	2020
Diagnosing the environment bias in vision-and-language navigation Y Zhang, H Tan, M Bansal arXiv preprint arXiv:2005.03086, 2020	50	2020
Instant3d: Fast text-to-3d with sparse-view generation and large reconstruction model J Li, H Tan, K Zhang, Z Xu, F Luan, Y Xu, Y Hong, K Sunkavalli, ... arXiv preprint arXiv:2311.06214, 2023	42	2023
Expressing visual relationships via language H Tan, F Dernoncourt, Z Lin, T Bui, M Bansal arXiv preprint arXiv:1906.07689, 2019	39	2019
The curse of performance instability in analysis datasets: Consequences, source, and suggestions X Zhou, Y Nie, H Tan, M Bansal arXiv preprint arXiv:2004.13606, 2020	38	2020
An Effective Framework for Weakly-Supervised Phrase Grounding Q Wang, H Tan, S Shen, M Mahoney, Z Yao Proceedings of the 2020 Conference on Empirical Methods in Natural Language …, 2020	34*	2020
Improving cross-modal alignment in vision language navigation via syntactic information J Li, H Tan, M Bansal arXiv preprint arXiv:2104.09580, 2021	32	2021
Dmv3d: Denoising multi-view diffusion using 3d large reconstruction model Y Xu, H Tan, F Luan, S Bi, P Wang, J Li, Z Shi, K Sunkavalli, G Wetzstein, ... arXiv preprint arXiv:2311.09217, 2023	27	2023
Vidlankd: Improving language understanding via video-distilled knowledge transfer Z Tang, J Cho, H Tan, M Bansal Advances in Neural Information Processing Systems 34, 24468-24481, 2021	26	2021
Modality-balanced models for visual dialogue H Kim, H Tan, M Bansal Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 8091-8098, 2020	22	2020
Documentclip: Linking figures and main body text in reflowed documents F Liu, H Tan, C Tensmeyer arXiv preprint arXiv:2306.06306, 2023	19	2023

Системата не може да изпълни операцията сега. Опитайте отново по-късно.

Статии 1–20

Позовавания годишно

Дублирани описания

Обединени библиографски описания

Добавяне на съавториСъавтори

Следене

Позовавания

Съавтори