LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis Z Shen, R Zhang, M Dell, BCG Lee, J Carlson, W Li Document Analysis and Recognition–ICDAR 2021: 16th International Conference …, 2021 | 155 | 2021 |
Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research L Soldaini, R Kinney, A Bhagia, D Schwenk, D Atkinson, R Authur, ... ACL 2024, 2024 | 115 | 2024 |
The semantic scholar open data platform R Kinney, C Anastasiades, R Authur, I Beltagy, J Bragg, A Buraczynski, ... arXiv preprint arXiv:2301.10140, 2023 | 111 | 2023 |
A large dataset of historical japanese documents with complex layouts Z Shen, K Zhang, M Dell Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020 | 56 | 2020 |
Deep learning based framework for automatic damage detection in aircraft engine borescope inspection Z Shen, X Wan, F Ye, X Guan, S Liu 2019 International Conference on Computing, Networking and Communications …, 2019 | 51 | 2019 |
Multi-lexsum: Real-world summaries of civil rights lawsuits at multiple granularities Z Shen, K Lo, L Yu, N Dahlberg, M Schlanger, D Downey Advances in Neural Information Processing Systems 35, 13158-13173, 2022 | 50 | 2022 |
VILA: Improving structured content extraction from scientific PDFs using visual layout groups Z Shen, K Lo, LL Wang, B Kuehl, DS Weld, D Downey Transactions of the Association for Computational Linguistics 10, 376-392, 2022 | 44 | 2022 |
A Design Space for Intelligent and Interactive Writing Assistants M Lee, KI Gero, JJY Chung, SB Shum, V Raheja, H Shen, S Venugopalan, ... CHI 2024, 1-35, 2024 | 31 | 2024 |
Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search D King*, Z Shen*, N Subramani, DS Weld, I Beltagy, D Downey arXiv preprint arXiv:2203.08436, 2022 | 31 | 2022 |
Learning to Decode Collaboratively with Multiple Language Models SZ Shen, H Lang, B Wang, Y Kim, D Sontag ACL 2024, 2024 | 24 | 2024 |
American stories: A large-scale structured text dataset of historical us newspapers M Dell, J Carlson, T Bryan, E Silcock, A Arora, Z Shen, L D'Amico-Wong, ... Advances in Neural Information Processing Systems 36, 2024 | 23 | 2024 |
PAWLS: PDF annotation with labels and structure M Neumann, Z Shen, S Skjonsberg arXiv preprint arXiv:2101.10281, 2021 | 21 | 2021 |
The Semantic Reader Project K Lo, JC Chang, A Head, J Bragg, AX Zhang, C Trier, C Anastasiades, ... Communications of the ACM, 2024 | 20* | 2024 |
PaperMage: A Unified Toolkit for Processing, Representing, and Manipulating Visually-Rich Scientific Documents K Lo, Z Shen, B Newman, JZ Chang, R Authur, E Bransom, S Candra, ... EMNLP 2023 : System Demonstrations (🏆 Best Paper Demo Award 🏆 ), 495-507, 2023 | 18 | 2023 |
Beyond summarization: Designing ai support for real-world expository writing tasks Z Shen, T August, P Siangliulue, K Lo, J Bragg, J Hammerbacher, ... arXiv preprint arXiv:2304.02623, 2023 | 18 | 2023 |
Olala: object-level active learning for efficient document layout annotation Z Shen, J Zhao, M Dell, Y Yu, W Li arXiv preprint arXiv:2010.01762, 2020 | 17* | 2020 |
A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models S Hegselmann, SZ Shen, F Gierse, M Agrawal, D Sontag, X Jiang CHIL 2024, 2024 | 9* | 2024 |
Towards Verifiable Text Generation with Symbolic References LT Hennigen*, S Shen*, A Nrusimha, B Gapp, D Sontag, Y Kim COLM 2024, 2023 | 8 | 2023 |
Conceptualizing machine learning for dynamic information retrieval of electronic health record notes S Jiang, S Shen, M Agrawal, B Lam, N Kurtzman, S Horng, DR Karger, ... Machine Learning for Healthcare Conference, 343-359, 2023 | 7 | 2023 |
Information Extraction from Text Regions with Complex Tabular Structure. K Zhang, Z Shen, J Zhou, M Dell Conference on Neural Information Processing Systems, 2019 | 6 | 2019 |