Papers
See AI2's Award Winning Papers
Learn more about AI2's Lasting Impact Award
Viewing 1-10 of 926 papers
Self-Refine: Iterative Refinement with Self-Feedback
Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, K. Hermann, S. Welleck, A. Yazdanbakhsh, Peter ClarkNeurips • 2023 Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback…SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding
Favyen Bastani, Piper Wolters, Ritwik Gupta, Joe Ferdinando, Aniruddha KembhaviICCV • 2023 Remote sensing images are useful for a wide variety of planet monitoring applications, from tracking deforestation to tackling illegal fishing. The Earth is extremely diverse -- the amount of potential tasks in remote sensing images is massive, and the sizes…A machine learning parameterization of clouds in a coarse-resolution climate model for unbiased radiation
Brian Henn, Y. R. Jauregui, Spencer K. Clark, Noah Brenowitz, J. McGibbon, Oliver Watt‐Meyer, Andrew G. Pauling, C. BrethertonESSOAr • 2023 Coarse-grid weather and climate models rely particularly on parameterizations of cloud fields, and coarse-grained cloud fields from a fine-grid reference model are a natural target for a machine-learned parameterization. We machine-learn the coarsened-fine…PromptCap: Prompt-Guided Task-Aware Image Captioning
Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A. Smith, Jiebo LuoICCV • Proceedings • 2023 Knowledge-based visual question answering (VQA) involves questions that require world knowledge beyond the image to yield the correct answer. Large language models (LMs) like GPT-3 are particularly helpful for this task because of their strong knowledge…TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, Noah A. SmithICCV • Proceedings • 2023 Despite thousands of researchers, engineers, and artists actively working on improving text-to-image generation models, systems often fail to produce images that accurately align with the text inputs. We introduce TIFA (Text-to-Image Faithfulness evaluation…Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations
Nirbhay Modhe, Qiaozi Gao, A. Kalyan, Dhruv Batra, G. Thattai, G. SukhatmearXiv.org • 2023 Offline reinforcement learning (RL) methods strike a balance between exploration and exploitation by conservative value estimation -- penalizing values of unseen states and actions. Model-free methods penalize values at all unseen actions, while model-based…The Bias Amplification Paradox in Text-to-Image Generation
P. Seshadri, Sameer Singh, Yanai ElazararXiv • 2023 Bias amplification is a phenomenon in which models increase imbalances present in the training data. In this paper, we study bias amplification in the text-to-image domain using Stable Diffusion by comparing gender ratios in training vs. generated images. We…Bound by the Bounty: Collaboratively Shaping Evaluation Processes for Queer AI Harms
Organizer of Queer In AI, Nathaniel Dennler, Anaelia Ovalle, Ashwin Singh, Luca Soldaini, Arjun Subramonian, Huy Tu, William Agnew, Avijit Ghosh, Kyra Yee, Irene Font Peradejordi, Zeerak Talat, Mayra Russo, Jessica de Jesus de Pinho PinhalAIES • 2023 Bias evaluation benchmarks and dataset and model documentation have emerged as central processes for assessing the biases and harms of artificial intelligence (AI) systems. However, these auditing processes have been criticized for their failure to integrate…LEXPLAIN: Improving Model Explanations via Lexicon Supervision
Orevaoghene Ahia, Hila Gonen, Vidhisha Balachandran, Yulia Tsvetkov, Noah A. Smith*SEM • Proceedings • 2023 Model explanations that shed light on the model’s predictions are becoming a desired additional output of NLP models, alongside their predictions. Challenges in creating these explanations include making them trustworthy and faithful to the model’s…When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
Alex Mallen, Akari Asai, Victor Zhong, R. Das, Daniel Khashabi, Hannaneh Hajishirzi, Annual Meeting of the Association for Computational Linguistics • 2023 Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the difficulty of encoding a wealth of world knowledge in their parameters. This paper aims to understand LMs…