Award Winning Papers

Learn more about AI2's Lasting Impact Award
Viewing 11-20 of 43 papers
  • Queer In AI: A Case Study in Community-Led Participatory AI

    Organizers Of Queer in AI, Anaelia Ovalle, Arjun Subramonian, Ashwin Singh, Claas Voelcker, Danica J. Sutherland, Davide Locatelli, Eva Breznik, Filip Klubička, Hang Yuan, Hetvi J, Huan Zhang, Jaidev Shriram, Kruno Lehman, Luca Soldaini, Maarten Sap, Marc Peter Deisenroth, Maria Leonor Pacheco, Maria Ryskina, Martin Mundt, Melvin Selim Atay, Milind Agarwal, Nyx McLean, Pan Xu, A Pranav, Raj Korpan, Ruchira Ray, Sarah Mathew, Sarthak Arora, St John, Tanvi Anand, Vishakha Agrawal, William Agnew, Yanan Long, Zijie J. Wang, Zeerak Talat, Avijit Ghosh, Nathaniel Dennler, Michael Noseworthy, Sharvani Jha, Emi Baylor, Aditya Joshi, Natalia Y. Bilenko, Andrew McNamara, Raphael Gontijo-Lopes, Alex Markham, Evyn Dǒng, Jackie Kay, Manu Saraswat, Nikhil Vytla, Luke StarkFAccT2023 We present Queer in AI as a case study for community-led participatory design in AI. We examine how participatory design and intersectional tenets started and shaped this community's programs over the years. We discuss different challenges that emerged in the…
  • Abstract Visual Reasoning with Tangram Shapes

    Anya Ji, Noriyuki Kojima, N. Rush, Alane Suhr, Wai Keen Vong, Robert D. Hawkins, Yoav ArtziEMNLP2022
    Best Long Paper Award
    We introduce KiloGram, a resource for studying abstract visual reasoning in humans and machines. Drawing on the history of tangram puzzles as stimuli in cognitive science, we build a richly annotated dataset that, with > 1k distinct stimuli, is orders of…
  • CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation

    Abhilasha Ravichander, Matt Gardner, Ana MarasovićEMNLP2022 The full power of human language-based communication cannot be realized without negation. All human languages have some form of negation. Despite this, negation remains a challenging phenomenon for current natural language understanding systems. To facilitate…
  • ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

    Matt Deitke, Eli VanderBilt, Alvaro Herrasti, Luca Weihs, Jordi Salvador, Kiana Ehsani, Winson Han, Eric Kolve, Ali Farhadi, Aniruddha Kembhavi, Roozbeh MottaghiNeurIPS2022 Massive datasets and high-capacity models have driven many recent advancements in computer vision and natural language understanding. This work presents a platform to enable similar success stories in Embodied AI. We propose ProcTHOR, a framework for…
  • Robust fine-tuning of zero-shot models

    Mitchell Wortsman, Gabriel Ilharco, Mike Li, Jong Wook Kim, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig SchmidtCVPR2022
    Best Paper Finalist
    Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning methods substantially improve…
  • NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics

    Ximing Lu, S. Welleck, Peter West, Liwei Jiang, Jungo Kasai, Daniel Khashabi, Ronan Le Bras, Lianhui Qin, Youngjae Yu, Rowan Zellers, Noah A. Smith, Yejin ChoiNAACL2022
    Best Paper Award
    The dominant paradigm for neural text generation is left-to-right decoding from autoregressive language models. Constrained or controllable generation under complex lexical constraints, however, requires foresight to plan ahead feasible future paths. Drawing…
  • Understanding Dataset Difficulty with 𝒱-Usable Information

    Kawin Ethayarajh, Yejin Choi, and Swabha SwayamdiptaICML2022 Estimating the difficulty of a dataset typically involves comparing state-of-the-art models to humans; the bigger the performance gap, the harder the dataset is said to be. However, this comparison provides little understanding of how difficult each instance…
  • Hallett‐Mossop Rime Splintering Dims Cumulus Clouds Over the Southern Ocean: New Insight From Nudged Global Storm‐Resolving Simulations

    R. Atlas, C. Bretherton, M. Khairoutdinov, P. BlosseyAGU Advances2022 In clouds containing both liquid and ice with temperatures between −3°C and −8°C, liquid droplets collide with large ice crystals, freeze, and shatter, producing a plethora of small ice splinters. This process, known as Hallett‐Mossop rime splintering, and…
  • Correcting Coarse-Grid Weather and Climate Models by Machine Learning From Global Storm-Resolving Simulations

    Bretherton, C. S., B. Henn, A. Kwa, N. D. Brenowitz, O. Watt-Meyer, J. McGibbon, W. A. Perkins, S. K. Clark, and L. HarrisJournal of Advances in Modeling Earth Systems2022 Global atmospheric `storm-resolving' models with horizontal grid spacing of less than 5~km resolve deep cumulus convection and flow in complex terrain. They promise to be reference models that could be used to improve computationally affordable coarse-grid…
  • MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers

    Krishna Pillutla, Swabha Swayamdipta, Rowan Zellers, John Thickstun, S. Welleck, Yejin Choi, Z. HarchaouiNeurIPS2021 As major progress is made in open-ended text generation, measuring how close machine-generated text is to human language remains a critical open problem. We introduce MAUVE , a comparison measure for open-ended text generation, which directly compares the…