Papers
See AI2's Award Winning Papers
Learn more about AI2's Lasting Impact Award
Viewing 221-230 of 298 papers
QuASE: Question-Answer Driven Sentence Encoding.
Hangfeng He, Qiang Ning, Dan RothACL • 2020 Question-answering (QA) data often encodes essential information in many facets. This paper studies a natural question: Can we get supervision from QA data for other tasks (typically, non-QA ones)? For example, {\em can we use QAMR (Michael et al., 2017) to…Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models
Maarten Sap, Eric Horvitz, Yejin Choi, Noah A. Smith, James W. Pennebaker ACL • 2020 We investigate the use of NLP as a measure of the cognitive processes involved in storytelling, contrasting imagination and recollection of events. To facilitate this, we collect and release HIPPOCORPUS, a dataset of 7,000 stories about imagined and recalled…Social Bias Frames: Reasoning about Social and Power Implications of Language
Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, Yejin ChoiACL • 2020Language has the power to reinforce stereotypes and project social biases onto others. At the core of the challenge is that it is rarely what is stated explicitly, but all the implied meanings that frame people's judgements about others. For example, given a…WeCNLP Best PaperThe Right Tool for the Job: Matching Model and Instance Complexities
Roy Schwartz, Gabi Stanovsky, Swabha Swayamdipta, Jesse Dodge, Noah A. SmithACL • 2020 As NLP models become larger, executing a trained model requires significant computational resources incurring monetary and environmental costs. To better respect a given inference budget, we propose a modification to contextual representation fine-tuning…Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering
Ben Bogin, Sanjay Subramanian, Matt Gardner, Jonathan BerantTACL • 2020 Answering questions that involve multi-step reasoning requires decomposing them and using the answers of intermediate steps to reach the final answer. However, state-ofthe-art models in grounded question answering often do not explicitly perform decomposition…Contextual Word Representations: Putting Words into Computers
Noah A. SmithCACM • 2020 This article aims to tell the story of how we put words into computers. It is part of the story of the field of natural language processing (NLP), a branch of artificial intelligence.a It targets a wide audience with a basic understanding of computer…On Consequentialism and Fairness
Dallas Card, Noah A. SmithFrontiers in AI Journal • 2020 Recent work on fairness in machine learning has primarily emphasized how to define, quantify, and encourage "fair" outcomes. Less attention has been paid, however, to the ethical foundations which underlie such efforts. Among the ethical perspectives that…Explain like I am a Scientist: The Linguistic Barriers of Entry to r/science
Tal August, Dallas Card, Gary Hsieh, Noah A. Smith, Katharina ReineckeCHI • 2020 As an online community for discussing research findings, r/science has the potential to contribute to science outreach and communication with a broad audience. Yet previous work suggests that most of the active contributors on r/science are science-educated…Longformer: The Long-Document Transformer
Iz Beltagy, Matthew E. Peters, Arman CohanarXiv • 2020 Transformer-based models are unable to process long sequences due to their self-attention operation, which scales quadratically with the sequence length. To address this limitation, we introduce the Longformer with an attention mechanism that scales linearly…Evaluating NLP Models via Contrast Sets
M.Gardner, Y.Artzi, V.Basmova, J.Berant, B.Bogin, S.Chen, P.Dasigi, D.Dua, Y.Elazar, A.Gottumukkala, N.Gupta, H.Hajishirzi, G.Ilharco, D.Khashabi, K.Lin, J.Liu, N.Liu, P.Mulcaire, Q.Ning, S.Singh, N.Smith, S.Subramanian, R.Tsarfaty, E.Wallace, et.alarXiv • 2020 Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on…