Multihop Questions via Single-hop Question Composition

Aristo • 2022
MuSiQue is a multihop reading comprehension dataset with 2-4 hop questions, built by composing seed questions from 5 existing single-hop datasets. The dataset is constructed with a bottom-up approach that systematically selects composable pairs of single-hop questions that are connected, i.e., where one reasoning step requires information from the other. This approach allows greater control over the properties of the resulting k-hop questions, allowing us to create a dataset that is substantially less cheatable (e.g. by shortcut-based or singlehop reasoning) and more challenging than prior similar datasets. MuSiQue comes in two variations -- MuSiQue-Answerable, which contains only answerable questions, and MuSiQue-Full, which contains both answerable and unanswerable questions. In the latter, each answerable question from MuSiQue-Answerable is paired with closely similar unanswerable question. In MuSiQue-Answerable, the task is to identify the answer and the supporting paragraphs, given a question and a context of up to 20 paragraphs. In MuSiQue-Full, the task is to first determine whether the question is answerable from the given context, and if it is, identify the answer and the supporting paragraphs.
License: CC BY

Leaderboard

Top Public Submissions
DetailsCreatedSupport+Sufficiency F1
1
Select+Answer (SA) Model
Harsh Trivedi,Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
7/4/202242%
2
Step Execution by End2End (EX(EE)) Model
Harsh Trivedi,Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
5/6/202244%
2
Step Execution by Select+Answer (EX(SA)) Model
Harsh Trivedi,Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
7/4/202244%
4
End2End (EE) Model
Harsh Trivedi,Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
5/6/202226%

Authors

Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal