Typically, machine learning systems solve new tasks by training on thousands of examples. In contrast, humans can solve new tasks by reading some instructions, with perhaps an example or two. To take a step toward closing this gap, we introduce a framework and benchmark dataset for learning NLP systems that solve new tasks after reading their descriptions. See our EMNLP 2020 paper “Learning from task descriptions” for a description of the framework, evaluation metric, and baseline model results.
ZEST contains task descriptions (formatted as questions) for 1,251 different NLP tasks. Each task has 20 different (context, answer)
annotations. This page provides download links (see above) for the dataset. Check out the github repository for a detailed description of the data, evaluation code, and information about submitting to the leaderboard to evaluate on the test set.
Details | Created | Output Structure Mean |
---|---|---|
1 T5-11B MTL baseline Orion Weller, Nicholas Lourie, Matt Gardner and Matt Peters | 10/16/2020 | 47% |
2 T5-11B baseline Orion Weller, Nicholas Lourie, Matt Gardner, Matt Peters | 10/16/2020 | 47% |
3 Hypter (BART-Large) INK Lab @ USC | 12/26/2020 | 23% |
4 BART-Large baseline Orion Weller, Nicholas Lourie, Matt Gardner, and Matt Peters | 10/16/2020 | 19% |
Orion Weller, Nicholas Lourie, Matt Gardner, Matthew E. Peters