Featured Demos
- Delphi, a research prototype designed to model people’s moral judgments on a variety of everyday situations | Mosaic, Research Visualization
Delphi is intended to demonstrate what state-of-the-art models can accomplish today on machine ethics as well as to highlight their limitations.
Try the demo - A new general-purpose model with unprecedented breadth, Unified-IO can perform a wide array of visual and linguistic tasks. | PRIOR, Research Visualization
Unified-IO is the first neural model to perform a large, diverse set of AI tasks from computer vision to natural language processing.
Try the demo - A QA model that outperforms other popular language models while being an order of magnitude smaller | Aristo, Research Visualization
Macaw is a high-performance question-answering (QA) model capable of outperforming other popular current language models, all while being an order of magnitude smaller. This demo allows you to explore Macaw's answers and compare them to those of the popular GPT-3 language model on a benchmark set of questions.
Try the demo - Generating Implications, Proofs, and Abductive Statements over Natural Language | Aristo
Like RuleTaker, ProofWriter determines whether statements are True or False based on rules given in natural language - but also generates the proof of its answers.
Try the demo - Scientific Paper PDF to HTML Converter | Semantic Scholar
Paper to HTML Converter
PrototypeThis is an experimental prototype that aims to render scientific papers in HTML so they can be more easily read by screen readers or on mobile devices.
Try the prototype - Modular QA answers questions by breaking them down into a series of smaller, more specific ones. This produces answers in a human-like way that's more explainable than black-box systems. | Aristo
ModularQA is a neuro-symbolic question-answering system that answers complex questions by asking a series of sub-questions to existing simpler QA systems or symbolic modules. It explains each of its reasoning steps in language, in terms of a simple question and its answer as produced by a simpler model or a math…
Try the demo - Uncovering stereotypical biases via underspecified questions | Aristo
This work focuses specifically on identifying biases in question answering (QA) models. If these models are blindly deployed in real-life settings, the biases within these models could cause real harm, which raises the question; how extensive are social stereotypes in question-answering models?
Try the demo - Evaluating neural toxic degeneration in language models | Mosaic, Research Visualization
In new joint work at AI2 and UW, we study how often popular NLP components produce problematic content, what might trigger this neural toxic degeneration from a given system, and whether or not it can be successfully avoided. We also study how much toxicity is present in the web text that these systems learned…
Try the demo - Find out whether scientific research supports or refutes a given claim | Semantic Scholar
Our fact verification demo was built using the SciFact dataset, a collection of 1.4K expert-written scientific claims paired with evidence-containing abstracts, and annotated with labels and rationales.
Try the demo - Crossing format boundaries with a single QA system | Aristo
UnifiedQA is a single pre-trained QA model that performs surprisingly well across 17 QA datasets spanning 4 diverse formats. Fine-tuning UnifiedQA into specialized models results in a new state-of-the-art on 6 datasets, establishing this model as a strong starting point for building QA systems.
Try the demo