Maluuba Releases World's Largest Human Created Question Answering Dataset

by Angela Guess

According to a recent press release, “MALUUBA, a Canadian deep-learning company helping machines think, reason and communicate with human-like intelligence, today announced the public release of two sophisticated natural language understanding datasets. In making these resources available, the company seeks to further advance and facilitate breakthrough innovation in artificial intelligence research. Created by a team of humans, rather than synthetically, Maluuba’s new datasets explore fundamental aspects of human capabilities in literacy and conversation. These datasets exhibit complexity and have been developed for machine reading comprehension, goal-oriented dialogue systems and conversational interface research.”

The release goes on, “Maluuba’s first dataset, NewsQA, was developed to train algorithms capable of answering complex questions that require human-level comprehension and reasoning skills. Leveraging CNN articles from the DeepMind Q&A Dataset, Maluuba prepared a crowd-sourced machine reading corpus of 120,000 question-answer pairs. The collection methodology was based on incomplete information and fostered curiosity. The questions require reasoning to answer, such as synthesis, inference and handling ambiguity, unlike other datasets that have focused on larger volumes yet simpler questions. The result is a robust dataset that will further drive natural language research. ‘The efforts put into developing this dataset will help drive progress in machine reading comprehension,’ said Dr. Aaron Courville, Assistant Professor in the Department of Computer Science and Operations Research (DIRO) at the Université de Montréal.”

Data Topics

Maluuba Releases World’s Largest Human Created Question Answering Dataset

Leave a Reply Cancel reply