allenai / open_information_extraction / 0.1.0


This algorithm provides state-of-the-art ability to answer a question based on a piece of text. It takes in a passage of text and a question based on that passage, then returns a substring of the passage that is guessed to be the correct answer.

It is a wrapper around the Machine Comprehension model put out by the AllenNLP team, which is itself a re-implementation of the BiDAF Model (2017).

Applicable Scenarios and Problems

This algorithm is useful in creating natural-language interfaces to extract information from text documents. For example it could feature into the backend of a chatbot, or provide customer support based on a user's manual.

It can also be used to extract structured data from textual documents. For example, a collection of doctors' reports could be turned into a table that says (for every report) what was hurting, what the patient should do, and when they should schedule a follow-up.


By default the algorithm only returns only the actual answer to the question. However, if you run in debug mode it will return the entire output of AllenNLP's model.


The input JSON blob should have the following fields:

  • sentence: the sentence to extract predicates from

Any additional fields will be passed through into the AllenNLP model.


The following output field will always be present:

  • words: the parsed tokens of the sentence
  • verbs: each verb in the sentence along with a description of how each part of the sentence relates to that verb


Example 1: Default Behavior


  "sentence": "John decided to run for office next month."


  "verbs": [
      "verb": "decided",
      "description": "[ARG0: John] [V: decided] [ARG1: to run for office next month] .",
      "tags": ["B-ARG0", "B-V", "B-ARG1", "I-ARG1", "I-ARG1", "I-ARG1", "I-ARG1", "I-ARG1", "O"]}, 
      "verb": "run",
      "description": "[ARG0: John] [BV: decided to] [V: run] [ARG1: for office] [ARG2: next month] .",
      "tags": ["B-ARG0", "B-BV", "I-BV", "B-V", "B-ARG1", "I-ARG1", "B-ARG2", "I-ARG2", "O"]}],
  "words": ["John", "decided", "to", "run", "for", "office", "next", "month", "."]}

See Also