StanfordNLP

StanfordNLP / NamedEntityRecognition / 0.2.0

README.md

Table of Contents

Introduction

This algorithm retrives recognized entities from a body of text using the stanfordNlp library.Currently it identifies named noun type entities such as PERSON, LOCATION, ORGANIZATION, MISC and numerical MONEY, NUMBER, DATA, TIME, DURATION, SET types.

note: the previous string version of this algorithm is now deprecated, which means its still functional but no longer documented.

I/O

Input

{
    "document": String
}
  • document - (required) an arbitrary length text document.

Output

{
  "sentences": List[
    {
      "detectedEntities": List[
        {
            "word": String,
            "entity": String
        }
      ]
    }
  ]
}
  • sentences - a list of sentences discovered in the input document
  • detectedEntities - a list of detected entities discovered in this particular sentence
  • word - the detected keyword word in the input document that refers to a specific entity
  • entity - the named entity the keyword relates to (ie: PERSON, ORGANIZATION, NUMBER, etc)

Examples

{  
   "document":"Jim went to Stanford University, Tom went to the University of Washington. They both work for Microsoft."
}
{
  "sentences": [
    {
      "detectedEntities": [
        {"word": "Jim", "entity": "PERSON"},
        {"word": "Stanford", "entity": "ORGANIZATION"},
        {"word": "University", "entity": "ORGANIZATION"},
        {"word": "Tom", "entity": "PERSON"},
        {"word": "University", "entity": "ORGANIZATION"},
        {"word": "of", "entity": "ORGANIZATION"},
        {"word": "Washington", "entity": "ORGANIZATION"}
      ]
    },
    {
      "detectedEntities": [
        {"word": "Microsoft", "entity": "ORGANIZATION"}
      ]
    }
  ]
}

Credits

For more information, please refer to stanford core-nlp or Manning, Christopher D., Surdeanu, Mihai, Bauer, John, Finkel, Jenny, Bethard, Steven J., and McClosky, David. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60.