ApacheOpenNLP

ApacheOpenNLP / TokenizeBySentence / 2.0.0

README.md
Takes in a string as input, splits into sentences and tokenizes each sentence, placing each token into a String array. This set of arrays is returned as a list. The default setting is English, splitting sentences according to the model en-sent.bin and tokenizing according to en-token.bin (both located in data://ApacheOpenNLP/models/).

To use this with other sentence and tokenization models, call
[<input text>,"data://ApacheOpenNLP/models/<sentence detection model>","data://ApacheOpenNLP/models/<tokenizer model>"]

For more information visit http://opennlp.apache.org.