util / ExtractText / 0.1.1

Extracts text from a given file. Supported formats include Office documents (Word, Powerpoint, etc), HTML, XML, PDF, RTF, etc. For more information on supported formats, please take a look at Apache Tika's documentation page.

This algorithm handles any URL, be it from the Data API or the URL to a file on the internet.