ANaimi

ANaimi / PDFToText / 0.1.4

PDFToText

Royalty Free
Get the text out of PDF a document
No Tags
Language
Java
Metrics
API Calls - 4,475 Avg call duration - 1.30sec
Permissions
The Algorithm Platform License is the set of terms that are stated in the Software License section of the Algorithmia Application Developer and API License Agreement. It is intended to allow users to reserve as many rights as possible without limiting Algorithmia's ability to run it as a service. Learn More
This is necessary for algorithms that rely on external services, however it also implies that this algorithm is able to send your input data outside of the Algorithmia platform.

Run an Example

Input
Output
[
  "Sources: Lipsum, http://www.lipsum.com, 1500 PDF to Text \nSample Input \n \nThis is a sample test input for the PDF to Text submission for Algorithmia. \n \nThis coordinates bounding this paragraph is (88, 174) to (500, 262). Lorem \nipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor \nincididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis \nnostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. \nDuis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu \nfugiat nulla pariatur.  \n \nTest of Table \nHeader #0  Header #1  Header #2 \nValue 0,0  Value 0,1  Value 0,2 \nValue 1,0 Bounds: (228, 335) to (370, 355) \nValue 1,2 \nValue 2,0  Value 2,1  Value 2,2 \n \n \nTest of Random Boxes/Shapes \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \nConclusion \nLorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor \nincididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis \nnostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. \nDuis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu \nfugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in \nculpa qui officia deserunt mollit anim id est laborum. \n \n This is text in a step \nbox that is in the \nmiddle of a page. \n \nBounds: \n(90, 460) to (215, 605) Step #1 \nThis is text in a step \nbox that is in the \nmiddle of a page. Step #2 \nThis is text in a step \nbox that is in the \nmiddle of a page. Step #3 ",
  "\nSources: Lipsum, http://www.lipsum.com, 1500 Furthermore \n \nThis is to illustrate grabbing text from a second page. \n \nAldebaran is an orange giant star located about 65 light years away in the zodiac \nconstellation of Taurus. With an average apparent magnitude of 0.87 it is the \nbrightest star in the constellation and is one of the brightest stars in the \nnighttime sky. \n \nThe name Aldebaran is Arabic (نﻥاﺍرﺭبﺏدﺩلﻝاﺍ al-­‐dabarān) and translates literally as \n\"the follower\", presumably because this bright star appears to follow the \nPleiades, or \"Seven Sisters\" star cluster in the night sky. \n \nMore: http://en.wikipedia.org/wiki/Aldebaran \n \n "
]

Install and Use

Install

Install the Algorithmia CLI client by running:

curl -sSLf https://algorithmia.com/install.sh | sh

Then authenticate by running:

$ algo auth
# When prompted for api endpoint, hit enter
# When prompted for API key, enter your key: YOUR_API_KEY
CLI Install Docs

Use



algo run ANaimi/PDFToText/0.1.4 -d '["data://ANaimi/PDFtoText/sample.pdf", 0]' --timeout 300
  
CLI Docs