gethtml - Get HTML of javascript based web page

gethtml load given web page, scroll down a number of times and wait until javascript loaded, then save HTML.

(no tags)

Algorithmia Platform License
apl
· Internet Access

This algorithm has Internet access.

This is necessary for algorithms that rely on external services, however it also implies that this algorithm is able to send your input data outside of the Algorithmia platform.

Run an Example

"{\"isError\":false,\"errorMessage\":\"Getting HTML is completed successfully!\",\"errorCode\":\"Success\",\"results\":{\"outputFile\":\"data://.algo/magicanded/gethtml/temp/eb613b46426c4e529e1f89c26b5dc35f.html\",\"injectJsFileUrl\":\"https://algorithmia.com/v1/data/magicanded%2Fpublic%2Fgethtml-inject.js\",\"errorStream\":\"\",\"outputFileUrl\":\"https://algorithmia.com/v1/data/.algo%2Fmagicanded%2Fgethtml%2Ftemp%2Feb613b46426c4e529e1f89c26b5dc35f.html\",\"scroll\":5,\"injectJsFile\":\"data://magicanded/public/gethtml-inject.js\",\"outputStream\":\"Failure upon successfully waiting for global dependencies.\\n\\n ReferenceError: Can\\u0027t find variable: jQuery\\n\\n  https://www.huffingtonpost.com/entry/apple-new-iphone-x_us_59b809f9e4b027c149e2dbe0:107\\nFailure upon successfully waiting for global dependencies.\\n\\n ReferenceError: Can\\u0027t find variable: jQuery\\n\\n  https://www.huffingtonpost.com/entry/apple-new-iphone-x_us_59b809f9e4b027c149e2dbe0:107\\nFailure upon successfully waiting for global dependencies.\\n\\n ReferenceError: Can\\u0027t find variable: jQuery\\n\\n  https://www.huffingtonpost.com/entry/apple-new-iphone-x_us_59b809f9e4b027c149e2dbe0:107\\n\",\"cookiesFile\":\"data://.algo/magicanded/gethtml/temp/cf98f88de4e04cb08c1bfc30375de688.cookies\",\"cookiesFileUrl\":\"https://algorithmia.com/v1/data/.algo%2Fmagicanded%2Fgethtml%2Ftemp%2Fcf98f88de4e04cb08c1bfc30375de688.cookies\"}}"

Install & Use

Use

curl -X POST -d '{
  "url": "https://www.huffingtonpost.com/entry/apple-new-iphone-x_us_59b809f9e4b027c149e2dbe0",
  "scroll": 5,
  "injectJsFile": "data://magicanded/public/gethtml-inject.js"
}' -H 'Content-Type: application/json' -H 'Authorization: Simple YOUR_API_KEY' https://api.algorithmia.com/v1/algo/magicanded/gethtml/0.2.18