No algorithm description given

Implementation of techniques from Kusner et. al, 2015: From Word Embeddings to Document Distances .  Represents documents as list of word vectors using the celebrated word2vec embedding to represent the words in the documents.  Implements heuristic distance metrics described in the paper (word centroid distance and relaxed word mover's distance) to compute distances between documents for k-nearest neighbor classification.  (The original paper showed that rWMD outperformed a wide variety of other ways of representing text data on a number of benchmark document classification tasks).   This algorithm has a preprocess mode: takes in a text file where each line is a label followed by the text of a document on the same line (with the two separated by a tab).  Additionally, a user must pass in the path to an ID map file in a data collection.  This file will hold a map from (human-readable) IDs for documents (e.g. "sports", "politics") to numeric algorithm-readable labels (e.g. 0, 1).   If a file exists at this path, the labels will be preprocessed using this ID map.  (So test data can have its IDs converted the same way that the training data was, which is important.)  If no file exists at this path, one will be formed when preprocessing the data and written to that path.  Currently, the preprocess mode assumes that even test data has tab-separated labels as well as the document text on a line, though the labels need not be used when testing (unless the user wants to compute the accuracy for test format).  So if test data is unlabeled, random labels may be added to each line of the data file to satisfy the required data format.  When the data is preprocessed, it will be written to the same data collection as the original data and a message to this effect will be returned. Additionally, this algorithm may be run in performance mode, where it is used for kNN classification.  Here the user must pass in training data (in a pickle file as returned by the preprocess mode of this algorithm), training labels, test data, and the number of neighbors to use for kNN classification.  Optionally, the user may pass in the test labels, in which case the program will return the accuracy.  Otherwise, a message informing the user that the predicted labels for the test data have been written to the data collection will be returned (either way, the predicted labels will be written to the data collection).  

(no tags)

Cost Breakdown

0 cr
royalty per call
1 cr
usage per second
avg duration

Cost Calculator

API call duration (sec)
API calls
Estimated cost
per calls
for large volume discounts
For additional details on how pricing works, see Algorithmia pricing.

No permissions required

This algorithm does not require any special permissions.

To understand more about how algorithm permissions work, see the permissions documentation.

1. Type your input

2. See the result

Running algorithm...

3. Use this algorithm

curl -X POST -d '{{input | formatInput:"curl"}}' -H 'Content-Type: application/json' -H 'Authorization: Simple YOUR_API_KEY' https://api.algorithmia.com/v1/algo/mheimann/WMD/0.1.0
View cURL Docs
algo auth
algo run algo://mheimann/WMD/0.1.0 -d '{{input | formatInput:"cli"}}'
View CLI Docs
import com.algorithmia.*;
import com.algorithmia.algo.*;

String input = "{{input | formatInput:"java"}}";
AlgorithmiaClient client = Algorithmia.client("YOUR_API_KEY");
Algorithm algo = client.algo("algo://mheimann/WMD/0.1.0");
AlgoResponse result = algo.pipeJson(input);
View Java Docs
import com.algorithmia._
import com.algorithmia.algo._

val input = {{input | formatInput:"scala"}}
val client = Algorithmia.client("YOUR_API_KEY")
val algo = client.algo("algo://mheimann/WMD/0.1.0")
val result = algo.pipeJson(input)
View Scala Docs
var input = {{input | formatInput:"javascript"}};
           .then(function(output) {
View Javascript Docs
var input = {{input | formatInput:"javascript"}};
           .then(function(response) {
View NodeJS Docs
import Algorithmia

input = {{input | formatInput:"python"}}
client = Algorithmia.client('YOUR_API_KEY')
algo = client.algo('mheimann/WMD/0.1.0')
print algo.pipe(input)
View Python Docs

input <- {{input | formatInput:"r"}}
client <- getAlgorithmiaClient("YOUR_API_KEY")
algo <- client$algo("mheimann/WMD/0.1.0")
result <- algo$pipe(input)$result
View R Docs
require 'algorithmia'

input = {{input | formatInput:"ruby"}}
client = Algorithmia.client('YOUR_API_KEY')
algo = client.algo('mheimann/WMD/0.1.0')
puts algo.pipe(input).result
View Ruby Docs
use algorithmia::*;

let input = {{input | formatInput:"rust"}};
let client = Algorithmia::client("YOUR_API_KEY");
let algo = client.algo('mheimann/WMD/0.1.0');
let response = algo.pipe(input);
View Rust Docs
  • {{comment.username}}