Word2Vec

No algorithm description given

Table of Contents Introduction How Word2Vec works. Examples Credits Introduction Word2vec is a group of related models that are used to produce so-called word embeddings. These models are shallow, two-layer neural networks, that are trained to reconstruct linguistic contexts of words. After training, word2vec models can be used to map each word to a vector of typically several hundred elements, which represent that word's relation to other words. This vector is the neural network's hidden layer. Word2vec relies on either skip-grams or continuous bag of words (CBOW) to create neural word embeddings. It was created by a team of researchers led by Tomas Mikolov at Google. The algorithm has been subsequently analysed and explained by other researchers. Input 1: (Required): A list of word(s). (key = "getVecFromWord") Output 1: A 300-dimensional vector representation of the given word. Input 2: (Required): A list of 300-dimensional vector(s). (key = "getWordFromVec") Output 2: The closest corresponding top 10 words to the given vector in vector-space. Input 3: (Required): A list of two words. (key = "similarityBetweenWords") Output 3: Similarity score between the two given words. Input 4: (Required): A list of words. (key = "doesntMatch") Output 4: Returns the word that doesn't match the other words in the list. Input 5: (Required): Vector arithmetic that uses the algorithm proposed in the original word2vec paper. (key = "vectorArithmetic") (Only one required): A list of words that will be positive in vector arithmetic. (key = "positive") (Only one required): A list of words that will be negative in vector arithmetic. (key = "negative") (Optional): Number of results I want to return. Default is 10. (key = "numResults") Output 5: Top N (if specified, otherwise N=10) words that are closest to the product vector of the arithmetic operation. Input 6: (Required): Vector arithmetic that uses a different algorithm . (key = "vectorArithmeticCosmul") (Only one required): A list of words that will be positive in vector arithmetic. (key = "positive") (Only one required): A list of words that will be negative in vector arithmetic. (key = "negative") (Optional): Number of results I want to return. Default is 10. (key = "numResults") Output 6: Top 10 words that are closest to the product vector of the arithmetic operation. How Word2Vec Works? From a high level overview , Word2Vec converts words into vectors that allows us to do cool tricks to derive insight from these words. For example by looking at the similarity between these vectors we can tell how similar two words are. (refer to example 3 ) These vectors also allows us to do more complicated tasks such as deriving a feminine vector by subtracting two word vectors from each other. For example if we subtract the word vector "girl" from the word vector "boy" we get the feminine vector. Vec(girl) - Vec(boy) = Feminine Vec(X) If we add this new feminine vector to the word nephew, we get the word vector niece as a result. (refer to example 6 ) Feminine Vec(x) + Vec(Nephew) = Vec(Niece) You can also try adding multiple word vectors to derive a new one. For example you could provide the word vectors: magical_creature, horse and magical to derive words such as: mythical_winged, unicorn, mythological_creature. etc... (please refer to example 9 ) Examples Example 1. Parameter 1: A word to get its vector representation. {
 "getVecFromWord": ["intelligence"]
} Output: {
 "vecFromWord": [
 {
 "vector": [
 0.016053304076194767,
 -0.008756347931921482,
 -0.007181741297245027
 ...
 -0.07773178815841678,
 0.016514163464307785,
 0.06236977502703666
 ],
 "word": "intelligence"
 }
 ]
} Example 2. Parameter 1: Vector for the word "paprika". {
 "getWordFromVec": [
 [
 -0.03515789285302162,
 -0.034899380058050156,
 ...
 -0.0075938464142382145,
 0.13029101490974426,
 0.046015478670597076
 ]
 ]
} Output: {
 "wordFromVec": [
 {
 "vector": [
 -0.03515788912773132,
 -0.03489937633275987,
 0.0770371407270432,
 ...
 -0.007593845948576927,
 0.13029100000858307,
 0.04601547494530679],
 "word": [
 ["paprika",1],
 ["smoked_paprika",0.7143110036849977],
 ["cumin",0.6887742280960084],
 ["sweet_paprika",0.6839773058891295],
 ["harissa",0.6795212030410769],
 ["Hungarian_paprika",0.6731281280517578],
 ["coriander",0.662696361541748],
 ["crystallized_ginger",0.6589986681938174],
 ["pepper_flakes",0.6587378382682803],
 ["dried_thyme",0.6585609912872313]]}
 ]
} Example 3. Parameter 1: A list of two words to get the similarity between them. {
 "similarityBetweenWords": [
 "apple",
 "orange"
 ]
} Output: {
 "similarityResult": 0.3920346133342321
} Example 4. Parameter 1: A list of words to get the word that doesn't match the others. {
 "doesntMatch": [
 "apple",
 "orange",
 "soda",
 "pineapple",
 "strawberry"
 ]
} Output: {
 "doesntMatchResult": "soda"
} Example 5. Parameter 1: A list of words that will be positive in vector arithmetic. Parameter 2: A word that will be negative in vector arithmetic. {
 "vectorArithmetic": {
 "positive": ["flying", "swim"],
 "negative": ["fly"]
 }
} Output: {
 "arithmeticResult": [
 ["swimming",0.7306299805641177],
 ["swims",0.6148028373718264],
 ["paddling",0.60206538438797],
 ["swimmers",0.5975159406661986],
 ["swam",0.5887854695320128],
 ["rowing",0.5569698810577393],
 ["Swimming",0.5504106879234314],
 ["swum",0.5483155250549314],
 ["Swim",0.5433419942855837],
 ["swimmer",0.5395475625991822]
 ]
} Example 6. Parameter 1: A list of words that will be positive in vector arithmetic. Parameter 2: A word that will be negative in vector arithmetic. {
 "vectorArithmetic": {
 "positive": ["girl", "nephew"],
 "negative": ["boy"]
 }
} Output: {
 "arithmeticResult": [
 ["niece",0.8366490602493287],
 ["daughter",0.7679876685142516],
 ["cousin",0.738209366798401],
 ["granddaughter",0.7378045320510866],
 ["uncle",0.7184493541717529],
 ["son",0.7181788682937623],
 ["sister",0.701685130596161],
 ["aunt",0.6983187198638916],
 ["mother",0.6890008449554443],
 ["brother",0.6875420808792113]
 ]
} Example 7. Parameter 1: A list of words that will be positive in vector arithmetic. Parameter 2: A word that will be negative in vector arithmetic. Parameter 3: Number of results. {
 "vectorArithmetic": {
 "positive": ["hot", "winter"],
 "negative": ["summer"],
 "numResults": 5
 }
} Output: {
 "arithmeticResult": [
 ["cold",0.5819779634475708],
 ["Hot",0.5251858234405519],
 ["toasty",0.4921303391456604],
 ["chilly",0.48790377378463756],
 ["warm",0.4819275140762329]
 ]
} Example 8. Parameter 1: A list of words that will be positive in vector arithmetic. Parameter 2: A list of words that will be negative in vector arithmetic. {
 "vectorArithmetic": {
 "positive": ["Spanish", "French", "German", "Italy"],
 "negative": ["Spain", "France", "Germany"]
 }
} Output: {
 "arithmeticResult": [
 ["Italian",0.6815115213394165],
 ["Sicilian",0.5179594755172731],
 ["Romanian",0.4798910915851593],
 ["Polish",0.47541826963424694],
 ["English",0.4746833443641663],
 ["Hungarian",0.4611275792121888],
 ["Russian",0.45219406485557556],
 ["Greek",0.451249361038208],
 ["Neapolitan",0.4504689574241638],
 ["Mexican",0.4454891681671143]
 ]
} Example 9. Parameter 1: A list of words that will be positive in vector arithmetic. {
 "vectorArithmeticCosmul": {
 "positive": ["mythical_creature", "horse", "magical"]
 }
} Output: {
 "arithmeticResult": [
 ["mythical_winged",0.35558521747589117],
 ["unicorn",0.34973922371864324],
 ["mythical",0.34622275829315186],
 ["mythological_creature",0.33106532692909246],
 ["mystical_magical",0.32782301306724565],
 ["noble_steed",0.32707193493843084],
 ["fairy",0.3262554407119751],
 ["mystical",0.3259653449058534],
 ["hippogriff",0.3259580433368684],
 ["Chincoteague_pony",0.3258856534957887]
 ]
} Credits For more information, please refer to: Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. " Distributed representations of words and phrases and their compositionality. " In Advances in neural information processing systems , pp. 3111-3119. 2013. Levy, Omer, Yoav Goldberg, and Israel Ramat-Gan. " Linguistic Regularities in Sparse and Explicit Word Representations. " In CoNLL , pp. 171-180. 2014. Sojka, Petr. " Software framework for topic modelling with large corpora. " In In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks . 2010.

Tags
(no tags)

Cost Breakdown

150 cr
royalty per call
1 cr
usage per second
avg duration

Cost Calculator

API call duration (sec)
×
API calls
=
Estimated cost
per calls
for large volume discounts
For additional details on how pricing works, see Algorithmia pricing.

No permissions required

This algorithm does not require any special permissions.


To understand more about how algorithm permissions work, see the permissions documentation.

1. Type your input

2. See the result

Running algorithm...

3. Use this algorithm

curl -X POST -d '{{input | formatInput:"curl"}}' -H 'Content-Type: application/json' -H 'Authorization: Simple YOUR_API_KEY' https://api.algorithmia.com/v1/algo/nlp/Word2Vec/0.1.0
View cURL Docs
algo auth
# Enter API Key: YOUR_API_KEY
algo run algo://nlp/Word2Vec/0.1.0 -d '{{input | formatInput:"cli"}}'
View CLI Docs
import com.algorithmia.*;
import com.algorithmia.algo.*;

String input = "{{input | formatInput:"java"}}";
AlgorithmiaClient client = Algorithmia.client("YOUR_API_KEY");
Algorithm algo = client.algo("algo://nlp/Word2Vec/0.1.0");
AlgoResponse result = algo.pipeJson(input);
System.out.println(result.asJsonString());
View Java Docs
import com.algorithmia._
import com.algorithmia.algo._

val input = {{input | formatInput:"scala"}}
val client = Algorithmia.client("YOUR_API_KEY")
val algo = client.algo("algo://nlp/Word2Vec/0.1.0")
val result = algo.pipeJson(input)
System.out.println(result.asJsonString)
View Scala Docs
var input = {{input | formatInput:"javascript"}};
Algorithmia.client("YOUR_API_KEY")
           .algo("algo://nlp/Word2Vec/0.1.0")
           .pipe(input)
           .then(function(output) {
             console.log(output);
           });
View Javascript Docs
var input = {{input | formatInput:"javascript"}};
Algorithmia.client("YOUR_API_KEY")
           .algo("algo://nlp/Word2Vec/0.1.0")
           .pipe(input)
           .then(function(response) {
             console.log(response.get());
           });
View NodeJS Docs
import Algorithmia

input = {{input | formatInput:"python"}}
client = Algorithmia.client('YOUR_API_KEY')
algo = client.algo('nlp/Word2Vec/0.1.0')
print algo.pipe(input)
View Python Docs
library(algorithmia)

input <- {{input | formatInput:"r"}}
client <- getAlgorithmiaClient("YOUR_API_KEY")
algo <- client$algo("nlp/Word2Vec/0.1.0")
result <- algo$pipe(input)$result
print(result)
View R Docs
require 'algorithmia'

input = {{input | formatInput:"ruby"}}
client = Algorithmia.client('YOUR_API_KEY')
algo = client.algo('nlp/Word2Vec/0.1.0')
puts algo.pipe(input).result
View Ruby Docs
use algorithmia::*;

let input = {{input | formatInput:"rust"}};
let client = Algorithmia::client("YOUR_API_KEY");
let algo = client.algo('nlp/Word2Vec/0.1.0');
let response = algo.pipe(input);
View Rust Docs
Discussion
  • {{comment.username}}