Doc2Vec

Vectorize documents of arbitrary length

Algorithmia Platform License
apl
· Internet Access

This algorithm has Internet access.

This is necessary for algorithms that rely on external services, however it also implies that this algorithm is able to send your input data outside of the Algorithmia platform.
· Calls Other Algorithms

This algorithm has permission to call other algorithms.

This allows an algorithm to compose sophisticated functionality using other algorithms as building blocks, however it also carries the potential of incurring additional royalty and usage costs from any algorithm that it calls.

Run an Example

Copy
Copy
{
  "vectors": [
    {
      "doc": "This algorithm creates a vector representation of an input text of arbitrary length (a document) by using LDA to detect topic keywords and Word2Vec to generate word vectors, and finally concatenating the word vectors together to form a document vector. The output document vector is within the same vector space as each Word2Vec word (300 dimensions, with a range of -1, +1).",
      "vector": [
        0.06030532649407784,
        -0.01792409811168909,
        0.025814860686659814,
        0.02193225547671318,
        -0.0654110675988098,
        0.043562420768042405,
        -0.0057307022778938215,
        -0.0337689479192098,
        0.05881221691767375,
        -0.00969575463483731,
        0.011704274639487266,
        -0.04036233387887478,
        -0.024430124461650847,
        -0.002753358706831932,
        -0.02537864043066899,
        0.027641560261448223,
        -0.030880420903364816,
        0.03487328123301268,
        -0.0361946323265632,
        -0.07743266969919205,
        -0.020094247752179703,
        0.07324582589790225,
        0.02355421433846156,
        -0.013223966956138611,
        -0.049222082893053694,
        -0.019827942674358685,
        -0.009662755772781868,
        0.03762243986129761,
        0.04625193811953067,
        -0.016408607177436352,
        -0.029363690813382467,
        -0.05005241980155309,
        -0.023602936727305252,
        0.02325182467078169,
        -0.029863992938771844,
        0.04989503088096778,
        0.01614228159499665,
        0.03986085553963979,
        0.03719622089217107,
        -0.02285262942314148,
        0.02858690299714605,
        0.028394673826793827,
        0.02597948188583056,
        0.08750108703970909,
        -0.03829903205235799,
        -0.02534536172946294,
        0.023228102674086887,
        -0.013892161846160888,
        -0.04018166642636061,
        0.07351528803507487,
        -0.022103263872365157,
        -0.012082815915346146,
        -0.016305413097143173,
        -0.0609675129254659,
        0.014569002638260524,
        0.016713072825223208,
        -0.00268705493460099,
        -0.09104960107554992,
        0.020431167135636013,
        0.037206114487101635,
        0.03411986790597439,
        0.018428854706386725,
        -0.028830416748921076,
        0.031150148808956148,
        -0.014385334278146426,
        -0.04789853834857543,
        0.020120789234836896,
        0.057797762006521224,
        -0.05637224726378918,
        0.02152114696800709,
        -0.022622099012005497,
        0.0051085672341287134,
        0.02676316723227501,
        -0.028377737229069073,
        -0.007544481350729863,
        -0.0017457878217101105,
        0.041848054404060045,
        -0.04506914077016214,
        0.03204323537647724,
        0.02865609414875507,
        0.008253747907777628,
        0.01569549962878227,
        0.05570024612049262,
        0.006172493752092123,
        0.07914483537897468,
        0.06054334156215191,
        -0.05349297896027565,
        0.01662580417857195,
        0.05798225545634826,
        -0.0041223170856634775,
        0.03822450069710613,
        -0.03805435250202815,
        -0.049636424899411696,
        -0.009898614138364794,
        -0.010921155909697212,
        -0.02131686955690384,
        -0.03146557820339998,
        0.015582675707992166,
        0.061606848364075026,
        -0.0001932280759016673,
        -0.06980998700795074,
        -0.0256193200747172,
        -0.04927860822839041,
        -0.013561723528740306,
        -0.008119181667764982,
        0.04499770278731982,
        -0.04284812211990356,
        -0.006652349481980004,
        0.01978212284545104,
        -0.0514970392609636,
        -0.10608614726612964,
        -0.016344997531268744,
        0.000549684464931488,
        0.030857874701420467,
        0.0449828361471494,
        0.07003050027415156,
        0.01347468855480353,
        -0.03344209082424641,
        0.05281756293649475,
        -0.031690448087950546,
        -0.050928963844974834,
        -0.05843263020118077,
        -0.06655740533024072,
        0.016803887533023955,
        0.07638155321280161,
        0.019217370140055815,
        -0.028514182412376007,
        -0.01564730331301689,
        0.04042327019075553,
        -0.019800341377655663,
        0.014845540933310983,
        -0.041085658470789585,
        -0.017582006690402826,
        0.009197719767689704,
        -0.001815024514993032,
        -0.022931747262676556,
        0.010504862914482752,
        0.040839523915201424,
        0.03819356585542361,
        0.04634464276023209,
        0.05260767589012782,
        -0.011120116710662842,
        -0.008590816147625446,
        0.002757261693477631,
        -0.03261583995384475,
        -0.019170820608269423,
        0.01148971604804198,
        0.024935893714427948,
        -0.06880530919879675,
        -0.032202964772780736,
        0.05954377229015032,
        0.03165062516927719,
        -0.035578621675570805,
        -0.01164435495932897,
        0.003944965886572996,
        -0.009813077996174495,
        -0.07187015886108081,
        0.015315083786845207,
        -0.03237609791879852,
        0.0045871206966694444,
        -0.024614220834337175,
        0.0527975969016552,
        -0.0036365740622083336,
        -0.006118370406329632,
        -0.0011755858858426413,
        0.022896766786774,
        -0.043635585159063336,
        -0.008819182645917559,
        -0.06766440346837044,
        -0.013977280026301742,
        0.02493462804704905,
        0.026639175694435836,
        -0.09685891959816216,
        -0.035028043140967684,
        0.09088488581279912,
        0.00921480351438125,
        0.041079485292236005,
        -0.09248825684189796,
        -0.00932889332373937,
        -0.03447736812134584,
        -0.01932740757862727,
        -0.031227258127182723,
        0.021336401005585988,
        -0.04125193208456039,
        0.04102617421497902,
        0.007428573320309321,
        -0.0020782779281338037,
        0.011275188128153485,
        -0.025270077834526696,
        0.05841796138168623,
        0.019051045179367065,
        -0.016832979520161946,
        -0.03644019930313031,
        -0.0008683372288942337,
        -0.05932681430131197,
        -0.006406188073257605,
        -0.012452445303400358,
        -0.002344335926075777,
        -0.00016828042765458425,
        -0.0049996480345726015,
        0.03562680948525667,
        0.019958720045785108,
        -0.04201726609220107,
        0.015969649826486906,
        -0.03401932244499525,
        0.020928135700523855,
        -0.005919299398859342,
        0.019530660410722097,
        0.013609476387500766,
        -0.04102814264285068,
        0.000029645611842473348,
        -0.002732205390930176,
        -0.05317607646187147,
        0.029651959178348383,
        -0.0164335656290253,
        0.009846555317441623,
        0.03865352869033813,
        0.0430039651071032,
        0.002331578064089022,
        0.019666125408063333,
        0.017569420486688615,
        0.013236404435398676,
        0.00816000858321786,
        -0.03390077340106169,
        0.02741190097294748,
        -0.03319050871456663,
        0.01455792747437954,
        0.09774977415800094,
        0.06572081645329793,
        0.0058535793175299965,
        -0.03463372041781743,
        0.014992453282078108,
        0.008196703003098568,
        -0.010309802864988645,
        -0.031580050786336265,
        -0.006313801038534928,
        0.019680303335189823,
        -0.01864248520384232,
        0.11895029128839572,
        -0.00004684925079345703,
        0.009104157666054864,
        -0.01342459327230851,
        0.017158462634931006,
        -0.030355424682299296,
        0.03905810341238976,
        -0.017513367782036462,
        0.014951996729359962,
        0.0386977414910992,
        0.04236471205949784,
        0.04271812339623769,
        -0.026504703300694623,
        0.010876317570606867,
        -0.016061696286002795,
        0.010708247125148774,
        0.02705327169969678,
        -0.060949162331720194,
        -0.048739799143125616,
        0.038383877075587707,
        -0.002890423444720606,
        -0.05454139704622018,
        0.001268620075037082,
        0.00380124698082606,
        -0.05452518812380731,
        -0.07045154124498368,
        0.01618116026123365,
        -0.01078727909674247,
        0.02426539547741413,
        -0.031819724043210346,
        -0.059301220998167994,
        -0.04456954821944237,
        -0.004506841053565347,
        0.03344298824667931,
        0.004167743896444639,
        -0.015674303223689397,
        0.024451941500107446,
        0.02094793762080371,
        -0.017026034882292155,
        -0.02349477137128512,
        0.02233829473455747,
        0.053834784775972366,
        -0.02893731606503328,
        0.059744891027609506,
        0.0134747593353192,
        0.00033598033090432113,
        0.031437889114022254,
        -0.005439385709663232,
        0.03916348020235697,
        0.06308421964446703,
        -0.0008762144483625889,
        0.02252236853043238,
        -0.0448238211683929,
        -0.004856507480144501,
        -0.004822879036267599,
        -0.024616226693615316,
        0.023055320233106615,
        -0.02056920264537136,
        -0.03518804834845165,
        -0.05258100920667251,
        -0.025027464143931866,
        0.03580452594906092
      ]
    }
  ]
}
Running algorithm...

Install & Use

Use

curl -X POST -d '{
	"docs": "This algorithm creates a vector representation of an input text of arbitrary length (a document) by using LDA to detect topic keywords and Word2Vec to generate word vectors, and finally concatenating the word vectors together to form a document vector. The output document vector is within the same vector space as each Word2Vec word (300 dimensions, with a range of -1, +1)."
}' -H 'Content-Type: application/json' -H 'Authorization: Simple YOUR_API_KEY' https://api.algorithmia.com/v1/algo/nlp/Doc2Vec/0.5.1