Doc2Vec

Vectorize documents of arbitrary length

Algorithmia Platform License · Internet Access · Calls Other Algorithms

Try the API

{
  "vectors": [
    {
      "doc": "This algorithm creates a vector representation of an input text of arbitrary length (a document) by using LDA to detect topic keywords and Word2Vec to generate word vectors, and finally concatenating the word vectors together to form a document vector. The output document vector is within the same vector space as each Word2Vec word (300 dimensions, with a range of -1, +1).",
      "vector": [
        0.027143991901539266,
        0.01981183455791325,
        0.025684720138087872,
        0.008205269230529666,
        -0.05381948390277103,
        0.013424265664070845,
        0.006839196765213275,
        -0.01621866307687014,
        0.034775817854097106,
        -0.0032083491678349687,
        0.015601685037836436,
        -0.0414551820140332,
        -0.019185552271665077,
        -0.015653730137273676,
        -0.01248707470949739,
        -0.0013141741510480697,
        -0.022354501881636683,
        0.0343677290948108,
        -0.002263055881485343,
        -0.05065101897343994,
        -0.023214336484670643,
        0.0012138421880081311,
        -0.012819819123251367,
        0.012173224065918477,
        -0.021504097850993276,
        0.0000010207295417759624,
        -0.030349470485816703,
        0.02046672615688294,
        0.01864283817121759,
        -0.041347857710206895,
        -0.057572116144001505,
        -0.03543975821230561,
        -0.0017694695852696891,
        0.03232873362867395,
        -0.024808663089061152,
        0.000157032831339163,
        0.012329394798143767,
        0.028061978751793507,
        0.01870037830667571,
        -0.017004989495035264,
        0.016422725806478407,
        0.03310684754978867,
        0.004671946982853119,
        0.07057368708774449,
        -0.012893462553620337,
        -0.023272546008229256,
        0.010891805402934558,
        0.0008913860656321043,
        0.008923016779590395,
        0.020464614965021617,
        -0.014518211712129416,
        -0.034020036924630404,
        -0.03526322450488806,
        -0.041435129474848516,
        -0.0048934412188828035,
        0.0032880413345992574,
        0.015304653556086128,
        -0.061472992296330624,
        0.03438575309701263,
        0.014316927001345908,
        0.01224871585145593,
        0.006440616096369921,
        -0.004458118928596378,
        -0.0076283931266516525,
        -0.010171758243814109,
        -0.01261389238061383,
        0.006355071789585061,
        0.05128005123697223,
        -0.03772720484994352,
        0.03891348175238818,
        -0.0185069807068885,
        -0.003875146911013871,
        0.05210681026801468,
        -0.030921049183234572,
        -0.03241922444431111,
        -0.017926732194609944,
        0.013137706730049103,
        0.00021363209816628072,
        0.006145675899460913,
        0.027640249463729553,
        0.00972532242303714,
        0.01740694860927761,
        0.027610974153503783,
        0.00701133086113259,
        0.06621045019710435,
        0.023296907776966684,
        -0.019762522832024853,
        0.019673622515256287,
        0.043171904573682696,
        0.016267745057120926,
        0.035519487631972886,
        -0.04909510491415859,
        -0.04843792023893913,
        0.00199563754722476,
        0.013846672605723146,
        -0.030007521156221628,
        -0.02239044569432736,
        0.009077158545551354,
        0.033187770983204246,
        -0.0008221528260037267,
        -0.026279328929376795,
        -0.02363579347729683,
        -0.040706131985643886,
        -0.015450497026904484,
        0.011819290230050685,
        0.028742745053023107,
        -0.02519238751847297,
        0.01834643865004182,
        -0.015737715642899285,
        -0.019256000523455438,
        -0.056906251818872995,
        -0.007065853256790432,
        0.00820177549030632,
        0.02486405387753621,
        0.037452953634783626,
        0.04870824370300397,
        -0.0006703615654259939,
        -0.026535548968240626,
        0.054268803854938596,
        -0.013684971316251906,
        -0.06195446266792714,
        -0.05494800023734571,
        -0.03096719237510115,
        0.005828135675983501,
        0.026471517980098745,
        0.012299214839003982,
        -0.034326615452300764,
        0.023619152139872313,
        0.02931217142031528,
        -0.0069854992907494315,
        0.03821599821094425,
        -0.005396009888499972,
        -0.011285436397884045,
        0.005246675340458747,
        0.004851227276958533,
        -0.001854060101322827,
        0.01777325483271852,
        0.005817468685563658,
        0.02807280342676676,
        0.02147928708291147,
        0.025186905171722177,
        -0.005959279835224152,
        -0.010654685031113336,
        0.018851587286917493,
        -0.018071090831654153,
        -0.03950329258077546,
        0.002729603322222826,
        0.009343775454908613,
        -0.05563973507378251,
        0.0018385930452495837,
        0.03537160623818637,
        0.01652481243945658,
        -0.03880744683556259,
        -0.010576896136626605,
        0.02004406217020005,
        -0.013632920687086884,
        -0.05343500105664134,
        -0.013025731226662176,
        -0.0343831111531472,
        -0.01708124763899832,
        -0.006083214815589604,
        0.03613067703554409,
        0.01404912176076323,
        -0.009491792996414006,
        -0.006293900543823837,
        0.025957562669646002,
        -0.008384123211726551,
        0.020504974341747587,
        -0.06291137682273984,
        -0.00559214994427748,
        -0.015890032635070377,
        0.016843025630805638,
        -0.052454726421274245,
        -0.01115344127174466,
        0.0769261515233666,
        0.008147272863425311,
        0.00414406321942806,
        -0.07706352928653364,
        -0.03958630608394743,
        -0.01525399391539395,
        -0.05067671620054172,
        -0.009120141679886732,
        -0.0062525907487724925,
        -0.020707708550617102,
        0.025007415912114094,
        -0.02370599564164878,
        0.022737967898137867,
        0.022484672910650264,
        0.018773174058878798,
        0.030447723656834572,
        0.021969076944515116,
        -0.020686852280050523,
        -0.011349958018399775,
        -0.00746811367571354,
        -0.06805495463777332,
        0.008945465320721269,
        0.003486611647531383,
        -0.0016885352961253392,
        -0.004710285284090791,
        -0.015469745019800035,
        0.02375476283486932,
        0.012637738487683233,
        -0.04124952957499773,
        0.01123042765539139,
        -0.0006011995428707466,
        0.01356089429464191,
        0.004200032853987068,
        -0.00469896057620644,
        -0.0021694404713343807,
        -0.006164566817460586,
        -0.014151540439343078,
        0.00239170715212822,
        -0.04781669611111283,
        0.044464499340392635,
        -0.03993357717990876,
        0.011970339983236048,
        0.034746183722745634,
        0.0365600127261132,
        -0.01864897811901756,
        0.023301091976463795,
        0.010385342873632913,
        0.01554389334341977,
        -0.00841591431526467,
        -0.018186578439781442,
        -0.005373611376853658,
        -0.009000137506518516,
        0.0056689348421059575,
        0.024431283585727225,
        0.043530105846002705,
        -0.00974515324924141,
        -0.022901828866451986,
        -0.0046353013021871396,
        0.02347961891791784,
        -0.0200192712363787,
        -0.005750744370743631,
        0.0021276776328704714,
        -0.0012158300960436377,
        -0.0012599412584677304,
        0.05963090539444239,
        -0.022363600786775344,
        0.016453404132334985,
        0.026103979093022648,
        0.004002321569714696,
        -0.033269394189119346,
        0.019549426797311753,
        0.013203495647758245,
        -0.019100415115644875,
        0.025317525374703116,
        0.02710415236651897,
        0.011248429771512749,
        -0.005182000808417799,
        0.01682353124488145,
        -0.00202000729041174,
        0.010117677738890048,
        0.03571231086971238,
        -0.054988211835734546,
        -0.01165253279032186,
        0.016782900784164674,
        -0.012584103838889863,
        -0.023830399988582936,
        0.011868096160469573,
        -0.016746202483773225,
        -0.04352510480384809,
        -0.03935135394567624,
        0.032501887297257795,
        0.022795865021180355,
        0.021650580456480373,
        -0.01933683839160949,
        -0.0538557937834412,
        -0.024914431753131797,
        -0.02836218615993858,
        0.018960845074616373,
        0.013395131740253422,
        -0.0037552139838226127,
        0.02298437344143168,
        0.01547621996724047,
        -0.022638741967966787,
        -0.018363050578045666,
        0.012190043897135175,
        0.023041971726343036,
        -0.02356912870891392,
        0.05301039176993075,
        0.015248840441927314,
        0.0016618213849142174,
        0.01921335980296136,
        0.010006775264628228,
        0.03212420269846917,
        0.01704227679874748,
        -0.00718907653936185,
        0.035260906035546206,
        -0.0422189606470056,
        0.006693855626508591,
        -0.02193701290525497,
        -0.027333557925885547,
        0.012483243364840742,
        -0.0030233282013796274,
        0.00449293394922279,
        -0.031209250737447288,
        -0.031159733771346503,
        0.009491354343481365
      ]
    },
    {
      "doc": "The JSON Formatter was created to help with debugging. As JSON data is often output without line breaks to save space, it is extremely difficult to actually read and make sense of it. This little tool hoped to solve the problem by formatting the JSON data so that it is easy to read and debug by human beings.",
      "vector": [
        0.034090492757968605,
        0.0010834950953721937,
        -0.017244145623408258,
        0.02305529569275678,
        -0.06192096314043738,
        0.01593235426116735,
        0.019022508524358286,
        0.03136622009333223,
        0.012490407563745975,
        0.0010783682228066025,
        -0.0004497806949075305,
        -0.02921708684880286,
        -0.03393419669009745,
        0.0020786905661225276,
        -0.01867914362810552,
        0.028408226324245344,
        0.02594140771543607,
        0.029567147605121143,
        -0.0032608155743218963,
        -0.055602009408175966,
        -0.01358182565309108,
        -0.009133846229815399,
        0.00103790552384453,
        0.03236651280894877,
        0.0070756055938545614,
        -0.025922712229657915,
        -0.0189166759082582,
        0.028279428450787243,
        -0.008130768779665232,
        -0.05616056284634398,
        -0.020332756263087504,
        -0.009740616515045986,
        -0.01831854022748304,
        -0.01158252111054026,
        -0.045928120845928795,
        0.006713915354339405,
        0.004513605963438752,
        -0.023012300720438358,
        0.0350230342010036,
        0.0059250624617561715,
        0.03872301033698023,
        0.028844655142165724,
        0.07238367991521959,
        0.05591295502381401,
        -0.007509719929657877,
        -0.011227542068809269,
        -0.012611961690708991,
        0.021072213770821687,
        -0.03235692565795035,
        0.020633633510442444,
        -0.024133387720212337,
        -0.048147106892429285,
        -0.0077192056050989786,
        -0.03268865472637117,
        -0.03133368189446631,
        0.02156554654357024,
        0.002293075631314432,
        -0.045012263406533755,
        0.020384037517942495,
        0.025810088147409257,
        0.01929671230027452,
        0.0037914126005489427,
        -0.02754924166947603,
        0.006753519279300238,
        -0.02178754517808557,
        0.009036644012667239,
        -0.015991100110113624,
        0.050799304153770215,
        -0.016444364096969373,
        0.005445576753118071,
        0.04539831611327828,
        -0.0015969882078934498,
        0.0188321984896902,
        -0.07031910121440886,
        -0.02385020244400948,
        -0.06063573760911823,
        0.04354556079488248,
        -0.002383684477536011,
        0.0346835432574153,
        0.023877250845544037,
        -0.012007712153717879,
        0.03082251468731556,
        0.01070953335147351,
        -0.009460617147851737,
        0.0532237766929029,
        -0.02145270281471313,
        -0.0020753575809067154,
        0.028912114212289453,
        0.00464617056422867,
        0.017310851195361472,
        -0.008582581533119079,
        -0.03062087728176265,
        -0.010966247413307432,
        -0.008616600418463351,
        -0.005307347164489324,
        -0.012798283598385751,
        0.009941902830178154,
        -0.026922748365905143,
        0.04793178361433093,
        -0.01920798653736711,
        0.005211693001911044,
        0.003835095791146158,
        0.013243782741483303,
        -0.009659318951889874,
        -0.011145472410134975,
        0.032191992446314543,
        -0.03991863975534217,
        0.04506960895378143,
        0.018918160028988492,
        -0.0008073417120613209,
        -0.046340323518961675,
        -0.01299751835176721,
        0.0018837555544450918,
        0.010138093726709485,
        0.05360131154156992,
        0.026496836799196906,
        0.0006660945559815442,
        -0.022635430563241247,
        0.06732965118862923,
        -0.00223434297367931,
        -0.05608053685864432,
        -0.013796955929137768,
        -0.02685387618839741,
        0.015889765112660825,
        0.0034602356026880413,
        -0.004306862858356912,
        -0.0630030184984207,
        0.021132806781679392,
        0.018328283098526302,
        0.0011346010142005967,
        -0.04898870550096034,
        -0.02195804828079418,
        -0.03175889246631414,
        0.05440504685975612,
        0.012569100566906856,
        -0.03381696995347739,
        0.012179838144220412,
        0.0019177122157998383,
        0.017974765214603398,
        -0.0018299972543900393,
        0.043493244098499424,
        -0.03424397314665839,
        -0.029655259917490198,
        0.000382092082872985,
        -0.015456752735190094,
        0.014792145366300247,
        0.029223995283246047,
        0.019662127713672817,
        0.003878323710523547,
        -0.0472728741588071,
        0.019834619713947177,
        -0.003044557175599041,
        -0.019601315550971783,
        0.02470114271272906,
        0.009098809969145805,
        -0.05927246098872276,
        -0.045708148740232,
        0.024318398558534696,
        -0.030860833532642566,
        -0.0120482326601632,
        -0.024996204301714894,
        0.03180969413369894,
        -0.012795272283256056,
        0.01319671235978604,
        -0.008510993240633981,
        -0.026435876847244806,
        -0.006378626101650295,
        0.015468096680706367,
        -0.022016665257979184,
        -0.0025083533255383433,
        -0.021605991176329557,
        -0.0041976340289693334,
        -0.032392700552009046,
        0.016225136583670974,
        0.033339605899527676,
        0.001235019881278285,
        -0.0036305943503975994,
        -0.07867040392011407,
        -0.007448477335856297,
        -0.043002335587516434,
        -0.034422607219312354,
        -0.029404499800875783,
        0.008586820797063417,
        -0.033388243755325675,
        0.01154739211779088,
        0.03315321524860339,
        -0.07297470327466729,
        0.006788936676457526,
        0.02699406380997971,
        0.010193250898737455,
        -0.04115266422741116,
        -0.03166488208808004,
        -0.05802696628961712,
        -0.012255074107088156,
        -0.014124382403679187,
        0.028225866321008656,
        0.02145227511937265,
        -0.05762301210779699,
        -0.029952146825962703,
        -0.056015682959696285,
        0.017751881678123034,
        0.015953860507579524,
        -0.049774236162193106,
        0.017662454978562895,
        -0.01076526375254616,
        -0.013196345360483981,
        -0.019675523159094165,
        -0.04231464414624498,
        -0.011149484198540451,
        -0.034188306482974426,
        -0.043197560589760556,
        0.046617715352113016,
        -0.01408576790709048,
        0.009200358879752462,
        -0.03953008144162595,
        0.008256967703346165,
        0.05368797795381397,
        0.04544029384851456,
        -0.038709804764948806,
        -0.035323140327818685,
        -0.004602729342877865,
        0.01627262169495225,
        -0.011317657539620994,
        -0.03402534022461624,
        -0.032677487353794284,
        -0.014552727981936195,
        0.017876438272651285,
        -0.01850837469100953,
        0.026569760870188475,
        0.006948347669094801,
        0.01865362143144012,
        -0.01906173175666482,
        -0.0041431773352087475,
        -0.019150217063724995,
        -0.00728384721151087,
        -0.05239952146075668,
        -0.0033879143884405485,
        0.01823507761582733,
        0.05550864565884696,
        -0.012133012642152613,
        0.028365848120301965,
        -0.011948723491514102,
        -0.008402765728533268,
        -0.051083874219330035,
        0.002048128517344597,
        -0.009627700317651028,
        -0.012307372388022491,
        0.08049018937163058,
        -0.02844645897857845,
        -0.029450809652189498,
        -0.017531223769765344,
        0.02544510379084386,
        -0.007587570697069164,
        0.023468778352253156,
        -0.006005103816278281,
        -0.07352659595198935,
        -0.012008616351522505,
        -0.014745774911716584,
        0.008044083835557108,
        -0.025255183049011972,
        0.0294316530926153,
        -0.03415891363556513,
        -0.0062195451755542325,
        -0.03616148835862986,
        -0.012246440397575501,
        0.027205208025407046,
        0.025230002578609852,
        -0.03460944387188647,
        -0.04428348076180554,
        -0.020522962091490623,
        0.00920754903927446,
        0.03457920404616744,
        -0.007821167469955976,
        0.023169592022895816,
        0.019637110177427537,
        0.004672455077525228,
        -0.006772235181415457,
        -0.01863440289162099,
        0.004268339376721998,
        0.01862286566756666,
        -0.014994692988693721,
        0.045257222722284496,
        0.011542857624590399,
        -0.006626796093769372,
        0.021463283948833126,
        -0.00220383080886677,
        0.017388614320225333,
        -0.004200383089482785,
        -0.00850788620300591,
        0.05408464418724179,
        -0.01748840426444077,
        0.02004028484225273,
        -0.029898532608058307,
        0.024512667965609587,
        -0.029175986095651755,
        0.015900821512332186,
        -0.010429314308566973,
        -0.043275312287732966,
        -0.024509177892468873,
        0.009128060191869732
      ]
    }
  ]
}

Install & Use

Use

curl -X POST -d '{
	"docs": [
    	"This algorithm creates a vector representation of an input text of arbitrary length (a document) by using LDA to detect topic keywords and Word2Vec to generate word vectors, and finally concatenating the word vectors together to form a document vector. The output document vector is within the same vector space as each Word2Vec word (300 dimensions, with a range of -1, +1).",
        "The JSON Formatter was created to help with debugging. As JSON data is often output without line breaks to save space, it is extremely difficult to actually read and make sense of it. This little tool hoped to solve the problem by formatting the JSON data so that it is easy to read and debug by human beings."
    ]
}' -H 'Content-Type: application/json' -H 'Authorization: Simple YOUR_API_KEY' https://api.algorithmia.com/v1/algo/nlp/Doc2Vec/0.2.13