spotify

spotify / Annoy / 0.1.1

README.md

Introduction


Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mapped into memory so that many processes may share the same data.

Input:

  • (Required): input: A list of key, vector pairs or .ann file.
  • (Required): dimension: # of dimensions the input pairs have.
  • (Optional): n_trees: (More trees = higher accuracy) (Default = 10)
  • (Optional): n_results: number of results you want for getting nearest neighbors. (Default = 10)
  • (Optional): search_k: # of nodes inspected during search. (Default = n_trees x n_results)
  • (Optional): save: Save the given input list of key, vector pairs to the provided location.
  • (Optional): get_nns_by_key: Get the nearest n number of neighbors for the given key.
  • (Optional): get_nns_by_vector: Get the nearest n number of neighbors for the given vector.
  • (Optional): get_distance: Get the distance between two given keys.

Output:

  • nearest neighbors: results get_nns_by_key or get_nns_by_vector.
  • save_location: location of where the .ann file is saved.
  • distance: distance between given keys.

Examples

Example 1.

  • Parameter 1: Blah
{
  "input": [
    {
      "key": 0,
      "location": [0.8767420894817113, -0.6700509183240524, 0.732325128862077, 2.3729040061632567, 0.3842172982561813, -0.5279857053123791, 0.3388777924688914, -0.8781371163765036, -0.15627714161231646, 1.2515703476491336]
    },
    {
      "key": 1,
      "location": [1.4233847889860505, 0.2802480981418487, 1.4523497001882992, 0.11019781892896119, -0.42626470349870627, 1.0859352470124153, -0.23808706492853393, -1.707333580508946, 0.22098479590677178, 0.06284074552955128]
    },
    {
      "key": 2,
      "location": [-1.119090401394029, -0.6007507480831501, -0.4441231187171228, 0.42985505709741534, 1.9394898471475137, 0.040620296914915095, -0.3051864310077192, 0.11910210883424684, -1.867057819379724, -0.4251404170729482]
    },
    {
      "key": 3,
      "location": [0.13236980085321293, 0.7537902335375711, -0.38742235598683566, 0.543663611109595, 1.3455497411061155, 1.768064027247205, -2.443991621181277, 1.479240975423127, -1.6418795265629658, 0.7087113400634844]
    },
    {
      "key": 4,
      "location": [0.966474074933394, 1.8249101017965141, -0.8623592618011836, -0.49037931925043954, 0.5348517185845649, 0.2358706793487077, -0.7044861571359646, 0.4003730942568148, 0.12068993974655497, -0.4408227639207874]
    }
  ],
  "dimension": 10,
  "get_nns_by_key": 2
}

Output:

{
  "nearest_neighbors": [2, 3, 0, 4, 1]
}

Example 2.

  • Parameter 1: Blah
{
  "input": [
    {
      "key": 0,
      "location": [0.8767420894817113, -0.6700509183240524, 0.732325128862077, 2.3729040061632567, 0.3842172982561813, -0.5279857053123791, 0.3388777924688914, -0.8781371163765036, -0.15627714161231646, 1.2515703476491336]
    },
    {
      "key": 1,
      "location": [1.4233847889860505, 0.2802480981418487, 1.4523497001882992, 0.11019781892896119, -0.42626470349870627, 1.0859352470124153, -0.23808706492853393, -1.707333580508946, 0.22098479590677178, 0.06284074552955128]
    },
    {
      "key": 2,
      "location": [-1.119090401394029, -0.6007507480831501, -0.4441231187171228, 0.42985505709741534, 1.9394898471475137, 0.040620296914915095, -0.3051864310077192, 0.11910210883424684, -1.867057819379724, -0.4251404170729482]
    },
    {
      "key": 3,
      "location": [0.13236980085321293, 0.7537902335375711, -0.38742235598683566, 0.543663611109595, 1.3455497411061155, 1.768064027247205, -2.443991621181277, 1.479240975423127, -1.6418795265629658, 0.7087113400634844]
    },
    {
      "key": 4,
      "location": [0.966474074933394, 1.8249101017965141, -0.8623592618011836, -0.49037931925043954, 0.5348517185845649, 0.2358706793487077, -0.7044861571359646, 0.4003730942568148, 0.12068993974655497, -0.4408227639207874]
    }
  ],
  "dimension": 10,
  "get_nns_by_vector": [1.0252480010304292, 1.3375028302621688, 0.25437080301180215, -0.4103567198423881, -0.3397271790005119, 0.5719740729482755, 0.10264453976978072, 0.8816450308892094, 0.205802476232614, 0.9364697693932269]
}

Output:

{
  "nearest_neighbors": [4, 3, 1, 0, 2]
}

Example 3.

  • Parameter 1: Blah
{
  "input": [
    {
      "key": 0,
      "location": [0.8767420894817113, -0.6700509183240524, 0.732325128862077, 2.3729040061632567, 0.3842172982561813, -0.5279857053123791, 0.3388777924688914, -0.8781371163765036, -0.15627714161231646, 1.2515703476491336]
    },
    {
      "key": 1,
      "location": [1.4233847889860505, 0.2802480981418487, 1.4523497001882992, 0.11019781892896119, -0.42626470349870627, 1.0859352470124153, -0.23808706492853393, -1.707333580508946, 0.22098479590677178, 0.06284074552955128]
    },
    {
      "key": 2,
      "location": [-1.119090401394029, -0.6007507480831501, -0.4441231187171228, 0.42985505709741534, 1.9394898471475137, 0.040620296914915095, -0.3051864310077192, 0.11910210883424684, -1.867057819379724, -0.4251404170729482]
    },
    {
      "key": 3,
      "location": [0.13236980085321293, 0.7537902335375711, -0.38742235598683566, 0.543663611109595, 1.3455497411061155, 1.768064027247205, -2.443991621181277, 1.479240975423127, -1.6418795265629658, 0.7087113400634844]
    },
    {
      "key": 4,
      "location": [0.966474074933394, 1.8249101017965141, -0.8623592618011836, -0.49037931925043954, 0.5348517185845649, 0.2358706793487077, -0.7044861571359646, 0.4003730942568148, 0.12068993974655497, -0.4408227639207874]
    }
  ],
  "dimension": 10,
  "save": "data://.algo/temp/output.ann"
}

Output:

{
  "save_location":"data://.algo/temp/output.ann"
}

Example 4.

  • Parameter 1: Blah
{
  "input": "data://spotify/Annoy/test.ann",
  "dimension": 10,
  "get_distance": [55, 125]
}

Output:

{
  "distance": 2.6344156265258785
}

Example 5.

  • Parameter 1: Blah
{
  "input": "data://spotify/Annoy/test.ann",
  "dimension": 10,
  "n_results": 30,
  "get_nns_by_vector": [1.0252480010304292, 1.3375028302621688, 0.25437080301180215, -0.4103567198423881, -0.3397271790005119, 0.5719740729482755, 0.10264453976978072, 0.8816450308892094, 0.205802476232614, 0.9364697693932269]
}

Output:

{
  "nearest_neighbors": [417, 816, 449, 250, 433, 786, 81, 31, 174, 758, 769, 888, 884, 241, 605, 870, 734, 903, 971, 949, 911, 618, 328, 125, 824, 693, 236, 426, 766, 679]
}

Credits

For more information, please refer to: https://github.com/spotify/annoy/