nlp

nlp / LDAMapper / 0.1.1

README.md

Introduction

This is a simple topic mapper algorithm for the nlp/LDA algorithm. It gives you the topic distribution for any given set of documents.

Input:

  • (Required): Topics. (Output of nlp/LDA)
  • (Required): Document list.

Output:

  • Topic distribution for given documents.

Examples

Example 1.

  • Parameter 1: Topics. (Output of nlp/LDA)
  • Parameter 2: Documents list.
{
  "topics": [
    {"science": 1, "machine": 1},
    {"future": 1},
    {
      "taking": 1,
      "demand": 1,
      "human": 1,
      "students": 1,
      "takeover": 1,
      "names": 1,
      "computer": 1,
      "superintelligent": 1
    },
    {"rise": 1, "machines": 1, "overlords": 1}
  ],
  "docsList": [
    "machine intelligence is the future",
    "computer science students are in demand and they know it",
    "I for one welcome our new machine overlords",
    "the machines are taking over, and they’ve even got human names",
    "superintelligent AI will takeover and rise"
  ]
}

Output:

{
  "topic_distribution":[
    {
      "doc": "machine intelligence is the future",
      "freq": {
        "0": 0.5,
        "1": 0.5,
        "2": 0,
        "3": 0
      }
    },
    {
      "doc": "computer science students are in demand and they know it",
      "freq": {
        "0": 0.25,
        "1": 0,
        "2": 0.75,
        "3": 0
      }
    },
    {
      "doc": "I for one welcome our new machine overlords",
      "freq": {
        "0": 0.5,
        "1": 0,
        "2": 0,
        "3": 0.5
      }
    },
    {
      "doc": "the machines are taking over, and they’ve even got human names",
      "freq": {
        "0": 0.2,
        "1": 0,
        "2": 0.6,
        "3": 0.2
      }
    },
    {
      "doc": "superintelligent AI will takeover and rise",
      "freq": {
        "0": 0,
        "1": 0,
        "2": 0.6666666666666666,
        "3": 0.3333333333333333
      }
    }
  ]
}