amitkumargaur / TextSimilarityMeasurement / 0.15.0

Given two documents/text (strings), this algorithm returns a similarity measurement value  between 0 and 1, 1 for text that are purely same and 0 for that are purely unrelated. It involves transforming each text into a vectors in a k - dimensional space model, then compute the cosine similarity ( i.e. dot product of the vectors) between them.

This algorithm is very useful in content based recommendation engine for recommending products having similar attributes like title, materials, fabric, color, care tips, patterns for the ecommerce domain.

Suppose, I have two products (taken from fashion sites) like, having title /description

["olive green cotton kurta",  "green cotton kurta"]

Similarity Index:-


So, for a particular items, one can recommend similar/related items from the large datasets.