mahout

mahout / Recommendation / 0.2.1

README.md

Overview


Eventually we will expand this to allow different parameters and options for neighborhood and similarity functions.

One of the more convenient datasets is the MovieLens 100k dataset at http://www.grouplens.org/system/files/ml-100k.zip. More detailed documentation can be found at https://mahout.apache.org/users/recommender/recommender-documentation.html


Sample Input

The sample input used below (movies.csv) has the following format: userId, itemId, reviewPoints

196,242,3
186,302,3
22,377,1
244,51,2
...

Information about Modes


User-based recommendation

In this mode, the recommender returns a set of item recommendations for each user, along with the predicted rating for the item. Think of this as generating recommendations based on user similarity. 

Item-based recommendation

This returns, for each item, a list of similar items. 

Matrix Factorization recommendation (with Alternating Least Squares)

An alternate and often more effective approach to recommendation that can be useful for uncovering latent explanatory factors. We plan to expose more of this soon, meanwhile, it acts as a user-based recommender.

Upcoming Features

We plan to add ALS on implicit feedback and weighted matrix factorization soon.