Java implementation of the Frequent Pattern Growth (FP-Growth) algorithm, which is a scalable method for finding frequent patterns within large datasets. For example, it could be used to find Association Rules and develop collaborative-filtering systems, such as "Other people also bought"...
The algorithm takes three arguments:
- Dataset: path to a local (data://...) dataset, where each line represents a single transaction and each item is separated by whitespace. (See Example file).
- Support: represents the minimum frequency for a pattern to be recognized. In most cases you'll want to increase this number to reduce the size of the output.
- MinItems: represents the minimum number of items (per association rule). Having the value of this argument as 1 will return each unique item in the dataset and the number of times they appeared. Most applications would require a number higher than 1.
- Output: optionally specify a local (data://...) location to which the output JSON should be written. This is required for result sets exceeding 10Mb in size.
This algorithm was featured in the Algorithmia Blog Post: "Mining Product Hunt, Part 2: Building a Recommendation Engine".