mahout

mahout / RandomForestTrain / 0.1.3

README.md
Trains a Mahout random forest classifier. Takes as input a string array of the training data, a destination url for the model, a string data descriptor, and a number of trees, returns the Data API URL of the trained model. The trained model can be applied to test data using ApplyRandomForest.

We assume that the first entry of any instance is the label, though Mahout does support other placement. The descriptor must be of form "L X X X ...", where each X designates the type of its respective field, either I (ignored), N (numerical), or C (categorical). L designates the label label. Think of the descriptor as a header for the data. As an example, a dataset with four attributes (beyond the label) might have the first two as categorical, the third numerical, and the last ignored, and its header would be "L C C N I".