This algorithm computes a collusion ratio for a given set of posts and user votes. For example, this could be used to flag user activity in social collaboration websites, such as Product Hunt / Hacker News / Reddit. It works by finding the ratio of common votes (among a group of users) over the sum of uncommon votes and total votes.
Input must be a CSV file with comma delimited numbers (without header). Each line represents the votes a single post received. The first number in every line is the post_id, while the remaining numbers are the user ids of the users who voted for that post/item.
The input string can be a path to a local CSV file (data://), remote file (http://), or straight-up CSV content.
The second parameter is the LIMIT, which is a double value between 0 and 1. This tells the algorithm to only return ratios that are larger than the specified limit.
The output is an array of CollusionRatio. Each CollusionRatio element has the following properties:
- Ratio: see computation above
- CommonVotes: # of votes in common among given users, a.k.a. Votes(U,P)
- OutsideVotes: # of votes not in common among given users, a.k.a. Votes(U, -P)
- AllVotes: # of votes by all users made on any post, a.k.a. Votes(U, *)
- UserCount: # of users part of this computation
- Users: array with ids of users involved in the computation
- PostId: the post id that was used to identify the group of users