Fuse NGrams

No algorithm description given

This routine is used for processing text prior to tagging (with an algorithm such as https://algorithmia.com/algorithms/kenny/LDA or https://algorithmia.com/algorithms/nlp/KeywordsForDocumentSet ). It turns important multi-word terms (n-grams) into a single word (by replacing spaces with underscores, so "machine learning" becomes "machine_learning") so tagging routines will "recognize" the importance of the term and not treat it as a collection of unrelated words. It can either do this for a designated set of terms or it can automatically detect which n-grams are likely to be important and process these. If you already know the terms you want to fuse, simply supply the text to process as well as a list of the terms to fuse, for example, if you enter ["Machine learning is the future of technology!",["machine learning","future of technology"]] you will get "machine_learning is the future_of_technology!" If you want to automatically discover terms that should be fused, input The input text, either a String or a String[]. n for the desired type of n-gram. 2 is the most common, for example, "machine learning" and "big data" are important 2-grams/bigrams. A List<String> of known n-grams. If you don't know what these should be just use an empty list. An int for the maximum number of n-grams to consider (default is 5). An int for the minimum frequency of the n-gram in the text (the number of times it appears) that will be considered (default is 5). If the first argument of a String, frequency is the total number of times the n-gram appears in it, if it is a String[], counts are taken across all entries. The output is the input with all relevant n-grams fused. All strings are converted to lower case for both processing and output. As mentioned above, if you input just a String or String[], the algorithm will search for a maximum of 5 bigrams, counting only those that appear more than five times, and will return, respectively, a String or String[] with all eligible bigrams fused.

(no tags)

Cost Breakdown

0 cr
royalty per call
1 cr
usage per second
avg duration
This algorithm has permission to call other algorithms which may incur separate royalty and usage costs.

Cost Calculator

API call duration (sec)
API calls
Estimated cost
per calls
for large volume discounts
For additional details on how pricing works, see Algorithmia pricing.

Internet access

This algorithm has Internet access. This is necessary for algorithms that rely on external services, however it also implies that this algorithm is able to send your input data outside of the Algorithmia platform.

Calls other algorithms

This algorithm has permission to call other algorithms. This allows an algorithm to compose sophisticated functionality using other algorithms as building blocks, however it also carries the potential of incurring additional royalty and usage costs from any algorithm that it calls.

To understand more about how algorithm permissions work, see the permissions documentation.

1. Type your input

2. See the result

Running algorithm...

3. Use this algorithm

curl -X POST -d '{{input | formatInput:"curl"}}' -H 'Content-Type: application/json' -H 'Authorization: Simple YOUR_API_KEY' https://api.algorithmia.com/v1/algo/nlp/FuseNGrams/1.0.0
View cURL Docs
algo auth
algo run algo://nlp/FuseNGrams/1.0.0 -d '{{input | formatInput:"cli"}}'
View CLI Docs
import com.algorithmia.*;
import com.algorithmia.algo.*;

String input = "{{input | formatInput:"java"}}";
AlgorithmiaClient client = Algorithmia.client("YOUR_API_KEY");
Algorithm algo = client.algo("algo://nlp/FuseNGrams/1.0.0");
AlgoResponse result = algo.pipeJson(input);
View Java Docs
import com.algorithmia._
import com.algorithmia.algo._

val input = {{input | formatInput:"scala"}}
val client = Algorithmia.client("YOUR_API_KEY")
val algo = client.algo("algo://nlp/FuseNGrams/1.0.0")
val result = algo.pipeJson(input)
View Scala Docs
var input = {{input | formatInput:"javascript"}};
           .then(function(output) {
View Javascript Docs
var input = {{input | formatInput:"javascript"}};
           .then(function(response) {
View NodeJS Docs
import Algorithmia

input = {{input | formatInput:"python"}}
client = Algorithmia.client('YOUR_API_KEY')
algo = client.algo('nlp/FuseNGrams/1.0.0')
print algo.pipe(input)
View Python Docs

input <- {{input | formatInput:"r"}}
client <- getAlgorithmiaClient("YOUR_API_KEY")
algo <- client$algo("nlp/FuseNGrams/1.0.0")
result <- algo$pipe(input)$result
View R Docs
require 'algorithmia'

input = {{input | formatInput:"ruby"}}
client = Algorithmia.client('YOUR_API_KEY')
algo = client.algo('nlp/FuseNGrams/1.0.0')
puts algo.pipe(input).result
View Ruby Docs
use algorithmia::*;

let input = {{input | formatInput:"rust"}};
let client = Algorithmia::client("YOUR_API_KEY");
let algo = client.algo('nlp/FuseNGrams/1.0.0');
let response = algo.pipe(input);
View Rust Docs
  • {{comment.username}}