Sort by
11 August 2020
5 min read
Semi-supervised learning is the type of machine learning that uses a combination of a small amount of labeled data and a large amount of unlabeled data to train models. This approach to machine learning is a combination of supervised machine learning, which uses labeled training data, and unsupervised...
6 August 2020
5 min read
Computing has the power to do some of the things that the human brain can do, thanks to advances in artificial intelligence. One of those advances is text processing, which also relates to natural language processing. This article is a deep dive into what text processing is and how it can generate value...
4 August 2020
5 min read
A data lake is a centralized repository of all an organization’s data stored in its raw format. This allows enterprises to store all their data, in its natural or raw state, in one location. This includes structured, relational data with rows and columns, semi-structured data such as CSV or XML files,...
29 July 2020
9 min read
The field of natural language processing (NLP) is concerned with the creation of machine learning methods for understanding written and verbal data. And as in any subfield of machine learning, it’s necessary to devise a technique for creating numerical representations of that data so it can be acted...
28 July 2020
6 min read
Applied machine learning is the application of machine learning to a specific data-related problem. This machine learning can involve either supervised models, meaning that there is an algorithm that improves itself on the basis of labeled training data, or unsupervised models, in which the inferences...
24 July 2020
6 min read
Data democratization, the process of allowing as many people as possible to have access to data without any bottlenecks or gatekeepers, can happen both within and between organizations. Within an organization, data democratization might mean that the IT department makes data easily and readily accessible...
21 July 2020
6 min read
One of the final (and arguably most important steps) in developing a machine learning model is evaluating its accuracy. You can’t trust a model to make good predictions about new and unknown data if it’s struggling with training data. Regression models evaluating accuracy usually means calculating...
16 July 2020
5 min read
Time series decompositions are one of the most important forms of data in machine learning and break down a series of events over time into analyzable components. Examples of data that might form a time series include the prices of stocks at various times, the number of passengers flying on an airline...
14 July 2020
5 min read
There are many metrics via which one can measure the performance of a model. One possible measure is the mean absolute percent error. It is calculated by taking the mean of the absolute value of the actual values minus the predictions divided by the actual values. Another measure of performance is the...
10 July 2020
5 min read
In machine learning, a parametric model is any model that captures all the information about its predictions within a finite set of parameters. Sometimes the model must be trained to select its parameters, as in the case of neural networks. Sometimes the parameters are selected by hand or through a simple...