Algorithmia Blog - Deploying AI at scale

Time series data analysis advances DevOps

Time series data anomalies found by ML models aid in DevOps capacity planning

Time series data, the key data points that have an associated timestamp allowing indexing in time order, are in most cases INSERT-intensive, requiring specialized time series databases as opposed to traditional relational practice as seen in SQL.

Prior to advancements in machine learning, much of the time series data analysis completed by DevOps engineers was limited to simple averages of key metrics with associated timestamps. By setting thresholds on those metrics in conjunction with timestamps, simple alert systems were born. Now, DevOps engineers are using time series data in ways that benefit from enhancements in the field of artificial intelligence.

DevOps strives for 100% uptime using historical time series data

While standard alerts are useful for determining if a service or system is close to failure, DevOps now has the ability to see valuable trends in time series data. Rather than being reactionary, engineers are adding methods to their tool belts to prevent system outages and prepare for events based on historical data.

This proactive approach is one of the key tenets of today’s ML DevOps methodology. Rather than focusing on thresholds, DevOps can utilize anomalies in time series data found by the introduction of machine learning models.

Time series in action

Let’s look at an example of some time series data that DevOps engineers are already familiar with.

127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] “GET /apache_pb.gif HTTP/1.0” 200 2326

Above, we see an HTTP log entry with a number of data points. Information like IP address, user information, request result, and above all, timestamp data are all collected.

DevOps can use this information to identify application failures or broken links to various assets. Engineers can act on these insights and resolve issues they encounter.

DevOps analytics and time series data

Alternatively, they are identifying trends in this time series data that allow for proactive capacity planning. If the data shows an increase in the number of requests during a holiday or other event, DevOps engineers can use scaling techniques on the production resources to ensure a good user experience. By using this type of data for capacity planning, outages due to lack of resources are minimized; which consequently, saves time and possibly lost revenue during a preventable outage.

Artificial intelligence identifies trends in economic time series data

 Just as DevOps engineers are able to take advantage of advancements in machine learning, economics is also benefiting from the new technology. A great example of this is to see how time series data identifies trends in the stock market. AI is creating new ways to conduct risk analysis so that investors have a clearer picture of historical trends for individual companies as well as the market as a whole.

Time series data can also provide more in-depth cost-benefit analysis, including forecasting based on data used during the training of various ML models. Ultimately, this gives insight into additional scenarios with feedback to support it when presenting to stakeholders. This type of data was rarely available prior to the introduction of artificial intelligence, making it quite valuable.

What makes today’s big data challenges more complicated, however, is the need for data scientists to have access to large datasets alongside the models they use. When working with data that involves transactions, it is critical to have an appropriate layer of security as well. 

Algorithmia allows a team’s DevOps engineers to implement a solution based on proven DevOps processes. At the same time, Algorithmia allows data scientists to branch out and innovate with their own machine learning models, or those already deployed in the Algorithmia platform.

Time series data benefits from specialized database formats 

Due to the nature of time series datasets, the database chosen must be scalable and highly available. Typical databases do not provide the throughput or storage needs for the large amounts of data surrounding ecommerce.

Specialized database formats are available that take advantage of advancements in software engineering, making them perfect for the types of intense analysis needed to make sense of large amounts of data. By hosting time series data in appropriate formats, data scientists and DevOps engineers also benefit from a usability standpoint.

Many functions for data retention, aggregation based on time elements, and common query tasks are built in, thus, eliminating the need for additional DevOps processes around maintenance. The result of having the right data in the right place is an increase in efficiencies across the board.

Algorithmia provides a full solution for time series data analysis

Algorithmia recognizes the need for storing specialized datasets. Additionally, time series data is often stored with major cloud providers as denoted by business needs. You can seamlessly connect many major cloud-platform storage accounts for use in ML models, all while providing a single point of integration that handles all aspects of security and scalability.

Algorithmia’s Public instance Includes ML models that fully utilize the way today’s time series data is stored. Using these models in combination with yours and those of the Algorithmia community, fosters innovation needed to advance AI for today’s big data needs. Data scientists can focus on their jobs and DevOps can ensure capacity and uptime remains at appropriate service levels.