Algorithmia Blog - Deploying AI at scale

Accelerate MLOps: using CI/CD with machine learning models

CI/CD pipelineContinuous Integration and Continuous Deployment (CI/CD) are key components of any mature software development environment. During CI, newly added code is merged into the codebase, kicking off builds and automated testing. If all tests succeed, then the CD phase begins, deploying the changes automatically to production. In this way, developers can immediately release changes to production by simply committing to, or merging into, the proper branch in their version control system.

Developers have a great deal of flexibility as to how they build this pipeline, due to the wide variety of open and interoperable platforms for version control and CI/CD. This is not, however, always true in the world of machine learning: it can be difficult to properly version, track, and productionize machine learning models

Challenges of CI/CD in machine learning

Some existing services provide this functionality effectively, but lock data scientists into a black-box silo where their models must be trained, tracked, and deployed on a closed technology stack. Even when open systems are available, they do not always interoperate cleanly, forcing data scientists to undergo a steep learning curve or bring in specialists to build novel deployment tools.

At Algorithmia, we believe ML teams should have the freedom to use any training platform, language, and framework they prefer, then easily and quickly deploy their models to production. To enable this, we provide CI/CD workflows in Jenkins and GitHub Actions, which work out of the box, but can be easily modified to work with just about any CI/CD tool to continuously deploy your ML models into our scalable model-hosting platform, running in our public marketplace or in your own private-cloud or on-prem cluster.

This is made possible by our Algorithm Management API, which provides a code-only solution for deploying your model as a scalable, versioned API endpoint. Let’s take a look at a typical CI/CD pipeline for model deployment:

CI/CD workflow

The process kicks off when you train your initial model or modify the prediction code—or when an automatic process retrains the model as new data becomes available. The files are saved to network-available storage and/or a version control system such as Git. This triggers any tests that must be run, then kicks off the deployment process. In an ideal world, your new model will be live and running on your production servers within seconds or minutes as a readily available API endpoint apps and services can call. In addition, your endpoint should support versioning, so dependant apps/services can access older versions of your model as easily as the latest copy.

Algorithmia’s CI/CD tools provide the latter stages of that workflow: detecting the change in your saved model or code, and deploying your new model to a scalable, versioned Algorithmia API endpoint (an “algorithm” in our terminology). These are drop-in configurations: the only changes you need to make are to the single Python script, which specifies the settings to use (e.g. endpoint name and execution language) and which files to deploy.

If you’re using Jenkins or GitHub Actions, simply clone and configure the appropriate configuration. If you prefer a simple, notebook-driven deploy, check out our Jupyter Notebook example. If you’re using another tool, it should be fairly simple to customize the examples, or you can contact us to request new ones!

The state of machine learning in financial services

Machine learning for finance

The financial services industry has often been at the forefront of using new technology to solve business problems. It’s no surprise that many firms in this sector are embracing machine learning, especially now that increased compute power, network connectivity, and cloud infrastructure are cheaper and more accessible. 

This post will detail five important machine learning use cases that are currently providing value within financial services organizations. 

Fraud detection 

The cost of financial fraud for a financial services company jumped 9 percent between 2017 and 2018, resulting in a cost of $2.92 for every dollar of fraud. We have previously discussed machine learning applications in fraud detection in detail, but it’s worth mentioning some additional reasons why this is one of the most important applications for machine learning in this sector. 

Most fraud prevention models are based on a set of human-created rules that result in a binary classification of “fraud” or “not fraud.” The problem with these models is that they can create a high number of false positives. It’s not good for business when customers receive an abnormally high number of unnecessary fraud notifications. Trust is lost, and actual fraud may continue to go on undetected. 

Machine learning clustering and classification algorithms can help reduce the problem of false positives. They continually modify the profile of a customer whenever they take a new action. With these multiple points of data, the machine can take a nuanced approach to determine what is normal and abnormal behavior. 


Creditworthiness is a natural and obvious use of machine learning. For decades, banks have used very rudimentary logistic regression models with inputs like income 30-60-90-day payment histories to determine likelihood of default, or the payment and interest terms of a loan. 

The logistic model can be problematic as it can penalize individuals with shorter credit histories or those who work outside of traditional banking systems. Banks also miss out on additional sources of revenue from rejected borrowers who would likely be able to pay.

With the growing number of alternative data points about individuals related to their financial histories (e.g., rent and utility bill payments or social media actions), lenders are able to use more advanced models to make more personalized decisions about creditworthiness. For example, a 2018 study suggests that a neural network machine learning model may be more accurate at predicting likelihood of default as compared to logistic regression or decision-tree modeling. 

Despite the optimism around increased equitability for customers and a larger client base for banks, there is still some trepidation around using black box algorithms for making lending decisions. Regulations, including the Fair Credit Reporting Act, require creditors to give individuals specific reasons for an outcome. This has been a challenge for engineers working with neural networks. 

Credit bureau Equifax suggests that it has found a solution to this problem, releasing a “regulatory-compliant machine learning credit scoring system” in 2018. 

Algorithmic trading

Simply defined, algorithmic trading is automated trading using a defined set of rules. A basic example would be a trader setting up automatic buy and sell rules when a stock falls below or rises above a particular price point. More sophisticated algorithms exploit arbitrage opportunities or predict stock price fluctuations based on real-world events like mergers or regulatory approvals. 

The previously mentioned models require thousands of lines of human-written code and have become increasingly unwieldy. Relying on machine learning makes trading more efficient and less prone to mistakes. It is particularly beneficial in high frequency trading, when large volumes of orders need to be made as quickly as possible. 

Automated trading has been around since the 1970s, but only recently have companies had access to the technological capabilities able to handle advanced algorithms. Many banks are investing heavily in machine learning-based trading. JPMorgan Chase recently launched a foreign exchange trading tool that bundles various algorithms including time-weighted average price and volume-weighted average price along with general market conditions to make predictions on currency values.


Robo-advisors have made investing and financial decision-making more accessible to the average person. Their investment strategies are derived from an algorithm based on a customer’s age, income, planned retirement date, financial goals, and risk tolerance. They typically follow traditional investment strategies and asset allocation based on that information. Because robo-advisors automate processes, they also eliminate the conflict of financial advisors not always working in a client’s best interest.

While robo-advisors are still a small portion of assets under management by financial services firms ($426 billion in 2018), this value is expected to more than triple by 2023. Customers are enticed by lower account minimums (sometimes $0), and wealth management companies save on the costs of employing human financial advisors. 

Cybersecurity and threat detection 

Although not unique to the financial services industry, robust cybersecurity protocols are absolutely necessary to demonstrate asset safety to customers. This is also a good use case to demonstrate how machine learning can play a role in assisting humans rather than attempting to replace them. Specific examples of how machine learning is used in cybersecurity include: 

Malware detection: Algorithms can detect malicious files by flagging never-before-seen software attempting to run as unsafe. 

Insider attacks: Monitoring network traffic throughout an organization looking for anomalies like repeated attempts to access unauthorized applications or unusual keystroke behavior

In both cases, the tedious task of constant monitoring is taken out of the hands of an employee and given to the computer. Analysts can then devote their time to conducting thorough investigations and determining the legitimacy of the threats.

It will be important to watch the financial sector closely because its use of machine learning and other nascent applications will play a large role in determining those technologies’ use and regulation across countless other industries.

Chaining machine learning models in production with Algorithmia

Workflow showing the tools needed at each stage

In software development, it makes sense to create reusable, portable, self-contained modules that can seamlessly plug into any application. As the old adages insist: rely on modular design, don’t repeat yourself (DRY), and write once, run anywhere. The rise of API-first design, containerization, and serverless functions has taken these lessons even further—allowing individual modules to be developed in separate languages but executed from anywhere in any context.

To reach its full potential, machine learning must follow the same principles. It’s important to create reusable abstractions around your models, keep them in a well-documented and searchable catalog, and encourage model reuse across your organization.

During model training, techniques such as transfer learning begin to address this need; but how can we benefit from reuse of shared models and utilities once they are already in production?

Architectural principles

Design with abstraction in mind: while you may be building a model for a specific, constrained business purpose, consider how it might be used in other contexts. If it only takes singular inputs, for instance, could you provide a simple wrapper to allow batches of inputs to be passed in as a list? If it expects a filename to be passed in, should you also allow for URLs or base64-encoded input?

Document and centrally catalog your models: once you’ve put in the hours necessary to conceive and train a model, move it off your laptop and into a central repository where others can discover it. Provide simple, clean documentation, which describes the purpose, limitations, inputs, and outputs of the model.

Host your models in a scalable, serverless environment: downloading is tedious, limiting, and resource-wasteful. Instead, allow other developers to access your model directly via an API. This way, they’ll only need to add a few simple lines of code to their application, instead of duplicating your entire model and associated dependencies. Host that API endpoint in a serverless environment so it can scale indefinitely and satisfy calls from any number of applications.

Search for existing endpoints before creating your own: there’s no need to build your own code from scratch, or even add another large dependency to your project. If the functionality is already provided by an API you can call, using existing resources is preferred. By thinking API-first, you’ll decrease your own module’s weight while minimizing technical debt and maintenance effort.

Design with abstraction in mind: consider how [a model] might be used in other contexts.

Model reuse and scaling

Algorithmia’s public model marketplace and Enterprise AI Layer have been designed with these principles in mind. Every model is indexed in a central, searchable catalog (with the option for individual or team-level privacy) with documentation and live sample execution, so developers can understand and even live-test the model before integrating it into their codebase.

Every model is run in Algorithmia’s scalable serverless environment and automatically wrapped by a common API, with cut-and-paste code samples provided in any language. There is no need to dig through sprawling API documentation or change patterns based on which model is called: integrating a Java deep-learning model into a Python server feels and acts as seamless as calling a local method. Running an R package from frontend JavaScript is just a simple function call.

Screenshot from of an R packageset running as a function call.

Chaining models

The benefits of Algorithmia’s design extend beyond executing models from end-user applications: it is equally simple to call one model from another model, a process known as model chaining or production model pipelining (not to be confused with training pipelines).

The core of this is the .pipe() call. UNIX users will already be familiar with the pipe “|” syntax, which sends input from one application to another; on Algorithmia, .pipe() sends input into an algorithm (a model hosted on Algorithmia), and can be used to send the output of one model directly into another model, or into a hosted utility function. For example, if we have a model called “ObjectDetection” for recognizing objects in a photo, and a utility function called “SearchTweets” for searching Twitter by keyword, and another model called “GetSentiment” which uses NLP to analyze the sentiment of text, we can write a line of code very similar to:

GetSentiment.pipe( SearchTweets.pipe( ObjectDetection.pipe(image).result ).result )

This runs an image through ObjectDetection, then sends the names of detected objects into SearchTweets, then gets the sentiment scores for the matching tweets.

Let’s implement this as an actual model pipeline, using the Algorithmia algorithms ObjectDetectionCOCO, AnalyzeTweets, UploadFileToCloudinary, and GetCloudinaryUrl. We’ll extend it a bit by picking one of the top sentiment-ranked tweets, overlaying the text on top of the image, and sending that image over to Cloudinary’s CDN for image hosting. Our full code looks something like this:

Object detection and tweet analysis chain code snippet

Line-by-line, here are the steps:

  1. Create a client for communicating with the Algorithmia service
  2. Send an image URL into ObjectDetectionCOCO v. 0.2.1, and extract all the labels found
  3. Search Twitter for tweets containing those labels via AnalyzeTweets v. 0.1.3, which also provides sentiment scores
  4. Sort the tweets based on sentiment score
  5. Upload the original image to Cloudinary
  6. Overlay the top-ranked tweet’s text on top of the image in Cloudinary’s CDN

Now, with just 6 lines of code, we’ve chained together two ML models and two external services to create a fun toy app! But let’s go further, making this an API of its own, so other developers can make use of the entire pipeline in a single call. We head to (or to our own privately-hosted Enterprise instance) and click Create New Algorithm. Then place the same code into the algorithm body:

create new algorithm on with

After we publish this, any other user will be able to make use of this pipeline by making a single function call, from any language:

making pipelines available for use by others

You can try this out yourself, and even inspect the source code (enhanced with some overlay formatting and random top-N tweet selection) at!

Going further

This toy example was fun to develop, but every industry has its own specific needs and workflows that can be improved with model chaining. For a few more model-chaining examples, read how to:

Or, explore our Model Pipelining whitepaper which addresses the business-level benefits of model pipelining within your enterprise.

Thanks for taking the time to explore with Algorithmia; we look forward to seeing what great model pipelines you dream up!

Building an ML−enabled fullstack application with Vue, Flask, Mongo, and Algorithmia

Full stack enabled application with Algorithmia

Are you an experienced fullstack developer looking to bring machine learning to your apps? Or are you an ML expert who wants to build a website to have a place to show off your models? In any event, the process of bringing AI to applications can be laborious and confusing—but it doesn’t have to be!

Algorithmia has created a complete end-to-end tutorial to demonstrate how you can quickly build a modern ML−enabled web application using the following popular technologies:

Course Specs

In our walkthrough example, we start from ground zero, showing you how to install and connect each of these technologies. From there, we build up each layer of the application, writing our backend logic, building out the presentation layer, and connecting to powerful serverless ML algorithms. 

By the end of the walkthrough, you’ll have an app skeleton for managing user profiles, enhanced by nudity-detection algorithms and auto-cropping models to create safe, automatic profile images. Use this as the basis for your next AI-powered app or take your newly acquired expertise and apply it to your own project with your favorite tech stack.

screenshot from within Algorithmia application

You can continue building out your app by adding any of Algorithmia’s 9,000+ serverless functions, or build your own ML pipelines right on the Algorithmia platform, connecting multiple powerful components together to create complex workflows callable by your app or service. 

Detect objects in an image, then search Twitter for relevant quotes, ranking them by sentiment score. Build a roommate-finder or dating tool to ensure stable matchups, automatically detect age, gender, and even emotion in user profiles—as only a few examples. Or build your own machine learning model to work standalone or in combination with any algorithm on the platform.

Visit Algorithmia’s Learning Center

Ready to jump in? Start the free, interactive course today: Building a Fullstack App with Algorithmia. It is just one of the course offerings in Algorithmia’s new Learning Center.

Check back often as the Learning Center is always growing. Explore dozens of free courses and acquire skills to improve your dev capabilities. Right now you can learn how to add serverless ML to your applications, manage data, deploy your own ML models with hands-on Scikit-learn and TensorFlow walkthroughs, and a lot more!

The Learning Center is housed within and offers trainings on using Algorithmia’s AI Layer—a machine learning model deployment and management platform. The AI Layer makes it easy to deploy models as scalable microservices, regardless of framework, language, or data source.

Connectivity in Machine Learning Infrastructure 

ML Life Cycle | Connect, deploy, scale, and manage

As companies begin developing use cases for machine learning, the infrastructure to support their plans must be able to adapt as data scientists experiment with new and better processes and solutions. Concurrently, organizations must connect a variety of systems into a platform that delivers consistent results.

Machine learning architecture consists of four main groups:

  • Data and Data Management Systems
  • Training Platforms and Frameworks
  • Serving and Life Cycle Management
  • External Systems 

ML-focused projects generate value only after these functional areas connect into a workflow.

In part 3 of our Machine Learning Infrastructure whitepaper series, “Connectivity,” we discuss how those functional areas fit together to power the ML life cycle. 

It all starts with data

Most data management systems include built-in authentication, role access controls, and data views. In more advanced cases, an organization will have a data-as-a-service engine that allows for querying data through a unified interface. 

Even in the simplest cases, ML projects likely rely on a variety of data formats—different types of data stores from many different vendors. For example, one model might train on images from a cloud-based Amazon S3 bucket, while another pulls rows from on-premises PostgreSQL and SQL Server databases, while a third interprets streaming transactional data from a Kafka pipeline.  

machine learning architecture

Select a training platform

Training platforms and frameworks comprise a wide variety of tools used for model building and training. Different training platforms offer unique features. Libraries like TensorFlow, Caffe, and PyTorch offer toolsets to train models. 

The freedom of choice is paramount, as each tool specializes in certain tasks. Models can be trained locally on a GPU and then deployed or they can be trained directly in the cloud using Dataiku, Amazon, SageMaker, Azure ML Studio, or other platforms or processors.

Life cycle management systems

Model serving encompasses all the services that allow data scientists to deliver trained models into production and maintain them. Such services include the abilities to ingest models, catalog them, integrate them into DevOps workflows, and manage the ML life cycle. 

Fortunately, each ML architecture component is fairly self-contained, and the interactions between those components are fairly consistent:

  • Data informs all systems through queries.
  • Training systems export model files and dependencies.
  • Serving and life cycle management systems return inferences to applications and model pipelines, and export logs to systems of record.
  • External systems call models, trigger events, and capture and modify data.

It becomes easy to take in data and deploy ML models when these functions are grouped together. 

External systems

External Systems can consume model output and integrate it in other places. Based on the type of deployment, we can create different user interfaces. For example, the model output can integrate into a REST API or another web application. RESTful APIs assist us in calling our output from any language and integrating it into new or existing project. 

Connectivity and machine learning sophistication

Data have made the jobs of business decision makers easier. But data is only useful after models interpret it, and model inference only generates value when external apps can integrate and consume it. That journey toward integration has two routes: horizontal integration and loosely coupled, tight integration.  

The quickest way to develop a functioning ML platform is by supporting only a subset of solutions from each of the functional groups to more quickly integrate each into a horizontal platform. Doing so requires no additional workforce training and adds speed to workflows already in place. 

Unfortunately, horizontal integration commits an organization to full-time software development rather than building and training models to add business value. An architecture that allows each system to evolve independently, however, can help organizations choose the right components for today without sacrificing the flexibility to rethink those choices tomorrow. 

To enable a loosely coupled, tightly integrated approach, a deployment platform must support three kinds of connectivity: 

  • Publish/Subscribe 
  • Data Connectors
  • RESTful APIs


Publish/subscribe (pub/sub) is an asynchronous, message-oriented notification pattern. In such a model, one system acts as a publisher, sending events to a message broker. Through the message broker, subscriber systems explicitly enroll in a channel, and the hub forwards and verifies delivery of publisher notifications, which can then be used by subscribers as event triggers. 

Algorithmia’s AI Layer has configurable event listeners that allow users to trigger actions based on input from pub/sub systems. 

Pub/sub approach

Data connectors

While the model is the engine of any machine learning system, data is both the fuel and the driver. Data feeds the model during training, influences the model in production, then retrains the model in response to drift. 

As data changes, so does its interaction with the model, and to support that iterative process, an ML deployment and management system must integrate with every relevant data connector.


Because there is a variety of requesting platforms and high unpredictability therein, a loose coupling is, again, the most elegant answer. RESTful APIs are the most elegant implementation, due to these required REST constraints:

  • Uniform interface: requests adhere to a standard format
  • Clint-Server: the server only interacts with the client through requests
  • Stateless: all necessary information must be included within a request
  • Layered system: the REST client passes any layers between itself and the server
  • Cacheable: Developers can store certain responses

To learn more about how connectivity feeds into the machine learning life cycle, download the full whitepaper.

And visit our website to read parts 1 and 2 of the Machine Learning Infrastructure whitepaper series.