Algorithmia Blog - Deploying AI at scale

From Crawling to Sprinting: Advances in Natural Language Processing 

Gif highlighting different Natural Language Processing branches e.g. prediction, translation

Natural language processing (NLP) is one of the fastest evolving branches in machine learning and among the most fundamental. It has applications in diplomacy, aviation, big data sentiment analysis, language translation, customer service, healthcare, policing and criminal justice, and countless other industries.

NLP is the reason we’ve been able to move from CTRL-F searches for single words or phrases to conversational interactions about the contents and meanings of long documents. We can now ask computers questions and have them answer. 

Algorithmia hosts more than 8,000 individual models, many of which are NLP models and complete tasks such as sentence parsing, text extraction and classification, as well as translation and language identification. 

Allen Institute for AI NLP Models on Algorithmia 

The Allen Institute for Artificial Intelligence (Ai2), is a non-profit created by Microsoft co-founder Paul Allen. Since its founding in 2013, Ai2 has worked to advance the state of AI research, especially in natural language applications. We are pleased to announce that we have worked with the producers of AllenNLP—one of the leading NLP libraries—to make their state-of-the-art models available with a simple API call in the Algorithmia AI Layer.

Among the algorithms new to the platform are:

  • Machine Comprehension: Input a body of text and a question based on it and get back the answer (strictly a substring of the original body of text).
  • Textual Entailment: Determine whether one statement follows logically from another
  • Semantic role labeling: Determine “who” did “what” to “whom” in a body of text

These and other algorithms are based on a collection of pre-trained models that are published on the AllenNLP website.  

Algorithmia provides an easy-to-use interface for getting answers out of these models. The underlying AllenNLP models provide a more verbose output, which is aimed at researchers who need to understand the models and debug their performance—this additional information is returned if you simply set debug=True.

The Ins and Outs of the AllenNLP Models 

Machine Comprehension: Create natural-language interfaces to extract information from text documents. 

This algorithm provides the state-of-the-art ability to answer a question based on a piece of text. It takes in a passage of text and a question based on that passage, and returns a substring of the passage that is guessed to be the correct answer.

This model could feature into the backend of a chatbot or provide customer support based on a user’s manual. It could also be used to extract structured data from textual documents, such as a collection of doctors’ reports could be turned into a table that says (for every report) the patient’s concern, what the patient should do, and when they should schedule a follow-up appointment.

Machine Comprehension screenshot

Screenshot from Machine Comprehension model on Algorithmia.

Entailment: This algorithm provides state-of-the-art natural language reasoning. It takes in a premise, expressed in natural language, and a hypothesis that may or may not follow up from. It determines whether the hypothesis follows from the premise, contradicts the premise, or is unrelated. The following is an example:

The input JSON blob should have the following fields:

  • premise: a descriptive piece of text
  • hypothesis: a statement that may or may not follow from the premise of the text

Any additional fields will pass through into the AllenNLP model.

The following output field will always be present:

  • contradiction: Probability that the hypothesis contradicts the premise
  • entailment: Probability that the hypothesis follows from the premise
  • neutral: Probability that the hypothesis is independent from the premise

Screenshot from Entailment model on Algorithmia.

Semantic role labeling: This algorithm provides state-of-the-art natural language reasoning—decomposing a sentence into a structured representation of the relationships it describes.

The concept of this algorithm is considering a verb and the entities involved in it as its arguments (like logical predicates). The arguments describe who or what does the action of this verb, to whom or what it is done, etc.

Semantic role labeling

Screenshot from Semantic role labeling model on Algorithmia.

NLP Moving Forward

NLP applications are rife in everyday life, and applications will only continue to expand and improve because the possibilities of a computer understanding written and spoken human language and executing on it are endless. 

CTA for browsing NLP algorithms

Use Event-Driven Model Serving on Algorithmia

“Our machine learning infrastructure is a great big Frankenstein of one-offs,” said one data scientist at our Seattle Roadshow. Heads nodded. Every time his data-driven organization needs to integrate with a new system, software development teams hardcode dependencies and schedule jobs inside their model deployment system, creating a collection of ad-hoc integrations that runs a very real risk of breaking if either connected system changes.

Event-driven model serving

Hardcoded connectors to multiple applications.

Machine learning workflows are complex, but connecting all the pieces doesn’t need to be. Data generated by many sources feed many different models. These, in turn, are used by a variety of applications (or even by other models in a pipeline). That’s a lot of moving pieces for any one data scientist to consider. This article will discuss the strategies needed to implement an event-driven architecture.

What Is An Event-Driven Model?

Event-driven programming means that the program execution flow is determined by user-action event processing. Whether it’s a key press, mouse click, or any other user input, the program will react according to the action. For example, an application could be programmed to send event notifications every time a user clicks a certain button. An event-driven approach can be valuable, but it can also be tedious. Next, we’ll discuss some of the options for event-driven architecture.

Loosely Coupled, Event-Driven: The Pub/Sub Pattern

Hardcoding every interaction between a model serving platform and its connected systems is far more work than data scientists should have to handle. Loosely coupled, event-driven architecture, such as the publish/subscribe (pub/sub) pattern, are asynchronous, message-oriented notification patterns commonly found in traditional enterprise software. It’s time they became more common in machine learning.

Event driven model serving | the pub/sub model

The pub/sub model

In a pub/sub model, one system acts as a publisher, sending messages to a message broker, such as Amazon SNS. Through that message broker, subscriber systems explicitly subscribe to a channel, and the broker forwards and verifies delivery of publisher messages, which can then be used by subscribers as event triggers.

Traditional vs. Pub/Sub Implementation

The following is a simple user story describing a routine process an organization might want to follow as it ingests scanned documents:

As a records manager, when new documents are added to an S3 bucket, I want to run them through an OCR model.

A traditional implementation might look like this:

  1. Create a document insertion trigger in the database.
  2. Create a custom, hardcoded function to invoke a model run via API, including delivery verification, failure handling, and retry logic.
  3. Maintain that function as end-user requirements change.
  4. Deprecate that function when it is no longer needed.

Over time, as other business apps want to build onto the document insertion event, a fifth step will be added:

  1. Repeat steps 2 through 4 for every other application of the trigger.

Pros of traditional implementation:

  • Near real-time communication
  • No scheduling or communication burden on ML deployment system

Cons of traditional implementation:

  • Database development team becomes a service organization.
  • Database development team must have knowledge of all downstream use cases involving document ingestion.
  • Architectural changes on either side of the exchange could break existing integrations.
Event Driven Model: A pub/sub implementation using Algorithmia's Event Listeners and an SQS queue

A pub/sub implementation using the Enterprise AI Layer’s Event Listeners and an SQS queue

The pub/sub implementation might look like this:

  1. Create a document insertion trigger in the database.
  2. When the trigger fires, send an event notification to a message broker.

At that point, the owners of the model deployment system would do the following:

  1. Subscribe to the published event feed.
  2. Build a function to invoke a model run via API.

Any other systems wishing to use that event as a trigger would follow the same process independently.

Pros of pub/sub implementation:

  • Near real-time communication
  • No scheduling or communication burden on ML deployment systems
  • Database development team can focus on database development
  • Downstream development teams can build without dependencies
  • Architectural changes far less likely to break existing integrations

The pub/sub pattern’s loose coupling is highly scalable, centralizing the burden of communication and removing it from dependent apps. It is also extremely flexible. By abstracting communications between publishers and subscribers, each side operates independently. This reduces the overhead of integrating any number of systems and allows publishers and subscribers to be swapped at any time, with no impact on performance.

Pub/sub advantages:

  • Flexible, durable integration: Upgrades to component systems won’t break communications workflows. As long as a published system continues to send messages to the appropriate queue, the receiving systems shouldn’t care how it generates those messages.
  • Developer independence: Decoupling publishers and subscribers means teams can iterate at their own paces and ship updates to components without introducing breaking changes.
  • Increased performance: Removing messaging from the application allows Ops to dedicate infrastructure to queueing, removing the processing burden from system components.
  • Modularity: Because components integrate through an intermediary message queue, any component can be added or swapped at any time without impacting the system.
  • Scalability: Since any number of applications can subscribe to a publisher’s feed, multiple systems can react to the same events.

Using Pub/Sub With Algorithmia’s Event Listeners

Screenshot from Algorithmia platform of Event Listeners

Screenshot from Algorithmia’s Enterprise AI Layer platform.

The Enterprise AI Layer provides configurable event listeners, also called an event handler, so users can trigger actions based on input from pub/sub event-driven database systems. In concert with the AI Layer’s automatic API generation, integrated versioning, and multi-language SDKs, this means your ML infrastructure is able to grow. Systems can trigger any version of any model, written in any programming language, trained with any framework. That’s critical as an ML program grows in size and scope and touches every part of a business.

The first of our supported event sources is Amazon SQS, with more to come. Read the Event Listeners tutorial in our Developer Center, give it a try, and let us know what you think!

From Jupyter Notebook to Fully Scalable Model—Now a Reality

We spend a lot of time focused on giving data scientists the best experience for deploying their machine learning models. We think they should not only use the best tools for the job, they should also be able to integrate their work easily with other tools. Today we’ll highlight one such integration: Jupyter Notebook.

When we work in Jupyter Notebook—an open-source project tool used for creating and sharing documents that contain live code snippets, visualizations, and markdown text—we are reminded how easy it is to use our API to deploy a machine learning model from within a notebook.

About Jupyter Notebook

These notebooks are popular among data scientists and are used both internally and externally to share information about data exploration, model training, and model deployment. Jupyter Notebook supports running both Python and R, two of the most common languages used by data scientists today.

An example Jupyter Notebook

Jupyter Notebook.

How We Use It

We don’t think you should be limited to creating algorithms/models solely through our UI. Instead, to give data scientists a better and more comprehensive experience, we built an API endpoint that gives more control over your algorithms. You can create, update, and publish algorithms using our Python client, which lets data scientists deploy their models directly from their existing Jupyter Notebook workflows.

We put together the following walkthrough to help guide users through the process to deploy from within a notebook. The first few steps are shown below:

In this example, after setting up credentials, we download and prep a dataset and build and deploy a text classifier model. You can see the full example notebook here. And for more information about our API, please visit our guide.

More of a Visual Learner?

Watch this short demo that goes through the full workflow.

Announcement: Today we celebrate an important milestone.

series b announcement

We are extremely thankful for our customers and their trust in us, and we want to share this news with them first and foremost. This funding enables us to double-down on developing the infrastructure to scale and accelerate their machine learning efforts. 

We are also thrilled to welcome Norwest Venture Partners (NVP) to Algorithmia and are honored for the continued support of Madrona Venture Group, Gradient Ventures, Work-Bench, Osage University Partners, and Rakuten Capital. Rama Sekhar, a partner at NVP, will be joining Algorithmia’s Board of Directors.

Their support means two things:

    1. Our customers have communicated to our investors that they are getting the personalized support they need to manage their machine learning life cycles. They feel confident not only in the product decisions we’ve made to date but also in the delivery of future product features.
    2. This funding allows us to continue maintaining our market leadership position.

We have proven as a team that we can enable humanitarian efforts of the United Nations, individual developers, and the largest companies in the world to adopt machine learning.

More importantly, our customers have given us the opportunity to truly deliver on our potential—and for that, we are immensely thankful. If you use Algorithmia, we are especially thankful for your support.

We are laser-focused on our customers

As we pioneer the field of machine learning operationalization, we know that our users are also pioneering their fields—we empathize with the challenges they face and are laser-focused on their success.

Our customers choose us because we listen to and build features they need to solve real problems and we offer responsive onboarding and support. Our customers are key to our success—we will now push even harder to help, listen to, and learn from them.

Moving forward, we will accelerate our engineering efforts and maintain our lead as the best platform for managing the machine learning life cycle.

Finally, funding is never a goal—it’s simply fuel to ensure we deliver on our mission to provide everyone the tools they need to use machine learning at any scale. If the purpose of tools is to extend human potential, then machine learning is poised to become one of the most powerful tools ever created. The AI Layer is a tool that makes machine learning available to everyone.  

Feeling thankful and energized,

Diego Oppenheimer
CEO, Algorithmia

P.S. A small ask—please share the news with your network on LinkedIn to help us spread the word.

Data Scientists Should be Able to Deploy and Iterate Their Own Models

Every quarter, we talk with more than a hundred companies working to productionize their machine learning. We see a clear path out of the stage many companies cannot get past: operationalizing the model deployment process.

The problem

Massive effort goes into any well-trained model—research shows that less than 25 percent of a data scientist’s time is spent on training models; the rest is spent collecting and engineering data and conducting DIY DevOps.

Collecting and cleaning data is a fairly easy-to-understand problem with a maturing ecosystem of best practices, products, and training. As such, let’s focus on the problem of DIY DevOps.

DIY DevOps massively increases the risk of product failure.

We often hear from data scientists that they tried to deploy a model themselves, got stuck as they attempted to fit it into existing architecture, and found out that building machine learning infrastructure is a much harder endeavor than they imagined.

A common story arc we hear:
  • A project starts with a clear problem to solve but maybe not a clear path forward
  • Collecting and cleaning the data is difficult but doable
  • Everyone realizes that deploying and iterating will yield real data and improve the model
  • Data scientists are tasked with deploying models for the project themselves
  • Data scientists get spread thin as they work on tasks outside their areas of strength
  • Months later it becomes clear that DevOps needs to be called in
  • Deploying a model gets tasked to an engineer
  • The engineer realizes that the model was created in Python when the application that needs to use it is in Java
  • The engineer heads over to Stack Overflow and decides the best path forward is to rewrite the model in Java
  • The model gets deployed weeks or months later
  • Down the line, a new version of the original Python model is trained

The solution

Companies that are serious about machine learning need an abstraction layer in their machine learning operationalization infrastructure that allows data scientists to deploy, serve, and iterate their own models in any language, using any framework.

We think about this abstraction layer as being an operating system for machine learning. But, what does it look like to help data scientists deploy and iterate their own models? It looks like using whatever tools are best for a given project. 

What do you need in your infrastructure?

Model versioning
Models aren’t static. As data science practices mature, there is a constant need to monitor model drift and retrain models as more data is made available. You’ll need a versioning system that allows you to continuously iterate on models without breaking or disrupting applications that are using older versions.

ML for DevOps
Traditional software development has a well-understood DevOps process: developers commit code to a versioned repository, then a Continuous Integration (CI) process builds, tests, and validates the most recent master branch. If everything meets deployment criteria, a Continuous Delivery or Continuous Deployment (CD) pipeline releases the latest valid version of the product to customers. As a result, developers can focus on writing code, release engineers can build and govern the software delivery process, and end users can receive the most functional code available, all within a system that can be audited and rolled back at any time.

Software development life cycle

The machine learning life cycle can, and should, borrow heavily from the traditional CI/CD model, but it’s important to account for several key differences, including:

Workflow patterns
Enterprise and commercial software development is far more likely to have many developers iterating concurrently on a single piece of code. In ML development, there may be far fewer data scientists iterating on individual models, but those models are likely to be pipelined together in myriad ways.

Dependency management
Traditional application development often leverages a single framework and one, or possibly two, languages. For example, many enterprise applications are written in C#, using the .NET framework, with small bits of highly performant code written in C++. ML models, on the other hand, are typically written using a wide range of languages and frameworks (R, Scala, Python, Java, etc.), so each model can have a unique set of dependencies.

Containerizing these models with all necessary dependencies while still optimizing start times and performance is a far more complex task than compiling a monolithic app built in a single framework.

ML development life cycle

The infrastructure for deploying and managing a machine learning portfolio is crucial for data scientists to do their best work. Though it’s a relatively new field, good practices exist for moving data science forward efficiently and effectively.