Algorithmia Blog - Deploying AI at scale

AI and the Cloud: Cloud Machine Learning

If you’ve been keeping informed of what’s happening in the AI and machine learning world, you’ve probably heard a lot of talk about this nebulous thing called the cloud. While the cloud is often used to describe a variety of offerings for decentralized computing, there’s an underlying similarity between all such services. 

Use cases for cloud machine learning

Simply put, the cloud consists of collections of anonymous servers housed by tech companies in server farms, and the use cases for the cloud are endless. These servers are used to do everything from running the latest high tech machine learning algorithms on your data to hosting your website to serving as cloud storage for your photography collection. 

Using the cloud is a vital component of most tech businesses in this new AI age, and whoever ends up dominating the market will stand to become entrenched for years to come.

Costs and benefits of a cloud AI platform

For AI and machine learning, the key benefit of the cloud to practitioners lies in the fact that for most people, setting up and hosting their own machine learning infrastructure is prohibitively expensive. Entry-level GPU cards for training machine learning models run close to $1,000, and the best cards run 2-4 times that. Of course, for many models you achieve greater training speeds by running cards in parallel, but doing so requires purchasing multiple cards and networking them together—no easy feat. 

On top of this, you need to house the cards in a desktop of some sort with sufficiently powerful cooling capabilities to prevent overheating. Then you need to factor in the costs of supplying power to the system, as training machine learning models is incredibly resource-intensive. After all is said and done, in order to build an elite machine learning hardware setup, you’re looking at startup costs of potentially over $10,000, and this isn’t even taking into account what would be involved if you were interested in using more specialized hardware such as TPUs or FPGAs.

Serverless ML architectures offer potentially infinite scalability when run on cloud services, and their real-time scaling produces minimal waste, generating only the resources needed to respond to demand. For these reasons, serverless is the clear choice for cloud-based machine learning. However, without proper configuration, organizations run the risk of underprovisioning resources in their quest for efficiency.

Using the cloud with trained models

Getting started with training models on the cloud is incomparably simple. Using a cloud provider, you can simply choose a machine with compute power sufficient for your task, spin up an instance, load your libraries and code, and be off to the races. 

Serverless costs range anywhere from a few cents to a few dollars per hour, and you only pay for the time you use. You can shut off the machine whenever you like, and of course you don’t have to deal with all the costs involved in hardware setup, failure, and maintenance. 

Hardware for cloud AI platforms 

Certain cloud providers also give access to niche hardware that’s not available anywhere else. For example, using GCP you can train your machine learning models on TPUs, specialized processors designed to handle complex tensor arithmetic. Other platforms offer access to FPGAs. 

For most people and most workloads, it’s hard to beat the diversity of hardware options and affordable pay-as-you-go model that the cloud provides. That’s not to say that running applications on the cloud will always be inexpensive. For example, it costs OpenAI over $250/hr just to train their latest NLP language model, GPT-2. 

Hosting models in the cloud

The cloud isn’t just for training models—it’s used for hosting them too. Data scientists and developers can package their trained models as services and then deploy them to generate online predictions. Cloud services can also provide useful analytics to hosts about server load and how many times their model was queried.

Avoiding lock-in

For enterprises, choosing a cloud service is an important step in establishing a tech stack because switching providers downstream can often be difficult. Once an organization couples its code, developer team, and infrastructure to a specific framework or service, those choices can be hard to undo, simply due to how hierarchical the development process is. 

Code is built atop code, and making changes in the core libraries often involves rewriting and reworking a sizable portion of the code base. What’s more, many services have specific frameworks and AI platforms tied to their usage. AWS uses SageMaker, and GCP is optimized for use with TensorFlow. GCP also provides a service called Cloud AutoML, which will automate the process of training a machine learning model for you. 

Algorithmia’s AI Layer supports any cloud-based model deployment and serving need so users can avoid vendor lock-in. We have built-in tools for versioning, serving, deployment, pipelining, and integrating with your current workflows. 

The AI Layer integrates with any data connectors your organization is currently using to make machine learning easier, taking you from data collection to model deployment and serving much faster. 

As AI research progresses and becomes more accessible, the only thing that’s clear is that the cloud is a key component of the evolving AI landscape and will continue to be for the foreseeable future. 

Interested in learning more about the AI Layer? Get a demo to see if the AI Layer is the right solution for your organization.