Use new Kafka event-driven algorithm workflows to automate models in production and maximize their impact
Event-driven jobs are an essential component for moving towards a fully automated machine learning pipeline, and now Algorithmia has added the popular open-source message broker Apache Kafka to our current Algorithmia Event Flow offerings, which already include support for Azure Service Bus and Amazon Simple Queue Service (SQS). With the addition of Kafka event-driven algorithm workflows, it’s now easier than ever to build flexible, automated machine learning workflows and maximize their business impact.
Mature machine learning workflows don’t consist of manually running models and then delivering data insights in a stakeholder presentation. Rather, mature data and machine learning pipelines are event driven—different steps of the process, including model inference, are triggered as data flows through the system. After a model scores or classifies new data, stakeholders such as quality control teams, or data scientists themselves, need inference metrics stored for analysis, retraining, and audits, and for other downstream models to consume. Complicated event-driven pipelines can benefit greatly from a streamlined, simple user interface (UI) where architects can configure event logic. The streamlined workflow enabled by the Algorithmia Event Flow UI helps you manage your machine learning pipeline in an automated way, allowing you to rapidly retrain and redeploy models to maximize their performance and uptime.
Event-driven workflows also require higher security than other pipelines, because often you’re connecting multiple third-party systems between your training and deployment pipelines. Having role-based access to connect your deployment infrastructure to externally-hosted message brokers allows for quicker audits, easier debugging, and improved security.
Message brokers such as Apache Kafka excel at event-driven workflows, and using Algorithmia’s intuitive UI, you can now easily add publish and subscribe events to your algorithm workflows. Add that to the fact that only certain individuals with the proper authority can connect to your externally hosted Kafka cluster, and you can be confident that your organization is quickly and securely moving towards automated, flexible machine learning systems that fit all of your business use cases and maximize their impact.
Maximize model performance
Not only have we made event-driven machine learning workflows easier to implement and more secure, we’ve also improved observability by enabling easy monitoring of your Kafka integrations. Organization members can examine event processing logs while cluster administrators can view and troubleshoot connection issues in the admin panel. These monitoring and observability capabilities help you maximize the performance of your models in production by helping you quickly detect issues with degrading performance.
This type of workflow also allows you to develop rich data-driven applications that rely on new data being processed as it becomes available, and that kick off model retraining jobs based on monitoring and alerting events. You can quickly spot any changes in the performance of your models, rapidly retrain them, and deploy them back to production with minimal downtime.
These features expand on the capabilities of Algorithmia Insights, our flexible integration solution for model performance monitoring. You can use Algorithmia Insights to export utility algorithms’ or machine learning models’ operational and inference-related metrics to your Kafka topic of choice, and run those metrics through a monitoring or alerting tool such as Datadog or InfluxDB in order to monitor for model drift or detect anomalies in the performance of your algorithms in production. Stay tuned for posts on how to use Kafka event-driven algorithm workflows with Algorithmia Insights.
How to enable Algorithmia Event Flows using Kafka for your event-driven inference pipelines
It’s easy to get started creating Kafka event-driven machine learning pipelines on Algorithmia.
Once your cluster administrator connects your Algorithmia cluster to a Kafka broker that’s owned and managed by your company, your cluster administrator can then choose from a list of available organizations on your cluster, and allow for any organization-owned algorithm they select to either subscribe or publish to a Kafka topic on that broker. Multiple organization-owned algorithms can either write data to or read data from the same Kafka topic, but we ensure a stable pipeline by not allowing for the same algorithm to both subscribe and publish to the same topic.
To enable you as a data engineer, data scientist, or application developer to seamlessly create Event Flows with Kafka, we’ve made it easy to review the available topics on the associated brokers and to view the publish and subscribe capabilities your cluster administrator has enabled for a given algorithm. Then, all you have to do is specify your algorithm version and whether you want the algorithm to subscribe or publish to a specific topic.
For example, if your cluster administrator enables your data processing algorithm to subscribe to a Kafka topic called “NewDataTopic”, all you need to do is specify the algorithm version and enable that algorithm event. Then, every time a new message is published to the “NewDataTopic”, that event will kick off an algorithm execution, using that new message as the algorithm’s payload. If your algorithm also has permission to publish the output of a successful execution to a different topic “ProcessedDataTopic”, then similarly you can add your algorithm version and enable that event.
Get started with Kafka event-driven algorithm workflows
Kafka event-driven algorithm workflows are available now for all Algorithmia Enterprise customers. As the enterprise machine learning operations (MLOps) platform, Algorithmia manages all stages of the production machine learning lifecycle within existing operational processes so you can put models into production quickly, securely, and cost-effectively.