Arize AI is an ML observability platform that provides real-time monitoring and explainability to help you understand how your models are performing in production. The platform uses an evaluation score, allowing you to upload offline training and validation baselines alongside online production data in order to connect drift changes to model performance changes, conduct root cause analyses of model failures and performance degradation, and analyze model bias, among other capabilities.
In this guide, we’ll show you how to integrate Arize with Algorithmia so you can bring their real-time monitoring capabilities to your algorithms. To make use of this integration, you’ll need to have an Arize account configured.
The following code is intended to be executed in a Jupyter notebook or on a training platform external to Algorithmia.
Training and saving your model
To demonstrate the end-to-end workflow, you’ll first walk through training a simple scikit-learn model, and then you’ll see how to deploy that model on Algorithia and send metrics to Arize from within your algorithm.
As with any Algorithmia algorithm, you can use the platform and tools of your choice for training your model. The code below representes one possible training workflow in a Jupyter notebook; navigate to the notebook to work with this code and to see example output.
In your training environment, you’ll first need to install the following third-party libraries using pip or the tool of your choice:
Train your model, generate some predictions, and then serialize the trained model:
Uploading your trained model to Algorithmia
Use the following code to upload your model to a hosted data collection on Algorithmia, without ever leaving your training environment. Note that if you’re running Algorithmia Enterprise, you’ll need to specify the API endpoint
CLUSTER_DOMAIN when you create the Algorithmia
client object; if not, delete the references to this variable.
You’ll need to replace the
COLLECTION_OWNER string with the name of the user or org account that owns the data collection. You’ll upload your model to that data collection and in your algorithm source code you’ll replace the
COLLECTION_NAME string with the name of that data collection. The Algorithmia API key you’re using must have write access to this data collection. See our Hosted Data docs for information about how to use hosted data collections.
Finally, this code assumes that you’ve set the
ALGORITHMIA_API_KEY environment variable to the value of your Algorithmia API key:
Generating explainability values using SHAP
SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any ML model. For in-depth details on how to use the shap library, visit SHAP Core Explainers. The code below creates a visual to verify that SHAP values are being properly generated for explainability (to generate the plot, you’ll need to install
The following represents the algorithm code that you would deploy on Algorithma, not the training platform used above.
Setting up your Algorithmia environment
To begin, on Algorithmia you’ll need to create an algorithm using an environment with Python 3.6 or later. In your algorithm’s
requirements.txt file, add the Arize Python library to add the monitoring capabilities provided by Arize, as well as the
shap library and the standard ML dependencies
Deploying your model on Algorithmia
Set the environment variables
ARIZE_ORG_KEY with your Arize keys; these keys; these secrets are accessible through the Arize Settings page.
Remember from above that you must also set the
ALGORITHMIA_API_KEY environment variable with the value of your Algorithmia API key if you’re running the algorithm from outside of the Algorithmia Web IDE; this API key only needs to have read access. You must also replace the
COLLECTION_NAME strings with the account name and collection name where the model is stored.
The algorithm establishes a connection with Arize using the Arize
Client, and then uses the
log_bulk_shap_values() methods to send Arize the predictions and SHAP values for monitoring:
The following code is intended to be executed back in the same external environment (Jupyter notebook or external training platform) that you used above to train your algorithm, once you’ve built the algorithm on Algorithmia.
Once you’ve built your algorithm, you can call it using its hash version to test it out; this will be a value like
f35025657bdc37eb0d6ffeed62b0539ee21c8b4e. If you build your algorithm in the Algorithmia Web IDE, this hash is displayed in the test console output upon successful build completion, but it’s also available in the “Builds” tab on the algorithm’s homepage. You can also publish the algorithm, in which case you’ll be able to call the algorithm using a semantic version such as
In the code below, substitute the appropriate strings for
ALGO_OWNER (the user or org account under which the algorithm was created),
ALGO_NAME (the name of the algorithm), and
ALGO_VERSION (the hash version or semantic version described above). As in the code above when you originally uploaded your model to Algorithmia, the
CLUSTER_DOMAIN variable should be deleted if you aren’t using an Enterprise cluster. The optional
timeout parameter can be used to specify the timeout for the call, in seconds.
Once you’ve incorporated these Arize logging methods and published your algorithm, every execution of your algorithm will send data to Arize.
In addition to this integration with Arize, we integrate with other platforms, including training platforms and other monitoring and observability platforms; see our Integrations page for information.
If you’re using Algorithmia Enterprise, you have access to an admin panel where you can view usage metrics at the cluster, user account, and algorithm level. See the Platform Usage Reporting page for more information. You can also opt in to our Insights feature in your algorithms, which enables you to publish your inference data to a Kafka topic which you can then subscribe to from external observability platforms. See Algorithmia Insights for more information.
If you’re new to Algorithmia and would like to learn more about our product and model monitoring capabilities, please contact our sales team. We’d love to hear from you!