There are security risks in software development of any kind, including machine learning. These risks must be taken into account during every phase of the machine learning life cycle in order to create a secure machine learning system.
What is machine learning security?
Machine learning security is software security for machine learning systems. Like other types of software, machine learning software is at risk for security breaches and cyber attacks. Although machine learning has been around even longer than computer security, its security risks were some of the least understood. Over recent years, hackers have been working hard to figure out all the potential attacks an ML system could fall victim to, so that engineers know what potential risks to plan for and cover in their machine learning security plan.
Why is security important in machine learning?
Security is important in machine learning because ML systems often contain confidential information or provide a competitive advantage to the organization that they would not want competitors to be able to access. Some companies use machine learning for security reasons to detect security breaches in other systems, so the security of that ML model itself is important so that their security system can be trusted to secure the other system.
Machine learning security risks
To understand the importance of machine learning security and to know what precautions should be taken to avoid security issues, here are a few of the possible machine learning security risks.
Protecting confidential data is already difficult without it being part of a machine learning system. ML brings additional challenges to protecting confidential data, since sensitive data is built into the model through training. There are effective but subtle attacks to extract data from an ML system are a potential risk. In order to protect your system from this type of attack, it is necessary to build security protocols into the model from the beginning stages in the ML lifecycle.
When a machine learning system continues learning and modifying its behavior while in operational use, it is said to be “online.” Experienced hackers are able to subtly move an online system in the wrong direction by feeding the system inputs that retrain it to give the wrong outputs. This type of attack is generally easy to carry out and subtle enough to be successful. Since this attack is so complex, ML engineers must consider algorithm choice, data provenance, and ML operations in order to properly protect the system.
One of the most discussed attacks that threaten machine learning systems is known as adversarial examples. The idea of this attack is to fool the machine learning model by feeding it malicious input in very small nudges that cause the model to make false predictions or categorizations. Adversarial examples are very real and therefore need to be planned for in the ML security plan.
Transfer learning attack
A transfer learning attack is a risk when an ML system is built by fine-tuning a pretrained model that is widely available. An attacker could use the public model as a cover for their malicious ML behavior. If you use a transfer model, it should describe in detail exactly what the system does and what the creator has put in place to control the risks in their models.
Since data plays such a huge role in machine learning security, if an attacker can purposely manipulate the data used by an ML system, it can compromise the entire system. ML engineers should consider what training data an attacker could potentially control and to what extent they could control it, in order to give special attention to preventing data poisoning.
Securing your machine learning systems
In order to secure a machine learning system, it must be built into the model from the very beginning of the machine learning lifecycle. From the start of development through the production stages the entire time the model is in use, the system must be actively secured.
Algorithmia provides organizations with a machine learning framework that secures the model and data involved through every stage of the ML lifecycle. A flexible ML platform connects to all necessary data sources in one secure, central location for repeatable, reusable, and collaborative model management. Manage MLOps with access controls and governance features to secure and audit your ML models in production. Access control your model management system to reduce data vulnerabilities. Securely govern your machine learning operations with a healthy ML lifecycle.
Watch a demo of Algorithmia to see how it secures the entire machine learning lifecycle.