Governance required for MLOps (Image Credit: Brannon Naito on Unsplash)Being a smart business or technical leader is like playing chess. You have to think several moves ahead and understand the possibilities in order to win. As a leader focused on AI initiatives, you are probably focused on the deployment of your AI projects. After all, model deployment is the first step to getting a model into production. However, in order to succeed, you must ask yourself, what are you missing?

Putting software into production is not new. Introducing new software code into production environments requires processes and procedures to ensure the performance, scalability, reliability, and security of those production systems. DevOps, for example, has already established continuous integration and deployment (CI/CD) processes to ensure that code is production-ready and won’t disrupt production applications.

Tight access controls on production systems are required. This ensures that only people who know how to update these systems have access. Companies also provide training on the correct testing and process for system updates. Why should it be any different for your production machine learning and AI projects?

Deployment of AI can have far-reaching consequences

The deployment and modification of production machine learning and AI models can have a far-reaching impact on customers, employees, partners, and investors. Introducing hundreds or even thousands of machine learning models into your business requires clear practices for their management. This ensures consistency and minimizes risk.

Production model governance sets the rules and controls for machine learning models running in production. It must include access control, testing, validation, change logs, access logs, and traceability of model results. With production model governance in place, organizations can scale machine learning investments and provide legal and compliance reports on the use of machine learning models.

Access controls are a key element for production systems

Tight access control for system updates is critical for any production system. When data science teams deploy machine learning projects outside of IT and then use predictions from these projects to support business decisions, they have created a weak link in the chain that will break.

Establish tight control to define who has access to production machine learning and AI projects and environments, set up clear roles and responsibilities in your access directory structure, and put the right people in these roles. Using role-based access control (RBAC) makes it easy to add and remove people from the production process. For example, when your data scientist leaves the company, remove them from the active directory. This way, they no longer have access to production systems.

Logging is a critical element

Keeping track of changes in the production environment is critical for a variety of reasons. When something goes wrong — and it will — production teams will need to be able to troubleshoot. When a legal issue comes up, logs make it easy to show which model produced a given result and explain what data created that model.

If operating in a regulated industry, like financial services, healthcare, pharma, there is also the burden of regulatory compliance. Companies have to provide validation and compliance reports to regulators to show they followed the rules. All of these actions require tracking of who accessed the system, who made changes, what they changed, and the preservation of models and the data sets that produced them.

Any complex system needs more than just a user audit trail to make sense of what is going on. Automation requires software agents in the MLOps system that make scheduled or event-triggered changes. For troubleshooting and reporting, those events must be captured alongside human interactions.

In addition, think about how humans and systems provide additional information about why they took action. For example, “I observed that the model data in production had drifted significantly from the training data and decided to train and deploy a new model version to replace the current version. This new model version is performing well in testing and warm-up, and I have promoted the new version to challenger testing status.”

Think of how helpful this note would be to someone who is trying to understand what happened at this point and why the user introduced this new model version.

Rigorous model and code testing is required to keep models trustworthy

Production models are going to be updated continuously. What happens if bad code is introduced into your production application? The answer is, bad stuff happens. Machine learning applications may respond quickly and with the right format, but the results will not be accurate. These errors can destroy the trust you have built with your business partners.

Ensuring the quality of production models requires a rigorous model and code testing and validation processes. When production code is updated, controls need to be in place. Every time a new model or model version is promoted, it needs to be “warmed-up” on the production environment. This allows you to see the production outputs before you replace your current version. Failure to test new code, including models, can have catastrophic results.

Production model governance required to beat competitors

Thinking several moves ahead in your AI chess match, you now realize that in order to beat your competitors, you need to think about risk. Scaling AI in the organization and reducing risk from human error or a code update, needs production model governance. Ensuring compliance with legal and regulatory issues, also means taking a governance-first approach to machine learning and AI.

Having strong governance in place frees you to make the moves that will make your business AI-driven and realize the value of AI by integrating machine learning and AI applications across your business.


DataRobot logoDataRobot is the leader in enterprise AI, delivering trusted AI technology and ROI enablement services to global enterprises competing in today’s Intelligence Revolution. DataRobot’s enterprise AI platform democratizes data science with end-to-end automation for building, deploying, and managing machine learning models.

This platform maximizes business value by delivering AI at scale and continuously optimizing performance over time. The company’s proven combination of cutting edge software and world-class AI implementation, training, and support services, empowers any organization – regardless of size, industry, or resources – to drive better business outcomes with AI.

With a singular focus on AI since its inception, DataRobot has a proven track record of delivering AI with ROI. DataRobot has offices across the globe and $431 million in funding from top-tier firms, including New Enterprise Associates, Sapphire Ventures, Meritech, and DFJ Growth. For more information, visit https://www.datarobot.com, and join the conversation on Twitter and LinkedIn.

LEAVE A REPLY

Please enter your comment!
Please enter your name here