This article discusses Machine Learning Model Operationalization Management (MLOps) tools to develop and deploy a food recipe application. It lists the tools that are framework-, platform-, and infrastructure-agnostic using Python for development. To find the ideal tool for the MLOps task, a decision-making and analysis process is employed. The goal is to design an end-to-end food recipe application using MLOps. And by the end of this article, we will be able to:

The end goal is to build an end-to-end pipeline. Starting from gathering the data through to training, evaluation, and inference processes. The final step involves deploying the model.

Business Use-Case

Business Model Canvas (BMC) provides an overview of the project. BMC offers a strategic management tool to define and visualize the project. It includes key business activities and their relationship to our value proposition.

Business Model Canvas for Food Recipe Application

Benefits of MLOps

MLOps combines Machine Learning (ML), DevOps, and Data Engineering into one discipline. The aim is to standardize and streamline the continuous delivery of high-performing models in production by deploying and maintaining machine learning systems in production. It helps to:

End-to-End MLOps Workflow

Before beginning any project, we usually evaluate the scope, the data needs, modeling, and deployment strategies. Following that, a framework for developing models is created. A basic model serves as a starting point. In addition, it helps to address the drift concept and provide a prototype of how to develop, deploy and continually improve and deploy models. Detailed in the following diagram and text is a ML pipeline that uses continuous integration and continuous delivery. We are going to use this approach to develop and deploy our food recipe application.

The pipeline consists of the following stages:

MLOps Tools in One Picture

The MLOps process has been managed using a wide array of MLOps tools. The current landscape is categorized into three categories: data, training and evaluation, and deployment.

Considerations for choosing MLOps tools

Since so many tools are available, it becomes impossible to evaluate each tool without considering the task at hand. Also, each tool takes a different approach. When choosing a tool, ensure that only the relevant tasks are taken into account. A critical criterion when we selected the tool for food recipe application is that they should be multi-platform, cross-framework, and infrastructure-agnostic for Python programming language. 

One of the best ways to select MLOps tools is to base them on the tasks that need to be accomplished. Following are the broad categories:

TaskCriteria of SelectionTool
Data and pipeline versioning1. Enables a greater focus on data quality
2. Maintain data organization and versioning
3. Keep all data organized and accessible
DVC is designed to make machine learning
models shareable and reproducible and
works with large files, data sets, machine
learning models, and metrics as well as code.
Data Analysis and Model Development1. ML experimentation platform for group collaboration.
2. Access data sources securely
Smooth transition from ML training to deployment.
3. Validate the model’s integrity whenever it is retrained.

Debug machine learning models and make informed decisions about their improvement
Jupyter Notebooks
We can create and share documents with live code, equations, graphs, and narrative text.
Experiment tracking and organization1. Managing the whole machine learning lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
2. Ability to go back in time to examine everything relating to a specific model.
Manage and deploy models from various machine learning libraries to a range of model serving and inference platforms
3. A breakdown of deployed & decommissioned ML models, their history and the stage of deployment of each model should be available. 
4. Training related to a given model should only be accessible to certain people in the model registry.
MLflow helps modelers experiment, reproduce, deploy and keep track of models.
Hyperparameter tuning1. Build and run distributed applications using simple parameters.
2. Enable end users to parallelize single machine code with little or no code changes.
3. Enable complex applications through a large ecosystem of applications, libraries, and tools.
Optuna can be used to automate hyperparameter search using open source components.
Several hyperparameter search algorithms and frameworks are integrated with Ray Tune, including Optuna.
Model Deployment and Serving1. Deploy with the least amount of engineering efforts.
2. Models are trained on historical data, but they predict using current data.
 3. Models remain constant until they are re-trained and deployed in the production system.
FastAPI is a modern web framework designed for API development that is quick and high-performing.
Continuous integration and continuous delivery pipelines1. Delivering a new version of a model is performed by following the CI/CD pipeline.
2. Merge changes back into the main branch as often as possible. 
3. Integrate changes early instead of waiting for release day.
4. Tests are run against the build to verify changes.
5. Deploys changes to a testing and/or production environment after the build stage.
CircleCI speeds up builds, eliminates feedback loops, and simplifies pipeline maintenance.
Table of MLOps Tools

Final Words

We have a shortlisted list of MLOps tools that are framework-, platform-, and infrastructure-agnostic. We will use these tools in actions using Python in a series of posts. We can start with discussing how to save a trained machine learning model.

Remember, don’t forget to share this post so that other people can see it too! Also, make sure you subscribe to this blog’s mailing list, follow me on Twitter or join Facebook Group so that you don’t miss out on any useful posts!

I read all comments, so if you have something to say, something to share, or questions and the like, leave a comment below!

References