Blog » ML Tools » The Best MLOps Tools You Need to Know as a Data Scientist

The Best MLOps Tools You Need to Know as a Data Scientist

In one of our articles—The Best Tools, Libraries, Frameworks and Methodologies that Machine Learning Teams Actually Use – Things We Learned from 41 ML Startups—Jean-Christophe Petkovich, CTO at Acerta, explained how their ML team approaches MLOps.

According to him, there are several ingredients for a complete MLOps system:

  • You need to be able to build model artifacts that contain all the information needed to preprocess your data and generate a result. 
  • Once you can build model artifacts, you have to be able to track the code that builds them, and the data they were trained and tested on. 
  • You need to keep track of how all three of these things, the models, their code, and their data, are related. 
  • Once you can track all these things, you can also mark them ready for staging, and production, and run them through a CI/CD process. 
  • Finally, to actually deploy them at the end of that process, you need some way to spin up a service based on that model artifact. 

It’s a great high-level summary of how to successfully implement MLOps in a company. But understanding what is needed in high-level is just a part of the puzzle. The other one is adopting or creating proper tooling that gets things done. 

That’s why we’ve compiled a list of the best MLOps tools. We’ve divided them into six categories so you can choose the right tools for your team and for your business. Let’s dig in!

See also

If you want to explore even more MLOps tools, check the MLOps Tools Landscape.

Data and pipeline versioning

1. DVC


DVC, or Data Version Control, is an open-source version control system for machine learning projects. It’s an experimentation tool that helps you define your pipeline regardless of the language you use.

When you find a problem in a previous version of your ML model, DVC helps to save time by leveraging code, data versioning, and reproducibility. You can also train your model and share it with your teammates via DVC pipelines.

DVC can cope with versioning and organization of big amounts of data and store them in a well-organized, accessible way. It focuses on data and pipeline versioning and management but also has some (limited) experiment tracking functionalities.

DVC – summary:

  • Possibility to use different types of storage— it’s storage agnostic
  • Full code and data provenance help to track the complete evolution of every ML model
  • Reproducibility by consistently maintaining a combination of input data, configuration, and the code that was initially used to run an experiment
  • Tracking metrics
  • A built-in way to connect ML steps into a DAG and run the full pipeline end-to-end

👉 Check out DVC & Neptune comparison

2. Pachyderm


Pachyderm is a platform that combines data lineage with end-to-end pipelines on Kubernetes.

It’s available in three versions, Community Edition (open-source, with ability to be used anywhere), Enterprise Edition (complete version-controlled platform), and Hub Edition (still a beta version, it combines characteristics of the two previous versions).

You need to integrate Pachyderm with your infrastructure/private cloud.

Since in this section we are talking about data and pipeline versioning we’ll talk about the two but there is more to Pachyderm than just that (check out the website for more info).

When it comes to data versioning, Pachyderm data versioning system has the following main concepts:

  • Repository – a Pachyderm repository is the highest level data object. Typically, each dataset in Pachyderm is its own repository
  • Commit – an immutable snapshot of a repo at a particular point in time
  • Branch – an alias to a specific commit, or a pointer, that automatically moves as new data is submitted
  • File – files and directories are actual data in your repository. Pachyderm supports any type, size, and a number of files
  • Provenance – expresses the relationship between various commits, branches, and repositories. It helps you to track the origin of each commit

👉 Check out Pachyderm & Neptune comparison

3. Kubeflow


Kubeflow is the ML toolkit for Kubernetes. It helps in maintaining machine learning systems by packaging and managing docker containers. It facilitates the scaling of machine learning models by making run orchestration and deployments of machine learning workflows easier.

It’s an open-source project that contains a curated set of compatible tools and frameworks specific for various ML tasks.

Kubeflow – summary:

  • A user interface (UI) for managing and tracking experiments, jobs, and runs
  • Notebooks for interacting with the system using the SDK
  • Re-use components and pipelines to quickly create end-to-end solutions without having to rebuild each time
  • Kubeflow Pipelines is available as a core component of Kubeflow or as a standalone installation

Run orchestration

1. Kubeflow

As you’ve noticed, we’ve already mentioned Kubeflow in data and pipeline versioning, but the tool can also be helpful in other areas, also orchestration.

You can use Kubeflow Pipelines to overcome long ML training jobs, manual experimentation, reproducibility, and DevOps obstacles.

With Kubeflow’s tools and frameworks, it’s easier to orchestrate your experiments.

👉 Check out Kubeflow & Neptune comparison

2. Polyaxon


Polyaxon is a platform for reproducing and managing the whole life cycle of machine learning projects as well as deep learning applications.

The tool can be deployed into any data center, cloud provider, and can be hosted and managed by Polyaxon. It supports all the major deep learning frameworks, e.g., Torch, Tensorflow, MXNet.

When it comes to orchestration, Polyaxon lets you maximize the usage of your cluster by scheduling jobs and experiments via their CLI, dashboard, SDKs, or REST API.

Polyaxon – summary:

  • Supports the entire lifecycle including run orchestration but can do way more than that
  • Has an open-source version that you can use right away but also provides options for enterprise
  • Very well documented platform, with technical reference docs, getting started guides, learning resources, guides, tutorials, changelogs, and more
  • Allows to monitor, track, and analyze every single optimization experiment with the experiment insights dashboard

👉 Check out the comparison between Polyaxon & Neptune

3. Airflow

Airflow is an open-source platform that allows you to monitor, schedule, and manage your workflows using the web application. It provides an insight into the status of completed and ongoing tasks along with an insight into the logs.

Airflow uses directed acyclic graphs (DAGs) to manage workflow orchestration. The tool is written in Python but you can use it with any other language


  • Easy to use with your current infrastructure—integrates with Google Cloud Platform, Amazon Web Services, Microsoft Azure and many other services
  • You can visualize pipelines running in production
  • It can help you manage different dependencies between tasks

Experiment tracking and organization

1. Neptune

Be more productive

Neptune is a metadata store that was built for research and production teams that run many experiments. It’s composed of three major components:

  • Data versioning  
  • Experiment tracking
  • Model registry

These components allow Neptune to serve as a connector between different parts of the MLOps workflow. The main purpose is to create a centralized place for all machine life-cycle metadata and make it easy for teams to store, organize, display, track the lineage, share and compare all metadata generated during model development.

Furthermore, Neptune is very flexible, works with many other frameworks, and thanks to its stable user interface, it enables great scalability (to millions of runs).

Finally, being a robust software, Neptune facilitates efficient team collaboration and project supervision as well as allows to store, retrieve, and analyze a large amount of data.

Neptune – summary:

  • Provides user and organization management with different organization, projects, and user roles
  • Fast and beautiful UI with a lot of capabilities to organize runs in groups, save custom dashboard views and share them with the team
  • You can use a hosted app to avoid all the hassle with maintaining yet another tool (or have it deployed on your on-prem infrastructure)
  • Your team can track experiments that are executed in scripts (Python, R, other), notebooks (local, Google Colab, AWS SageMaker), and do that on any infrastructure (cloud, laptop, cluster)
  • Extensive experiment tracking and visualization capabilities (resource consumption, scrolling through lists of images)
  • Provides individuals and teams with notebook checkpointing and model registry to track model version and lineage. 

Best Tools to Manage Machine Learning Projects

2. MLflow


MLflow is an open-source platform that helps manage the whole machine learning lifecycle that includes experimentation, reproducibility, deployment, and a central model registry. 

MLflow is suitable for individuals and for teams of any size. 

The tool is library-agnostic. You can use it with any machine learning library and in any programming language.

MLflow comprises four main functions that help to track and organize experiments:

  1. MLflow Tracking – an API and UI for logging parameters, code versions, metrics, and artifacts when running machine learning code and for later visualizing and comparing the results
  2. MLflow Projects – packaging ML code in a reusable, reproducible form to share with other data scientists or transfer to production
  3. MLflow Models – managing and deploying models from different ML libraries to a variety of model serving and inference platforms
  4. MLflow Model Registry – a central model store to collaboratively manage the full lifecycle of an MLflow Model, including model versioning, stage transitions, and annotations

👉 Check out the comparison between MLflow & Neptune

👉 And read about MLflow + Neptune integration

3. Comet


Comet is a meta machine learning platform for tracking, comparing, explaining, and optimizing experiments and models. It allows you to view and compare all of your experiments in one place. It works wherever you run your code with any machine learning library, and for any machine learning task.

Comet is suitable for teams, individuals, academics, organizations, and anyone who wants to easily visualize experiments and facilitate work and run experiments.

Some of the Comet most notable features include:

  • Sharing work in a team: multiple features for sharing in a team
  • Works well with existing ML libraries
  • Deals with user management
  • Let’s you compare experiments—code, hyperparameters, metrics, predictions, dependencies, system metrics, and more
  • Allows you to visualize samples with dedicated modules for vision, audio, text and tabular data
  • Has a bunch of Integrations to connect it to other tools easily

Hyperparameter tuning

1. Optuna


Optuna is an automatic hyperparameter optimization framework that can be used both for machine learning/deep learning and in other domains. It has a suite of state-of-the-art algorithms that you can choose (or connect to), it is very easy to distribute training to multiple machines, and lets you visualize your results nicely.

It integrates with popular machine learning libraries such as PyTorch, TensorFlow, Keras, FastAI, scikit-learn, LightGBM, and XGBoost.

Optuna – summary:

  • Supports distributed training both on one machine (multi-process) and on a cluster (multi node)
  • Supports various pruning strategies to converge faster (and use less compute)
  • Has a suite of powerful visualizations like parallel coordinates, contour plot, or slice plot

👉 Check Neptune’s integration with Optuna

2. Sigopt


SigOpt aims to accelerate and amplify the impact of machine learning, deep learning, and simulation models. It helps to save time by automating processes which makes it a suitable tool for hyperparameter tuning.

You can integrate SigOpt seamlessly into any model, framework, or platform without worrying about your data, model, and infrastructure – everything’s secure.

The tool also lets you monitor, track, and analyze your optimization experiments as well as visualize them.

SigOpt – summary:

  • Multimetric Optimization facilitates the exploration of two distinct metrics simultaneously
  • Conditional Parameters allow defining and tune architecture parameters and automate model selection
  • High Parallelism enables you to fully leverage large-scale computer infrastructure and run optimization experiments across up to one hundred workers

Hyperparameter Tuning in Python: a Complete Guide 2020

Model serving

1. Kubeflow

Kubeflow appears several times in our article and that’s because its components allow you to manage almost every aspect of your ML experiments.

It’s quite a flexible solution that gives you space to flexibly manipulate your data and serve models the way you need to.

2. Cortex

Cortex is an open-source alternative to serving models with SageMaker or building your own model deployment platform on top of AWS services like Elastic Kubernetes Service (EKS), Lambda, or Fargate and open source projects like Docker, Kubernetes, TensorFlow Serving, and TorchServe.

It’s a multi framework tool that lets you deploy all types of models.

Cortex – summary:

  • Automatically scale APIs to handle production workloads
  • Run inference on any AWS instance type
  • Deploy multiple models in a single API and update deployed APIs without downtime
  • Monitor API performance and prediction results

3. Seldon


Seldon is an open-source platform that allows you to deploy machine learning models on Kubernetes. It’s available in the cloud and on-premise.

Seldon – summary:

  • Simplify model deployment with various options like canary deployment
  • Monitor models in production with the alerting system when things go wrong
  • Use model explainers to understand why certain predictions were made. Seldon also open-sourced a model explainer package alibi

Production model monitoring

1. Amazon SageMaker Model Monitor 

Amazon SageMaker Model Monitor is part of the Amazon SageMaker platform that enables data scientists to build, train, and deploy machine learning models.

When it comes to Amazon SageMaker Model Monitor, it lets you automatically monitor machine learning models in production, and alerts you whenever data quality issues appear.

The tool helps to save time and resources so you and your team can focus on the results.

Amazon SageMaker Model Monitor—summary:

  • Use the tool on any endpoint— when the model was trained with a built-in algorithm, a built-in framework, or your own container
  • With the SageMaker SDK, you can capture predictions or a configurable fraction of the data sent to the endpoint and store it in one of your Amazon Simple Storage Service (S3) buckets. Captured data is enriched with metadata, and you can secure and access it just like any S3 object.
  • Launch a monitoring schedule and receive reports that contain statistics and schema information on the data received during the latest time frame, and any violation that was detected

2. Hydrosphere


Hydrosphere is an open-source platform for managing ML models. Hydrosphere Monitoring is its module that allows you to monitor your production machine learning in real-time. 

It uses different statistical and machine learning methods to check whether your production distribution matches the training one. It supports external infrastructure by allowing you to connect models hosted outside Hydrosphere to Hydrosphere Monitoring to monitor their quality.

Hydrosphere Monitoring—summary:

  • Monitor how various statistics change in your data over time with Statistical Drift Detection
  • Complex data drift can be detected with Hydrosphere multivariate data monitoring
  • Monitor anomalies with a custom KNN metric or a custom Isolation Forest metric
  • It supports tabular, image, and text data
  • When your metrics changes, you get notified so you can quickly respond

3. Cortex

We’ve already mentioned Cortex in the Model Serving section but since it’s a multi framework tool, you can flexibly use it for other purposes, also to monitor your model.

And together with the model serving feature, it gives you full control over your models.

Cortex is an open-source alternative to serving models with SageMaker or building your own model deployment platform on top of AWS services like Elastic Kubernetes Service (EKS), Lambda, or Fargate and open source projects like Docker, Kubernetes, TensorFlow Serving, and TorchServe.

It’s a multiframework tool that lets you deploy all types of models.

Cortex – summary:

  • Automatically scale APIs to handle production workloads
  • Run inference on any AWS instance type
  • Deploy multiple models in a single API and update deployed APIs without downtime
  • Monitor API performance and prediction results

To wrap it up

Now that you have the list of the best tools, combine your favorite with the right system and your results will skyrocket. There’s nothing better than the mix of a good approach and superb software. 

Happy experimenting!


MLOps at GreenSteam: Shipping Machine Learning [Case Study]

7 mins read | Tymoteusz Wołodźko | Posted March 31, 2021

GreenSteam is a company that provides software solutions for the marine industry that help reduce fuel usage. Excess fuel usage is both costly and bad for the environment, and vessel operators are obliged to get more green by the International Marine Organization and reduce the CO2 emissions by 50 percent by 2050.

Greensteam logo

Even though we are not a big company (50 people including business, devs, domain experts, researchers, and data scientists), we have already built several machine learning products over the last 13 years that help some major shipping companies make informed performance optimization decisions.

MLOps shipping

In this blog post, I want to share our journey to building the MLOps stack. Specifically, how we:

  • dealt with code dependencies
  • approached testing ML models  
  • built automated training and evaluation pipelines 
  • deployed and served our models
  • managed to keep human-in-the-loop in MLOps
Continue reading ->

MLOps: What It Is, Why it Matters, and How To Implement It (from a Data Scientist Perspective)

Read more
Best tools featured

15 Best Tools for Tracking Machine Learning Experiments

Read more
ML experiment tracking

ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It

Read more
MLOps best practices

MLOps: 10 Best Practices You Should Know

Read more