Why is MLOps a necessity in my life?

Screenshot 2023-02-01 at 19.47.41.png

I am in the fortunate position that I did not only work on experimental machine learning models, so my model often went into production, and the company began to generate profit. This means that, for example, it gave a custom price for the custom subscription package of the users, predicted customer churn, gave recommendations to the users, interpreted and filled out legal documents, etc.

Unfortunately, 90% of machine learning models never make it to production. However, the owners of the company would certainly want the money paid for it to be repaid as soon as possible for the time invested in the development of the machine learning model. Of course, it may also happen that the data scientist does not work for production, but the goal is to gain some kind of learning or insight.
However, very often machine learning models designed for production do not even make it to production. Why? Due to a lack of MLOps knowledge and other reasons (see later). The process from business need to user experience is simply not well thought through.

When my machine learning model finally gets to the point where I can put it into production, I face many challenges. It is necessary to create a version of the model that can prepare never-before-seen data that is 100% identical to the teaching procedure used during teaching. Then it needs a version that the company’s software can communicate with, and things happen in the backend and frontend based on the predictions. For example, the machine learning model provides a recommendation regarding what should be displayed for the user on the website. That’s already massively MLOps.
The system must also be monitored. Is it working properly? What happens if the server stops or our system runs into an error? I need to know which version of the ML (machine learning) model is running, what its results (accuracy, AUC, etc.) were on the test set, and what its results are in production. An alert is needed if the gap between the results obtained on the test set and the production results is too large. Explicit and implicit feedback from users can be processed if our MLOps system is good.

At certain intervals, our system needs to be retrained on the new data we have collected. Then the model version changes, the accuracy on the body set changes, etc.

It often happens that the data changes so much in a few years, a few months, or even a few days that our trained model is no longer able to predict accurately. This is a data drift and concept drift topic. Data drift is when the data changes, for example, the distribution of the input data changes. Concept drift is when user habits change, or for example, the economic environment changes, or a company’s strategy changes. Concept drift was, for example, Covid19. I have to ask myself, is my machine learning model prepared for these? Have I even built a monitoring system that allows me to quickly detect the problem with an alert? The problem is not that there is a problem of data drift or concept drift. The biggest problem is that our model no longer predicts well, and the accuracy in production has drastically deteriorated, but we and our company don’t even know about it. This can result in a drastic drop in revenue or a drastic increase in the costs of the company. So the profit decreases or disappears.

Has our system been thought through from an ethical, social, and legal point of view? Doesn’t our system generate, for example, hate speech, i.e. doesn’t it incite against a minority group?

If we want to develop the system further, or if we need to hand it over to someone else to develop it, can you find production-ready code? Is it documented? Understandable? Are our experiments tracked? Can you see how far we have come during the development of the model?

If a new model version comes out, is it easy to switch to it? Do we have test cases with which we can test whether it meets the requirements of production? Is it possible to track back when which model was in production and with what results?

Why MLOps aren’t easy? There are several reasons:

On the one hand, this is a fairly new field, this concept appeared only a few years ago, of course, machine learning models were running in production before that, but it was not called MLOps, and there were no suitable books written, courses created, and it was not taught at the university, so it was rather underrepresented and its importance was underestimated.
On the other hand, MLOps is a multi-disciplinary field. This means that you also need some knowledge of data science, machine learning, backend engineering, data engineering, DevOps engineering, and software engineering.
A third problem is that business likes to think of machine learning projects as being ready when the optimized model is ready. But the truth is that the business side also needs to be prepared, and the data professionals need to prepare them for the fact that when a model is ready, we are only halfway there. After that, there is still a lot of work that needs to be done, if we don’t want our machine learning model to end up among the 90% postponed, never put into production, or models that don’t perform well there.

In summary, if we want to have successful machine learning projects behind us, we need to learn the science of MLOps. To do this, we need to acquire a number of skills that are outside of what we have been working with so far. In the third part, we should represent the point of view when communicating with the business managers that with the completion of the machine learning models, the project is only halfway through and additional resources are needed.

If you like my content, please upwote this, follow me, send message and check my following pages:

Website: https://www.datascienceeurope.ai
Linkedin: https://www.linkedin.com/in/gerzson-boros/
Medium: https://medium.com/@gerzson.boros