Real-World ML #018: How to become a TOP ML engineer

Think of products 📦, NOT models

Apr 01, 2023

Read time: 5 minutes

Most Machine Learning courses give you:

a dataset with features (aka model inputs) and a target metric (aka model outputs)
an error metric to optimize (e.g. accuracy)

and they show you how to build the best possible model, that is, the mapping from features to targets that minimizes the error metric. And although this step is essential to building real-world ML products, it is JUST ONE of the steps necessary to build real-world ML products.

When you exclusively focus on building ML models, you adopt the so-called model-first mindset.

This is a very valid approach if you are working in academia or as an ML Research Engineer in a big lab (e.g. DeepMind) and there is a huge engineering team behind you, that provides tooling and platforms to quickly move prototypes to production.

However, most companies in the world are NOT research labs.

This means that most ML engineers in the world need to adopt a different mindset, focused on building ML products that bring business value from day 0.

This is what I call a product-first mindset.

Product-first mindset 🧠

Unless you are a researcher in academia, and your goal is to publish a paper, you cannot just focus on the ML mapping between features and targets You need to think further down the line and think of the business problem you are trying to sovle. When you do that, you adopt a product-first mindset.

Real-world ML products are more than just ML models. There are 2 essential skills you need to perfect and master over time, that you won't learn in any Kaggle competition:

Problem framing, which you do before you train your ML model.
Model operationalization (aka MLOps), which you do after you train your ML model.

From model-first mindset to product-first mindset.

Skill #1. Problem framing

Machine Learning is not the end, but the means through which you try to solve ONE SPECIFIC BUSINESS PROBLEM. Hence, at the start of any ML project, you need to step back a bit so that you don’t miss the forest for the trees 🌲🌳🌲🌳🌲.

Here is a list of things I suggest you do at the beginning of every project:

Understand the underlying business problem, the business metric you wanna impact (e.g. marketing profit) and the proxy metric your ML model will try to predict (e.g. Ad Click-Through-Rates).
Talk to stakeholders and end-users, especially if the output of your ML model will be used by a human operator to assist in her decision process.
Estimate the current value (aka baseline) of your business and proxy metric. This is the metric you need to improve to bring value to the table. Think of easy-to-implement-non-ML solutions that will work just fine to start with.
Talk to the data engineers to understand data availability (quantity and quality), so you detect project blockers as soon as possible.

Skill #2. ML model operationalization (MLOps)

ML model prototypes have 0 value until you put them to work. For that, you need to build a minimum system that

Ingests data and generates features → feature pipeline
Re-trains the model → training pipeline
Generates and serves predictions → inference pipeline

💡 If you wanna get into more details I recommend you read my previous article on the 3-pipeline design

MLOps is a set of best practices to help you build a fully functional MVP. And improve it over time.

This is what has business value, and what companies are looking for.

My advice 💡

If you are new to Machine Learning, go to Kaggle, pick a dataset and problem you are interested in, and try to build a good predictive model. Get comfortable building models.

Once you feel comfortable building models, take the next step, and start thinking of building actual ML products from these models.

This is in my opinion the greatest differentiator in the job market these days: those who build complete ML products, vs the ones who just build ML prototypes.

If you need help building ML products, I have a hands-on tutorial that will help you beat the impostor syndrome, and learn all the technical tools you need to build your products.

👉🏽 Click here to read all the details

Keep on learning!

Peace, love, and laugh

Pau

Real-World Machine Learning

Discussion about this post