How I would Engineer Machine Learning - Redux
If you were to start preparing for ML roles today, what would your step by step approach look like, the concepts you would focus on and what strategies would you use to learn better.
I would do the following, roughly a couple of these together at a time
- ★ Practice leetcode questions, know the patterns of problems
- ★ Practice the basics of ML. A good benchmark is you should know everything that Josh Starmer’s StatQuest talks about, I think he has a couple of playlists.
- But know the basics thouroughly. For example, pick one algorithm from each class of problems and know EVERYTHING about it. Linear Regression, Logistic Regression, Decision Tree, SVM, K Means Clustering. Deep enough to answer meta questions like:
- Why sigmoid and not any other function in Logistic regression
- Why use log in logistic regression
- How can you predict continuous values in DT
- Apart from models also know ML basics - things like Bias, Variance, Overfitting, Regularization, Why layers in NN, How overfitting manifests in DT NN and LR, Accuracy Precision Recall F1 scores, RoC curves, How to identify good model, how to identify good data, Feature Engineering, Pruning and when to stop training
- But know the basics thouroughly. For example, pick one algorithm from each class of problems and know EVERYTHING about it. Linear Regression, Logistic Regression, Decision Tree, SVM, K Means Clustering. Deep enough to answer meta questions like:
- ★ Learn basic probability and stats thoroughly. Anything in a Probability 101 course is fair game. You should be intuitively familiar with things like Distributions (Normal, Binomial, Poission, Bernoulli), Expectations, Variance, Conditional Probability, Independent Events, Hypothesis Testing, Graphs, Mean-Median-Mode and when to use what, Arithmetic and Geometric means
- Try going deeper into fancy things like Transformers, LLMs, etc. Once you know the foundation it will be very easy to understand what is happening elsewhere.
- ★ If you are able to pick any topic and hold a lesson on it, and go as deep as 5-whys into anything above without stumbling, you are awesome now
- ★ Know everything about any ML project in your resume. You should be able to explain it at various levels of abstraction - from a new software engineer to a seasoned professional in that exact field.
- ★ Look up ML System Design problems. Break it down and build the entire system, for example, if you had to build the search bar for Amazon in its website how would you fill the suggestions below? Remember the constraints (millions of products, different semantics - apple could mean the fruit or company, super fast predictions - where to store the mode)
There are a lot of resources online. I wouldn’t stress too much on the “best” way to learn as long as you are hitting the above points. Focus more on the basics. Some suggestions (again, not the best, but a good place to start and refer):
When you have MORE time and want to go DEEPER:
All the best :)
★ marks essentials fast ramp-up for ML interviews