Hiring Playbook
Most ML engineer hires go sideways because the screen tests what the candidate knows, not what they have done. Here are the questions that actually predict production performance.
A machine learning engineer hire goes one of two ways. Either the candidate ramps up, ships an improvement to an existing ML system in the first quarter, and starts owning a slice of the stack by month six. Or the candidate spends four months in notebooks, never ships, and quietly transitions into a data analyst role while the team quietly hopes the next hire is better.
The difference between those outcomes is almost always set at the screening stage. Most interview loops test ML theory: tree depth, regularization, gradient descent. Those questions do not predict who will ship in production. This page lays out the questions that do.
Shipping ML is a distributed systems job with a model attached. The model is 20% of the work. The other 80% is data pipelines, feature stores, serving infrastructure, monitoring, versioning, rollback, and on-call. An engineer who has only trained models in a notebook is missing most of the actual job.
Before you design the interview, be honest with yourself about which 80% the role owns. Is this a modeling-heavy seat on an existing platform, or does the candidate need to own the whole stack? If it is the latter, the candidate pool is five to ten times smaller, and the screen needs to reflect that.
The following questions are the core of the screen we run on every senior ML search at Engineers in AI. Tony Kochhar, a 20-year engineering veteran, runs these conversations personally on senior roles. Each question is designed to be hard to fake.
A Kaggle grandmaster title does not predict production ML performance. Neither does a PhD. Neither does a FAANG resume, on its own. The signals that actually matter are operational.
The most common mistake is using the same interview loop for every ML hire, regardless of whether the role is platform, applied, or research. A second common mistake is letting one strong paper carry a candidate through the loop when the rest of the signal is weak. A third is chasing a name-brand resume at the expense of someone quieter who has shipped more real systems.
The way to avoid all three is to write down what the hire will ship in their first 90 days, and interview backwards from that list. If the hire will own a recommendation service, the loop should include a realistic recommendation system design question, not a generic ML theory quiz.
An interview loop that actually predicts ML engineer performance has three stages. A 45-to-60-minute conversation about a system the candidate shipped, with specific follow-ups on monitoring, retraining, and failure modes. A scoped ML system design, not a leetcode puzzle. And a practical coding exercise that focuses on data handling and pipeline logic, not on exotic algorithms the candidate will never reach for in production.
The loop you want to avoid is the one that tests theory in three separate stages. If your interview is mostly a machine learning final exam, you will end up with candidates who can explain algorithms and stall the moment they have to debug a feature pipeline that silently dropped 3% of rows last Tuesday.
Use an ML specialist recruiter when your role is senior, the screening bar is high, and your internal team cannot filter on production signal. At a fully loaded ML engineer compensation of $350K to $500K, a bad hire is an expensive mistake. A flat 20% placement fee pays back the first time a bad candidate gets filtered out before your team spends a week on them.
Engineers in AI has closed over 1,000 technical placements in 20 years, including ML hires at Agoda, Hearst, Con Edison, and Trilogy. No retainer, no exclusivity, 90-day replacement guarantee. If you are hiring an ML engineer and want an engineering-led read on your role, book a hiring call.
Screening built around shipped systems, not theory. Flat 20% fee. 90-day replacement.