RAG Engineering
Retrieval-augmented generation is a specific discipline. Embeddings, chunking, reranking, evals, latency. Engineers in AI screens for the craft, not the library list.
Retrieval-augmented generation is often described as "just plug in a vector database and an LLM." That description has launched a lot of demos and almost no shipped systems. The teams that actually ship production RAG know that the hard work sits between those two boxes, and a RAG engineer hire who cannot reason about that middle is not going to ship anything that survives real traffic.
Engineers in AI is a RAG engineer recruiter with a retrieval-specific screen. We do not accept "I built a RAG system with LangChain and Pinecone" as evidence. We ask what was in the system, how it was evaluated, and what broke when real users hit it.
A production RAG system has six or seven moving parts, and a real RAG engineer can reason about all of them.
A candidate strong in three of those seven areas can be hired for a team where other engineers own the rest. A candidate strong in zero should not be hired as a RAG engineer. The screen below is about finding that difference.
Tony Kochhar runs the first technical conversation on every senior RAG engineer search. The screen below is written to avoid the LangChain-tutorial trap: candidates who have wired libraries together but cannot reason about the tradeoffs.
Strong candidates answer in specific numbers and specific tradeoffs. Weak candidates answer in library names.
A RAG engineer at an enterprise search company owns reranking and eval on a corpus measured in hundreds of millions of documents. A RAG engineer at a Series A AI startup often owns the full stack, from parsing to serving, on a corpus measured in tens of thousands. A RAG engineer at a regulated company spends half their time on source attribution and guardrails because the cost of a wrong citation is a compliance incident. Those are three different hires.
Before the first submittal we push to pin down which variant your team actually needs. The candidate who will flourish in one of those environments will stall in another. A specialist recruiter's job is to know that, and to calibrate the outreach before the first candidate ever lands in your inbox.
The first call is 45 minutes on the role. We pull apart what the hire will actually ship in 90 days: is this a retrieval-specific seat on an existing LLM team, or is this the first RAG engineer shaping the whole stack? Those are different candidate profiles. We target accordingly, and our first submittal is usually in your inbox within two weeks.
Engineers in AI is a boutique NYC recruiting firm with a flat 20% placement fee, no retainer, no exclusivity, and a 90-day replacement guarantee. Over 1,000 placements in 20 years, including engagements with Agoda, Hearst, Con Edison, and Trilogy. We take the searches we can deliver on, and we are honest on the first call about the ones where we cannot.
If you are hiring a retrieval-augmented generation engineer and you want a recruiter who can screen on retrieval quality, not just library familiarity, book a hiring call. 45 minutes on the role, a real read on the market, and an honest answer on whether a boutique firm is the right fit for your hire.
Specialist search for retrieval-augmented generation. Flat 20% fee. 90-day replacement. No retainer.