Is There a Role for AI Quality Engineers?

The AI Engineer job spec covers everything from fine-tuning to RAG pipelines to prompt design. As AI infrastructure scales, specialism is inevitable, and quality engineering is where the biggest gap sits.

Nov 2025 7 min

AI is going to be more and more integrated into the fabric of all software applications. Some applications will let users interface with the model directly. Some will use AI to make orchestration decisions. Some will use AI to augment existing human processes. The shape varies. The fact that AI is in the stack is increasingly fixed.

Which raises the question. How will this affect the existing software world, and what types of engineering roles might be needed to support these systems? In the specialised AI world we already see what was the domain of the Data Scientist and Machine Learning Engineer expanding to take in Backend Engineers, Data Engineers, and Quality Engineers. New roles have already been defined and adopted, the most visible of which is the AI Engineer, a role specialised in working with and deploying AI-based products.

This article looks at one such role, asks whether it is sustainable as a single job, and considers where Quality Engineering fits in the answer.

An example AI Engineer job

The job description below is taken from a recent listing. Specific company details have been removed, and it is used here as an example only.

Responsibilities:

Breaking it down

The spec covers a lot of different specialisms, and it is much broader than a traditional software engineering role. If the traditional roles that map onto each responsibility were added to the profile, it would look something like this:

That is a lot of generalism in one job. It is also a lot of unique specialisms that engineers and scientists spend whole careers developing. Code quality alone is a discipline. Context retrieval is an expanding and increasingly complex field of its own. Asking one role to cover all of it is asking for someone who is competent everywhere, expert nowhere, and accountable for results in domains that take years to learn properly.

The rise of hybrid specialists

The reasonable inference is that one of two things will happen. Either AI Engineers will develop personal specialisms within the broad role, or, more likely, the industry will define official specialisms as hybrids of existing roles. Something along these lines:

New role types are also likely to emerge that have no obvious traditional counterpart. Conversation Engineer. AI Evaluation Engineer. Prompt Architect. The names will settle over time, but the specialisms underneath them are already real.

Defining the AI Quality Engineer

Since the focus of this site is testing and quality, let’s sketch out what an AI Quality Engineer role might actually look like and what the responsibilities could be.

That list is recognisable as quality engineering, applied to a different kind of system. The skills transfer. The mindset transfers. What changes is the system being tested, which is non-deterministic, statistical, and rarely failing in ways that produce a clean stack trace. That changes the techniques without changing the discipline.

Generalist or specialist

The honest counter-argument is whether a generalist is good enough. Will engineers be so heavily augmented by AI tooling that a generalist can apply specialist techniques under AI guidance? It is a fair question.

The answer probably depends on the size of the project. Small, simple projects are often built and run by generalist developers, and that has always worked. As with anything in engineering, when the design scales and the complexity grows, the need for specialism grows alongside it. Performance is easy to test on a system with one service doing one task. It is far harder when there are a hundred services working together with large volumes of data.

As AI supporting infrastructure grows in scale and complexity, the specialisms have to grow with it. The job spec at the top of this article is what it looks like when an industry is still pretending that one person can cover the whole stack. The hybrid roles are what emerge once that pretence stops being practical.

That is where the AI Quality Engineer role comes in. The discipline already exists. The system being tested is new. Someone has to own the question of whether it works, and right now that question is sitting at the bottom of a job description that already asks for too much.