AI is going to be more and more integrated into the fabric of all software applications, whether that is an application that allows the user to interface with the model directly, an AI making orchestration type decisions or AIs augmenting human processes.
Which leads to the question: how will this affect the existing software world and what types of engineering roles might be required to support these systems. In the specialised AI world we already see what would have been the domain of the Data Scientist and Machine Learning Engineer is now expanding to encompass Backend Engineers, Data Engineers and Quality Engineers. New roles have been defined and adopted. Specifically the AI Engineer, specialised in working with and deploying AI based products.
An Example AI Engineer Job
As an example, below is a recent job description for an AI Engineer. Any specific details about the company have been removed and this is used as an example only.
Responsibilities:
- Develop prompt templates and apply retrieval strategies to enhance answer precision and factual accuracy.
- Engage in error analysis, iterative model refinement, and performance optimization.
- Responsible for building, fine-tuning, and evaluating models powered by LLMs, using frameworks such as AWS Bedrock, HuggingFace, or the OpenAI API.
- Work closely with senior engineers and product managers to translate user needs into technical specifications and features.
- Contribute to the creation of data pipelines for training and testing, including necessary annotation and evaluation tools.
- Maintain clear documentation for code and workflows, adhering to best practices for code quality and reproducibility.
Let’s break it down:
This appears to cover a lot of different specialities and might be far too broad when compared to traditional software development roles. If the traditional roles were added to the profile above it could look something like this:
- They need to be able to build, fine tune and evaluate models (Machine learning engineer)
- Create and optimise prompts (Data scientist, Machine Learning Engineer, Linguist)
- Work with RAG (context retrieval strategies) (Data Scientist, Data Engineer)
- Create Data pipelines (Data Engineer)
- Create technical specifications for features (Systems Engineer)
- Analyze errors, and optimise to refine the model and the performance. (Quality Engineer, Back end Developer, Data Scientist, Machine learning Engineer)
- Write and maintain good code and documentation. (Backend Developer, Quality Engineer)
Looking at it objectively, this seems like a job with a huge amount of generalism required, but equally some unique specialisms that engineers and scientists could spend their whole careers developing. Good code quality alone is a huge area of expertise and context retrieval is an ever expanding and complex field.
The Rise of Hybrid-Specialists
From this we can infer that either AI engineers will develop specialisms, or more likely there will be official specialisms within the industry, hybrids of previous roles. For example:
- AI Backend Engineer, responsible for code quality and complex back end problems on AI systems.
- AI Data Engineer, responsible for data pipelines, databases and context retrieval.
- AI Model Engineer, responsible for fine tuning and model optimisation.
- AI Quality Engineer, responsible for evaluation and performance of AI systems
- …or even new role types like Conversation Engineer or AI Evaluation Engineer.
Defining the AI Quality Engineer
Since we are interested in testing and quality engineering, let’s try to define what an AI Quality Engineer role might look like and the types of responsibilities there might be.
- Responsible for the testing strategy for evaluating LLM driven applications, using or building evaluation frameworks.
- Responsible for the identification and management of test data sets to evaluate changes in models.
- Contribute to critical analysis of design choices, prompting and context retrieval.
- Design and implement test techniques for exploratory and regression testing of LLM based systems.
- Carry out root cause analysis and risk analysis on identified issues.
- Build benchmarks and evals.
- Lead human in the Loop campaigns for labelling and root cause analysis tasks.
Generalist vs. Specialist
In the end, we might ask, is a generalist good enough? Will engineers be so heavily augmented by AI that generalists will be able to apply specialist techniques under AI guidance. This likely depends on the size of the project.
Small simple projects are often built and run by generalist developers, as with anything in engineering when the design scales and complexity grows the need for specialism grows with it. Performance is easy to test on a system with one service that carries out one task. It’s far more complex when there’s 100 services simultaneously working together with large amounts of data.
Therefore as AI supporting infrastructure grows in scale and complexity we have to assume the specialisms will too.