Master's thesis:On the evaluation of AI-driven code and test generation agents

Background
The field of software development is being transformed by the rapid rise of AI agents. These systems are increasingly automating tasks that were traditionally performed by developers. While this automation improves efficiency, it also introduces challenges around oversight, validation, and ensuring the safety and reliability of the generated outputs. This thesis explores these challenges by focusing on the development and evaluation of AI coding agents.

Description
Coding agents powered by large language models (LLMs) have recently reached a level of maturity that enables their use in production environments. However, running large LLMs can be computationally expensive and impractical. A promising alternative is the use of smaller, specialized language models trained for specific tasks such as code and test generation.

This thesis will investigate the effectiveness of such small-scale models in generating both code and tests. In particular, it will evaluate how these models impact the safety, security, and reliability of generated code, and explore methodologies for assessing their performance in real-world scenarios. The thesis is planned to be done as part of the European project HIVEMIND (Human-centred collaboratIVE MultI-ageNt framework for accelerating software Development and maintenance).

Key Responsibilities
The following tasks will be undertaken within this project:

1. Conduct a literature review of existing methods for evaluation of AI-driven code/test generation

2. Based on the literature review, choose an appropriate method or combine several methods to create one, and evaluate the code/test generating agent.

3. Build a prototype code-and-test-generating agent using small yet specialized models.

4. Evaluate the quality as well as safety and security of the developed agent based on the chosen methodology.

5. Document the findings and learnings from the use of the created agent, apart from the regular thesis presentation and thesis report.

Qualifications
Candidates are expected to be enrolled in a master's program in a field related to computer science and engineering, control and mechatronics, or complex systems. Having already completed AI-related courses is an advantage.

Terms
As a master's thesis candidate in this project, you will work with researchers from the Dependable Transport Systems at RISE. We will provide you with the infrastructure and support to perform your thesis work. This thesis is located in Borås, physical presence is expected to some degree. Start is beginning of 2026. We encourage applications from two students who want to work together on the project.

Compensation: 1,000 SEK per credit after project completion and approval, if more than one student 1,333 SEK per credit after project completion and approval, if one student.

Welcome with your application!
Last day of application; January 15 2026
Contact; Behrooz Sangshoolie, +46 10 516 61 89

Master's thesis:On the evaluation of AI-driven code and test generation agents

About RISE Research Institutes of Sweden AB

Master's thesis:On the evaluation of AI-driven code and test generation agents