The impact of artificial intelligence on the software testing
Daghigh, Sajad (2025)
Daghigh, Sajad
2025
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025061021891
https://urn.fi/URN:NBN:fi:amk-2025061021891
Tiivistelmä
Artificial intelligence is growing and influencing all aspects of human life, and software development is no exception. Various tools have been developed for the software testing process and examining them can help optimize their usage. Investigating and understanding these tools is vital for optimizing their application in real-world development environments.
A comprehensive descriptive literature survey is conducted to establish the current landscape of AI applications in software testing. Principles and strategies of software testing are examined, and various software testing methods are classified from different perspectives, such as testing levels, the degree of knowledge about the software under test, and the method of executing the tests. Moreover, test quality evaluation methods, including various code coverage metrics and mutation analysis are taken into used.
Various AI-based tools that are currently used in software testing are introduced and evaluated. These tools are analyzed in terms of their practical application, particularly in the context of automated test generation and evaluation. The tools are categorized into prompt-driven systems, such as those powered by large language models including GPT-4o, Claude Sonnet 3.5, and Gemini 2.5, and non-prompt-driven systems tailored for Java programming. In addition, an experimental evaluation utilizing a predefined Java-based software project as a case study is conducted to assess the tools based on standard testing metrics to measure their effectiveness and reliability.
The results indicate that there are several influencing factors of AI-based testing tool performance. Those are the type and architecture of the AI model, contextual knowledge of the software project, input prompt specificity, and quality. Key limitations and challenges were presented, e.g., the need for human supervision, challenges in semantic comprehension of code, and restrictions in handling complex software situations.
By investigating these dimensions, the study offers a realistic view of the strengths and weaknesses of existing AI technologies for software testing. It offers possibilities for future enhancement and a baseline for ongoing research on smarter and autonomous approaches to testing.
A comprehensive descriptive literature survey is conducted to establish the current landscape of AI applications in software testing. Principles and strategies of software testing are examined, and various software testing methods are classified from different perspectives, such as testing levels, the degree of knowledge about the software under test, and the method of executing the tests. Moreover, test quality evaluation methods, including various code coverage metrics and mutation analysis are taken into used.
Various AI-based tools that are currently used in software testing are introduced and evaluated. These tools are analyzed in terms of their practical application, particularly in the context of automated test generation and evaluation. The tools are categorized into prompt-driven systems, such as those powered by large language models including GPT-4o, Claude Sonnet 3.5, and Gemini 2.5, and non-prompt-driven systems tailored for Java programming. In addition, an experimental evaluation utilizing a predefined Java-based software project as a case study is conducted to assess the tools based on standard testing metrics to measure their effectiveness and reliability.
The results indicate that there are several influencing factors of AI-based testing tool performance. Those are the type and architecture of the AI model, contextual knowledge of the software project, input prompt specificity, and quality. Key limitations and challenges were presented, e.g., the need for human supervision, challenges in semantic comprehension of code, and restrictions in handling complex software situations.
By investigating these dimensions, the study offers a realistic view of the strengths and weaknesses of existing AI technologies for software testing. It offers possibilities for future enhancement and a baseline for ongoing research on smarter and autonomous approaches to testing.