We Designed a Study to See If AI Can Imitate Real Software Bugs

from Hackernoon 1 year ago

The article explores the effectiveness of large language models (LLMs) in generating software mutations, particularly for Java programs. It is structured around five research questions that evaluate LLM performance concerning cost, usability, and behavior similarity to actual bugs. The study also investigates how different prompts and LLM architectures influence these outcomes, and it identifies underlying causes for errors in mutation generation. This comprehensive analysis aims to enhance the understanding of LLM capabilities in software testing contexts.

Our study investigates the capabilities of existing LLMs in mutation generation, focusing on performance evaluation, prompt engineering strategies, and root cause analysis for underperformed aspects.

We design research questions that target LLM performance in mutation generation regarding cost, usability, behavior similarity with real bugs, and the impact of different prompts and models.

Read at Hackernoon

#llms #mutation-generation #software-testing #performance-evaluation #prompt-engineering

Collection

[

...

]

We Designed a Study to See If AI Can Imitate Real Software Bugs | HackerNoonWe Designed a Study to See If AI Can Imitate Real Software Bugs | HackerNoon Briefly

We Designed a Study to See If AI Can Imitate Real Software Bugs | HackerNoon
We Designed a Study to See If AI Can Imitate Real Software Bugs | HackerNoon
Briefly