Empirical Software Engineering
This paper presents the first systematic evaluation of ArkTS code generation with large language models, which uses 300 prompts across three difficulty levels and measures Pass@1, compilation rate, and generation time in milliseconds while it maps compiler messages into syntax, type, undefined reference, and other failures, and it also adds an independent LLM judge with a fixed scoring rule. We e…
The increasing reliance on Open Source Software (OSS) in organizations’ software supply chains necessitates robust mechanisms in the intake process to ensure sourced components’ long-term viability and maintenance. Assessing OSS project health in the intake process is complex due to the wide range of socio-technical factors involved. This study aims to explore how the health of OSS projects may b…
Unit testing is essential for software reliability, yet manual test creation is time-consuming and often neglected. Although search-based software testing improves efficiency, it produces tests with poor readability and maintainability. Although LLMs show promise for test generation, existing research lacks comprehensive evaluation across execution-driven assessment, reasoning-based prompting, an…
Boundary artefacts are shared artefacts that support collaboration by allowing different groups to interpret the same information in different ways. Software development activities benefit from them, as a single artefact can support stakeholders across different organisational boundaries. When these artefacts contain inconsistencies, such as incorrect information, practitioners’ trust in them may…
Software modelling is a creative yet challenging task. Modellers often find themselves lost in the process, from understanding the modelling problem to solving it with proper modelling strategies and modelling tools. Students learning modelling often get overwhelmed with the notations and tools. To teach students systematic modelling, we must investigate students’ practical modelling knowledge an…
In software development, domain models are conceptual blueprints that capture the structure, relationships, and key entities of a problem domain. Automated techniques can support analysts and developers by extracting such models from existing artifacts. However, this is a non-trivial task, especially when the input consists of informal artifacts such as user stories. This paper investigates how p…
Abstract Programmers rely on code documentation and comments to understand source code, with program comprehension tasks consuming a significant portion of development time. Despite their importance, the impact of comments on program comprehension remains debated. Our study addresses this gap by investigating the influence of comments on program comprehension. Employing a mixed-methods approach, …
Abstract While several studies have examined the security of code generated by GPT and other Large Language Models (LLMs), most have relied on controlled experiments rather than real developer interactions. This paper investigates the security of GPT-generated code extracted from the DevGPT dataset and evaluates the ability of current LLMs to detect and repair vulnerabilities in this real-world c…
Abstract Understanding how software systems evolve over time is a key challenge in software engineering. While traditional tools like GitHub offer detailed access to project history, their interface can be fragmented and cognitively demanding. Immersive environments, by contrast, offer new opportunities to visualize software evolution in ways that leverage spatial reasoning and embodied interacti…
research.ioSign up to keep scrolling
Create your feed subscriptions, save articles, keep scrolling.