ai4se

AI for Software Engineering

My research investigates how artificial intelligence, and especially Large Language Models, can be applied to core software engineering tasks: code generation, translation, testing, verification, and trustworthiness assessment. A central thread is rigorous benchmarking: building the evaluation frameworks needed to measure what AI tools can and cannot do reliably in real engineering contexts. This work is part of a broader agenda to establish Benchmark Engineering as a formal discipline. See the dedicated Benchmark Engineering page for the full research agenda. I also write opinion and perspective pieces on what the AI revolution means for software quality, human expertise, and professional responsibility (see my blog Echoes of Saudade).

Code Generation Code Translation Testing & Robustness Vulnerability Detection Vulnerability Triage Benchmarking Agentic SE

Opinion & Perspectives

Thoughts, analyses, and reflections on the impact of artificial intelligence on the software engineering profession. You can find more of my thoughts on Echoes of Saudade.

Perspective · IEEE Computer · 2025

Why We Should Trust Systems, Not Just Their AI/ML Components

Marco Vieira · IEEE Computer, Vol. 58, No. 11, pp. 84–94, 2025

The hype around trustworthy AI primarily emphasizes fairness, robustness, and explainability of models but overlooks a wellknown reality: AI does not run in isolation! We call for a holistic perspective into trustworthy systems that considers the infrastructure, systemic interactions, governance, and humans in the loop.

Read at IEEE Computer

Perspective · IEEE Computer · 2025

Leveraging LLMs for Trustworthy Software Engineering: Insights and Challenges

Marco Vieira · IEEE Computer, Vol. 58, No. 7, pp. 79–90, 2025

Large language models (LLMs) are transforming software engineering by accelerating development, reducing complexity, and cutting costs. If fully integrated into the software lifecycle they will have the potential to drive design, development, and deployment. However, LLM-driven trustworthy software engineering requires addressing multiple challenges.

Read at IEEE Computer

Courses, Keynotes & Tutorials

University courses, presentations, and lectures on the intersection of generative AI and software engineering.

Research Report · Jun 2026

Identity, Ethics, and Cooperation in the Age of AI: The Adlerian Software Engineer and the Adlerian Classroom

90th Meeting of the IFIP WG10.4 · Charlotte, NC, US

View details & presentation

Keynote · Jun 2026

Trusting the System, Not Just the Model: A Perspective on AI-Enabled Autonomous Systems

DSAS+DT4DRS @ DSN 2026 · Charlotte, NC, USA

View details & presentation

Course · Spring 2026

AI-Driven Trustworthy Software Development

University of North Carolina at Charlotte · (MSc, PhD)

View Course Details

Keynote · Nov 2025

Benchmarking GenAI for Software Engineering: Challenges and Insights

AISM @ ASE 2025 · Seoul, South Korea

View details & presentation

Keynote · Nov 2024

LLMs for Trustworthy Software Engineering: Insights and Challenges

LADC 2024 · Recife, PE, Brazil

View details & presentation

Research Papers & Assets

Selected works on AI and ML applied to software engineering tasks. Full list available on the publications page.

LLMKernelBench: Benchmarking Large Language Models on Software Vulnerability Detection in Linux Kernel

Arastoo Zibaeirad, Rodrigo Pato Nogueira, Marco Vieira

IEEE Trans. Reliability 2026

LLM-Based Robustness Testing of Microservice Applications: An Empirical Study

Hrushitha Tigulla, Marco Vieira

SRDS 2026

Calibration Without Comprehension: Diagnosing the Limits of Fine-Tuning LLMs for Vulnerability Detection in Systems Software

Arastoo Zibaeirad, Marco Vieira

ISSRE 2026

Unreliable in Practice? A Comprehensive Study of Errors in LLM-Generated Code

Rodrigo Nogueira, Marco Vieira, João Campos

ISSRE 2026

AutoTrace: From Patches to Triggers via Agentic Interprocedural Exploration

Arastoo Zibaeirad, Marco Vieira, Thomas Zimmermann

ISSRE 2026

LLM-Assisted CVE Triage for IoT Gateways: A Human-in-the-Loop Empirical Study

Diego Gomes, Fernando Aires, Marco Vieira

ISSRE 2026

PROBE: Benchmarking Code Generation in Large Language Models

Rodrigo Nogueira, Marco Vieira, João Campos

EMSE 2026

TestForge: Benchmarking LLM-Based Test Case Generation

Marco Vieira, Bhavain Shah, Priyam Shah

SANER 2026

Dashboard Assets

SE Perspective on LLMs: Biases in Code Generation, Code Interpretability, and Code Security Risks

Rrezarta Krasniqi, Depeng Xu, Marco Vieira

ACM Computing Surveys 2026

Polyglot: An Extensible Framework to Benchmark Code Translation with LLMs

Marco Vieira, Bhavain Shah, Rrezarta Krasniqi

ASE 2025

Dashboard Assets

Beyond Functional Correctness: An Empirical Evaluation of Large Language Models for Text-to-Code Generation

Rodrigo Nogueira, Marco Vieira, João Campos

ISSRE 2025

An Empirical Study of Large Language Models as Experts in Software Trustworthiness Assessment

Saeed Jananloo, José D'Abruzzo Pereira, Marco Vieira

LADC 2025

Marco's RA (Online)

Hi! I'm Marco Vieira's designated Research Assistant. I'm supposed to answer your questions but I really need to finish running this simulation script. What do you need?