benchmark
Benchmark Engineering
Benchmarking is among the most consequential activities in computer science, yet it has never matured into a discipline of its own. For over two decades, my research has examined how benchmarks are designed, what makes them fail, and what principled foundations would look like across domains including security, dependability, and AI. This page brings that work together under a single agenda: Benchmark Engineering.
Opinion & Perspectives
Provocation, argument, and reflection on why computer science needs benchmarking as a first-class discipline.
TO BEnchmark OR NOT TO BEnchmark
From Performance to Dependability Benchmarking: A Mandatory Path
Read paperKeynotes & Tutorials
Invited keynotes and tutorials on benchmark engineering across security, dependability, and AI.
Benchmarking GenAI for Software Engineering: Challenges and Insights
Slides EventPerspectives on Dependability and Security Benchmarking: TO BEnchmark OR NOT TO BEnchmark
Slides EventTrustworthiness Benchmarking of (Safety) Critical Systems
Benchmarking the Security of Software Systems OR TO BEnchmark or NOT TO Benchmark
From Software Security Assessment to Security Benchmark
Benchmarking the Dependability of Computer Systems
Dependability Benchmarking of Computer Systems
Benchmarking Machine Learning-based Online Failure Prediction Models
Benchmarking the Security of Software Systems
Benchmarking the Security of Software Systems
On the Metrics for Benchmarking Vulnerability Detection Tools
Foundational Research
Over two decades of publications forming the empirical foundation of the Benchmark Engineering agenda, spanning AI evaluation, security, and dependability. Full list available on the publications page.