Testing and Continuous Delivery

Bringing automatic detection of backdoors to the CI pipeline

<p><strong>Software backdoors aren’t a myth—they’re a recurring nightmare</strong>. Time and again, we’ve watched malicious code slip into open-source ecosystems. The notorious xz compromise grabbed headlines, but it wasn’t the first act in this drama. Earlier breaches included the PHP incident in 2021, as well as vulnerabilities in vsFTPd (CVE-2011-2523) and ProFTPD (CVE-2010-20103). And here’s the unsettling truth: these examples likely just scratch the surface. <strong>Why does it matter?</strong> Because a single backdoor in a widely used project turns into a hacker’s dream buffet—millions of machines served up for exploitation.</p> <p><strong>Tracking down and eliminating backdoors isn’t a quick win</strong>—it’s like diving headfirst into sprawling code jungles. Sounds epic? In reality, even for a veteran armed with reverse-engineering gear, it’s a grueling slog. So grueling that most people simply don’t bother. The good news? <strong>New tools such as ROSA (<a href="https://github.com/binsec/rosa">https://github.com/binsec/rosa</a>) prove that large-scale backdoor detection can be automated</strong>—at least to a significant extent. Here’s the twist: traditional fuzzers like AFL++ (<a href="https://github.com/AFLplusplus/AFLplusplus">https://github.com/AFLplusplus/AFLplusplus</a>) test programs with endless input variations to trigger crashes. It’s brute force, but brilliant for uncovering memory-safety flaws. Backdoors, however, play by different rules—they don’t crash; they lurk behind hidden triggers and perfectly valid behaviors. ROSA changes the game by training fuzzers to tell “normal” execution apart from “backdoored” behavior.</p> <p>But there’s a catch: <strong>ROSA’s current use case is after-the-fact analysis, helping security experts vet full software releases</strong> (including binaries). Following the <a href="https://en.wikipedia.org/wiki/Shift-left_testing">shift-left paradigm</a>, <strong>our goal is to bring this detection magic into the CI pipeline</strong>—so we can stop backdoors before they ever land. Sounds great, but reality bites: ROSA produces false alarms and can require a significant test budget to find backdoors, which are a nightmare in CI. <strong>In this talk</strong>, we would like explore the methodological and technical upgrades needed to build a ROSA-based backdoor detection prototype that thrives in CI environments. Think reduced resources, and minimal noise—all within the tight resource windows CI jobs demand.</p>

Additional information

Live Stream https://live.fosdem.org/watch/h2213
Type devroom
Language English

More sessions

2/1/26
Testing and Continuous Delivery
Theodore Tucker
H.2213
<p>A number of industrial applications now demand hard real-time scheduling capabilities from the kernel of a Linux-based operating system, but scheduling measurements from the system itself cannot be completely trusted as they are referenced to the same clock as the kernel-under-test. Yet, if the system can output signals to hardware as it runs, their timing can be analysed by an external microcontroller, and a second "external" measurement obtained to compare with the system's own report.</p> ...
2/1/26
Testing and Continuous Delivery
Ivan Baravy
H.2213
<p>OS kernel development is often connected with time consuming testing process and non-trivial debug technics. Although emulators like QEMU and Bochs ease this work significantly, nothing can compare with convenience of userspace developer environment. Moving parts of the kernel to the userspace binary is not straightforward, especially if the kernel has almost no compatibility with POSIX and is written entirely in assembly. Still, sometimes it is doable. The talk shares experience, ...
2/1/26
Testing and Continuous Delivery
Andreea Daniela Andrisan
H.2213
<p>Talk about a project that implements a hardware-in-the-loop testing framework for validating Linux distributions on specific development boards. The system uses a universal testing harness that automatically detects target hardware platforms and adapts generic testing scripts to board-specific configurations with Claude AI help. Platform adaptation is achieved through specific configuration files that define board-specific parameters, enabling the same testing codebase to validate different ...
2/1/26
Testing and Continuous Delivery
Marek Pikuła
H.2213
<p>The <strong>ci-multiplatform</strong> project is a generic, OCI-based multi-architecture CI system designed to make cross-platform testing practical for open-source projects using GitLab CI. Originally created while enabling RISC-V support for Pixman (https://gitlab.freedesktop.org/pixman/pixman), it has since grown into an independent project under the RISE (RISC-V Software Ecosystem) umbrella: https://gitlab.com/riseproject/CI/ci-multiplatform, with a mirror on freedesktop.org: ...
2/1/26
Testing and Continuous Delivery
Connor Aird
H.2213
<p>Testing is central to modern software quality, yet many widely used Fortran codebases still lack automated tests. Existing tests are often limited to coarse end-to-end regression checks that provide only partial confidence. With the growth of open-source Fortran tools, we can now bring unit testing and continuous validation to legacy and modern Fortran projects alike. </p> <p>This talk surveys the current landscape of Fortran testing frameworks before focusing on three I have evaluated in ...
2/1/26
Testing and Continuous Delivery
Rémi Duraffort
H.2213
<p>ESPHome is a versatile framework to create custom firmware for various microcontrollers. In this talks we will look at how to automatically test the latest ESPHome firmware on an ESP32.</p> <p>As ESPHome devices are used to interact with the real world, we will also look at how to test that the LUX sensors is able to detect light variations.</p> <p>In order to test the ESP32 device, we are going to use lava on the command line, directly inside a gitlab runner.</p>
2/1/26
Testing and Continuous Delivery
Ullrich Hafner
H.2213
<p>The CI/CD server Jenkins provides powerful build-quality visualizations through plugins such as <a href="https://plugins.jenkins.io/warnings-ng/">Warnings</a>, <a href="https://plugins.jenkins.io/coverage/">Coverage</a>, and <a href="https://plugins.jenkins.io/git-forensics/">Git Forensics</a>. These plugins aggregate and visualize data from static analysis tools, coverage reports, software metrics, and Git history, enabling teams to track quality trends across builds. We have now brought ...