GCC (GNU Toolchain)

Libgomp Optimizations for Scheduler Guided OpenMP Execution in Cloud VMs

UD6.215
Himadri CHHAYA-SHAILESH
<p>OpenMP is a widely used framework for parallelizing applications, enabling thread-level parallelism via simple source-code annotations. It follows the fork-join model and relies heavily on barrier synchronization among worker threads. Running OpenMP-enabled applications in the cloud is increasingly popular due to elasticity, fast startup, and pay-as-you-go pricing.</p> <p>In cloud-based execution, worker threads run inside a virtual machine (VM) and are subject to dual levels of scheduling: threads are placed on guest virtual CPUs (vCPUs), and vCPUs run as ordinary tasks on the host’s physical CPUs (pCPUs). The guest scheduler places threads on vCPUs, while the host scheduler places vCPUs on pCPUs. Because these schedulers act independently, a semantic gap emerges that can undermine application performance. Barrier synchronization, whose efficiency depends on timely scheduling decisions, is vulnerable to this semantic gap, and remains under-explored.</p> <p>This talk presents my PhD thesis project supervised by Julia Lawall and Jean-Pierre Lozi at Inria Paris. The thesis defines Phantom vCPUs to describe problematic host-level preemptions in which guest vCPUs remain queued on busy pCPUs, stalling progress. We show that OpenMP performance can be substantially improved inside oversubscribed cloud VMs by (1) dynamically adapting the degree of parallelism (DoP) at the start of each parallel region and (2) dynamically choosing between spinning versus blocking at barriers on a per-thread, per-barrier basis. We propose paravirtualized, scheduler-informed techniques that accurately guide these decisions and demonstrate their effectiveness in realistic deployments. </p> <p>The first contribution of this thesis is Phantom Tracker, an algorithmic solution implemented in the Linux kernel that leverages paravirtualized task scheduling to detect and quantify Phantom vCPUs accurately. The second contribution is pv-barrier-sync, a dynamic barrier synchronization mechanism driven by the scheduler insights produced by Phantom Tracker. The third and final contribution is Juunansei, an OpenMP runtime extension that demonstrates the practical utility of Phantom Tracker and pv-barrier-sync with additional optimizations.</p> <p>The talk discusses the context and motivation of this work, followed by a brief introduction to the Phantom Tracker, and then takes a deep dive into the libgomp implementation of pv-barrier-sync and Juunansei.</p>

Weitere Infos

Live Stream https://live.fosdem.org/watch/ud6215
Format devroom
Sprache Englisch

Weitere Sessions

31.01.26
GCC (GNU Toolchain)
UD6.215
<p>Welcome to the GCC (GNU Toolchain) devroom from the organizers.</p>
31.01.26
GCC (GNU Toolchain)
Afonso Oliveira
UD6.215
<p>RISC-V now spans 100+ extensions and over a thousand instructions. Binutils, QEMU, and other projects maintain separate instruction definitions, leading to duplication, mismatches, and slower support of new features.</p> <p>UDB provides a machine-readable, validated source of truth covering most of the ISA. Our generator currently produces Binutils and QEMU definitions directly from UDB, cutting the effort for standard and custom extension bring-up. And with automated CI checks against ...
31.01.26
GCC (GNU Toolchain)
Lancelot SIX
UD6.215
<p>Version 6 of the DWARF debugging information format is still a work in progress, with many changes already accepted. This talk will focus on one fundamental change that has been accepted recently: "<a href="https://dwarfstd.org/issues/230524.1.html">Issue 230524.1</a>", also known as "Location Descriptions on the DWARF Stack".</p> <p>The compiler can emit small programs in a bytecode known as DWARF expressions that a consumer (usually a debugger) can evaluate in order to compute an object's ...
31.01.26
GCC (GNU Toolchain)
Baris Aktemur
UD6.215
<p>We present a <a href="https://github.com/intel/dwarf-evaluator">DWARF-6 expression evaluator</a> implemented in OCaml. The evaluator is concise and lightweight. It aims to help tool developers learn and understand DWARF by examining the precise definitions of DWARF operators and by running examples. We believe this will be useful in particular with the "locations on the stack" change that's coming in DWARF-6.</p> <p>The evaluator comes with test cases, which can gradually turn into a ...
31.01.26
GCC (GNU Toolchain)
Daan De Meyer
UD6.215
<p>Concurrency in pid 1 and systemd in general is a touchy subject. systemd is very trigger happy when it comes to forking and when combined with multithreading this causes all sorts of issues, so there's an unwritten policy to not use threads in systemd. This has lead to (in my opinion) a sprawling callback hell in every daemon and CLI in the project that performs concurrent operations.</p> <p>In this presentation I'll present my view on the issues with using threads in systemd and why ...
31.01.26
GCC (GNU Toolchain)
James Lowden
UD6.215
<p>Last year the GCC COBOL runtime library added libxml2 as a dependency because COBOL defines XML parsing and generation as part of the language. Thus was born an engineering challenge and controversy. Should libxml2 become part of GCC? Should it be linked statically or dynamically? Who will be responsible for CVE reports and security updates? Who, indeed, will maintain libxml2, now that the maintainer has stepped down? </p> <p>Just what every compiler project wonts on their plate on a Monday ...
31.01.26
GCC (GNU Toolchain)
Mohammad-Reza Nabipoor
UD6.215
<p>A brief introduction to GNU Algol 68 programming language through showcasing a real-world baremetal project. We'll cover: - How to setup GNU Algol 68 toolchain for baremetal platforms (Arm and RISC-V microcontrollers). - How to call C code to access machine's capabilities.</p>