Software Performance

Towards unified full-stack performance analysis and automated computer system design at CERN with Adaptyst

H.1301 (Cornil)
Maks Graczyk
<p>Slow performance is often a major blocker of new visionary applications in scientific computing and related fields, regardless of whether it is embedded or distributed computing. This issue is becoming more and more challenging to tackle as it is no longer enough to do only algorithmic optimisations, only hardware optimisations, or only (operating) system optimisations: all of them need to be considered together.</p> <p>Architecting full-stack computer systems customised for a use case comes to the rescue, namely software-system-hardware co-design. However, doing this manually per use case is cumbersome as the search space of possible solutions is vast, the number of different programming models is substantial, and experts from various disciplines need to be involved. Moreover, performance analysis tools often used here are fragmented, with state-of-the-art programs tending to be proprietary and not compatible with each other.</p> <p>This is why automated full-stack system design is promising, but the existing solutions are few and far between and do not scale. <strong>Adaptyst is an open-source project at CERN (the world-leading particle physics laboratory) aiming to solve this problem. It is meant to be a comprehensive architecture-agnostic tool which:</strong></p> <ul> <li>unifies performance analysis across the entire software-hardware stack by calling state-of-the-art software and APIs under the hood with any remaining gaps bridged by Adaptyst (so that performance can be inspected both macro- and microscopically regardless of the workflow and platform type)</li> <li>suggests automatically the best solutions of workflow performance bottlenecks in terms of one or more of: software optimisations, hardware choices and/or customisations, and (operating) system design</li> <li>scales easily from embedded to high-performance/distributed computing and allows adding support for new software/system/hardware components seamlessly by anyone thanks to the modular design</li> </ul> <p>The tool is in the early phase of development with small workforce and concentrating on profiling at the moment. Given that Adaptyst has broad application potential and we want it to be for everyone’s benefit, we are building an open-source community around the project.</p> <p>This talk is an invitation to join us: we will explain the performance problems we face at CERN, tell you in detail what Adaptyst is and how you can get involved, and demonstrate the current version of the project on CPU and CUDA examples.</p> <p>Project website: https://adaptyst.web.cern.ch</p>

Additional information

Live Stream https://live.fosdem.org/watch/h1301
Type devroom
Language English

More sessions

2/1/26
Software Performance
Alexander Zaitsev
H.1301 (Cornil)
<p>Nowadays, in the software industry, we already have a lot of ways to improve performance of our applications: compilers become better and better each year in the optimization field, we have a lot of tools like Linux perf and Intel VTune to analyze performance. Even algorithms are still improving in various domains! But how many of these improvements are actually adopted in the industry, and how difficult it is to adopt them in reality? That's an interesting question!</p> <p>In this talk, I ...
2/1/26
Software Performance
YASH PANCHAL
H.1301 (Cornil)
<p>Relying only on nvidia-smi is like measuring highway usage by checking if any car is present, not how many lanes are full. </p> <p>This talk reveals the metrics nvidia-smi doesn't show and introduces open source tools that expose actual GPU efficiency metrics.</p> <p>We'll cover:</p> <ol> <li>Why GPU Utilization is not same as GPU Efficiency.</li> <li>Deep dive into relevant key metrics: SM metrics, Tensor Core metrics, Memory metrics explained.</li> <li>Practical gpu profiling and monitoring ...
2/1/26
Software Performance
Kenneth Hoste
H.1301 (Cornil)
<p>In scientific computing on supercomputers, performance should be king. Today’s rapidly diversifying High-Performance Computing (HPC) landscape makes this increasingly difficult to achieve however...</p> <p>Modern supercomputers rely heavily on open source software, from a Linux-based operating system to scientific applications and their vast dependency stacks. A decade ago, HPC systems were relatively homogeneous: Intel CPUs, a fast interconnect like Infininand, and a shared filesystem. ...
2/1/26
Software Performance
H.1301 (Cornil)
<p>Reliable performance measurement remains an unsolved problem across most open source projects. Benchmarks are often an afterthought, and when they aren't they can be noisy, non-repeatable, and hard to act on.</p> <p>This talk shares lessons learned from building a large-scale benchmarking system at Datadog and shows how small fixes can make a big difference: controlling environmental noise, designing benchmarks, interpreting results with sound statistical methods, and more.</p> <p>Attendees ...
2/1/26
Software Performance
H.1301 (Cornil)
<p><a href="https://www.mercurial-scm.org/">Mercurial</a> is a distributed version control system whose codebase combines Python, C and Rust. Over its twenty years of development, significant effort has been put into its scaling and overall performance.</p> <p>In the recent 7.2 version, the performance of exchanging data between repositories (e.g. <code>push</code> and <code>pull</code>) has been significantly improved, with some of our most complicated benchmark cases moving from almost four ...
2/1/26
Software Performance
Gábor Szárnyas
H.1301 (Cornil)
<p>Database vendors often engage in fierce competition on system performance – in the 1980s, they even had their "benchmark wars". The creation of the TPC, a non-profit organization that defines standard benchmarks and supervises their use through rigorous audits, spelled an end to the benchmark wars and helped drive innovation on performance in relational database management systems.</p> <p>TPC served as a model for defining database benchmarks, including the Linked Data Benchmark Council ...
2/1/26
Software Performance
Henrik Ingo
H.1301 (Cornil)
<p>In the past 30 years we've moved from manual QA testing of release candidates to Continuous Integration and even Continuous Deployment. But while most software projects excel at testing correctness, the level of automation of performance testing is still near zero. And while it's a given that each developer writes tests for their own code, Performance Engineering remains the domain of individual experts or separate teams, who benchmark the product with custom tools developed in house, often ...