Miscellaneous

SaBRe: Load-time selective binary rewriting

K.1.105 (La Fontaine)
Paul-Antoine Arras
Abstract Binary rewriting is a technique that consists in disassembling a program to modify its instructions, with many applications, e.g. monitoring, debugging, reverse engineering and reliability. However, existing solutions suffer from well-known shortcomings in terms of soundness, performance and usability. We present SaBRe, a novel load-time framework for selective binary rewriting. SaBRe rewrites specific constructs of interest — mainly system calls and function prologues — when the program is loaded into memory. This enables users to intercept those constructs at runtime via a modular architecture allowing custom plugins to be linked with SaBRe using a simple and flexible API. We also discuss the theoretical underpinnings of disassembling and rewriting, including conditions for coverage, accuracy, and correctness; and how they affect SaBRe. We developed two backends for SaBRe — one for x86_64 and one for RISC-V — which were in turn used to implement two open-source plugins: a fast system call tracer and a fault injector. Our evaluation shows that SaBRe imposes little performance overhead, between 0.2% and 4.3% on average. In addition to explaining the architecture of SaBRe and demonstrating its performance, we also show on a concrete example how easy creating a new plugin for SaBRe is. SaBRe is a free open-source software released under the GPLv3 license and originally developed as part of the Software Reliabilty Group at Imperial College London.
Introduction The goal of binary rewriting is to add, delete and replace instructions in binary code. There are two main types of binary rewriting techniques: static and dynamic. In static binary rewriting, the binary file is statically rewritten on disk, while in dynamic binary rewriting it is rewritten in memory, as the program executes. Static binary rewriting has the advantage that the rewriting process does not incur any overhead during execution, as it is performed before the program starts running. However, static binary rewriting is hard to get right: creating a valid modified executable on disk is challenging, and correctly identifying all the code in the program is error-prone in the presence of variable-length instructions and indirect jumps. By contrast, dynamic binary rewriting modifies the code in memory, during program execution. This is typically accomplished by translating one basic block at a time and caching the results, with branch instructions modified to point to already translated code. Since translation is done at runtime, when the instructions are issued and the targets of indirect branches are already resolved, dynamic binary rewriting does not encounter the challenges discussed above for static binary rewriting. However, the translation is heavyweight and incurs a large runtime overhead. In this presentation, we introduce SaBRe, a system that implements a novel design point for binary rewriting. Unlike prior techniques, SaBRe operates at load-time, after the program is loaded into memory, but before it starts execution. Like static binary rewriting techniques, SaBRe rewrites the code in-place, but the translation is done in memory, as for dynamic binary rewriting. To achieve a high level of both performance and reliability, SaBRe relies by default on trampolines, which are extremely efficient and can be used more than 99.99% of the time, and only falls back on illegal instructions triggering a signal handler for pathological cases. The main limitation of SaBRe is that it is designed to rewrite only certain types of constructs, namely system calls (including vDSO), function prologues and some architecture- specific instructions (e.g. RDTSC in x86). However, as we illustrate later on, this is enough to support a variety of tasks, with much lower overhead than with dynamic binary rewriting and without incurring the precision limitations of static binary rewriting. We implemented two binary rewriters based on this design: one for x86 64 and one for RISC-V code. Both rewriters feature a flexible API, which we used to implement three different plugins: a fast system call tracer, a multi-version execution system (not open-sourced yet) and a fault injector. In summary, our main contributions are: 1. A new design point for selective binary rewriting which translates code in memory in-place at load time, before the program starts execution. 2. An implementation of this approach for two architectures, one for x86 64 and the other for RISC-V. 3. A comprehensive evaluation using two open-source plugins: a fast strace-like system call tracer and a fault injector. 4. An extremely simple API that can be leveraged by users to implement and integrate their own plugins.

Additional information

Type maintrack

More sessions

2/1/20
Community and Ethics
Danese Cooper
K.1.105 (La Fontaine)
Free and Open Source software has revolutionized the Software Industry and nearly all other areas of human endeavor, but until now its reach into actual governance at the municipal citizen level has not been very deep. Initiatives like Code for America have encountered challenges driving acceptance for FOSS alternatives to proprietary software for citizen governance. At the same time the gap between citizen need and cities’ capabilities as widened. But several new projects are aiming to change ...
2/1/20
History
Michael Meeks
Janson
From ten years of LibreOffice, how can you apply what we learned to your project ? What is going on in LibreOffice today, and where is it going ? and How can you re-use or contribute to the story.
2/1/20
Community and Ethics
James Bottomley
K.1.105 (La Fontaine)
It has become very popular in the last several years to think of free and open source as a community forward activity, indeed the modern approach is to try and form a community or foundation first and do code second. There is also much talk about maintainer burn out and community exploitation. However, the same people who talk about this still paraphrase the most famous quote from the Cathedral and the Bazaar "Scratching your own itch". They forget this is your own itch not everyone else's ...
2/1/20
History
James Shubin
Janson
Over the past twenty years, the automation landscape has changed dramatically. As our hunger for complex technical infrastructure increased, and our inability to keep up with these demands faltered, we've outsourced a lot of the work to third-parties and cloud providers. We'll step backwards and show where we came from, and where we're going. If we don't understand this future, and step up to the challenge, then we eventually won't control our own computers anymore. We'll discuss this timeline ...
2/1/20
Community and Ethics
Molly de Blanc
K.1.105 (La Fontaine)
Internet of Things (IoT) devices are part of the future we were promised. Armed with our mobile devices, we can control everything from our cars to our toasters to the doors of our homes. Along with convenience, IoT devices bring us ethical quandaries, as designers and users. We need to consider the ethical implicates of the technologies we are building and ask ourselves not just about the ways they are being used, for both good and evil, but the potential use cases we might encounter in the ...
2/1/20
History
Ton Roosendaal
Janson
The presentation is going to be audiovisual and entertaining; based on a number of short videos I want to tell the story of Blender. Starting in late 90s, how Blender became open source, going over the big milestones for Blender, end ending with the fast growth of our project and the interest of the film and game industry. Blender now is a more mature project now, which involves a different dynamics than it used to be. How are we going to tackle the challenges of the industry, while not losing ...
2/1/20
Community and Ethics
K.1.105 (La Fontaine)
Despite the number of working groups, advisory committees, and coordination roundtables, there is little progress towards creating more ethical and safe AI systems. AI systems are deployed in increasingly fragile contexts. From law enforcement to humanitarian aid, several organizations use AI powered systems to make or inform critical decisions with increasingly outsized side effects. What is a rights-based approach for designing minimally safe and transparent guidelines for AI systems? In this ...