Containers

Linux memory management at scale

Building the future of kernel resource management
UD2.208 (Decroly)
Chris Down
Memory management is an extraordinarily complex and widely misunderstood topic. It is also one of the most fundamental concepts to understand in order to produce coherent, stable, and efficient systems and containers, especially at scale. In this talk, we will go over how to compose reliable memory heavy, multi container systems that can withstand production incidents, and go over examples of how Facebook is achieving this in production at the cutting edge. We'll also go over the open-source technologies we're building to make this work at scale in a density that has never been achieved before. We will go over widely-misunderstood Linux memory management concepts which are important to site reliability and container management with an engineer who works on the Linux kernel's memory subsystem, busting commonly held misconceptions about things like swap and memory constraints, and giving advice on key and bleeding-edge kernel concepts like PSI, cgroup v2, memory protection, and other important container-related topics along the way.

Additional information

Type devroom

More sessions

2/1/20
Containers
Sascha Grunert
UD2.208 (Decroly)
Podman is the container management tool of your choice when it comes to boosting day-to-day development tasks around containers. The journey of Podman started as a drop-in replacement for docker, but nowadays it’s even more than just that. For example, Podman is capable of managing pods, running containers without being root and supports fine granular configuration possibilities.
2/1/20
Containers
Akihiro Suda
UD2.208 (Decroly)
The biggest problem of the OCI Image Spec is that a container cannot be started until all the tarball layers are downloaded, even though more than 90% of the tarball contents are often unneeded for the actual workload. This session will show state-of-the-art alternative image formats, which allow runtime implementations to start a container without waiting for all its image contents to be locally available. Especially, this session will put focus on CRFS/stargz and its implementation status in ...
2/1/20
Containers
Daniel Borkmann
UD2.208 (Decroly)
BPF as a foundational technology in the Linux kernel provides a powerful tool for systems developers and users to dynamically reprogram and customize the kernel to meet their needs in order to solve real-world problems and without having to be a kernel expert. Thanks to BPF we have come to the point to overcome having to carry legacy accumulated over decades of development grounded in a more traditional networking environment that is typically far more static than your average Kubernetes ...
2/1/20
Containers
Ralf Haferkamp
UD2.208 (Decroly)
Kata Containers provide a secure container runtime offering an experience close to that of native containers, while providing stronger workload isolation and host infrastructure security by using hardware virtualization technology. This is particularly useful when containers are used to host and run third-party applications. In this presentation, after a short intro to Kata, we will demonstrate how easy it is to install and use on openSUSE. We will show it in action both as part of a podman ...
2/1/20
Containers
Laurent Bernaille
UD2.208 (Decroly)
Kube-proxy enables access to Kubernetes services (virtual IPs backed by pods) by configuring client-side load-balancing on nodes. The first implementation relied on a userspace proxy which was not very performant. The second implementation used iptables and is still the one used in most Kubernetes clusters. Recently, the community introduced an alternative based on IPVS. This talk will start with a description of the different modes and how they work. It will then focus on the IPVS ...
2/1/20
Containers
Adrian Reber
UD2.208 (Decroly)
The difficult task to checkpoint and restore a process is used in many container runtimes to implement container live migration. This talk will give details how CRIU is able to checkpoint and restore processes, how it is integrated in different container runtimes and which optimizations CRIU offers to decrease the downtime during container migration. In this talk I want to provide details how CRIU checkpoints and restores a process. Starting from ptrace() to pause the process, how parasite code ...
2/1/20
Containers
Christian Brauner
UD2.208 (Decroly)
Recently the kernel landed seccomp support for SECCOMPRETUSER_NOTIF which enables a process (supervisee) to retrieve a fd for its seccomp filter. This fd can then be handed to another (usually more privileged) process (supervisor). The supervisor will then be able to receive seccomp messages about the syscalls having been performed by the supervisee. We have integrated this feature into userspace and currently make heavy use of this to intercept mknod(), mount(), and other syscalls in user ...