Packet processing in software is currently undergoing a huge paradigm shift. Connection speeds of 10 Gbit/s and beyond created new problems and operating systems couldn't keep keep up. Hence, there has been a rise of frameworks and libraries working around the kernel, sometimes referred to as kernel bypass or zero copy (the latter is a misnomer). Examples are DPDK, Snabb, netmap, XDP, pf_ring, and pfq. The first part of the talk looks at the background and performance of the kernel network stack and what changes with these new frameworks. They break with all traditional APIs and present new paradigms. For example, they usually provide an application exclusive access to a network interface and exchange raw packets with the app. There are no sockets, they don't even offer a protocol stack. Hence, they are mostly used for low-level packet processing apps: routers, (virtual) switches, firewalls, and annoying middleboxes "optimizing" your connection.
It's now feasible to write quick prototypes of packet processing and forwarding apps that were restricted to dedicated hardware in the past, enabling everyone to build and test high-speed networking equipment with a low budget. These concepts are slowly creeping into operating systems and software routers/switches: FreeBSD ships with netmap today, XDP is coming to Linux, Open vSwitch can be compiled with a DPDK backend, pfSense is adopting DPDK as well, ... We need to look at the architecture of these frameworks to better understand what is coming for us. Most of these frameworks build on the original drivers that have been growing in complexity: a typical driver for a 10 or 40 Gbit/s NIC is in the order of 50,000 lines of code nowadays.
Hundreds of thousands of lines of code are involved when handling a packet in a typical operating system, and tens of thousands when using one of these new frameworks. Reading and understanding so much code is quite tedious, so the obvious question is: How hard can it be to implement a driver for a modern 10 Gbit/s NIC from scratch while ignoring all of the existing software layers? Turns out that it's not very hard: I've written ixy, a user space driver for 10 Gbit/s NICs from the Intel 82599 family (X520, X540, X550) from scratch in about 1000 lines of C code. The second part of the talk focuses on user space drivers and the Intel 82599 architecture as it is easy to understand, has a great datasheet, and the core functionality is in the driver as opposed to a magic black-box firmware.
ixy is a full user space driver: you get your raw packets delivered directly into your application and the operating system doesn't even know the NIC exists. User space drivers are also very hackable, you get direct access to the full hardware in your application in user space making it really easy to test out new features, no pesky kernel code needed. This is why it's important to have a simple driver like ixy: for hacking and educational purposes. Core functionality of the driver such as handling DMA buffers is never far away when writing an ixy app: you typically only need to look beneath one layer to see the guts of the driver. For example, when you send out a packet you call a transmit function that directly modifies a ring buffer of DMA descriptors.
Check out the code of ixy on GitHub!