Session
Schedule FOSDEM 2020
RISC-V

ERASER: Early-stage Reliability And Security Estimation for RISC-V

An open source framework for resilience/security evaluation and validation in RISC-V processors
K.3.401
Karthik Swaminathan
RISC-V processors have gained acceptance across a wide range of computing domains, from IoT to embedded/mobile class and even in server-class processing systems. In processing systems ranging from connected cars and autonomous vehicles, to those on-board satellites and spacecrafts, these processors are targeted to function in safety-critical systems, where Reliability, Availability and Serviceability (RAS)-based considerations are of paramount importance. Along with potential system vulnerabilities caused primarily due to random errors, these processors may also be sensitive to targeted errors, possibly from malicious entities, which raises serious concerns regarding the security and safety of the processing system. Consequently, such systems necessitate the incorporation of RAS-based considerations right from an early stage of processor design. While the hardware and software ecosystem around RISC-V has been steadily maturing, there have, however, been limited developments in early stage reliability-aware design and verification. The Early-stage Reliability And Security Estimation for RISC-V (ERASER) tool attempts to address this shortcoming. It consists of an open source framework aimed at providing directions to incorporate such reliability and security features at an early, pre-silicon stage of design. These features may include what kind of protection to be applied and which components within the processor should they be applied to. The proposed infrastructure comprises of an open source toolchain for early stage modeling of latch vulnerability in a RISC-V core (SERMiner [1]), a tool for automated generation of stress marks that maximize the likelihood of a transient-failure induced error (Microprobe (RISC-V) [2]), and verification by means of statistical and/or targeted fault injection (Chiffre [3]). While the infrastructure is targeted towards any core that uses the RISC-V ISA, the repository provides an end-to-end flow for the Rocket core [4]. ERASER thus evaluates “RAS-readiness”, or the effectiveness of protection techniques in processor design such that processor vulnerability in terms of Failures-In-time (FIT) rate is minimized, for a specified power/performance overhead. FIT rate is defined as the number of failures in one billion hours of operation and is a standard vulnerability metric used in industry. ERASER is an open source tool available for download at https://github.com/IBM/eraser. The tool currently supports analysis of all latches in the design across a single Rocket core and the generation of stressmarks that can be used to evaluate the vulnerability of these latches. In addition to radiation-induced soft errors, we plan to extend ERASER to also model errors due to voltage noise, thermal and aging-induced failures, both in memory and logic, and generate representative stressmarks. ERASER is an initial effort to devise a comprehensive methodology for RAS analysis, particularly for open-source hardware, with the hope that it spurs further research and development into reliability-aware design both in industry and academia. References: K. Swaminathan, R. Bertran, H. Jacobson, P. Kudva, P. Bose, ‘Generation of Stressmarks for Early-stage Soft-error Modeling’, International Conference on Dependable Systems and Networks (DSN) 2019 S. Eldridge R. Bertran, A. Buyuktosunoglu, P. Bose, ‘MicroProbe: An Open Source Microbenchmark Generator, ported to the RISC-V ISA, the 7th RISC-V workshop, 2017 S. Eldridge, A. Buyuktosunoglu and P. Bose, ‘Chiffre A Configurable Hardware Fault Injection Framework for RISC-V Systems’ 2nd Workshop on Computer Architecture Research with RISC-V (CARRV), 2018 Krste Asanović, Rimas Avižienis, Jonathan Bachrach, Scott Beamer, David Biancolin, Christopher Celio, Henry Cook, Palmer Dabbelt, John Hauser, Adam Izraelevitz, Sagar Karandikar, Benjamin Keller, Donggyu Kim, John Koenig, Yunsup Lee, Eric Love, Martin Maas, Albert Magyar, Howard Mao, Miquel Moreto, Albert Ou, David Patterson, Brian Richards, Colin Schmidt, Stephen Twigg, Huy Vo, and Andrew Waterman, The Rocket Chip Generator, Technical Report UCB/EECS-2016-17, EECS Department, University of California, Berkeley, April 2016

The attached figure shows a representative flow for the RAS estimation methodology. An initial characterization of all instructions in the RISC-V ISA is carried out via RTL simulation using an existing core model (eg. the Rocket core). The simulation is configured to generate VCD (Value- Change Dump) files for every single instruction testcase. The SERMiner tool parses these VCD files to determine latch activities across the core, aggregated at a macro (or RTL module) level. Based on these per-instruction latch activities, SERMiner outputs an instruction sequence, which forms the basis of the SER stressmark to be generated by Microprobe (RISC-V). Microprobe (RISC-V) is a microbenchmark generation tool that is capable of generating microbenchmarks geared towards specific architecture and micro-architecture level characterization. One of its key applications is in the generation of stressmarks, or viruses, that target various worst-case corners of processor operation. These stressmarks may be targeted at maximizing power, voltage noise, temperature, or soft-error vulnerability as in case of this tool. The generated stressmark is then used to generate a list of latches that show a high residency and hence a high SER vulnerability. These latches are the focus of fault injection-based validation experiments using the Chiffre tool. Chiffre provides a framework for automatically instrumenting a hardware design with run-time configurable fault injectors. The vulnerable latches obtained from running the generated stressmarks through the Rocket core model, and then through SERMiner, are earmarked for targeted fault injection experiments using Chiffre. The objective of these experiments is to further prune the list of vulnerable latches by eliminating those that are derated, that is, they do not affect the overall output even when a fault is injected in them. Focusing any and all protection strategies on this final list of latches would maximize RAS coverage across the entire core.

Ongoing and future work:

ERASER currently only supports analysis of all latches in the design across a single Rocket core and the generated stressmarks can be used to evaluate the vulnerability of these latches. Most on-chip memory structures such as register files and caches, are equipped with parity/ECC protection and are as such protected against most radiation-induced soft errors. However, they are still vulnerable to supply voltage noise, thermal and aging-induced failures, and other hard or permanent errors. We plan to extend ERASER to model such errors, both in memory and logic, and generate stressmarks representative of worst-case thermal emergencies and voltage noise, in addition to soft errors.

Additional information

Type devroom

More sessions

2/1/20
RISC-V
Greg Chadwick
K.3.401
Ibex implements RISC-V 32-bit I/E MC M-Mode, U-Mode and PMP. It uses an in order 2 stage pipe and is best suited for area and power sensitive rather than high performance applications. However there is scope for meaningful performance gains without major impact to power or area. This talk describes work done at lowRISC to analyse and improve the performance of Ibex. The RTL of an Ibex system is simulated using Verilator to run CoreMark and Embench and the traces analysed to identify the major ...
2/1/20
RISC-V
Dan Petrisko
K.3.401
BlackParrot is a Linux-capable, cache-coherent RISC-V multicore, designed for efficiency and ease of use. In this talk, we will provide an architectural overview of BlackParrot, focusing on the design principles and development process as well as the software and hardware ecosystems surrounding the core. We will also discuss the project roadmap and our plans to engage the open-source community. Last, we will demonstrate a multithreaded RISC-V program running on top of Linux on a multicore ...
2/1/20
RISC-V
K.3.401
HammerBlade is an open source RISC-V manycore that has been under development since 2015 and has already been silicon validated with a 511-core chip in 16nm TSMC. It features extensions to the RISC-V ISA that target GPU-competitive performance for parallel programs (i.e. GPGPU) including graphs and ML workloads. In this talk we will overview the components of the HW and software ecosystem in the latest version, and show you how to get up and running as an open source user or contributor in ...
2/1/20
RISC-V
K.3.401
ESP is an open-source research platform for RISC-V systems-on-chip that integrate many hardware accelerators. ESP provides a vertically integrated design flow from software development and hardware integration to full-system prototyping on FPGA. For application developers, ESP offers domain-specific automated solutions to synthesize new accelerators for their software and map it onto the heterogeneous SoC architecture. For hardware engineers, ESP offers automated solutions to integrate their ...
2/1/20
RISC-V
Schuyler Eldridge
K.3.401
The burgeoning RISC-V hardware ecosystem includes a number of microprocessor implementations [1, 3] and SoC generation frameworks [1, 2, 7]. However, while accelerator “sockets” have been defined and used (e.g., Rocket Chip’s custom coprocessor/RoCC), accelerators require additional collateral to be generated like structured metadata descriptions, hardware wrappers, and device drivers. Requiring manual effort to generate this collateral proves both time consuming and error prone and is at ...
2/1/20
RISC-V
K.3.401
RISC-V application, OS, and firmware development has been slowed by the lack of "real hardware" available for developers to work with. With the rise of FPGAs in the cloud and the recent release of the OpenPiton+Ariane manycore platform on Amazon's F1 cloud FPGA platform, we propose using 1-12 core OpenPiton+Ariane processors emulated on F1 to develop RISC-V software and firmware. In this talk, we will give an accelerated tutorial on how to get started with OpenPiton+Ariane, the spec-compliant ...
2/1/20
RISC-V
Ofer Shinaar
K.3.401
We would like to present and overlay technique for RISCV, develop by WD and open sourced. This FW feature acts as a software “paging” manager. It is to be threaded with the Real-Time code and to the toolchain. Cacheable Overlay Manager RISC-V (ComRV), a technique which fits limited memory embedded devices (as IoT’s), and does not need any HW support.