HPC, Big Data and Data Science
EESSI: One Scientific Software Stack to Rule Them All
The European Environment for Scientific Software Installations (EESSI, pronounced as “easy”) is a collaboration between different HPC sites and industry partners, with the common goal to set up a shared repository of scientific software installations that can be used on a variety of systems, regardless of which flavor/version of Linux distribution or processor architecture is used, or whether it is a full-size HPC cluster, a cloud environment or a personal workstation.
The EESSI codebase (https://github.com/eessi) is open source and heavily relies on various other open-source software, including Ansible, archspec, CernVM-FS, Cluster-in-the-Cloud, EasyBuild, Gentoo Prefix, Lmod, ReFrame, Singularity, and Terraform.
The concept of the EESSI project was inspired by the Compute Canada software stack, and consists of three main layers:
- a filesystem layer leveraging the established CernVM-FS technology, to globally distribute the EESSI software stack;
- a compatibility layer using Gentoo Prefix, to ensure compatibility with different client operating systems (different Linux distributions, macOS, Windows Subsystem for Linux);
- a software layer, hosting optimized installations of scientific software along with required dependencies, which were built for different processor architectures, and where archspec, EasyBuild and Lmod are leveraged.
We use Ansible for automating the deployment of the EESSI software stack. Terraform is used for creating cloud instances which are used for development, building software, and testing. We also employ ReFrame for testing the different layers of the EESSI project, and the provided installations of scientific software applications. Finally, we use Singularity containers for having clean software build environments and for providing easy access to our software stack, for instance on machines without a native CernVM-FS client.
In this talk, we will present how the EESSI project grew out of a need for more collaboration to tackle the challenges in the changing landscape of scientific software and HPC system architectures. The project structure will be explained in more detail, covering the motivation for the layered approach and the choice of tools, as well as the lessons learned from the work done by Compute Canada. The goals we have in mind and how we plan to achieve them going forward will be outlined.
Finally, we will demonstrate the current pilot version of the project, and give you a feeling of the potential impact.
Here we give a more extensive overview of the free and open-source software that EESSI depends on, and how they are being used in the project.
Ansible (https://www.ansible.com/) is a tool for automation and configuration management. We use Ansible for automating the deployment of the EESSI software stack. This includes, for instance, the installation and configuration of all CernVM-FS components, installing Gentoo Prefix on different CPU architectures, and adding our packages and customizations to the Gentoo Prefix installation.
Archspec (https://github.com/archspec/archspec) is a Python library for detecting, querying, and comparing the architecture of a system. In EESSI it is used to find the CPU type of the host system and the software stack in the repository that best matches the host CPU microarchitecture. In the future, we will also use the library to do the same for GPUs.
CernVM-FS (https://cernvm.cern.ch/fs/) is a software distribution service that provides a scalable, read-only, globally distributed filesystem. Clients can mount this filesystem over HTTP. We use CernVM-FS to make the scientific software stack available to any client around the world.
Cluster-in-the-Cloud (https://cluster-in-the-cloud.readthedocs.io/) is a tool that allows you to easily set up a scalable and heterogeneous cluster in the cloud. We leverage this tool to automate software builds on specific architectures, and to test the software installations.
EasyBuild (https://easybuilders.github.io/easybuild/) is an installation tool for scientific software, currently supporting over 2,000 packages. By default, EasyBuild optimizes the software for the build host system. We use EasyBuild to install all the different scientific software that we want to include in our stack, and for all the different architectures that we want to support.
Gentoo Prefix (https://wiki.gentoo.org/wiki/Project:Prefix) is a Linux distribution that is built from source and can be installed in a given path (the “prefix”). It supports many different architectures, including x86_64, Arm64, POWER, and can be used on both Linux and macOS systems.
Lmod (https://www.tacc.utexas.edu/research-development/tacc-projects/lmod) is an environment modules tool written in Lua, which is used on many different HPC systems to give users intuitive access to software installations. It also allows you to have multiple software versions side-by-side. The EESSI software stack includes an installation of Lmod and environment module files for each scientific software application and its dependencies, providing easy access to those installations to end users.
ReFrame (https://reframe-hpc.readthedocs.io/) is a high-level regression testing framework for HPC systems. EESSI will be using ReFrame to implement tests (written in Python) for verifying the correctness of the different layers of our project, and doing performance checks of software installations.
Singularity (https://sylabs.io/singularity/) is a container platform that was created to run complex applications on HPC systems. We use it to set up isolated build environments on different kinds of systems, without requiring root privileges. Furthermore, we use it to provide clients with a way to easily access our repository, without having to install a CernVM-FS client.
Terraform (https://www.terraform.io/) is a tool that enables you to easily set up clound instances on demand. We use it to do exactly that, for instance for build machines.