Infra Management

AMENDMENT Designing for Failure

Fault Injection, Circuit Breakers and Fast Recovery
UD2.120 (Chavanne)
Walter Heck
Designing for Failure While we all work very hard to build high-available, fault-tolerant and resillient applications and infrastructures the end-goal is currently often something along the lines of loosly-coupled/microservices with zero-downtime in mind. Upgrades are tied to CI/CD pipelines and we should be sipping pina coladas on the beach. Time to unleash the Chaos Monkey, because that is what Netflix does, so we should try it as well. Now, the backend DB failed. The middleware application is returning errors, and your frontend is showing a fancy 5xx. While each layer is able to scale independently or fail-over to another region, even a simple timeout @ the DB can cause a cascading failure. The application is designed to work, not designed to recover from failure. Designing for failure applies to both software development and infrastructure architecture, and I'd like to talk about a couple of points to highlight this paradigm. Please note that this talk replaces one entitled "Introduction to Metal³" that was due to have been given by Stephen Benjamin, who has sent his apologies but is now unable to attend.

Additional information

Type devroom

More sessions

2/1/20
Infra Management
UD2.120 (Chavanne)
Introducing Tanka, a scalable Jsonnet based tool for deploying and managing Kubernetes Infrastructure
2/1/20
Infra Management
Dennis Kliban
UD2.120 (Chavanne)
Pulp (https://pulpproject.org) enables users to organize and distribute software. Now that Pulp 3.0 is generally available, it’s time to integrate it into your software delivery workflows. While the REST API is the primary integration point, it is the OpenAPI schema definition of that API that enables users to build client libraries in various languages. These clients simplify the integration with Pulp 3. This talk will provide a brief introduction to OpenAPI. This will be followed by a ...
2/1/20
Infra Management
David Heijkamp
UD2.120 (Chavanne)
It may be hard to image, but some sysadmins do not operate in ideal, tightly controlled circumstances. Apparently, not every developer, application or organization is ready for Kubernetes… In this presentation we will share a real world use case: deploying and configuring a brand new natural history museum. We’ll show how we built the museum with open source software and config management tools, dealing with a broad set of technologies, a tight schedule, a sector dominated by traditional ...
2/1/20
Infra Management
Amit Upadhye
UD2.120 (Chavanne)
Managing compliance of large IT environment is complex and challenging task. Today's hybrid cloud environments are having different life cycles, when considering many short lived system like cloud instances its difficult to manage compliance on the go. This talk focuses on how OpenSCAP policies, tools and Ansible can be used to have greater control of compliance of large environments.
2/1/20
Infra Management
UD2.120 (Chavanne)
The talk with give an introduction of Ansible collections and will talk about collection structure and how to deliver Ansible content with collections
2/1/20
Infra Management
Jeff Knurek
UD2.120 (Chavanne)
A key aspect of a microservice architecture is to make sure individual services work in isolation. But as a developer its also important to make sure the service works in the full system. Providing developers a way to run pre-production code in a multi-service environment is challenging. Making use of existing Helm charts and defaulting to production configuration does part of the work. Also important is being able to extend upon tools like Telepresence/Ksync for debugging in k8s. But while ...
2/1/20
Infra Management
Michael Hrivnak
UD2.120 (Chavanne)
Join us to learn why Operators are the leading and default approach for managing workloads on Kubernetes. We will pull back the curtain to show you exactly what an Operator is, how to make one, and what it means to be “Kubernetes Native”.