Monitoring and Observability

PostgreSQL Network Filter for EnvoyProxy

February 7, 2021
11:30 AM – 12:05 PM

D.monitoring

How do you monitor Postgres? What information can you get out of it, and to what degree does this information help to troubleshoot operational issues? What if you want/need to log all the queries? That may bring heavy trafficked databases down. At OnGres we’re obsessed with improving PostgreSQL’s observability. So we worked together with Tetrate folks on an Envoy’s Network Filter extension for PostgreSQL, to provide and extend observability of the traffic inout a cluster infrastructure. This extension is public and open source. You can use it anywhere you use Envoy. It allows you to capture automated metrics and to debug network traffic. This talk will be a technical deep-dive into PostgreSQL’s protocol decoding, Envoy proxy filters and will cover all the capabilities of the tool and its usage and deployment in any environment.

Envoy [1] is a high performance C++ distributed proxy designed for single services and applications, as well as a communication bus and “universal data plane” designed for large microservice “service mesh” architectures. Built on the learnings of solutions such as NGINX, HAProxy, hardware load balancers, and cloud load balancers, Envoy runs alongside every application and abstracts the network by providing common features in a platform-agnostic manner. When all service traffic in an infrastructure flows via an Envoy mesh, it becomes easy to visualize problem areas via consistent observability, tune overall performance, and add substrate features in a single place. Envoy can be used to proxy connections to PostgreSQL instances and in this talk we’ll see how we improve PostgreSQL observability without impacting the performance of the database and without needing to install and/or configure a bunch of things like logs, pgstatstatements, etc, using a Network Filter [2] for PostgreSQL we developed that decodes frontend and backend protocol to get transparently some metrics and metadata about it operation. Roadmap: - [WIP] SSL termination and monitoring [3] [4] - Integrate Postgres parser to improve dynamic metadata and per-query tracking - Individual (per-query) tracking of query performance - Traffic mirroring for Postgres major upgrade testing and validations [1] https://www.envoyproxy.io/ [2] https://www.envoyproxy.io/docs/envoy/latest/intro/archoverview/otherprotocols/postgres#arch-overview-postgres [3] https://github.com/envoyproxy/envoy/issues/10942 [4] https://github.com/envoyproxy/envoy/issues/9577

Additional information

Type	devroom

More sessions

2/7/21	Monitoring & Observability intro Monitoring and Observability Richard Hartmann D.monitoring Our customary welcome.
2/7/21	Observability for beginners Monitoring and Observability Atibhi Agrawal D.monitoring Observability is not a new idea, it first originated in control theory. In control theory observability is defined as "A measure of how well internal states of a system can be inferred from knowledge of its external outputs" We software folks borrowed the term and now define it as the property of any system that allows us to understand what is going on with them, monitor what they are doing and get the information we need to operate & troubleshoot. In this talk, I am going to give an ...
2/7/21	A Google Monitoring System, Monarch… in Open Source? Monitoring and Observability D.monitoring Recently Google published a paper on their monitoring system Monarch, which happened to have similar design choices to the existing CNCF Incubated project: Thanos! During this talk, two of Thanos maintainers will explain why Thanos could be claimed as an unintentional open source evolution of Google Monitoring Systems like Monarch.
2/7/21	Getting Started with Grafana Tempo Monitoring and Observability Joe Elliott D.monitoring Grafana Tempo is a new high volume distributed tracing backend whose only dependency is object storage. Unlike other tracing backends Tempo can hit massive scale without a massive and difficult to manage Elasticsearch or Cassandra cluster. The current trade off for using object storage is that Tempo supports search by trace id only. However, we will see how this trade off can be overcome using the other pillars of observability. In this session we will use an OpenTelemetry instrumented ...
2/7/21	Proper Monitoring Monitoring and Observability Jason Yee D.monitoring Good monitoring allows us to quickly troubleshoot problems and ensure that they remain minor blips rather than escalate into hours or days of downtime. But what is “good”? Just like good code, good monitoring should include tests and documentation to ensure that it’s always valid and easily used by everyone. In this lightning talk, I’ll share best practices for validating and documenting your monitoring.
2/7/21	Monitoring MariaDB Server with bpftrace on Linux Monitoring and Observability Valerii Kravchuk D.monitoring Bpftrace is a relatively new open source tracer for modern Linux (kernels 5.x.y) for analyzing production performance problems and troubleshooting software. Basic usage of the tool, as well as bpftrace-based one liners and small scripts useful for MariaDB DBAs (and even developers) are presented. Problems of MariaDB Server dynamic tracing with bpftrace are discussed.
2/7/21	Performance Analysis and Troubleshooting Methodologies for Databases Monitoring and Observability Peter Zaitsev D.monitoring Have you heard about the USE Method (Utilization - Saturation - Errors), RED (Rate - Errors - Duration) or Golden Signals (Latency - Traffic - Errors - Saturations)? In this presentation, we will talk briefly about these different, but similar “focuses” and discuss how we can apply them to the data infrastructure performance analysis troubleshooting and monitoring.

FOSDEM 2021

2/6/21 – 2/7/21

Event

Hackerkonferenzen

Created by @CCC 58 Follower

Event Calendar