CloudOpen Europe 2013: Efficient and Large-scale Infrastructure Monitoring with Tracing

Tracing is a powerful tool to help solve problems in high-performance multi-threaded applications. There are success stories of custom application tracers deployed in large distributed environments, but we almost never see a low-level system tracer deployed in such environments. With the features introduced in LTTng during the last year, we can now extract remotely and in real-time relevant informations about running production servers efficiently. We will demonstrate how LTTng can be deployed in a cloud infrastructure (0penStack) to extract high-precision metrics remotely, how to enable/disable kernel and user-space events dynamically, and how to extract traces on crashes. This presentation will give system administrators a new perspective on how to monitor and debug production servers in large-scale data-centers.

ACM Queue article: Proving the Correctness of Nonblocking Data Structures

Desnoyers, Mathieu, Proving the Correctness of Nonblocking Data Structures. ACM Queue, 11 (5): 30-43 (2013).

Collaboration Summit 2013: LTTng-UST: Efficient System-Wide User-Space Tracing

In the past, much effort has been invested in high performance kernel tracing tools, but now focus in the tracing community seems to be shifting over to efficient user space application tracing. By providing joint kernel and user space tracing, developers can have deeper insights in their applications latencies. This presentation covers the ongoing efforts within the LTTng project to enhance system-wide tracing at the user space level. It discusses instrumentation sources such as Tracepoints, Uprobes, and SystemTAP SDT providers, along with their integration with LTTng. A brief overview of the latest and upcoming features of the user space tracer is presented. It also discusses ongoing efforts in the area of trace format and control protocol standardisation. Finally, our presentation includes challenging glibc-related issues encountered during LTTng-UST development, opening the discussion on how to improve and collaborate on user-space instrumentation.

The targeted audience is user space and kernel developers, those interested in tracing infrastructure, shared system libraries, and application instrumentation.

Linux Plumbers Conference Tracing Summit 2012

The Tracing Summit 2012 was held in San Diego, on August 30th, 2012, as part of the Linux Plumbers Conference 2012. It was organized by Mathieu Desnoyers (EfficiOS) and Dominique Toupin (Ericsson).

It focuses on the tracing area, gathering people involved in development and end-users of tracing tools as well as trace analysis tools. The main target of this one day conference is to provide room for discussion between people in the various areas that benefit from tracing, namely parallel, distributed and/or real-time systems, as well as kernel development.

Tracing Summit 2012: LTTngTop: Human Readable Trace Viewer

LTTngTop is a top-alike kernel trace viewer. It uses the LTTng trace format (CTF) as input and benefits from the whole kernel tracing infrastructure (tracepoints, kprobes, kretprobes and perf PMU counters). This tool uses the tracing information to represent the state of the kernel at any point in time. As of now it displays the CPU and I/O usage, as well as the performance (PMU) counters evolution associated with all processes. It also provides a detailed view for each process including the file descriptors opened and the amount of data read and written for each. This project is still in active development mode as live tracing is being included and more views are added progressively to suit the needs of developers and system administrators. After presenting the tool and its potential, the discussion will cover the future views and analysis that the Linux Plumbers would be most interested in.