Having just an RTOS is not enough for a real-time system. The hardware must be deterministic as well as the applications that run on the system. When you are missing deadlines, the first thing that must be done is to find what is the source of the latency that caused the issue. It could be the hardware, the operating system or the application, or even a combination of the above. This talk will discuss how to determine where the latency is using tools that come with the Linux Kernel, and will explain a few cases that caused issues.
The webinar discussed accelerating P4 and eBPF programs on Netronome SmartNIC hardware. It covered the Linux kernel infrastructure like TC and XDP that supports offloading eBPF programs. It also explained how the NFP architecture is optimized for network flow processing with its multi-core design and memory hierarchy. The webinar demonstrated how eBPF programs can be translated to run efficiently on the NFP hardware by handling maps and applying optimizations.
The document provides an overview of the initialization process in the Linux kernel from start_kernel to rest_init. It lists the functions called during this process organized by category including functions for initialization of multiprocessor support (SMP), memory management (MM), scheduling, timers, interrupts, and architecture specific setup. The setup_arch section focuses on x86 architecture specific initialization functions such as reserving memory regions, parsing boot parameters, initializing memory mapping and MTRRs.
The document discusses Linux networking architecture and covers several key topics in 3 paragraphs or less:
It first describes the basic structure and layers of the Linux networking stack including the network device interface, network layer protocols like IP, transport layer, and sockets. It then discusses how network packets are managed in Linux through the use of socket buffers and associated functions. The document also provides an overview of the data link layer and protocols like Ethernet, PPP, and how they are implemented in Linux.
This document summarizes a presentation on static partitioning virtualization for RISC-V. It discusses the motivation for embedded virtualization, an overview of static partitioning hypervisors like Jailhouse and Xen, and the Bao hypervisor. It then provides an overview of the RISC-V hypervisor specification and extensions, including implemented features. It evaluates the performance overhead and interrupt latency of a prototype RISC-V hypervisor implementation with and without interference mitigations like cache partitioning.
Using the new extended Berkley Packet Filter capabilities in Linux to the improve performance of auditing security relevant kernel events around network, file and process actions.
Java Performance Analysis on Linux with Flame Graphs
This document discusses using Linux perf_events (perf) profiling tools to analyze Java performance on Linux. It describes how perf can provide complete visibility into Java, JVM, GC and system code but that Java profilers have limitations. It presents the solution of using perf to collect mixed-mode flame graphs that include Java method names and symbols. It also discusses fixing issues with broken Java stacks and missing symbols on x86 architectures in perf profiles.
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
Memory Mapping Implementation (mmap) in Linux Kernel
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
Talk for USENIX LISA 2016 by Brendan Gregg.
"Linux 4.x Tracing Tools: Using BPF Superpowers
The Linux 4.x series heralds a new era of Linux performance analysis, with the long-awaited integration of a programmable tracer: Enhanced BPF (eBPF). Formally the Berkeley Packet Filter, BPF has been enhanced in Linux to provide system tracing capabilities, and integrates with dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). This has allowed dozens of new observability tools to be developed so far: for example, measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived.
In this talk I'll show you how to use BPF in the Linux 4.x series, and I'll summarize the different tools and front ends available, with a focus on iovisor bcc. bcc is an open source project to provide a Python front end for BPF, and comes with dozens of new observability tools (many of which I developed). These tools include new BPF versions of old classics, and many new tools, including: execsnoop, opensnoop, funccount, trace, biosnoop, bitesize, ext4slower, ext4dist, tcpconnect, tcpretrans, runqlat, offcputime, offwaketime, and many more. I'll also summarize use cases and some long-standing issues that can now be solved, and how we are using these capabilities at Netflix."
This talk discusses Linux profiling using perf_events (also called "perf") based on Netflix's use of it. It covers how to use perf to get CPU profiling working and overcome common issues. The speaker will give a tour of perf_events features and show how Netflix uses it to analyze performance across their massive Amazon EC2 Linux cloud. They rely on tools like perf for customer satisfaction, cost optimization, and developing open source tools like NetflixOSS. Key aspects covered include why profiling is needed, a crash course on perf, CPU profiling workflows, and common "gotchas" to address like missing stacks, symbols, or profiling certain languages and events.
USENIX LISA2021 talk by Brendan Gregg (https://www.youtube.com/watch?v=_5Z2AU7QTH4). This talk is a deep dive that describes how BPF (eBPF) works internally on Linux, and dissects some modern performance observability tools. Details covered include the kernel BPF implementation: the verifier, JIT compilation, and the BPF execution environment; the BPF instruction set; different event sources; and how BPF is used by user space, using bpftrace programs as an example. This includes showing how bpftrace is compiled to LLVM IR and then BPF bytecode, and how per-event data and aggregated map data are fetched from the kernel.
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
The motivation of hypervisor based CPUFreq is to enable the one of the main PM use-cases (Dynamic voltage and frequency scaling) in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs.
In this talk Oleksandr will briefly describe the possible approach of generic CPUFreq in Xen on ARM, the advantages and disadvantages of having DVFS support on ARM boards powered by Xen hypervisor and share results of his CPUFreq PoC which implies power consumption measurements with and without CPUFreq enabled on R-Car Gen3 based board as an example.
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
This document provides an overview of the data structures and functions used to implement ethernet drivers in the Linux kernel. It discusses the net_device and sk_buff structures that represent network interfaces and packets. It also describes how the driver interacts with the kernel via polling, interrupts, and NAPI to handle reception and transmission of frames. Finally, it provides an example of the key components needed for a simple ethernet driver, including initialization, setup, open/close, transmission, and reception functions.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document describes a biolatency tool that traces block device I/O latency using eBPF. It discusses how the tool was originally written in the bcc framework using C/BPF, but has since been rewritten in the bpftrace framework using a simpler one-liner script. It provides examples of the bcc and bpftrace implementations of biolatency.
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Trying to figure out why your application is responding late can be difficult, especially if it is because of interference from the operating system. This talk will briefly go over how to write a C program that can analyze what in the Linux system is interfering with your application. It will use trace-cmd to enable kernel trace events as well as tracing lock functions, and it will then go over a quick tutorial on how to use libtracecmd to read the created trace.dat file to uncover what is the cause of interference to you application.
In this PowerPoint, learn how a security policy can be your first line of defense. Servers running AIX and other operating systems are frequent targets of cyberattacks, according to the Data Breach Investigations Report. From DoS attacks to malware, attackers have a variety of strategies at their disposal. Having a security policy in place makes it easier to ensure you have appropriate controls in place to protect mission-critical data.
"Session ID: BUD17-309
Session Name: IRQ prediction - BUD17-309
Speaker: Daniel Lezcano
Track: Power Management
★ Session Summary ★
The CPUidle is one component of the power management framework. It behaves in an opportunistic way when there is nothing to do on the system, by trying to predict the next CPU wake up and select an idle state. But the current design has some weaknesses as it mixes the different sources of wakeup, resulting in an already non-deterministic situation getting worse. This presentation will describe the issues faced with the current approach and will show another approach to predict the next wake up event with a better accuracy, leading, under some circumstances, to 100% right predictions.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/bud17/bud17-309/
Presentation: https://www.slideshare.net/linaroorg/bud17309-irq-prediction
Video: https://youtu.be/krFAyi1ietI
---------------------------------------------------
★ Event Details ★
Linaro Connect Budapest 2017 (BUD17)
6-10 March 2017
Corinthia Hotel, Budapest,
Erzsébet krt. 43-49,
1073 Hungary
---------------------------------------------------
Keyword: IRQ, power-management, CPUidle
http://www.linaro.org
http://connect.linaro.org
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961"
In the context of high-performance computing (HPC), the Operating System Noise (osnoise) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, softirqs, and any other system thread can cause noise to the application. Moreover, hardware-related jobs can also cause noise, for example, via SMIs.
HPC users and developers that care about every microsecond stolen by the OS need not only a precise way to measure the osnoise but mainly to figure out who is stealing cpu time so that they can pursue the perfect tune of the system. These users and developers are the inspiration of Linux's osnoise tracer.
The osnoise tracer runs an in-kernel loop measuring how much time is available. It does it with preemption, softirq and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise tracer takes note of the entry and exit point of any source of interferences. When the noise happens without any interference from the operating system level, the tracer can safely point to a hardware-related noise. In this way, osnoise can account for any source of interference. The osnoise tracer also adds new kernel tracepoints that auxiliaries the user to point to the culprits of the noise in a precise and intuitive way.
At the end of a period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources, serving as a benchmark tool.
A short, introductory talk to the world of debuggers. During the talk, we write a simple debugger application in Rust.
Video at: https://www.youtube.com/watch?v=qS51kIHWARM
This document discusses interrupts, exceptions, and system calls in operating systems. It begins by explaining that the OS is event-driven and only executes when there is an interrupt, trap, or system call. It then describes the different event types - interrupts, exceptions, and how hardware interrupts are handled using interrupt controllers like the 8259 PIC and APIC. It discusses interrupt vectors, interrupt descriptor tables, and how the CPU switches to the interrupt handler. Finally, it covers system calls and exceptions as a type of software interrupt, providing an example of the write system call.
This document provides an overview of servers, processes, and system administration. It discusses servers as machines made up of components like RAM, CPU, and I/O. It then covers these components and their capacities, as well as processes and how they interact with servers through system calls. Hands-on examples are provided to demonstrate monitoring servers and investigating processes using tools like top, lsof, strace, and vmstat.
Modern operating systems are complex beasts, responsible for sharing hardware resources between many competing programs. For low-latency systems, sometimes it's necessary to subvert the OS to grab back the resources your program needs. In this talk, we will explore what is actually going on when you run a program, how much time it actually gets on CPU, and strategies to help make your code run as fast as possible. By the end of this talk, you will know how to tune your software and the linux kernel to get the most from your hardware, and more importantly, how to validate that your changes have worked.
While probably the most prominent, Docker is not the only tool for building and managing containers. Originally meant to be a "chroot on steroids" to help debug systemd, systemd-nspawn provides a fairly uncomplicated approach to work with containers. Being part of systemd, it is available on most recent distributions out-of-the-box and requires no additional dependencies.
This deck will introduce a few concepts involved in containers and will guide you through the steps of building a container from scratch. The payload will be a simple service, which will be automatically activated by systemd when the first request arrives.
This document discusses time measurement in embedded systems. It explains that most embedded applications need to measure time for functions like auto power off, playing media, and measuring elapsed time. It also discusses real-time systems and the difference between hard and soft real-time constraints. Examples of needing real-time measurement include missile control and autonomous vehicles. The document then talks about different ways to measure time on embedded platforms like Arduino and ESP32 using millis(), micros(), and interrupts. It provides examples of using delays or interrupts to trigger a task repeatedly every second. Finally, it provides homework questions involving measuring elapsed time and triggering tasks using delays or interrupts on Arduino or C/C++.
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Ftrace is the official tracer of the Linux kernel. It has been apart of Linux since 2.6.31, and has grown tremendously ever since. Ftrace’s name comes from its most powerful feature: function tracing. But the ftrace infrastructure is much more than that. It also encompasses the trace events that are used by perf, as well as kprobes that can dynamically add trace events that the user defines.
This talk will focus on learning how the kernel works by using the ftrace infrastructure. It will show how to see what happens within the kernel during a system call; learn how interrupts work; see how ones processes are being scheduled, and more. A quick introduction to some tools like trace-cmd and KernelShark will also be demonstrated.
Steven Rostedt, VMware
Talk for YOW! by Brendan Gregg. "Systems performance studies the performance of computing systems, including all physical components and the full software stack to help you find performance wins for your application and kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (ftrace, bcc/BPF, and bpftrace/BPF), advice about what is and isn't important to learn, and case studies to see how it is applied. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud.
"
Kernel Recipes 2016 - Understanding a Real-Time System (more than just a kernel)
The PREEMPT_RT patch turns Linux into a hard Real-Time designed operating system. But it takes more than just a kernel to make sure you can meet all your requirements. This talk explains all aspects of the system that is being used for a mission critical project that must be considered. Creating a Real-Time environment is difficult and there is no simple solution to make sure that your system is capable to fulfill its needs. One must be vigilant with all aspects of the system to make sure there are no surprises. This talk will discuss most of the “gotchas” that come with putting together a Real-Time system.
You don’t need to be a developer to enjoy this talk. If you are curious to know how your computer is an unpredictable mess you should definitely come to this talk.
Steven Rostedt - Red Hat
Complexity is increasing. Trust eroding. In the wake of Spectre and Meltdown, when it seems that things cannot get any darker for processor security, the last light goes out. This talk will demonstrate what everyone has long feared but never proven: there are hardware backdoors in x86 processors, and they’re buried deeper than we ever imagined possible.
In this talk, we walk through how we discovered a privilege escalation backdoor in a family of x86 CPUs, that allows an unprivileged user, on an unmodified system, to circumvent all processor security checks and escalate from ring 3 to ring 0 – permitting an unprivileged, arbitrary userland program to directly modify and execute code inside of the kernel, regardless of the operating system, security patches, antivirus, firmware, etc.
Speakers:
Christopher Domas, Cyber Security Researcher
The document discusses analyzing database systems using a 3D method for performance analysis. It introduces the 3D method, which looks at performance from the perspectives of the operating system (OS), Oracle database, and applications. The 3D method provides a holistic view of the system that can help identify issues and direct solutions. It also covers topics like time-based analysis in Oracle, how wait events are classified, and having a diagnostic framework for quick troubleshooting using tools like the Automatic Workload Repository report.
This document discusses concurrency in SystemC simulations. It explains that SystemC uses events and processes to model concurrent systems. There are two main types of processes: threads and methods. Threads can wait for events using wait() and methods use next_trigger() to establish dynamic sensitivity. Events have no duration and are used to trigger processes. Notifying an event using notify() moves waiting processes to the ready queue. The SystemC kernel is event-driven and executes ready processes in non-deterministic order.
The document discusses process management in operating systems. It covers control blocks, interrupts, process states, scheduling algorithms like FIFO, SJF, SRTF, Round Robin and priority scheduling. It also discusses queuing, multiprogramming vs time sharing and scheduling criteria like CPU utilization, throughput, turnaround time and waiting time. Scheduling can be long, medium or short term and algorithms include priority queues and multilevel feedback queues.
Training Slides: Intermediate 204: Identifying and Resolving Issues with Tung...
In this intermediate training session we look at resolving issues that may occur with replication (Tungsten Replicator) and clustering (Tungsten Clustering). This training is for all engineers using Continuent Tungsten solutions, as the tools we use will apply to any of our products and topologies. Some MySQL replication knowledge is assumed.
AGENDA
- How to identify an issue when scanning logs
- Tools to help identify issues
- Identifying replication lag / diagnosing replication latency
- Resolving common replication issues
- Resolving common clustering issues
This presentation covers the general concepts about real-time systems, how Linux kernel works for preemption, the latency in Linux, rt-preempt, and Xenomai, the real-time extension as the dual kernel approach.
Kernel Recipes 2019 - Driving the industry toward upstream first
This document discusses the benefits of contributing device drivers and other code to the Linux kernel upstream first instead of maintaining private branches. It outlines Chrome OS's strategy of picking long-term stable kernels and contributing hardware support to upstream. While challenging, working upstream provides many benefits like free code reviews, bugfixes, and easier updates. The document shows that Chrome OS contributions have helped increase upstream support for vendors' devices and that there is still work remaining to upstream GPU and other drivers. It demos mainlining the Samsung Chromebook Plus by using the open source Panfrost GPU driver instead of a vendor driver.
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
As the name would suggest, a Non-Maskable Interrupt (NMI) is an interrupt-like feature that is unaffected by the disabling of classic interrupts. In Linux, NMIs are involved in some features such as performance event monitoring, hard-lockup detector, on demand state dumping, etc… Their potential to fire when least expected can fill the most seasoned kernel hackers with dread.
AArch64 (aka arm64 in the Linux tree) does not provide architected NMIs, a consequence being that features benefiting from NMIs see their use limited on AArch64. However, the Arm Generic Interrupt Controller (GIC) supports interrupt prioritization and masking, which, among other things, provides a way to control whether or not a set of interrupts can be signaled to a CPU.
This talk will cover how, using the GIC interrupt priorities, we provide a way to configure some interrupts to behave in an NMI-like manner on AArch64. We’ll discuss the implementation, some of the complications that ensued and also some of the benefits obtained from it.
Julien Thierry
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
At a rate of almost 9 changes per hour (24/7), the Linux kernel is definitely a scary beast. Bugs are introduced on a daily basis and, through the use of multiple code analyzers, *some* of them are detected and fixed before they hit mainline. Over the course of the last few years, Gustavo has been fixing such bugs and many different issues in every corner of the Linux kernel. Recently, he was in charge of leading the efforts to globally enable -Wimplicit-fallthrough; which appears by default in Linux v5.3. This presentation is a report on all the stuff Gustavo has found and fixed in the kernel with the support of the Core Infrastructure Initiative.
Gustavo A.R. Silva
In I.T. we all use all kinds of metrics. Operations teams rely heavily on these, especially when things go south. These metrics are sometimes overrated. Let’s dive into a few real life stories together.
Aurélien Rougemont
Kernel Recipes 2019 - Kernel documentation: past, present, and future
This document discusses the current state and future plans for Linux kernel documentation. It notes that documentation has transitioned from DocBook to Sphinx/RST, improving formatting and integration. There are now over 3,000 documentation files and many kerneldoc comments. Future plans include converting remaining text files, updating ancient documents, improving organization by topic, integrating documentation better, and enhancing the documentation toolchain. The goal is to improve documentation to better help kernel developers, users and the community.
- The document discusses Linux network stack monitoring and configuration. It begins with definitions of key concepts like RSS, RPS, RFS, LRO, GRO, DCA, XDP and BPF.
- It then provides an overview of how the network stack works from the hardware interrupts and driver level up through routing, TCP/IP and to the socket level.
- Monitoring tools like ethtool, ftrace and /proc/interrupts are described for viewing hardware statistics, software stack traces and interrupt information.
The document discusses Linux memory management, describing how physical memory is divided into page frames and virtual memory allows processes to have a virtual view of memory mapped to physical memory using page tables, and covers topics like memory overcommit, page cache, swap space, and tools for monitoring memory usage.
The document discusses the bootloaders for the BeagleBone Black system. It describes the memory organization and booting process, including the roles of the X-Loader and U-Boot bootloaders. The X-Loader is described as the first stage bootloader that is derived from U-Boot and runs in internal SRAM. It loads the second stage U-Boot bootloader. U-Boot is then described as the universal bootloader that can be ported to different boards with minimal changes and is responsible for loading the Linux kernel from external DDR memory.
The webinar discussed accelerating P4 and eBPF programs on Netronome SmartNIC hardware. It covered the Linux kernel infrastructure like TC and XDP that supports offloading eBPF programs. It also explained how the NFP architecture is optimized for network flow processing with its multi-core design and memory hierarchy. The webinar demonstrated how eBPF programs can be translated to run efficiently on the NFP hardware by handling maps and applying optimizations.
The document provides an overview of the initialization process in the Linux kernel from start_kernel to rest_init. It lists the functions called during this process organized by category including functions for initialization of multiprocessor support (SMP), memory management (MM), scheduling, timers, interrupts, and architecture specific setup. The setup_arch section focuses on x86 architecture specific initialization functions such as reserving memory regions, parsing boot parameters, initializing memory mapping and MTRRs.
The document discusses Linux networking architecture and covers several key topics in 3 paragraphs or less:
It first describes the basic structure and layers of the Linux networking stack including the network device interface, network layer protocols like IP, transport layer, and sockets. It then discusses how network packets are managed in Linux through the use of socket buffers and associated functions. The document also provides an overview of the data link layer and protocols like Ethernet, PPP, and how they are implemented in Linux.
This document summarizes a presentation on static partitioning virtualization for RISC-V. It discusses the motivation for embedded virtualization, an overview of static partitioning hypervisors like Jailhouse and Xen, and the Bao hypervisor. It then provides an overview of the RISC-V hypervisor specification and extensions, including implemented features. It evaluates the performance overhead and interrupt latency of a prototype RISC-V hypervisor implementation with and without interference mitigations like cache partitioning.
Using the new extended Berkley Packet Filter capabilities in Linux to the improve performance of auditing security relevant kernel events around network, file and process actions.
Java Performance Analysis on Linux with Flame GraphsBrendan Gregg
This document discusses using Linux perf_events (perf) profiling tools to analyze Java performance on Linux. It describes how perf can provide complete visibility into Java, JVM, GC and system code but that Java profilers have limitations. It presents the solution of using perf to collect mixed-mode flame graphs that include Java method names and symbols. It also discusses fixing issues with broken Java stacks and missing symbols on x86 architectures in perf profiles.
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
Memory Mapping Implementation (mmap) in Linux KernelAdrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
Linux 4.x Tracing Tools: Using BPF SuperpowersBrendan Gregg
Talk for USENIX LISA 2016 by Brendan Gregg.
"Linux 4.x Tracing Tools: Using BPF Superpowers
The Linux 4.x series heralds a new era of Linux performance analysis, with the long-awaited integration of a programmable tracer: Enhanced BPF (eBPF). Formally the Berkeley Packet Filter, BPF has been enhanced in Linux to provide system tracing capabilities, and integrates with dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). This has allowed dozens of new observability tools to be developed so far: for example, measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived.
In this talk I'll show you how to use BPF in the Linux 4.x series, and I'll summarize the different tools and front ends available, with a focus on iovisor bcc. bcc is an open source project to provide a Python front end for BPF, and comes with dozens of new observability tools (many of which I developed). These tools include new BPF versions of old classics, and many new tools, including: execsnoop, opensnoop, funccount, trace, biosnoop, bitesize, ext4slower, ext4dist, tcpconnect, tcpretrans, runqlat, offcputime, offwaketime, and many more. I'll also summarize use cases and some long-standing issues that can now be solved, and how we are using these capabilities at Netflix."
This talk discusses Linux profiling using perf_events (also called "perf") based on Netflix's use of it. It covers how to use perf to get CPU profiling working and overcome common issues. The speaker will give a tour of perf_events features and show how Netflix uses it to analyze performance across their massive Amazon EC2 Linux cloud. They rely on tools like perf for customer satisfaction, cost optimization, and developing open source tools like NetflixOSS. Key aspects covered include why profiling is needed, a crash course on perf, CPU profiling workflows, and common "gotchas" to address like missing stacks, symbols, or profiling certain languages and events.
USENIX LISA2021 talk by Brendan Gregg (https://www.youtube.com/watch?v=_5Z2AU7QTH4). This talk is a deep dive that describes how BPF (eBPF) works internally on Linux, and dissects some modern performance observability tools. Details covered include the kernel BPF implementation: the verifier, JIT compilation, and the BPF execution environment; the BPF instruction set; different event sources; and how BPF is used by user space, using bpftrace programs as an example. This includes showing how bpftrace is compiled to LLVM IR and then BPF bytecode, and how per-event data and aggregated map data are fetched from the kernel.
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsThe Linux Foundation
The motivation of hypervisor based CPUFreq is to enable the one of the main PM use-cases (Dynamic voltage and frequency scaling) in virtualized system powered by Xen hypervisor. Rationale behind this activity is that CPU virtualization is done by hypervisor and the guest OS doesn't actually know anything about physical CPUs because it is running on virtual CPUs.
In this talk Oleksandr will briefly describe the possible approach of generic CPUFreq in Xen on ARM, the advantages and disadvantages of having DVFS support on ARM boards powered by Xen hypervisor and share results of his CPUFreq PoC which implies power consumption measurements with and without CPUFreq enabled on R-Car Gen3 based board as an example.
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
This document provides an overview of the data structures and functions used to implement ethernet drivers in the Linux kernel. It discusses the net_device and sk_buff structures that represent network interfaces and packets. It also describes how the driver interacts with the kernel via polling, interrupts, and NAPI to handle reception and transmission of frames. Finally, it provides an example of the key components needed for a simple ethernet driver, including initialization, setup, open/close, transmission, and reception functions.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document describes a biolatency tool that traces block device I/O latency using eBPF. It discusses how the tool was originally written in the bcc framework using C/BPF, but has since been rewritten in the bpftrace framework using a simpler one-liner script. It provides examples of the bcc and bpftrace implementations of biolatency.
Using Libtracecmd to Analyze Your Latency and Performance TroublesScyllaDB
Trying to figure out why your application is responding late can be difficult, especially if it is because of interference from the operating system. This talk will briefly go over how to write a C program that can analyze what in the Linux system is interfering with your application. It will use trace-cmd to enable kernel trace events as well as tracing lock functions, and it will then go over a quick tutorial on how to use libtracecmd to read the created trace.dat file to uncover what is the cause of interference to you application.
In this PowerPoint, learn how a security policy can be your first line of defense. Servers running AIX and other operating systems are frequent targets of cyberattacks, according to the Data Breach Investigations Report. From DoS attacks to malware, attackers have a variety of strategies at their disposal. Having a security policy in place makes it easier to ensure you have appropriate controls in place to protect mission-critical data.
"Session ID: BUD17-309
Session Name: IRQ prediction - BUD17-309
Speaker: Daniel Lezcano
Track: Power Management
★ Session Summary ★
The CPUidle is one component of the power management framework. It behaves in an opportunistic way when there is nothing to do on the system, by trying to predict the next CPU wake up and select an idle state. But the current design has some weaknesses as it mixes the different sources of wakeup, resulting in an already non-deterministic situation getting worse. This presentation will describe the issues faced with the current approach and will show another approach to predict the next wake up event with a better accuracy, leading, under some circumstances, to 100% right predictions.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/bud17/bud17-309/
Presentation: https://www.slideshare.net/linaroorg/bud17309-irq-prediction
Video: https://youtu.be/krFAyi1ietI
---------------------------------------------------
★ Event Details ★
Linaro Connect Budapest 2017 (BUD17)
6-10 March 2017
Corinthia Hotel, Budapest,
Erzsébet krt. 43-49,
1073 Hungary
---------------------------------------------------
Keyword: IRQ, power-management, CPUidle
http://www.linaro.org
http://connect.linaro.org
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961"
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
In the context of high-performance computing (HPC), the Operating System Noise (osnoise) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, softirqs, and any other system thread can cause noise to the application. Moreover, hardware-related jobs can also cause noise, for example, via SMIs.
HPC users and developers that care about every microsecond stolen by the OS need not only a precise way to measure the osnoise but mainly to figure out who is stealing cpu time so that they can pursue the perfect tune of the system. These users and developers are the inspiration of Linux's osnoise tracer.
The osnoise tracer runs an in-kernel loop measuring how much time is available. It does it with preemption, softirq and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise tracer takes note of the entry and exit point of any source of interferences. When the noise happens without any interference from the operating system level, the tracer can safely point to a hardware-related noise. In this way, osnoise can account for any source of interference. The osnoise tracer also adds new kernel tracepoints that auxiliaries the user to point to the culprits of the noise in a precise and intuitive way.
At the end of a period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources, serving as a benchmark tool.
A short, introductory talk to the world of debuggers. During the talk, we write a simple debugger application in Rust.
Video at: https://www.youtube.com/watch?v=qS51kIHWARM
This document discusses interrupts, exceptions, and system calls in operating systems. It begins by explaining that the OS is event-driven and only executes when there is an interrupt, trap, or system call. It then describes the different event types - interrupts, exceptions, and how hardware interrupts are handled using interrupt controllers like the 8259 PIC and APIC. It discusses interrupt vectors, interrupt descriptor tables, and how the CPU switches to the interrupt handler. Finally, it covers system calls and exceptions as a type of software interrupt, providing an example of the write system call.
Servers and Processes: Behavior and Analysisdreamwidth
This document provides an overview of servers, processes, and system administration. It discusses servers as machines made up of components like RAM, CPU, and I/O. It then covers these components and their capacities, as well as processes and how they interact with servers through system calls. Hands-on examples are provided to demonstrate monitoring servers and investigating processes using tools like top, lsof, strace, and vmstat.
Modern operating systems are complex beasts, responsible for sharing hardware resources between many competing programs. For low-latency systems, sometimes it's necessary to subvert the OS to grab back the resources your program needs. In this talk, we will explore what is actually going on when you run a program, how much time it actually gets on CPU, and strategies to help make your code run as fast as possible. By the end of this talk, you will know how to tune your software and the linux kernel to get the most from your hardware, and more importantly, how to validate that your changes have worked.
While probably the most prominent, Docker is not the only tool for building and managing containers. Originally meant to be a "chroot on steroids" to help debug systemd, systemd-nspawn provides a fairly uncomplicated approach to work with containers. Being part of systemd, it is available on most recent distributions out-of-the-box and requires no additional dependencies.
This deck will introduce a few concepts involved in containers and will guide you through the steps of building a container from scratch. The payload will be a simple service, which will be automatically activated by systemd when the first request arrives.
This document discusses time measurement in embedded systems. It explains that most embedded applications need to measure time for functions like auto power off, playing media, and measuring elapsed time. It also discusses real-time systems and the difference between hard and soft real-time constraints. Examples of needing real-time measurement include missile control and autonomous vehicles. The document then talks about different ways to measure time on embedded platforms like Arduino and ESP32 using millis(), micros(), and interrupts. It provides examples of using delays or interrupts to trigger a task repeatedly every second. Finally, it provides homework questions involving measuring elapsed time and triggering tasks using delays or interrupts on Arduino or C/C++.
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
Ftrace is the official tracer of the Linux kernel. It has been apart of Linux since 2.6.31, and has grown tremendously ever since. Ftrace’s name comes from its most powerful feature: function tracing. But the ftrace infrastructure is much more than that. It also encompasses the trace events that are used by perf, as well as kprobes that can dynamically add trace events that the user defines.
This talk will focus on learning how the kernel works by using the ftrace infrastructure. It will show how to see what happens within the kernel during a system call; learn how interrupts work; see how ones processes are being scheduled, and more. A quick introduction to some tools like trace-cmd and KernelShark will also be demonstrated.
Steven Rostedt, VMware
Talk for YOW! by Brendan Gregg. "Systems performance studies the performance of computing systems, including all physical components and the full software stack to help you find performance wins for your application and kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (ftrace, bcc/BPF, and bpftrace/BPF), advice about what is and isn't important to learn, and case studies to see how it is applied. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud.
"
Kernel Recipes 2016 - Understanding a Real-Time System (more than just a kernel)Anne Nicolas
The PREEMPT_RT patch turns Linux into a hard Real-Time designed operating system. But it takes more than just a kernel to make sure you can meet all your requirements. This talk explains all aspects of the system that is being used for a mission critical project that must be considered. Creating a Real-Time environment is difficult and there is no simple solution to make sure that your system is capable to fulfill its needs. One must be vigilant with all aspects of the system to make sure there are no surprises. This talk will discuss most of the “gotchas” that come with putting together a Real-Time system.
You don’t need to be a developer to enjoy this talk. If you are curious to know how your computer is an unpredictable mess you should definitely come to this talk.
Steven Rostedt - Red Hat
GOD MODE Unlocked: Hardware backdoors in x86 CPUsPriyanka Aash
Complexity is increasing. Trust eroding. In the wake of Spectre and Meltdown, when it seems that things cannot get any darker for processor security, the last light goes out. This talk will demonstrate what everyone has long feared but never proven: there are hardware backdoors in x86 processors, and they’re buried deeper than we ever imagined possible.
In this talk, we walk through how we discovered a privilege escalation backdoor in a family of x86 CPUs, that allows an unprivileged user, on an unmodified system, to circumvent all processor security checks and escalate from ring 3 to ring 0 – permitting an unprivileged, arbitrary userland program to directly modify and execute code inside of the kernel, regardless of the operating system, security patches, antivirus, firmware, etc.
Speakers:
Christopher Domas, Cyber Security Researcher
The document discusses analyzing database systems using a 3D method for performance analysis. It introduces the 3D method, which looks at performance from the perspectives of the operating system (OS), Oracle database, and applications. The 3D method provides a holistic view of the system that can help identify issues and direct solutions. It also covers topics like time-based analysis in Oracle, how wait events are classified, and having a diagnostic framework for quick troubleshooting using tools like the Automatic Workload Repository report.
This document discusses concurrency in SystemC simulations. It explains that SystemC uses events and processes to model concurrent systems. There are two main types of processes: threads and methods. Threads can wait for events using wait() and methods use next_trigger() to establish dynamic sensitivity. Events have no duration and are used to trigger processes. Notifying an event using notify() moves waiting processes to the ready queue. The SystemC kernel is event-driven and executes ready processes in non-deterministic order.
The document discusses process management in operating systems. It covers control blocks, interrupts, process states, scheduling algorithms like FIFO, SJF, SRTF, Round Robin and priority scheduling. It also discusses queuing, multiprogramming vs time sharing and scheduling criteria like CPU utilization, throughput, turnaround time and waiting time. Scheduling can be long, medium or short term and algorithms include priority queues and multilevel feedback queues.
Training Slides: Intermediate 204: Identifying and Resolving Issues with Tung...Continuent
In this intermediate training session we look at resolving issues that may occur with replication (Tungsten Replicator) and clustering (Tungsten Clustering). This training is for all engineers using Continuent Tungsten solutions, as the tools we use will apply to any of our products and topologies. Some MySQL replication knowledge is assumed.
AGENDA
- How to identify an issue when scanning logs
- Tools to help identify issues
- Identifying replication lag / diagnosing replication latency
- Resolving common replication issues
- Resolving common clustering issues
This presentation covers the general concepts about real-time systems, how Linux kernel works for preemption, the latency in Linux, rt-preempt, and Xenomai, the real-time extension as the dual kernel approach.
Similar to Embedded Recipes 2018 - Finding sources of Latency In your system - Steven Rostedt (20)
Kernel Recipes 2019 - Driving the industry toward upstream firstAnne Nicolas
This document discusses the benefits of contributing device drivers and other code to the Linux kernel upstream first instead of maintaining private branches. It outlines Chrome OS's strategy of picking long-term stable kernels and contributing hardware support to upstream. While challenging, working upstream provides many benefits like free code reviews, bugfixes, and easier updates. The document shows that Chrome OS contributions have helped increase upstream support for vendors' devices and that there is still work remaining to upstream GPU and other drivers. It demos mainlining the Samsung Chromebook Plus by using the open source Panfrost GPU driver instead of a vendor driver.
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIAnne Nicolas
As the name would suggest, a Non-Maskable Interrupt (NMI) is an interrupt-like feature that is unaffected by the disabling of classic interrupts. In Linux, NMIs are involved in some features such as performance event monitoring, hard-lockup detector, on demand state dumping, etc… Their potential to fire when least expected can fill the most seasoned kernel hackers with dread.
AArch64 (aka arm64 in the Linux tree) does not provide architected NMIs, a consequence being that features benefiting from NMIs see their use limited on AArch64. However, the Arm Generic Interrupt Controller (GIC) supports interrupt prioritization and masking, which, among other things, provides a way to control whether or not a set of interrupts can be signaled to a CPU.
This talk will cover how, using the GIC interrupt priorities, we provide a way to configure some interrupts to behave in an NMI-like manner on AArch64. We’ll discuss the implementation, some of the complications that ensued and also some of the benefits obtained from it.
Julien Thierry
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelAnne Nicolas
At a rate of almost 9 changes per hour (24/7), the Linux kernel is definitely a scary beast. Bugs are introduced on a daily basis and, through the use of multiple code analyzers, *some* of them are detected and fixed before they hit mainline. Over the course of the last few years, Gustavo has been fixing such bugs and many different issues in every corner of the Linux kernel. Recently, he was in charge of leading the efforts to globally enable -Wimplicit-fallthrough; which appears by default in Linux v5.3. This presentation is a report on all the stuff Gustavo has found and fixed in the kernel with the support of the Core Infrastructure Initiative.
Gustavo A.R. Silva
Kernel Recipes 2019 - Metrics are moneyAnne Nicolas
In I.T. we all use all kinds of metrics. Operations teams rely heavily on these, especially when things go south. These metrics are sometimes overrated. Let’s dive into a few real life stories together.
Aurélien Rougemont
Kernel Recipes 2019 - Kernel documentation: past, present, and futureAnne Nicolas
This document discusses the current state and future plans for Linux kernel documentation. It notes that documentation has transitioned from DocBook to Sphinx/RST, improving formatting and integration. There are now over 3,000 documentation files and many kerneldoc comments. Future plans include converting remaining text files, updating ancient documents, improving organization by topic, integrating documentation better, and enhancing the documentation toolchain. The goal is to improve documentation to better help kernel developers, users and the community.
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...Anne Nicolas
Modern SoC designs incorporate technologies from numerous vendors, each with their own inconsistent, confusing, undocumented and even contradictory terminology. The result is a mess of acronyms and product names which have a surprising impact on the ability to develop reusable, modular code thanks to the nature of the underlying IP being obscured.
This presentation will dive into some of the misnomers plaguing the Arm ecosystem, with the aim of explaining why things are like they are, how they fit together under the architectural umbrella and how you, as a developer, can decipher the baffling ingredients list of your next SoC design!
Will Deacon
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataAnne Nicolas
GNU poke is a new interactive editor for binary data. Not limited to editing basic ntities such as bits and bytes, it provides a full-fledged procedural, interactive programming language designed to describe data structures and to operate on them. Once a user has defined a structure for binary data (usually matching some file format) she can search, inspect, create, shuffle and modify abstract entities such as ELF relocations, MP3 tags, DWARF expressions, partition table entries, and so on, with primitives resembling simple editing of bits and bytes. The program comes with a library of already written descriptions (or “pickles” in poke parlance) for many binary formats.
GNU poke is useful in many domains. It is very well suited to aid in the development of programs that operate on binary files, such as assemblers and linkers. This was in fact the primary inspiration that brought me to write it: easily injecting flaws into ELF files in order to reproduce toolchain bugs. Also, due to its flexibility, poke is also very useful for reverse engineering, where the real structure of the data being edited is discovered by experiment, interactively. It is also good for the fast development of prototypes for programs like linkers, compressors or filters, and it provides a convenient foundation to write other utilities such as diff and patch tools for binary files.
This talk (unlike Gaul) is divided into four parts. First I will introduce the program and show what it does: from simple bits/bytes editing to user-defined structures. Then I will show some of the internals, and how poke is implemented. The third block will cover the way of using Poke to describe user data, which is to say the art of writing “pickles”. The presentation ends with a status of the project, a call for hackers, and a hint at future works.
Jose E. Marchesi
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...Anne Nicolas
Operating system distributors often face challenges that are somewhat different from that of upstream kernel developers. For instance, some kernel updates often need to stay at least binary compatible with modules that might be “out of tree” for some time.
In that context, being able to automatically detect and analyze changes to the binary interface exposed by the kernel to its module does have some noticeable value.
The Libabigail framework is capable of analyzing ELF binaries along with their accompanying debug info in the DWARF format, detect and report changes in types, functions, variables and ELF symbols. It has historically supported that for user space shared libraries and application so we worked to make it understand the Linux kernel
binaries.
In this presentation, we are going to present the current support of ABI analysis for Linux Kernel binaries, the challenges we face, how we address them and the plans we have for the future.
Dodji Seketeli, Jessica Yu, Matthias Männich
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxAnne Nicolas
Different upgrade and update strategies exist when it comes to embedded Linux system. If at development time none of these strategies have been chosen, adding them afterwards can be tedious task.
Even harder it gets when the system is already deployed in the field and only accessible via a 3G connection.
This talk is a developer experience of putting in place exactly that. Giving a return of experience on one way of doing it on a system running Barebox and a Yocto-based distribution.
Patrick Boettcher
Embedded Recipes 2019 - Making embedded graphics less specialAnne Nicolas
Open-source graphics drivers are helping to make embedded systems less specialized by standardizing interfaces like KMS, DRI, and Mesa. Mesa provides implementations of graphics APIs like OpenGL and Vulkan via drivers for different GPUs. Combined with Wayland, this allows applications to run on embedded systems like a modern Linux desktop without platform-specific code. Current open-source drivers like Freedreno, Etnaviv, and Panfrost support GPUs from vendors like Qualcomm and ARM and are under active development by their respective communities.
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre SiliconAnne Nicolas
This talk will explore Open Source Hardware projects relevant to Linux, including boards like BeagleBone, Olimex OLinuXino, Giant board and more. Looking at the benefits and challenges of designing Open Source Hardware for a Linux system, along with BeagleBoard.org’s experience of working with community, manufacturers, and distributors to create an Open Source Hardware platform. In closing also looking at the future, Libre Silicon like RISC-V designs, and where this might take Linux.
Drew Fustini
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) pictureAnne Nicolas
The I2C subsystem is not the shiniest part of the Linux Kernel. For embedded devices, though, it is one of the many puzzle pieces which just have to work. Wolfram Sang has the experience of maintaining this subsystem for nearly 7 years now. This talk gives a short overview of how maintaining works in general and specifically in this subsystem. But mainly, it will highlight noteworthy points in the timeline and lessons learnt from that. It will present trends, not so much regarding I2C but more the Linux Kernel and the embedded ecosystem in general. And of course, there will be plenty of anecdotes and bits from behind the scenes for your entertainment.
Wolfram Sang
Embedded Recipes 2019 - Testing firmware the devops wayAnne Nicolas
ITRenew is selling recertified OCP servers under the Sesame brand, those servers come either with their original UEFI BIOS or with LinuxBoot. The LinuxBoot project is pushing the Linux kernel inside bios flash and using userland programs as bootloader.
To achieve quality on our software stack, as any project, we need to test it. Traditional BIOS are tested by hand, this is 2019 we need to do it automatically! We already presented the hardware setup behind the LinuxBoot CI, this talk will focus on the software.
We use u-root for our userland bootloader; this software is written in Go so we naturally choose to use Go for our testing too. We will present how we are using and extending the Go native test framework `go test` for testing embedded systems (serial console) and improving the report format for integration to a CI.
Julien Viard de Galbert
Embedded Recipes 2019 - Herd your socs become a matchmakerAnne Nicolas
This document discusses device tree matching for embedded systems. It covers:
- The history of device matching in Linux, from architecture-specific to discoverable buses to the modern device framework.
- How device trees improved the situation by providing a standardized hardware description and stable ABI, separating this from OS configuration.
- Best practices for designing compatible value schemes and supporting a variety of similar devices across SoC families in device trees and drivers.
- Different techniques for matching devices and drivers like compatible values, soc_device_match(), and quirk handling classes.
- Examples of implementing matching for Renesas SoCs and addressing differences across revisions.
Embedded Recipes 2019 - LLVM / Clang integrationAnne Nicolas
This document discusses the integration of LLVM and Clang into Buildroot. It provides an overview of LLVM and Clang, the objectives of integrating them into Buildroot, and the initial integration work done. It then covers cross-compiling Linux with Clang and some conclusions. Some key points include:
- LLVM is an open source compiler infrastructure that includes the Clang C/C++ compiler. It uses an intermediate representation and focuses on compile time and code generation performance.
- The initial integration into Buildroot included LLVM, Clang, and Mesa 3D driver support. It aimed to enable OpenCL and add new packages.
- Cross-compiling Linux and userspace applications with Clang
Embedded Recipes 2019 - Introduction to JTAG debuggingAnne Nicolas
This talk introduces JTAG debugging capabilities, both for debugging hardware and software. Marek first explains what the JTAG stands for and explains the operation of the JTAG state machine. This is followed by an introduction to free software JTAG tools, OpenOCD and urJTAG. Marek shortly explains how to debug software using those tools and how that ties into the JTAG state machine. However, JTAG was designed for testing hardware. Marek explains what boundary scan testing (BST) is, what are BSDL files and their format, and practically demonstrates how to blink an LED using BST and only free software tools.
Marek Vasut
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimediaAnne Nicolas
PipeWire is an open source project that aims to greatly improve audio and video handling under Linux. Utilising a fresh design, it bridges use cases that have been previously addressed by different tools – or not addressed at all -, providing ground for building complex, yet secure and efficient, multimedia systems.
In this talk, Julien is going to present the PipeWire project and the concepts that make up its design. In addition, he is going to give an update of the current and future work going on around PipeWire, both upstream and in Automotive Grade Linux, an early adopter that Julien is actively working on.
Julian Bouzas
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedAnne Nicolas
The document describes the ftrace function tracing tool in Linux kernels. It allows attaching to functions in the kernel to trace function calls. It works by having the GCC compiler insert indirect function entry calls. These calls are recorded during linking and replaced with nops at boot time for efficiency. This allows function tracing with low overhead by tracing the indirect function entry calls.
Kernel Recipes 2019 - Suricata and XDPAnne Nicolas
Suricata is a network threat detection engine using network packets capture to reconstruct the traffic till the application layer and find threats on the network using rules that define behavior to detect. This task is really CPU intensive and discarding non interesting traffic is a solution to enable a scaling of Suricata to 40gbps and other.
This talk will present the latest evolution of Suricata that knows uses eBPF and XDP to bypass traffic. Suricata 5.0 is supporting the hardware XDP to provide ypass with network card such as Netronome. It also takes advantage of pinned maps to get persistance of the bypassed flows. This talk will cover the different usage of XDP and eBPF in Suricata and shows how it impact performance and usability. If development time permit, the talk will also cover AF_XDP and the impact on this new capture method on Suricata.
Eric Leblond
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)Anne Nicolas
System memory configuration is a transparent operation nowadays, something that we all came to expect to just work out of the box. Still, it does happen behind the scenes every single time we boot our computers. This requires the cooperation of hardware components on the mainboard and on memory modules themselves, as well as firmware code to drive these. While it is possible to just let it happen, having a deeper understanding of how it works makes it possible to access valuable information from the operating system at run-time.
I will take you through the history of system memory configuration from the mid 70s to now. We will explore the different types of memory modules, how their configuration data is stored and how the firmware can access them. We will see which problems had to be solved along the way and how they were solved. Lastly we will see how Linux supports reading the memory configuration information and what you can do with that information.
Jean Delvare
Cultural Shifts: Embracing DevOps for Organizational TransformationMindfire Solution
Mindfire Solutions specializes in DevOps services, facilitating digital transformation through streamlined software development and operational efficiency. Their expertise enhances collaboration, accelerates delivery cycles, and ensures scalability using cloud-native technologies. Mindfire Solutions empowers businesses to innovate rapidly and maintain competitive advantage in dynamic market landscapes.
A Comparative Analysis of Functional and Non-Functional Testing.pdfkalichargn70th171
A robust software testing strategy encompassing functional and non-functional testing is fundamental for development teams. These twin pillars are essential for ensuring the success of your applications. But why are they so critical?
Functional testing rigorously examines the application's processes against predefined requirements, ensuring they align seamlessly. Conversely, non-functional testing evaluates performance and reliability under load, enhancing the end-user experience.
Are you wondering how to migrate to the Cloud? At the ITB session, we addressed the challenge of managing multiple ColdFusion licenses and AWS EC2 instances. Discover how you can consolidate with just one EC2 instance capable of running over 50 apps using CommandBox ColdFusion. This solution supports both ColdFusion flavors and includes cb-websites, a GoLang binary for managing CommandBox websites.
Responsibilities of Fleet Managers and How TrackoBit Can Assist.pdfTrackobit
What do fleet managers do? What are their duties, responsibilities, and challenges? And what makes a fleet manager effective and successful? This blog answers all these questions.
An MVP (Minimum Viable Product) mobile application is a streamlined version of a mobile app that includes only the core features necessary to address the primary needs of its users. The purpose of an MVP is to validate the app concept with minimal resources, gather user feedback, and identify any areas for improvement before investing in a full-scale development. This approach allows businesses to quickly launch their app, test its market viability, and make data-driven decisions for future enhancements, ensuring a higher likelihood of success and user satisfaction.
Seamless PostgreSQL to Snowflake Data Transfer in 8 Simple StepsEstuary Flow
Unlock the full potential of your data by effortlessly migrating from PostgreSQL to Snowflake, the leading cloud data warehouse. This comprehensive guide presents an easy-to-follow 8-step process using Estuary Flow, an open-source data operations platform designed to simplify data pipelines.
Discover how to seamlessly transfer your PostgreSQL data to Snowflake, leveraging Estuary Flow's intuitive interface and powerful real-time replication capabilities. Harness the power of both platforms to create a robust data ecosystem that drives business intelligence, analytics, and data-driven decision-making.
Key Takeaways:
1. Effortless Migration: Learn how to migrate your PostgreSQL data to Snowflake in 8 simple steps, even with limited technical expertise.
2. Real-Time Insights: Achieve near-instantaneous data syncing for up-to-the-minute analytics and reporting.
3. Cost-Effective Solution: Lower your total cost of ownership (TCO) with Estuary Flow's efficient and scalable architecture.
4. Seamless Integration: Combine the strengths of PostgreSQL's transactional power with Snowflake's cloud-native scalability and data warehousing features.
Don't miss out on this opportunity to unlock the full potential of your data. Read & Download this comprehensive guide now and embark on a seamless data journey from PostgreSQL to Snowflake with Estuary Flow!
Try it Free: https://dashboard.estuary.dev/register
Sami provided a beginner-friendly introduction to Amazon Web Services (AWS), covering essential terms, products, and services for cloud deployment. Participants explored AWS' latest Gen AI offerings, making it accessible for those starting their cloud journey or integrating AI into coding practices.
React and Next.js are complementary tools in web development. React, a JavaScript library, specializes in building user interfaces with its component-based architecture and efficient state management. Next.js extends React by providing server-side rendering, routing, and other utilities, making it ideal for building SEO-friendly, high-performance web applications.
React Native vs Flutter - SSTech SystemSSTech System
Your project needs and long-term objectives will ultimately choose which of React Native and Flutter to use. For applications using JavaScript and current web technologies in particular, React Native is a mature and trustworthy choice. For projects that value performance and customizability across many platforms, Flutter, on the other hand, provides outstanding performance and a unified UI development experience.
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...Hironori Washizaki
Hironori Washizaki, "Charting a Course for Equity: Strategies for Overcoming Challenges and Promoting Inclusion in the Metaverse", IEEE COMPSAC 2024 D&I Panel, 2024.
Discover the Power of ONEMONITAR: The Ultimate Mobile Spy App for Android Dev...onemonitarsoftware
Unlock the full potential of mobile monitoring with ONEMONITAR. Our advanced and discreet app offers a comprehensive suite of features, including hidden call recording, real-time GPS tracking, message monitoring, and much more.
Perfect for parents, employers, and anyone needing a reliable solution, ONEMONITAR ensures you stay informed and in control. Explore the key features of ONEMONITAR and see why it’s the trusted choice for Android device monitoring.
Share this infographic to spread the word about the ultimate mobile spy app!
3. What is Latency?
3
“Latency is a time interval between the stimulation and
response, or, from a more general point of view, a time
delay between the cause and the effect of some physical
change in the system being observed.” - Wikipedia
4. What is Latency?
●
The time between
– When something needs to be done
– When that something actually gets done
4
“Latency is a time interval between the stimulation and
response, or, from a more general point of view, a time
delay between the cause and the effect of some physical
change in the system being observed.” - Wikipedia
5. What is Latency?
●
The time between
– When something needs to be done
– When that something actually gets done
●
The time between
– When something happens
– When it is seen
5
“Latency is a time interval between the stimulation and
response, or, from a more general point of view, a time
delay between the cause and the effect of some physical
change in the system being observed.” - Wikipedia
6. There is ALWAYS Latency
●
Nothing happens instantaneously
●
There’s always going to be some “lag”
●
Real Time systems avoid “Unbounded Latency”
– When all latency is bound to a worse case scenario
– Any latency that is indeterminate can cause problems
●
Non Real-Time systems still care about latency
6
7. Why do we care about Latency?
●
For Real Time tasks, it is critical
– Need to know worse case latency
– Guarantee that it can achieve its objective
●
Don’t want airplane flaps to not react quick enough!
8. Why do we care about Latency?
●
Practically everyone/everything cares about Latency
– How long would you wait for ‘a’ to show up after clicking ‘a’?
– A search result that takes 5 minutes to answer!
– All systems are therefore “real time”!
●
Needs to be bounded to some arbitrary number
●
Latency adds up
– A system’s latency is the combination of its sub-components
9. Where is that Latency?
●
Latency adds up
– A system’s latency is the combination of its sub-components
●
It can be hard finding where your latency is
●
What causes it
– The application
– The libraries
– The operating system
– The hardware
10. Where is that Latency?
HARDWARE
Kernel
Library
Application
BIOS
11. Hardware Latency
●
Can’t get better than what the hardware gives you
●
Sources of HW latency
– System Management Interrupts (SMI)
●
The BIOS takes over the machine
– Clock frequency
●
The system slows down
– Cache line bouncing
●
Sharing the same variable among CPUs
12. Cyclictest
●
A tool to measure latency
●
Runs a simple loop (in a high priority task)
– Sleep for a specified time
– Get timestamp when wakes up
– Compare the difference
●
Best to use nanosleep
– Can also use signals, but that has high latency!
●
Favorite application of the Linux RT folks
13. Cyclictest
start = gettimeofday()
Sleep 250us
user-space
kernel
Put task to sleep
Set interrupt timer for 250us
Timer Interrupt goes off
Wake up Task
Schedule Taskend = gettimeofday()
jitter = (end - start) - 250us
250us
14. Cyclictest
start = gettimeofday()
Sleep 250us
user-space
kernel
Put task to sleep
Set interrupt timer for 250us
Timer Interrupt goes off
Wake up Task
Schedule Taskend = gettimeofday()
jitter = (end - start) - 250us
250us
latency (jitter)
15. Cyclictest
start = gettimeofday()
Sleep 250us
user-space
kernel
Put task to sleep
Set interrupt timer for 250us
Timer Interrupt goes off
Wake up Task
Schedule Taskend = gettimeofday()
jitter = (end - start) - 250us
250us
latency (jitter)
interrupt latency
16. Cyclictest
start = gettimeofday()
Sleep 250us
user-space
kernel
Put task to sleep
Set interrupt timer for 250us
Timer Interrupt goes off
Wake up Task
Schedule Taskend = gettimeofday()
jitter = (end - start) - 250us
250us
latency (jitter)
interrupt latency
wakeuplatency
17. Cyclictest
●
The options I am using here
– “-p80” : set the starting priority to 80
– “-i250” : set the starting interval to 250 microseconds
●
Sets nanosleep to the next 250 microsend interval
– “-n” : use nanosleep and not POSIX timers
●
POSIX timers will use signals, they have horrible latency
– “-a” : Set affinity of a task to each CPU
– “-t” : one thread per available processor (or logical CPU)
– “-q” : keep quiet while running
– “-d0” : Keep the interval the same for all threads (all at 250 microseconds)
18. The tools
●
Measuring latency
– cyclictest
●
Using Tracing tools can help find what’s happening
– Ftrace
●
hwlat
●
function and function graph tracer
●
event tracer
●
trace-cmd / KernelShark
– Lockdep (lock contention)
– Perf
●
profiling
●
statistics
– Many others (but I’m focusing on the Ftrace and and a bit of Lockdep)
19. Hardware Latency
●
Can’t get better than what the hardware gives you
●
Sources of HW latency
– System Management Interrupts (SMI)
●
The BIOS takes over the machine
– Clock frequency
●
The system slows down
– Cache line bouncing
●
Sharing the same variable among CPUs
20. Hardware Latency Detector
●
Part of Ftrace
– CONFIG_HWLAT_TRACER (in kernel .config build file)
●
cd /sys/kernel/tracing (tracefs directory)
– mount -t tracefs nodev /sys/kernel/tracing
●
echo hwlat > current_tracer
●
cat trace
24. Hardware Latency Algorithm
●
Runs a kernel thread
●
Does a tight loop for a period of time
●
Interrupts are disabled (during the loop)
●
Moves around all CPUs
●
Records when the timestamp happened
●
Denotes if it happened inside or outside the double timestamp
26. Hardware Latency Detector
# cd /sys/kernel/tracing
# cat hwlat_detector/width
500000
# cat hwlat_detector/window
1000000
# cat tracing_thresh
10
●
Spin for “width” microseconds
●
Every “window” microseconds
●
Record a trace if the diff is greater than “tracing_thresh” microseconds
27. Hardware Latency Detector (with NMIs)
●
It checks if an NMI triggered
●
NMIs are not hardware, but software controlled (in most cases)
●
nmi-total - The total time in NMI during the loop
●
nmi-count - The number of NMIs that were triggered
36. Wakeup Latency
●
The time a task wakes up, to the time it is scheduled
●
“wakeup” tracer
– Traces the time of the highest priority task (RT or Not)
– Interesting, but not very useful (hides RT task latency)
●
“wakeup_rt” tracer
– Only monitors RT tasks
●
trace-cmd record -p wakeup_rt -d -e all
– “-d” - disable function tracing
– “-e all” enable all events
36
42. IRQ and Preemption Latency
●
When interrupts or preemption is disabled
– Other tasks can not be scheduled in
●
Tracers:
– irqsoff
●
Interesting info
– preemptoff
●
interesting info
– preemptirqsoff
●
Useful info
●
The only one I use
42
45. IRQ and Preemption Latency
●
When interrupts or preemption is disabled
– Other tasks can not be scheduled in
●
Tracers:
– irqsoff
●
Interesting info
– preemptoff
●
interesting info
– preemptirqsoff
●
Useful info
●
The only one I use
45
46. The “trace_marker” and “tracing_on” files
●
/sys/kernel/tracing/trace_marker
– Let’s userspace write into the kernel ring buffer
– Application can write into it to tag where it discovered an issue
●
/sys/kernel/tracing/tracing_on
– Can enable or disable writing to the ring buffer
– Note, when off, it also prevents new “max latency” from being recorded
●
by the wakeup, and preempt/irqs off tracers
46
47. Using trace_marker and tracing_on in C code
static int marker_fd = -1;
static int tracing_fd = -1;
static __thread char buff[BUFSIZ+1];
static void write_marker(const char *fmt, ...)
{
va_list ap;
int n;
if (marker_fd < 0)
return;
va_start(ap, fmt);
n = vsnprintf(buff, BUFSIZ, fmt, ap);
va_end(ap);
write(marker_fd, buff, n);
}
static void set_tracing (int on)
{
char val[1];
if (tracing_fd < 0)
return;
val[0] = ‘0’ + on;
write(tracing_fd, val, 1);
}
[..]
// On error
write_marker(“Detected a latency of %d microsecondsn”, latency);
set_tracing(0);
48. Having cyclictest stop on max threshold
●
cyclictest is aware of the tracing infrastructure
– “-b 500” : break on if latency is greater than 500 microseconds
●
More options that we are not using but could have (All require -b option)
– “-E” enable all events
– “-f” enable function tracing
– “-I” enable irqsoff tracer
– “-P” enable preemptoff tracer
– “-w” enable wakeup tracer
– “-W” enable wakeup_rt tracer
48
50. The issue has been record (we hope)
●
The trace buffer has the issue (with a marker tag)
●
Save the text files (just to keep them around)
50
# cp /sys/kernel/tracing/trace /some/path/save_trace_dir/trace
# for file in /sys/kernel/tracing/per_cpu/cpu*/trace; do
cpu=${file/%/trace}
cp $file /some/path/save_trace_dir/`basename $cpu`
done
# ls /some/path/save_trace_dir
cpu0 cpu1 cpu2 cpu3 trace
51. For full power of trace-cmd
●
trace-cmd extract retrieves the kernel buffer into a trace.dat file
●
Can now do all sorts of filtering
51
# trace-cmd extract
# trace-cmd report 2>/dev/null |grep print
cyclictest-15129 [003] 23346.433979: print: tracing_mark_write: hit latency threshold (560 > 500)
52. For full power of trace-cmd
●
trace-cmd extract retrieves the kernel buffer into a trace.dat file
●
Can now do all sorts of filtering
52
# trace-cmd extract
# trace-cmd report 2>/dev/null |grep print
cyclictest-15129 [003] 23346.433979: print: tracing_mark_write: hit latency threshold (560 > 500)
54. Lock Stats (based off of Lockdep)
●
lockdep - checks correctness of locking
– Avoids various deadlock scenarios
●
lockstat - built off of lockdep
– Records when a task waits on a lock
– Records how much time it waited
– Records contention between different CPUs
– Records acquisitions too
●
See Documentation/locking/lockstats.txt
54
55. The /proc/lockstat file
●
The headers
– con-bounces : How many times contended across CPUs
– contentions : Number of locks acquisitions that had to wait
– wait time
●
min: Shortest (non-zero) wait time
●
max: Longest wait time
●
total: Total amount of time tasks had to wait
●
avg: The average time a task waited
– acq-bounces : Number of acquisitions that crossed CPUs
– acquisitions: The number of times the lock was acquired
– hold time - min: max: total: avg: Same as wait time but for time lock was held
55
56. trylock(lock) record lock contetion(&lock)
record lock acquisition(&lock)
lock(&lock)
How lockstats work