This document discusses various techniques for optimizing KVM performance on Linux systems. It covers CPU and memory optimization through techniques like vCPU pinning, NUMA affinity, transparent huge pages, KSM, and virtio_balloon. For networking, it discusses vhost-net, interrupt handling using MSI/MSI-X, and NAPI. It also covers block device optimization through I/O scheduling, cache mode, and asynchronous I/O. The goal is to provide guidance on configuring these techniques for workloads running in KVM virtual machines.
VMware ESXi - Intel and Qlogic NIC throughput difference v0.6
We are observing different network throughputs on Intel X710 NICs and QLogic FastLinQ QL41xxx NIC. ESXi hardware supports NIC hardware offloading and queueing on 10Gb, 25Gb, 40Gb and 100Gb NIC adapters. Multiple hardware queues per NIC interface (vmnic) and multiple software threads on ESXi VMkernel is depicted and documented in this paper which may or may not be the root cause of the observed problem. The key objective of this document is to clearly document and collect NIC information on two specific Network Adapters and do a comparison to find the difference or at least root cause hypothesis for further troubleshooting.
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
The document discusses distributed virtual routers (DVR) in OpenStack Neutron. It describes the high-level architecture of DVR, which distributes routing functions from network nodes to compute nodes to improve performance and scalability compared to legacy centralized routing. Key aspects covered include east-west and north-south routing mechanisms, configuration, agent operation modes, database extensions, scheduling, and support for services. Plans are outlined for enhancing DVR in upcoming OpenStack releases.
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
The document provides an overview of virtual networking concepts in VMware vSphere, including:
- Types of virtual switch connections like virtual machine port groups and VMkernel ports
- Standard switches and distributed switches
- VLAN configurations and tagging
- Network adapter and switch port policies for security, traffic shaping, and failover
- Troubleshooting tools like ESXCLI, TCPDUMP and networking commands
Deploying CloudStack and Ceph with flexible VXLAN and BGP networking
1) The document discusses using VXLAN, BGP and EVPN to implement a layer 3 network for a cloud deployment using Ceph and CloudStack. This allows scaling beyond the limits of layer 2 networks and VLANs.
2) Key infrastructure components discussed include Dell S5232F-ON switches running Cumulus Linux, SuperMicro hypervisors and Ceph storage servers using NVMe SSDs.
3) The deployment provides high performance private and public cloud infrastructure with scalable networking and over 650TB of reliable Ceph storage per rack.
This document discusses SR-IOV (Single Root I/O Virtualization) in ACRN. It begins with an introduction to SR-IOV, describing how it allows PCIe devices to be isolated and have near bare-metal performance through the use of Physical Functions (PFs) and Virtual Functions (VFs). It then outlines the SR-IOV architecture in ACRN, including how it detects and initializes SR-IOV devices, assigns VFs to VMs, and manages the lifecycle of VFs. Finally, it provides an agenda for an SR-IOV demo using an Intel 82576 NIC and concludes with a Q&A section.
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
Architecting a private cloud to meet the use cases of its users can be a daunting task. How do you determine which of the many L2/L3 Neutron plugins and drivers to implement? Does network performance outweigh reliability? Are overlay networks just as performant as VLAN networks? The answers to these questions will drive the appropriate technology choice.
In this presentation, we will look at many of the common drivers built around the ML2 framework, including LinuxBridge, OVS, OVS+DPDK, SR-IOV, and more, and will provide performance data to help drive decisions around selecting a technology that's right for the situation. We will discuss our experience with some of these technologies, and the pros and cons of one technology over another in a production environment.
In this session, you'll learn how RBD works, including how it:
Uses RADOS classes to make access easier from user space and within the Linux kernel.
Implements thin provisioning.
Builds on RADOS self-managed snapshots for cloning and differential backups.
Increases performance with caching of various kinds.
Uses watch/notify RADOS primitives to handle online management operations.
Integrates with QEMU, libvirt, and OpenStack.
Quantum provides network connectivity as a service for OpenStack clouds. It allows tenants to create multiple private networks with custom topologies and control IP addressing. Quantum uses a plugin architecture that supports different networking technologies like Open vSwitch, Cisco UCS, and Linux bridge. This provides choice and enables advanced network services. The Quantum API and plugins manage network connectivity independently of compute resources.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Five common customer use cases for Virtual SAN - VMworld US / 2015
This session was presented by Lee Dilworth and Duncan Epping at VMworld in the US in 2015. Five common customer use cases of the last 12-18 months are discussed in this deck.
This document discusses Linux kernel crash capture and analysis. It begins with an overview of what constitutes a kernel crash and reasons crashes may occur, both from hardware and software issues. It then covers using kdump to capture virtual memory cores (vmcores) when a crash happens, and configuring kdump for optimal core collection. Finally, it discusses analyzing vmcores after collection using the crash utility, including commands to inspect system information, backtraces, logs, and more.
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Page cache mechanism in Linux kernel.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
LCU13: Deep Dive into ARM Trusted Firmware
Resource: LCU13
Name: Deep Dive into ARM Trusted Firmware
Date: 31-10-2013
Speaker: Dan Handley / Charles Garcia-Tobin
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
This document summarizes OpenStack Compute features related to the Libvirt/KVM driver, including updates in Kilo and predictions for Liberty. Key Kilo features discussed include CPU pinning for performance, huge page support, and I/O-based NUMA scheduling. Predictions for Liberty include improved hardware policy configuration, post-plug networking scripts, further SR-IOV support, and hot resize capability. The document provides examples of how these features can be configured and their impact on guest virtual machine configuration and performance.
Building and managing a cloud is not an easy task. It needs solid knowledge, proper planning and extensive experience in selecting the proper components and putting them together.
Many companies build new-age KVM clouds, only to find out that their applications & workloads do not perform well. Join this webinar to learn how to get the most out of your KVM cloud and how to optimize it for performance.
Join this webinar and learn:
Why performance matters and how to measure it properly?
What are the main components of an efficient new-age cloud?
How to select the right hardware?
How to optimize CPU and memory for ultimate performance?
Which network components work best?
How to tune the storage layer for performance?
Many companies build new-age KVM clouds, only to find out that their applications & workloads do not perform well. In this talk we’ll show you how to get the most out of your KVM cloud and how to optimize it for performance: You’ll understand why performance matters and how to measure it properly. We’ll teach you how to optimize CPU and memory for ultimate performance and how to tune the storage layer for performance. You’ll find out what are the main components of an efficient new-age cloud and which network components work best. In addition, you’ll learn how to select the right hardware to achieve unmatched performance for your new-age cloud and applications.
Venko Moyankov is an experienced system administrator and solutions architect at StorPool storage. He has experience with managing large virtualizations, working in telcos, designing and supporting the infrastructure of large enterprises. In the last year, his focus has been in helping companies globally to build the best storage solution according to their needs and projects.
The document discusses various ways to optimize storage performance for virtual machines, including:
1) Provisioning virtual disks using different QEMU emulated devices like virtio-blk and configuring the IOThread option to improve performance.
2) Performing NUMA pinning to ensure virtual CPUs, memory and I/O threads are placed on the same NUMA node as the host storage device.
3) Configuring virtual machine options like using raw block devices instead of image files, enabling the IOThread, and tuning QEMU and image file parameters to improve I/O performance.
Docker Networking with New Ipvlan and Macvlan DriversBrent Salisbury
This document introduces new Docker network drivers called Macvlan and Ipvlan. It provides information on setting up and using these drivers. Some key points:
- Macvlan and Ipvlan allow containers to have interfaces directly on the host network instead of going through NAT or VPN. This provides better performance and no NAT issues.
- The drivers can be used in bridge mode to connect containers to an existing network, or in L2/L3 modes for more flexibility in assigning IPs and routing.
- Examples are given for creating networks with each driver mode and verifying connectivity between containers on the same network.
- Additional features covered include IP address management, VLAN trunking, and dual-stack IPv4/
VXLAN Integration with CloudStack was presented at the Advanced Zone CCCEU13 conference in Amsterdam on November 21, 2013. The presentation discussed integrating VXLAN to overcome the VLAN ID limitation in CloudStack and allow for more scalable network isolation. VXLAN was demonstrated working with CloudStack to provide isolated networks and inter-tier connectivity within VPCs while maintaining network isolation. Basic functions like VM connectivity, migration, and network availability were tested under VXLAN and found to work as expected. Feedback was welcomed on the VXLAN integration in CloudStack.
FOSDEM15 SDN developer room talk
DPDK performance
How to not just do a demo with DPDK
The Intel DPDK provides a platform for building high performance Network Function Virtualization applications. But it is hard to get high performance unless certain design tradeoffs are made. This talk focuses on the lessons learned in creating the Brocade vRouter using DPDK. It covers some of the architecture, locking and low level issues that all have to be dealt with to achieve 80 Million packets per second forwarding.
VMware ESXi - Intel and Qlogic NIC throughput difference v0.6David Pasek
We are observing different network throughputs on Intel X710 NICs and QLogic FastLinQ QL41xxx NIC. ESXi hardware supports NIC hardware offloading and queueing on 10Gb, 25Gb, 40Gb and 100Gb NIC adapters. Multiple hardware queues per NIC interface (vmnic) and multiple software threads on ESXi VMkernel is depicted and documented in this paper which may or may not be the root cause of the observed problem. The key objective of this document is to clearly document and collect NIC information on two specific Network Adapters and do a comparison to find the difference or at least root cause hypothesis for further troubleshooting.
Overview of Distributed Virtual Router (DVR) in Openstack/Neutronvivekkonnect
The document discusses distributed virtual routers (DVR) in OpenStack Neutron. It describes the high-level architecture of DVR, which distributes routing functions from network nodes to compute nodes to improve performance and scalability compared to legacy centralized routing. Key aspects covered include east-west and north-south routing mechanisms, configuration, agent operation modes, database extensions, scheduling, and support for services. Plans are outlined for enhancing DVR in upcoming OpenStack releases.
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
The document provides an overview of virtual networking concepts in VMware vSphere, including:
- Types of virtual switch connections like virtual machine port groups and VMkernel ports
- Standard switches and distributed switches
- VLAN configurations and tagging
- Network adapter and switch port policies for security, traffic shaping, and failover
- Troubleshooting tools like ESXCLI, TCPDUMP and networking commands
Deploying CloudStack and Ceph with flexible VXLAN and BGP networking ShapeBlue
1) The document discusses using VXLAN, BGP and EVPN to implement a layer 3 network for a cloud deployment using Ceph and CloudStack. This allows scaling beyond the limits of layer 2 networks and VLANs.
2) Key infrastructure components discussed include Dell S5232F-ON switches running Cumulus Linux, SuperMicro hypervisors and Ceph storage servers using NVMe SSDs.
3) The deployment provides high performance private and public cloud infrastructure with scalable networking and over 650TB of reliable Ceph storage per rack.
This document discusses SR-IOV (Single Root I/O Virtualization) in ACRN. It begins with an introduction to SR-IOV, describing how it allows PCIe devices to be isolated and have near bare-metal performance through the use of Physical Functions (PFs) and Virtual Functions (VFs). It then outlines the SR-IOV architecture in ACRN, including how it detects and initializes SR-IOV devices, assigns VFs to VMs, and manages the lifecycle of VFs. Finally, it provides an agenda for an SR-IOV demo using an Intel 82576 NIC and concludes with a Q&A section.
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpJames Denton
Architecting a private cloud to meet the use cases of its users can be a daunting task. How do you determine which of the many L2/L3 Neutron plugins and drivers to implement? Does network performance outweigh reliability? Are overlay networks just as performant as VLAN networks? The answers to these questions will drive the appropriate technology choice.
In this presentation, we will look at many of the common drivers built around the ML2 framework, including LinuxBridge, OVS, OVS+DPDK, SR-IOV, and more, and will provide performance data to help drive decisions around selecting a technology that's right for the situation. We will discuss our experience with some of these technologies, and the pros and cons of one technology over another in a production environment.
In this session, you'll learn how RBD works, including how it:
Uses RADOS classes to make access easier from user space and within the Linux kernel.
Implements thin provisioning.
Builds on RADOS self-managed snapshots for cloning and differential backups.
Increases performance with caching of various kinds.
Uses watch/notify RADOS primitives to handle online management operations.
Integrates with QEMU, libvirt, and OpenStack.
Quantum provides network connectivity as a service for OpenStack clouds. It allows tenants to create multiple private networks with custom topologies and control IP addressing. Quantum uses a plugin architecture that supports different networking technologies like Open vSwitch, Cisco UCS, and Linux bridge. This provides choice and enables advanced network services. The Quantum API and plugins manage network connectivity independently of compute resources.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Five common customer use cases for Virtual SAN - VMworld US / 2015Duncan Epping
This session was presented by Lee Dilworth and Duncan Epping at VMworld in the US in 2015. Five common customer use cases of the last 12-18 months are discussed in this deck.
This document discusses Linux kernel crash capture and analysis. It begins with an overview of what constitutes a kernel crash and reasons crashes may occur, both from hardware and software issues. It then covers using kdump to capture virtual memory cores (vmcores) when a crash happens, and configuring kdump for optimal core collection. Finally, it discusses analyzing vmcores after collection using the crash utility, including commands to inspect system information, backtraces, logs, and more.
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Page cache mechanism in Linux kernel.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
LCU13: Deep Dive into ARM Trusted Firmware
Resource: LCU13
Name: Deep Dive into ARM Trusted Firmware
Date: 31-10-2013
Speaker: Dan Handley / Charles Garcia-Tobin
Talk by Brendan Gregg for USENIX LISA 2019: Linux Systems Performance. Abstract: "
Systems performance is an effective discipline for performance analysis and tuning, and can help you find performance wins for your applications and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes the topic for everyone, touring six important areas of Linux systems performance: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events) and tracing (Ftrace, bcc/BPF, and bpftrace/BPF), and much advice about what is and isn't important to learn. This talk is aimed at everyone: developers, operations, sysadmins, etc, and in any environment running Linux, bare metal or the cloud."
This document summarizes OpenStack Compute features related to the Libvirt/KVM driver, including updates in Kilo and predictions for Liberty. Key Kilo features discussed include CPU pinning for performance, huge page support, and I/O-based NUMA scheduling. Predictions for Liberty include improved hardware policy configuration, post-plug networking scripts, further SR-IOV support, and hot resize capability. The document provides examples of how these features can be configured and their impact on guest virtual machine configuration and performance.
Achieving the Ultimate Performance with KVMDevOps.com
Building and managing a cloud is not an easy task. It needs solid knowledge, proper planning and extensive experience in selecting the proper components and putting them together.
Many companies build new-age KVM clouds, only to find out that their applications & workloads do not perform well. Join this webinar to learn how to get the most out of your KVM cloud and how to optimize it for performance.
Join this webinar and learn:
Why performance matters and how to measure it properly?
What are the main components of an efficient new-age cloud?
How to select the right hardware?
How to optimize CPU and memory for ultimate performance?
Which network components work best?
How to tune the storage layer for performance?
Many companies build new-age KVM clouds, only to find out that their applications & workloads do not perform well. In this talk we’ll show you how to get the most out of your KVM cloud and how to optimize it for performance: You’ll understand why performance matters and how to measure it properly. We’ll teach you how to optimize CPU and memory for ultimate performance and how to tune the storage layer for performance. You’ll find out what are the main components of an efficient new-age cloud and which network components work best. In addition, you’ll learn how to select the right hardware to achieve unmatched performance for your new-age cloud and applications.
Venko Moyankov is an experienced system administrator and solutions architect at StorPool storage. He has experience with managing large virtualizations, working in telcos, designing and supporting the infrastructure of large enterprises. In the last year, his focus has been in helping companies globally to build the best storage solution according to their needs and projects.
The document discusses various ways to optimize storage performance for virtual machines, including:
1) Provisioning virtual disks using different QEMU emulated devices like virtio-blk and configuring the IOThread option to improve performance.
2) Performing NUMA pinning to ensure virtual CPUs, memory and I/O threads are placed on the same NUMA node as the host storage device.
3) Configuring virtual machine options like using raw block devices instead of image files, enabling the IOThread, and tuning QEMU and image file parameters to improve I/O performance.
Advanced performance troubleshooting using esxtopAlan Renouf
This document discusses using esxtop and resxtop tools to troubleshoot performance issues on VMware ESXi hosts. It provides 10 key things to know about esxtop counters and how they work. It then gives examples of using esxtop to troubleshoot common problems like CPU contention, memory issues, network throughput problems, and disk I/O latency. It also lists some other diagnostic tools that can be used along with esxtop.
The document describes Linux containerization and virtualization technologies including containers, control groups (cgroups), namespaces, and backups. It discusses:
1) How cgroups isolate and limit system resources for containers through mechanisms like cpuset, cpuacct, cpu, memory, blkio, and freezer.
2) How namespaces isolate processes by ID, mounting, networking, IPC, and other resources to separate environments for containers.
3) The new backup system which uses thin provisioning and snapshotting to efficiently backup container environments to backup servers and restore individual accounts or full servers as needed.
Build an High-Performance and High-Durable Block Storage Service Based on CephRongze Zhu
This document discusses building a high-performance and durable block storage service using Ceph. It describes the architecture, including a minimum deployment of 12 OSD nodes and 3 monitor nodes. It outlines optimizations made to Ceph, Qemu, and the operating system configuration to achieve high performance, including 6000 IOPS and 170MB/s throughput. It also discusses how the CRUSH map can be optimized to reduce recovery times and number of copysets to improve durability to 99.99999999%.
The document introduces AppliedMicro's X-Gene® processor technology. The X-Gene 1 and X-Gene 2 are server-on-a-chip solutions that integrate ARMv8 CPU cores, memory controllers, networking, storage and I/O interfaces while achieving high performance and low power. Benchmark results show the X-Gene processors providing competitive performance to Intel Xeon chips while using less power. The high-density, low-power X-Gene chips allow building scale-out servers that deliver significantly higher performance and lower costs than traditional scale-up servers for various workloads like web applications and databases.
In this session, Boyan Krosnov, CPO of StorPool will discuss a private cloud setup with KVM achieving 1M IOPS per hyper-converged (storage+compute) node. We will answer the question: What is the optimum architecture and configuration for performance and efficiency?
Slides from a talk given to coursemates about my university final year project on the UWE CRTS course which involved porting uCLinux to the Pluto 6 gaming control board.
Slides at OpenStack Summit 2017 Sydney
Session Info and Video: https://www.openstack.org/videos/sydney-2017/100gbps-openstack-for-providing-high-performance-nfv
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Community
This document discusses an all-flash Ceph array design from QCT based on NUMA architecture. It provides an agenda that covers all-flash Ceph and use cases, QCT's all-flash Ceph solution for IOPS, an overview of QCT's lab environment and detailed architecture, and the importance of NUMA. It also includes sections on why all-flash storage is used, different all-flash Ceph use cases, QCT's IOPS-optimized all-flash Ceph solution, benefits of using NVMe storage, and techniques for configuring and optimizing all-flash Ceph performance.
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureDanielle Womboldt
This document discusses an all-flash Ceph array design from QCT based on NUMA architecture. It provides an agenda that covers all-flash Ceph and use cases, QCT's all-flash Ceph solution for IOPS, an overview of QCT's lab environment and detailed architecture, and the importance of NUMA. It also includes sections on why all-flash storage is used, different all-flash Ceph use cases, QCT's IOPS-optimized all-flash Ceph solution, benefits of using NVMe storage, QCT's lab test environment, Ceph tuning recommendations, and benefits of using multi-partitioned NVMe SSDs for Ceph OSDs.
Achieving the ultimate performance with KVM ShapeBlue
This document summarizes an presentation about achieving ultimate performance with KVM. It discusses optimizing hardware, CPU, memory, networking, and storage for virtual machines. The goal is the lowest cost per delivered resource while meeting performance targets. Specific optimizations mentioned include CPU pinning, huge pages, SR-IOV networking, virtio drivers, and bypassing the host for storage. It cautions that many performance claims use unrealistic benchmarks and hardware configurations unlike real-world usage.
- The document discusses current R&D work on pre-Exascale HPC systems, including a PRACE 2011 prototype that delivers over 10 TFLOPS in a single rack using heterogeneous hardware with GPUs and achieves over 1.1 TFLOPS/kW efficiency.
- Performance debugging techniques are discussed for multi-socket, multi-chipset, multi-GPU systems to analyze issues like bottlenecks in the cache hierarchy topology and imbalanced I/O. Affinity and memory binding are important to optimize performance.
- Linux and Windows tools like HWLOC can be used to set CPU and GPU affinity as well as memory binding to improve data transfer rates between devices by ensuring local memory access.
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017 - ...Haidee McMahon
For details on Intel's Out of The Box Network Developers Ireland meetup, goto https://www.meetup.com/Out-of-the-Box-Network-Developers-Ireland/events/237726826/
Intel Talk : Enhanced Platform Awareness for Openstack to increase NFV performance
By Andrew Duignan
Bio: Andrew Duignan is an Electronic Engineering graduate from University College Dublin, Ireland. He has worked as a software engineer in Motorola and now at Intel Corporation. He is now in a Platform Applications Engineering role, supporting technologies such as DPDK and virtualization on Intel CPUs. He is based in the Intel Shannon site in Ireland.
The document describes the BURA supercomputer located at the University of Rijeka in Croatia. It provides details about the system's architecture and specifications. BURA is a hybrid system consisting of a large shared memory system with 512 processor cores and a computer cluster of 288 compute nodes with 6,912 processor cores in total. Benchmark results show that BURA can achieve peak performance of 233.56 TFlop/s and has an overall power efficiency of 2.15 TFlop/s per kW of power. The supercomputer is used for computational projects in fields like computational chemistry, biology, engineering, and weather modeling.
The document discusses changes in z/VM 6.3 to support large logical partition (LPAR) workloads. Key changes include implementing HiperDispatch to improve processor efficiency through affinity-aware dispatching and vertical CPU management. Memory support was increased from 256GB to 1TB per z/VM system. Other improvements include enhanced dump support for larger environments and tools for studying monitor data to understand workload behavior.
The document provides information about virtual machine extensions (VMX) on Juniper Networks routers. It discusses hardware virtualization concepts including guest virtual machines running on a host machine. It then describes the different types of virtualization including fully virtualized, para-virtualized, and hardware-assisted. The rest of the document goes into details about the VMX product, architecture, forwarding model, and performance considerations for different use cases.
Deploying Containers and Managing ThemDocker, Inc.
The document discusses managing Docker containers across multiple hosts. It introduces Dockermix/Maestro for defining deployments in YAML and synchronizing containers. It covers allocating CPU/RAM resources, potential scheduling solutions like Mesos and Omega, and advanced networking techniques like using Open vSwitch to bypass iptables overhead. Useful links are provided for Maestro, container metrics, Pipework for networking containers, and a Docker API pull request for resource allocation.
Similar to Kvm performance optimization for ubuntu (20)
Kubernetes networking allows pods to communicate with each other and services to load balance traffic to pods. The document discusses Kubernetes networking concepts including the network model, pod networking using CNI plugins like Flannel, and different service types such as ClusterIP, NodePort, and Ingress. It provides examples of exposing a Kubernetes service using hostNetwork, hostPort, and NodePort and how network policies are implemented using iptables.
The document discusses Distributed Virtual Router (DVR) and L3 High Availability in OpenStack Networking (Juno). It describes DVR packet flow including SNAT on the network node, floating IP/DNAT on compute nodes, and East-West traffic flow between instances on different compute nodes using GRE tunnels. Compute nodes perform distributed routing functions using Open vSwitch and namespaces.
Docker - container and lightweight virtualization Sim Janghoon
Docker is an open platform for building, shipping and running distributed applications. It uses containers, which are lightweight and portable execution environments, to isolate applications and their dependencies from one another. Containers are created from Docker images, which are templates that contain binaries, libraries and configuration files needed to run an application. Namespaces and control groups allow containers to share resources on the host machine while maintaining isolation.
This document provides an overview of OpenStack Networking (Neutron) and the different networking plugins and configurations available in Neutron. It discusses the Nova network manager, the Neutron OpenvSwitch plugin configured for VLAN and GRE tunneling modes, Neutron security groups, and Neutron's software defined networking capabilities. Diagrams and examples of packet flows are provided to illustrate how networks are logically and physically implemented using the different Neutron plugins.
OpenStack networking can use either VLAN tagging or GRE tunneling to provide logical isolation between tenant networks. With VLAN, packets are tagged with a VLAN ID at the compute and network nodes to associate them with a particular tenant network. With GRE, packets are encapsulated with a GRE header that includes a tunnel ID to associate them with a tenant network. Security groups are applied using iptables rules to filter traffic between VMs in different networks.
This document discusses OpenvSwitch, an open source virtual switch that provides virtual networking and network virtualization capabilities. It describes OpenvSwitch's architecture, features, configuration, and use cases with OpenStack, VMware NSX, MidoNet, Pica8, and Intel DPDK. OpenvSwitch supports virtual networking functions like VLANs, STP, QoS, and tunneling protocols. It integrates with hypervisors and controllers to enable network virtualization and software-defined networking.
Choose our Linux Web Hosting for a seamless and successful online presencerajancomputerfbd
Our Linux Web Hosting plans offer unbeatable performance, security, and scalability, ensuring your website runs smoothly and efficiently.
Visit- https://onliveserver.com/linux-web-hosting/
Coordinate Systems in FME 101 - Webinar SlidesSafe Software
If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights.
During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to:
- Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value
- Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems
- Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors
- Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported
- Look Ahead: Gain insights into where FME is headed with coordinate systems in the future
Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!
How RPA Help in the Transportation and Logistics Industry.pptxSynapseIndia
Revolutionize your transportation processes with our cutting-edge RPA software. Automate repetitive tasks, reduce costs, and enhance efficiency in the logistics sector with our advanced solutions.
UiPath Community Day Kraków: Devs4Devs ConferenceUiPathCommunity
We are honored to launch and host this event for our UiPath Polish Community, with the help of our partners - Proservartner!
We certainly hope we have managed to spike your interest in the subjects to be presented and the incredible networking opportunities at hand, too!
Check out our proposed agenda below 👇👇
08:30 ☕ Welcome coffee (30')
09:00 Opening note/ Intro to UiPath Community (10')
Cristina Vidu, Global Manager, Marketing Community @UiPath
Dawid Kot, Digital Transformation Lead @Proservartner
09:10 Cloud migration - Proservartner & DOVISTA case study (30')
Marcin Drozdowski, Automation CoE Manager @DOVISTA
Pawel Kamiński, RPA developer @DOVISTA
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
09:40 From bottlenecks to breakthroughs: Citizen Development in action (25')
Pawel Poplawski, Director, Improvement and Automation @McCormick & Company
Michał Cieślak, Senior Manager, Automation Programs @McCormick & Company
10:05 Next-level bots: API integration in UiPath Studio (30')
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
10:35 ☕ Coffee Break (15')
10:50 Document Understanding with my RPA Companion (45')
Ewa Gruszka, Enterprise Sales Specialist, AI & ML @UiPath
11:35 Power up your Robots: GenAI and GPT in REFramework (45')
Krzysztof Karaszewski, Global RPA Product Manager
12:20 🍕 Lunch Break (1hr)
13:20 From Concept to Quality: UiPath Test Suite for AI-powered Knowledge Bots (30')
Kamil Miśko, UiPath MVP, Senior RPA Developer @Zurich Insurance
13:50 Communications Mining - focus on AI capabilities (30')
Thomasz Wierzbicki, Business Analyst @Office Samurai
14:20 Polish MVP panel: Insights on MVP award achievements and career profiling
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfNeo4j
Presented at Gartner Data & Analytics, London Maty 2024. BT Group has used the Neo4j Graph Database to enable impressive digital transformation programs over the last 6 years. By re-imagining their operational support systems to adopt self-serve and data lead principles they have substantially reduced the number of applications and complexity of their operations. The result has been a substantial reduction in risk and costs while improving time to value, innovation, and process automation. Join this session to hear their story, the lessons they learned along the way and how their future innovation plans include the exploration of uses of EKG + Generative AI.
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...Toru Tamaki
Jindong Gu, Zhen Han, Shuo Chen, Ahmad Beirami, Bailan He, Gengyuan Zhang, Ruotong Liao, Yao Qin, Volker Tresp, Philip Torr "A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models" arXiv2023
https://arxiv.org/abs/2307.12980
Transcript: Details of description part II: Describing images in practice - T...BookNet Canada
This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator.
Link to presentation recording and slides: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/
Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.
Best Programming Language for Civil EngineersAwais Yaseen
The integration of programming into civil engineering is transforming the industry. We can design complex infrastructure projects and analyse large datasets. Imagine revolutionizing the way we build our cities and infrastructure, all by the power of coding. Programming skills are no longer just a bonus—they’re a game changer in this era.
Technology is revolutionizing civil engineering by integrating advanced tools and techniques. Programming allows for the automation of repetitive tasks, enhancing the accuracy of designs, simulations, and analyses. With the advent of artificial intelligence and machine learning, engineers can now predict structural behaviors under various conditions, optimize material usage, and improve project planning.
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxSynapseIndia
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
Comparison Table of DiskWarrior Alternatives.pdfAndrey Yasko
To help you choose the best DiskWarrior alternative, we've compiled a comparison table summarizing the features, pros, cons, and pricing of six alternatives.
Implementations of Fused Deposition Modeling in real worldEmerging Tech
The presentation showcases the diverse real-world applications of Fused Deposition Modeling (FDM) across multiple industries:
1. **Manufacturing**: FDM is utilized in manufacturing for rapid prototyping, creating custom tools and fixtures, and producing functional end-use parts. Companies leverage its cost-effectiveness and flexibility to streamline production processes.
2. **Medical**: In the medical field, FDM is used to create patient-specific anatomical models, surgical guides, and prosthetics. Its ability to produce precise and biocompatible parts supports advancements in personalized healthcare solutions.
3. **Education**: FDM plays a crucial role in education by enabling students to learn about design and engineering through hands-on 3D printing projects. It promotes innovation and practical skill development in STEM disciplines.
4. **Science**: Researchers use FDM to prototype equipment for scientific experiments, build custom laboratory tools, and create models for visualization and testing purposes. It facilitates rapid iteration and customization in scientific endeavors.
5. **Automotive**: Automotive manufacturers employ FDM for prototyping vehicle components, tooling for assembly lines, and customized parts. It speeds up the design validation process and enhances efficiency in automotive engineering.
6. **Consumer Electronics**: FDM is utilized in consumer electronics for designing and prototyping product enclosures, casings, and internal components. It enables rapid iteration and customization to meet evolving consumer demands.
7. **Robotics**: Robotics engineers leverage FDM to prototype robot parts, create lightweight and durable components, and customize robot designs for specific applications. It supports innovation and optimization in robotic systems.
8. **Aerospace**: In aerospace, FDM is used to manufacture lightweight parts, complex geometries, and prototypes of aircraft components. It contributes to cost reduction, faster production cycles, and weight savings in aerospace engineering.
9. **Architecture**: Architects utilize FDM for creating detailed architectural models, prototypes of building components, and intricate designs. It aids in visualizing concepts, testing structural integrity, and communicating design ideas effectively.
Each industry example demonstrates how FDM enhances innovation, accelerates product development, and addresses specific challenges through advanced manufacturing capabilities.
YOUR RELIABLE WEB DESIGN & DEVELOPMENT TEAM — FOR LASTING SUCCESS
WPRiders is a web development company specialized in WordPress and WooCommerce websites and plugins for customers around the world. The company is headquartered in Bucharest, Romania, but our team members are located all over the world. Our customers are primarily from the US and Western Europe, but we have clients from Australia, Canada and other areas as well.
Some facts about WPRiders and why we are one of the best firms around:
More than 700 five-star reviews! You can check them here.
1500 WordPress projects delivered.
We respond 80% faster than other firms! Data provided by Freshdesk.
We’ve been in business since 2015.
We are located in 7 countries and have 22 team members.
With so many projects delivered, our team knows what works and what doesn’t when it comes to WordPress and WooCommerce.
Our team members are:
- highly experienced developers (employees & contractors with 5 -10+ years of experience),
- great designers with an eye for UX/UI with 10+ years of experience
- project managers with development background who speak both tech and non-tech
- QA specialists
- Conversion Rate Optimisation - CRO experts
They are all working together to provide you with the best possible service. We are passionate about WordPress, and we love creating custom solutions that help our clients achieve their goals.
At WPRiders, we are committed to building long-term relationships with our clients. We believe in accountability, in doing the right thing, as well as in transparency and open communication. You can read more about WPRiders on the About us page.
The Rise of Supernetwork Data Intensive ComputingLarry Smarr
Invited Remote Lecture to SC21
The International Conference for High Performance Computing, Networking, Storage, and Analysis
St. Louis, Missouri
November 18, 2021
2. Index
● CPU & Memory
○
○
○
○
vCPU pinning
NUMA affinity
THP (Transparent Huge Page)
KSM (Kernel SamePage Merging) & virtio_balloon
● Networking
○
○
○
vhost_net
Interrupt handling
Large Segment offload
● Block Device
○
○
○
I/O Scheduler
VM Cache mode
Asynchronous I/O
3. ● CPU & Memory
Modern CPU cache architecture and latency (Intel Sandy Bridge)
CPU
CPU
Core
Core
L1i
32K
L1d
32K
L1i
32K
L1d
32K
4 cycles, 0.5 ns
4 cycles, 0.5 ns
4 cycles, 0.5 ns
4 cycles, 0.5 ns
QPI
L2 or MLC (unified)
256K
L2 or MLC (unified)
256K
11 cycles, 7 ns
11 cycles, 7 ns
L3 or LLC (Shared)
14 ~ 38 cycles, 45 ns
Local memory
75 ~ 100 ns
Remote memory
120 ~ 160 ns
4. ● CPU & Memory - vCPU pinning
KVM Guest4
KVM
Guest0
KVM
Guest1
KVM
Guest3
* Pin specific vCPU on specific physical CPU
* vCPU pinning increases CPU cache hit ratio
KVM Guest2
1. Discover cpu topology.
- virsh capabilities
Node0
Node1
2. pin vcpu on specific core
- vcpupin <domain> vcpu-num cpu-num
Core 0
Core 1
Core 2
Core 3
3. print vcpu information
- vcpuinfo <domain>
LLC
LLC
* Multi-node virtual machine?
Node memory
Node memory
5. ● CPU & Memory - vCPU pinning
* Two times faster memory access than without pinning
- Shorter is better on scheduling and mutex test.
- Longer is better on memory test.
6. ● CPU & Memory - NUMA affinity
Cores
Memory
Controller
Memory
CPU 2
QPI
CPU 3
I/O
Controller
PCI-E
Cores
Cores
LLC
LLC
Cores
Cores
PCI-E
Cores
Memory
LLC
PCI-E
LLC
Memory
Cores
Memory
Controller
I/O
Controller
PCI-E
Cores
I/O
Controller
CPU 1
Memory
Controller
Memory
Controller
Memory
CPU 0
I/O
Controller
CPU architecture (Intel Sandy Bridge)
8. ● CPU & Memory - NUMA affinity
1. Determine where the pages of a VM are allocated.
- cat /proc/<PID>/numa_maps
- cat /sys/fs/cgroup/memory/sysdefault/libvirt/qemu/<KVM name>/memory.numa_stat
total=244973 N0=118375 N1=126598
file=81 N0=24 N1=57
anon=244892 N0=118351 N1=126541
unevictable=0 N0=0 N1=0
2. Change memory policy mode
- cgset -r cpuset.mems=<Node> sysdefault/libvirt/qemu/<KVM name>/emulator/
3. Migrate pages into a specific node.
- migratepages <PID> from-node to-node
- cat /proc/<PID>/status
* Memory policy modes
1) interleave : Memory will be allocated using round robin on nodes. When memory cannot be allocated on
the current interleave target fall back to other nodes.
2) bind : Only allocate memory from nodes. Allocation will fail when there is not enough memory available
on these nodes.
3) preferred : Preferably allocate memory on node, but if memory cannot be allocated there fall back to
other nodes.
* “preferred” memory policy mode is not currently supported on cgroup
9. ● CPU & Memory - NUMA affinity
- NUMA reclaim
Node0
Node1
1. Check if zone reclaim is enabled.
Free memory
Free memory
Mapped page cache
- cat /proc/sys/vm/zone_reclaim_mode
0(default): Linux kernel allocates the
Mapped page cache
memory to a remote NUMA node where
Unmapped page cache
Unmapped page cache
Anonymous page
Anonymous page
free memory is available.
1 : Linux kernel reclaims unmapped page
caches for the local NUMA node rather than
Local reclaim
immediately allocating the memory to a
Node0
Node1
Free memory
Free memory
Mapped page cache
Mapped page cache
remote NUMA node.
It is known that a virtual machine causes zone
reclaim to occur in the situations when KSM(Kernel
Same-page Merging) is enabled or Hugepage are
Unmapped page cache
Unmapped page cache
Anonymous page
Anonymous page
enabled on a virtual machine side.
11. ● CPU & Memory - THP (Transparent Huge Page)
- Memory address translation in 64 bit Linux
Linear(Virtual) address (48 bit)
Upper Dir
(9 bit)
Global Dir
(9 bit)
Table
Offset
(9 bit in 4KB
0 bit in 2MB)
Middle Dir
(9 bit)
(12 bit in 4KB
21 bit in 2MB)
Physical Page
(4KB page size)
Page table
Page Middle
Directory
Page Upper
Directory
Page Global
Directory
Physical Page
(2MB page size)
cr3
reduce 1
step!
12. ● CPU & Memory - THP (Transparent Huge Page)
Paging hardware with TLB (Translation lookaside buffer)
* Translation lookaside buffer (TLB)
: a cache that memory management
hardware uses to improve virtual
address translation speed.
* TLB is also a kind of cache memory
in CPU.
* Then, how can we increase TLB hit
ratio?
- TLB can hold only 8 ~ 1024
entries.
- Decrease the number of pages.
i.e. On 32GB memory system,
8,388,608 pages with 4KB page block
16,384 pages with 2MB page block
13. ● CPU & Memory - THP (Transparent Huge Page)
THP performance benchmark - MySQL 5.5 OLTP testing
* A guest machine also has to use HugePage for the best effect
14. ● CPU & Memory - THP (Transparent Huge Page)
1. Check current THP configuration
- cat /sys/kernel/mm/transparent_hugepage/enabled
2. Configure THP mode
- echo mode > /sys/kernel/mm/transparent_hugepage/enabled
3. monitor Huge page usage
- cat /proc/meminfo | grep Huge
janghoon@machine-04:~$ grep Huge /proc/meminfo
AnonHugePages: 462848 kB
HugePages_Total:
0
HugePages_Free:
0
HugePages_Rsvd:
0
HugePages_Surp:
0
Hugepagesize:
2048 kB
- grep Huge /proc/<PID>/smaps
4. Adjust parameters under /sys/kernel/mm/transparent_hugepage/khugepaged
- grep thp /proc/vmstat
* THP modes
1) always : use HugePages always
2) madvise : use HugePages only in specific regions, madvise(MADV_HUGEPAGE). Default on Ubuntu precise
3) never : not use HugePages
* Currently it only works for anonymous memory mappings but in the future it can expand over the pagecache
layer starting with tmpfs. - Linux Kernel Documentation/vm/transhuge.txt
15. ● CPU & Memory - KSM & virtio_balloon
- KSM (Kernel SamePage Merging)
KSM
MADV_MERGEABLE
Guest 0
Merging
Guest 1
Guest 2
1. A kernel feature in KVM that shares
memory pages between various processes,
over-committing
the memory.
2. Only merges anonymous (private) pages.
3. Enable KSM
- echo “1” > /sys/kernel/mm/ksm/run
4. Monitor KSM
- Files under /sys/kernel/mm/ksm/
5. For NUMA (in Linux 3.9)
- /sys/kernel/mm/ksm/merge_across_nodes
0 : merge only pages in the memory area of
the same NUMA node
1 : merge pages across nodes
16. ● CPU & Memory - KSM & virtio_balloon
- Virtio_balloon
currentMemory
Memory
Host
Host
VM 0
VM 1
memory
memory
VM 0
memory
VM 1
memory
balloon
balloon
balloon
balloon
1. The hypervisor sends a request to the guest operating system to return some amount of memory back to the
hypervisor.
2. The virtio_balloon driver in the guest operating system receives the request from the hypervisor.
3. The virtio_balloon driver inflates a balloon of memory inside the guest operating system
4. The guest operating system returns the balloon of memory back to the hypervisor.
5. The hypervisor allocates the memory from the balloon elsewhere as needed.
6. If the memory from the balloon later becomes available, the hypervisor can return the memory to the guest
operating system
21. ● Networking - Interrupt handling
2. MSI(Message Signaled Interrupt), MSI-X
- Make sure your MSI-X enabled
janghoon@machine-04:~$ sudo lspci -vs 01:00.1 | grep MSI
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] MSI-X: Enable+ Count=10 MaskedCapabilities: [a0] Express Endpoint, MSI 00
3. NAPI(New API)
- a feature that in the linux kernel aiming to improve the performance of high-speed
networking, and avoid interrupt storms.
- Ask your NIC vendor whether it’s enabled by default.
Most modern NICs support Multi-queue, MSI-X and NAPI.
However, You may need to make sure these features are configured correctly and working
properly.
http://en.wikipedia.org/wiki/Message_Signaled_Interrupts
http://en.wikipedia.org/wiki/New_API
22. ● Networking - Interrupt handling
4. IRQ Affinity (Pin Interrupts to the local node)
Memory
Controller
Memory
Cores
Cores
QPI
I/O Controller
PCI-E
PCI-E
I/O Controller
PCI-E
PCI-E
Each node has PCI-E devices connected directly (On Intel Sandy Bridge).
Where is my 10G NIC connected to?
Memory
CPU 1
Memory
Controller
CPU 0
23. ● Networking - Interrupt handling
4. IRQ Affinity (Pin Interrupts to the local node)
1) stop irqbalance service
2) determine node that a NIC is connected to
- lspci -tv
- lspci -vs <PCI device bus address>
- dmidecode -t slot
- cat /sys/devices/pci*/<bus address>/numa_node (-1 : not detected)
- cat /proc/irq/<Interrupt number>/node
3) Pin interrupts from the NIC to specific node
- echo f > /proc/irq/<Interrupt number>/smp_affinity
24. ● Networking - Large Segment Offload
w/o LSO
Data (i.e. 4K)
Application
with LSO
Data (i.e. 4K)
Kernel
Header
Data
Header
Data
Header
Data
NIC
Header
Data
Header
Data
Header
Data
Header
Data
Header
janghoon@ubuntu-precise:~$ sudo ethtool -k eth0
Offload parameters for eth0:
tcp-segmentation-offload: on
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
Data
Header
+
Data
metadata
Header
Data
25. ● Networking - Large Segment Offload
1.
2.
3.
100% throughput performance improvement
GSO/GRO, TSO and UFO are enabled by default on Ubuntu precise 12.04 LTS
Jumbo-frame is not much effective
26. ● Block device - I/O Scheduler
1. Noop : basic FIFO queue.
2. Deadline : I/O requests place in a priority queue and are guaranteed to be run within a
certain time. low latency.
3. CFQ (Completely Fair Queueing) : I/O requests are distributed to a number of perprocess queues. default I/O scheduler on Ubuntu.
janghoon@ubuntu-precise:~$ sudo cat /sys/block/sdb/queue/scheduler
noop deadline [cfq]
27. ● Block device - VM Cache mode
Disable host page cache
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none'/>
<source file='/mnt/VM_IMAGES/VM-test.img'/>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
Host
R
w
writeback
R
w
Guest
writethrough
R
w
none
Host Page Cache
Disk Cache
Physical Disk
R
w
directsync