The document discusses just-in-time (JIT) compilers in the Java Virtual Machine (JVM). It describes how JIT compilers work by compiling bytecode to native machine code during execution based on profiling information. This allows for optimizations like inlining, devirtualization, loop unrolling and eliding unnecessary synchronization that improve performance. The JIT compiler uses feedback from profiling to enable more aggressive optimizations like these.
Java on Kubernetes may seem complicated, but after a bit of YAML and Dockerfiles, you will wonder what all that fuss was. But then the performance of your app in 1 CPU/1 GB of RAM makes you wonder. Learn how JVM ergonomics, CPU throttling, and GCs can help increase performance while reducing costs.
Jemalloc can help debug memory leaks in ATS plugins. It provides memory profiling by sampling memory allocations and dumping profiles to files. These profiles can then be viewed as gifs to analyze the call graph. The author provides two case studies where jemalloc helped identify leaks - a leak over months in ATS fronting APIs, and a 12 hour leak from a bug in their Brotli plugin. Jemalloc also improved ATS scalability by addressing issues with memory operations and plugins stressing the CPU to higher utilization.
This document discusses making Linux capable of hard real-time performance. It begins by defining hard and soft real-time systems and explaining that real-time does not necessarily mean fast but rather determinism. It then covers general concepts around real-time performance in Linux like preemption, interrupts, context switching, and scheduling. Specific features in Linux like RT-Preempt, priority inheritance, and threaded interrupts that improve real-time capabilities are also summarized.
GDB can debug programs by running them under its control. It allows inspecting and modifying program state through breakpoints, watchpoints, and examining variables and memory. GDB supports debugging optimized code, multi-threaded programs, and performing tasks like stepping, continuing, and backtracing through the call stack. It can also automate debugging through commands, scripts, and breakpoint actions.
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Learn what you need to know to experience nirvana in the evaluation of G1 GC even if your are migrating from Parallel GC to G1, or CMS GC to G1 GC
You also get a walk through of some case study data
G1 GC
Java Performance Analysis on Linux with Flame Graphs
This document discusses using Linux perf_events (perf) profiling tools to analyze Java performance on Linux. It describes how perf can provide complete visibility into Java, JVM, GC and system code but that Java profilers have limitations. It presents the solution of using perf to collect mixed-mode flame graphs that include Java method names and symbols. It also discusses fixing issues with broken Java stacks and missing symbols on x86 architectures in perf profiles.
Broken benchmarks, misleading metrics, and terrible tools. This talk will help you navigate the treacherous waters of Linux performance tools, touring common problems with system tools, metrics, statistics, visualizations, measurement overhead, and benchmarks. You might discover that tools you have been using for years, are in fact, misleading, dangerous, or broken.
The speaker, Brendan Gregg, has given many talks on tools that work, including giving the Linux PerformanceTools talk originally at SCALE. This is an anti-version of that talk, to focus on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive! This talk will include advice for verifying new performance tools, understanding how they work, and using them successfully.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
JDK Flight Recorder introduced in OpenJDK 11.
This feature is low overhead of profiling and be able to used on production environment.
High Performance recording engine is embedded to Hotspot VM.
Linux uses /proc/iomem as a "Rosetta Stone" to establish relationships between software and hardware. /proc/iomem maps physical memory addresses to devices, similar to how the Rosetta Stone helped map Egyptian hieroglyphs to Greek and decode ancient Egyptian texts. This virtual file allows the kernel to interface with devices by providing address translations between physical and virtual memory spaces.
It is the presentation file used by Jim Huang (jserv) at OSDC.tw 2009. New compiler technologies are invisible but highly integrated around our world, and we can enrich the experience via facilitating LLVM.
This document is a term paper on Just-In-Time compilers (JIT). It begins with an acknowledgements section thanking the teacher for guidance. It then provides an introduction defining JIT as improving runtime performance of bytecode programs by compiling to machine code during execution. The paper discusses time-space tradeoffs of JIT and how JIT functions by compiling sections of bytecode to native code prior to execution. It also classifies JIT compilers based on invocation, executability, and concurrency. The conclusion restates that the paper provided an overview of JIT compilers.
This document provides an overview of JVM JIT compilers, specifically focusing on the HotSpot JVM compiler. It discusses the differences between static and dynamic compilation, how just-in-time compilation works in the JVM, profiling and optimizations performed by JIT compilers like inlining and devirtualization, and how to monitor the JIT compiler through options like -XX:+PrintCompilation and -XX:+PrintInlining.
This document discusses Java compilers and their impact on performance. It explains that Java uses a two-step compilation process to achieve both portability and speed. The first step compiles Java code to bytecode, while the second step just-in-time compiles the bytecode to native machine code. It describes how client-side compilers focus on fast startup times while server-side compilers emphasize long-term optimizations. Tiered compilation combines aspects of both. The document also introduces hotspot compilation, which optimizes frequently executed code sections.
Presto is Uber's distributed SQL query engine for their Hadoop data warehouse. Some key points:
- Presto allows interactive SQL queries directly on Uber's petabyte-scale Hadoop data lake without needing to first load the data into another database.
- It provides fast performance at scale by leveraging columnar data formats like Parquet and optimizing for distributed execution across many nodes.
- Uber deployed a 200 node Presto cluster that handles 30,000 queries per day, serving both ad hoc queries and real-time applications accessing data in Hadoop and improving on the performance of alternative solutions like Hive.
JIT-компиляция в виртуальной машине Java (HighLoad++ 2013)
Обеспечение достойной производительности высокоуровневого языка с динамической типизацией - непростая задача. Just-in-time (JIT) компиляция - динамическая генерация машинного кода с учетом информации, собранной во время выполнения приложения - ключевой элемент производительности виртуальной машины (будь то Java, .NET или даже JavaScript). JIT-компилятор, в свою очередь, должен иметь впечатляющий набор трюков и оптимизаций, что бы компенсировать "динамизм" языка.
В докладе речь пойдет о достижениях современной JIT компиляции в целом и более подробно будут освещены особенности HotSpot JVM (бесплатной JVM от Oracle)
This document provides an overview of just-in-time (JIT) compilers in the HotSpot Java Virtual Machine (JVM). It discusses the differences between static and dynamic compilation, how modern JVMs use dynamic compilers and profiling data to perform aggressive optimizations, and some of the specific optimizations used in the HotSpot JVM like inlining, devirtualization, and on-stack replacement.
This document summarizes the services and operations of a software development company with offices in Gdynia and Warsaw, Poland. The company has grown from 8 to 96 employees in 2 years. They offer dedicated software solutions, IT outsourcing, expert services, and software products. Their main technical skills include Java, JavaScript, PL/SQL, Android, C#, and C++ development. They emphasize quality assurance through practices like agile development, test automation, and transparency. The company recruits candidates through various sources and has deep engagement with the academic community through student projects, internships, and university partnerships.
The document discusses tuning garbage collection in the Java Virtual Machine. It provides recommendations for sizing generations based on an application's object longevity and size to reduce premature promotions which are a major cause of garbage collection pauses. Maintaining a low allocation rate and promotion rate can also help reduce garbage collection frequency. Plotting metrics like allocation rates, promotion rates, and heap occupancy over time can help analyze garbage collection performance.
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
This document provides an overview of garbage collection in the Java Virtual Machine. It discusses key concepts like generational collection, parallel and concurrent marking, and tuning garbage collection for throughput versus latency. Specific collectors like Parallel GC, CMS GC, and G1 GC are explained in terms of their marking and compaction algorithms. Memory tuning recommendations and analyzing garbage collection logs and heap dumps are also covered. The document concludes with a high-level explanation of the Garbage First garbage collector and how it uses region-based heap management.
JavaOne 2016 presentation slides on the Testarossa Just In Time compiler technology from the IBM J9 Java Virtual Machine, which IBM is contributing to open source (800KLOC to date on github at the Eclipse OMR project). This talk covers both the overall structure of the compiler and provides some details on the dynamic AOT technology available in Testarossa since 2006.
At JavaOne keynote this year, Mark Reinhold talked about how Java 9 was much bigger than Jigsaw. To put that in numbers - 80+ JEPs bigger! Yes, we see more presentations on Jigsaw since it brings about modularity to the once monolithic JDK. But what about those other JEPs?! One of those "other" JEPs, is JEP 143 - 'Improve Contended Locking'. Monica will apply her performance engineering approach and talk about JEP 143 and Oracle's Studio Analyzer Performance Tool. The crux of the presentation will entail comparing performance of contended locks in JDK 9 to JDK 8.
Managed runtime performance expert, Monica Beckwith will divulge her survival guide which is essential for any application performance engineer. Following simple rules and performance engineering patterns will make you and your stakeholders happy.
The document discusses LLVM and its use in building programming language compilers and runtimes. It provides an overview of LLVM, including its core components like its intermediate representation (IR), optimizations, and code generation capabilities. It also discusses how LLVM is used in various applications like Android, browsers, and graphics processing. Examples are given of using Clang and LLVM to compile and run a simple C program.
The document provides an overview of implementing a high-performance JavaScript engine. It discusses the key components including the parser, runtime, execution engine, garbage collector, and foreign function interface. It also covers various implementation strategies and tradeoffs for aspects like value representation, object models, execution engines, and garbage collection. The document emphasizes learning from Self VM and using techniques like hidden classes, inline caching, and tiered compilation and optimization.
This document provides an overview of just-in-time (JIT) and lean operations. It defines JIT and discusses its goals of eliminating waste and achieving smooth, rapid material flow. Key aspects covered include JIT building blocks like product design, process design and personnel elements. Benefits include reduced inventory, flexibility and increased productivity. The document also compares JIT to traditional systems and outlines steps to transition to JIT.
GNU Toolchain is the de facto standard of IT industrial and has been improved by comprehensive open source contributions. In this session, it is expected to cover the mechanism of compiler driver, system interaction (take GNU/Linux for example), linker, C runtime library, and the related dynamic linker. Instead of analyzing the system design, the session is use case driven and illustrated progressively.
In Java 9, Garbage First Garbage Collector (G1 GC) will be the default GC. This presentation makes an effort to help Hotspot VM users to understand the concept of G1 GC as well as provides some tuning advice.
Introduce Brainf*ck, another Turing complete programming language. Then, try to implement the following from scratch: Interpreter, Compiler [x86_64 and ARM], and JIT Compiler.
In this presentation we will discuss about the concept of just in time (JIT) production philosophy, types and concepts of JIT, objectives of JIT manufacturing, comparison between ideal production system and JIT production, characteristics of JIT system, JIT manufacturing vs. JIT purchasing. We will also discuss about major tools and techniques of JIT manufacturing, JIT implementation approach, problems regarding implementation of JIT, planning of a successful JIT system, obstacles faced for JIT conversion, operational benefits of JIT systems.
To know more about Welingkar School’s Distance Learning Program and courses offered, visit: http://www.welingkaronline.org/distance-learning/online-mba.html
Java Jit. Compilation and optimization by Andrey Kovalenko
This document discusses Java Just-In-Time (JIT) compilation. It describes JIT as compiling Java bytecode to native machine code during program execution rather than prior to execution. It outlines the main types of JIT compilers in HotSpot (client, server, tiered) and the key optimizations they perform like inlining, escape analysis, on-stack replacement, and tiered compilation. The document provides details on JIT tuning flags and how to get more profiling information from the JIT compiler logs. It emphasizes that letting the JIT do its work through warmup and avoiding microbenchmarks is important to achieving full performance.
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.
Apache Spark has rocked the big data landscape, quickly becoming the largest open source big data community with over 750 contributors from more than 200 organizations. Spark's core tenants of speed, ease of use, and its unified programming model fit neatly with the high performance, scalable, and manageable characteristics of modern Java runtimes. In this talk we introduce the Spark programming model, and describe some unique Java runtime capabilities in the JIT, fast networking, serialization techniques, and GPU off-loading that deliver the ultimate big data platform for solving business problems. We will show how solutions, previously infeasible with regular Java programming, become possible with a high performance Spark core runtime, enabling you to solve problems smarter and faster.
Apache Spark has rocked the big data landscape, becoming the largest open source big data community with over 750 contributors from more than 200 organizations. Spark's core tenants of speed, ease of use, and its unified programming model fit neatly with the high performance, scalable, and manageable characteristics of modern Java runtimes. In this talk Tim Ellison, a JVM developer at IBM, shows some of the unique Java 8 capabilities in the JIT compiler, fast networking, serialization techniques, and GPU off-loading that deliver the ultimate big data platform for solving business problems. Tim will demonstrate how solutions, previously infeasible with regular Java programming, become possible with this high performance Spark core runtime, enabling you to solve problems smarter and faster.
A Java Implementer's Guide to Better Apache Spark Performance
This document discusses techniques for improving the performance of Apache Spark applications. It describes optimizing the Java virtual machine by enhancing the just-in-time compiler, improving the object serializer, enabling faster I/O using technologies like RDMA networking and CAPI flash storage, and offloading tasks to graphics processors. The document provides examples of code style guidelines and specific Spark optimizations that further improve performance, such as leveraging hardware accelerators and tuning JVM heuristics.
Five cool ways the JVM can run Apache Spark faster
The IBM JVM runs Apache Spark fast! This talk explains some of the findings and optimizations from our experience of running Spark workloads.
The talk was originally presented at the SparkEU Summit 2015 in Amsterdam.
The document discusses the benefits of moving JIT compilation out of individual JVMs and into a shared, cloud-based compiler service. This "JIT-as-a-Service" approach improves efficiency by allowing optimization resources to be shared and elastic across JVMs. It also enables optimized code to be reused for applications that execute on multiple devices. Moving JIT compilation to the cloud reduces warmup time, memory footprint, and CPU usage for JVMs while improving the level of optimizations that can be performed.
Code examples are available here: https://github.com/ivailo-pashov/jvmmagic
How well do you know what is going inside the JVM? How about its secret backdoors and nasty hacks? Initially they appear as magic tricks but being aware what is going on behind the scenes will save your time when real issues arise.
At a time when Herbt Sutter announced to everyone that the free lunch is over (The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software), concurrency has become our everyday life.A big change is coming to Java, the Loom project and with it such new terms as "virtual thread", "continuations" and "structured concurrency". If you've been wondering what they will change in our daily work or
whether it's worth rewriting your Tomcat-based application to super-efficient reactive Netty,or whether to wait for Project Loom? This presentation is for you.
I will talk about the Loom project and the new possibilities related to virtual wattles and "structured concurrency". I will tell you how it works and what can be achieved and the impact on performance
The document discusses various topics related to tuning the Java Virtual Machine (JVM) for performance, including:
1. Hotspot compiler options like method inlining that can improve performance.
2. Threading models on Solaris like M:N and 1:1 and how tuning thread-related JVM options can significantly impact throughput.
3. Memory and garbage collection tuning like selecting the right GC algorithm, tuning heap sizes, and analyzing GC logs to identify bottlenecks and optimize full GC frequency and duration.
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
Secrets of Performance Tuning Java on KubernetesBruno Borges
Java on Kubernetes may seem complicated, but after a bit of YAML and Dockerfiles, you will wonder what all that fuss was. But then the performance of your app in 1 CPU/1 GB of RAM makes you wonder. Learn how JVM ergonomics, CPU throttling, and GCs can help increase performance while reducing costs.
Jemalloc can help debug memory leaks in ATS plugins. It provides memory profiling by sampling memory allocations and dumping profiles to files. These profiles can then be viewed as gifs to analyze the call graph. The author provides two case studies where jemalloc helped identify leaks - a leak over months in ATS fronting APIs, and a 12 hour leak from a bug in their Brotli plugin. Jemalloc also improved ATS scalability by addressing issues with memory operations and plugins stressing the CPU to higher utilization.
This document discusses making Linux capable of hard real-time performance. It begins by defining hard and soft real-time systems and explaining that real-time does not necessarily mean fast but rather determinism. It then covers general concepts around real-time performance in Linux like preemption, interrupts, context switching, and scheduling. Specific features in Linux like RT-Preempt, priority inheritance, and threaded interrupts that improve real-time capabilities are also summarized.
GDB can debug programs by running them under its control. It allows inspecting and modifying program state through breakpoints, watchpoints, and examining variables and memory. GDB supports debugging optimized code, multi-threaded programs, and performing tasks like stepping, continuing, and backtracing through the call stack. It can also automate debugging through commands, scripts, and breakpoint actions.
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...Monica Beckwith
Learn what you need to know to experience nirvana in the evaluation of G1 GC even if your are migrating from Parallel GC to G1, or CMS GC to G1 GC
You also get a walk through of some case study data
G1 GC
Java Performance Analysis on Linux with Flame GraphsBrendan Gregg
This document discusses using Linux perf_events (perf) profiling tools to analyze Java performance on Linux. It describes how perf can provide complete visibility into Java, JVM, GC and system code but that Java profilers have limitations. It presents the solution of using perf to collect mixed-mode flame graphs that include Java method names and symbols. It also discusses fixing issues with broken Java stacks and missing symbols on x86 architectures in perf profiles.
Broken benchmarks, misleading metrics, and terrible tools. This talk will help you navigate the treacherous waters of Linux performance tools, touring common problems with system tools, metrics, statistics, visualizations, measurement overhead, and benchmarks. You might discover that tools you have been using for years, are in fact, misleading, dangerous, or broken.
The speaker, Brendan Gregg, has given many talks on tools that work, including giving the Linux PerformanceTools talk originally at SCALE. This is an anti-version of that talk, to focus on broken tools and metrics instead of the working ones. Metrics can be misleading, and counters can be counter-intuitive! This talk will include advice for verifying new performance tools, understanding how they work, and using them successfully.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
JDK Flight Recorder introduced in OpenJDK 11.
This feature is low overhead of profiling and be able to used on production environment.
High Performance recording engine is embedded to Hotspot VM.
Linux uses /proc/iomem as a "Rosetta Stone" to establish relationships between software and hardware. /proc/iomem maps physical memory addresses to devices, similar to how the Rosetta Stone helped map Egyptian hieroglyphs to Greek and decode ancient Egyptian texts. This virtual file allows the kernel to interface with devices by providing address translations between physical and virtual memory spaces.
It is the presentation file used by Jim Huang (jserv) at OSDC.tw 2009. New compiler technologies are invisible but highly integrated around our world, and we can enrich the experience via facilitating LLVM.
This document is a term paper on Just-In-Time compilers (JIT). It begins with an acknowledgements section thanking the teacher for guidance. It then provides an introduction defining JIT as improving runtime performance of bytecode programs by compiling to machine code during execution. The paper discusses time-space tradeoffs of JIT and how JIT functions by compiling sections of bytecode to native code prior to execution. It also classifies JIT compilers based on invocation, executability, and concurrency. The conclusion restates that the paper provided an overview of JIT compilers.
This document provides an overview of JVM JIT compilers, specifically focusing on the HotSpot JVM compiler. It discusses the differences between static and dynamic compilation, how just-in-time compilation works in the JVM, profiling and optimizations performed by JIT compilers like inlining and devirtualization, and how to monitor the JIT compiler through options like -XX:+PrintCompilation and -XX:+PrintInlining.
This document discusses Java compilers and their impact on performance. It explains that Java uses a two-step compilation process to achieve both portability and speed. The first step compiles Java code to bytecode, while the second step just-in-time compiles the bytecode to native machine code. It describes how client-side compilers focus on fast startup times while server-side compilers emphasize long-term optimizations. Tiered compilation combines aspects of both. The document also introduces hotspot compilation, which optimizes frequently executed code sections.
Presto is Uber's distributed SQL query engine for their Hadoop data warehouse. Some key points:
- Presto allows interactive SQL queries directly on Uber's petabyte-scale Hadoop data lake without needing to first load the data into another database.
- It provides fast performance at scale by leveraging columnar data formats like Parquet and optimizing for distributed execution across many nodes.
- Uber deployed a 200 node Presto cluster that handles 30,000 queries per day, serving both ad hoc queries and real-time applications accessing data in Hadoop and improving on the performance of alternative solutions like Hive.
JIT-компиляция в виртуальной машине Java (HighLoad++ 2013)aragozin
Обеспечение достойной производительности высокоуровневого языка с динамической типизацией - непростая задача. Just-in-time (JIT) компиляция - динамическая генерация машинного кода с учетом информации, собранной во время выполнения приложения - ключевой элемент производительности виртуальной машины (будь то Java, .NET или даже JavaScript). JIT-компилятор, в свою очередь, должен иметь впечатляющий набор трюков и оптимизаций, что бы компенсировать "динамизм" языка.
В докладе речь пойдет о достижениях современной JIT компиляции в целом и более подробно будут освещены особенности HotSpot JVM (бесплатной JVM от Oracle)
This document provides an overview of just-in-time (JIT) compilers in the HotSpot Java Virtual Machine (JVM). It discusses the differences between static and dynamic compilation, how modern JVMs use dynamic compilers and profiling data to perform aggressive optimizations, and some of the specific optimizations used in the HotSpot JVM like inlining, devirtualization, and on-stack replacement.
This document summarizes the services and operations of a software development company with offices in Gdynia and Warsaw, Poland. The company has grown from 8 to 96 employees in 2 years. They offer dedicated software solutions, IT outsourcing, expert services, and software products. Their main technical skills include Java, JavaScript, PL/SQL, Android, C#, and C++ development. They emphasize quality assurance through practices like agile development, test automation, and transparency. The company recruits candidates through various sources and has deep engagement with the academic community through student projects, internships, and university partnerships.
The document discusses tuning garbage collection in the Java Virtual Machine. It provides recommendations for sizing generations based on an application's object longevity and size to reduce premature promotions which are a major cause of garbage collection pauses. Maintaining a low allocation rate and promotion rate can also help reduce garbage collection frequency. Plotting metrics like allocation rates, promotion rates, and heap occupancy over time can help analyze garbage collection performance.
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...Monica Beckwith
This document provides an overview of garbage collection in the Java Virtual Machine. It discusses key concepts like generational collection, parallel and concurrent marking, and tuning garbage collection for throughput versus latency. Specific collectors like Parallel GC, CMS GC, and G1 GC are explained in terms of their marking and compaction algorithms. Memory tuning recommendations and analyzing garbage collection logs and heap dumps are also covered. The document concludes with a high-level explanation of the Garbage First garbage collector and how it uses region-based heap management.
Under the Hood of the Testarossa JIT CompilerMark Stoodley
JavaOne 2016 presentation slides on the Testarossa Just In Time compiler technology from the IBM J9 Java Virtual Machine, which IBM is contributing to open source (800KLOC to date on github at the Eclipse OMR project). This talk covers both the overall structure of the compiler and provides some details on the dynamic AOT technology available in Testarossa since 2006.
At JavaOne keynote this year, Mark Reinhold talked about how Java 9 was much bigger than Jigsaw. To put that in numbers - 80+ JEPs bigger! Yes, we see more presentations on Jigsaw since it brings about modularity to the once monolithic JDK. But what about those other JEPs?! One of those "other" JEPs, is JEP 143 - 'Improve Contended Locking'. Monica will apply her performance engineering approach and talk about JEP 143 and Oracle's Studio Analyzer Performance Tool. The crux of the presentation will entail comparing performance of contended locks in JDK 9 to JDK 8.
Managed runtime performance expert, Monica Beckwith will divulge her survival guide which is essential for any application performance engineer. Following simple rules and performance engineering patterns will make you and your stakeholders happy.
The document discusses LLVM and its use in building programming language compilers and runtimes. It provides an overview of LLVM, including its core components like its intermediate representation (IR), optimizations, and code generation capabilities. It also discusses how LLVM is used in various applications like Android, browsers, and graphics processing. Examples are given of using Clang and LLVM to compile and run a simple C program.
The document provides an overview of implementing a high-performance JavaScript engine. It discusses the key components including the parser, runtime, execution engine, garbage collector, and foreign function interface. It also covers various implementation strategies and tradeoffs for aspects like value representation, object models, execution engines, and garbage collection. The document emphasizes learning from Self VM and using techniques like hidden classes, inline caching, and tiered compilation and optimization.
This document provides an overview of just-in-time (JIT) and lean operations. It defines JIT and discusses its goals of eliminating waste and achieving smooth, rapid material flow. Key aspects covered include JIT building blocks like product design, process design and personnel elements. Benefits include reduced inventory, flexibility and increased productivity. The document also compares JIT to traditional systems and outlines steps to transition to JIT.
GNU Toolchain is the de facto standard of IT industrial and has been improved by comprehensive open source contributions. In this session, it is expected to cover the mechanism of compiler driver, system interaction (take GNU/Linux for example), linker, C runtime library, and the related dynamic linker. Instead of analyzing the system design, the session is use case driven and illustrated progressively.
In Java 9, Garbage First Garbage Collector (G1 GC) will be the default GC. This presentation makes an effort to help Hotspot VM users to understand the concept of G1 GC as well as provides some tuning advice.
Introduce Brainf*ck, another Turing complete programming language. Then, try to implement the following from scratch: Interpreter, Compiler [x86_64 and ARM], and JIT Compiler.
In this presentation we will discuss about the concept of just in time (JIT) production philosophy, types and concepts of JIT, objectives of JIT manufacturing, comparison between ideal production system and JIT production, characteristics of JIT system, JIT manufacturing vs. JIT purchasing. We will also discuss about major tools and techniques of JIT manufacturing, JIT implementation approach, problems regarding implementation of JIT, planning of a successful JIT system, obstacles faced for JIT conversion, operational benefits of JIT systems.
To know more about Welingkar School’s Distance Learning Program and courses offered, visit: http://www.welingkaronline.org/distance-learning/online-mba.html
Java Jit. Compilation and optimization by Andrey KovalenkoValeriia Maliarenko
This document discusses Java Just-In-Time (JIT) compilation. It describes JIT as compiling Java bytecode to native machine code during program execution rather than prior to execution. It outlines the main types of JIT compilers in HotSpot (client, server, tiered) and the key optimizations they perform like inlining, escape analysis, on-stack replacement, and tiered compilation. The document provides details on JIT tuning flags and how to get more profiling information from the JIT compiler logs. It emphasizes that letting the JIT do its work through warmup and avoiding microbenchmarks is important to achieving full performance.
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.J On The Beach
Apache Spark has rocked the big data landscape, quickly becoming the largest open source big data community with over 750 contributors from more than 200 organizations. Spark's core tenants of speed, ease of use, and its unified programming model fit neatly with the high performance, scalable, and manageable characteristics of modern Java runtimes. In this talk we introduce the Spark programming model, and describe some unique Java runtime capabilities in the JIT, fast networking, serialization techniques, and GPU off-loading that deliver the ultimate big data platform for solving business problems. We will show how solutions, previously infeasible with regular Java programming, become possible with a high performance Spark core runtime, enabling you to solve problems smarter and faster.
Apache Spark has rocked the big data landscape, becoming the largest open source big data community with over 750 contributors from more than 200 organizations. Spark's core tenants of speed, ease of use, and its unified programming model fit neatly with the high performance, scalable, and manageable characteristics of modern Java runtimes. In this talk Tim Ellison, a JVM developer at IBM, shows some of the unique Java 8 capabilities in the JIT compiler, fast networking, serialization techniques, and GPU off-loading that deliver the ultimate big data platform for solving business problems. Tim will demonstrate how solutions, previously infeasible with regular Java programming, become possible with this high performance Spark core runtime, enabling you to solve problems smarter and faster.
A Java Implementer's Guide to Better Apache Spark PerformanceTim Ellison
This document discusses techniques for improving the performance of Apache Spark applications. It describes optimizing the Java virtual machine by enhancing the just-in-time compiler, improving the object serializer, enabling faster I/O using technologies like RDMA networking and CAPI flash storage, and offloading tasks to graphics processors. The document provides examples of code style guidelines and specific Spark optimizations that further improve performance, such as leveraging hardware accelerators and tuning JVM heuristics.
Five cool ways the JVM can run Apache Spark fasterTim Ellison
The IBM JVM runs Apache Spark fast! This talk explains some of the findings and optimizations from our experience of running Spark workloads.
The talk was originally presented at the SparkEU Summit 2015 in Amsterdam.
The document discusses the benefits of moving JIT compilation out of individual JVMs and into a shared, cloud-based compiler service. This "JIT-as-a-Service" approach improves efficiency by allowing optimization resources to be shared and elastic across JVMs. It also enables optimized code to be reused for applications that execute on multiple devices. Moving JIT compilation to the cloud reduces warmup time, memory footprint, and CPU usage for JVMs while improving the level of optimizations that can be performed.
Code examples are available here: https://github.com/ivailo-pashov/jvmmagic
How well do you know what is going inside the JVM? How about its secret backdoors and nasty hacks? Initially they appear as magic tricks but being aware what is going on behind the scenes will save your time when real issues arise.
At a time when Herbt Sutter announced to everyone that the free lunch is over (The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software), concurrency has become our everyday life.A big change is coming to Java, the Loom project and with it such new terms as "virtual thread", "continuations" and "structured concurrency". If you've been wondering what they will change in our daily work or
whether it's worth rewriting your Tomcat-based application to super-efficient reactive Netty,or whether to wait for Project Loom? This presentation is for you.
I will talk about the Loom project and the new possibilities related to virtual wattles and "structured concurrency". I will tell you how it works and what can be achieved and the impact on performance
The document discusses various topics related to tuning the Java Virtual Machine (JVM) for performance, including:
1. Hotspot compiler options like method inlining that can improve performance.
2. Threading models on Solaris like M:N and 1:1 and how tuning thread-related JVM options can significantly impact throughput.
3. Memory and garbage collection tuning like selecting the right GC algorithm, tuning heap sizes, and analyzing GC logs to identify bottlenecks and optimize full GC frequency and duration.
A technical presentation on how Zing changes parts of the JVM to eliminate GC pauses, generate more heavily optimised code from the JIT and reduce the warm up time.
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationMonica Beckwith
Adaptive compilation and runtime in the OpenJDK Hotspot VM offers significant performance enhancements for our tools and applications in Java and other JVM languages. Understanding how it works provides developers with critical information on the Java HotSpot JIT compilation and runtime techniques such as vectorization, compressed OOPs etc., to assist in understanding performance for both client and server applications. We will focus on the internals of OpenJDK 8, the reference implementation for Java SE 8.
At Criteo we are using both .NET CLR runtime and JVM one. At first look it seems there are very similar: a bytecode, a JIT, a GC, … but in fact there is some differences in the implementation and in the vision of the targeted applications and their requirements
Let’s dig into those differences with the pros & cons
Oplægget blev holdt ved et seminar i InfinIT-interessegruppen Højniveausprog til indlejrede systemer, der blev afholdt den 18. juni 2014. Læs mere om interessegruppen her: http://infinit.dk/dk/interessegrupper/hoejniveau_sprog_til_indlejrede_systemer/hoejniveau_sprog_til_indlejrede_systemer.htm
The document discusses developing Groovy scripts securely and productively in the cloud for Oracle Application Developer Framework (ADF). It outlines using Groovy AST transformations to add debugging capabilities and runtime security checks when executing scripts in the cloud. Caching is also discussed to improve performance of compiling thousands of scripts across many applications. The implementation transforms the AST to wrap method calls and inject breakpoints while limiting access to restricted APIs.
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...srisatish ambati
Top 10 Causes for Java Issues in Production and What to Do When Things Go Wrong
JavaOne 2010.
Abstract: It's Friday evening and you hear the first rumble . . . one java node has become slightly unresponsive. You lookup the process, get a thread dump, and for good measure restart it at 8 p.m. Saturday afternoon is when you realize that other nodes have caught the flu and you get the ugly call from the customer. In a matter of hours, you're on that conference bridge with support groups of different packages and Java vendors and one of your uberarchitects. Yes, production instances are up and down, and restarting like there's no tomorrow. Here's an accumulated compendium of the op 10 things that can cause Java production heartburn and what to do when your Java production is on fire. And yes, please have your tools belt on.
Speaker(s):
Cliff Click, Azul Systems, Distinguished Engineer
SriSatish Ambati, Azul Systems, Performance Engineer
FOSDEM 2017 - Open J9 The Next Free Java VMCharlie Gracie
I will discuss the J9 VM technology and our plans on open sourcing the technology. My team has already open sourced a lot of the underlying technology as part of the Eclipse OMR project and now we are working open sourcing the rest of the technology.
This document discusses the Java Memory Model (JMM). It begins by introducing the goals of familiarizing the attendee with the JMM, how processors work, and how the Java compiler and JVM work. It then covers key topics like data races, synchronization, atomicity, and examples. The document provides examples of correctly synchronized programs versus programs with data races. It explains concepts like happens-before ordering, volatile variables, and atomic operations. It also discusses weaknesses in some common multi-threading constructs like double-checked locking and discusses how constructs like final fields can enable safe publication of shared objects. The document concludes by mentioning planned improvements to the JMM in JEP 188.
Similar to JVM JIT-compiler overview @ JavaOne Moscow 2013 (20)
"What's New in HotSpot JVM 8" @ JPoint 2014, Moscow, Russia Vladimir Ivanov
This document summarizes new features and improvements in HotSpot JVM 8. It discusses support for Project Lambda (lambda expressions and default methods), the Nashorn JavaScript engine, removal of PermGen space, and various performance enhancements. Notable changes include storing class metadata in native memory instead of PermGen, support for type annotations, and intrinsics for exact math operations to improve performance of dynamic languages like JavaScript.
"Formal Verification in Java" by Shura Iline, Vladimir Ivanov @ JEEConf 2013,...Vladimir Ivanov
This document discusses formal verification and testing methods for software. It defines formal verification as mathematically proving software correctness against a specification, while testing involves running software with different inputs to check for errors. The document outlines some key differences: formal verification provides a lower bound on quality by guaranteeing absence of certain failures, while testing only provides an upper bound. It also discusses techniques like deductive verification using theorem proving. Later sections cover topics like type systems, annotations, and pluggable type checking tools like the Checker Framework.
Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data.
The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs.
Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution!
Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.
Coordinate Systems in FME 101 - Webinar SlidesSafe Software
If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights.
During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to:
- Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value
- Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems
- Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors
- Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported
- Look Ahead: Gain insights into where FME is headed with coordinate systems in the future
Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxSynapseIndia
Your comprehensive guide to RPA in healthcare for 2024. Explore the benefits, use cases, and emerging trends of robotic process automation. Understand the challenges and prepare for the future of healthcare automation
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc
Six months into 2024, and it is clear the privacy ecosystem takes no days off!! Regulators continue to implement and enforce new regulations, businesses strive to meet requirements, and technology advances like AI have privacy professionals scratching their heads about managing risk.
What can we learn about the first six months of data privacy trends and events in 2024? How should this inform your privacy program management for the rest of the year?
Join TrustArc, Goodwin, and Snyk privacy experts as they discuss the changes we’ve seen in the first half of 2024 and gain insight into the concrete, actionable steps you can take to up-level your privacy program in the second half of the year.
This webinar will review:
- Key changes to privacy regulations in 2024
- Key themes in privacy and data governance in 2024
- How to maximize your privacy program in the second half of 2024
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...Toru Tamaki
Jindong Gu, Zhen Han, Shuo Chen, Ahmad Beirami, Bailan He, Gengyuan Zhang, Ruotong Liao, Yao Qin, Volker Tresp, Philip Torr "A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models" arXiv2023
https://arxiv.org/abs/2307.12980
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsMydbops
This presentation, delivered at the Postgres Bangalore (PGBLR) Meetup-2 on June 29th, 2024, dives deep into connection pooling for PostgreSQL databases. Aakash M, a PostgreSQL Tech Lead at Mydbops, explores the challenges of managing numerous connections and explains how connection pooling optimizes performance and resource utilization.
Key Takeaways:
* Understand why connection pooling is essential for high-traffic applications
* Explore various connection poolers available for PostgreSQL, including pgbouncer
* Learn the configuration options and functionalities of pgbouncer
* Discover best practices for monitoring and troubleshooting connection pooling setups
* Gain insights into real-world use cases and considerations for production environments
This presentation is ideal for:
* Database administrators (DBAs)
* Developers working with PostgreSQL
* DevOps engineers
* Anyone interested in optimizing PostgreSQL performance
Contact info@mydbops.com for PostgreSQL Managed, Consulting and Remote DBA Services
An invited talk given by Mark Billinghurst on Research Directions for Cross Reality Interfaces. This was given on July 2nd 2024 as part of the 2024 Summer School on Cross Reality in Hagenberg, Austria (July 1st - 7th)
Support en anglais diffusé lors de l'événement 100% IA organisé dans les locaux parisiens d'Iguane Solutions, le mardi 2 juillet 2024 :
- Présentation de notre plateforme IA plug and play : ses fonctionnalités avancées, telles que son interface utilisateur intuitive, son copilot puissant et des outils de monitoring performants.
- REX client : Cyril Janssens, CTO d’ easybourse, partage son expérience d’utilisation de notre plateforme IA plug & play.
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Erasmo Purificato
Slide of the tutorial entitled "Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Emerging Trends" held at UMAP'24: 32nd ACM Conference on User Modeling, Adaptation and Personalization (July 1, 2024 | Cagliari, Italy)
YOUR RELIABLE WEB DESIGN & DEVELOPMENT TEAM — FOR LASTING SUCCESS
WPRiders is a web development company specialized in WordPress and WooCommerce websites and plugins for customers around the world. The company is headquartered in Bucharest, Romania, but our team members are located all over the world. Our customers are primarily from the US and Western Europe, but we have clients from Australia, Canada and other areas as well.
Some facts about WPRiders and why we are one of the best firms around:
More than 700 five-star reviews! You can check them here.
1500 WordPress projects delivered.
We respond 80% faster than other firms! Data provided by Freshdesk.
We’ve been in business since 2015.
We are located in 7 countries and have 22 team members.
With so many projects delivered, our team knows what works and what doesn’t when it comes to WordPress and WooCommerce.
Our team members are:
- highly experienced developers (employees & contractors with 5 -10+ years of experience),
- great designers with an eye for UX/UI with 10+ years of experience
- project managers with development background who speak both tech and non-tech
- QA specialists
- Conversion Rate Optimisation - CRO experts
They are all working together to provide you with the best possible service. We are passionate about WordPress, and we love creating custom solutions that help our clients achieve their goals.
At WPRiders, we are committed to building long-term relationships with our clients. We believe in accountability, in doing the right thing, as well as in transparency and open communication. You can read more about WPRiders on the About us page.
Measuring the Impact of Network Latency at TwitterScyllaDB
Widya Salim and Victor Ma will outline the causal impact analysis, framework, and key learnings used to quantify the impact of reducing Twitter's network latency.
Sustainability requires ingenuity and stewardship. Did you know Pigging Solutions pigging systems help you achieve your sustainable manufacturing goals AND provide rapid return on investment.
How? Our systems recover over 99% of product in transfer piping. Recovering trapped product from transfer lines that would otherwise become flush-waste, means you can increase batch yields and eliminate flush waste. From raw materials to finished product, if you can pump it, we can pig it.
UiPath Community Day Kraków: Devs4Devs ConferenceUiPathCommunity
We are honored to launch and host this event for our UiPath Polish Community, with the help of our partners - Proservartner!
We certainly hope we have managed to spike your interest in the subjects to be presented and the incredible networking opportunities at hand, too!
Check out our proposed agenda below 👇👇
08:30 ☕ Welcome coffee (30')
09:00 Opening note/ Intro to UiPath Community (10')
Cristina Vidu, Global Manager, Marketing Community @UiPath
Dawid Kot, Digital Transformation Lead @Proservartner
09:10 Cloud migration - Proservartner & DOVISTA case study (30')
Marcin Drozdowski, Automation CoE Manager @DOVISTA
Pawel Kamiński, RPA developer @DOVISTA
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
09:40 From bottlenecks to breakthroughs: Citizen Development in action (25')
Pawel Poplawski, Director, Improvement and Automation @McCormick & Company
Michał Cieślak, Senior Manager, Automation Programs @McCormick & Company
10:05 Next-level bots: API integration in UiPath Studio (30')
Mikolaj Zielinski, UiPath MVP, Senior Solutions Engineer @Proservartner
10:35 ☕ Coffee Break (15')
10:50 Document Understanding with my RPA Companion (45')
Ewa Gruszka, Enterprise Sales Specialist, AI & ML @UiPath
11:35 Power up your Robots: GenAI and GPT in REFramework (45')
Krzysztof Karaszewski, Global RPA Product Manager
12:20 🍕 Lunch Break (1hr)
13:20 From Concept to Quality: UiPath Test Suite for AI-powered Knowledge Bots (30')
Kamil Miśko, UiPath MVP, Senior RPA Developer @Zurich Insurance
13:50 Communications Mining - focus on AI capabilities (30')
Thomasz Wierzbicki, Business Analyst @Office Samurai
14:20 Polish MVP panel: Insights on MVP award achievements and career profiling
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfjackson110191
These fighter aircraft have uses outside of traditional combat situations. They are essential in defending India's territorial integrity, averting dangers, and delivering aid to those in need during natural calamities. Additionally, the IAF improves its interoperability and fortifies international military alliances by working together and conducting joint exercises with other air forces.
2. 2
Agenda
§ about compilers in general
– … and JIT-compilers in particular
§ about JIT-compilers in HotSpot JVM
§ monitoring JIT-compilers in HotSpot JVM
4. 4
Dynamic and Static Compilation Differences
§ Static compilation
– “ahead-of-time”(AOT) compilation
– Source code → Native executable
– Most of compilation work happens before executing
§ Modern Java VMs use dynamic compilers (JIT)
– “just-in-time” (JIT) compilation
– Source code → Bytecode → Interpreter + JITted executable
– Most of compilation work happens during executing
5. 5
Dynamic and Static Compilation Differences
§ Static compilation (AOT)
– can utilize complex and heavy analyses and optimizations
– … but static information sometimes isn’t enough
– … and it’s hard to rely on profiling info, if any
– moreover, how to utilize specific platform features (like SSE 4.2)?
6. 6
Dynamic and Static Compilation Differences
§ Modern Java VMs use dynamic compilers (JIT)
– aggressive optimistic optimizations
§ through extensive usage of profiling info
– … but budget is limited and shared with an application
– startup speed suffers
– peak performance may suffer as well (not necessary)
9. 9
JVM
§ Runtime
– class loading, bytecode verification, synchronization
§ JIT
– profiling, compilation plans, OSR
– aggressive optimizations
§ GC
– different algorithms: throughput vs. response time
10. 10
JVM: Makes Bytecodes Fast
§ JVMs eventually JIT bytecodes
– To make them fast
– Some JITs are high quality optimizing compilers
§ But cannot use existing static compilers directly:
– Tracking OOPs (ptrs) for GC
– Java Memory Model (volatile reordering & fences)
– New code patterns to optimize
– Time & resource constraints (CPU, memory)
11. 11
JVM: Makes Bytecodes Fast
§ JIT'ing requires Profiling
– Because you don't want to JIT everything
§ Profiling allows focused code-gen
§ Profiling allows better code-gen
– Inline what’s hot
– Loop unrolling, range-check elimination, etc
– Branch prediction, spill-code-gen, scheduling
12. 12
Dynamic Compilation (JIT)
§ Knows about
– loaded classes, methods the program has executed
§ Makes optimization decisions based on code paths executed
– Code generation depends on what is observed:
§ loaded classes, code paths executed, branches taken
§ May re-optimize if assumption was wrong, or alternative code paths
taken
– Instruction path length may change between invocations of methods as a
result of de-optimization / re-compilation
13. 13
Dynamic Compilation (JIT)
§ Can do non-conservative optimizations in dynamic
§ Separates optimization from product delivery cycle
– Update JVM, run the same application, realize improved performance!
– Can be "tuned" to the target platform
14. 14
Profiling
§ Gathers data about code during execution
– invariants
§ types, constants (e.g. null pointers)
– statistics
§ branches, calls
§ Gathered data is used during optimization
– Educated guess
– Guess can be wrong
15. 15
Profile-guided optimization (PGO)
§ Use profile for more efficient optimization
§ PGO in JVMs
– Always have it, turned on by default
– Developers (usually) not interested or concerned about it
– Profile is always consistent to execution scenario
16. 16
Optimistic Compilers
§ Assume profile is accurate
– Aggressively optimize based on profile
– Bail out if we’re wrong
§ ...and hope that we’re usually right
17. 17
Dynamic Compilation (JIT)
§ Is dynamic compilation overhead essential?
– The longer your application runs, the less the overhead
§ Trading off compilation time, not application time
– Steal some cycles very early in execution
– Done automagically and transparently to application
§ Most of “perceived” overhead is compiler waiting for more data
– ...thus running semi-optimal code for time being
Overhead
21. 21
Deoptimization
§ Bail out of running native code
– stop executing native (JIT-generated) code
– start interpreting bytecode
§ It’s a complicated operation at runtime…
22. 22
OSR: On-Stack Replacement
§ Running method never exits?
§ But it’s getting really hot?
§ Generally means loops, back-branching
§ Compile and replace while running
§ Not typically useful in large systems
§ Looks great on benchmarks!
24. 24
Optimizations in HotSpot JVM
§ compiler tactics
delayed compilation
tiered compilation
on-stack replacement
delayed reoptimization
program dependence graph rep.
static single assignment rep.
§ proof-based techniques
exact type inference
memory value inference
memory value tracking
constant folding
reassociation
operator strength reduction
null check elimination
type test strength reduction
type test elimination
algebraic simplification
common subexpression elimination
integer range typing
§ flow-sensitive rewrites
conditional constant propagation
dominating test detection
flow-carried type narrowing
dead code elimination
§ language-specific techniques
class hierarchy analysis
devirtualization
symbolic constant propagation
autobox elimination
escape analysis
lock elision
lock fusion
de-reflection
§ speculative (profile-based) techniques
optimistic nullness assertions
optimistic type assertions
optimistic type strengthening
optimistic array length strengthening
untaken branch pruning
optimistic N-morphic inlining
branch frequency prediction
call frequency prediction
§ memory and placement transformation
expression hoisting
expression sinking
redundant store elimination
adjacent store fusion
card-mark elimination
merge-point splitting
§ loop transformations
loop unrolling
loop peeling
safepoint elimination
iteration range splitting
range check elimination
loop vectorization
§ global code shaping
inlining (graph integration)
global code motion
heat-based code layout
switch balancing
throw inlining
§ control flow graph transformation
local code scheduling
local code bundling
delay slot filling
graph-coloring register allocation
linear scan register allocation
live range splitting
copy coalescing
constant splitting
copy removal
address mode matching
instruction peepholing
DFA-based code generator
25. 25
JVM: Makes Virtual Calls Fast
§ C++ avoids virtual calls – because they are slow
§ Java embraces them – and makes them fast
– Well, mostly fast – JIT's do Class Hierarchy Analysis (CHA)
– CHA turns most virtual calls into static calls
– JVM detects new classes loaded, adjusts CHA
§ May need to re-JIT
– When CHA fails to make the call static, inline caches
– When IC's fail, virtual calls are back to being slow
26. 26
Call Site
§ The place where you make a call
§ Monomorphic (“one shape”)
– Single target class
§ Bimorphic (“two shapes”)
§ Polymorphic (“many shapes”)
§ Megamorphic
27. 27
Inlining
§ Combine caller and callee into one unit
– e.g.based on profile
– … or prove smth using CHA (Class Hierarchy Analysis)
– Perhaps with a guard/test
§ Optimize as a whole
– More code means better visibility
28. 28
Inlining
int addAll(int max) {
int accum = 0;
for (int i = 0; i < max; i++) {
accum = add(accum, i);
}
return accum;
}
int add(int a, int b) { return a + b; }
30. 30
Inlining and devirtualization
§ Inlining is the most profitable compiler optimization
– Rather straightforward to implement
– Huge benefits: expands the scope for other optimizations
§ OOP needs polymorphism, that implies virtual calls
– Prevents naïve inlining
– Devirtualization is required
– (This does not mean you should not write OOP code)
31. 31
JVM Devirtualization
§ Developers shouldn't care
§ Analyze hierarchy of currently loaded classes
§ Efficiently devirtualize all monomorphic calls
§ Able to devirtualize polymorphic calls
§ JVM may inline dynamic methods
– Reflection calls
– Runtime-synthesized methods
– JSR 292
32. 32
Feedback multiplies optimizations
§ On-line profiling and CHA produces information
– ...which lets the JIT ignore unused paths
– ...and helps the JIT sharpen types on hot paths
– ...which allows calls to be devirtualized
– ...allowing them to be inlined
– ...expanding an ever-widening optimization horizon
§ Result:
Large native methods containing tightly optimized machine code for
hundreds of inlined calls!
41. 41
Escape Analysis
public int m1() {
Pair p = new Pair(1, 2);
return m2(p);
}
public int m2(Pair p) {
return p.first + m3(p);
}
public int m3(Pair p) { return p.second;}
Initial version
53. 53
Monitoring JIT-Compiler
§ how to print info about compiled methods?
– -XX:+PrintCompilation
§ how to print info about inlining decisions
– -XX:+PrintInlining
§ how to control compilation policy?
– -XX:CompileCommand=…
§ how to print assembly code?
– -XX:+PrintAssembly
– -XX:+PrintOptoAssembly (C2-only)
56. 56
Print Compilation
§ 2043 470 % ! jdk.nashorn.internal.ir.FunctionNode::accept @ 136 (265 bytes)
% == OSR compilation
! == has exception handles (may be expensive)
s == synchronized method
§ 2028 466 n java.lang.Class::isArray (native)
n == native method
Other useful info
57. 57
Print Compilation
§ 621 160 java.lang.Object::equals (11 bytes) made not entrant
– don‘t allow any new calls into this compiled version
§ 1807 160 java.lang.Object::equals (11 bytes) made zombie
– can safely throw away compiled version
Not just compilation notifications
58. 58
No JIT At All?
§ Code is too large
§ Code isn’t too «hot»
– executed not too often
61. 61
Inlining Tuning
§ -XX:MaxInlineSize=35
– Largest inlinable method (bytecode)
§ -XX:InlineSmallCode=#
– Largest inlinable compiled method
§ -XX:FreqInlineSize=#
– Largest frequently-called method…
§ -XX:MaxInlineLevel=9
– How deep does the rabbit hole go?
§ -XX:MaxRecursiveInlineLevel=#
– recursive inlining
62. 62
Machine Code
§ -XX:+PrintAssembly
§ http://wikis.sun.com/display/HotSpotInternals/PrintAssembly
§ Knowing code compiles is good
§ Knowing code inlines is better
§ Seeing the actual assembly is best!