Getting Space Pirate Trainer* to Perform on Intel® Graphics
- 1. Audience
Getting Space Pirate Trainer* to Perform ON
Intel® Graphics
Dirk Van Welden – Founder of I-Illusions
Cristiano Ferreira – Intel Developer Relations Engineer
Seth Schneider – Intel® GPA Product Owner
- 2. 2
▪ Getting cozy with Space Pirate Trainer*
▪ Why target mainstream VR?
▪ Dive into optimizations found with Intel® GPA
▪ Questions?
Agenda
- 5. 5
● Launch title for HTC* Vive, Oculus*
Touch and Microsoft* Mixed
Reality
● Early access title since April 2016
● 1.0 since October 2017
● 150,000+ units sold
● Used worldwide in VR arcades and
as a demo experience
A Few Facts
- 6. 6
● Rift backer
● Valve / Vive kit
● Here’s a VR demo
“I’m never going to create a VR game”
- 8. 8
● Initial MSMR port
○ Together with Microsoft*
○ Up & Running in 3 days
○ Same min. spec of 970
● Mainstream audience demo feedback
○ Cool, can I run it on my ultrabook?
○ Affordable HMD <-> Expensive PC/Laptop
● Huge market opportunity, but at the time, limited quality content
Why Mainstream?
- 9. 9
● Initial mainstream version (Intel® Core™ i5 processor family, Intel®
HD620 Graphics, Intel® NUC)
○ 12 FPS
● No point-lights, no post-effects
○ Almost 30 FPS, ugly
● “I seriously doubt this, but let’s try”
Let’s do it! … But
- 14. 14
What’s Inside Intel® Graphics Performance
Analyzer (Intel® GPA)? System Analyzer / HUD
Graphics Frame Analyzer
In-game analysis
Single frame analysisTimeline analysis
Graphics Monitor
Launch & config tool
Graphics Trace Analyzer
- 15. 15
Performance Feature Focused
Hotspot Analysis
Identifies most expensive sets of
events grouped by state and/or
bottleneck
Metrics Analysis
Identifies exact hardware bottleneck
Playback Experiments
Test performance optimizations and
quantify improvements
- 17. 17
Shaders – Floor
Before (Standard Shader) After (Lambert Shader)
267 Instructions
47 Instructions
~1.5ms GPU Duration ~0.3ms GPU Duration
(5x Performance Improvement)
- 19. 19
Shaders – Microsoft* Windows Mixed Reality Toolkit
▪ Microsoft* has recreated all of Unity’s* built-in shaders to contain
significantly less math ops
▪ Quick action:
– Download the kit: https://github.com/Microsoft/MixedRealityToolkit-Unity
– When detecting Mainstream WinMR swap materials to the ones
contained in the kit
- 21. 21
Shaders - Material Batching With Unlit Shaders
No Batching or Instancing
# of Draws: 1300
# of Vertices: 1.5M
GPU Duration: 3.5ms
Batching and Instancing
# of Draws: 8
# of Vertices: 2M
GPU Duration: 1.7ms
(2x Performance Improvement)
- 24. 24
Remove Dynamic Lights
• Dynamic Lights need to
render multiple passes for
each light contributing
5ms of frame time!
• Events 83 and 84 are the
base pass, then 123-130
are the additional passes
for each dynamic light
- 26. 26
Post Processing Stacks - Bloom
Low Settings
• Ended up using the Mobile Bloom PFX
Stack.
• Consolidated all PFX into one pass
• GPU Duration: 0.6ms
(4x Performance Improvement)
High Settings
• Initially reduced number of passes to 14
• GPU Duration: 2.6ms
- 27. 27
Post Processing - HDR – Vertical Flip
▪ There is a required vertical flip
step that happens when using
HDR effects
▪ The way to avoid the .3 ms /
frame penalty is to uncheck all
HDR boxes on scene cameras
& to remove image effects that
may require it (usually
tonemapping / bloom / etc).
▪ If using post effects, use the
post effects stack and not
image effects
▪ Removed any effects with a
depth pass (Fog, etc.)
- 29. 29
Post Processing - TSCMAA
▪ Temporally Stable post effect anti aliasing
techniques like CMAA can provide equivalent
functionality at half the cost
▪ If necessary, use Temporally Stable CMAA
(TSCMAA) – good if rendering at less than
1280x1280 and upscaling.
Performance: 1.5X performance improvement with
TSCMAA over 4x MSAA
4x MSAA
TSCMAA
- 35. 35
RenderQueue Order for WinMR
1. Draw VR hands and any
interactibles (weapons, etc.)
2. Draw scene dressings,
dynamic / small static objects
3. Draw Large Set Pieces
(Buildings, Ship, etc.)
4. Draw the floor
5. Draw skybox (usually already
done last if using built-in Unity
skybox)
1
2
3
4
5
- 39. 39
Skybox Compression
Low Settings
• 1K texture
• GPU Duration: 0.2ms
(5x Performance Improvement)
High Settings
• 2k texture - originally was
uncompressed at 4k
• GPU Duration: 1.1ms
- 42. 42
Results – 4x Faster!
12 FPS on default settings
w/out lights and PFX, about 30 FPS
60 FPS (low) and 35 FPS (High)
Better performance on all platforms!
- 44. 44
Helpful Links:
- Perf Recommendations for immersive
headset apps: https://goo.gl/4V4kpr
- Porting Guides: https://goo.gl/QTbWYp
- Enthusiast’s Guide:
https://goo.gl/gKZE2w
- Development Hardware:
https://goo.gl/gNG5oK
Tools:
Download Intel® GPA for
FREE at
software.intel.com/gpa
Tech:
TSCMAA Article and Sample:
https://goo.gl/6FFnKp
Getting Started with WinMR Optimization
- 45. Legal Disclaimers and Optimization Notices
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as
any warranty arising from course of performance, course of dealing, or usage in trade.
You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a
non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein.
The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are
available on request.
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system
configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at [intel.com].
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are
measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult
other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other
products. For more complete information visit www.intel.com/benchmarks.
Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These
optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on
microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to
Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your
system hardware, software or configuration may affect your actual performance.
Intel, Core and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others
© Intel Corporation.