The document discusses a presentation given by Seth Schneider from Intel and Russ Glaeser from Cascade Game Foundry. It introduces Intel's Graphics Performance Analyzers (GPA) tool and demonstrates how it was used to optimize the game Infinite Scuba developed by Cascade Game Foundry. The presentation covered an overview of GPA, details about Infinite Scuba, and a live demo of using GPA to analyze and improve performance of the game.
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Review state-of-the-art techniques that use neural networks to synthesize motion, such as mode-adaptive neural network and phase-functioned neural networks. See how next-generation CPUs with reinforcement learning can offer better performance.
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators deliver orders of magnitude performance gain for AI across deep learning, classical machine learning, and graph analytics and are key to enabling AI Everywhere. Get started on your AI Developer Journey @ software.intel.com/ai.
The document describes Intel Graphics Performance Analyzers (Intel GPA), a free tool that allows users to optimize game performance on Windows, Android, and Ubuntu systems. Intel GPA includes tools like the System Analyzer for real-time in-game performance analysis, the Frame Analyzer for detailed frame-level analysis, and the Platform Analyzer to visualize CPU and GPU activity. It also allows experiments like changing graphics settings without code modifications to help identify performance bottlenecks.
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Explore practical elements, such as performance profiling, debugging, and porting advice. Get an overview of advanced programming topics, like common design patterns, SIMD lane interoperability, data conversions, and more.
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
QuEST Global is a global engineering company that provides AI and digital transformation services using technologies like computer vision, machine learning, and deep learning. It has developed several AI solutions using Intel technologies like OpenVINO that provide accelerated inferencing on Intel CPUs. Some examples include a lung nodule detection solution to help detect early-stage lung cancer from CT scans and a vision analytics platform used for applications in retail, banking, and surveillance. The company leverages Intel's AI Builder program and ecosystem to develop, integrate, and deploy AI solutions globally.
More explosions, more chaos, and definitely more blowing stuff up
This document discusses optimizations and new DirectX features for Intel graphics hardware. It begins with an introduction of Avalanche Studios, the developer of the game Just Cause 3. It then discusses the use of Intel's Graphics Performance Analyzers tools to analyze Just Cause 3 and identify optimization opportunities. The document outlines several low-level shader optimizations performed, including reworking math operations, rearranging variables, and reusing intermediate values. It also discusses leveraging new DirectX features pioneered by Intel. The goal of these optimizations is to improve performance for the large install base of gamers using Intel graphics.
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
oneDNN Graph API extends oneDNN with a graph interface which reduces deep learning integration costs and maximizes compute efficiency across a variety of AI hardware including AI accelerators. Get started on your AI Developer Journey @ software.intel.com/ai.
Embree Ray Tracing Kernels | Overview and New Features | SIGGRAPH 2018 Tech S...
Overview of the new Embree 3 ray tracing framework, including how to use the new API, supported geometry types, and ray intersection methods. Includes a look at new features like normal oriented curves, vertex grids, etc.
With the growth of AI, machine learning, and data-centric applications, the industry needs a programming model that allows developers to take advantage of rapid innovation in processor architectures. TensorFlow supports the oneAPI industry initiative and its standards-based open specification.
oneAPI complements TensorFlow’s modular design and provides increased choice of hardware vendor and processor architecture, and faster support of next-generation accelerators. TensorFlow uses oneAPI today on Xeon processors and we look forward to using oneAPI to run on future Intel architectures.
This document discusses Bodo Inc.'s product that aims to simplify and accelerate data science workflows. It highlights common problems in data science like complex and slow analytics, segregated development and production environments, and unused data. Bodo provides a unified development and production environment where the same code can run at any scale with automatic parallelization. It integrates an analytics engine and HPC architecture to optimize Python code for performance. Bodo is presented as offering more productive, accurate and cost-effective data science compared to traditional approaches.
Jeff Rous from Intel and Niklas Smedberg from Epic Games discussed optimizing the Unreal Engine 4 (UE4) game engine for Intel processors. They described measuring performance using Intel's Graphics Performance Analyzers, common pain points like memory bandwidth and dense geometry on Intel graphics, and shader optimizations. The presentation also covered optimizing UE4 for DirectX 12, adding support for Android x86/x64, and announcing fast ASTC texture compression support in UE4.
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Integrated into Intel® Advisor, Cache-aware Roofline Modeling (CARM) provides insight into how an application behaves by helping to determine a) how optimally it works on a given hardware, b) the main factors that limit performance, c) if the workload is memory or compute-bound, and d) the right strategy to improve application performance.
Intel® Open Image Denoise: Optimized CPU Denoising | SIGGRAPH 2019 Technical ...
Open Image Denoise is an open source library for denoising images rendered with ray tracing. It provides a deep learning based denoising filter that can run on any modern Intel CPU. The filter uses a convolutional neural network architecture and has been shown to improve image quality over other filters while maintaining interactive performance. The API is designed to be simple and easy to integrate into rendering applications. Future versions will include additional features like temporal coherence and support for more input buffers.
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
This talk focuses on the newest release in RenderMan* 22.5 and its adoption at Pixar Animation Studios* for rendering future movies. With native support for Intel® Advanced Vector Extensions, Intel® Advanced Vector Extensions 2, and Intel® Advanced Vector Extensions 512, it includes enhanced library features, debugging support, and an extensive test framework.
This session was held by Vladimir Brenner, Partner Account Manager, Disruptors & AI, Intel AI at the Dive into H2O: London training on June 17, 2019.
Please find the recording here: https://youtu.be/60o3eyG5OLM
In this talk, Tong will start with the current landscape and typical use cases of Artificial Intelligence applications in the Telco domain. Then, she will introduce Intel’s strategy and products for Network AI, including our focus areas, our hardware portfolio, software stacks, roadmaps and some case studies.
Speaker: Tong Zhang, Principal Engineer and Chief Architect for AI and Analytics of the Network Platforms Group, Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
This document contains several legal notices and disclaimers from Intel regarding their products. No license is granted to any intellectual property and Intel assumes no liability relating to the sale and use of their products. Intel products are not intended for medical or life critical applications. Specifications and descriptions are subject to change without notice.
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Preprocess, visualize, and Build AI Faster at-Scale on Intel Architecture. Develop end-to-end AI pipelines for inferencing including data ingestion, preprocessing, and model inferencing with tabular, NLP, RecSys, video and image using Intel oneAPI AI Analytics Toolkit and other optimized libraries. Build at-scale performant pipelines with Databricks and end-to-end Xeon optimizations. Learn how to visualize with the OmniSci Immerse Platform and experience a live demonstration of the Intel Distribution of Modin and OmniSci.
Accelerate Machine Learning Software on Intel Architecture
This session presents performance data for deep learning training for image recognition that achieves greater than 24 times speedup performance with a single Intel® Xeon Phi™ processor 7250 when compared to Caffe*. In addition, we present performance data that shows training time is further reduced by 40 times the speedup with a 128-node Intel® Xeon Phi™ processor cluster over Intel® Omni-Path Architecture (Intel® OPA).
This document discusses scaling Python performance in production environments. It introduces the Intel Distribution for Python, which provides optimized versions of NumPy, SciPy, and Scikit-Learn using Intel MKL to accelerate linear algebra and machine learning algorithms. It also supports parallelism through MPI, TBB for multithreading, and integration with big data frameworks. Profiling tools like Intel VTune Amplifier help optimize mixed-language Python applications for Intel architectures. The goal is to make Python usable for high performance computing and big data workloads while maintaining its ease of use.
Open Source Interactive CPU Preview Rendering with Pixar's Universal Scene De...
Universal Scene Description* (USD) is an open source initiative developed by Pixar for fast, large scale, and universal asset management across multiple programs including Maya, Houdini, and others.
1. The document introduces the Intel Xeon Scalable platform, which provides the foundation for data center innovation with a 1.65x average performance boost over previous generations.
2. It highlights key advantages of the platform including scalable performance, agility in rapid service delivery, and hardware-enhanced security with near-zero performance overhead.
3. Various workload-optimized solutions are discussed that leverage the platform's performance to accelerate insights from analytics, deploy cloud infrastructure more quickly, and transform networks.
Accelerate Your Apache Spark with Intel Optane DC Persistent Memory
The capacity of data grows rapidly in big data area, more and more memory are consumed either in the computation or holding the intermediate data for analytic jobs. For those memory intensive workloads, end-point users have to scale out the computation cluster or extend memory with storage like HDD or SSD to meet the requirement of computing tasks. For scaling out the cluster, the extra cost from cluster management, operation and maintenance will increase the total cost if the extra CPU resources are not fully utilized. To address the shortcoming above, Intel Optane DC persistent memory (Optane DCPM) breaks the traditional memory/storage hierarchy and scale up the computing server with higher capacity persistent memory. Also it brings higher bandwidth & lower latency than storage like SSD or HDD. And Apache Spark is widely used in the analytics like SQL and Machine Learning on the cloud environment. For cloud environment, low performance of remote data access is typical a stop gap for users especially for some I/O intensive queries. For the ML workload, it's an iterative model which I/O bandwidth is the key to the end-2-end performance. In this talk, we will introduce how to accelerate Spark SQL with OAP (https://github.com/Intel-bigdata/OAP) to accelerate SQL performance on Cloud to archive 8X performance gain and RDD cache to improve K-means performance with 2.5X performance gain leveraging Intel Optane DCPM. Also we will have a deep dive how Optane DCPM for these performance gains.
Speakers: Cheng Xu, Piotr Balcer
Apache CarbonData & Spark meetup
"QATCodec: past, present and future" if from INTEL
Apache Spark™ is a unified analytics engine for large-scale data processing.
CarbonData is a high-performance data solution that supports various data analytic scenarios, including BI analysis, ad-hoc SQL query, fast filter lookup on detail record, streaming analytics, and so on. CarbonData has been deployed in many enterprise production environments, in one of the largest scenario it supports queries on single table with 3PB data (more than 5 trillion records) with response time less than 3 seconds!
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
This document discusses optimizing deep learning inference on Intel processor graphics using the OpenVINOTM toolkit. Some key points include:
- Running inference on client devices provides advantages over cloud like privacy, bandwidth savings, and responsiveness.
- OpenVINOTM provides tools to optimize models for Intel hardware and achieve 5-10x speedups on Intel GPUs compared to CPU baselines.
- A case study demonstrates optimizing a deep image matting model, reducing inference time from 2.35 seconds to 291 milliseconds on Intel GPU using OpenVINOTM.
- Emerging technologies like federated learning are discussed which could improve privacy for on-device inference.
Whether you are an AI, HPC, IoT, Graphics, Networking or Media developer, visit the Intel Developer Zone today to access the latest software products, resources, training, and support. Test-drive the latest Intel hardware and software products on DevCloud, our online development sandbox, and use DevMesh, our online collaboration portal, to meet and work with other innovators and product leaders. Get started by joining the Intel Developer Community @ software.intel.com.
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2020/11/acceleration-of-deep-learning-using-openvino-3d-seismic-case-study-a-presentation-from-intel/
For more information about edge AI and computer vision, please visit:
https://www.edge-ai-vision.com
Manas Pathak, Global AI Lead for Oil and Gas at Intel, presents the “Acceleration of Deep Learning Using OpenVINO: 3D Seismic Case Study” tutorial at the September 2020 Embedded Vision Summit.
The use of deep learning for automatic seismic data interpretation is gaining the attention of many researchers across the oil and gas industry. The integration of high-performance computing (HPC) AI workflows in seismic data interpretation brings the challenge of moving and processing large amounts of data from HPC to AI computing solutions and vice-versa.
In this presentation, Pathak illustrates this challenge via a case study using a public deep learning model for salt identification applied on a 3D seismic survey from the F3 Dutch block in the North Sea. He presents a workflow to address this challenge and perform accelerated AI on seismic data. The Intel Distribution of OpenVINO toolkit was used to increase the inference performance of a pre-trained model on an Intel CPU. OpenVINO allows CPU users to get significant improvement in AI inference performance for high memory capacity deep learning models used on large datasets without any significant loss in accuracy.
The document discusses a presentation given by Seth Schneider from Intel and Russ Glaeser from Cascade Game Foundry. It introduces Intel's Graphics Performance Analyzers (GPA) tool and demonstrates how it was used to optimize the game Infinite Scuba developed by Cascade Game Foundry. The presentation covered an overview of GPA, details about Infinite Scuba, and a live demo of using GPA to analyze and improve performance of the game.
Review state-of-the-art techniques that use neural networks to synthesize motion, such as mode-adaptive neural network and phase-functioned neural networks. See how next-generation CPUs with reinforcement learning can offer better performance.
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Intel® Software
Software AI Accelerators deliver orders of magnitude performance gain for AI across deep learning, classical machine learning, and graph analytics and are key to enabling AI Everywhere. Get started on your AI Developer Journey @ software.intel.com/ai.
The document describes Intel Graphics Performance Analyzers (Intel GPA), a free tool that allows users to optimize game performance on Windows, Android, and Ubuntu systems. Intel GPA includes tools like the System Analyzer for real-time in-game performance analysis, the Frame Analyzer for detailed frame-level analysis, and the Platform Analyzer to visualize CPU and GPU activity. It also allows experiments like changing graphics settings without code modifications to help identify performance bottlenecks.
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Intel® Software
Explore practical elements, such as performance profiling, debugging, and porting advice. Get an overview of advanced programming topics, like common design patterns, SIMD lane interoperability, data conversions, and more.
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019Intel® Software
QuEST Global is a global engineering company that provides AI and digital transformation services using technologies like computer vision, machine learning, and deep learning. It has developed several AI solutions using Intel technologies like OpenVINO that provide accelerated inferencing on Intel CPUs. Some examples include a lung nodule detection solution to help detect early-stage lung cancer from CT scans and a vision analytics platform used for applications in retail, banking, and surveillance. The company leverages Intel's AI Builder program and ecosystem to develop, integrate, and deploy AI solutions globally.
More explosions, more chaos, and definitely more blowing stuff upIntel® Software
This document discusses optimizations and new DirectX features for Intel graphics hardware. It begins with an introduction of Avalanche Studios, the developer of the game Just Cause 3. It then discusses the use of Intel's Graphics Performance Analyzers tools to analyze Just Cause 3 and identify optimization opportunities. The document outlines several low-level shader optimizations performed, including reworking math operations, rearranging variables, and reusing intermediate values. It also discusses leveraging new DirectX features pioneered by Intel. The goal of these optimizations is to improve performance for the large install base of gamers using Intel graphics.
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Intel® Software
oneDNN Graph API extends oneDNN with a graph interface which reduces deep learning integration costs and maximizes compute efficiency across a variety of AI hardware including AI accelerators. Get started on your AI Developer Journey @ software.intel.com/ai.
Embree Ray Tracing Kernels | Overview and New Features | SIGGRAPH 2018 Tech S...Intel® Software
Overview of the new Embree 3 ray tracing framework, including how to use the new API, supported geometry types, and ray intersection methods. Includes a look at new features like normal oriented curves, vertex grids, etc.
oneAPI: Industry Initiative & Intel ProductTyrone Systems
With the growth of AI, machine learning, and data-centric applications, the industry needs a programming model that allows developers to take advantage of rapid innovation in processor architectures. TensorFlow supports the oneAPI industry initiative and its standards-based open specification.
oneAPI complements TensorFlow’s modular design and provides increased choice of hardware vendor and processor architecture, and faster support of next-generation accelerators. TensorFlow uses oneAPI today on Xeon processors and we look forward to using oneAPI to run on future Intel architectures.
This document discusses Bodo Inc.'s product that aims to simplify and accelerate data science workflows. It highlights common problems in data science like complex and slow analytics, segregated development and production environments, and unused data. Bodo provides a unified development and production environment where the same code can run at any scale with automatic parallelization. It integrates an analytics engine and HPC architecture to optimize Python code for performance. Bodo is presented as offering more productive, accurate and cost-effective data science compared to traditional approaches.
Jeff Rous from Intel and Niklas Smedberg from Epic Games discussed optimizing the Unreal Engine 4 (UE4) game engine for Intel processors. They described measuring performance using Intel's Graphics Performance Analyzers, common pain points like memory bandwidth and dense geometry on Intel graphics, and shader optimizations. The presentation also covered optimizing UE4 for DirectX 12, adding support for Android x86/x64, and announcing fast ASTC texture compression support in UE4.
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Intel® Software
Integrated into Intel® Advisor, Cache-aware Roofline Modeling (CARM) provides insight into how an application behaves by helping to determine a) how optimally it works on a given hardware, b) the main factors that limit performance, c) if the workload is memory or compute-bound, and d) the right strategy to improve application performance.
Intel® Open Image Denoise: Optimized CPU Denoising | SIGGRAPH 2019 Technical ...Intel® Software
Open Image Denoise is an open source library for denoising images rendered with ray tracing. It provides a deep learning based denoising filter that can run on any modern Intel CPU. The filter uses a convolutional neural network architecture and has been shown to improve image quality over other filters while maintaining interactive performance. The API is designed to be simple and easy to integrate into rendering applications. Future versions will include additional features like temporal coherence and support for more input buffers.
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...Intel® Software
This talk focuses on the newest release in RenderMan* 22.5 and its adoption at Pixar Animation Studios* for rendering future movies. With native support for Intel® Advanced Vector Extensions, Intel® Advanced Vector Extensions 2, and Intel® Advanced Vector Extensions 512, it includes enhanced library features, debugging support, and an extensive test framework.
This session was held by Vladimir Brenner, Partner Account Manager, Disruptors & AI, Intel AI at the Dive into H2O: London training on June 17, 2019.
Please find the recording here: https://youtu.be/60o3eyG5OLM
In this talk, Tong will start with the current landscape and typical use cases of Artificial Intelligence applications in the Telco domain. Then, she will introduce Intel’s strategy and products for Network AI, including our focus areas, our hardware portfolio, software stacks, roadmaps and some case studies.
Speaker: Tong Zhang, Principal Engineer and Chief Architect for AI and Analytics of the Network Platforms Group, Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Inteltdc-globalcode
This document contains several legal notices and disclaimers from Intel regarding their products. No license is granted to any intellectual property and Intel assumes no liability relating to the sale and use of their products. Intel products are not intended for medical or life critical applications. Specifications and descriptions are subject to change without notice.
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciIntel® Software
Preprocess, visualize, and Build AI Faster at-Scale on Intel Architecture. Develop end-to-end AI pipelines for inferencing including data ingestion, preprocessing, and model inferencing with tabular, NLP, RecSys, video and image using Intel oneAPI AI Analytics Toolkit and other optimized libraries. Build at-scale performant pipelines with Databricks and end-to-end Xeon optimizations. Learn how to visualize with the OmniSci Immerse Platform and experience a live demonstration of the Intel Distribution of Modin and OmniSci.
Accelerate Machine Learning Software on Intel Architecture Intel® Software
This session presents performance data for deep learning training for image recognition that achieves greater than 24 times speedup performance with a single Intel® Xeon Phi™ processor 7250 when compared to Caffe*. In addition, we present performance data that shows training time is further reduced by 40 times the speedup with a 128-node Intel® Xeon Phi™ processor cluster over Intel® Omni-Path Architecture (Intel® OPA).
Python* Scalability in Production EnvironmentsIntel® Software
This document discusses scaling Python performance in production environments. It introduces the Intel Distribution for Python, which provides optimized versions of NumPy, SciPy, and Scikit-Learn using Intel MKL to accelerate linear algebra and machine learning algorithms. It also supports parallelism through MPI, TBB for multithreading, and integration with big data frameworks. Profiling tools like Intel VTune Amplifier help optimize mixed-language Python applications for Intel architectures. The goal is to make Python usable for high performance computing and big data workloads while maintaining its ease of use.
Open Source Interactive CPU Preview Rendering with Pixar's Universal Scene De...Intel® Software
Universal Scene Description* (USD) is an open source initiative developed by Pixar for fast, large scale, and universal asset management across multiple programs including Maya, Houdini, and others.
1. The document introduces the Intel Xeon Scalable platform, which provides the foundation for data center innovation with a 1.65x average performance boost over previous generations.
2. It highlights key advantages of the platform including scalable performance, agility in rapid service delivery, and hardware-enhanced security with near-zero performance overhead.
3. Various workload-optimized solutions are discussed that leverage the platform's performance to accelerate insights from analytics, deploy cloud infrastructure more quickly, and transform networks.
Accelerate Your Apache Spark with Intel Optane DC Persistent MemoryDatabricks
The capacity of data grows rapidly in big data area, more and more memory are consumed either in the computation or holding the intermediate data for analytic jobs. For those memory intensive workloads, end-point users have to scale out the computation cluster or extend memory with storage like HDD or SSD to meet the requirement of computing tasks. For scaling out the cluster, the extra cost from cluster management, operation and maintenance will increase the total cost if the extra CPU resources are not fully utilized. To address the shortcoming above, Intel Optane DC persistent memory (Optane DCPM) breaks the traditional memory/storage hierarchy and scale up the computing server with higher capacity persistent memory. Also it brings higher bandwidth & lower latency than storage like SSD or HDD. And Apache Spark is widely used in the analytics like SQL and Machine Learning on the cloud environment. For cloud environment, low performance of remote data access is typical a stop gap for users especially for some I/O intensive queries. For the ML workload, it's an iterative model which I/O bandwidth is the key to the end-2-end performance. In this talk, we will introduce how to accelerate Spark SQL with OAP (https://github.com/Intel-bigdata/OAP) to accelerate SQL performance on Cloud to archive 8X performance gain and RDD cache to improve K-means performance with 2.5X performance gain leveraging Intel Optane DCPM. Also we will have a deep dive how Optane DCPM for these performance gains.
Speakers: Cheng Xu, Piotr Balcer
Apache CarbonData & Spark meetup
"QATCodec: past, present and future" if from INTEL
Apache Spark™ is a unified analytics engine for large-scale data processing.
CarbonData is a high-performance data solution that supports various data analytic scenarios, including BI analysis, ad-hoc SQL query, fast filter lookup on detail record, streaming analytics, and so on. CarbonData has been deployed in many enterprise production environments, in one of the largest scenario it supports queries on single table with 3PB data (more than 5 trillion records) with response time less than 3 seconds!
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...Intel® Software
This document discusses optimizing deep learning inference on Intel processor graphics using the OpenVINOTM toolkit. Some key points include:
- Running inference on client devices provides advantages over cloud like privacy, bandwidth savings, and responsiveness.
- OpenVINOTM provides tools to optimize models for Intel hardware and achieve 5-10x speedups on Intel GPUs compared to CPU baselines.
- A case study demonstrates optimizing a deep image matting model, reducing inference time from 2.35 seconds to 291 milliseconds on Intel GPU using OpenVINOTM.
- Emerging technologies like federated learning are discussed which could improve privacy for on-device inference.
E5 Intel Xeon Processor E5 Family Making the Business Case Intel IT Center
This presentation highlights cloud computing advantages of the Intel® Xeon® processor E5 family and helps you make the business case for investing. Includes access to an ROI calculator.
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY
HPC DAY 2017 - http://www.hpcday.eu/
Accelerating tomorrow's HPC and AI workflows with Intel Architecture
Atanas Atanasov | HPC solution architect, EMEA region at Intel
Introduction to container networking in K8s - SDN/NFV London meetupHaidee McMahon
This document discusses Intel's work on container networking technologies for network functions virtualization (NFV). It outlines three deployment models for containers in NFV environments - bare metal, unified infrastructure, and hybrid. It also addresses key challenges for using containers in bare metal environments, such as providing multiple network interfaces and high-performance data planes. Intel is working to help solve these challenges through open source solutions and experience kits that provide best practices.
Software Development Tools for Intel® IoT PlatformsIntel® Software
This talk familiarizes participants with the benefits of using the Intel® software development tools and libraries for developing end-to-end IoT solutions.
Driving Industrial InnovationOn the Path to ExascaleIntel IT Center
This document discusses driving industrial innovation through high performance computing (HPC). It summarizes Intel's progress in HPC technologies including processors, coprocessors, fabrics, and software. Examples are given of how HPC is transforming industries like automotive design at Audi. The top supercomputer is highlighted as using Intel Xeon and Xeon Phi processors. The document envisions continuing innovation to achieve exascale computing and connect more people through technology.
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chipinside-BigData.com
- SpringHill (NNP-I1000) is Intel's new data center inference chip that provides best-in-class performance per watt for major data center inference workloads.
- It delivers 4.8 TOPs/watt of performance and can scale from 10-50 watts to boost performance.
- The chip features 12 inference compute engines, 24MB of shared cache, and Intel architecture cores to drive AI innovation while maintaining high performance and efficiency.
Cloud Technology: Now Entering the Business Process Phasefinteligent
Cloud technology is moving into its next phase of business use. [1] Cloud models are entering the "business process" phase of delivering services. [2] Cloud technologies can now generate higher returns for businesses. [3] Consulting with Intel can help optimize cloud solutions.
Similar to TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Learning (20)
TDC2019 Intel Software Day - Visao Computacional e IA a servico da humanidadetdc-globalcode
O documento discute o uso de visão computacional e inteligência artificial para aplicações médicas e industriais. Ele descreve como CPUs, GPUs e VPUs podem processar IA localmente e com baixo custo usando ferramentas como OpenVINO. Isso permitiria diagnósticos médicos remotos em tempo real com baixo consumo de energia.
TDC2019 Intel Software Day - ACATE - Cases de Sucessotdc-globalcode
O documento fornece um panorama geral da tecnologia e inovação em Santa Catarina, destacando:
1) O setor representa 5,6% da economia catarinense, com faturamento de R$15,53 bilhões;
2) As cidades de Florianópolis e Blumenau são os dois polos com maior crescimento de faturamento no estado;
3) Santa Catarina tem a maior proporção de startups no país, com 19,95% do total nacional.
TDC2019 Intel Software Day - Deteccao de objetos em tempo real com OpenVinotdc-globalcode
O documento apresenta uma palestra sobre detecção de objetos em tempo real utilizando redes neurais convolucionais e o framework OpenVINO da Intel. É discutido o algoritmo YOLO para detecção de objetos em imagens e sua implementação no Intel AI Dev Cloud para treinamento. Também é mostrado como otimizar modelos de deep learning utilizando o OpenVINO para inferência em tempo real.
TDC2019 Intel Software Day - OpenCV: Inteligencia artificial e Visao Computac...tdc-globalcode
O documento apresenta uma palestra sobre OpenCV, biblioteca de código aberto para visão computacional. Aborda os fundamentos da visão computacional e aplicações da OpenCV, incluindo processamento de imagens, reconhecimento de padrões e diretrizes para melhor desempenho em processadores Intel. Também discute tópicos como aprendizado de máquina profundo, YOLO e conformidade com leis de privacidade como o GDPR.
Trilha BigData - Banco de Dados Orientado a Grafos na Seguranca Publicatdc-globalcode
O documento discute a aplicação de bancos de dados orientados a grafos para análise de vínculos na segurança pública. Esses bancos permitem modelar dados de crimes e suspeitos como vértices e arestas em um grafo, possibilitando correlacionar informações de forma mais eficiente do que bancos relacionais. Isso pode identificar proximidades entre suspeitos e verificar vínculos de diferentes perspectivas em tempo real, auxiliando investigações criminais.
O documento apresenta os principais conceitos da programação funcional usando a linguagem F#, como imutabilidade, funções como valores, composição de funções, type providers e features como Option e unidades de medida. O objetivo é mostrar como F# oferece um paradigma diferente de programação e como seus recursos podem ser usados no desenvolvimento .NET.
This document summarizes the development of an API for addresses in Go. It describes using Beego and pure Go for scalability. It implemented middlewares for routing, error handling, authentication, and logging. It also integrated New Relic for metrics. The API routes and controllers are chained through the middleware handlers. In conclusion, the API was able to successfully provide address functionality and insights through integration of middlewares and third-party services.
TDC2018SP | Trilha Modern Web - Para onde caminha a Web?tdc-globalcode
O documento discute a importância da personalização e da relevância na web. Defende que sites devem entender os usuários individuais e fornecer experiências adaptadas às necessidades e desejos únicos de cada pessoa. Explica como o web mining, logs de servidor e clientes podem ser usados para analisar o comportamento dos usuários e melhorar a usabilidade e a relevância dos sites.
TDC2018SP | Trilha Go - Clean architecture em Golangtdc-globalcode
O documento descreve os princípios da arquitetura limpa em Go, dividindo o código em 4 camadas: entidades, casos de uso, controladores e frameworks/drivers. A arquitetura promove independência de frameworks, teste, interface gráfica e bancos de dados, além de permitir testes por camada. Um exemplo completo está disponível em um repositório no GitHub.
TDC2018SP | Trilha Go - "Go" tambem e linguagem de QAtdc-globalcode
O documento discute as vantagens de usar a linguagem Go para testes de software, apresentando diversas ferramentas para testes com Go, como Godog para testes de aceitação baseados em BDD, Gomega para testes unitários e Ginkgo para execução de testes. Também cita empresas que usam Go em produção como Uber, Docker e Dropbox.
TDC2018SP | Trilha Mobile - Digital Wallets - Seguranca, inovacao e tendenciatdc-globalcode
O documento discute (1) a tokenização como método seguro de armazenar dados de cartão, (2) as integrações com as APIs do Google Pay e Apple Pay para permitir pagamentos móveis dentro de aplicativos, e (3) a crescente tendência dos pagamentos digitais em carteiras móveis em todo o mundo.
TDC2018SP | Trilha .Net - Real Time apps com Azure SignalR Servicetdc-globalcode
O documento discute o Azure SignalR Service, um serviço gerenciado pela Microsoft que permite escalar aplicações em tempo real sem gerenciar a infraestrutura subjacente. O serviço oferece fallback automático entre protocolos de comunicação e permite 1000 conexões por unidade com SLA de 99,9%. O documento também fornece instruções sobre como adicionar o serviço a uma aplicação ASP.NET Core.
TDC2018SP | Trilha .Net - Passado, Presente e Futuro do .NETtdc-globalcode
O documento discute a evolução do .NET, desde seu foco inicial em aplicativos empresariais para Windows até se tornar uma pilha multiplataforma de código aberto. Também compara o .NET Framework e o .NET Core, explicando quando cada um é mais adequado, e demonstra o SQL Server rodando no Linux.
TDC2018SP | Trilha .Net - Novidades do C# 7 e 8tdc-globalcode
This document discusses new features in C# 7 and 8, including pattern matching, tuples, out variables, discards, ref returns and locals, expression-bodied members, numeric literals, local functions, generalized async returns, default literals, non-trailing named arguments, leading separators for numeric literals, private protected access, reference semantics with value types using in, ref, and ref readonly. It also provides links to documentation and proposals for each feature.
1) A apresentação introduz Fernando Mendes e Mikaeri Ohana, arquiteto de software e desenvolvedor de software respectivamente, e descreve o tópico da palestra sobre obter métricas com TDD utilizando build automatizado e deploy no Azure. 2) A palestra discute os benefícios dos testes unitários, TDD e cobertura de código e faz uma demonstração. 3) As ferramentas xUnit, OpenCover e ReportGenerator são apresentadas para testes, cobertura e relatórios.
TDC2018SP | Trilha .Net - .NET funcional com F#tdc-globalcode
O documento apresenta a linguagem de programação funcional F# como parte do .NET, destacando que é fortemente tipada e permite programação funcional "impura" com interoperabilidade com o ecossistema .NET. Também menciona o apoio da Microsoft à linguagem e exemplos de onde ela pode ser usada.
TDC2018SP | Trilha .Net - Crie SPAs com Razor e C# usando Blazor em .Net Coretdc-globalcode
O documento descreve o Blazor, um projeto experimental que permite criar SPAs usando C# e WebAssembly. Ele explica que o WebAssembly é um novo formato para compilação web e lista algumas vantagens do Blazor, como ser estável e usar ferramentas da indústria. Também menciona pré-requisitos para usar o Blazor e dicas de hospedagem.
TDC2018SP | Trilha .Net - Novidades do ASP.NET Core 2.1tdc-globalcode
Este documento resume as principais novidades do ASP.NET Core 2.1, incluindo melhorias na segurança HTTPS, suporte ao GDPR, imagens Docker menores, o tipo ActionResult<T> para simplificar APIs, e Razor Class Libraries para compartilhar elementos UI entre projetos. O documento também discute outras atualizações como suporte a Identity, Kestrel, templates SPA e SignalR.
TDC2018SP | Trilha BigData - Big Data Governance - Como estabelecer uma Gover...tdc-globalcode
O documento discute a importância de estabelecer uma governança de dados em ambientes de big data e analytics. Ele destaca como os dados sem governança podem levar a problemas como feudos de dados, custos excessivos e baixa qualidade. Apresenta os principais elementos de uma governança de dados efetiva, incluindo estruturas de apoio, processos, alinhamento estratégico e foco contínuo na inovação.
TDC2018SP | Trilha BigData - Mais Falados - Usando a Interacao Social para a ...tdc-globalcode
O documento descreve um sistema de recomendação de vídeos baseado na análise de opiniões e comentários de usuários nas redes sociais. O sistema utiliza técnicas de processamento de linguagem natural e aprendizado de máquina para entender os comentários e recomendar os vídeos mais populares, resultando em 46% mais visualizações em comparação a recomendações baseadas apenas em visualizações.
AI Risk Management: ISO/IEC 42001, the EU AI Act, and ISO/IEC 23894PECB
As artificial intelligence continues to evolve, understanding the complexities and regulations regarding AI risk management is more crucial than ever.
Amongst others, the webinar covers:
• ISO/IEC 42001 standard, which provides guidelines for establishing, implementing, maintaining, and continually improving AI management systems within organizations
• insights into the European Union's landmark legislative proposal aimed at regulating AI
• framework and methodologies prescribed by ISO/IEC 23894 for identifying, assessing, and mitigating risks associated with AI systems
Presenters:
Miriama Podskubova - Attorney at Law
Miriama is a seasoned lawyer with over a decade of experience. She specializes in commercial law, focusing on transactions, venture capital investments, IT, digital law, and cybersecurity, areas she was drawn to through her legal practice. Alongside preparing contract and project documentation, she ensures the correct interpretation and application of European legal regulations in these fields. Beyond client projects, she frequently speaks at conferences on cybersecurity, online privacy protection, and the increasingly pertinent topic of AI regulation. As a registered advocate of Slovak bar, certified data privacy professional in the European Union (CIPP/e) and a member of the international association ELA, she helps both tech-focused startups and entrepreneurs, as well as international chains, to properly set up their business operations.
Callum Wright - Founder and Lead Consultant Founder and Lead Consultant
Callum Wright is a seasoned cybersecurity, privacy and AI governance expert. With over a decade of experience, he has dedicated his career to protecting digital assets, ensuring data privacy, and establishing ethical AI governance frameworks. His diverse background includes significant roles in security architecture, AI governance, risk consulting, and privacy management across various industries, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: June 26, 2024
Tags: ISO/IEC 42001, Artificial Intelligence, EU AI Act, ISO/IEC 23894
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
Delegation Inheritance in Odoo 17 and Its Use CasesCeline George
There are 3 types of inheritance in odoo Classical, Extension, and Delegation. Delegation inheritance is used to sink other models to our custom model. And there is no change in the views. This slide will discuss delegation inheritance and its use cases in odoo 17.
Understanding and Interpreting Teachers’ TPACK for Teaching Multimodalities i...Neny Isharyanti
Presented as a plenary session in iTELL 2024 in Salatiga on 4 July 2024.
The plenary focuses on understanding and intepreting relevant TPACK competence for teachers to be adept in teaching multimodality in the digital age. It juxtaposes the results of research on multimodality with its contextual implementation in the teaching of English subject in the Indonesian Emancipated Curriculum.
How to Handle the Separate Discount Account on Invoice in Odoo 17Celine George
In Odoo, separate discount account can be set up to accurately track and manage discounts applied on various transaction and ensure precise financial reporting and analysis
The membership Module in the Odoo 17 ERPCeline George
Some business organizations give membership to their customers to ensure the long term relationship with those customers. If the customer is a member of the business then they get special offers and other benefits. The membership module in odoo 17 is helpful to manage everything related to the membership of multiple customers.
Ardra Nakshatra (आर्द्रा): Understanding its Effects and RemediesAstro Pathshala
Ardra Nakshatra, the sixth Nakshatra in Vedic astrology, spans from 6°40' to 20° in the Gemini zodiac sign. Governed by Rahu, the north lunar node, Ardra translates to "the moist one" or "the star of sorrow." Symbolized by a teardrop, it represents the transformational power of storms, bringing both destruction and renewal.
About Astro Pathshala
Astro Pathshala is a renowned astrology institute offering comprehensive astrology courses and personalized astrological consultations for over 20 years. Founded by Gurudev Sunil Vashist ji, Astro Pathshala has been a beacon of knowledge and guidance in the field of Vedic astrology. With a team of experienced astrologers, the institute provides in-depth courses that cover various aspects of astrology, including Nakshatras, planetary influences, and remedies. Whether you are a beginner seeking to learn astrology or someone looking for expert astrological advice, Astro Pathshala is dedicated to helping you navigate life's challenges and unlock your full potential through the ancient wisdom of Vedic astrology.
For more information about their courses and consultations, visit Astro Pathshala.
Front Desk Management in the Odoo 17 ERPCeline George
Front desk officers are responsible for taking care of guests and customers. Their work mainly involves interacting with customers and business partners, either in person or through phone calls.
Is Email Marketing Really Effective In 2024?Rakesh Jalan
Slide 1
Is Email Marketing Really Effective in 2024?
Yes, Email Marketing is still a great method for direct marketing.
Slide 2
In this article we will cover:
- What is Email Marketing?
- Pros and cons of Email Marketing.
- Tools available for Email Marketing.
- Ways to make Email Marketing effective.
Slide 3
What Is Email Marketing?
Using email to contact customers is called Email Marketing. It's a quiet and effective communication method. Mastering it can significantly boost business. In digital marketing, two long-term assets are your website and your email list. Social media apps may change, but your website and email list remain constant.
Slide 4
Types of Email Marketing:
1. Welcome Emails
2. Information Emails
3. Transactional Emails
4. Newsletter Emails
5. Lead Nurturing Emails
6. Sponsorship Emails
7. Sales Letter Emails
8. Re-Engagement Emails
9. Brand Story Emails
10. Review Request Emails
Slide 5
Advantages Of Email Marketing
1. Cost-Effective: Cheaper than other methods.
2. Easy: Simple to learn and use.
3. Targeted Audience: Reach your exact audience.
4. Detailed Messages: Convey clear, detailed messages.
5. Non-Disturbing: Less intrusive than social media.
6. Non-Irritating: Customers are less likely to get annoyed.
7. Long Format: Use detailed text, photos, and videos.
8. Easy to Unsubscribe: Customers can easily opt out.
9. Easy Tracking: Track delivery, open rates, and clicks.
10. Professional: Seen as more professional; customers read carefully.
Slide 6
Disadvantages Of Email Marketing:
1. Irrelevant Emails: Costs can rise with irrelevant emails.
2. Poor Content: Boring emails can lead to disengagement.
3. Easy Unsubscribe: Customers can easily leave your list.
Slide 7
Email Marketing Tools
Choosing a good tool involves considering:
1. Deliverability: Email delivery rate.
2. Inbox Placement: Reaching inbox, not spam or promotions.
3. Ease of Use: Simplicity of use.
4. Cost: Affordability.
5. List Maintenance: Keeping the list clean.
6. Features: Regular features like Broadcast and Sequence.
7. Automation: Better with automation.
Slide 8
Top 5 Email Marketing Tools:
1. ConvertKit
2. Get Response
3. Mailchimp
4. Active Campaign
5. Aweber
Slide 9
Email Marketing Strategy
To get good results, consider:
1. Build your own list.
2. Never buy leads.
3. Respect your customers.
4. Always provide value.
5. Don’t email just to sell.
6. Write heartfelt emails.
7. Stick to a schedule.
8. Use photos and videos.
9. Segment your list.
10. Personalize emails.
11. Ensure mobile-friendliness.
12. Optimize timing.
13. Keep designs clean.
14. Remove cold leads.
Slide 10
Uses of Email Marketing:
1. Affiliate Marketing
2. Blogging
3. Customer Relationship Management (CRM)
4. Newsletter Circulation
5. Transaction Notifications
6. Information Dissemination
7. Gathering Feedback
8. Selling Courses
9. Selling Products/Services
Read Full Article:
https://digitalsamaaj.com/is-email-marketing-effective-in-2024/
How to Install Theme in the Odoo 17 ERPCeline George
With Odoo, we can select from a wide selection of attractive themes. Many excellent ones are free to use, while some require payment. Putting an Odoo theme in the Odoo module directory on our server, downloading the theme, and then installing it is a simple process.
How to Show Sample Data in Tree and Kanban View in Odoo 17Celine George
In Odoo 17, sample data serves as a valuable resource for users seeking to familiarize themselves with the functionalities and capabilities of the software prior to integrating their own information. In this slide we are going to discuss about how to show sample data to a tree view and a kanban view.
How to Create Sequence Numbers in Odoo 17Celine George
Sequence numbers are mainly used to identify or differentiate each record in a module. Sequences are customizable and can be configured in a specific pattern such as suffix, prefix or a particular numbering scheme. This slide will show how to create sequence numbers in odoo 17.
How to Add Colour Kanban Records in Odoo 17 NotebookCeline George
In Odoo 17, you can enhance the visual appearance of your Kanban view by adding color-coded records using the Notebook feature. This allows you to categorize and distinguish between different types of records based on specific criteria. By adding colors, you can quickly identify and prioritize tasks or items, improving organization and efficiency within your workflow.
2. NoticesandDisclaimers
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on
system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at www.intel.com.
Performance results are based on testing as of Aug. 20, 2017 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be
absolutely secure.
Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and
provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel
representative to obtain the latest forecast, schedule, specifications and roadmaps.
Any forecasts of goods and services needed for Intel’s operations are provided for discussion purposes only. Intel will have no liability to make any purchase in connection with
forecasts published in this document.
ARDUINO 101 and the ARDUINO infinity logo are trademarks or registered trademarks of Arduino, LLC.
Altera, Arria, the Arria logo, Intel, the Intel logo, Intel Atom, Intel Core, Intel Nervana, Intel Saffron, Iris, Movidius, OpenVINO, Stratix and Xeon are trademarks of Intel Corporation or
its subsidiaries in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Copyright 2019 Intel Corporation.
2
5. https://www.intelnervana.com/framework-optimizations/
Big Data Analytics
HPC != Big Data Analytics != Inteligência Artificial ?
*Other brands and names are the property of their respective owners.
FORTRAN / C++ Applications
MPI
High Performance
Java, Python, Go, etc.*
Applications
Hadoop*
Simple to Use
SLURM
Supports large scale startup
YARN*
More resilient of hardware failures
Lustre*
Remote Storage
HDFS*, SPARK*
Local Storage
Compute & Memory Focused
High Performance Components
Storage Focused
Standard Server Components
Server Storage
SSDs
Switch
Fabric
Infrastructure
Modelo de
Programação
Resource
Manager
Sistema de
arquivos
Hardware
Server Storage
HDDs
Switch
Ethernet
Infrastructure
6. https://www.intelnervana.com/framework-optimizations/
Trends in HPC + Big Data Analytics
Standards
Business viability
Performance
Code Modernization
(Vector instructions)
Many-core
FPGA, ASICs
Usability
Faster time-to-market
Lower costs (HPC at Cloud ? )
Better products
Easy to mantain HW & SW
Portability
Open
Commom
Environments
Integrated solutions:
Storage + Network +
Processing + Memory
Public investments
7. https://www.intelnervana.com/framework-optimizations/
Varied Resource Needs
Typical HPC
Workloads
Typical
Big Data
Workloads
7
Big Data & HPC
Ambientes de Produção
Small Data + Small
Compute
e.g. Data analysis
Big Data +
Small Compute
e.g. Search, Streaming,
Data Preconditioning
Small Data +
Big Compute
e.g. Mechanical Design, Multi-physics
Data
Compute
High
Frequency
Trading
Numeric
Weather
Simulation
Oil & Gas
Seismic
Systemcostbalance
Video Survey Traffic
Monitor
Personal
Digital Health
Systemcostbalance
Processor Memory Interconnect Storage
9. Intel®AITools
PortfolioofsoftwaretoolstoexpediteandenrichAIdevelopment
† Formerly the Intel® Computer Vision SDK
*Other names and brands may be claimed as the property of others.
Developer personas show above represent the primary user base for each row, but are not mutually-exclusive
All products, computer systems, dates, and figures are preliminary based on current expectations, and are subject to change without notice.
TOOLKITS
Application
Developers
libraries
Data
Scientists
foundation
Library
Developers
DEEPLEARNINGDEPLOYMENT
OpenVINO™† Intel® Movidius™ SDK
Open Visual Inference & Neural Network Optimization toolkit for
inference deployment on CPU/GPU/FPGA/VPU using TensorFlow*,
Caffe* & MXNet*
Optimized inference deployment
for all Intel® Movidius™ VPUs using TensorFlow
& Caffe
DEEPLEARNING
Intel® Deep
Learning Studio‡
Open-source tool to compress
deep learning development
cycle
DEEPLEARNINGFRAMEWORKS
Now optimized for CPU Optimizations in progress
TensorFlow MXNet Caffe BigDL* (Spark) Caffe2 PyTorch CNTK PaddlePaddle
MACHINELEARNINGLIBRARIES
Python R Distributed
• Scikit-
learn
• Pandas
• NumPy
• Cart
• Random
Forest
• e1071
• MlLib (on
Spark)
• Mahout
* * * *
ANALYTICS,MACHINE&DEEPLEARNINGPRIMITIVES
Python* DAAL MKL-DNN clDNN
Intel distribution
optimized for
machine learning
Intel® Data Analytics
Acceleration Library
(incl machine learning)
Open-source deep neural
network functions for
CPU / integrated graphics
DEEPLEARNINGGRAPHCOMPILER
Intel® nGraph™ Compiler (Alpha)
Open-sourced compiler for deep learning model
computations optimized for multiple devices from
multiple frameworks
9
11. 11
What’s Inside Intel® Distribution of OpenVINO™ toolkit
OpenVX and the OpenVX logo are trademarks of the Khronos Group Inc.
OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos
Intel® Architecture-Based
Platforms Support
OS Support: CentOS* 7.4 (64 bit), Ubuntu* 16.04.3 LTS (64 bit), Microsoft Windows* 10 (64 bit), Yocto Project* version Poky Jethro v2.0.3 (64 bit)
Intel® Deep Learning Deployment Toolkit Traditional Computer Vision
Model Optimizer
Convert & Optimize
Inference Engine
Optimized InferenceIR OpenCV* OpenVX*
Optimized Libraries & Code Samples
IR = Intermediate Representation file
For Intel® CPU & GPU/Intel® Processor Graphics
Increase Media/Video/Graphics Performance
Intel® Media SDK
Open Source version
OpenCL™
Drivers & Runtimes
For GPU/Intel® Processor Graphics
Optimize Intel® FPGA (Linux* only)
FPGA RunTime Environment
(from Intel® FPGA SDK for OpenCL™)
Bitstreams
Samples
An open source version is available at 01.org/openvinotoolkit (some deep learning functions support Intel CPU/GPU only).
Tools & Libraries
Intel® Vision Accelerator
Design Products &
AI in Production/
Developer Kits
30+ Pre-trained
Models
Computer Vision
Algorithms
Samples
12. 12
Intel®DeepLearningDeploymentToolkit
ForDeepLearningInference
Caffe*
TensorFlow*
MxNet*
.dataIR
IR
IR = Intermediate
Representation format
Load, infer
CPU Plugin
GPU Plugin
FPGA Plugin
NCS Plugin
Model
Optimizer
Convert &
Optimize
Model Optimizer
▪ What it is: A python based tool to import trained models
and convert them to Intermediate representation.
▪ Why important: Optimizes for performance/space with
conservative topology transformations; biggest boost is
from conversion to data types matching hardware.
Inference Engine
▪ What it is: High-level inference API
▪ Why important: Interface is implemented as dynamically
loaded plugins for each hardware type. Delivers best
performance for each type without requiring users to
implement and maintain multiple code pathways.
Trained
Models
Inference
Engine
Common API
(C++ / Python)
Optimized cross-
platform inference
OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos
GPU = Intel CPU with integrated graphics processing unit/Intel® Processor Graphics
Kaldi*
ONNX*
GNA Plugin
Extendibility
C++
Extendibility
OpenCL™
Extendibility
OpenCL™
13. Intel® CPUs
(Atom®, Core™, Xeon®)
Intel® CPUs
w/ Integrated Graphics
Intel®VISIONAcceleratorDesignProducts
Intel®VisionProducts
Intel® Movidius™ VPUs
& Intel® FPGAs
Future Accelerators
(Keem Bay, etc.)
Writeonce - deployAcrossIntelArchitecture - Leveragecommonalgorithms
Add to existing Intel® architectures for
accelerated DL inference capabilities
1. Intel® Distribution of OpenVINO™ toolkit: Computer
vision & deep learning inference tool with common API
2. Portfolio of hardware for computer vision & deep
learning inference, device to cloud
3. Ecosystem to cover the breadth of IoT vision systems
13
15. BIGDL
Bringing Deep Learning to Big Data
github.com/intel-analytics/BigDL
▪ Open Sourced Deep Learning Library for
Apache Spark*
▪ Make Deep learning more Accessible to Big
data users and data scientists.
▪ Feature Parity with popular DL frameworks like
Caffe, Torch, Tensorflow etc.
▪ Easy Customer and Developer Experience
▪ Run Deep learning Applications as Standard
Spark programs;
▪ Run on top of existing Spark/Hadoop clusters
(No Cluster change)
▪ High Performance powered by Intel MKL and
Multi-threaded programming.
▪ Efficient Scale out leveraging Spark
architecture.
Spark Core
SQL SparkR
Stream-
ing
MLlib GraphX
ML Pipeline
DataFrame
BigDL
For developers looking to run deep learning on Hadoop/Spark due to familiarity or analytics use
16. *Other names and brands may be claimed as the property of others.
All products, computer systems, dates, and figures are preliminary based on current expectations, and are subject to change without notice.
Intel®ngraph™compiler
Open-source compiler enabling flexibility to run models
across a variety of frameworks and hardware
nGraph™ – Deep Learning Compiler
GPU
Future
HW
Future
FW
*
*
* *
* * * *
18. Integer Matrix Multiply Performance
on Intel® Xeon® Platinum 8180 Processor
Configuration Details on Slide: 13
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors
may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit: http://www.intel.com/performance Source: Intel
measured as of June 2017 Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not
guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel
microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Enhanced matrix multiply performance on Intel® Xeon® Scalable Processor
Lower
precision
integer ops
PUBLIC
Performance estimates were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or
system.
20. Programação Paralela aplicado em IA
Técnicas de HPC aplicadas para IA
Job 0
Job 1
Job 2
Job 3
12
threads
12
threads
12
threads
12
threads
libnumactl kmp_affinity
https://software.intel.com/en-us/articles/boosting-deep-learning-training-inference-performance-on-xeon-and-xeon-phi
21. Igor Freitas 21
Centros de Excelência em Inteligência Artificial - Intel
Casos de sucesso
“Validador Cognitivo de Infrações de Trânsito”
✓ Performance 22.5x mais rápida em “Xeon Scalable Processors”
“...um processamento de multas que antes levava 45 horas agora poderá ser realizado em menos de 2 horas.”
✓ Desenvolvimento do modelo matemático
“Com isso, tivemos uma acurácia de 90% no sistema, além da automação de todo o projeto”,
disse Gustavo Rocha, chefe de divisão do SERPRO,“
Thiago Oliveira, superintendente de Engenharia de
Infraestrutura do SERPRO
22. 22
TensorFlow for CPU
intra_op_parallelism_threads: Nodes that can use multiple threads to parallelize their execution will schedule the
individual pieces into this pool.
inter_op_parallelism_threads: All ready nodes are scheduled in this pool.
config = tf.ConfigProto()
config.intra_op_parallelism_threads = 44
config.inter_op_parallelism_threads = 44
tf.Session(config=config)
Aplicando técnica “Afinidade de Processos” (NUMA aware) no TensorFlow
Source:
https://www.tensorflow.org/guide/performance/overview#optimizing_for_cpu
31. 31
Programação Paralela aplicado em IA
numactl –C 0-15,16-31 python MNIST.py
• Mais cores não significa maior
performance
• 48 threads teve mesma
performance que 64 threads
(102s)
• Melhor tempo com 32 threads
(83s) – 1.22x speedup
4, 271
8, 140
16, 112
32, 83
48, 102 64, 105
0
50
100
150
200
250
300
0 10 20 30 40 50 60 70
Segundos
Threads
NUMACTL
64 cores modo “default”
Tempo para 64 Threads “default”: 102 segundos
32. 32
Programação Paralela aplicado em IA
export KMP_BLOCKTIME=0
numactl –C 0-15,16-31 python MNIST.py
• KMP_BLOCKTIME: tempo em
milisegundos de espera da thread,
após executar sua tarefa, antes de
dormir
• 2.68x speedup
• Melhor tempo com 16 threads
• Melhor Performance x benefício
com 2 Threads
1, 67
2, 43
4, 41 8, 40
16, 39 32, 41
48, 50
64, 46
4, 271
8, 140
16, 112
32, 83
48, 102
64, 105
0
50
100
150
200
250
300
0 10 20 30 40 50 60 70
Segundos
Threads
NUMACTL - KMP_BLOCKTIME=0
KMP_BLOCK_TIME=0 KMP_BLOCK_TIME=Default
64 cores modo “default”
33. 33
Programação Paralela aplicado em IA
export KMP_BLOCKTIME=0
export KMP_AFFINITY=granularity=fine,verbose,compact,1,0
numactl –C 0-15 python MNIST.py
• 16 threads : 4.86x speedup !
• Menor custo de infra-estrutura
• Mais jobs de treinamento ao
mesmo tempo
• Modelos maiores
• Sem alteração de código
64 cores modo “default”
0
50
100
150
200
250
300
0 10 20 30 40 50 60 70
Segundos
Threads
NUMACTL + KMP_BLOCKTIME=0 + AFFINITY
NUMACTL
NUMACTL + KMP_BLOCKTIME=0
34. 34
Programação Paralela aplicado em IA
KMP_AFFINITY=granularity=fine,verbose,compact,1,0
• Como as Threads são distribuídas entre os Cores e
Sockets
• Impacta bandwidth: “velocidade de memória”
• Compact:
• Threads próximas entre si
• Troca de dados entre elas mais rápida
• Dados cabem na cache,
• Pouca troca de dados entre CPU e DRAM
37. 37
▪ Extends neural network support to include LSTM (long short-term memory) from ONNX*, TensorFlow*& MXNet*
frameworks, & 3D convolutional-based networks in preview mode (CPU-only) for non-vision use cases.
▪ Introduces Neural Network Builder API (preview), providing flexibility to create a graph from simple API calls and
directly deploy via the Inference Engine.
▪ Improves Performance - Delivers significant CPU performance boost on multicore systems through new
parallelization techniques via streams. Optimizes performance on Intel® Xeon®, Core™ & Atom processors through
INT8-based primitives for Intel® Advanced Vector Extensions (Intel® AVX-512), Intel® AVX2 & SSE4.2.
▪ Supports Raspberry Pi* hardware as a host for the Intel® Neural Compute Stick 2 (preview). Offload your deep
learning workloads to this low-cost, low-power USB.
▪ Adds 3 new optimized pretrained models (for a total of 30+): Text detection of indoor/outdoor scenes, and 2
single-image super resolution networks that enhance image resolution by a factor of 3 or 4.
What’s New in Intel® Distribution of OpenVINO™ toolkit
2018 R5
See product site & release notes for more details about 2018 R4.
OpenVX and the OpenVX logo are trademarks of the Khronos Group Inc.
39. BigDLConfigurationDetails
Benchmark Segment AI/ML/DL
Benchmark type Training
Benchmark Metric Training Throughput (images/sec)
Framework BigDL master trunk with Spark 2.1.1
Topology Inception V1, VGG, ResNet-50, ResNet-152
# of Nodes 8, 16 (multiple configurations)
Platform Purley
Sockets 2S
Processor
Intel ® Xeon ® Scalable Platinum 8180 Processor (Skylake): 28-core @ 2.5
GHz (base), 3.8 GHz (max turbo), 205W
Intel ® Xeon ® Processor E5-2699v4 (Broadwell): 22-core @ 2.2 GHz (base),
3.6 GHz (max turbo), 145W
Enabled Cores Skylake: 56 per node, Broadwell: 44 per node
Total Memory Skylake: 384 GB, Broadwell: 256 GB
Memory Configuration
Skylake: 12 slots * 32 GB @ 2666 MHz Micron DDR4 RDIMMs
Broadwell: 8 slots * 32 GB @ 2400 MHz Kingston DDR4 RDIMMs
Storage
Skylake: Intel® SSD DC P3520 Series (2TB, 2.5in PCIe 3.0 x4, 3D1, MLC)
Broadwell: 8 * 3 TB Seagate HDDs
Network 1 * 10 GbE network per node
OS
CentOS Linux reléase 7.3.1611 (Core), Linux kernel
4.7.2.el7.x86_64
HT On
Turbo On
Computer Type Dual-socket server
Framework Version https://github.com/intel-analytics/BigDL
Topology Version https://github.com/google/inception
Dataset, version ImageNet, 2012; Cifar-10
Performance command
(Inception v1)
spark-submit --class
com.intel.analytics.bigdl.models.inception.TrainInceptionV1 --
master spark://$master_hostname:7077 --executor-cores=36
--num-executors=16 --total-executor-cores=576 --driver-
memory=60g --executor-
memory=300g $BIGDL_HOME/dist/lib/bigdl-*-SNAPSHOT-
jar-with-dependencies.jar --batchSize 2304 --learningRate
0.0896 -f hdfs:///user/root/sequence/ --
checkpoint $check_point_folder
Data setup
Data was stored on HDFS and cached in memory before
training
Java JDK 1.8.0 update 144
MKL Library version Intel MKL 2017
40. SparkConfigurationDetails
Configurations:
4.3X for Spark MLlib thru Intel Math Kernel Library (MKL)
▪ Spark-Perf (same for before and after): 9 nodes each with Intel® Xeon® processor E5-2697A v4 @ 2.60GHz * 2 (16 cores, 32 threads); 256 GB ; 10x SSDs; 10Gbps NIC
19x for HDFS Erasure Coding in micro workload (RawErasureCoderBenchmark) and 1.25x in Terasort, plus 50+% storage capacity saving and higher failure tolerance level.
▪ RawErasureCoderBenchmark (same for before and after): single node with Intel® Xeon® processor E5-2699 v4 @ 2.20GHz *2 (22 cores, 44 threads); 256GB; 8x HDDs; 10Gbps NIC
▪ Terasort (same for before and after): 10 nodes each with Intel® Xeon® processor E5-2699 v4 @ 2.20GHz *2 (22 cores, 44 threads); 256GB; 8x HDDs; 10Gbps NIC
5.6x for HBase off heaping read in micro workload (PE) and 1.3x in real Alibaba production workload
▪ PE (same for before and after): Intel® Xeon® Processor X5670 @ 2.93Hz *2 (6 cores, 12 threads); RAM: 150 GB; 1Gbps NIC
▪ Alibaba (same for before and after): 400 nodes cluster with Intel® Xeon® processors
1.22x Spark Shuffle File Encryption performance for TeraSort and 1.28x for BigBench
▪ Terasort (same for before and after): Single node with Intel® Xeon® Processor E5-2699 v3 @ 2.30GHz *2 (18 cores, 36 threads); 128GB; 4x SSD; 10Gbps NIC
▪ BigBench (same for before and after): 6 nodes each with Intel® Xeon® Processor E5-2699 v3 @ 2.30GHz *2 (18 cores, 36 threads); 256GB; 1x SSD; 8x SATA HDD 3TB, 10Gbps NIC
1.35X Spark Shuffle RPC encryption performance for TeraSort and 1.18x for BigBench
▪ Terasort (same for before and after): 3 nodes each with Intel® Xeon® Processor E5-2699 v3 @ 2.30GHz *2 (18 cores, 36 threads); 128GB; 4x SSD; 10Gbps NIC
▪ BigBench (same for before and after): 5 nodes. 1x head node: Intel® Xeon® Processor E5-2699 v3 @ 2.30GHz *2 (18 cores, 36 threads); 384GB; 1x SSD; 8x SATA HDD 3TB, 10Gbps NIC. 4x
worker nodes: each with Intel® Xeon® processor E5-2699 v4 @ 2.20GHz *2 (22 cores, 44 threads); 384GB; 1x SSD; 8x SATA HDD 3TB, 10Gbps NIC.
10X scalability for Word2Vec E5-2630v2 * 2, 128 GB Memory, 12x HDDs; 1000Mb NIC (14 nodes)
70X scalability for LDA (Latent Dirichlet Allocation)
▪ Intel Xeon E5-2630v2 * 2, 288GB Memory, SAS Raid5, 10Gb NIC
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These
optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any
optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more
information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
41. SparkSQLConfigurations
41
AEP DRAM
Hardware DRAM 192GB (12x 16GB DDR4) 768GB (24x 32GB DDR4)
Apache Pass 1TB (ES2: 8 x 128GB) N/A
AEP Mode App Direct (Memkind) N/A
SSD N/A N/A
CPU Worker: Intel® Xeon® Platinum 8170 @ 2.10GHz (Thread(s) per core: 2, Core(s) per socket: 26, Socket(s): 2
CPU max MHz: 3700.0000 CPU min MHz: 1000.0000 L1d cache: 32K, L1i cache: 32K, L2 cache: 1024K, L3 cache:
36608K)
OS 4.16.6-202.fc27.x86_64 (BKC: WW26, BIOS: SE5C620.86B.01.00.0918.062020181644)
Software OAP 1TB AEP based OAP cache 620GB DRAM based OAP cache
Hadoop 8 * HDD disk (ST1000NX0313, 1-replica uncompressed & plain encoded data on Hadoop)
Spark 1 * Driver (5GB) + 2 * Executor (62 cores, 74GB), spark.sql.oap.rowgroup.size=1MB
JDK Oracle JDK 1.8.0_161
Workloa
d
Data Scale 2.6TB (9 queries related data is of 729.4GB in capacity)
TPC-DS
Queries
9 I/O intensive queries (Q19,Q42,Q43,Q52,Q55, Q63,Q68,Q73,Q98)
Multi-Tenants 9 threads (Fair scheduled)
42. ApacheCassandraConfigurations
42
NVMe Apache Pass
Server Hardware System Details Intel® Server Board Purely Platform (2 socket)
CPU Dual Intel® Xeon® Platinum 8180 Processors, 28 core/socket, 2 sockets, 2 threads per core
Hyper-Threading Enabled
DRAM DDR4 dual rank 192GB total = 12 DIMMs 16GB@2667Mhz DDR4 dual rank 384GB total = 12 DIMMs 32GB@2667Mh
Apache Pass N/A AEP ES.2 1.5TB total = 12 DIMMs * 128GB Capacity each: Single Rank, 128GB, 15W
Apache Pass Mode N/A App-Direct
NVMe 4 x Intel P3500 1.6TB NVMe devices N/A
Network 10Gbit on board Intel NIC
Software OS Fedora 27
Kernel Kernel: 4.16.6-202.fc27.x86_64
Cassandra Version 3.11.2 release
Cassandra 4.0 trunk, with App Direct patch version 2.1, software found at
https://github.com/shyla226/cassandra/tree/13981
with PCJ library: https://github.com/pmem/pcj
JDK Oracle Hotspot JDK (JDK1.8 u131)
Spectra/Meltdown Compliant Patched for variants 1/2/3
Cassandra
Parameters
Number of Cassandra
Instances
1 14
Cluster Nodes One per Cluster
Garbage Collector CMS Parallel
JVM Options (difference from
default)
-Xms64G
-Xmx64G
-Xms20G
-Xmx20G
-Xmn8G
-XX:+UseAdaptiveSizePolicy
-XX:ParallelGCThreads=5
Schema cqlstress-insanity-example.yaml
DataBase Size per Instance 1.25 Billion entries 100 K entries
Client(s) Hardware Number of Client machines 1 2
System Intel® Server Board model S2600WFT (2 socket)
CPU Dual Intel® Xeon® Platinum 8176M CPU @ 2.1Ghz, 28 core/socket, 2 sockets, 2 threads per core
DRAM DDR4 384GB total = 12 DIMMs 32GB@2666Mhz
Network 10Gbit on board Intel NIC
Software OS Fedora 27
Kernel Kernel: 4.16.6-202.fc27.x86_64
JDK Oracle Hotspot JDK (JDK1.8 u131)
Workload Benchmark Cassandra-Stress
Cassandra-Stress Instances 1 14
Command line to write
database
cassandra-stress user profile/root/cassandra_4.0/tools/cqlstress-insanity-example.yaml
ops(insert=1) n=1250000000 cl=ONE no-warmup -pop seq=1..1250000000 -mode native
cql3 -node <ip_addr> -rate threads=10
cassandra-stress user profile/root/cassandra_4.0/tools/cqlstress-insanity-example.yaml
ops(insert=1) n=100000 cl=ONE no-warmup -pop seq=1..100000 -mode native cql3 -node
<ip_addr> -rate threads=10
Command line to read
database
cassandra-stress user profile=/root/cassandra_4.0/tools/cqlstress-insanity-example.yaml
ops(simple1=1) duration=10m cl=ONE no-warmup -pop dist=UNIFORM(1.. 1250000000)
-mode native cql3 –node <ip_addr> -rate threads=300
cassandra-stress user profile=/root/cassandra_4.0/tools/cqlstress-insanity-example.yaml
ops(simple1=1) duration=3m cl=ONE no-warmup -pop dist=UNIFORM(1..100000) -mode
native cql3 –node <ip_addr> -rate threads=320