SlideShare a Scribd company logo
1
Dr HAMADI CHAREF Brahim
Non-Volatile Memory (NVM)
Data Storage Institute (DSI), A*STAR
Recent
developments
in Deep Learning
May 30, 2016
2
Deep Learning – Convolutional NNets
3
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman
Coding
Song Han, Huizi Mao, William J. Dally
International Conference on Learning Representations ICLR2016
http://arXiv.org/abs/1510.00149
Learning both Weights and Connections for Efficient Neural Networks
Song Han, Jeff Pool, John Tran, William J. Dally
Neural Information Processing Systems NIPS2015
http://arxiv.org/abs/1506.02626
EIE: Efficient Inference Engine on Compressed Deep Neural Network
Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, William J. Dally
International Symposium on Computer Architecture ISCA2016
http://arXiv.org/abs/1602.01528
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, Kurt Keutzer
Technical Report 2016
http://arXiv.org/abs/1602.07360
Recent developments in Deep Learning
4
LeNet. The first successful applications of Convolutional Networks were developed by Yann LeCun
in 1990’s. Of these, the best known is the LeNet architecture that was used to read zip codes,
digits, etc.
AlexNet. The first work that popularized Convolutional Networks in Computer Vision was
the AlexNet, developed by Alex Krizhevsky, Ilya Sutskever and Geoff Hinton. The AlexNet was
submitted to the ImageNet ILSVRC challenge in 2012 and significantly outperformed the second
runner-up (top 5 error of 16% compared to runner-up with 26% error). The Network had a very
similar architecture to LeNet, but was deeper, bigger, and featured Convolutional Layers stacked on
top of each other (previously it was common to only have a single CONV layer always immediately
followed by a POOL layer).
VGGNet. The runner-up in ILSVRC 2014 was the network from Karen Simonyan and Andrew
Zisserman that became known as the VGGNet. Its main contribution was in showing that the depth
of the network is a critical component for good performance. Their final best network contains 16
CONV/FC layers and, appealingly, features an extremely homogeneous architecture that only
performs 3x3 convolutions and 2x2 pooling from the beginning to the end. Their pretrained model is
available for plug and play use in Caffe. A downside of the VGGNet is that it is more expensive to
evaluate and uses a lot more memory and parameters (140M). Most of these parameters are in the
first fully connected layer, and it was since found that these FC layers can be removed with no
performance downgrade, significantly reducing the number of necessary parameters.
Convolutional Neural Networks (CNNs / ConvNets)
http://cs231n.github.io/convolutional-networks/
Recent developments in Deep Learning

Recommended for you

NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection

The presentation provided an overview of different approaches to object detection using deep learning including sliding window detection, region-based detection, and fully convolutional networks. It demonstrated how to set up object detection workflows in Caffe and DIGITS and discussed challenges in object detection such as background clutter and occlusion.

#deeplearning #dli #object detection #nvidia #gpu
CNN Quantization
CNN QuantizationCNN Quantization
CNN Quantization

This document discusses quantization techniques for convolutional neural networks to improve performance. It examines quantizing models trained with floating point precision to fixed point to reduce memory usage and accelerate inference. Tensorflow and Caffe Ristretto quantization approaches are described and tested on MNIST and CIFAR10 datasets. Results show quantization reduces model size with minimal accuracy loss but increases inference time, likely due to limited supported operations.

deep learningconvolution neural networkneural network
Deep Learning Primer: A First-Principles Approach
Deep Learning Primer: A First-Principles ApproachDeep Learning Primer: A First-Principles Approach
Deep Learning Primer: A First-Principles Approach

A simplified way of approaching machine learning and deep learning from the ground up. The case for deep learning and an attempt to develop intuition for how/why it works. Advantages, state-of-the-art, and trends. Presented at NYU Center for Genomics for NY Deep Learning Meetup

brainstatisticsartificial intelligence
5
Deep Learning – Paper 1
6
Deep Learning – Paper 1
1 INTRODUCTION
2 NETWORK PRUNING
3 TRAINED QUANTIZATION AND WEIGHT SHARING
3.1 WEIGHT SHARING
3.2 INITIALIZATION OF SHARED WEIGHTS
3.3 FEED-FORWARD AND BACK-PROPAGATION
4 HUFFMAN CODING
5 EXPERIMENTS
5.1 LENET-300-100 AND LENET-5 ON MNIST
5.2 ALEXNET ON IMAGENET
5.3 VGG-16 ON IMAGENET
6 DISCUSSIONS
6.1 PRUNING AND QUANTIZATION WORKING TOGETHER
6.2 CENTROID INITIALIZATION
6.3 SPEEDUP AND ENERGY EFFICIENCY
6.4 RATIO OF WEIGHTS, INDEX AND CODEBOOK
7 RELATED WORK
8 FUTURE WORK
9 CONCLUSION
7
Deep Learning – Paper 1
8
Deep Learning – Paper 1

Recommended for you

Deep Learning for Robotics
Deep Learning for RoboticsDeep Learning for Robotics
Deep Learning for Robotics

Yinyin Liu presents at SD Robotics Meetup on November 8th, 2016. Deep learning has made great success in image understanding, speech, text recognition and natural language processing. Deep Learning also has tremendous potential to tackle the challenges in robotic vision, and sensorimotor learning in a robotic learning environment. In this talk, we will talk about how current and future deep learning technologies can be applied for robotic applications.

deep learningnervananeon
The deep learning tour - Q1 2017
The deep learning tour - Q1 2017 The deep learning tour - Q1 2017
The deep learning tour - Q1 2017

This is a 2 hours overview on the deep learning status as for Q1 2017. Starting with some basic concepts, continue to basic networks topologies , tools, HW/Accelerators and finally Intel's take on the the different fronts.

tensorflowdeep learningartificial intelligence
Image Classification Done Simply using Keras and TensorFlow
Image Classification Done Simply using Keras and TensorFlow Image Classification Done Simply using Keras and TensorFlow
Image Classification Done Simply using Keras and TensorFlow

This presentation walks through the process of building an image classifier using Keras with a TensorFlow backend. It will give a basic understanding of image classification and show the techniques used in industry to build image classifiers. The presentation will start with building a simple convolutional network, augmenting the data, using a pretrained network, and finally using transfer learning by modifying the last few layers of a pretrained network. The classification will be based on the classic example of classifying cats and dogs. The code for the presentation can be found at https://github.com/rajshah4/image_keras, and the presentation will discuss how to extend the code to your own pictures to make a custom image classifier.

image classificationkerascomputer vision
9
Deep Learning – Paper 1
10
Deep Learning – Paper 1
THE MNIST DATABASE of handwritten digits
http://yann.lecun.com/exdb/mnist/
Visual Geometry Group (University of Oxford)
http://www.robots.ox.ac.uk/~vgg/research/very_deep/
Alex Krizhevsky https://www.cs.toronto.edu/~kriz/
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes,
with 6000 images per class. There are 50000 training images and 10000 test images
11
Deep Learning – Paper 1
12
Deep Learning – Paper 1

Recommended for you

"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr..."Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...

For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/auvizsystems/embedded-vision-training/videos/pages/may-2015-embedded-vision-summit For more information about embedded vision, please visit: http://www.embedded-vision.com Nagesh Gupta, CEO and Founder of Auviz Systems, presents the "Trade-offs in Implementing Deep Neural Networks on FPGAs" tutorial at the May 2015 Embedded Vision Summit. Video and images are a key part of Internet traffic—think of all the data generated by social networking sites such as Facebook and Instagram—and this trend continues to grow. Extracting usable information from video and images is thus a growing requirement in the data center. For example, object and face recognition are valuable for a wide range of uses, from social applications to security applications. Deep neural networks are currently the most popular form of convolutional neural networks (CNN) used in data centers for such applications. 3D convolutions are a core part of CNNs. Nagesh presents alternative implementations of 3D convolutions on FPGAs, and discusses trade-offs among them.

nagesh guptaembedded vision allianceembedded vision
Why is Deep learning hot right now? and How can we apply it on each day job?
Why is Deep learning hot right now? and How can we apply it on each day job?Why is Deep learning hot right now? and How can we apply it on each day job?
Why is Deep learning hot right now? and How can we apply it on each day job?

What is Deep Learning Why Now How it Wok Deep Learning Models Deep Learning Applications Applying Deep Learning

machine learningai deep learning
Small Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their DesignSmall Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their Design

This document discusses small deep neural networks, their advantages, and their design. It notes that computer vision tasks now work well due to advances in deep learning. Small neural networks have advantages for applications requiring low power usage and real-time performance, such as in gadgets. Their smaller size allows for faster training, easier deployment on embedded devices, and continuous updating over-the-air. Recent advances in small networks like SqueezeNet achieve similar accuracy as larger networks but with much smaller size and parameters.

machine learningdeep learningembedded systems
13
Deep Learning – Paper 1
14
Deep Learning – Paper 1
15
Deep Learning – Paper 1
16
Deep Learning – Paper 1

Recommended for you

"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co..."New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...

Wave Computing is a startup that has developed a new dataflow architecture called the Dataflow Processing Unit (DPU) to accelerate deep learning training by up to 1000x. Their initial market focus is on machine learning in the datacenter. They have invented a Coarse Grain Reconfigurable Array architecture that can statically schedule dataflow graphs onto a massive array of processors. Wave is now accepting qualified customers for its Early Access Program to provide select companies early access to benchmark Wave's machine learning computers before official sales begin.

chris nicolwave computingembedded vision
Rethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligenceRethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligence

CTO of Nervana, Amir Khosrowshahi presents at New Frontiers in Computing 2016, Cognitive Computing: to the Singularity and Beyond at Stanford.

deep learningnervanatpu
Deep Learning
Deep LearningDeep Learning
Deep Learning

This document discusses deep learning, including its relationship to artificial intelligence and machine learning. It describes deep learning techniques like artificial neural networks and how GPUs are useful for deep learning. Applications mentioned include computer vision, speech recognition, and bioinformatics. Both benefits like robustness and weaknesses like long training times are outlined. Finally, common deep learning algorithms, libraries and tools are listed.

gpumachine learningdeeep learning
17
Deep Learning – Paper 1
18
Deep Learning – Paper 1
19
Deep Learning – Paper 1
20
Deep Learning – Paper 2
NIPS2015 Review
http://media.nips.cc/nipsbooks/nipspapers/paper_files/nips28/reviews/708.html

Recommended for you

NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow

深度學習圖像分割 課程目標與介紹 許多重要的應用程式在圖像中偵測一個以上的對象,這時候就必須將圖像分割成小空間區域並且標明不同種類。圖像分割常被應用在醫學圖像分析或是自主駕駛車等等領域。本課程將帶領學員透過 Tensorflow 架構,實際操作整理好的醫學影像與自主駕駛車資料庫做學習,您將有機會熟悉如何在複雜的醫療影像中分解不同類型的人體組織,血管或異常細胞,並進一步由原始影像分割出特定器官。

#deeplearning#tensorflow#nvidiagpu#image
Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow

Large-scale deep learning with TensorFlow allows storing and performing computation on large datasets to develop computer systems that can understand data. Deep learning models like neural networks are loosely based on what is known about the brain and become more powerful with more data, larger models, and more computation. At Google, deep learning is being applied across many products and areas, from speech recognition to image understanding to machine translation. TensorFlow provides an open-source software library for machine learning that has been widely adopted both internally at Google and externally.

#apachespark
Scalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetScalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNet

This document provides an overview and agenda for a Deep Learning with MXNet workshop. It begins with background on deep learning basics like biological and artificial neurons. It then introduces Apache MXNet and discusses its key features like scalability, efficiency, and programming models. The remainder of the document provides hands-on examples for attendees to train their first neural network using MXNet, including linear regression, MNIST digit classification using a multilayer perceptron, and convolutional neural networks.

awsaws pop-up loft san franciscosunil-mallya
21
Deep Learning – Paper 2
[7] Mark Horowitz. Energy table for 45nm process, Stanford VLSI wiki
Mark Horowitz Professor of Electrical Engineering and Computer Science
VLSI, Hardware, Graphics and Imaging, Applying Engineering to Biology
22
Deep Learning – Paper 2
23
Deep Learning – Paper 2
24
Deep Learning – Paper 2

Recommended for you

Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale

Nervana's deep learning platform provides unprecedented computing power through specialized hardware. It includes a fast deep learning framework called Neon that is 10 times faster than other frameworks on GPUs. Neon also includes pre-trained models and is under active development to improve capabilities like distributed computing and integration with other frameworks. Nervana aims to make deep learning more accessible and applicable across industries like healthcare, automotive, finance, and more.

re-workneondeep learning
PyConZA'17 Deep Learning for Computer Vision
PyConZA'17 Deep Learning for Computer VisionPyConZA'17 Deep Learning for Computer Vision
PyConZA'17 Deep Learning for Computer Vision

Slides from my talk on deep learning for computer vision at PyConZA on 2017/10/06. Description: The state-of-the-art in image classification has skyrocketed thanks to the development of deep convolutional neural networks and increases in the amount of data and computing power available to train them. The top-5 error rate in the ImageNet competition to predict which of 1000 classes an image belongs to has plummeted from 28% error in 2010 to just 2.25% in 2017 (human level error is around 5%). In addition to being able to classify objects in images (including not hotdogs), deep learning can be used to automatically generate captions for images, convert photos into paintings, detect cancer in pathology slide images, and help self-driving cars ‘see’. The talk will give an overview of the cutting edge and some of the core mathematical concepts and will also include a short code-first tutorial to show how easy it is to get started using deep learning for computer vision in python…

deep learningcomputer visioncape town
Quad Core Processors - Technology Presentation
Quad Core Processors - Technology PresentationQuad Core Processors - Technology Presentation
Quad Core Processors - Technology Presentation

The document discusses Intel's quad-core processors, which contain four processing cores on a single chip. This allows higher performance with lower power consumption compared to single-core chips. Quad-core processors are designed to improve performance for applications like workstations, servers, gaming, and datacenter virtualization while reducing total cost of ownership. An example application described is a 3D mapping software that combines topographical and satellite data to model natural disasters, which would benefit from a multi-core platform.

25
Deep Learning – Paper 2
26
Deep Learning – Paper 2
27
Deep Learning – Paper 2
28
Deep Learning – Paper 2

Recommended for you

MaPU-HPCA2016
MaPU-HPCA2016MaPU-HPCA2016
MaPU-HPCA2016

The document describes the development and testing of a novel mathematical computing architecture called MaPU. Key highlights include a multi-granularity parallel storage system that enables simultaneous matrix row and column access, a high dimension data model, and a cascading pipeline with a state machine-based program model. The first MaPU chip was implemented on a 40nm process with 4 MaPU cores. Testing showed the MaPU core was up to 6.94 times faster than a similar TI C66x DSP core for various algorithms like FFT and matrix multiplication. Power analysis indicated tested power was within 8% of estimated power.

"Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv...
"Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv..."Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv...
"Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv...

For the full video of this presentation, please visit: http://www.embedded-vision.com/industry-analysis/video-interviews-demos/overcoming-barriers-consumer-adoption-vision-enabled-produc For more information about embedded vision, please visit: http://www.embedded-vision.com John Feland, CEO and Founder of Argus Insights, presents the "Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Services" tutorial at the May 2015 Embedded Vision Summit. Visual intelligence is being deployed in a growing range of consumer products, including smartphones, tablets, security cameras, laptops (especially with Intel’s RealSense push), and even smartwatches. The demos are always cool. But does vision work for regular consumers? Do consumers see vision as a value add or just another feature to be ignored? In this talk, John investigates the best and worst of consumer product embedded vision implementations as told by real consumers, based on Argus Insights’ extensive portfolio of consumer data. John examines where current products fall short of consumers’ needs. And, he illuminates successful implementations to show how their vision capabilities create value in the lives of consumers. Case studies will include examples from Dropcam, Intel RealSense, HTC’s M8, and vision-enabled drones such as the DJI Phantom 2 Vision+.

john felandargus insightsembedded vision
Classification and Clustering
Classification and ClusteringClassification and Clustering
Classification and Clustering

This document provides an overview of clustering and classification techniques. It defines clustering as organizing objects into groups of similar objects and discusses common clustering algorithms like k-means and hierarchical clustering. It also provides examples of how k-means works and references for further information.

artificial intelligclassification and clustering
29
Deep Learning – Paper 3
30
Deep Learning – Paper 3
31
Deep Learning – Paper 3
32
Deep Learning – Paper 3

Recommended for you

CIFAR-10
CIFAR-10CIFAR-10
CIFAR-10

This document describes the CIFAR-10 dataset for classifying images into 10 categories. It contains 60,000 32x32 color images split into 50,000 training and 10,000 test images. Two methods are proposed: Method 1 extracts patches and features from each image and uses SVM/kNN, while Method 2 uses LoG and HoG features to preserve shape before SVM/kNN classification. Experiments test different parameters, with the best accuracy around 42% using a 13-dimensional Fisher vector and RBF SVM kernel.

Unsupervised Classification of Images: A Review
Unsupervised Classification of Images: A ReviewUnsupervised Classification of Images: A Review
Unsupervised Classification of Images: A Review

Unsupervised image classification is the process by which each image in a dataset is identified to be a member of one of the inherent categories present in the image collection without the use of labelled training samples. Unsupervised categorisation of images relies on unsupervised machine learning algorithms for its implementation. This paper identifies clustering algorithms and dimension reduction algorithms as the two main classes of unsupervised machine learning algorithms needed in unsupervised image categorisation, and then reviews how these algorithms are used in some notable implementation of unsupervised image classification algorithms.

clusteringimage retrievaldimension reduction.
Neural Network as a function
Neural Network as a functionNeural Network as a function
Neural Network as a function

The document discusses neural networks and how they can be viewed as functions. It describes how neural networks take input data and produce output predictions or classifications. The document outlines how neural networks have a layered structure where each layer is a function, and how the layers are composed together. It explains that neurons are the basic units of computation in each layer and how they operate. The document also discusses how neural network training works by optimizing the weights and biases in each layer to minimize error, and how matrix operations in neural networks can benefit from parallel processing on GPUs.

deep learningneural networkscala
33
Deep Learning – Paper 4
34
Deep Learning – Paper 3
35
Deep Learning – Paper 3
36
Deep Learning – Paper 3

Recommended for you

"A Vision of Safety," a Presentation from Nauto
"A Vision of Safety," a Presentation from Nauto"A Vision of Safety," a Presentation from Nauto
"A Vision of Safety," a Presentation from Nauto

The document discusses the inefficiencies and dangers of the current transportation system and envisions how new technologies could lead to a safer, more efficient system. It notes that currently transportation involves a huge waste of resources, with cars spending most of their time parked and accidents causing many deaths each year due to human error. The vision is that autonomous, connected, shared, and electric vehicles could reduce accidents by 90%, increase vehicle utilization to 50%, and make drivetrains 85% efficient. This could lead to an 8 cent per mile transportation system.

embedded vision alliancecomputer visionstefan heck
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreading

This document discusses three types of hardware multithreading: coarse-grained, fine-grained, and simultaneous multithreading (SMT). Coarse-grained multithreading allows another thread to run during long stalls of the first thread. Fine-grained multithreading interleaves instructions from multiple threads in a round-robin fashion to hide stalls. SMT issues instructions from multiple threads in the same cycle by using register renaming and dynamic scheduling to maximize utilization.

"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...

For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2014-embedded-vision-summit-khronos For more information about embedded vision, please visit: http://www.embedded-vision.com Neil Trevett, President of Khronos and Vice President at NVIDIA, presents the "OpenVX Hardware Acceleration API for Embedded Vision Applications and Libraries" tutorial at the May 2014 Embedded Vision Summit. This presentation introduces OpenVX, a new application programming interface (API) from the Khronos Group. OpenVX enables performance and power optimized vision algorithms for use cases such as face, body and gesture tracking, smart video surveillance, automatic driver assistance systems, object and scene reconstruction, augmented reality, visual inspection, robotics and more. OpenVX enables significant implementation innovation while maintaining a consistent API for developers. OpenVX can be used directly by applications or to accelerate higher-level middleware with platform portability. OpenVX complements the popular OpenCV open source vision library that is often used for application prototyping.

khronosembedded vision summitopenvx
37
Deep Learning – Paper 3
38
Deep Learning – Paper 3
39
Deep Learning – Paper 3
40
Deep Learning – Paper 3

Recommended for you

PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装

Pythonによりコンピュータビジョンアルゴリズムを実装する内容。本スライドは作者が短期間で学んだ内容につき、誤りを含む可能性がございます。あらかじめご了承ください。

"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP..."Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...

For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/altera/embedded-vision-training/videos/pages/may-2015-embedded-vision-summit For more information about embedded vision, please visit: http://www.embedded-vision.com Deshanand Singh, Director of Software Engineering at Altera, presents the "Efficient Implementation of Convolutional Neural Networks using OpenCL on FPGAs" tutorial at the May 2015 Embedded Vision Summit. Convolutional neural networks (CNN) are becoming increasingly popular in embedded applications such as vision processing and automotive driver assistance systems. The structure of CNN systems is characterized by cascades of FIR filters and transcendental functions. FPGA technology offers a very efficient way of implementing these structures by allowing designers to build custom hardware datapaths that implement the CNN structure. One challenge of using FPGAs revolves around the design flow that has been traditionally centered around tedious hardware description languages. In this talk, Deshanand gives a detailed explanation of how CNN algorithms can be expressed in OpenCL and compiled directly to FPGA hardware. He gives detail on code optimizations and provides comparisons with the efficiency of hand-coded implementations.

alteraembedded visionembedded vision summit
"Trends and Recent Developments in Processors for Vision," a Presentation fro...
"Trends and Recent Developments in Processors for Vision," a Presentation fro..."Trends and Recent Developments in Processors for Vision," a Presentation fro...
"Trends and Recent Developments in Processors for Vision," a Presentation fro...

For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/bdti/embedded-vision-training/videos/pages/may-2014-embedded-vision-summit-techni-0 For more information about embedded vision, please visit: http://www.embedded-vision.com Jeff Bier, President and co-founder of BDTI and founder of the Embedded Vision Alliance, presents the "Trends and Recent Developments in Processors for Vision" tutorial at the May 2014 Embedded Vision Summit. Processor suppliers are investing intensively in new processors for vision applications, employing a diverse range of architecture approaches to meet the conflicting requirements of high performance, low cost, energy efficiency, and ease of application development. In this presentation, Bier draws from BDTI's ongoing processor evaluation work to highlight significant recent developments in processors for vision applications, including mobile application processors, graphics processing units, and specialized vision processors. He also explores what BDTI considers to be the most significant trends in processors for vision—such as the increasing use of heterogeneous architectures—and the implications of these trends for system designers and application developers.

computer visionembedded vision alliancejeff bier
41
Deep Learning – Paper 4
42
Deep Learning – Paper 4
43
Deep Learning – Paper 4
1. Introduction and Motivation
More efficient distributed training
Less overhead when exporting new models to clients
Feasible FPGA and embedded deployment
2. Related Work
2.1. Model Compression
2.2. CNN Microarchitecture
2.3. CNN Macroarchitecture
2.4. Neural Network Design Space Exploration
3. SqueezeNet: preserving accuracy with few parameters
3.1. Architectural Design Strategies
Strategy 1. Replace 3x3 filters with 1x1 filters
Strategy 2. Decrease the number of input channels to 3x3 filters
Strategy 3. Downsample late in the network so that convolution layers have large activation maps
3.2. The Fire Module
3.3. The SqueezeNet architecture
3.3.1 Other SqueezeNet details
5. CNN Microarchitecture Design Space Exploration
5.1. CNN Microarchitecture metaparameters
5.2. Squeeze Ratio
5.3. Trading off 1x1 and 3x3 filters
6. CNN Macroarchitecture Design Space Exploration
7. Model Compression Design Space Exploration
7.1. Sensitivity Analysis: Where to Prune or Add parameters
Sensitivity analysis applied to model compression
Sensitivity analysis applied to increasing accuracy
7.2. Improving Accuracy by Densifying Sparse Models
8. Conclusions
Rectified linear units improve restricted boltzmann machines.
V. Nair and G. E. Hinton. In ICML, 2010. 3
44
Deep Learning – Paper 4

Recommended for you

【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...

CVPR2016にてシモセラ・エドガー氏が発表した、StyleNetの紹介資料です。 "Fashion Style in 128 Floats: Joint Ranking and Classification using Weak Data for Feature Extraction," Edgar Simo-Serra and Hiroshi Ishikawa, in CVPR2016. 論文情報 http://hi.cs.waseda.ac.jp/~esimo/publications/SimoSerraCVPR2016.pdf プロジェクトページ http://hi.cs.waseda.ac.jp/~esimo/ja/research/stylenet/

deep learningcvprcomputer vision
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre..."Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...

For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/ceva/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit-siegel For more information about embedded vision, please visit: http://www.embedded-vision.com Yair Siegel, Director of Segment Marketing at CEVA, presents the "Fast Deployment of Low-power Deep Learning on CEVA Vision Processors" tutorial at the May 2016 Embedded Vision Summit. Image recognition capabilities enabled by deep learning are benefitting more and more applications, including automotive safety, surveillance and drones. This is driving a shift towards running neural networks inside embedded devices. But, there are numerous challenges in squeezing deep learning into resource-limited devices. This presentation details a fast path for taking a neural network from research into an embedded implementation on a CEVA vision processor core, making use of CEVA’s neural network software framework. Siegel explains how the CEVA framework integrates with existing deep learning development environments like Caffe, and how it can be used to create low-power embedded systems with neural network capabilities.

embedded visionembedded vision summitdeep learning
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM

For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/arm/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit-iodice For more information about embedded vision, please visit: http://www.embedded-vision.com Gian Marco Iodice, Software Engineer at ARM, presents the "Using SGEMM and FFTs to Accelerate Deep Learning" tutorial at the May 2016 Embedded Vision Summit. Matrix Multiplication and the Fast Fourier Transform are numerical foundation stones for a wide range of scientific algorithms. With the emergence of deep learning, they are becoming even more important, particularly as use cases extend into mobile and embedded devices. In this presentation, lodice discusses and analyzes how these two key, computationally-intensive algorithms can be used to gain significant performance improvements for convolutional neural network (CNN) implementations. After a brief introduction to the nature of CNN computations, Iodice explores the use of GEMM (General Matrix Multiplication) and mixed-radix FFTs to accelerate 3D convolution. He shows examples of OpenCL implementations of these functions and highlights their advantages, limitations and trade-offs. Central to the techniques explored is an emphasis on cache-efficient memory accesses and the crucial role of reduced-precision data types.

armembedded vision alliancedeep learning
45
Deep Learning – Paper 4
46
Deep Learning – Paper 4
47
Deep Learning – Paper 4
48
Deep Learning – Paper 4

Recommended for you

"Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin..."Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin...

For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/dec-2016-member-meeting-uofw For more information about embedded vision, please visit: http://www.embedded-vision.com Professor Jeff Bilmes of the University of Washington delivers the presentation "Image and Video Summarization" at the December 2016 Embedded Vision Alliance Member Meeting. Bilmes provides an overview of the state of the art in image and video summarization.

computer visionuniversity of washingtonembedded vision
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication

The document outlines the objectives, methodology, and work accomplished for a project involving designing an efficient convolutional neural network architecture for image classification. The objectives were to classify images using CNNs and design an effective CNN architecture. The methodology involved designing convolution and pooling layers, and using gradient descent to train the network. Work accomplished included GPU configuration, designing CNN architectures for CIFAR-10 and MNIST datasets, and tracking training loss, validation loss, and accuracy over epochs.

convolutional neural networkimage classificationobject recognition
Core 2 Duo Processor
Core 2 Duo ProcessorCore 2 Duo Processor
Core 2 Duo Processor

The document summarizes the key features and specifications of the Intel Core 2 Duo processor. It is a 64-bit dual-core processor introduced in 2006 as the successor to the Core Duo. Each of its cores are based on the Pentium M microarchitecture and have shorter pipelines, allowing for higher performance at lower clock speeds compared to previous architectures like the Pentium 4. The Core 2 Duo comes in desktop and notebook versions with performance about 20% lower in notebooks due to lower voltages and bus speeds.

intel corelatest prcessorcore 2 duo
49
Deep Learning – Paper 4
50
Deep Learning – Paper 4
51
Deep Learning – Paper 4

More Related Content

What's hot

On-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on AndroidOn-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on Android
Yufeng Guo
 
Faster deep learning solutions from training to inference - Michele Tameni - ...
Faster deep learning solutions from training to inference - Michele Tameni - ...Faster deep learning solutions from training to inference - Michele Tameni - ...
Faster deep learning solutions from training to inference - Michele Tameni - ...
Codemotion
 
Introduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at GalvanizeIntroduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at Galvanize
Intel Nervana
 
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA Taiwan
 
CNN Quantization
CNN QuantizationCNN Quantization
CNN Quantization
Emanuele Ghelfi
 
Deep Learning Primer: A First-Principles Approach
Deep Learning Primer: A First-Principles ApproachDeep Learning Primer: A First-Principles Approach
Deep Learning Primer: A First-Principles Approach
Maurizio Calo Caligaris
 
Deep Learning for Robotics
Deep Learning for RoboticsDeep Learning for Robotics
Deep Learning for Robotics
Intel Nervana
 
The deep learning tour - Q1 2017
The deep learning tour - Q1 2017 The deep learning tour - Q1 2017
The deep learning tour - Q1 2017
Eran Shlomo
 
Image Classification Done Simply using Keras and TensorFlow
Image Classification Done Simply using Keras and TensorFlow Image Classification Done Simply using Keras and TensorFlow
Image Classification Done Simply using Keras and TensorFlow
Rajiv Shah
 
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr..."Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
Edge AI and Vision Alliance
 
Why is Deep learning hot right now? and How can we apply it on each day job?
Why is Deep learning hot right now? and How can we apply it on each day job?Why is Deep learning hot right now? and How can we apply it on each day job?
Why is Deep learning hot right now? and How can we apply it on each day job?
Issam AlZinati
 
Small Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their DesignSmall Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their Design
Forrest Iandola
 
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co..."New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
Edge AI and Vision Alliance
 
Rethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligenceRethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligence
Intel Nervana
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
Büşra İçöz
 
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA Taiwan
 
Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow
Jen Aman
 
Scalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetScalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNet
Amazon Web Services
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
Intel Nervana
 
PyConZA'17 Deep Learning for Computer Vision
PyConZA'17 Deep Learning for Computer VisionPyConZA'17 Deep Learning for Computer Vision
PyConZA'17 Deep Learning for Computer Vision
Alex Conway
 

What's hot (20)

On-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on AndroidOn-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on Android
 
Faster deep learning solutions from training to inference - Michele Tameni - ...
Faster deep learning solutions from training to inference - Michele Tameni - ...Faster deep learning solutions from training to inference - Michele Tameni - ...
Faster deep learning solutions from training to inference - Michele Tameni - ...
 
Introduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at GalvanizeIntroduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at Galvanize
 
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
 
CNN Quantization
CNN QuantizationCNN Quantization
CNN Quantization
 
Deep Learning Primer: A First-Principles Approach
Deep Learning Primer: A First-Principles ApproachDeep Learning Primer: A First-Principles Approach
Deep Learning Primer: A First-Principles Approach
 
Deep Learning for Robotics
Deep Learning for RoboticsDeep Learning for Robotics
Deep Learning for Robotics
 
The deep learning tour - Q1 2017
The deep learning tour - Q1 2017 The deep learning tour - Q1 2017
The deep learning tour - Q1 2017
 
Image Classification Done Simply using Keras and TensorFlow
Image Classification Done Simply using Keras and TensorFlow Image Classification Done Simply using Keras and TensorFlow
Image Classification Done Simply using Keras and TensorFlow
 
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr..."Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
 
Why is Deep learning hot right now? and How can we apply it on each day job?
Why is Deep learning hot right now? and How can we apply it on each day job?Why is Deep learning hot right now? and How can we apply it on each day job?
Why is Deep learning hot right now? and How can we apply it on each day job?
 
Small Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their DesignSmall Deep-Neural-Networks: Their Advantages and Their Design
Small Deep-Neural-Networks: Their Advantages and Their Design
 
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co..."New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
 
Rethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligenceRethinking computation: A processor architecture for machine intelligence
Rethinking computation: A processor architecture for machine intelligence
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
 
Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow
 
Scalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetScalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNet
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
 
PyConZA'17 Deep Learning for Computer Vision
PyConZA'17 Deep Learning for Computer VisionPyConZA'17 Deep Learning for Computer Vision
PyConZA'17 Deep Learning for Computer Vision
 

Viewers also liked

Quad Core Processors - Technology Presentation
Quad Core Processors - Technology PresentationQuad Core Processors - Technology Presentation
Quad Core Processors - Technology Presentation
vinaya.hs
 
MaPU-HPCA2016
MaPU-HPCA2016MaPU-HPCA2016
MaPU-HPCA2016
Shaolin Xie
 
"Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv...
"Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv..."Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv...
"Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv...
Edge AI and Vision Alliance
 
Classification and Clustering
Classification and ClusteringClassification and Clustering
Classification and Clustering
Yogendra Tamang
 
CIFAR-10
CIFAR-10CIFAR-10
CIFAR-10
satyam_madala
 
Unsupervised Classification of Images: A Review
Unsupervised Classification of Images: A ReviewUnsupervised Classification of Images: A Review
Unsupervised Classification of Images: A Review
CSCJournals
 
Neural Network as a function
Neural Network as a functionNeural Network as a function
Neural Network as a function
Taisuke Oe
 
"A Vision of Safety," a Presentation from Nauto
"A Vision of Safety," a Presentation from Nauto"A Vision of Safety," a Presentation from Nauto
"A Vision of Safety," a Presentation from Nauto
Edge AI and Vision Alliance
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreading
Fraboni Ec
 
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
Edge AI and Vision Alliance
 
PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装
Hirokatsu Kataoka
 
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP..."Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
Edge AI and Vision Alliance
 
"Trends and Recent Developments in Processors for Vision," a Presentation fro...
"Trends and Recent Developments in Processors for Vision," a Presentation fro..."Trends and Recent Developments in Processors for Vision," a Presentation fro...
"Trends and Recent Developments in Processors for Vision," a Presentation fro...
Edge AI and Vision Alliance
 
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
Hirokatsu Kataoka
 
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre..."Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
Edge AI and Vision Alliance
 
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
Edge AI and Vision Alliance
 
"Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin..."Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin...
Edge AI and Vision Alliance
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
Yogendra Tamang
 
Core 2 Duo Processor
Core 2 Duo ProcessorCore 2 Duo Processor
Core 2 Duo Processor
Kashif Latif
 

Viewers also liked (19)

Quad Core Processors - Technology Presentation
Quad Core Processors - Technology PresentationQuad Core Processors - Technology Presentation
Quad Core Processors - Technology Presentation
 
MaPU-HPCA2016
MaPU-HPCA2016MaPU-HPCA2016
MaPU-HPCA2016
 
"Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv...
"Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv..."Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv...
"Overcoming Barriers to Consumer Adoption of Vision-enabled Products and Serv...
 
Classification and Clustering
Classification and ClusteringClassification and Clustering
Classification and Clustering
 
CIFAR-10
CIFAR-10CIFAR-10
CIFAR-10
 
Unsupervised Classification of Images: A Review
Unsupervised Classification of Images: A ReviewUnsupervised Classification of Images: A Review
Unsupervised Classification of Images: A Review
 
Neural Network as a function
Neural Network as a functionNeural Network as a function
Neural Network as a function
 
"A Vision of Safety," a Presentation from Nauto
"A Vision of Safety," a Presentation from Nauto"A Vision of Safety," a Presentation from Nauto
"A Vision of Safety," a Presentation from Nauto
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreading
 
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
 
PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装
 
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP..."Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
 
"Trends and Recent Developments in Processors for Vision," a Presentation fro...
"Trends and Recent Developments in Processors for Vision," a Presentation fro..."Trends and Recent Developments in Processors for Vision," a Presentation fro...
"Trends and Recent Developments in Processors for Vision," a Presentation fro...
 
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
 
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre..."Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...
 
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
"Using SGEMM and FFTs to Accelerate Deep Learning," a Presentation from ARM
 
"Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin..."Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin...
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
 
Core 2 Duo Processor
Core 2 Duo ProcessorCore 2 Duo Processor
Core 2 Duo Processor
 

Similar to Recent developments in Deep Learning

UNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxUNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptx
NoorUlHaq47
 
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkRunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
Putra Wanda
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
Bomm Kim
 
Pres Tesi LM-2016+transcript_eng
Pres Tesi LM-2016+transcript_engPres Tesi LM-2016+transcript_eng
Pres Tesi LM-2016+transcript_eng
Daniele Ciriello
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
Brodmann17
 
A Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep LearningA Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep Learning
IRJET Journal
 
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
Munisekhar Gunapati
 
Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequences
Claudio Gallicchio
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Towards Dropout Training for Convolutional Neural Networks
Towards Dropout Training for Convolutional Neural Networks Towards Dropout Training for Convolutional Neural Networks
Towards Dropout Training for Convolutional Neural Networks
Mah Sa
 
Development of 3D convolutional neural network to recognize human activities ...
Development of 3D convolutional neural network to recognize human activities ...Development of 3D convolutional neural network to recognize human activities ...
Development of 3D convolutional neural network to recognize human activities ...
journalBEEI
 
convolutional_neural_networks.pptx
convolutional_neural_networks.pptxconvolutional_neural_networks.pptx
convolutional_neural_networks.pptx
MsKiranSingh
 
Efficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognitEfficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognit
EL Mehdi RAOUHI
 
Dp2 ppt by_bikramjit_chowdhury_final
Dp2 ppt by_bikramjit_chowdhury_finalDp2 ppt by_bikramjit_chowdhury_final
Dp2 ppt by_bikramjit_chowdhury_final
Bikramjit Chowdhury
 
Artificial Neural Network Implementation on FPGA – a Modular Approach
Artificial Neural Network Implementation on FPGA – a Modular ApproachArtificial Neural Network Implementation on FPGA – a Modular Approach
Artificial Neural Network Implementation on FPGA – a Modular Approach
Roee Levy
 
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
NAVER Engineering
 
CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent Advances
Dmytro Mishkin
 
CNN
CNNCNN
DEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTIONDEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTION
IRJET Journal
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Sitakanta Mishra
 

Similar to Recent developments in Deep Learning (20)

UNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxUNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptx
 
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkRunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
 
Pres Tesi LM-2016+transcript_eng
Pres Tesi LM-2016+transcript_engPres Tesi LM-2016+transcript_eng
Pres Tesi LM-2016+transcript_eng
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
 
A Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep LearningA Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep Learning
 
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
 
Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequences
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Towards Dropout Training for Convolutional Neural Networks
Towards Dropout Training for Convolutional Neural Networks Towards Dropout Training for Convolutional Neural Networks
Towards Dropout Training for Convolutional Neural Networks
 
Development of 3D convolutional neural network to recognize human activities ...
Development of 3D convolutional neural network to recognize human activities ...Development of 3D convolutional neural network to recognize human activities ...
Development of 3D convolutional neural network to recognize human activities ...
 
convolutional_neural_networks.pptx
convolutional_neural_networks.pptxconvolutional_neural_networks.pptx
convolutional_neural_networks.pptx
 
Efficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognitEfficient mobilenet architecture_as_image_recognit
Efficient mobilenet architecture_as_image_recognit
 
Dp2 ppt by_bikramjit_chowdhury_final
Dp2 ppt by_bikramjit_chowdhury_finalDp2 ppt by_bikramjit_chowdhury_final
Dp2 ppt by_bikramjit_chowdhury_final
 
Artificial Neural Network Implementation on FPGA – a Modular Approach
Artificial Neural Network Implementation on FPGA – a Modular ApproachArtificial Neural Network Implementation on FPGA – a Modular Approach
Artificial Neural Network Implementation on FPGA – a Modular Approach
 
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
 
CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent Advances
 
CNN
CNNCNN
CNN
 
DEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTIONDEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTION
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
 

Recently uploaded

Exploring Deep Learning Models for Image Recognition: A Comparative Review
Exploring Deep Learning Models for Image Recognition: A Comparative ReviewExploring Deep Learning Models for Image Recognition: A Comparative Review
Exploring Deep Learning Models for Image Recognition: A Comparative Review
sipij
 
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
VICTOR MAESTRE RAMIREZ
 
kiln burning and kiln burner system for clinker
kiln burning and kiln burner system for clinkerkiln burning and kiln burner system for clinker
kiln burning and kiln burner system for clinker
hamedmustafa094
 
Lecture 3 Biomass energy...............ppt
Lecture 3 Biomass energy...............pptLecture 3 Biomass energy...............ppt
Lecture 3 Biomass energy...............ppt
RujanTimsina1
 
LeetCode Database problems solved using PySpark.pdf
LeetCode Database problems solved using PySpark.pdfLeetCode Database problems solved using PySpark.pdf
LeetCode Database problems solved using PySpark.pdf
pavanaroshni1977
 
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdfGUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
ProexportColombia1
 
Lecture 6 - The effect of Corona effect in Power systems.pdf
Lecture 6 - The effect of Corona effect in Power systems.pdfLecture 6 - The effect of Corona effect in Power systems.pdf
Lecture 6 - The effect of Corona effect in Power systems.pdf
peacekipu
 
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
Jim Mimlitz, P.E.
 
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdfOCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
Muanisa Waras
 
How to Manage Internal Notes in Odoo 17 POS
How to Manage Internal Notes in Odoo 17 POSHow to Manage Internal Notes in Odoo 17 POS
How to Manage Internal Notes in Odoo 17 POS
Celine George
 
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model SafeRohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
binna singh$A17
 
Net Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK EmpireNet Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK Empire
Global Network for Zero
 
Best Practices of Clothing Businesses in Talavera, Nueva Ecija, A Foundation ...
Best Practices of Clothing Businesses in Talavera, Nueva Ecija, A Foundation ...Best Practices of Clothing Businesses in Talavera, Nueva Ecija, A Foundation ...
Best Practices of Clothing Businesses in Talavera, Nueva Ecija, A Foundation ...
IJAEMSJORNAL
 
Germany Offshore Wind 010724 RE (1) 2 test.pptx
Germany Offshore Wind 010724 RE (1) 2 test.pptxGermany Offshore Wind 010724 RE (1) 2 test.pptx
Germany Offshore Wind 010724 RE (1) 2 test.pptx
rebecca841358
 
Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.
Tool and Die Tech
 
Development of Chatbot Using AI/ML Technologies
Development of  Chatbot Using AI/ML TechnologiesDevelopment of  Chatbot Using AI/ML Technologies
Development of Chatbot Using AI/ML Technologies
maisnampibarel
 
Response & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITHResponse & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITH
IIIT Hyderabad
 
Online music portal management system project report.pdf
Online music portal management system project report.pdfOnline music portal management system project report.pdf
Online music portal management system project report.pdf
Kamal Acharya
 
Biology for computer science BBOC407 vtu
Biology for computer science BBOC407 vtuBiology for computer science BBOC407 vtu
Biology for computer science BBOC407 vtu
santoshpatilrao33
 
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
IJAEMSJORNAL
 

Recently uploaded (20)

Exploring Deep Learning Models for Image Recognition: A Comparative Review
Exploring Deep Learning Models for Image Recognition: A Comparative ReviewExploring Deep Learning Models for Image Recognition: A Comparative Review
Exploring Deep Learning Models for Image Recognition: A Comparative Review
 
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
 
kiln burning and kiln burner system for clinker
kiln burning and kiln burner system for clinkerkiln burning and kiln burner system for clinker
kiln burning and kiln burner system for clinker
 
Lecture 3 Biomass energy...............ppt
Lecture 3 Biomass energy...............pptLecture 3 Biomass energy...............ppt
Lecture 3 Biomass energy...............ppt
 
LeetCode Database problems solved using PySpark.pdf
LeetCode Database problems solved using PySpark.pdfLeetCode Database problems solved using PySpark.pdf
LeetCode Database problems solved using PySpark.pdf
 
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdfGUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
 
Lecture 6 - The effect of Corona effect in Power systems.pdf
Lecture 6 - The effect of Corona effect in Power systems.pdfLecture 6 - The effect of Corona effect in Power systems.pdf
Lecture 6 - The effect of Corona effect in Power systems.pdf
 
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
SCADAmetrics Instrumentation for Sensus Water Meters - Core and Main Training...
 
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdfOCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
 
How to Manage Internal Notes in Odoo 17 POS
How to Manage Internal Notes in Odoo 17 POSHow to Manage Internal Notes in Odoo 17 POS
How to Manage Internal Notes in Odoo 17 POS
 
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model SafeRohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
 
Net Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK EmpireNet Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK Empire
 
Best Practices of Clothing Businesses in Talavera, Nueva Ecija, A Foundation ...
Best Practices of Clothing Businesses in Talavera, Nueva Ecija, A Foundation ...Best Practices of Clothing Businesses in Talavera, Nueva Ecija, A Foundation ...
Best Practices of Clothing Businesses in Talavera, Nueva Ecija, A Foundation ...
 
Germany Offshore Wind 010724 RE (1) 2 test.pptx
Germany Offshore Wind 010724 RE (1) 2 test.pptxGermany Offshore Wind 010724 RE (1) 2 test.pptx
Germany Offshore Wind 010724 RE (1) 2 test.pptx
 
Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.
 
Development of Chatbot Using AI/ML Technologies
Development of  Chatbot Using AI/ML TechnologiesDevelopment of  Chatbot Using AI/ML Technologies
Development of Chatbot Using AI/ML Technologies
 
Response & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITHResponse & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITH
 
Online music portal management system project report.pdf
Online music portal management system project report.pdfOnline music portal management system project report.pdf
Online music portal management system project report.pdf
 
Biology for computer science BBOC407 vtu
Biology for computer science BBOC407 vtuBiology for computer science BBOC407 vtu
Biology for computer science BBOC407 vtu
 
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
Profiling of Cafe Business in Talavera, Nueva Ecija: A Basis for Development ...
 

Recent developments in Deep Learning

  • 1. 1 Dr HAMADI CHAREF Brahim Non-Volatile Memory (NVM) Data Storage Institute (DSI), A*STAR Recent developments in Deep Learning May 30, 2016
  • 2. 2 Deep Learning – Convolutional NNets
  • 3. 3 Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding Song Han, Huizi Mao, William J. Dally International Conference on Learning Representations ICLR2016 http://arXiv.org/abs/1510.00149 Learning both Weights and Connections for Efficient Neural Networks Song Han, Jeff Pool, John Tran, William J. Dally Neural Information Processing Systems NIPS2015 http://arxiv.org/abs/1506.02626 EIE: Efficient Inference Engine on Compressed Deep Neural Network Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, William J. Dally International Symposium on Computer Architecture ISCA2016 http://arXiv.org/abs/1602.01528 SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, Kurt Keutzer Technical Report 2016 http://arXiv.org/abs/1602.07360 Recent developments in Deep Learning
  • 4. 4 LeNet. The first successful applications of Convolutional Networks were developed by Yann LeCun in 1990’s. Of these, the best known is the LeNet architecture that was used to read zip codes, digits, etc. AlexNet. The first work that popularized Convolutional Networks in Computer Vision was the AlexNet, developed by Alex Krizhevsky, Ilya Sutskever and Geoff Hinton. The AlexNet was submitted to the ImageNet ILSVRC challenge in 2012 and significantly outperformed the second runner-up (top 5 error of 16% compared to runner-up with 26% error). The Network had a very similar architecture to LeNet, but was deeper, bigger, and featured Convolutional Layers stacked on top of each other (previously it was common to only have a single CONV layer always immediately followed by a POOL layer). VGGNet. The runner-up in ILSVRC 2014 was the network from Karen Simonyan and Andrew Zisserman that became known as the VGGNet. Its main contribution was in showing that the depth of the network is a critical component for good performance. Their final best network contains 16 CONV/FC layers and, appealingly, features an extremely homogeneous architecture that only performs 3x3 convolutions and 2x2 pooling from the beginning to the end. Their pretrained model is available for plug and play use in Caffe. A downside of the VGGNet is that it is more expensive to evaluate and uses a lot more memory and parameters (140M). Most of these parameters are in the first fully connected layer, and it was since found that these FC layers can be removed with no performance downgrade, significantly reducing the number of necessary parameters. Convolutional Neural Networks (CNNs / ConvNets) http://cs231n.github.io/convolutional-networks/ Recent developments in Deep Learning
  • 6. 6 Deep Learning – Paper 1 1 INTRODUCTION 2 NETWORK PRUNING 3 TRAINED QUANTIZATION AND WEIGHT SHARING 3.1 WEIGHT SHARING 3.2 INITIALIZATION OF SHARED WEIGHTS 3.3 FEED-FORWARD AND BACK-PROPAGATION 4 HUFFMAN CODING 5 EXPERIMENTS 5.1 LENET-300-100 AND LENET-5 ON MNIST 5.2 ALEXNET ON IMAGENET 5.3 VGG-16 ON IMAGENET 6 DISCUSSIONS 6.1 PRUNING AND QUANTIZATION WORKING TOGETHER 6.2 CENTROID INITIALIZATION 6.3 SPEEDUP AND ENERGY EFFICIENCY 6.4 RATIO OF WEIGHTS, INDEX AND CODEBOOK 7 RELATED WORK 8 FUTURE WORK 9 CONCLUSION
  • 10. 10 Deep Learning – Paper 1 THE MNIST DATABASE of handwritten digits http://yann.lecun.com/exdb/mnist/ Visual Geometry Group (University of Oxford) http://www.robots.ox.ac.uk/~vgg/research/very_deep/ Alex Krizhevsky https://www.cs.toronto.edu/~kriz/ The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images
  • 20. 20 Deep Learning – Paper 2 NIPS2015 Review http://media.nips.cc/nipsbooks/nipspapers/paper_files/nips28/reviews/708.html
  • 21. 21 Deep Learning – Paper 2 [7] Mark Horowitz. Energy table for 45nm process, Stanford VLSI wiki Mark Horowitz Professor of Electrical Engineering and Computer Science VLSI, Hardware, Graphics and Imaging, Applying Engineering to Biology
  • 43. 43 Deep Learning – Paper 4 1. Introduction and Motivation More efficient distributed training Less overhead when exporting new models to clients Feasible FPGA and embedded deployment 2. Related Work 2.1. Model Compression 2.2. CNN Microarchitecture 2.3. CNN Macroarchitecture 2.4. Neural Network Design Space Exploration 3. SqueezeNet: preserving accuracy with few parameters 3.1. Architectural Design Strategies Strategy 1. Replace 3x3 filters with 1x1 filters Strategy 2. Decrease the number of input channels to 3x3 filters Strategy 3. Downsample late in the network so that convolution layers have large activation maps 3.2. The Fire Module 3.3. The SqueezeNet architecture 3.3.1 Other SqueezeNet details 5. CNN Microarchitecture Design Space Exploration 5.1. CNN Microarchitecture metaparameters 5.2. Squeeze Ratio 5.3. Trading off 1x1 and 3x3 filters 6. CNN Macroarchitecture Design Space Exploration 7. Model Compression Design Space Exploration 7.1. Sensitivity Analysis: Where to Prune or Add parameters Sensitivity analysis applied to model compression Sensitivity analysis applied to increasing accuracy 7.2. Improving Accuracy by Densifying Sparse Models 8. Conclusions Rectified linear units improve restricted boltzmann machines. V. Nair and G. E. Hinton. In ICML, 2010. 3