This document discusses using deep learning for seismic tomography. It begins with an overview of seismic tomography and the forward and inverse problems. It then discusses using deep learning approaches like empirical risk minimization with neural networks to solve the inverse problem. Several deep learning architectures are evaluated including those using semblance cubes, spectrograms of raw seismic data, and raw seismic data directly as input. Recurrent neural networks with LSTM and GRU cells are also explored for image reconstruction. The document concludes that while performance is good on simple models, more data and increased network capacity is needed for complex geology. It also lists several related publications.
Introduce recent anchor-free object detection methods on general objects and person detection. The slide summarize more than 10 papers on this topic.
Oral presentation at IEEE International Conference on Image Processing (ICIP), Hong Kong, September 2010. Abstract: Non-uniform filters are frequently used in many image processing applications to describe regions or to detect specific features. However, non-uniform filtering is a computationally complex task. This paper presents a method to perform fast non-uniform filtering using a reduced number of memory accesses. The idea is based on integral images which are commonly used for box or Haar wavelet filtering. The disadvantage of those filters for several applications is their uniform shape. We describe a method to build Symmetric Weighted Integral Images that are tailored for a variety of kernels and the process to perform fast filtering with them. We show a relevant speedup when compared to Kernel Integral Images and large when compared to conventional non-uniform filtering by reducing the computational complexity.
The document discusses object detection techniques including R-CNN, SPPnet, Fast R-CNN, and Faster R-CNN. R-CNN uses region proposals and CNN features to classify each region. SPPnet improves efficiency by computing CNN features once for the whole image. Fast R-CNN further improves efficiency by sharing computation and using a RoI pooling layer. Faster R-CNN introduces a region proposal network to generate proposals, achieving end-to-end training. The techniques showed improved accuracy and processing speed over prior methods.
Tensorflow Korea 논문읽기 모임 PR12의 45번째 발표는 DeepLab이라는 Semantic Image Segmentation ��고리즘을 기반으로 여러 관련 논문들을 설명하였습니다.
Faster R-CNN improves object detection by introducing a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. The RPN slides over feature maps and predicts object bounds and objectness at each position. During training, anchors are assigned positive or negative labels based on Intersection over Union with ground truth boxes. Faster R-CNN runs the RPN in parallel with Fast R-CNN for detection, end-to-end in a single network and stage. This achieves state-of-the-art object detection speed and accuracy while eliminating computationally expensive selective search for proposals.
The Scientific Computing, Applied and Industrial Mathematics (SCAIM) Seminar at University of British Columbia. October 2019.
- Two convolutional neural network architectures are presented to reduce noise in low-dose CT images. The first network is inspired by dictionary learning methods. An efficient improved network is also presented. - Important parameters for each network are investigated to determine the best performance. The models are tested and results are compared to state-of-the-art methods, showing superior performance. - Future work could explore advanced deep learning methods like deep residual networks, generative adversarial networks, or improving contrast in DICOM images.
Slides by Miriam Bellver from the Computer Vision Reading Group at the Universitat Politecnica de Catalunya about the paper: Lu, Yongxi, Tara Javidi, and Svetlana Lazebnik. "Adaptive Object Detection Using Adjacency and Zoom Prediction." CVPR 2016 Abstract: State-of-the-art object detection systems rely on an accurate set of region proposals. Several recent methods use a neural network architecture to hypothesize promising object locations. While these approaches are computationally efficient, they rely on fixed image regions as anchors for predictions. In this paper we propose to use a search strategy that adaptively directs computational resources to sub-regions likely to contain objects. Compared to methods based on fixed anchor locations, our approach naturally adapts to cases where object instances are sparse and small. Our approach is comparable in terms of accuracy to the state-of-the-art Faster R-CNN approach while using two orders of magnitude fewer anchors on average. Code is publicly available.
This slide provides a brief summary of recent progress on object detection using deep learning. The concept of selected previous works(R-CNN series/YOLO/SSD) and 6 recent papers (uploaded to the Arxiv between Dec/2016 and Mar/2017) are introduced in this slide. Most papers are focusing on improving the performance of small object detection.
For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/auvizsystems/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit For more information about embedded vision, please visit: http://www.embedded-vision.com Nagesh Gupta, Founder and CEO of Auviz Systems, presents the "Semantic Segmentation for Scene Understanding: Algorithms and Implementations" tutorial at the May 2016 Embedded Vision Summit. Recent research in deep learning provides powerful tools that begin to address the daunting problem of automated scene understanding. Modifying deep learning methods, such as CNNs, to classify pixels in a scene with the help of the neighboring pixels has provided very good results in semantic segmentation. This technique provides a good starting point towards understanding a scene. A second challenge is how such algorithms can be deployed on embedded hardware at the performance required for real-world applications. A variety of approaches are being pursued for this, including GPUs, FPGAs, and dedicated hardware. This talk provides insights into deep learning solutions for semantic segmentation, focusing on current state of the art algorithms and implementation choices. Gupta discusses the effect of porting these algorithms to fixed-point representation and the pros and cons of implementing them on FPGAs.
Locating objects in images (“detection”) quickly and efficiently enables object tracking and counting applications on embedded visual sensors (fixed and mobile). By 2012, progress on techniques for detecting objects in images – a topic of perennial interest in computer vision – had plateaued, and techniques based on histogram of oriented gradients (HOG) were state of the art. Soon, though, convolutional neural networks (CNNs), in addition to classifying objects, were also beginning to become effective at simultaneously detecting objects. Research in CNN-based object detection was jump-started by the groundbreaking region-based CNN (R-CNN). We’ll follow the evolution of neural network algorithms for object detection, starting with R-CNN and proceeding to Fast R-CNN, Faster R-CNN, “You Only Look Once” (YOLO), and up to the latest Single Shot Multibox detector. In this talk, we’ll examine the successive innovations in performance and accuracy embodied in these algorithms – which is a good way to understand the insights behind effective neural-network-based object localization. We’ll also contrast bounding-box approaches with pixel-level segmentation approaches and present pros and cons.
This document describes research on using region-oriented convolutional neural networks for object retrieval. It discusses using local CNNs like CaffeNet, Fast R-CNN, and SDS to extract visual features from object candidates in images. These features are used to match against query descriptors. Pooled regional features are ranked to retrieve relevant shots. Fine-tuning pre-trained networks on larger datasets like COCO can improve retrieval accuracy. Combining global and local approaches through re-ranking provides an additional boost in performance.
This document describes a proposed method for real-time object detection using Single Shot Multi-Box Detection (SSD) with the MobileNet model. SSD is a single, unified network for object detection that eliminates feature resampling and combines predictions. MobileNet is used to create a lightweight network by employing depthwise separable convolutions, which significantly reduces model size compared to regular convolutions. The proposed SSD with MobileNet model achieved improved accuracy in identifying real-time household objects while maintaining the detection speed of SSD.
https://telecombcn-dl.github.io/2017-dlcv/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
ICANN 2017 26th International Conference on Artificial Neural Networks Alghero, Sardinia, Italy September 2017
The document discusses Mask R-CNN, an extension of Faster R-CNN object detection that also performs semantic segmentation. Mask R-CNN adds a branch for predicting segmentation masks on each Region of Interest independently of class. During training, the mask branch learns to segment objects regardless of class, and at test time predicts masks for all classes using a "winner takes all" approach. The document also compares Mask R-CNN to Faster R-CNN and FCN approaches.
This document summarizes the paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift". It introduces batch normalization, which normalizes layer inputs to speed up training of neural networks. Batch normalization reduces internal covariate shift by normalizing layer inputs. It computes normalization statistics over each mini-batch and applies them to the inputs. This allows higher learning rates and acts as a regularizer. Experiments show batch normalization stabilizes and accelerates the training of neural networks on ImageNet classification.
1. Dimensionality reduction techniques like PCA can be used to optimize master event templates for cross-correlation based seismic event detection and location. 2. The document explores using various dimensionality reduction methods such as PCA, IPCA, and SSD on both real and synthetic seismic data to minimize the number of templates needed. 3. Representing seismic data as hypercomplex numbers or tensors can allow dimensionality reduction techniques to utilize the full multidimensional information from seismic arrays for improved master event design.
This document summarizes an adaptive modular approach for mining sensor network data using machine learning techniques. It presents a two-layer architecture that uses an online compression algorithm (PCA) in the first layer to reduce data dimensionality and an adaptive lazy learning algorithm (KNN) in the second layer for prediction and regression tasks. Simulation results on a wave propagation dataset show the approach can handle non-stationarities like concept drift, sensor failures and network changes in an efficient and adaptive manner.
ieee nss mic 2016 poster N30-21 Pixel Discrimination using Artificial Neural Network for Gamma Camera Module
1. The document describes using a deep neural network to detect changes between two SAR images by preclassifying the images, training the neural network on selected samples, and analyzing the results. 2. A similarity matrix and variance matrix are calculated during preclassification to identify and jointly label similar pixels, while different pixels are labeled separately. Good samples are selected to train the neural network. 3. The neural network is tested on images with different types and levels of noise and performs well at change detection, with performance increasing as noise decreases. Future work could focus on accelerating the training process.
This document summarizes research on using model counting approaches to analyze nonlinear numerical constraints that arise in applications like probabilistic inference, reliability analysis, and side-channel analysis. It presents two implementations of modular exponentiation with nonlinear constraints and evaluates the performance of various exact and approximate model counting tools on the path conditions extracted from symbolic execution. The results show that for small domains, brute force counting works best, while approximate model counting scales better to larger problems.
Abstract The Comprehensive Nuclear-Test-Ban Treaty’s verification regime requires uniform distribution of monitoring capabilities over the globe. The use of waveform cross correlation as a monitoring technique demands waveform templates from master events outside regions of natural seismicity and test sites. We populated aseismic areas with masters having synthetic templates for predefined sets (from 3 to 10) of primary array stations of the International Monitoring System. Previously, we tested the global set of master events and synthetic templates using IMS seismic data for February 12, 2013 and demonstrated excellent detection and location capability of the matched filter technique. In this study, we test the global grid of synthetic master events using seismic events from the Reviewed Event Bulletin. For detection, we use standard STA/LTA (SNR) procedure applied to the time series of cross correlation coefficient (CC). Phase association is based on SNR, CC, and arrival times. Azimuth and slowness estimates based f-k analysis cross correlation traces are used to reject false arrivals.
1. The document discusses using an adaptive general regression neural network (GRNN) for automatic mapping of environmental data. 2. A GRNN is a modification of the Nadaraya-Watson nonparametric regressor that can perform nonlinear modeling, feature selection, and characterize uncertainties to produce quality maps. 3. An illustrative case study applies GRNN to precipitation data in Switzerland, showing it can automatically filter out irrelevant variables and produce accurate interpolation maps and uncertainty analyses of residuals.
An artificial neural network was used to accurately identify the interaction positions of gamma photons in a gamma camera detector module. Training datasets were acquired along lines parallel to the x and y axes to simplify the training process and optimize the neural network structure. The proposed method improved discrimination accuracy at the edges of the detector compared to conventional algorithms and reduced the energy resolution from 22.8% to 15.7%, demonstrating its effectiveness for gamma camera systems.
(1) The document describes using neural networks called autoencoders to perform dimensionality reduction on data in a nonlinear way. Autoencoders use an encoder network to transform high-dimensional data into a low-dimensional code, and a decoder network to recover the data from the code. (2) The autoencoders are trained to minimize the discrepancy between the original and reconstructed data. Experiments on image and face datasets showed autoencoders outperforming principal components analysis at reconstructing the original data from the low-dimensional code. (3) Pretraining the autoencoder layers using restricted Boltzmann machines helps optimize the many weights in deep autoencoders and scale the approach to large datasets.
This document surveys pruning methods for person re-identification networks. It introduces how convolutional neural networks (CNNs) have achieved high accuracy in tasks like person re-identification but at the cost of high complexity. Siamese networks are used for person re-identification by extracting features from images using a shared-weight backbone network. Pruning techniques can significantly reduce the complexity of these networks by reducing parameters and computations while maintaining high accuracy. The document reviews different pruning methods like filter pruning, adaptive filter pruning, and compares their performance on re-identification datasets.