Questions tagged [neural-network]
Artificial neural networks (ANN), are composed of 'neurons' - programming constructs that mimic the properties of biological neurons. A set of weighted connections between the neurons allows information to propagate through the network to solve artificial intelligence problems without the network designer having had a model of a real system.
4,383
questions
283
votes
12
answers
278k
views
What are deconvolutional layers?
I recently read Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, Trevor Darrell. I don't understand what "deconvolutional layers" do / how they work.
The ...
199
votes
5
answers
155k
views
What is the "dying ReLU" problem in neural networks?
Referring to the Stanford course notes on Convolutional Neural Networks for Visual Recognition, a paragraph says:
"Unfortunately, ReLU units can be fragile during training and can
"die". For ...
198
votes
6
answers
373k
views
How to draw Deep learning network architecture diagrams?
I have built my model. Now I want to draw the network architecture diagram for my research paper. Example is shown below:
182
votes
21
answers
260k
views
How do you visualize neural network architectures?
When writing a paper / making a presentation about a topic which is about neural networks, one usually visualizes the networks architecture.
What are good / simple ways to visualize common ...
181
votes
6
answers
185k
views
When to use GRU over LSTM?
The key difference between a GRU and an LSTM is that a GRU has two gates (reset and update gates) whereas an LSTM has three gates (namely input, output and forget gates).
Why do we make use of GRU ...
150
votes
17
answers
126k
views
Best python library for neural networks
I'm using Neural Networks to solve different Machine learning problems. I'm using Python and pybrain but this library is almost discontinued. Are there other good alternatives in Python?
114
votes
11
answers
127k
views
Choosing a learning rate
I'm currently working on implementing Stochastic Gradient Descent, SGD, for neural nets using back-propagation, and while I understand its purpose I have some ...
111
votes
5
answers
85k
views
Backprop Through Max-Pooling Layers?
This is a small conceptual question that's been nagging me for a while: How can we back-propagate through a max-pooling layer in a neural network?
I came across max-pooling layers while going through ...
90
votes
1
answer
89k
views
When to use (He or Glorot) normal initialization over uniform init? And what are its effects with Batch Normalization?
I knew that Residual Network (ResNet) made He normal initialization popular. In ResNet, He normal initialization is used , while the first layer uses He uniform initialization.
I've looked through ...
87
votes
4
answers
53k
views
How are 1x1 convolutions the same as a fully connected layer?
I recently read Yan LeCuns comment on 1x1 convolutions:
In Convolutional Nets, there is no such thing as "fully-connected layers". There are only convolution layers with 1x1 convolution ...
83
votes
5
answers
50k
views
What is the difference between "equivariant to translation" and "invariant to translation"
I'm having trouble understanding the difference between equivariant to translation and invariant to translation.
In the book Deep Learning. MIT Press, 2016 (I. Goodfellow, A. Courville, and Y. Bengio)...
78
votes
6
answers
165k
views
What is the difference between Gradient Descent and Stochastic Gradient Descent?
What is the difference between Gradient Descent and Stochastic Gradient Descent?
I am not very familiar with these, can you describe the difference with a short example?
76
votes
6
answers
152k
views
Cross-entropy loss explanation
Suppose I build a neural network for classification. The last layer is a dense layer with Softmax activation. I have five different classes to classify. Suppose for a single training example, the <...
70
votes
5
answers
52k
views
Adding Features To Time Series Model LSTM
have been reading up a bit on LSTM's and their use for time series and its been interesting but difficult at the same time. One thing I have had difficulties with understanding is the approach to ...
69
votes
11
answers
103k
views
Why should the data be shuffled for machine learning tasks
In machine learning tasks it is common to shuffle data and normalize it. The purpose of normalization is clear (for having same range of feature values). But, after struggling a lot, I did not find ...