Skip to main content

Questions tagged [neural-network]

Artificial neural networks (ANN), are composed of 'neurons' - programming constructs that mimic the properties of biological neurons. A set of weighted connections between the neurons allows information to propagate through the network to solve artificial intelligence problems without the network designer having had a model of a real system.

283 votes
12 answers
278k views

What are deconvolutional layers?

I recently read Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, Trevor Darrell. I don't understand what "deconvolutional layers" do / how they work. The ...
Martin Thoma's user avatar
199 votes
5 answers
155k views

What is the "dying ReLU" problem in neural networks?

Referring to the Stanford course notes on Convolutional Neural Networks for Visual Recognition, a paragraph says: "Unfortunately, ReLU units can be fragile during training and can "die". For ...
tejaskhot's user avatar
  • 4,085
198 votes
6 answers
373k views

How to draw Deep learning network architecture diagrams?

I have built my model. Now I want to draw the network architecture diagram for my research paper. Example is shown below:
Muhammad Ali's user avatar
  • 2,487
182 votes
21 answers
260k views

How do you visualize neural network architectures?

When writing a paper / making a presentation about a topic which is about neural networks, one usually visualizes the networks architecture. What are good / simple ways to visualize common ...
Martin Thoma's user avatar
181 votes
6 answers
185k views

When to use GRU over LSTM?

The key difference between a GRU and an LSTM is that a GRU has two gates (reset and update gates) whereas an LSTM has three gates (namely input, output and forget gates). Why do we make use of GRU ...
Sayali Sonawane's user avatar
150 votes
17 answers
126k views

Best python library for neural networks

I'm using Neural Networks to solve different Machine learning problems. I'm using Python and pybrain but this library is almost discontinued. Are there other good alternatives in Python?
114 votes
11 answers
127k views

Choosing a learning rate

I'm currently working on implementing Stochastic Gradient Descent, SGD, for neural nets using back-propagation, and while I understand its purpose I have some ...
ragingSloth's user avatar
  • 1,824
111 votes
5 answers
85k views

Backprop Through Max-Pooling Layers?

This is a small conceptual question that's been nagging me for a while: How can we back-propagate through a max-pooling layer in a neural network? I came across max-pooling layers while going through ...
shinvu's user avatar
  • 1,240
90 votes
1 answer
89k views

When to use (He or Glorot) normal initialization over uniform init? And what are its effects with Batch Normalization?

I knew that Residual Network (ResNet) made He normal initialization popular. In ResNet, He normal initialization is used , while the first layer uses He uniform initialization. I've looked through ...
Rizky Luthfianto's user avatar
87 votes
4 answers
53k views

How are 1x1 convolutions the same as a fully connected layer?

I recently read Yan LeCuns comment on 1x1 convolutions: In Convolutional Nets, there is no such thing as "fully-connected layers". There are only convolution layers with 1x1 convolution ...
Martin Thoma's user avatar
83 votes
5 answers
50k views

What is the difference between "equivariant to translation" and "invariant to translation"

I'm having trouble understanding the difference between equivariant to translation and invariant to translation. In the book Deep Learning. MIT Press, 2016 (I. Goodfellow, A. Courville, and Y. Bengio)...
Aamir 's user avatar
  • 993
78 votes
6 answers
165k views

What is the difference between Gradient Descent and Stochastic Gradient Descent?

What is the difference between Gradient Descent and Stochastic Gradient Descent? I am not very familiar with these, can you describe the difference with a short example?
Developer's user avatar
  • 1,099
76 votes
6 answers
152k views

Cross-entropy loss explanation

Suppose I build a neural network for classification. The last layer is a dense layer with Softmax activation. I have five different classes to classify. Suppose for a single training example, the <...
enterML's user avatar
  • 3,051
70 votes
5 answers
52k views

Adding Features To Time Series Model LSTM

have been reading up a bit on LSTM's and their use for time series and its been interesting but difficult at the same time. One thing I have had difficulties with understanding is the approach to ...
Rjay155's user avatar
  • 1,225
69 votes
11 answers
103k views

Why should the data be shuffled for machine learning tasks

In machine learning tasks it is common to shuffle data and normalize it. The purpose of normalization is clear (for having same range of feature values). But, after struggling a lot, I did not find ...
Green Falcon's user avatar
  • 14.1k

15 30 50 per page
1
2 3 4 5
293