Questions tagged [neural-network]
Artificial neural networks (ANN), are composed of 'neurons' - programming constructs that mimic the properties of biological neurons. A set of weighted connections between the neurons allows information to propagate through the network to solve artificial intelligence problems without the network designer having had a model of a real system.
4,383
questions
0
votes
1
answer
28
views
hacky backprop outperforms clean backprop - Why?
I implemented a basic NN for MNIST in Numpy and started with a hacky implementation of backprop (just randomly multiplying gradients together), but somehow that one works better than my cleaned up ...
-1
votes
1
answer
33
views
How can I select subsets of features using neural network?
This listing selects the best features from the 1000 available columns in a given dataset.
The first three columns are dropped because they are useless data.
The dataset is huge. So, they were read in ...
0
votes
0
answers
45
views
How weight vector behave when we initialize the weight to 0 in case of perceptron
While reading in book i encountered this statement
Now, the reason we don't initialize the weights to zero is that the learning rate (eta) only has an effect on the classification outcome if the ...
1
vote
1
answer
24
views
Everything is classified as background by segmentation model
I am training a U-NET model for medical image segmentation. Problem is that the binary masks that im using to train the model mostly consist of background pixels and a very small region of the whole ...
0
votes
0
answers
18
views
Does it common for LM (hundreds million parameters) beat LLM (billion parameters) for binary classification task?
Preface
I am trying to fine-tune the transformer-based model (LM and LLM). The LM that I used is DEBERTA, and the LLM is LLaMA 3. The task is to classify whether a text contains condescending language ...
0
votes
0
answers
14
views
How to increase the optimial cutoff point(youden index) after training a model?
So I trained a model based on a medical dataset and and I got an AUROC for detecting cancer in brain images as about 0.96 and i noticed that the youden index is 0.1 but i want to increase it to 0.5 , ...
-1
votes
1
answer
8
views
WGAN generating images from the training data
Is it possible for gan to remember somehow training data distribution?
Or maybe somеthing leaks out when I calculate gradients?
...
0
votes
0
answers
25
views
Is it legit to normalize time series with respect to the x-axis?
I have a data set consisting of multivariate time series, e.g. a batch of my data has the shape (batch_size, timesteps, number_input_features) and I want to train a neural network on it to predict ...
1
vote
1
answer
38
views
How does seeing training batches only once influence the generalization of a neural network?
I am referring to this question/scenario Train neural network with unlimited training data but unfortunately I can not comment.
As I am not seeing any training batch multiple times I would guess that ...
0
votes
0
answers
9
views
How to handle sequences with crossEntropyLoss
fist of all i am ne wto the whole thing, so sorry if this is superdumb.
I'm currently training a Transformer model for a sequence classification task using CrossEntropyLoss. My input tensor has the ...
0
votes
0
answers
22
views
What is the most accurate way of computing the evaluation time of a neural network model?
I am training some neural networks in pytorch to use as an embedded surrogate model. Since I am testing various architectures, I want to compare the accuracy of each one, but I am also interested in ...
0
votes
0
answers
9
views
Mobilenet vs resnet
Q1-Why dont we remove relu after addition of skip connection in resnet50 like we do in mobile-net v2 for better performance?
Q2-And why dont we have Convolution layer in skip connection for dimention ...
2
votes
1
answer
22
views
Benchmark Neural Networks on High-Dimensional Functions
For a personal project, I am interested in benchmarking certain neural network architectures in the context of high-dimensional function approximation. Specifically, I am interested in continuous, ...
0
votes
1
answer
19
views
What is the "fast version" of ZFNet referenced in SPPNet and Faster R-CNN papers?
I'm reading old papers:
SPPNet: Link
Faster R-CNN: Link
In both cases, the authors refer to a "fast version of Zeiler and Fergus (ZF) Net"; specifically:
In SPPNet:
ZF-5: this ...
1
vote
0
answers
46
views
Why can't I replicate the results from this paper?
I'm trying to train a model to evaluate chess positions, following the methodology from this paper (note that the author presents several different architectures, but I'm only looking at the ANN with ...
1
vote
1
answer
55
views
wierd neural network approache
I'm working on a problem where I need to create a neural network to optimize the seating arrangement for 24 unique individuals in a 6x4 grid, minimizing conflicts between adjacent (up,down,left,right) ...
2
votes
0
answers
13
views
What's the best way to incorporate momentum and regularization when training a neural network?
I want to implement the momentum algorithm to train a neural network, but I'm uncertain about where the regularization term should be incorporated. For ridge regularization, one option is to have:
$$
...
1
vote
0
answers
9
views
Residual Network Skip Connection Clarification
In ResNets do skip connections get utilised at every step? If not what causes a layer to be skipped vs not skipped?
Thank you,
1
vote
1
answer
34
views
Predicted output is only 0s
I am developing a neural network using Home credit Default Risk Dataset.
The prediction should be between 0.0 and 1.0 but my algorithm's outcome is just 0.0 for every row.
My Code
...
0
votes
0
answers
14
views
Semantics Building In LSTM-Based Models - How does a LSTM is able to extract and represent long data using just one value (long-memory)
How does a LSTM is able to extract and represent long sequences with data while using just one value (long-memory / LM) to maintain all this information?
If multiple value were used, it could be ...
0
votes
0
answers
14
views
Impact of Adding Imbalanced Data on Model Performance for Different Groups
Suppose I initially have a dataset with 50 samples of type A and 50 samples of type B, each with several features. I built a neural network model using this data and recorded the prediction accuracy ...
3
votes
1
answer
233
views
What ML model for regression given tabular AND image data?
I'd like to predict the power production of a windfarm given the wind speed, its direction and other variables related to the specific wind turbines. However, due to wake effects (wind speed decreases ...
1
vote
0
answers
38
views
Class imbalance for binary classification tasks
I am looking to train a binary classifier. Most of my experience so far has been with generative models, not classifiers, so I am wondering with respect to training data, what is a good ratio of 0 and ...
0
votes
1
answer
27
views
How to update first layer weights?
I’m trying to make a neural network without using any deep learning library that recognizes numbers in the mnist database. Its structure is: 784 input neurons (for the 784 pixels in the number images),...
3
votes
1
answer
45
views
Is it legal to use a model found on github for a personal project and uploading the personal project onto github? [closed]
I found a great model I would like to use and make improvements upon for a personal project. It doesn't contain any liscenses nor does it mention anything about restrictions of use.
Are AI models like ...
3
votes
1
answer
29
views
Outputting handwritten digits with a Neural Network
I know that you can use a neural Network to recognize handwritten digits. How would you then use that same neural network to output handwritten digits in the unique style of that network? In other ...
0
votes
0
answers
23
views
Theoretical Limitations of Achieving 100% Accuracy in Modeling Non-linear Relationships with Neural Networks
I am working on a project where I need to model a specific non-linear relationship using a neural network. The relationship is given by $y = 3x_1^2x_2^3 $. The approach involves:
Preprocessing the ...
6
votes
1
answer
180
views
Changing output size from a model
So I am currently training some deep learning models for some basic classification problems, and I am trying to figure out if it is possible to change the output size of the model in case I want to ...
0
votes
1
answer
30
views
How to explain missing dates to a model?
I have this dataset that I'm trying to train a neural network on.
The problem is that since weekend dates are not available, I am not confident in whether the model is able to account for that. ...
1
vote
1
answer
62
views
Improving GPU Utilization in LLM Inference System
I´m trying to build a distributed LLM inference platform with Huggingface support. The implementation involves utilizing Python for model processing and Java for interfacing with external systems. ...