Newest 'neural-network' Questions - Data Science Stack Exchange

0 votes

1 answer

28 views

hacky backprop outperforms clean backprop - Why?

I implemented a basic NN for MNIST in Numpy and started with a hacky implementation of backprop (just randomly multiplying gradients together), but somehow that one works better than my cleaned up ...

Christoph Hörtnagl

1

asked Jul 14 at 15:34

-1 votes

1 answer

33 views

How can I select subsets of features using neural network?

This listing selects the best features from the 1000 available columns in a given dataset. The first three columns are dropped because they are useless data. The dataset is huge. So, they were read in ...

user366312

99

asked Jul 8 at 4:20

0 votes

0 answers

45 views

How weight vector behave when we initialize the weight to 0 in case of perceptron

While reading in book i encountered this statement Now, the reason we don't initialize the weights to zero is that the learning rate (eta) only has an effect on the classification outcome if the ...

Vipin Dubey

101

asked Jul 7 at 1:56

1 vote

1 answer

24 views

Everything is classified as background by segmentation model

I am training a U-NET model for medical image segmentation. Problem is that the binary masks that im using to train the model mostly consist of background pixels and a very small region of the whole ...

Ashwin Singh

61

asked Jul 6 at 15:35

0 votes

0 answers

18 views

Does it common for LM (hundreds million parameters) beat LLM (billion parameters) for binary classification task?

Preface I am trying to fine-tune the transformer-based model (LM and LLM). The LM that I used is DEBERTA, and the LLM is LLaMA 3. The task is to classify whether a text contains condescending language ...

sempraEdic

1

asked Jul 1 at 1:16

0 votes

0 answers

14 views

How to increase the optimial cutoff point(youden index) after training a model?

So I trained a model based on a medical dataset and and I got an AUROC for detecting cancer in brain images as about 0.96 and i noticed that the youden index is 0.1 but i want to increase it to 0.5 , ...

mutli-arm-bandit

23

asked Jun 30 at 21:28

-1 votes

1 answer

8 views

WGAN generating images from the training data

Is it possible for gan to remember somehow training data distribution? Or maybe somеthing leaks out when I calculate gradients? ...

Тима

43

asked Jun 27 at 15:40

0 votes

0 answers

25 views

Is it legit to normalize time series with respect to the x-axis?

I have a data set consisting of multivariate time series, e.g. a batch of my data has the shape (batch_size, timesteps, number_input_features) and I want to train a neural network on it to predict ...

ZenDen

13

asked Jun 27 at 9:10

1 vote

1 answer

38 views

How does seeing training batches only once influence the generalization of a neural network?

I am referring to this question/scenario Train neural network with unlimited training data but unfortunately I can not comment. As I am not seeing any training batch multiple times I would guess that ...

ZenDen

13

asked Jun 26 at 15:07

0 votes

0 answers

9 views

How to handle sequences with crossEntropyLoss

fist of all i am ne wto the whole thing, so sorry if this is superdumb. I'm currently training a Transformer model for a sequence classification task using CrossEntropyLoss. My input tensor has the ...

Tobias

101

asked Jun 21 at 15:12

0 votes

0 answers

22 views

What is the most accurate way of computing the evaluation time of a neural network model?

I am training some neural networks in pytorch to use as an embedded surrogate model. Since I am testing various architectures, I want to compare the accuracy of each one, but I am also interested in ...

HWIK

1

asked Jun 20 at 7:47

0 votes

0 answers

9 views

Mobilenet vs resnet

Q1-Why dont we remove relu after addition of skip connection in resnet50 like we do in mobile-net v2 for better performance? Q2-And why dont we have Convolution layer in skip connection for dimention ...

Tarun Saxena

1

asked Jun 19 at 21:09

2 votes

1 answer

22 views

Benchmark Neural Networks on High-Dimensional Functions

For a personal project, I am interested in benchmarking certain neural network architectures in the context of high-dimensional function approximation. Specifically, I am interested in continuous, ...

user82261

121

asked Jun 19 at 13:48

0 votes

1 answer

19 views

What is the "fast version" of ZFNet referenced in SPPNet and Faster R-CNN papers?

I'm reading old papers: SPPNet: Link Faster R-CNN: Link In both cases, the authors refer to a "fast version of Zeiler and Fergus (ZF) Net"; specifically: In SPPNet: ZF-5: this ...

Papemax89

1

asked Jun 14 at 17:18

1 vote

0 answers

46 views

Why can't I replicate the results from this paper?

I'm trying to train a model to evaluate chess positions, following the methodology from this paper (note that the author presents several different architectures, but I'm only looking at the ANN with ...

William Markley

11

asked Jun 13 at 15:51

1 vote

1 answer

55 views

wierd neural network approache

I'm working on a problem where I need to create a neural network to optimize the seating arrangement for 24 unique individuals in a 6x4 grid, minimizing conflicts between adjacent (up,down,left,right) ...

Moein

101

asked Jun 9 at 17:16

2 votes

0 answers

13 views

What's the best way to incorporate momentum and regularization when training a neural network?

I want to implement the momentum algorithm to train a neural network, but I'm uncertain about where the regularization term should be incorporated. For ridge regularization, one option is to have: $$ ...

lucaspedroso

21

asked Jun 6 at 12:19

1 vote

0 answers

9 views

Residual Network Skip Connection Clarification

In ResNets do skip connections get utilised at every step? If not what causes a layer to be skipped vs not skipped? Thank you,

joe_credit

11

asked Jun 6 at 11:30

1 vote

1 answer

34 views

Predicted output is only 0s

I am developing a neural network using Home credit Default Risk Dataset. The prediction should be between 0.0 and 1.0 but my algorithm's outcome is just 0.0 for every row. My Code ...

Erevos

13

asked Jun 6 at 8:07

0 votes

0 answers

14 views

Semantics Building In LSTM-Based Models - How does a LSTM is able to extract and represent long data using just one value (long-memory)

How does a LSTM is able to extract and represent long sequences with data while using just one value (long-memory / LM) to maintain all this information? If multiple value were used, it could be ...

Linces games

1

asked Jun 5 at 2:29

0 votes

0 answers

14 views

Impact of Adding Imbalanced Data on Model Performance for Different Groups

Suppose I initially have a dataset with 50 samples of type A and 50 samples of type B, each with several features. I built a neural network model using this data and recorded the prediction accuracy ...

Mickly

1

asked Jun 4 at 4:19

3 votes

1 answer

233 views

What ML model for regression given tabular AND image data?

I'd like to predict the power production of a windfarm given the wind speed, its direction and other variables related to the specific wind turbines. However, due to wake effects (wind speed decreases ...

deque

133

asked Jun 3 at 9:45

1 vote

0 answers

38 views

Class imbalance for binary classification tasks

I am looking to train a binary classifier. Most of my experience so far has been with generative models, not classifiers, so I am wondering with respect to training data, what is a good ratio of 0 and ...

Wigeon

11

asked May 31 at 14:38

0 votes

1 answer

27 views

How to update first layer weights?

I’m trying to make a neural network without using any deep learning library that recognizes numbers in the mnist database. Its structure is: 784 input neurons (for the 784 pixels in the number images),...

Allo Bonjour

1

asked May 28 at 1:58

3 votes

1 answer

45 views

Is it legal to use a model found on github for a personal project and uploading the personal project onto github? [closed]

I found a great model I would like to use and make improvements upon for a personal project. It doesn't contain any liscenses nor does it mention anything about restrictions of use. Are AI models like ...

MrIzzat

31

asked May 26 at 11:09

3 votes

1 answer

29 views

Outputting handwritten digits with a Neural Network

I know that you can use a neural Network to recognize handwritten digits. How would you then use that same neural network to output handwritten digits in the unique style of that network? In other ...

Uriah Sanders

33

asked May 24 at 23:14

0 votes

0 answers

23 views

Theoretical Limitations of Achieving 100% Accuracy in Modeling Non-linear Relationships with Neural Networks

I am working on a project where I need to model a specific non-linear relationship using a neural network. The relationship is given by $y = 3x_1^2x_2^3 $. The approach involves: Preprocessing the ...

Mo McWebmo

1

asked May 21 at 20:48

6 votes

1 answer

180 views

Changing output size from a model

So I am currently training some deep learning models for some basic classification problems, and I am trying to figure out if it is possible to change the output size of the model in case I want to ...

pdaranda661

163

asked May 17 at 12:29

0 votes

1 answer

30 views

How to explain missing dates to a model?

I have this dataset that I'm trying to train a neural network on. The problem is that since weekend dates are not available, I am not confident in whether the model is able to account for that. ...

Akshat Vats

101

asked May 16 at 10:08

1 vote

1 answer

62 views

Improving GPU Utilization in LLM Inference System

I´m trying to build a distributed LLM inference platform with Huggingface support. The implementation involves utilizing Python for model processing and Java for interfacing with external systems. ...

Cardstdani

111

asked May 14 at 16:17

Stack Exchange Network

Questions tagged [neural-network]

hacky backprop outperforms clean backprop - Why?

How can I select subsets of features using neural network?

How weight vector behave when we initialize the weight to 0 in case of perceptron

Everything is classified as background by segmentation model

Does it common for LM (hundreds million parameters) beat LLM (billion parameters) for binary classification task?

How to increase the optimial cutoff point(youden index) after training a model?

WGAN generating images from the training data

Is it legit to normalize time series with respect to the x-axis?

How does seeing training batches only once influence the generalization of a neural network?

How to handle sequences with crossEntropyLoss

What is the most accurate way of computing the evaluation time of a neural network model?

Mobilenet vs resnet

Benchmark Neural Networks on High-Dimensional Functions

What is the "fast version" of ZFNet referenced in SPPNet and Faster R-CNN papers?

Why can't I replicate the results from this paper?

wierd neural network approache

What's the best way to incorporate momentum and regularization when training a neural network?

Residual Network Skip Connection Clarification

Predicted output is only 0s

Semantics Building In LSTM-Based Models - How does a LSTM is able to extract and represent long data using just one value (long-memory)

Impact of Adding Imbalanced Data on Model Performance for Different Groups

What ML model for regression given tabular AND image data?

Class imbalance for binary classification tasks

How to update first layer weights?

Is it legal to use a model found on github for a personal project and uploading the personal project onto github? [closed]

Outputting handwritten digits with a Neural Network

Theoretical Limitations of Achieving 100% Accuracy in Modeling Non-linear Relationships with Neural Networks

Changing output size from a model

How to explain missing dates to a model?

Improving GPU Utilization in LLM Inference System

Hot Network Questions

Questions tagged [neural-network]

Related Tags