Newest 'neural-network+computer-vision' Questions

0 votes

1 answer

19 views

What is the "fast version" of ZFNet referenced in SPPNet and Faster R-CNN papers?

I'm reading old papers: SPPNet: Link Faster R-CNN: Link In both cases, the authors refer to a "fast version of Zeiler and Fergus (ZF) Net"; specifically: In SPPNet: ZF-5: this ...

Papemax89

1

asked Jun 14 at 17:18

0 votes

0 answers

22 views

Losing Information while resizing the image in Segmentation task using U-net

I'm using U-net architecture to build a segmentation task of image. During training I have image of size 256256 image. It works very well on the segmentation of same size 256256 or near to size 256*...

Akshit Dhillon

1

asked Apr 11 at 20:37

0 votes

1 answer

26 views

How do I ensure final output shape matches input shape for a semantic segmentation task?

I trying to replicate the semantic segmentation example https://keras.io/examples/vision/oxford_pets_image_segmentation/ but train on my own data. I have 8 labels (7 features + background). My images ...

utx7563yu

3

asked Mar 30 at 21:43

1 vote

1 answer

300 views

Is vision transformer (ViT) always better than CNN?

The paper - AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE proposed vision transformer and outperformed CNN-based models in many cases. When it comes to sequential data, we ...

Chuck Liu

13

asked Feb 5 at 8:46

0 votes

0 answers

23 views

Loss MAE when estimating the angle of rotation of an object in an image is stuck at about 90

I am dealing with the problem of estimating the angle of rotation of objects in images. The problem is that the network gets stuck when training at a loss level of about 90. Below is the code for my ...

DamianSz

1

asked Nov 3, 2023 at 13:58

0 votes

1 answer

56 views

Classification: ClassA vs. "everything else"

I am trying to create a neural network for recognizing a particular object. Maybe I am approaching this task from the wrong side, but, in my mind, this task boils down to teaching the network to do a ...

Dmytro Titov

121

asked Aug 21, 2022 at 20:01

1 vote

0 answers

24 views

Suggestions for labeling regression data to improve model accuracy

I'm working on a convolutional neural network that should predict up to 3 (x,y) coordinate pairs representing the waypoints of a concrete path, given an input image. This network will be used to help ...

pmitch

11

asked Mar 23, 2022 at 19:11

0 votes

1 answer

74 views

How to handle the case of multiple ground truth boxes having high IOU with the same predicted box?

In single shot detector the matching strategy between ground truth and predicted box starts with the following step: For each ground truth box we are selecting from default boxes that vary over ...

Yandle

231

asked Jan 5, 2022 at 8:13

0 votes

1 answer

1k views

What is `Multi-scale` in Multiscale Convolutional Network?

I was reading an article on Deep Learning and came across this term called Multi-scale Neural Network. I fully understand the concepts of convolutional neural network but it is a bit difficult to ...

Aashish Chaubey

103

asked Nov 5, 2021 at 10:35

1 vote

0 answers

22 views

Attention mechanism: Why apply multiple different transformations to obtain query, key, value

I have two questions about the structure of attention modules: Since I work with imagery I will be talking about using convolutions on feature maps in order to obtain attention maps. If we have a set ...

Steve Ahlswede

181

asked Aug 2, 2021 at 17:25

3 votes

2 answers

1k views

Less parameters - in general within ResNets

My question is about the parameters of the ResNet. Why does the network tend to have fewer parameters than the VGG? This would be the case if I got the paper and the summary from Yannic Kilcher ...

bohniti

31

asked Jun 8, 2021 at 12:26

0 votes

1 answer

190 views

Training the network with some batch size - code

There is my "training" code below, I wrote it based on one youtube tutorial. I don't understand actually one part: batch_X = train_X[i:i+BATCH_SIZE], batch_y = train_y[i:i+BATCH_SIZE]. How ...

Adolf Miszka

149

asked May 12, 2021 at 13:26

1 vote

1 answer

346 views

Feature extraction from sequence of images with Siamese Neural Network

I am trying to train a neural network to recognize certain actions in short movies. Each such movie consists of a fixed number of frames, each frame - the image is of course the same size, after ...

JohnyBe

113

asked Feb 26, 2021 at 15:05

1 vote

0 answers

23 views

Machine learning model (neural network or SVM) for unequal feature matrices size

I have feature matrices obtained from visual bags of words model for various dictionary sizes. Example, Nx5, Nx10, …., Nx15000. Where N is the number of samples and 5, 10, …15000 are the visual ...

PManjunatha

11

asked Feb 11, 2021 at 11:14

1 vote

1 answer

337 views

Why are axes-aligned bounding boxes used in object detection

I understand (I think) why in object detection, the result is a rectangle: it is a simple shape that can be defined by 4 variables (2 pairs coords of opposite corners or 1 pair of coords + width and ...

Jan Pisl

195

asked Feb 9, 2021 at 19:11

Stack Exchange Network

All Questions

What is the "fast version" of ZFNet referenced in SPPNet and Faster R-CNN papers?

Losing Information while resizing the image in Segmentation task using U-net

How do I ensure final output shape matches input shape for a semantic segmentation task?

Is vision transformer (ViT) always better than CNN?

Loss MAE when estimating the angle of rotation of an object in an image is stuck at about 90

Classification: ClassA vs. "everything else"

Suggestions for labeling regression data to improve model accuracy

How to handle the case of multiple ground truth boxes having high IOU with the same predicted box?

What is `Multi-scale` in Multiscale Convolutional Network?

Attention mechanism: Why apply multiple different transformations to obtain query, key, value

Less parameters - in general within ResNets

Training the network with some batch size - code

Feature extraction from sequence of images with Siamese Neural Network

Machine learning model (neural network or SVM) for unequal feature matrices size

Why are axes-aligned bounding boxes used in object detection

Hot Network Questions

All Questions

Related Tags