Skip to main content

Questions tagged [computer-vision]

Questions related to image representation, segmentation, visual object categorization and image processing algorithms in general.

0 votes
0 answers
9 views

Training an image classifier to detect slight differences in line angles

I've been struggling to find success training a Yolov8 classifier to detect slight difference in line angles. My hunch is that the difference in line angles are so slight that the classifier is ...
Isaac Padberg's user avatar
0 votes
0 answers
8 views

Efficient pooling to extract global embedding from local features (for LiDAR point clouds)

Problem: I have 3d point cloud data (autonomous driving setting) and a point cloud encoder (MinkUNet) that extracts local features from them. What are suitable pooling techniques to map those local (...
Hölderlin's user avatar
1 vote
1 answer
19 views

NeRF vs mesh for text-to-3d generation

There seem to be multiple aproaches to generating 3d objects from text prompt. What's confusing is that some of them are generating NeRFs (https://arxiv.org/pdf/2308.16512), other's are generating ...
zlenyk's user avatar
  • 196
0 votes
0 answers
15 views

Differentiable voting loss

I have a problem which needs me to assign a class to each object in a scene (say 0 or 1) represented by an image. I am posing this as a segmentation problem (since there are many objects in a scene ...
Abhijat Biswas's user avatar
0 votes
0 answers
14 views

In which cases validation loss & validation f1 score both increase?

I'm training a image segmentation model using BCE loss. My validation loss is increasing but at same time my f1 score of validation is also increasing. What are all scenarios where it is possible?
Thunder's user avatar
0 votes
0 answers
18 views

Can 3D convolutions appropriately capture a frozen embedding space?

My project is a strange combination of NLP and Computer Vision. I have datapoints of 3D tensor where each element is a token in an NLP vocabulary. The vocabulary is around 1000 unique "words"...
schmixi's user avatar
  • 43
1 vote
0 answers
33 views

Is it reasonable to use background subtraction to identify some objects in sequential frame images to start labeling objects for YOLO training?

I know background subtraction is not a complete solution for object detection, but I’ve tried it for identifying potential new objects appearing in fixed background camera scenarios (millions of ...
NominalSystems's user avatar
0 votes
0 answers
7 views

How to apply CAMs on small images

I��m implementing the GradCam algorithm on several architectures, mainly Resnets. The main issue is that the processed data becomes very small in the last block, precisely the last feature map is 1x1. ...
PiEmmeC's user avatar
1 vote
0 answers
13 views

Impact of Pixel Normalization Technique on Weights, Gradients, and Activations in Neural Network

There are different ways to process an image either before or during the training of a neural network trained to take in image inputs. Some of the pixel adjustment techniques used: Scaling each pixel ...
Kinshu's user avatar
  • 11
0 votes
0 answers
27 views

Fluctuating Validation Loss & Accuracy during Transfer Learning (ResNet50) - FER+ Dataset

I'm trying to build a CNN model for image classification, more specifically emotion classification using the FER+ dataset which is proving difficult to work with. I've tried several variations of ...
RoliPoliOli's user avatar
0 votes
0 answers
35 views

How to Plot and Interpret the ROC Curve for Segmentation-based Object Detection Models?

I'm trying to plot the ROC Curve for a number of target/object detection models and compare their performance. The pre-trained models in question take an input image and they output a mask image where ...
Tungdil's user avatar
0 votes
0 answers
6 views

Detecting Object Removal in Images

The problem statement is as follows - Given an altered image (an image from which some object has been removed), generate a mask for the removed object. For instance, say an original image contains ...
Aditya Kulkarni's user avatar
0 votes
0 answers
28 views

Depth estimation used in SSL from ego-motion

I have read a recent paper called Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV which tooks inspiration from a famous older one Unsupervised Learning of Depth and Ego-...
Iledran's user avatar
0 votes
0 answers
21 views

Computer vision tool to match regions in two images at different pixel locations

I have two image files. One image has subplots of (100) stock price graphs with stock ticker labels. The other image has subplots of (112) stock price graphs shuffled to a different row and column in ...
Jose_Peeterson's user avatar
0 votes
0 answers
14 views

Is global pooling necessary in image classification models?

In many image classification models, the global pooling operation is performed before the classification layer (i.e. fully connected layer) to reduce model complexity. Is the global pooling layer a ...
Liuji's user avatar
  • 1

15 30 50 per page
1
2 3 4 5
33