Questions tagged [computer-vision]
Questions related to image representation, segmentation, visual object categorization and image processing algorithms in general.
482
questions
0
votes
0
answers
9
views
Training an image classifier to detect slight differences in line angles
I've been struggling to find success training a Yolov8 classifier to detect slight difference in line angles.
My hunch is that the difference in line angles are so slight that the classifier is ...
0
votes
0
answers
8
views
Efficient pooling to extract global embedding from local features (for LiDAR point clouds)
Problem:
I have 3d point cloud data (autonomous driving setting) and a point cloud encoder (MinkUNet) that extracts local features from them. What are suitable pooling techniques to map those local (...
1
vote
1
answer
19
views
NeRF vs mesh for text-to-3d generation
There seem to be multiple aproaches to generating 3d objects from text prompt. What's confusing is that some of them are generating NeRFs (https://arxiv.org/pdf/2308.16512), other's are generating ...
0
votes
0
answers
15
views
Differentiable voting loss
I have a problem which needs me to assign a class to each object in a scene (say 0 or 1) represented by an image. I am posing this as a segmentation problem (since there are many objects in a scene ...
0
votes
0
answers
14
views
In which cases validation loss & validation f1 score both increase?
I'm training a image segmentation model using BCE loss. My validation loss is increasing but at same time my f1 score of validation is also increasing. What are all scenarios where it is possible?
0
votes
0
answers
18
views
Can 3D convolutions appropriately capture a frozen embedding space?
My project is a strange combination of NLP and Computer Vision.
I have datapoints of 3D tensor where each element is a token in an NLP vocabulary. The vocabulary is around 1000 unique "words"...
1
vote
0
answers
33
views
Is it reasonable to use background subtraction to identify some objects in sequential frame images to start labeling objects for YOLO training?
I know background subtraction is not a complete solution for object detection, but I’ve tried it for identifying potential new objects appearing in fixed background camera scenarios (millions of ...
0
votes
0
answers
7
views
How to apply CAMs on small images
I��m implementing the GradCam algorithm on several architectures, mainly Resnets. The main issue is that the processed data becomes very small in the last block, precisely the last feature map is 1x1.
...
1
vote
0
answers
13
views
Impact of Pixel Normalization Technique on Weights, Gradients, and Activations in Neural Network
There are different ways to process an image either before or during the training of a neural network trained to take in image inputs.
Some of the pixel adjustment techniques used:
Scaling each pixel ...
0
votes
0
answers
27
views
Fluctuating Validation Loss & Accuracy during Transfer Learning (ResNet50) - FER+ Dataset
I'm trying to build a CNN model for image classification, more specifically emotion classification using the FER+ dataset which is proving difficult to work with.
I've tried several variations of ...
0
votes
0
answers
35
views
How to Plot and Interpret the ROC Curve for Segmentation-based Object Detection Models?
I'm trying to plot the ROC Curve for a number of target/object detection models and compare their performance. The pre-trained models in question take an input image and they output a mask image where ...
0
votes
0
answers
6
views
Detecting Object Removal in Images
The problem statement is as follows - Given an altered image (an image from which some object has been removed), generate a mask for the removed object.
For instance, say an original image contains ...
0
votes
0
answers
28
views
Depth estimation used in SSL from ego-motion
I have read a recent paper called Kick Back & Relax: Learning to Reconstruct the
World by Watching SlowTV which tooks inspiration from a famous older one Unsupervised Learning of Depth and Ego-...
0
votes
0
answers
21
views
Computer vision tool to match regions in two images at different pixel locations
I have two image files. One image has subplots of (100) stock price graphs with stock ticker labels. The other image has subplots of (112) stock price graphs shuffled to a different row and column in ...
0
votes
0
answers
14
views
Is global pooling necessary in image classification models?
In many image classification models, the global pooling operation is performed before the classification layer (i.e. fully connected layer) to reduce model complexity. Is the global pooling layer a ...