0
$\begingroup$

Situation:

My dataset is 70k images of people wearing clothes. Images are labelled: bbox position and class. There are 10 classes. I did 80:20 split. Categories are balanced with exception of one category, but I can accept poor performance on one category.

The goal is cloth recognition in images. When I feed an image of a person wearing pants and a t-shirt, I want to see two bboxes of these clothes.

My problems:

I already trained a few models from tf model zoo. I did over 100k steps on ssd mobilenet v1 and faster rcnn resnet 101.

The problem with ssd is that it won't converge. Loss is not getting below stable 2 and accuracy is bad. The problem with faster rcnn is that loss is below 1 but it's varying a lot and sometimes it jumps over 1.

What I've done:

I tried different batch sizes for ssd with no luck. FRCNN is locked with batch size 1. I improved the dataset multiple times. I went from 50 unbalanced classes to 10 balanced classes. I didn't tweak hyperparameters from models configs besides batch size.

My access to strong GPU is limited for me so I can't just randomly try different hyperparameters combinations with the hope that it will work. Could you suggest me few things that I can do in order to improve my models? I would be very thankful.

$\endgroup$
1

1 Answer 1

0
$\begingroup$

Are you training the models from scratch? If yes, then can you try using pre-trained models and fine-tune for your specific dataset.

You'll have the experiment with hyperparameters. Learning rate and optimizer (e.g., sgd, adam, rmsprop, adadelta) will have largest effect on model training and performance.

$\endgroup$
1

Not the answer you're looking for? Browse other questions tagged or ask your own question.