I believe I've configured this model correctly for multi-label classification, but it would seem that it insists on behaving like a multi-class model, since the predictions it outputs always sum to 1 (e.g. [0.2, 0.8]).
First my dataset config, the label_mode
gives me the one-hot encoded inputs I should need for multi-label:
train_dataset = utils.text_dataset_from_directory(
train_dir,
batch_size=batch_size,
validation_split=0.2,
subset='training',
seed=seed,
labels='inferred',
label_mode='categorical')
val_dataset = utils.text_dataset_from_directory(
train_dir,
batch_size=batch_size,
validation_split=0.2,
subset='validation',
seed=seed,
label_mode='categorical')
My TextVectorization layer shouldn't be relevant to the problem but I'm including it anyway just in case I'm missing something here:
def standardize(input):
input = tf.strings.lower(input)
input = tf.strings.regex_replace(input, r'[^a-z ]', '')
return input
encoder = layers.TextVectorization(standardize=standardize)
encoder.adapt(train_dataset.map(lambda text, label: text))
And finally the model:
model = keras.Sequential([
encoder,
keras.layers.Embedding(
input_dim=len(encoder.get_vocabulary()),
output_dim=50,
# Use masking to handle the variable sequence lengths
mask_zero=True),
keras.layers.Bidirectional(keras.layers.LSTM(50)),
keras.layers.Dense(100, activation='relu'),
keras.layers.Dense(2, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',
optimizer=keras.optimizers.Adam(1e-4),
metrics=[keras.metrics.CategoricalAccuracy()])
history = model.fit(
train_dataset, epochs=4,
validation_data=val_dataset
)
I've tried binary, categorical, and F1 metrics, the results are the same regardless.
A prediction like this will invariably be two values whose sum is between 1.02 and 0.98, regardless of the contents of input_string
.
tensor = keras.ops.convert_to_tensor(np.array([input_string]))
print(model.predict(tensor, verbose=0)[0])
Any help would be appreciated!