11
$\begingroup$

I have a binary classification problem and want to build a NN model which classifies the data whether class 0 or class 1.

My actual implementation looks like the following:

# Split dataset in train and test data 
X_train, X_test, Y_train, Y_test = train_test_split(normalized_X, Y, test_size=0.3, random_state=seed)

# Build the model
model = Sequential()
model.add(Dense(23, input_dim=45, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation='sigmoid'))

# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model
history = model.fit(X_train, Y_train, validation_split=0.3, epochs=200, batch_size=5, verbose=1, callbacks=[tensorboard, time_callback]) 

and I get a val_accuracy of 79.85% at the last training epoch.

I used a confusion matrix to evaluate the model:

y_pred = model.predict(X_test)
y_pred =(y_pred>0.5)
list(y_pred)

cm = confusion_matrix(Y_test, y_pred)
print(cm)

and I get these values: [[ 622 205] [ 216 1055]]

which makes 79.93% (approximately the same as the val_accuracy on the last epoch) of right predicted classes (622 TN + 1055 TP).

Now my question is: How to improve my NN so that I get above that? Okay, I am using 1 hidden Dense Layer, with 23 nodes. When to use Dense layers, and when to use Conv2D or Dropout, or any of the other layers of Keras?

I am classifying numerical data. Here is how my data looks like (the dataframe separated in 2 photos, because it's too wide for just 1): df 1st part df 2nd part

PS: the categorical features were One-Hot-Encoded using:

basic_df = pd.get_dummies(basic_df, columns=['industry', 'weekday', 'category_name', 'page_name', 'type'])

And the label column is 'successful'.

$\endgroup$

1 Answer 1

18
$\begingroup$

What are Dense layers and when are they useful?

Dense layers are used when association can exist among any feature to any other feature in data point. Since between two layers of size $n_{1}$ and $n_{2}$, there can $n_{1} * n_{2}$ connections and these are referred to as Dense.

Where does Conv layers come in and when are they useful?

Coming to the conv layers, these are important when nearby associations among the features matter, example object detection. Neighborhoods matter to classify or detect. It is very less likely that the pixels at the opposite corners(very far away) are somehow helpful in these use cases. Filters does this job of getting associations among neighborhoods. This answer is great at understanding difference between 1D and 2D convolutions. I really dont want to repeat it.

Dropout and Flatten

Dropout is a way of cutting too much association among features by dropping the weights (edges) at a probability. The original paper from Hinton et.al is a quick and great read to grasp it. Reducing associations can be applied among any layers which stops weight updation for the edge. The another key difference here is it has no weights associated with it. It is just there dropping things.

Flatten layers are used when you got a multidimensional output and you want to make it linear to pass it onto a Dense layer. If you are familiar with numpy, it is equivalent to numpy.ravel. An output from flatten layers is passed to an MLP for classification or regression task you want to achieve. No weighting are associated with these too. It is just flattening the hell out.

To improve the model

Try more layers, change activation functions, try batching. If there is any chance of improving features and creating any higher order ones, do it. Too many variables to speculate.

This question is real broad, but hope I covered it well.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.