3

I am training a deep autoencoder to map human faces to a 128 dimensional latent space, and then decode them back to its original 128x128x3 format.

I was hoping that after training the autoencoder, I would somehow be able to 'slice' the second half of the autoencoder, i.e. the decoder network responsible for mapping the latent space (128,) to the image space (128, 128, 3) by using the functional Keras API and autoenc_model.get_layer()

Here are the relevant layers of my model:

INPUT_SHAPE=(128,128,3)
input_img = Input(shape=INPUT_SHAPE, name='enc_input')

#1
x = Conv2D(64, (3, 3), padding='same', activation='relu')(input_img)
x = BatchNormalization()(x)

//Many Conv2D, BatchNormalization(), MaxPooling() layers
.
.
.

#Flatten
fc_input = Flatten(name='enc_output')(x)

y = Dropout(DROP_RATE)(fc_input)
y = Dense(128, activation='relu')(y)
y = Dropout(DROP_RATE)(y)
fc_output = Dense(128, activation='linear')(y)   

#Reshape
decoder_input = Reshape((8, 8, 2), name='decoder_input')(fc_output)

#Decoder part

#UnPooling-1
z = UpSampling2D()(decoder_input)
//Many Conv2D, BatchNormalization, UpSampling2D layers
.
.
.
#16
decoder_output = Conv2D(3, (3, 3), padding='same', activation='linear', name='decoder_output')(z)

autoenc_model = Model(input_img, decoder_output)

here is the notebook containing the entire model architecture.

To get the decodeer network from the trained autoencoder, I have tried using:

dec_model = Model(inputs=autoenc_model.get_layer('decoder_input').input, outputs=autoenc_model.get_layer('decoder_output').output)

and

dec_model = Model(autoenc_model.get_layer('decoder_input'), autoenc_model.get_layer('decoder_output'))

neither of which seem to work.

I need to extract the decoder layers out of the autoencoder as I want to train the entire autoencoder model first, then use the encoder and the decoder independently.

I could not find a satisfactory answer anywhere else. The Keras blog article on building autoencoders only covers how to extract the decoder for 2 layered autoencoders.

The decoder input/output shape should be: (128, ) and (128, 128, 3), which is the input shape of the 'decoder_input' and output shape of the 'decoder_output' layers respectively.

3 Answers 3

2

Couple of changes are needed:

z = UpSampling2D()(decoder_input)

to

direct_input = Input(shape=(8,8,2), name='d_input')
#UnPooling-1
z = UpSampling2D()(direct_input)

and

autoenc_model = Model(input_img, decoder_output)

to

dec_model = Model(direct_input, decoder_output)
autoenc_model = Model(input_img, dec_model(decoder_input))

Now, you can train on the auto encoder and predict using the decoder.

import numpy as np
autoenc_model.fit(np.ones((5,128,128,3)), np.ones((5,128,128,3)))
dec_model.predict(np.ones((1,8,8,2)))

You can also refer this self-contained example: https://github.com/keras-team/keras/blob/master/examples/variational_autoencoder.py

3
  • Thanks for your answer but, this line, dec_model = Model(direct_input, decoder_output), shouldn't it be dec_model = Model(z, decoder_output)? Thanks.
    – VansFannel
    Commented Sep 6, 2020 at 5:27
  • The model input needs to be an 'Input' layer. We add the 'Input' layer to the decoder so that it can be used as an independent model later. The upsampling layer in the decoder named 'z' is kind of misleading as it is usually reserved for the latent space output of the encoder. Commented Sep 6, 2020 at 12:54
  • Thanks. There is a related question to this one, stackoverflow.com/questions/63756756/…, that I have asked.
    – VansFannel
    Commented Sep 6, 2020 at 15:40
1

My solution isn't very elegant, and there are probably better solutions out there, but since no-one replied yet, I'll post it (I was actually hoping someone would so I can improve my own implementation, as you'll see below).

So what I did was built a network that can take a secondary input, directly into the latent space. Unfortunately, both inputs are obligatory, so I end up with a network that requires dummy arrays full of zeros for the 'unwanted' input (you'll see in a second).

Using Keras functional API:

image_input = Input(shape=image_shape)
conv1 = Conv2D(...,activation='relu')(image_input)
...
dense_encoder = Dense(...)(<layer>)
z_input = Input(shape=n_latent)
decoder_entry = Dense(...,activation='relu')(Add()([dense_encoder,z_input]))
...
decoder_output = Conv2DTranspose(...)


model = Model(inputs=[image_input,z_input], outputs=decoder_output)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

encoder = Model(inputs=image_input,outputs=dense_encoder)
decoder = Model(inputs=[z_input,image_input], outputs=decoder_output)

Note that you shouldn't compile the encoder and decoder.

(some code is either omitted or left with ... for you to fill in your specific needs).

Finally, to train you'll have to provide one empty array. So to train the entire auto-encoder:

images is X in this context

model.fit([images,np.zeros((len(n_latent),...))],images)

And then you can get the latent features using:

latent_features = encoder.predict(images)

Or use the decoder with latent input and dummy variables (note the order of inputs above):

decoder.predict([Z_inputs,np.zeros(shape=images.shape)])

Finally, another solution I haven't tried is build to parallel models, with the same architecture, one the autoencoder, and the second only the decoder part, and then use:

decoder_layer.set_weights(model_layer.get_weights()) 

It should work, but I haven't confirmed it. It does have the disadvantage of having to copy the weights again every time your train the autoencoder model.

So to conclude, I am aware of the many problems here, but again, I only posted this because I saw no-one else replied, and was hoping this will still be of some use to you.

Please comment if something is not clear.

0

An option is to define a function which uses get_layer and then reconstruct the decoder part in there. For example, consider a simple autoencoder with the following architecture: [n_inputs, 500, 100, 500, n_outputs]. To be able to run some inputs through the second half (ie run 100 inputs through the layers of 500 and n_outputs.

# Function to get outputs from a given set of bottleneck inputs
def bottleneck_to_outputs(bottleneck_inputs, autoencoder):
    # Run bottleneck_inputs (eg 100 units) through decoder layer (eg 500 units)
    x = autoencoder.get_layer('decoder')(bottleneck_inputs)
    # Run x (eg 500 units) through output layer (n units = n features)
    x = autoencoder.get_layer('output')(x)
    return x

For your example, this function should work (assuming you have given your layers the names referenced here).

def decoder_part(autoenc_model, image):

  #UnPooling-1
  z = autoenc_model.get_layer('upsampling1')(image)

  #9
  z = autoenc_model.get_layer('conv2d1')(z)
  z = autoenc_model.get_layer('batchnorm1')(z)

  #10
  z = autoenc_model.get_layer('conv2d2')(z)
  z = autoenc_model.get_layer('batchnorm2')(z)

  #UnPooling-2
  z = autoenc_model.get_layer('upsampling2')(z)

  #11
  z = autoenc_model.get_layer('conv2d3')(z)
  z = autoenc_model.get_layer('batchnorm3')(z)

  #12
  z = autoenc_model.get_layer('conv2d4')(z)
  z = autoenc_model.get_layer('batchnorm4')(z)

  #UnPooling-3
  z = autoenc_model.get_layer('upsampling3')(z)

  #13
  z = autoenc_model.get_layer('conv2d5')(z)
  z = autoenc_model.get_layer('batchnorm5')(z)

  #14
  z = autoenc_model.get_layer('conv2d6')(z)
  z = autoenc_model.get_layer('batchnorm6')(z)

  #UnPooling-4
  z = autoenc_model.get_layer('upsampling4')(z)

  #15
  z = autoenc_model.get_layer('conv2d7')(z)
  z = autoenc_model.get_layer('batchnorm7')(z)

  #16
  decoder_output = autoenc_model.get_layer('decoder_output')(z)

  return decoder_output

Given this function, it would make sense to also have a way to test if it is working correctly. In order to do this, define another model which gets you from inputs to the bottleneck (latent space), such as:

bottleneck_layer = Model(inputs= input_img,outputs=decoder_input)

Then, as a test, run a vector of ones through the first part of the model and obtain the latent space:

import numpy as np
ones_image = np.ones((128,128,3))
bottleneck_ones = bottleneck_layer(ones_image.reshape(1,128,128,3))

And then run that latent space through the function defined above to create a variable which you will test against the output of full network:

decoded_test = decoder_part(autoenc_model, bottleneck_ones)

Now, run the ones_image through the whole network and verify that you get the same results:

model_test = autoenc_model.predict(ones_image.reshape(1,128,128,3))
tf.debugging.assert_equal(model_test, decoder_test, message= 'Tensors are not equivalent') 

If the assert_equal line does not throw an error, your decoder is working correctly.

Not the answer you're looking for? Browse other questions tagged or ask your own question.