0

I built a seq2seq model for a chatbot after getting inspired by a github repo. To train the chatbot I used my facebook chat history. Since most of my chat is like hindi words written in english language. I had to train word embedding from scratch. I knew that the model will take about 30-40 hours(500000 iterations of batch size 24) of training on cpu. So, I learned to use

tf.train.saver()

method to save the variables and restore them in future.

To see the progress of my model I made model to output the replies of five input text sequences at every 250th iteration. In the beginning of training I was getting blank output(since token is most common). But after few thousand iterations it started to give most common words as output. After 90,000 iterations it was giving some illogical but different kind of outputs.So, I stopped training there.

Now when I restore variables from latest checkpoint. I am again getting blank lines as output. Is this normal behavior or is there some kind of mistake in my code.

CODE: Full code

code snippets: (code to restore from latest checkpoint)

sess = tf.Session()
saver = tf.train.Saver()
saver.restore(sess, tf.train.latest_checkpoint('models/'))
sess.run(tf.global_variables_initializer())

(code to save variables in iteration loop)

    if (i % 10000 == 0 and i != 0):
        savePath = saver.save(sess, "models/pretrained_seq2seq.ckpt", global_step=i)

1 Answer 1

1

Brother, every time you use global value initializer it resets, :/. Use sess.run(tf.global) only first time when you train model. After you've saved the model with the saver, comment it out. So next time you restore model it won't reset and it should work fine.

0

Not the answer you're looking for? Browse other questions tagged or ask your own question.