I am trying to follow an implementation of an attention decoder
from keras.layers.recurrent import Recurrent
...
class AttentionDecoder(Recurrent):
...
#######################################################
# The functionality of init, build and call is clear
#######################################################
def __init__(self, units, output_dim,
activation='tanh',
...
def build(self, input_shape):
...
def call(self, x):
self.x_seq = x
...
return super(AttentionDecoder, self).call(x)
##################################################################
# What is the purpose of 'get_initial_state' and 'step' functions
# Do these functions override the Recurrent base class functions?
##################################################################
def get_initial_state(self, inputs):
# apply the matrix on the first time step to get the initial s0.
s0 = activations.tanh(K.dot(inputs[:, 0], self.W_s))
# from keras.layers.recurrent to initialize a vector of (batchsize,
# output_dim)
y0 = K.zeros_like(inputs) # (samples, timesteps, input_dims)
y0 = K.sum(y0, axis=(1, 2)) # (samples, )
y0 = K.expand_dims(y0) # (samples, 1)
y0 = K.tile(y0, [1, self.output_dim])
return [y0, s0]
def step(self, x, states):
ytm, stm = states
...
return yt, [yt, st]
The AttentionDecoder
class is inherited from Recurrent
, an abstract base class for recurrent layers ( [email protected], documented here ).
How do the get_initial_state
and step
function work withing the class (who calls them, when, etc.)? If these function are related to the base class Recurrent, where can I find the relevant documentation?