I'm trying to implement this method for use with TensorFlow (taken from here):
def _jacobian_product_sq_euc(X, Y, E, G):
m = X.shape[0]
n = Y.shape[0]
d = X.shape[1]
for i in range(m): # 0 - 4
for j in range(n):
for k in range(d):
G[i, k] += E[i,j] * 2 * (X[i, k] - Y[j, k])
I have re-written this using three tf.while_loops but noticed that it is very slow (working example here):
def calc_score():
gm = tf.zeros([16, 256])
i = 0
i_max = 16
j_max = 16
d_max = 256
while_condition_loop1 = lambda i, gm_score: tf.less(i, i_max)
while_condition_loop2 = lambda i, j, gm_score: tf.less(j, j_max)
while_condition_loop3 = lambda i, j, d, gm_score: tf.less(d, d_max)
gm_score = tf.constant(0.)
def loop3(i, j, d, gm_score):
gm_score = gm_score + e[i+1, j+1] * 2 * tf.abs((x[i,d] - y[j, d]))
return [i, j, tf.add(d,1), gm_score]
def loop2(i, j, gm_score):
d = 0
_, _, _, gm_score = tf.while_loop(while_condition_loop3, loop3, [i, j, d, gm_score])
return [i, tf.add(j,1), gm_score]
def loop1(i, gm_score):
j = 0
_, _, gm_score = tf.while_loop(while_condition_loop2, loop2, [i, j, gm_score])
return [tf.add(i,1), gm_score]
_, gm_score = tf.while_loop(while_condition_loop1, loop1, [i, gm_score])
return gm_score
(Note: I'm aware that I'm returning a single value in this case, instead of a matrix. But that is a separate issue)
A series of 16x256 values takes about 4-5 seconds to calculate. Now I'm wondering how to optimize this. Are there alternatives to using tf.while_loop in this case? My CPU also seems to have a fairly high load and I get a lot of these messages while training:
2017-10-30 17:00:51.234993: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 257610 get requests, put_count=385620 evicted_count=128000 eviction_rate=0.331933 and unsatisfied allocation rate=0
My knowledge of TensorFlow is still limited and I'm wondering what can be done to optimize this method.
I'm using python 2.7 and TensorFlow 1.2.0