Variables¶

tf.Variables are used when a tensor's gradient have to be computed. These are often called parameters of an algorithm, that are updated after each training step.

The gradients are computed by automatic differentiation on a DAG

Consider the example of solving linear regression in one variable iteratively : (other methods of solving include - finding the second derivative to compute the learning rate and get a closed form solution, or by computing pseudo inverse)

Computing gradients:¶

Gradient of output with respect to a variable is the product of gradients along the path connecting these two nodes. If there are more than one path between the nodes (When least square is computed for a batch of inputs), the gradient is the sum of the results on all such paths

Placeholders¶

tf.placeholders are commonly used for inputs to the algorithm

import tensorflow as tf

tf.reset_default_graph()

The value of x is passed when running the session

x = tf.placeholder(tf.float32,name="input")
t = tf.placeholder(tf.float32,name="target") 
#takes shape based on feed_dict unless the shape is explicitly specified

#Sometimes, we may want to fix certain dimension and the other dimension to be determined at run time
#placeholder to handle different catch size
batch = tf.placeholder(tf.float32,(None,5))
print(x)
print(t)
print(batch)

Tensor("input_1:0", dtype=float32)
Tensor("target_1:0", dtype=float32)
Tensor("Placeholder_1:0", shape=(?, 5), dtype=float32)

Variables¶

Variables have to defined with initial value
A = tf.Variable(<initial_value>, ...)

A = tf.Variable(1.0,tf.float32,name="A")
B = tf.Variable(0.0,tf.float32,name="B")

#Initializing a variable with tensor
tensor = tf.random_uniform((5,4),0,101)
tensor_variable = tf.Variable(tensor)

print(A)
print(B)
print(tensor_variable)

<tf.Variable 'A_1:0' shape=() dtype=float32_ref>
<tf.Variable 'B_1:0' shape=() dtype=float32_ref>
<tf.Variable 'Variable_1:0' shape=(5, 4) dtype=float32_ref>

y = A*x + B
E = tf.reduce_sum(tf.square(tf.subtract(y,t))) 
#if placeholder is a batch, then y is a vector by pythonbroadcasting rule

!rm tensorboard_logs/*

with tf.Session() as sess:
    writer = tf.summary.FileWriter('./tensorboard_logs', sess.graph)
    writer.close()

!tensorboard --logdir=tensorboard_logs/

TensorBoard 1.12.0 at http://0f5261ab9fe2:6006 (Press CTRL+C to quit)
^C

Variable have to be initialized before running the graph

init = tf.global_variables_initializer()
    with tf.Session() as sess:
        sess.run(init)
        ...

You can also initialize specific variables with
init = tf.variable_initializer([A])

init = tf.global_variables_initializer()
sess = tf.Session() 
sess.run(init)

Set up Optimizer¶

optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(E)
print(train)

name: "GradientDescent_2"
op: "NoOp"
input: "^GradientDescent_2/update_A/ApplyGradientDescent"
input: "^GradientDescent_2/update_B/ApplyGradientDescent"

Train¶

for i in range(20):
    sess.run(train,feed_dict={x:1,t:2})
    #Get the value of the variables after each iteration
    loss,a,b = sess.run([E,A,B],feed_dict={x:1,t:2})
    print("Loss, A, B : ", loss,a,b)

Loss, A, B :  0.92160004 1.02 0.02
Loss, A, B :  0.84934676 1.0392 0.0392
Loss, A, B :  0.7827579 1.057632 0.057632003
Loss, A, B :  0.7213897 1.0753267 0.075326726
Loss, A, B :  0.66483265 1.0923136 0.092313655
Loss, A, B :  0.6127097 1.1086211 0.108621106
Loss, A, B :  0.56467324 1.1242763 0.12427626
Loss, A, B :  0.52040285 1.1393052 0.1393052
Loss, A, B :  0.4796033 1.153733 0.153733
Loss, A, B :  0.4420024 1.1675837 0.16758367
Loss, A, B :  0.4073495 1.1808803 0.18088032
Loss, A, B :  0.37541324 1.1936451 0.1936451
Loss, A, B :  0.34598076 1.2058994 0.2058993
Loss, A, B :  0.31885594 1.2176634 0.21766332
Loss, A, B :  0.29385763 1.2289568 0.22895679
Loss, A, B :  0.2708192 1.2397985 0.23979852
Loss, A, B :  0.24958698 1.2502066 0.25020656
Loss, A, B :  0.23001932 1.2601984 0.2601983
Loss, A, B :  0.21198583 1.2697904 0.26979035
Loss, A, B :  0.19536613 1.2789989 0.27899873

Getting the gradients¶

The optimizer.minimize is actually a wrapper of two methods

optimizer.compute_gradients : With this gradient, custom regularizers can be applied

optimizer.apply_gradients

This format is used to manipulate the gradients

optimizer = tf.train.GradientDescentOptimizer(0.01)
grad_A, grad_B = optimizer.compute_gradients(E,[A,B])
print(grad_A[0]) #contains both grad
print(grad_A[1]) # and var

Tensor("gradients/mul_grad/tuple/control_dependency:0", shape=(), dtype=float32)
<tf.Variable 'A:0' shape=() dtype=float32_ref>

#Manipulate grads if needed
train = optimizer.apply_gradients([grad_A,grad_B])
print(train)

name: "GradientDescent"
op: "NoOp"
input: "^GradientDescent/update_A/ApplyGradientDescent"
input: "^GradientDescent/update_B/ApplyGradientDescent"

for i in range(20):
    sess.run(train,feed_dict={x:1,t:2})
    loss,a,b,g_a,g_b = sess.run([E,A,B,grad_A[0],grad_B[0]],feed_dict={x:1,t:2})
    print("Loss, A, B, Grad_A, Grad_B : ", loss,a,b,g_a,g_b)

Loss, A, B, Grad_A, Grad_B :  0.92160004 1.02 0.02 -1.9200001 -1.9200001
Loss, A, B, Grad_A, Grad_B :  0.84934676 1.0392 0.0392 -1.8432002 -1.8432002
Loss, A, B, Grad_A, Grad_B :  0.7827579 1.057632 0.057632003 -1.7694721 -1.7694721
Loss, A, B, Grad_A, Grad_B :  0.7213897 1.0753267 0.075326726 -1.6986933 -1.6986933
Loss, A, B, Grad_A, Grad_B :  0.66483265 1.0923136 0.092313655 -1.6307454 -1.6307454
Loss, A, B, Grad_A, Grad_B :  0.6127097 1.1086211 0.108621106 -1.5655155 -1.5655155
Loss, A, B, Grad_A, Grad_B :  0.56467324 1.1242763 0.12427626 -1.5028949 -1.5028949
Loss, A, B, Grad_A, Grad_B :  0.52040285 1.1393052 0.1393052 -1.4427791 -1.4427791
Loss, A, B, Grad_A, Grad_B :  0.4796033 1.153733 0.153733 -1.3850679 -1.3850679
Loss, A, B, Grad_A, Grad_B :  0.4420024 1.1675837 0.16758367 -1.3296652 -1.3296652
Loss, A, B, Grad_A, Grad_B :  0.4073495 1.1808803 0.18088032 -1.2764788 -1.2764788
Loss, A, B, Grad_A, Grad_B :  0.37541324 1.1936451 0.1936451 -1.2254195 -1.2254195
Loss, A, B, Grad_A, Grad_B :  0.34598076 1.2058994 0.2058993 -1.1764026 -1.1764026
Loss, A, B, Grad_A, Grad_B :  0.31885594 1.2176634 0.21766332 -1.1293466 -1.1293466
Loss, A, B, Grad_A, Grad_B :  0.29385763 1.2289568 0.22895679 -1.0841727 -1.0841727
Loss, A, B, Grad_A, Grad_B :  0.2708192 1.2397985 0.23979852 -1.0408058 -1.0408058
Loss, A, B, Grad_A, Grad_B :  0.24958698 1.2502066 0.25020656 -0.99917364 -0.99917364
Loss, A, B, Grad_A, Grad_B :  0.23001932 1.2601984 0.2601983 -0.9592066 -0.9592066
Loss, A, B, Grad_A, Grad_B :  0.21198583 1.2697904 0.26979035 -0.92083836 -0.92083836
Loss, A, B, Grad_A, Grad_B :  0.19536613 1.2789989 0.27899873 -0.88400483 -0.88400483

sess.close()

!rm tensorboard_logs/*

with tf.Session() as sess:
    writer = tf.summary.FileWriter('./tensorboard_logs', sess.graph)
    writer.close()

!tensorboard --logdir=tensorboard_logs/

TensorBoard 1.12.0 at http://0f5261ab9fe2:6006 (Press CTRL+C to quit)
^C