
tf.Variables are used when a tensor's gradient have to be computed. These are often called parameters of an algorithm, that are updated after each training step.

The gradients are computed by automatic differentiation on a DAG

Consider the example of solving linear regression in one variable iteratively : (other methods of solving include - finding the second derivative to compute the learning rate and get a closed form solution, or by computing pseudo inverse)

Computing gradients:

Gradient of output with respect to a variable is the product of gradients along the path connecting these two nodes. If there are more than one path between the nodes (When least square is computed for a batch of inputs), the gradient is the sum of the results on all such paths


tf.placeholders are commonly used for inputs to the algorithm

import tensorflow as tf
The value of x is passed when running the session

x = tf.placeholder(tf.float32,name="input")
t = tf.placeholder(tf.float32,name="target") 
#takes shape based on feed_dict unless the shape is explicitly specified

#Sometimes, we may want to fix certain dimension and the other dimension to be determined at run time
#placeholder to handle different catch size
batch = tf.placeholder(tf.float32,(None,5))
Tensor("input_1:0", dtype=float32)
Tensor("target_1:0", dtype=float32)
Tensor("Placeholder_1:0", shape=(?, 5), dtype=float32)


Variables have to defined with initial value

A = tf.Variable(<initial_value>, ...)
A = tf.Variable(1.0,tf.float32,name="A")
B = tf.Variable(0.0,tf.float32,name="B")

#Initializing a variable with tensor
tensor = tf.random_uniform((5,4),0,101)
tensor_variable = tf.Variable(tensor)

<tf.Variable 'A_1:0' shape=() dtype=float32_ref>
<tf.Variable 'B_1:0' shape=() dtype=float32_ref>
<tf.Variable 'Variable_1:0' shape=(5, 4) dtype=float32_ref>
y = A*x + B
E = tf.reduce_sum(tf.square(tf.subtract(y,t))) 
#if placeholder is a batch, then y is a vector by pythonbroadcasting rule
with tf.Session() as sess:
    writer = tf.summary.FileWriter('./tensorboard_logs', sess.graph)
Variable have to be initialized before running the graph

init = tf.global_variables_initializer()
    with tf.Session() as sess:

You can also initialize specific variables with

init = tf.variable_initializer([A])
init = tf.global_variables_initializer()
sess = tf.Session()

Set up Optimizer

optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(E)
name: "GradientDescent_2"
op: "NoOp"
input: "^GradientDescent_2/update_A/ApplyGradientDescent"
input: "^GradientDescent_2/update_B/ApplyGradientDescent"


for i in range(20):,feed_dict={x:1,t:2})
    #Get the value of the variables after each iteration
    loss,a,b =[E,A,B],feed_dict={x:1,t:2})
    print("Loss, A, B : ", loss,a,b)
Loss, A, B :  0.92160004 1.02 0.02
Loss, A, B :  0.84934676 1.0392 0.0392
Loss, A, B :  0.7827579 1.057632 0.057632003
Loss, A, B :  0.7213897 1.0753267 0.075326726
Loss, A, B :  0.66483265 1.0923136 0.092313655
Loss, A, B :  0.6127097 1.1086211 0.108621106
Loss, A, B :  0.56467324 1.1242763 0.12427626
Loss, A, B :  0.52040285 1.1393052 0.1393052
Loss, A, B :  0.4796033 1.153733 0.153733
Loss, A, B :  0.4420024 1.1675837 0.16758367
Loss, A, B :  0.4073495 1.1808803 0.18088032
Loss, A, B :  0.37541324 1.1936451 0.1936451
Loss, A, B :  0.34598076 1.2058994 0.2058993
Loss, A, B :  0.31885594 1.2176634 0.21766332
Loss, A, B :  0.29385763 1.2289568 0.22895679
Loss, A, B :  0.2708192 1.2397985 0.23979852
Loss, A, B :  0.24958698 1.2502066 0.25020656
Loss, A, B :  0.23001932 1.2601984 0.2601983
Loss, A, B :  0.21198583 1.2697904 0.26979035
Loss, A, B :  0.19536613 1.2789989 0.27899873

Getting the gradients

The optimizer.minimize is actually a wrapper of two methods

optimizer.compute_gradients : With this gradient, custom regularizers can be applied


This format is used to manipulate the gradients

optimizer = tf.train.GradientDescentOptimizer(0.01)
grad_A, grad_B = optimizer.compute_gradients(E,[A,B])
print(grad_A[0]) #contains both grad
print(grad_A[1]) # and var
Tensor("gradients/mul_grad/tuple/control_dependency:0", shape=(), dtype=float32)
<tf.Variable 'A:0' shape=() dtype=float32_ref>
#Manipulate grads if needed
train = optimizer.apply_gradients([grad_A,grad_B])
name: "GradientDescent"
op: "NoOp"
input: "^GradientDescent/update_A/ApplyGradientDescent"
input: "^GradientDescent/update_B/ApplyGradientDescent"

for i in range(20):,feed_dict={x:1,t:2})
    loss,a,b,g_a,g_b =[E,A,B,grad_A[0],grad_B[0]],feed_dict={x:1,t:2})
    print("Loss, A, B, Grad_A, Grad_B : ", loss,a,b,g_a,g_b)
Loss, A, B, Grad_A, Grad_B :  0.92160004 1.02 0.02 -1.9200001 -1.9200001
Loss, A, B, Grad_A, Grad_B :  0.84934676 1.0392 0.0392 -1.8432002 -1.8432002
Loss, A, B, Grad_A, Grad_B :  0.7827579 1.057632 0.057632003 -1.7694721 -1.7694721
Loss, A, B, Grad_A, Grad_B :  0.7213897 1.0753267 0.075326726 -1.6986933 -1.6986933
Loss, A, B, Grad_A, Grad_B :  0.66483265 1.0923136 0.092313655 -1.6307454 -1.6307454
Loss, A, B, Grad_A, Grad_B :  0.6127097 1.1086211 0.108621106 -1.5655155 -1.5655155
Loss, A, B, Grad_A, Grad_B :  0.56467324 1.1242763 0.12427626 -1.5028949 -1.5028949
Loss, A, B, Grad_A, Grad_B :  0.52040285 1.1393052 0.1393052 -1.4427791 -1.4427791
Loss, A, B, Grad_A, Grad_B :  0.4796033 1.153733 0.153733 -1.3850679 -1.3850679
Loss, A, B, Grad_A, Grad_B :  0.4420024 1.1675837 0.16758367 -1.3296652 -1.3296652
Loss, A, B, Grad_A, Grad_B :  0.4073495 1.1808803 0.18088032 -1.2764788 -1.2764788
Loss, A, B, Grad_A, Grad_B :  0.37541324 1.1936451 0.1936451 -1.2254195 -1.2254195
Loss, A, B, Grad_A, Grad_B :  0.34598076 1.2058994 0.2058993 -1.1764026 -1.1764026
Loss, A, B, Grad_A, Grad_B :  0.31885594 1.2176634 0.21766332 -1.1293466 -1.1293466
Loss, A, B, Grad_A, Grad_B :  0.29385763 1.2289568 0.22895679 -1.0841727 -1.0841727
Loss, A, B, Grad_A, Grad_B :  0.2708192 1.2397985 0.23979852 -1.0408058 -1.0408058
Loss, A, B, Grad_A, Grad_B :  0.24958698 1.2502066 0.25020656 -0.99917364 -0.99917364
Loss, A, B, Grad_A, Grad_B :  0.23001932 1.2601984 0.2601983 -0.9592066 -0.9592066
Loss, A, B, Grad_A, Grad_B :  0.21198583 1.2697904 0.26979035 -0.92083836 -0.92083836
Loss, A, B, Grad_A, Grad_B :  0.19536613 1.2789989 0.27899873 -0.88400483 -0.88400483
with tf.Session() as sess:
    writer = tf.summary.FileWriter('./tensorboard_logs', sess.graph)
