get_grad returns zero matrix when use_gpu=True
Created by: lcy-seso
I use this PR https://github.com/PaddlePaddle/Paddle/pull/3085 to print parameter values and gradients in event handle
.
My usage is as the follows:
def show_parameter_status(parameters):
# for debug print
for p in parameters:
value = parameters.get(p)
grad = parameters.get_grad(p)
avg_abs_value = np.average(np.abs(value))
avg_abs_grad = np.average(np.abs(grad))
logger.info(
("%s avg_abs_value=%.6f avg_abs_grad=%.6f "
"min_value=%.6f max_value=%.6f min_grad=%.6f max_grad=%.6f") %
(p, avg_abs_value, avg_abs_grad, value.min(), value.max(),
grad.min(), grad.max()))
- But I found when set
use_gpu=True
intrain.init()
,get_grad
always returns an all-zero matrix. - when I change
use_gpu=False
and keep all the other things unchanged, it returns a non-zeros matrix. - My training cost decreases so I think the gradient matrices should be non-zeros.