Implementation of paddle.attr.Param ignore the globally set initialization mean and std, and force me to use smart initalization
Created by: lcy-seso
There is a code snippet from paddle/trainer_config_helpers/attrs.py
https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/trainer_config_helpers/attrs.py#L139:
def __init__(self,
name=None,
is_static=False,
initial_std=None,
initial_mean=None,
initial_max=None,
initial_min=None,
l1_rate=None,
l2_rate=None,
learning_rate=None,
momentum=None,
gradient_clipping_threshold=None,
sparse_update=False,
update_hooks=None,
initializer=None):
self.attr = {}
if is_static:
self.attr['is_static'] = True
if initial_std is None and initial_mean is None and initial_max \
is None and initial_min is None:
self.attr['initial_smart'] = True
-
in training, I may prefer another initialization strategy, such as initialization from a Gaussian distribution with a default mean and std, rather than use smart initialization.
-
to do this, I will set the global initialization std and mean.
-
The implementation above ignores the globally set mean and std. If I just use
paddle.attr.Param
to tune the learning rate or any other parameters exceptinitial_std
,initial_mean
,initial_max
, andinitial_min
like below. The codes then force me to use smart initialization. It ignores the default initialization std I set.param_attr=paddle.attr.Param(learning_rate=0.1)