add comment for lazy_mode adam optimizer

eb5d427d · Qiao Longfei · c624417c · eb5d427d
显示空白变更内容
内联并排

Showing with 7 addition and 2 deletion

python/paddle/fluid/optimizer.py python/paddle/fluid/optimizer.py +7 -2

未找到文件。
--- a/python/paddle/fluid/optimizer.py
+++ b/python/paddle/fluid/optimizer.py
@@ -641,9 +641,14 @@ class AdamOptimizer(Optimizer):
        beta1 (float): The exponential decay rate for the 1st moment estimates.
        beta2 (float): The exponential decay rate for the 2nd moment estimates.
        epsilon (float): a small float value for numerical stability.
-        regularization: A Regularizer, such as
+        regularization: A Regularizer, such as fluid.regularizer.L2DecayRegularizer.
-                        fluid.regularizer.L2DecayRegularizer.
        name: A optional name prefix.
+        lazy_mode(bool: false): The official Adam algorithm has two moving-average accumulators
+        the accumulators are updated at every step. Every element of the two moving-average is updated
+        in both dense mode and sparse mode. If the size of parameter is very large, then the update
+        may be very slow. The lazy mode only update the element that has gradient is the current
+        mini-batch, so it will be much more faster. But this mode has different semantics with the
+        original Adam algorithm and may lead to different result.
    Examples:
        .. code-block:: python