Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
BaiXuePrincess
Paddle
提交
7ff197d3
P
Paddle
项目概览
BaiXuePrincess
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
7ff197d3
编写于
8月 24, 2020
作者:
W
WangXi
提交者:
GitHub
8月 24, 2020
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Add fleet dgc amp doc, test=document_fix (#26608)
上级
36868e84
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
60 addition
and
0 deletion
+60
-0
python/paddle/distributed/fleet/base/distributed_strategy.py
python/paddle/distributed/fleet/base/distributed_strategy.py
+60
-0
未找到文件。
python/paddle/distributed/fleet/base/distributed_strategy.py
浏览文件 @
7ff197d3
...
...
@@ -307,6 +307,30 @@ class DistributedStrategy(object):
@
property
def
amp_configs
(
self
):
"""
Set automatic mixed precision training configurations. In general, amp has serveral configurable
settings that can be configured through a dict.
**Notes**:
**init_loss_scaling(float)**: The initial loss scaling factor. Default 32768.
**use_dynamic_loss_scaling(bool)**: Whether to use dynamic loss scaling. Default True.
**incr_every_n_steps(int)**: Increases loss scaling every n consecutive steps with finite gradients. Default 1000.
**decr_every_n_nan_or_inf(int)**: Decreases loss scaling every n accumulated steps with nan or inf gradients. Default 2.
**incr_ratio(float)**: The multiplier to use when increasing the loss scaling. Default 2.0.
**decr_ratio(float)**: The less-than-one-multiplier to use when decreasing the loss scaling. Default 0.5.
**custom_white_list(list[str])**: Users' custom white list which always execution fp16.
**custom_black_list(list[str])**: Users' custom black list which forbidden execution fp16.
Examples:
.. code-block:: python
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy.amp = True
strategy.amp_configs = {
"init_loss_scaling": 32768,
"custom_white_list": ['conv2d']}
"""
return
get_msg_dict
(
self
.
strategy
.
amp_configs
)
@
amp_configs
.
setter
...
...
@@ -620,6 +644,20 @@ class DistributedStrategy(object):
@
property
def
dgc
(
self
):
"""
Indicating whether we are using Deep Gradient Compression training. For more details, please refer to
[Deep Gradient Compression](https://arxiv.org/abs/1712.01887).
Default Value: False
Examples:
.. code-block:: python
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy.dgc = True # by default this is false
"""
return
self
.
strategy
.
dgc
@
dgc
.
setter
...
...
@@ -631,6 +669,28 @@ class DistributedStrategy(object):
@
property
def
dgc_configs
(
self
):
"""
Set Deep Gradient Compression training configurations. In general, dgc has serveral configurable
settings that can be configured through a dict.
**Notes**:
**rampup_begin_step(int)**: The beginning step from which gradient compression is implemented. Default 0.
**rampup_step(int)**: Time steps used in sparsity warm-up periods. Default is 1.
For example, if the sparsity is [0.75, 0.9375, 0.984375, 0.996, 0.999], and the rampup_step is 100,
it will use 0.75 at 0~19 steps, and 0.9375 at 20~39 steps, and so on. And when reach sparsity array
ends, it will use 0.999 then and after.
**sparsity(list[float])**: Get top important element from gradient tensor, the ratio is (1 - sparsity).
Default is [0.999]. For example, if the sparsity is [0.99, 0.999], the top [1%, 0.1%] important
element will be transmitted.
Examples:
.. code-block:: python
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy.dgc = True
strategy.dgc_configs = {"rampup_begin_step": 1252}
"""
return
get_msg_dict
(
self
.
strategy
.
dgc_configs
)
@
dgc_configs
.
setter
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录