Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
BaiXuePrincess
Paddle
提交
cc9c6196
P
Paddle
项目概览
BaiXuePrincess
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
cc9c6196
编写于
12月 01, 2020
作者:
1
123malin
提交者:
GitHub
12月 01, 2020
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
test=develop, fix doc (#29200)
* fix fleet api doc
上级
c0a991c8
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
131 addition
and
76 deletion
+131
-76
python/paddle/distributed/fleet/base/distributed_strategy.py
python/paddle/distributed/fleet/base/distributed_strategy.py
+61
-17
python/paddle/distributed/fleet/base/fleet_base.py
python/paddle/distributed/fleet/base/fleet_base.py
+70
-59
未找到文件。
python/paddle/distributed/fleet/base/distributed_strategy.py
浏览文件 @
cc9c6196
...
@@ -107,7 +107,7 @@ class DistributedStrategy(object):
...
@@ -107,7 +107,7 @@ class DistributedStrategy(object):
All of the distributed training configurations can be configured in DistributedStrategy,
All of the distributed training configurations can be configured in DistributedStrategy,
such as automatic mixed precision (AMP), Layer-wise Adaptive Rate Scaling (LARS),
such as automatic mixed precision (AMP), Layer-wise Adaptive Rate Scaling (LARS),
asynchronous update parameter server(ASGD), etc.
asynchronous update parameter server(ASGD), etc.
DistributedStrategy can be serialized into protobuf file or deserialized from protobuf file
DistributedStrategy can be serialized into protobuf file or deserialized from protobuf file
Users who run local training usually configure BuildStrategy and ExecutionStrategy, and
Users who run local training usually configure BuildStrategy and ExecutionStrategy, and
...
@@ -128,8 +128,9 @@ class DistributedStrategy(object):
...
@@ -128,8 +128,9 @@ class DistributedStrategy(object):
Serialize current DistributedStrategy to string and save to output file
Serialize current DistributedStrategy to string and save to output file
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.dgc = True
strategy.dgc = True
...
@@ -145,6 +146,7 @@ class DistributedStrategy(object):
...
@@ -145,6 +146,7 @@ class DistributedStrategy(object):
Load from prototxt file for DistributedStrategy initialization
Load from prototxt file for DistributedStrategy initialization
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -161,10 +163,11 @@ class DistributedStrategy(object):
...
@@ -161,10 +163,11 @@ class DistributedStrategy(object):
Configure ExecutionStrategy for DistributedStrategy
Configure ExecutionStrategy for DistributedStrategy
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle
import paddle
exe_strategy = paddle.
fluid
.ExecutionStrategy()
exe_strategy = paddle.
static
.ExecutionStrategy()
exe_strategy.num_threads = 10
exe_strategy.num_threads = 10
exe_strategy.num_iteration_per_drop_scope = 10
exe_strategy.num_iteration_per_drop_scope = 10
exe_strategy.num_iteration_per_run = 10
exe_strategy.num_iteration_per_run = 10
...
@@ -195,10 +198,11 @@ class DistributedStrategy(object):
...
@@ -195,10 +198,11 @@ class DistributedStrategy(object):
only if the property is non-distributed strategy.
only if the property is non-distributed strategy.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle
import paddle
build_strategy = paddle.
fluid
.BuildStrategy()
build_strategy = paddle.
static
.BuildStrategy()
build_strategy.enable_sequential_execution = True
build_strategy.enable_sequential_execution = True
build_strategy.fuse_elewise_add_act_ops = True
build_strategy.fuse_elewise_add_act_ops = True
build_strategy.fuse_bn_act_ops = True
build_strategy.fuse_bn_act_ops = True
...
@@ -207,7 +211,7 @@ class DistributedStrategy(object):
...
@@ -207,7 +211,7 @@ class DistributedStrategy(object):
build_strategy.fuse_broadcast_ops = True
build_strategy.fuse_broadcast_ops = True
build_strategy.fuse_all_optimizer_ops = True
build_strategy.fuse_all_optimizer_ops = True
build_strategy.enable_inplace = True
build_strategy.enable_inplace = True
strategy = paddle.distributed.fleet.DistributedStrategy()
strategy = paddle.distributed.fleet.DistributedStrategy()
strategy.build_strategy = build_strategy
strategy.build_strategy = build_strategy
"""
"""
...
@@ -240,6 +244,7 @@ class DistributedStrategy(object):
...
@@ -240,6 +244,7 @@ class DistributedStrategy(object):
Default value: True
Default value: True
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -248,7 +253,7 @@ class DistributedStrategy(object):
...
@@ -248,7 +253,7 @@ class DistributedStrategy(object):
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.a_sync = True # by default this is True
strategy.a_sync = True # by default this is True
# code block for defining loss and local optimizer
# code block for defining loss and local optimizer
# sgd = fleet.distributed_optimizer(optimizer, strategy)
# sgd = fleet.distributed_optimizer(optimizer, strategy)
"""
"""
...
@@ -288,6 +293,7 @@ class DistributedStrategy(object):
...
@@ -288,6 +293,7 @@ class DistributedStrategy(object):
runtime_split_send_recv(bool): if we are using Tensor split for send and recv during runtime
runtime_split_send_recv(bool): if we are using Tensor split for send and recv during runtime
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -319,6 +325,7 @@ class DistributedStrategy(object):
...
@@ -319,6 +325,7 @@ class DistributedStrategy(object):
Default Value: False
Default Value: False
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -360,6 +367,7 @@ class DistributedStrategy(object):
...
@@ -360,6 +367,7 @@ class DistributedStrategy(object):
custom_black_list(list[str]): Users' custom black list which forbidden execution fp16.
custom_black_list(list[str]): Users' custom black list which forbidden execution fp16.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -384,6 +392,7 @@ class DistributedStrategy(object):
...
@@ -384,6 +392,7 @@ class DistributedStrategy(object):
Default value: False
Default value: False
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -401,6 +410,7 @@ class DistributedStrategy(object):
...
@@ -401,6 +410,7 @@ class DistributedStrategy(object):
We note that system overhead is usually lower when sync_nccl_allreduce = True
We note that system overhead is usually lower when sync_nccl_allreduce = True
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -425,6 +435,7 @@ class DistributedStrategy(object):
...
@@ -425,6 +435,7 @@ class DistributedStrategy(object):
allreduce among the leaders of each group
allreduce among the leaders of each group
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -450,6 +461,7 @@ class DistributedStrategy(object):
...
@@ -450,6 +461,7 @@ class DistributedStrategy(object):
Default value: number of GPU cards on each single GPU machine
Default value: number of GPU cards on each single GPU machine
Example:
Example:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -472,10 +484,11 @@ class DistributedStrategy(object):
...
@@ -472,10 +484,11 @@ class DistributedStrategy(object):
def
sync_batch_norm
(
self
):
def
sync_batch_norm
(
self
):
"""
"""
Indicating whether we are using sync_batch_norm to do synchronous batch normalization among all training nodes.
Indicating whether we are using sync_batch_norm to do synchronous batch normalization among all training nodes.
Default value: False
Default value: False
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -500,6 +513,7 @@ class DistributedStrategy(object):
...
@@ -500,6 +513,7 @@ class DistributedStrategy(object):
Default value: True
Default value: True
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -524,8 +538,9 @@ class DistributedStrategy(object):
...
@@ -524,8 +538,9 @@ class DistributedStrategy(object):
Default value: 32
Default value: 32
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.fuse_grad_size_in_MB = 50
strategy.fuse_grad_size_in_MB = 50
...
@@ -562,8 +577,9 @@ class DistributedStrategy(object):
...
@@ -562,8 +577,9 @@ class DistributedStrategy(object):
Default value: 1
Default value: 1
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.nccl_comm_num = 2
strategy.nccl_comm_num = 2
...
@@ -594,8 +610,9 @@ class DistributedStrategy(object):
...
@@ -594,8 +610,9 @@ class DistributedStrategy(object):
implementation should have some manually assign checkpoints
implementation should have some manually assign checkpoints
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.recompute = True
strategy.recompute = True
...
@@ -622,8 +639,9 @@ class DistributedStrategy(object):
...
@@ -622,8 +639,9 @@ class DistributedStrategy(object):
Default value: False
Default value: False
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.fleet as fleet
import paddle.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.sharding = True
strategy.sharding = True
...
@@ -649,8 +667,9 @@ class DistributedStrategy(object):
...
@@ -649,8 +667,9 @@ class DistributedStrategy(object):
and should be an empirical value decided by your model size and network topology.
and should be an empirical value decided by your model size and network topology.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.sharding = True
strategy.sharding = True
...
@@ -674,8 +693,9 @@ class DistributedStrategy(object):
...
@@ -674,8 +693,9 @@ class DistributedStrategy(object):
device_guard information in user-defined program.
device_guard information in user-defined program.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.pipeline = True
strategy.pipeline = True
...
@@ -709,8 +729,9 @@ class DistributedStrategy(object):
...
@@ -709,8 +729,9 @@ class DistributedStrategy(object):
**micro_batch**: the number of small batches in each user defined batch
**micro_batch**: the number of small batches in each user defined batch
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.pipeline = True
strategy.pipeline = True
...
@@ -736,6 +757,7 @@ class DistributedStrategy(object):
...
@@ -736,6 +757,7 @@ class DistributedStrategy(object):
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -764,6 +786,7 @@ class DistributedStrategy(object):
...
@@ -764,6 +786,7 @@ class DistributedStrategy(object):
begin_step(int) The step of begining training by localsgd. Default 1.
begin_step(int) The step of begining training by localsgd. Default 1.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -791,6 +814,7 @@ class DistributedStrategy(object):
...
@@ -791,6 +814,7 @@ class DistributedStrategy(object):
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -821,6 +845,7 @@ class DistributedStrategy(object):
...
@@ -821,6 +845,7 @@ class DistributedStrategy(object):
begin_step(int) The step of begining training by adaptive localsgd. Default 1.
begin_step(int) The step of begining training by adaptive localsgd. Default 1.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -848,6 +873,7 @@ class DistributedStrategy(object):
...
@@ -848,6 +873,7 @@ class DistributedStrategy(object):
Default Value: False
Default Value: False
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -884,6 +910,7 @@ class DistributedStrategy(object):
...
@@ -884,6 +910,7 @@ class DistributedStrategy(object):
element will be transmitted.
element will be transmitted.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -906,6 +933,7 @@ class DistributedStrategy(object):
...
@@ -906,6 +933,7 @@ class DistributedStrategy(object):
Default Value: False
Default Value: False
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -935,6 +963,7 @@ class DistributedStrategy(object):
...
@@ -935,6 +963,7 @@ class DistributedStrategy(object):
to model parameters.
to model parameters.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -963,6 +992,7 @@ class DistributedStrategy(object):
...
@@ -963,6 +992,7 @@ class DistributedStrategy(object):
avg(bool): whether to average the gradients of each mini-batch, the default value is `True`
avg(bool): whether to average the gradients of each mini-batch, the default value is `True`
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -989,6 +1019,7 @@ class DistributedStrategy(object):
...
@@ -989,6 +1019,7 @@ class DistributedStrategy(object):
Default Value: False
Default Value: False
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -1019,6 +1050,7 @@ class DistributedStrategy(object):
...
@@ -1019,6 +1050,7 @@ class DistributedStrategy(object):
will be exclude from weight decay in lars formula.
will be exclude from weight decay in lars formula.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -1048,8 +1080,9 @@ class DistributedStrategy(object):
...
@@ -1048,8 +1080,9 @@ class DistributedStrategy(object):
[Large Batch Optimization for Deep Learning: Training BERT in 76 minutes](https://arxiv.org/abs/1904.00962).
[Large Batch Optimization for Deep Learning: Training BERT in 76 minutes](https://arxiv.org/abs/1904.00962).
Default Value: False
Default Value: False
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -1078,6 +1111,7 @@ class DistributedStrategy(object):
...
@@ -1078,6 +1111,7 @@ class DistributedStrategy(object):
will be exclude from weight decay in lamb formula.
will be exclude from weight decay in lamb formula.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -1123,11 +1157,12 @@ class DistributedStrategy(object):
...
@@ -1123,11 +1157,12 @@ class DistributedStrategy(object):
Default Value: False
Default Value: False
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle
import paddle
import paddle.distributed.fleet as fleet
paddle.enable_static()
paddle.enable_static()
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.auto = True
strategy.auto = True
...
@@ -1156,8 +1191,11 @@ class DistributedStrategy(object):
...
@@ -1156,8 +1191,11 @@ class DistributedStrategy(object):
Default Value: True
Default Value: True
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle
paddle.enable_static()
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.cudnn_exhaustive_search = False
strategy.cudnn_exhaustive_search = False
...
@@ -1187,15 +1225,18 @@ class DistributedStrategy(object):
...
@@ -1187,15 +1225,18 @@ class DistributedStrategy(object):
Default Value: 4000
Default Value: 4000
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle
paddle.enable_static()
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.conv_workspace_size_limit = 1024
strategy.conv_workspace_size_limit = 1024
optimizer = paddle.optimizer.SGD(learning_rate=0.01)
optimizer = paddle.optimizer.SGD(learning_rate=0.01)
optimizer = fleet.distributed_optimizer(optimizer, strategy)
optimizer = fleet.distributed_optimizer(optimizer, strategy)
"""
"""
return
self
.
strategy
.
conv_workspace_size_limit
return
self
.
strategy
.
conv_workspace_size_limit
...
@@ -1217,8 +1258,11 @@ class DistributedStrategy(object):
...
@@ -1217,8 +1258,11 @@ class DistributedStrategy(object):
Default Value: True
Default Value: True
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle
paddle.enable_static()
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
strategy.cudnn_batchnorm_spatial_persistent = True
strategy.cudnn_batchnorm_spatial_persistent = True
...
...
python/paddle/distributed/fleet/base/fleet_base.py
浏览文件 @
cc9c6196
...
@@ -69,8 +69,11 @@ class Fleet(object):
...
@@ -69,8 +69,11 @@ class Fleet(object):
Fleet: A Fleet instance
Fleet: A Fleet instance
Example for collective training:
Example for collective training:
.. code-block:: python
.. code-block:: python
import paddle
paddle.enable_static()
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
fleet.init(is_collective=True)
fleet.init(is_collective=True)
...
@@ -86,6 +89,8 @@ class Fleet(object):
...
@@ -86,6 +89,8 @@ class Fleet(object):
.. code-block:: python
.. code-block:: python
import paddle
paddle.enable_static()
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
fleet.init()
fleet.init()
...
@@ -159,7 +164,7 @@ class Fleet(object):
...
@@ -159,7 +164,7 @@ class Fleet(object):
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
role = fleet.PaddleCloudRoleMaker
role = fleet.PaddleCloudRoleMaker
()
fleet.init(role)
fleet.init(role)
"""
"""
...
@@ -233,6 +238,7 @@ class Fleet(object):
...
@@ -233,6 +238,7 @@ class Fleet(object):
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
fleet.init()
fleet.init()
fleet.worker_index()
fleet.worker_index()
...
@@ -246,8 +252,9 @@ class Fleet(object):
...
@@ -246,8 +252,9 @@ class Fleet(object):
Returns:
Returns:
int: worker numbers
int: worker numbers
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -266,6 +273,7 @@ class Fleet(object):
...
@@ -266,6 +273,7 @@ class Fleet(object):
False if not.
False if not.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -283,6 +291,7 @@ class Fleet(object):
...
@@ -283,6 +291,7 @@ class Fleet(object):
list/string: server endpoints
list/string: server endpoints
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -303,10 +312,12 @@ class Fleet(object):
...
@@ -303,10 +312,12 @@ class Fleet(object):
int: server number
int: server number
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
fleet.init()
import paddle.distributed.fleet as fleet
fleet.server_num()
fleet.init()
fleet.server_num()
"""
"""
return
len
(
self
.
_role_maker
.
_get_pserver_endpoints
())
return
len
(
self
.
_role_maker
.
_get_pserver_endpoints
())
...
@@ -318,6 +329,7 @@ class Fleet(object):
...
@@ -318,6 +329,7 @@ class Fleet(object):
int: node id
int: node id
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -335,6 +347,7 @@ class Fleet(object):
...
@@ -335,6 +347,7 @@ class Fleet(object):
list/string: server endpoints
list/string: server endpoints
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
...
@@ -359,6 +372,7 @@ class Fleet(object):
...
@@ -359,6 +372,7 @@ class Fleet(object):
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
fleet.init()
fleet.init()
fleet.is_server()
fleet.is_server()
...
@@ -510,21 +524,21 @@ class Fleet(object):
...
@@ -510,21 +524,21 @@ class Fleet(object):
def
save_persistables
(
self
,
executor
,
dirname
,
main_program
=
None
,
mode
=
1
):
def
save_persistables
(
self
,
executor
,
dirname
,
main_program
=
None
,
mode
=
1
):
"""
"""
saves all persistable
variable
s from :code:`main_program` to
saves all persistable
tensor
s from :code:`main_program` to
the folder :code:`dirname`. You can refer to
the folder :code:`dirname`. You can refer to
The :code:`dirname` is used to specify the folder where persistable
variable
s
The :code:`dirname` is used to specify the folder where persistable
tensor
s
are going to be saved. If you would like to save
variable
s in separate
are going to be saved. If you would like to save
tensor
s in separate
files, set :code:`filename` None.
files, set :code:`filename` None.
Args:
Args:
executor(Executor): The executor to run for saving persistable
variable
s.
executor(Executor): The executor to run for saving persistable
tensor
s.
You can refer to :ref:`api_guide_executor_en` for
You can refer to :ref:`api_guide_executor_en` for
more details.
more details.
dirname(str, optional): The saving directory path.
dirname(str, optional): The saving directory path.
When you need to save the parameter to the memory, set it to None.
When you need to save the parameter to the memory, set it to None.
main_program(Program, optional): The program whose persistbale
variable
s will
main_program(Program, optional): The program whose persistbale
tensor
s will
be saved. Default: None.
be saved. Default: None.
...
@@ -535,16 +549,17 @@ class Fleet(object):
...
@@ -535,16 +549,17 @@ class Fleet(object):
.. code-block:: text
.. code-block:: text
import paddle
paddle.enable_static()
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
import paddle.fluid as fluid
fleet.init()
fleet.init()
# build net
# build net
# fleet.distributed_optimizer(...)
# fleet.distributed_optimizer(...)
exe =
fluid.Executor(fluid
.CPUPlace())
exe =
paddle.static.Executor(paddle
.CPUPlace())
fleet.save_persistables(exe, "dirname",
fluid
.default_main_program())
fleet.save_persistables(exe, "dirname",
paddle.static
.default_main_program())
"""
"""
...
@@ -569,9 +584,9 @@ class Fleet(object):
...
@@ -569,9 +584,9 @@ class Fleet(object):
.. code-block:: python
.. code-block:: python
import paddle
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
role = fleet.role_maker.PaddleCloudRoleMaker(is_collective=True)
fleet.init(is_collective=True)
fleet.init(role)
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
optimizer = paddle.optimizer.SGD(learning_rate=0.001)
optimizer = paddle.optimizer.SGD(learning_rate=0.001)
optimizer = fleet.distributed_optimizer(optimizer, strategy=strategy)
optimizer = fleet.distributed_optimizer(optimizer, strategy=strategy)
...
@@ -621,23 +636,20 @@ class Fleet(object):
...
@@ -621,23 +636,20 @@ class Fleet(object):
def forward(self, x):
def forward(self, x):
return self._linear2(self._linear1(x))
return self._linear2(self._linear1(x))
# 1. enable dynamic mode
# 1. initialize fleet environment
paddle.disable_static()
# 2. initialize fleet environment
fleet.init(is_collective=True)
fleet.init(is_collective=True)
#
3
. create layer & optimizer
#
2
. create layer & optimizer
layer = LinearNet()
layer = LinearNet()
loss_fn = nn.MSELoss()
loss_fn = nn.MSELoss()
adam = paddle.optimizer.Adam(
adam = paddle.optimizer.Adam(
learning_rate=0.001, parameters=layer.parameters())
learning_rate=0.001, parameters=layer.parameters())
#
4
. get data_parallel model using fleet
#
3
. get data_parallel model using fleet
adam = fleet.distributed_optimizer(adam)
adam = fleet.distributed_optimizer(adam)
dp_layer = fleet.distributed_model(layer)
dp_layer = fleet.distributed_model(layer)
#
5
. run layer
#
4
. run layer
inputs = paddle.randn([10, 10], 'float32')
inputs = paddle.randn([10, 10], 'float32')
outputs = dp_layer(inputs)
outputs = dp_layer(inputs)
labels = paddle.randn([10, 1], 'float32')
labels = paddle.randn([10, 1], 'float32')
...
@@ -675,11 +687,10 @@ class Fleet(object):
...
@@ -675,11 +687,10 @@ class Fleet(object):
import paddle
import paddle
from paddle.distributed import fleet
from paddle.distributed import fleet
paddle.disable_static()
fleet.init(is_collective=True)
fleet.init(is_collective=True)
value = np.arange(26).reshape(2, 13).astype("float32")
value = np.arange(26).reshape(2, 13).astype("float32")
a = paddle.
fluid.dygraph.to_variable
(value)
a = paddle.
to_tensor
(value)
layer = paddle.nn.Linear(13, 5)
layer = paddle.nn.Linear(13, 5)
adam = paddle.optimizer.Adam(learning_rate=0.01, parameters=layer.parameters())
adam = paddle.optimizer.Adam(learning_rate=0.01, parameters=layer.parameters())
...
@@ -710,11 +721,10 @@ class Fleet(object):
...
@@ -710,11 +721,10 @@ class Fleet(object):
import paddle
import paddle
from paddle.distributed import fleet
from paddle.distributed import fleet
paddle.disable_static()
fleet.init(is_collective=True)
fleet.init(is_collective=True)
value = np.arange(26).reshape(2, 13).astype("float32")
value = np.arange(26).reshape(2, 13).astype("float32")
a = paddle.
fluid.dygraph.to_variable
(value)
a = paddle.
to_tensor
(value)
layer = paddle.nn.Linear(13, 5)
layer = paddle.nn.Linear(13, 5)
adam = paddle.optimizer.Adam(learning_rate=0.01, parameters=layer.parameters())
adam = paddle.optimizer.Adam(learning_rate=0.01, parameters=layer.parameters())
...
@@ -722,9 +732,9 @@ class Fleet(object):
...
@@ -722,9 +732,9 @@ class Fleet(object):
adam = fleet.distributed_optimizer(adam)
adam = fleet.distributed_optimizer(adam)
dp_layer = fleet.distributed_model(layer)
dp_layer = fleet.distributed_model(layer)
state_dict = adam.state_dict()
state_dict = adam.state_dict()
paddle.
framework.
save(state_dict, "paddle_dy")
paddle.save(state_dict, "paddle_dy")
para_state_dict
, opti_state_dict = paddle.framework.load(
"paddle_dy")
para_state_dict
= paddle.load(
"paddle_dy")
adam.set_state_dict(
opti
_state_dict)
adam.set_state_dict(
para
_state_dict)
"""
"""
# imitate target optimizer retrieval
# imitate target optimizer retrieval
return
self
.
user_defined_optimizer
.
set_state_dict
(
state_dict
)
return
self
.
user_defined_optimizer
.
set_state_dict
(
state_dict
)
...
@@ -748,11 +758,10 @@ class Fleet(object):
...
@@ -748,11 +758,10 @@ class Fleet(object):
import paddle
import paddle
from paddle.distributed import fleet
from paddle.distributed import fleet
paddle.disable_static()
fleet.init(is_collective=True)
fleet.init(is_collective=True)
value = np.arange(26).reshape(2, 13).astype("float32")
value = np.arange(26).reshape(2, 13).astype("float32")
a = paddle.
fluid.dygraph.to_variable
(value)
a = paddle.
to_tensor
(value)
layer = paddle.nn.Linear(13, 5)
layer = paddle.nn.Linear(13, 5)
adam = paddle.optimizer.Adam(learning_rate=0.01, parameters=layer.parameters())
adam = paddle.optimizer.Adam(learning_rate=0.01, parameters=layer.parameters())
...
@@ -785,17 +794,17 @@ class Fleet(object):
...
@@ -785,17 +794,17 @@ class Fleet(object):
float: The learning rate of the current step.
float: The learning rate of the current step.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import numpy as np
import numpy as np
import paddle
import paddle
from paddle.distributed import fleet
from paddle.distributed import fleet
paddle.disable_static()
fleet.init(is_collective=True)
fleet.init(is_collective=True)
value = np.arange(26).reshape(2, 13).astype("float32")
value = np.arange(26).reshape(2, 13).astype("float32")
a = paddle.
fluid.dygraph.to_variable
(value)
a = paddle.
to_tensor
(value)
layer = paddle.nn.Linear(13, 5)
layer = paddle.nn.Linear(13, 5)
adam = paddle.optimizer.Adam(learning_rate=0.01, parameters=layer.parameters())
adam = paddle.optimizer.Adam(learning_rate=0.01, parameters=layer.parameters())
...
@@ -819,6 +828,7 @@ class Fleet(object):
...
@@ -819,6 +828,7 @@ class Fleet(object):
None
None
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle
import paddle
...
@@ -834,23 +844,20 @@ class Fleet(object):
...
@@ -834,23 +844,20 @@ class Fleet(object):
def forward(self, x):
def forward(self, x):
return self._linear2(self._linear1(x))
return self._linear2(self._linear1(x))
# 1. enable dynamic mode
# 1. initialize fleet environment
paddle.disable_static()
# 2. initialize fleet environment
fleet.init(is_collective=True)
fleet.init(is_collective=True)
#
3
. create layer & optimizer
#
2
. create layer & optimizer
layer = LinearNet()
layer = LinearNet()
loss_fn = nn.MSELoss()
loss_fn = nn.MSELoss()
adam = paddle.optimizer.Adam(
adam = paddle.optimizer.Adam(
learning_rate=0.001, parameters=layer.parameters())
learning_rate=0.001, parameters=layer.parameters())
#
4
. get data_parallel model using fleet
#
3
. get data_parallel model using fleet
adam = fleet.distributed_optimizer(adam)
adam = fleet.distributed_optimizer(adam)
dp_layer = fleet.distributed_model(layer)
dp_layer = fleet.distributed_model(layer)
#
5
. run layer
#
4
. run layer
inputs = paddle.randn([10, 10], 'float32')
inputs = paddle.randn([10, 10], 'float32')
outputs = dp_layer(inputs)
outputs = dp_layer(inputs)
labels = paddle.randn([10, 1], 'float32')
labels = paddle.randn([10, 1], 'float32')
...
@@ -878,6 +885,7 @@ class Fleet(object):
...
@@ -878,6 +885,7 @@ class Fleet(object):
None
None
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle
import paddle
...
@@ -893,23 +901,20 @@ class Fleet(object):
...
@@ -893,23 +901,20 @@ class Fleet(object):
def forward(self, x):
def forward(self, x):
return self._linear2(self._linear1(x))
return self._linear2(self._linear1(x))
# 1. enable dynamic mode
# 1. initialize fleet environment
paddle.disable_static()
# 2. initialize fleet environment
fleet.init(is_collective=True)
fleet.init(is_collective=True)
#
3
. create layer & optimizer
#
2
. create layer & optimizer
layer = LinearNet()
layer = LinearNet()
loss_fn = nn.MSELoss()
loss_fn = nn.MSELoss()
adam = paddle.optimizer.Adam(
adam = paddle.optimizer.Adam(
learning_rate=0.001, parameters=layer.parameters())
learning_rate=0.001, parameters=layer.parameters())
#
4
. get data_parallel model using fleet
#
3
. get data_parallel model using fleet
adam = fleet.distributed_optimizer(adam)
adam = fleet.distributed_optimizer(adam)
dp_layer = fleet.distributed_model(layer)
dp_layer = fleet.distributed_model(layer)
#
5
. run layer
#
4
. run layer
inputs = paddle.randn([10, 10], 'float32')
inputs = paddle.randn([10, 10], 'float32')
outputs = dp_layer(inputs)
outputs = dp_layer(inputs)
labels = paddle.randn([10, 1], 'float32')
labels = paddle.randn([10, 1], 'float32')
...
@@ -962,38 +967,44 @@ class Fleet(object):
...
@@ -962,38 +967,44 @@ class Fleet(object):
Add distributed operations to minimize ``loss`` by updating ``parameter_list``.
Add distributed operations to minimize ``loss`` by updating ``parameter_list``.
Args:
Args:
loss (
Variable): A ``Variable
`` containing the value to minimize.
loss (
Tensor): A ``Tensor
`` containing the value to minimize.
startup_program (Program, optional): :ref:`api_fluid_Program` for
startup_program (Program, optional): :ref:`api_fluid_Program` for
initializing parameters in ``parameter_list``. The default value
initializing parameters in ``parameter_list``. The default value
is None, at this time :ref:`api_fluid_default_startup_program` will be used.
is None, at this time :ref:`api_fluid_default_startup_program` will be used.
parameter_list (Iterable, optional): Iterable of ``
Variable`` or ``Variable
.name`` to update
parameter_list (Iterable, optional): Iterable of ``
Tensor`` or ``Tensor
.name`` to update
to minimize ``loss``. The default value is None, at this time all parameters
to minimize ``loss``. The default value is None, at this time all parameters
will be updated.
will be updated.
no_grad_set (set, optional): Set of ``
Variable`` or ``Variable
.name`` that don't need
no_grad_set (set, optional): Set of ``
Tensor`` or ``Tensor
.name`` that don't need
to be updated. The default value is None.
to be updated. The default value is None.
Returns:
Returns:
tuple: tuple (optimize_ops, params_grads), A list of operators appended
tuple: tuple (optimize_ops, params_grads), A list of operators appended
by minimize and a list of (param, grad)
variable
pairs, param is
by minimize and a list of (param, grad)
tensor
pairs, param is
``Parameter``, grad is the gradient value corresponding to the parameter.
``Parameter``, grad is the gradient value corresponding to the parameter.
The returned tuple can be passed to ``fetch_list`` in ``Executor.run()`` to
The returned tuple can be passed to ``fetch_list`` in ``Executor.run()`` to
indicate program pruning. If so, the program will be pruned by ``feed`` and
indicate program pruning. If so, the program will be pruned by ``feed`` and
``fetch_list`` before run, see details in ``Executor``.
``fetch_list`` before run, see details in ``Executor``.
Examples:
Examples:
.. code-block:: python
.. code-block:: python
import paddle
import paddle
paddle.enable_static()
import paddle.distributed.fleet as fleet
import paddle.distributed.fleet as fleet
import paddle.nn.functional as F
hid_dim = 10
label_dim = 2
input_x = paddle.static.data(name='x', shape=[None, 13], dtype='float32')
input_y = paddle.static.data(name='y', shape=[None, 1], dtype='int64')
fc_1 = paddle.static.nn.fc(x=input_x, size=hid_dim, activation='tanh')
fc_2 = paddle.static.nn.fc(x=fc_1, size=hid_dim, activation='tanh')
prediction = paddle.static.nn.fc(x=[fc_2], size=label_dim, activation='softmax')
cost = F.cross_entropy(input=prediction, label=input_y)
avg_cost = paddle.mean(x=cost)
fc_1 = paddle.fluid.layers.fc(input=input_x, size=hid_dim, act='tanh')
fleet.init(is_collective=True)
fc_2 = paddle.fluid.layers.fc(input=fc_1, size=hid_dim, act='tanh')
prediction = paddle.fluid.layers.fc(input=[fc_2], size=label_dim, act='softmax')
cost = paddle.fluid.layers.cross_entropy(input=prediction, label=input_y)
avg_cost = paddle.fluid.layers.mean(x=cost)
role = fleet.role_maker.PaddleCloudRoleMaker(is_collective=True)
fleet.init(role)
strategy = fleet.DistributedStrategy()
strategy = fleet.DistributedStrategy()
optimizer = paddle.optimizer.SGD(learning_rate=0.001)
optimizer = paddle.optimizer.SGD(learning_rate=0.001)
optimizer = fleet.distributed_optimizer(optimizer, strategy=strategy)
optimizer = fleet.distributed_optimizer(optimizer, strategy=strategy)
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录