未验证 提交 199da968 编写于 作者: C Chen Weihang 提交者: GitHub

Polish api Program/CompiledProgram/ParallelEnv doc & code example (#27656)

* polish Program api doc & example

* polish CompiledProgram api doc & example

* polish ParallelEnv api doc & examples

* polish details, test=document_fix

* polish program doc details, test=document_fix

* polish details, test=document_fix

* fix note format error, test=document_fix

* add lost example, test=document_fix

* fix lost example, test=document_fix
上级 b14ecb86
...@@ -93,16 +93,16 @@ class CompiledProgram(object): ...@@ -93,16 +93,16 @@ class CompiledProgram(object):
for example, the operators' fusion in the computation graph, memory for example, the operators' fusion in the computation graph, memory
optimization during the execution of the computation graph, etc. optimization during the execution of the computation graph, etc.
For more information about build_strategy, please refer to For more information about build_strategy, please refer to
:code:`fluid.BuildStrategy`. :code:`paddle.static.BuildStrategy`.
Args: Args:
program_or_graph (Graph|Program): This parameter is the Program or Graph program_or_graph (Graph|Program): This argument is the Program or Graph
being executed. being executed.
build_strategy(BuildStrategy): This parameter is used to compile the build_strategy(BuildStrategy): This argument is used to compile the
program or graph with the specified options, such as operators' fusion program or graph with the specified options, such as operators' fusion
in the computational graph and memory optimization during the execution in the computational graph and memory optimization during the execution
of the computational graph. For more information about build_strategy, of the computational graph. For more information about build_strategy,
please refer to :code:`fluid.BuildStrategy`. The default is None. please refer to :code:`paddle.static.BuildStrategy`. The default is None.
Returns: Returns:
CompiledProgram CompiledProgram
...@@ -110,25 +110,28 @@ class CompiledProgram(object): ...@@ -110,25 +110,28 @@ class CompiledProgram(object):
Example: Example:
.. code-block:: python .. code-block:: python
import paddle.fluid as fluid import numpy
import numpy import paddle
import paddle.static as static
place = fluid.CUDAPlace(0) # fluid.CPUPlace() paddle.enable_static()
exe = fluid.Executor(place)
data = fluid.data(name='X', shape=[None, 1], dtype='float32') place = paddle.CUDAPlace(0) # paddle.CPUPlace()
hidden = fluid.layers.fc(input=data, size=10) exe = static.Executor(place)
loss = fluid.layers.mean(hidden)
fluid.optimizer.SGD(learning_rate=0.01).minimize(loss)
exe.run(fluid.default_startup_program()) data = static.data(name='X', shape=[None, 1], dtype='float32')
compiled_prog = fluid.CompiledProgram( hidden = static.nn.fc(input=data, size=10)
fluid.default_main_program()) loss = paddle.mean(hidden)
paddle.optimizer.SGD(learning_rate=0.01).minimize(loss)
x = numpy.random.random(size=(10, 1)).astype('float32') exe.run(static.default_startup_program())
loss_data, = exe.run(compiled_prog, compiled_prog = static.CompiledProgram(
feed={"X": x}, static.default_main_program())
fetch_list=[loss.name])
x = numpy.random.random(size=(10, 1)).astype('float32')
loss_data, = exe.run(compiled_prog,
feed={"X": x},
fetch_list=[loss.name])
""" """
def __init__(self, program_or_graph, build_strategy=None): def __init__(self, program_or_graph, build_strategy=None):
...@@ -169,13 +172,16 @@ class CompiledProgram(object): ...@@ -169,13 +172,16 @@ class CompiledProgram(object):
exec_strategy to set some optimizations that can be applied during the construction exec_strategy to set some optimizations that can be applied during the construction
and computation of the Graph, such as reducing the number of AllReduce operations, and computation of the Graph, such as reducing the number of AllReduce operations,
specifying the size of the thread pool used in the computation Graph running the model, specifying the size of the thread pool used in the computation Graph running the model,
and so on. **Note: If build_strategy is specified when building CompiledProgram and calling and so on.
with_data_parallel, build_strategy in CompiledProgram will be overwritten, therefore,
if it is data parallel training, it is recommended to set build_strategy when calling .. note::
with_data_parallel interface.** If build_strategy is specified when building CompiledProgram and calling
with_data_parallel, build_strategy in CompiledProgram will be overwritten, therefore,
if it is data parallel training, it is recommended to set build_strategy when calling
with_data_parallel interface.
Args: Args:
loss_name (str): This parameter is the name of the loss variable of the model. loss_name (str): This parameter is the name of the loss Tensor of the model.
**Note: If it is model training, you must set loss_name, otherwise the **Note: If it is model training, you must set loss_name, otherwise the
result may be problematic**. The default is None. result may be problematic**. The default is None.
build_strategy(BuildStrategy): This parameter is used to compile the build_strategy(BuildStrategy): This parameter is used to compile the
...@@ -192,7 +198,7 @@ class CompiledProgram(object): ...@@ -192,7 +198,7 @@ class CompiledProgram(object):
specified by share_vars_from. This parameter needs to be set when model testing specified by share_vars_from. This parameter needs to be set when model testing
is required during model training, and the data parallel mode is used for is required during model training, and the data parallel mode is used for
training and testing. Since CompiledProgram will only distribute parameter training and testing. Since CompiledProgram will only distribute parameter
variables to other devices when it is first executed, the CompiledProgram Tensors to other devices when it is first executed, the CompiledProgram
specified by share_vars_from must be run before the current CompiledProgram. specified by share_vars_from must be run before the current CompiledProgram.
The default is None. The default is None.
places(list(CUDAPlace)|list(CPUPlace)|None): This parameter specifies the device places(list(CUDAPlace)|list(CPUPlace)|None): This parameter specifies the device
...@@ -214,50 +220,53 @@ class CompiledProgram(object): ...@@ -214,50 +220,53 @@ class CompiledProgram(object):
Example: Example:
.. code-block:: python .. code-block:: python
import paddle.fluid as fluid import numpy
import numpy import os
import os import paddle
import paddle.static as static
use_cuda = True
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() paddle.enable_static()
parallel_places = [fluid.CUDAPlace(0), fluid.CUDAPlace(1)] if use_cuda else [fluid.CPUPlace()] * 2
use_cuda = True
# NOTE: If you use CPU to run the program, you need place = paddle.CUDAPlace(0) if use_cuda else paddle.CPUPlace()
# to specify the CPU_NUM, otherwise, fluid will use parallel_places = [paddle.CUDAPlace(0), paddle.CUDAPlace(1)] if use_cuda else [paddle.CPUPlace()] * 2
# all the number of the logic core as the CPU_NUM,
# in that case, the batch size of the input should be # NOTE: If you use CPU to run the program, you need
# greater than CPU_NUM, if not, the process will be # to specify the CPU_NUM, otherwise, paddle will use
# failed by an exception. # all the number of the logic core as the CPU_NUM,
if not use_cuda: # in that case, the batch size of the input should be
os.environ['CPU_NUM'] = str(2) # greater than CPU_NUM, if not, the process will be
# failed by an exception.
exe = fluid.Executor(place) if not use_cuda:
os.environ['CPU_NUM'] = str(2)
data = fluid.data(name='X', shape=[None, 1], dtype='float32')
hidden = fluid.layers.fc(input=data, size=10) exe = static.Executor(place)
loss = fluid.layers.mean(hidden)
data = static.data(name='X', shape=[None, 1], dtype='float32')
test_program = fluid.default_main_program().clone(for_test=True) hidden = static.nn.fc(input=data, size=10)
fluid.optimizer.SGD(learning_rate=0.01).minimize(loss) loss = paddle.mean(hidden)
exe.run(fluid.default_startup_program()) test_program = static.default_main_program().clone(for_test=True)
compiled_train_prog = fluid.CompiledProgram( paddle.optimizer.SGD(learning_rate=0.01).minimize(loss)
fluid.default_main_program()).with_data_parallel(
loss_name=loss.name, places=parallel_places) exe.run(static.default_startup_program())
# NOTE: if not set share_vars_from=compiled_train_prog, compiled_train_prog = static.CompiledProgram(
# the parameters used in test process are different with static.default_main_program()).with_data_parallel(
# the parameters used by train process loss_name=loss.name, places=parallel_places)
compiled_test_prog = fluid.CompiledProgram( # NOTE: if not set share_vars_from=compiled_train_prog,
test_program).with_data_parallel( # the parameters used in test process are different with
share_vars_from=compiled_train_prog, # the parameters used by train process
places=parallel_places) compiled_test_prog = static.CompiledProgram(
test_program).with_data_parallel(
train_data = numpy.random.random(size=(10, 1)).astype('float32') share_vars_from=compiled_train_prog,
loss_data, = exe.run(compiled_train_prog, places=parallel_places)
train_data = numpy.random.random(size=(10, 1)).astype('float32')
loss_data, = exe.run(compiled_train_prog,
feed={"X": train_data}, feed={"X": train_data},
fetch_list=[loss.name]) fetch_list=[loss.name])
test_data = numpy.random.random(size=(10, 1)).astype('float32') test_data = numpy.random.random(size=(10, 1)).astype('float32')
loss_data, = exe.run(compiled_test_prog, loss_data, = exe.run(compiled_test_prog,
feed={"X": test_data}, feed={"X": test_data},
fetch_list=[loss.name]) fetch_list=[loss.name])
""" """
......
...@@ -61,60 +61,44 @@ def prepare_context(strategy=None): ...@@ -61,60 +61,44 @@ def prepare_context(strategy=None):
class ParallelEnv(object): class ParallelEnv(object):
""" """
**Notes**: .. note::
**The old class name was Env and will be deprecated. Please use new class name ParallelEnv.** This API is not recommended, if you need to get rank and world_size,
it is recommended to use ``paddle.distributed.get_rank()`` and
``paddle.distributed.get_world_size()`` .
This class is used to obtain the environment variables required for This class is used to obtain the environment variables required for
the parallel execution of dynamic graph model. the parallel execution of ``paddle.nn.Layer`` in dynamic mode.
The dynamic graph parallel mode needs to be started using paddle.distributed.launch. The parallel execution in dynamic mode needs to be started using ``paddle.distributed.launch``
By default, the related environment variable is automatically configured by this module. or ``paddle.distributed.spawn`` .
This class is generally used in with `fluid.dygraph.DataParallel` to configure dynamic graph models
to run in parallel.
Examples: Examples:
.. code-block:: python .. code-block:: python
# This example needs to run with paddle.distributed.launch, The usage is: import paddle
# python -m paddle.distributed.launch --selected_gpus=0,1 example.py import paddle.distributed as dist
# And the content of `example.py` is the code of following example.
def train():
import numpy as np # 1. initialize parallel environment
import paddle.fluid as fluid dist.init_parallel_env()
import paddle.fluid.dygraph as dygraph
from paddle.fluid.optimizer import AdamOptimizer # 2. get current ParallelEnv
from paddle.fluid.dygraph.nn import Linear parallel_env = dist.ParallelEnv()
from paddle.fluid.dygraph.base import to_variable print("rank: ", parallel_env.rank)
print("world_size: ", parallel_env.world_size)
place = fluid.CUDAPlace(fluid.dygraph.ParallelEnv().dev_id)
with fluid.dygraph.guard(place=place): # print result in process 1:
# rank: 1
# prepare the data parallel context # world_size: 2
strategy=dygraph.prepare_context() # print result in process 2:
# rank: 2
linear = Linear(1, 10, act="softmax") # world_size: 2
adam = fluid.optimizer.AdamOptimizer()
if __name__ == '__main__':
# make the module become the data parallelism module # 1. start by ``paddle.distributed.spawn`` (default)
linear = dygraph.DataParallel(linear, strategy) dist.spawn(train, nprocs=2)
# 2. start by ``paddle.distributed.launch``
x_data = np.random.random(size=[10, 1]).astype(np.float32) # train()
data = to_variable(x_data)
hidden = linear(data)
avg_loss = fluid.layers.mean(hidden)
# scale the loss according to the number of trainers.
avg_loss = linear.scale_loss(avg_loss)
avg_loss.backward()
# collect the gradients of trainers.
linear.apply_collective_grads()
adam.minimize(avg_loss)
linear.clear_gradients()
""" """
def __init__(self): def __init__(self):
......
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册