The :ref:`api_fluid_CompiledProgram` is used to transform a program for various optimizations. For example, you can use :code:`with_data_parallel` to transform the program to data parallel program so that it can be run in multiple devices.
.. code-block:: python
# Note:
# - If you want to specify the GPU cards which are used to run
# in ParallelExecutor, you should define the CUDA_VISIBLE_DEVICES
# in environment.
# - If you want to use multi CPU to run the program in ParallelExecutor,
# you should define the CPU_NUM in the environment.
# First create the Executor.
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
# Run the startup program once and only once.
exe.run(fluid.default_startup_program())
# Run the main program directly without compile.
loss = exe.run(fluid.default_main_program(),
feed=feed_dict,
fetch_list=[loss.name])
# Or, compiled the program, and then run the model with data parallel.
exec_strategy = fluid.ExecutionStrategy()
exec_strategy.num_threads = dev_count * 4 # the size of thread pool.
build_strategy = fluid.BuildStrategy()
build_strategy.memory_optimize = True if memory_opt else False
:code:`Executor` realizes a simple executor in which all operators will be executed in order. You can run :code:`Executor` in a Python script. There are two kinds of executors in PaddlePaddle Fluid. One is single-thread executor which is the default option for :code:`Executor`
and another is the parallel executor which is illustrated in :ref:`api_guide_parallel_executor_en` .
:code:`Executor` realizes a simple executor in which all operators will be executed in order. You can run :code:`Executor` in a Python script. There are two kinds of executors in PaddlePaddle Fluid. One is single-thread executor which is the default option for :code:`Executor` and the other is the parallel executor which is illustrated in :ref:`api_guide_parallel_executor_en` . The config of `Executor` and :ref:`api_guide_parallel_executor_en` is different, it may be a bit confusing for some users. To make the executor more facility, we introduce :ref:`api_guide_compiled_program_en` , :ref:`api_guide_compiled_program_en` is used to transform a program for various optimizations, and it can be run by :code:`Executor`.
The logic of :code:`Executor` is very simple. It is suggested to thoroughly run the model with :code:`Executor` in debugging phase on one computer and then switch to mode of multiple devices or multiple computers to compute.
:code:`Executor` receives a :code:`Place` at construction, which can either be :ref:`api_fluid_CPUPlace` or :ref:`api_fluid_CUDAPlace`.
.. code-block:: python
# First create the Executor.
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
# Run the startup program once and only once.
exe.run(fluid.default_startup_program())
# Run the main program directly.
loss, = exe.run(fluid.default_main_program(),
feed=feed_dict,
fetch_list=[loss.name])
For simple example please refer to `quick_start_fit_a_line <http://paddlepaddle.org/documentation/docs/zh/1.1/beginners_guide/quick_start/fit_a_line/README.html>`_
For API Reference, please refer to :ref:`api_fluid_Executor`.
@@ -31,6 +31,45 @@ These two modes are specified by :code:`build_strategy`. For how to use them, pl
Since the execution speed of the model is related to the model structure and the execution strategy of the executor, :code:`ParallelExecutor` allows you to modify the relevant parameters of the executor, such as the size of thread pool ( :code:`num_threads` ), how many iterations should be done to clean up temporary variables :code:`num_iteration_per_drop_scope` . For more information, please refer to :ref:`api_fluid_ExecutionStrategy`.
.. code-block:: python
# Note:
# - If you want to specify the GPU cards which are used to run
# in ParallelExecutor, you should define the CUDA_VISIBLE_DEVICES
# in environment.
# - If you want to use multi CPU to run the program in ParallelExecutor,
# you should define the CPU_NUM in the environment.
# First create the Executor.
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
# Run the startup program once and only once.
exe.run(fluid.default_startup_program())
# Define train_exe and test_exe
exec_strategy = fluid.ExecutionStrategy()
exec_strategy.num_threads = dev_count * 4 # the size of thread pool.
build_strategy = fluid.BuildStrategy()
build_strategy.memory_optimize = True if memory_opt else False
In multi-card training, you can use :code:`fluid.ParallelExecutor` to run training :code:`fluid.Program`. For example:
In multi-card training, you can use :code:`fluid.compiler.CompiledProgram` to compile the :code:`fluid.Program`, and then call :code:`with_data_parallel`. For example:
1. The constructor of :code:`ParallelExecutor` needs to be set with :code:`fluid.Program` to be run which can not be modified at runtime. The default value is :code:`fluid.default_main_program()` .
2. :code:`ParallelExecutor` should be indicated whether to use CUDA to train. In the mode of graphic card training, all graphic cards will be occupied. Users can configure `CUDA_VISIBLE_DEVICES <http://www.acceleware.com/blog/cudavisibledevices-masking-gpus>`_ to change graphics cards that are being used.
1. The constructor of :ref:`api_fluid_CompiledProgram` needs to be set with :code:`fluid.Program` to be run which can not be modified at runtime.
2. If :code:`exe` is initialized with CUDAPlace, the model will be run in GPU. In the mode of graphics card training, all graphics card will be occupied. Users can configure `CUDA_VISIBLE_DEVICES <http://www.acceleware.com/blog/cudavisibledevices-masking-gpus>`_ to change graphics cards that are being used.
3. If :code:`exe` is initialized with CPUPlace, the model will be run in CPU. In this situation, the multi-threads are used to run the model, and the number of threads is equal to the number of logic cores. Users can configure `CPU_NUM` to change the number of threads that are being used.