From 39e1c74208003b24a280ecd909a50e7ec738ebb5 Mon Sep 17 00:00:00 2001 From: "Mr.Lee" <37506361+ShangCambridge@users.noreply.github.com> Date: Wed, 20 Feb 2019 16:05:25 +0800 Subject: [PATCH] upload memory_optimize_en.rst and nets_en.rst (#628) * upload memory_optimize_en.rst * Review * fix jargon * Fix memory ==> video memory --- .../low_level/memory_optimize_en.rst | 50 ++++++++++++++++ doc/fluid/api_guides/low_level/nets_en.rst | 59 +++++++++++++++++++ 2 files changed, 109 insertions(+) create mode 100644 doc/fluid/api_guides/low_level/memory_optimize_en.rst create mode 100644 doc/fluid/api_guides/low_level/nets_en.rst diff --git a/doc/fluid/api_guides/low_level/memory_optimize_en.rst b/doc/fluid/api_guides/low_level/memory_optimize_en.rst new file mode 100644 index 000000000..f9ce753e5 --- /dev/null +++ b/doc/fluid/api_guides/low_level/memory_optimize_en.rst @@ -0,0 +1,50 @@ +.. _api_guide_memory_optimize_en: + +######################### +Video Memory Optimization +######################### + +**This guide is for GPU training.** + +Video Memory optimization is to reduce the video memory consumption of :code:`Program` during execution by analyzing and reusing the video memory occupied by :code:`Variable` in :code:`Program`. Users can use the :code:`memory_optimize` interface to perform video memory optimization through Python scripts. The execution strategy of video memory optimization is as follows: + +- Firstly, analyze the remaining lifetime of :code:`Variable` according to the relationship between :code:`Operator` in :code:`Program` to get the remaining lifetime of each :code:`Variable`; +- Secondly, according to the remaining lifetime of each :code:`Variable`, the future :code:`Variable` will reuse the video memory which is used by the :code:`Variable` that approches the end of its remaining lifetime or ceases to exist. + +.. code-block:: python + + z = fluid.layers.sum([x, y]) + m = fluid.layers.matmul(y, z) + +In this example, the lifetime of :code:`x` lasts until :code:`fluid.layers.sum`, so its video memory can be reused by :code:`m`. + +Disable video memory optimization for specific parts +======================================================= + +:code:`memory_optimize` supports disabling video memory optimization for specific sections. You can specify the :code:`Variable` whose video memory space is not going to be reused by passing in a collection of :code:`Variable` names; +At the same time, :code:`memory_optimize` disables video memory optimization for the backward part of the network, and the user can enable this function by passing in the :code:`skip_grads` parameter. + +.. code-block:: python + + fluid.memory_optimize(fluid.default_main_program(), + skip_opt_set=("fc"), skip_grads=True) + +In this example, the :code:`fluid.memory_optimize` interface performs analysis of remaining lifetime of :code:`Variable` for the default :code:`Program` , and skips the :code:`Variable` with the name :code:`fc` and all the :code:`Variable` in the backward part of the network . +This part of the :code:`Variable` video memory will not be used again by other :code:`Variable`. + +Specify the video memory optimization level +============================================== + +:code:`memory_optimize` supports printing information for video memory reusing to facilitate debugging. Users can enable debugging video memory reusing by specifying :code:`print_log=True`; + +:code:`memory_optimize` supports two levels of video memory optimization, namely :code:`0` or :code:`1` : + +- When the optimization level is :code:`0`: After :code:`memory_optimize` analyzes the remaining lifetime of :code:`Variable`, it will judge the :code:`shape` of :code:`Variable` . Memory reusing can only happens to the :code:`Variable` with the same :code:`shape`; +- When the optimization level is :code:`1`: the :code:`memory_optimize` will perform video memory reusing as much as possible. After analyzing the remaining survival time of :code:`Variable`, even with different :code:`shape`, the :code:`Variable` will also perform the maximum amount of video memory reusing. + +.. code-block:: python + + fluid.memory_optimize(fluid.default_main_program(), + level=0, print_log=True) + +In this example, the :code:`fluid.memory_optimize` interface performs analysis of remaining lifetime of :code:`Variable` for the default :code:`Program` . Only when the :code:`shape` is exactly the same, will the :code:`Variable` enjoy video memory reusing. After the analysis is finished, all the debugging information related to video memory reusing will be printed out. diff --git a/doc/fluid/api_guides/low_level/nets_en.rst b/doc/fluid/api_guides/low_level/nets_en.rst new file mode 100644 index 000000000..aaf4f3ce9 --- /dev/null +++ b/doc/fluid/api_guides/low_level/nets_en.rst @@ -0,0 +1,59 @@ +.. _api_guide_nets_en: + +################ +Complex Networks +################ + +When dealing with complex functions, we usually need to code a lot to build a complex `Neural Network `_ . +Therefore, in order to make it easier for users to build complex network models, we provide some common basic function modules to simplify the user's code and reduce development costs. +These modules are usually composed of fine-grained functions combined based on certain logics. For implementation, please refer to `nets `_ . + +1.simple_img_conv_pool +---------------------- + +:code:`simple_img_conv_pool` is got by concatenating :ref:`api_fluid_layers_conv2d` with :ref:`api_fluid_layers_pool2d` . +This module is widely used in image classification models, such as the `MNIST `_ number classification. + +For API Reference, please refer to :ref:`api_fluid_nets_simple_img_conv_pool` + + +2.img_conv_group +---------------- + +:code:`img_conv_group` is composed of :ref:`api_fluid_layers_conv2d` , :ref:`api_fluid_layers_batch_norm`, :ref:`api_fluid_layers_dropout` and :ref:`api_fluid_layers_pool2d`. +This module can implement the combination of multiple :ref:`api_fluid_layers_conv2d` , :ref:`api_fluid_layers_batch_norm` , :ref:`api_fluid_layers_dropout` and a single :ref:`api_fluid_layers_pool2d`. +Among them, the number of :ref:`api_fluid_layers_conv2d` , :ref:`api_fluid_layers_batch_norm` and :ref:`api_fluid_layers_dropout` can be controlled separately, resulting in various combinations. +This module is widely used in more complex image classification tasks, such as `VGG `_. + +For API Reference, please refer to :ref:`api_fluid_nets_img_conv_group` + + +3.sequence_conv_pool +-------------------- + +:code:`sequence_conv_pool` is got by concatenating :ref:`api_fluid_layers_sequence_conv` with :ref:`api_fluid_layers_sequence_pool`. +The module is widely used in the field of `natural language processing `_ and `speech recognition `_ . Models such as the `text classification model `_ , +`TagSpace `_ and `Multi view Simnet `_. + +For API Reference, please refer to :ref:`api_fluid_nets_sequence_conv_pool` + + +4.glu +----- +The full name of :code:`glu` is Gated Linear Units, which originates from the paper `Language Modeling with Gated Convolutional Networks `_ . It consists of :ref:`api_fluid_layers_split` , :ref:`api_fluid_layers_sigmoid` and :ref:`api_fluid_layers_elementwise_mul`. +It divides the input data into 2 equal parts, calculates the `Sigmoid `_ of second part, and then performs dot product of the sigmoid vlaue with the first part to get the output. + +For API Reference, please refer to :ref:`api_fluid_nets_glu` + + +5.scaled_dot_product_attention +------------------------------ +:code:`scaled_dot_product_attention` originates from the paper `Attention Is All You Need `_ , mainly composed of :ref:`api_fluid_layers_fc` and :ref:`api_fluid_layers_softmax` . +For the input data :code:`Queries` , :code:`Key` and :code:`Values`, calculate the :code:`Attention` according to the following formula. + +.. math:: + Attention(Q, K, V)= softmax(QK^\mathrm{T})V + +This module is widely used in the model of `machine translation `_, such as `Transformer `_ . + +For API Reference, please refer to :ref:`api_fluid_nets_scaled_dot_product_attention` -- GitLab