提交 4537b7bc 编写于 作者: Q qingqing01 提交者: GitHub

Merge pull request #2376 from Xreki/warpctc_note

Add doc to the usage of warp-ctc.
...@@ -2916,11 +2916,11 @@ def memory(name, ...@@ -2916,11 +2916,11 @@ def memory(name,
to specify the layer needs to be remembered as the following: to specify the layer needs to be remembered as the following:
.. code-block:: python .. code-block:: python
mem = memory(size=256) mem = memory(size=256)
state = fc_layer(input=mem, size=256) state = fc_layer(input=mem, size=256)
mem.set_input(mem) mem.set_input(mem)
:param name: the name of the layer which this memory remembers. :param name: the name of the layer which this memory remembers.
If name is None, user should call set_input() to specify the If name is None, user should call set_input() to specify the
name of the layer which this memory remembers. name of the layer which this memory remembers.
...@@ -3407,7 +3407,7 @@ def recurrent_group(step, ...@@ -3407,7 +3407,7 @@ def recurrent_group(step,
else, for training or testing, one of the input type must else, for training or testing, one of the input type must
be LayerOutput. be LayerOutput.
: type is_generating: bool :type is_generating: bool
:return: LayerOutput object. :return: LayerOutput object.
:rtype: LayerOutput :rtype: LayerOutput
...@@ -3814,7 +3814,7 @@ def mse_cost(input, label, weight=None, name=None, coeff=1.0, layer_attr=None): ...@@ -3814,7 +3814,7 @@ def mse_cost(input, label, weight=None, name=None, coeff=1.0, layer_attr=None):
.. math:: .. math::
\frac{1}{N}\sum_{i=1}^N(t_i-y_i)^2 \\frac{1}{N}\sum_{i=1}^N(t_i-y_i)^2
:param name: layer name. :param name: layer name.
:type name: basestring :type name: basestring
...@@ -4769,21 +4769,36 @@ def warp_ctc_layer(input, ...@@ -4769,21 +4769,36 @@ def warp_ctc_layer(input,
layer_attr=None): layer_attr=None):
""" """
A layer intergrating the open-source `warp-ctc A layer intergrating the open-source `warp-ctc
<https://github.com/baidu-research/warp-ctc>` library, which is used in <https://github.com/baidu-research/warp-ctc>`_ library, which is used in
`Deep Speech 2: End-toEnd Speech Recognition in English and Mandarin `Deep Speech 2: End-toEnd Speech Recognition in English and Mandarin
<https://arxiv.org/pdf/1512.02595v1.pdf>`, to compute Connectionist Temporal <https://arxiv.org/pdf/1512.02595v1.pdf>`_, to compute Connectionist Temporal
Classification (CTC) loss. Classification (CTC) loss. Besides, another `warp-ctc
<https://github.com/gangliao/warp-ctc>`_ repository, which is forked from
the official one, is maintained to enable more compiling options. During the
building process, PaddlePaddle will clone the source codes, build and
install it to :code:`third_party/install/warpctc` directory.
To use warp_ctc layer, you need to specify the path of :code:`libwarpctc.so`,
using following methods:
1. Set it in :code:`paddle.init` (python api) or :code:`paddle_init` (c api),
such as :code:`paddle.init(use_gpu=True,
warpctc_dir=your_paddle_source_dir/third_party/install/warpctc/lib)`.
2. Set environment variable LD_LIBRARY_PATH on Linux or DYLD_LIBRARY_PATH
on Mac OS. For instance, :code:`export
LD_LIBRARY_PATH=your_paddle_source_dir/third_party/install/warpctc/lib:$LD_LIBRARY_PATH`.
More details of CTC can be found by referring to `Connectionist Temporal More details of CTC can be found by referring to `Connectionist Temporal
Classification: Labelling Unsegmented Sequence Data with Recurrent Classification: Labelling Unsegmented Sequence Data with Recurrent
Neural Networks <http://machinelearning.wustl.edu/mlpapers/paper_files/ Neural Networks <http://machinelearning.wustl.edu/mlpapers/paper_files/
icml2006_GravesFGS06.pdf>`_ icml2006_GravesFGS06.pdf>`_.
Note: Note:
- Let num_classes represent the category number. Considering the 'blank' - Let num_classes represent the category number. Considering the 'blank'
label needed by CTC, you need to use (num_classes + 1) as the input label needed by CTC, you need to use (num_classes + 1) as the input size.
size. Thus, the size of both warp_ctc_layer and 'input' layer should Thus, the size of both warp_ctc layer and 'input' layer should be set to
be set to num_classes + 1. num_classes + 1.
- You can set 'blank' to any value ranged in [0, num_classes], which - You can set 'blank' to any value ranged in [0, num_classes], which
should be consistent as that used in your labels. should be consistent as that used in your labels.
- As a native 'softmax' activation is interated to the warp-ctc library, - As a native 'softmax' activation is interated to the warp-ctc library,
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册