diff --git a/doc/howto/deep_model/rnn/index_cn.rst b/doc/howto/deep_model/rnn/index_cn.rst index 9e805ca85191b793c8798a239927a318c70b96f5..9ecab5594cff47cde4700b7ce0f58013a960a16e 100644 --- a/doc/howto/deep_model/rnn/index_cn.rst +++ b/doc/howto/deep_model/rnn/index_cn.rst @@ -4,6 +4,7 @@ RNN相关模型 .. toctree:: :maxdepth: 1 + rnn_config_cn.rst recurrent_group_cn.md hierarchical_layer_cn.rst hrnn_rnn_api_compare_cn.rst diff --git a/doc/howto/deep_model/rnn/rnn_cn.md b/doc/howto/deep_model/rnn/rnn_cn.md deleted file mode 100644 index 5ec05b2cab9ba85f9f6e9644375ee14f647a413c..0000000000000000000000000000000000000000 --- a/doc/howto/deep_model/rnn/rnn_cn.md +++ /dev/null @@ -1,226 +0,0 @@ -RNN é…ç½® -================= - -æœ¬æ•™ç¨‹å°†æŒ‡å¯¼ä½ å¦‚ä½•åœ¨ PaddlePaddle ä¸é…置循环神ç»ç½‘络(RNN)。PaddlePaddle 高度支æŒçµæ´»å’Œé«˜æ•ˆçš„循环神ç»ç½‘络é…置。 在本教程ä¸ï¼Œæ‚¨å°†äº†è§£å¦‚何: - -- 准备用æ¥å¦ä¹ 循环神ç»ç½‘络的åºåˆ—æ•°æ®ã€‚ -- é…置循环神ç»ç½‘络架构。 -- 使用å¦ä¹ 完æˆçš„循环神ç»ç½‘络模型生æˆåºåˆ—。 - -我们将使用 vanilla 循环神ç»ç½‘络和 sequence to sequence 模型æ¥æŒ‡å¯¼ä½ 完æˆè¿™äº›æ¥éª¤ã€‚sequence to sequence 模型的代ç å¯ä»¥åœ¨`demo / seqToseq`找到。 - -准备åºåˆ—æ•°æ® ---------------------- - -PaddlePaddle ä¸éœ€è¦å¯¹åºåˆ—æ•°æ®è¿›è¡Œä»»ä½•é¢„处ç†ï¼Œä¾‹å¦‚填充。唯一需è¦åšçš„是将相应类型设置为输入。例如,以下代ç 段定义了三个输入。 它们都是åºåˆ—,它们的大å°æ˜¯`src_dict`,`trg_dict`å’Œ`trg_dict`: - -``` sourceCode -settings.input_types = [ - integer_value_sequence(len(settings.src_dict)), - integer_value_sequence(len(settings.trg_dict)), - integer_value_sequence(len(settings.trg_dict))] -``` - -在`process`函数ä¸ï¼Œæ¯ä¸ª`yield`函数将返回三个整数列表。æ¯ä¸ªæ•´æ•°åˆ—表被视为一个整数åºåˆ—: - -``` sourceCode -yield src_ids, trg_ids, trg_ids_next -``` - -有关如何编写数æ®æ供程åºçš„更多细节æ述,请å‚考 [PyDataProvider2](../../ui/data_provider/index.html)。完整的数æ®æ供文件在 `demo/seqToseq/dataprovider.py`。 - -é…置循环神ç»ç½‘络架构 ------------------------------------------------ - -### 简å•é—¨æŽ§å¾ªçŽ¯ç¥žç»ç½‘络(Gated Recurrent Neural Network) - -循环神ç»ç½‘络在æ¯ä¸ªæ—¶é—´æ¥éª¤é¡ºåºåœ°å¤„ç†åºåˆ—。下é¢åˆ—出了 LSTM 的架构的示例。 - -![image](../../../tutorials/sentiment_analysis/bi_lstm.jpg) - -一般æ¥è¯´ï¼Œå¾ªçŽ¯ç½‘络从 *t* = 1 到 *t* = *T* 或者åå‘地从 *t* = *T* 到 *t* = 1 执行以下æ“作。 - -*x*<sub>*t* + 1</sub> = *f*<sub>*x*</sub>(*x*<sub>*t*</sub>),*y*<sub>*t*</sub> = *f*<sub>*y*</sub>(*x*<sub>*t*</sub>) - -å…¶ä¸ *f*<sub>*x*</sub>(.) 称为**å•æ¥å‡½æ•°**(å³å•æ—¶é—´æ¥æ‰§è¡Œçš„函数,step function),而 *f*<sub>*y*</sub>(.) 称为**输出函数**。在 vanilla 循环神ç»ç½‘络ä¸ï¼Œå•æ¥å‡½æ•°å’Œè¾“出函数都éžå¸¸ç®€å•ã€‚然而,PaddlePaddle å¯ä»¥é€šè¿‡ä¿®æ”¹è¿™ä¸¤ä¸ªå‡½æ•°æ¥å®žçŽ°å¤æ‚的网络é…置。我们将使用 sequence to sequence 模型演示如何é…ç½®å¤æ‚的循环神ç»ç½‘络模型。在本节ä¸ï¼Œæˆ‘们将使用简å•çš„ vanilla 循环神ç»ç½‘络作为使用`recurrent_group`é…置简å•å¾ªçŽ¯ç¥žç»ç½‘络的例å。 注æ„ï¼Œå¦‚æžœä½ åªéœ€è¦ä½¿ç”¨ç®€å•çš„RNN,GRU或LSTM,那么推è使用`grumemory`å’Œ`lstmemory`ï¼Œå› ä¸ºå®ƒä»¬çš„è®¡ç®—æ•ˆçŽ‡æ¯”`recurrent_group`更高。 - -对于 vanilla RNN,在æ¯ä¸ªæ—¶é—´æ¥é•¿ï¼Œ**å•æ¥å‡½æ•°**为: - -*x*<sub>*t* + 1</sub> = *W*<sub>*x*</sub>*x*<sub>*t*</sub> + *W*<sub>*i*</sub>*I*<sub>*t*</sub> + *b* - -å…¶ä¸ *x*<sub>*t*</sub> 是RNN状æ€ï¼Œå¹¶ä¸” *I*<sub>*t*</sub> 是输入,*W*<sub>*x*</sub> å’Œ *W*<sub>*i*</sub> 分别是RNN状æ€å’Œè¾“入的å˜æ¢çŸ©é˜µã€‚*b* 是å差。它的**输出函数**åªéœ€è¦*x*<sub>*t*</sub>作为输出。 - -`recurrent_group`是构建循环神ç»ç½‘络的最é‡è¦çš„工具。 它定义了**å•æ¥å‡½æ•°**,**输出函数**和循环神ç»ç½‘络的输入。注æ„,这个函数的`step`å‚数需è¦å®žçŽ°`step function`(å•æ¥å‡½æ•°ï¼‰å’Œ`output function`(输出函数): - - -``` sourceCode -def simple_rnn(input, - size=None, - name=None, - reverse=False, - rnn_bias_attr=None, - act=None, - rnn_layer_attr=None): - def __rnn_step__(ipt): - out_mem = memory(name=name, size=size) - rnn_out = mixed_layer(input = [full_matrix_projection(ipt), - full_matrix_projection(out_mem)], - name = name, - bias_attr = rnn_bias_attr, - act = act, - layer_attr = rnn_layer_attr, - size = size) - return rnn_out - return recurrent_group(name='%s_recurrent_group' % name, - step=__rnn_step__, - reverse=reverse, - input=input) -``` - -PaddlePaddle 使用“Memoryâ€ï¼ˆè®°å¿†æ¨¡å—)实现å•æ¥å‡½æ•°ã€‚**Memory**是在PaddlePaddleä¸æž„é€ å¾ªçŽ¯ç¥žç»ç½‘络时最é‡è¦çš„概念。 Memory是在å•æ¥å‡½æ•°ä¸å¾ªçŽ¯ä½¿ç”¨çš„状æ€ï¼Œä¾‹å¦‚*x*<sub>*t* + 1</sub> = *f*<sub>*x*</sub>(*x*<sub>*t*</sub>)。 一个Memory包å«**输出**å’Œ**输入**。当å‰æ—¶é—´æ¥å¤„çš„Memory的输出作为下一时间æ¥Memory的输入。Memory也å¯ä»¥å…·æœ‰**boot layer(引导层)**,其输出被用作Memoryçš„åˆå§‹å€¼ã€‚ 在我们的例åä¸ï¼Œé—¨æŽ§å¾ªçŽ¯å•å…ƒçš„输出被用作输出Memory。请注æ„,`rnn_out`层的å称与`out_mem`çš„å称相åŒã€‚è¿™æ„味ç€`rnn_out` (*x*<sub>*t* + 1</sub>)的输出被用作`out_mem`Memoryçš„**输出**。 - -Memory也å¯ä»¥æ˜¯åºåˆ—。在这ç§æƒ…况下,在æ¯ä¸ªæ—¶é—´æ¥ä¸ï¼Œæˆ‘们有一个åºåˆ—作为循环神ç»ç½‘络的状æ€ã€‚è¿™åœ¨æž„é€ éžå¸¸å¤æ‚的循环神ç»ç½‘络时是有用的。 其他高级功能包括定义多个Memory,以åŠä½¿ç”¨ååºåˆ—æ¥å®šä¹‰åˆ†çº§å¾ªçŽ¯ç¥žç»ç½‘络架构。 - -我们在函数的结尾返回`rnn_out`。 è¿™æ„å‘³ç€ `rnn_out` 层的输出被用作门控循环神ç»ç½‘络的**输出**函数。 - -### Sequence to Sequence Model with Attention - -我们将使用 sequence to sequence model with attention 作为例å演示如何é…ç½®å¤æ‚的循环神ç»ç½‘络模型。该模型的说明如下图所示。 - -![image](../../../tutorials/text_generation/encoder-decoder-attention-model.png) - -在这个模型ä¸ï¼Œæºåºåˆ— *S* = {*s*<sub>1</sub>, …, *s*<sub>*T*</sub>} 用åŒå‘门控循环神ç»ç½‘络编ç 。åŒå‘门控循环神ç»ç½‘络的éšè—çŠ¶æ€ *H*<sub>*S*</sub> = {*H*<sub>1</sub>, …, *H*<sub>*T*</sub>} 被称为 *ç¼–ç å‘é‡*。解ç 器是门控循环神ç»ç½‘络。当解读æ¯ä¸€ä¸ª*y*<sub>*t*</sub>æ—¶, 这个门控循环神ç»ç½‘络生æˆä¸€ç³»åˆ—æƒé‡ *W*<sub>*S*</sub><sup>*t*</sup> = {*W*<sub>1</sub><sup>*t*</sup>, …, *W*<sub>*T*</sub><sup>*t*</sup>}, 用于计算编ç å‘é‡çš„åŠ æƒå’Œã€‚åŠ æƒå’Œç”¨æ¥ç”Ÿæˆ*y*<sub>*t*</sub>。 - -模型的编ç 器部分如下所示。它å«åš`grumemory`æ¥è¡¨ç¤ºé—¨æŽ§å¾ªçŽ¯ç¥žç»ç½‘络。如果网络架构简å•ï¼Œé‚£ä¹ˆæŽ¨è使用循环神ç»ç½‘ç»œçš„æ–¹æ³•ï¼Œå› ä¸ºå®ƒæ¯” `recurrent_group` 更快。我们已ç»å®žçŽ°äº†å¤§å¤šæ•°å¸¸ç”¨çš„循环神ç»ç½‘络架构,å¯ä»¥å‚考 [Layers](../../ui/api/trainer_config_helpers/layers_index.html) 了解更多细节。 - -我们还将编ç å‘é‡æŠ•å°„到 `decoder_size` 维空间。这通过获得åå‘循环网络的第一个实例,并将其投射到 `decoder_size` 维空间完æˆï¼š - -``` sourceCode -# 定义æºè¯å¥çš„æ•°æ®å±‚ -src_word_id = data_layer(name='source_language_word', size=source_dict_dim) -# 计算æ¯ä¸ªè¯çš„è¯å‘é‡ -src_embedding = embedding_layer( - input=src_word_id, - size=word_vector_dim, - param_attr=ParamAttr(name='_source_language_embedding')) -# 应用å‰å‘循环神ç»ç½‘络 -src_forward = grumemory(input=src_embedding, size=encoder_size) -# 应用åå‘递归神ç»ç½‘络(reverse=True表示åå‘循环神ç»ç½‘络) -src_backward = grumemory(input=src_embedding, - size=encoder_size, - reverse=True) -# 将循环神ç»ç½‘络的å‰å‘å’Œåå‘部分混åˆåœ¨ä¸€èµ· -encoded_vector = concat_layer(input=[src_forward, src_backward]) - -# 投射编ç å‘é‡åˆ° decoder_size -encoder_proj = mixed_layer(input = [full_matrix_projection(encoded_vector)], - size = decoder_size) - -# 计算åå‘RNN的第一个实例 -backward_first = first_seq(input=src_backward) - -# 投射åå‘RNN的第一个实例到 decoder size -decoder_boot = mixed_layer(input=[full_matrix_projection(backward_first)], size=decoder_size, act=TanhActivation()) -``` - -解ç 器使用 `recurrent_group` æ¥å®šä¹‰å¾ªçŽ¯ç¥žç»ç½‘络。å•æ¥å‡½æ•°å’Œè¾“出函数在 `gru_decoder_with_attention` ä¸å®šä¹‰ï¼š - -``` sourceCode -group_inputs=[StaticInput(input=encoded_vector,is_seq=True), - StaticInput(input=encoded_proj,is_seq=True)] -trg_embedding = embedding_layer( - input=data_layer(name='target_language_word', - size=target_dict_dim), - size=word_vector_dim, - param_attr=ParamAttr(name='_target_language_embedding')) -group_inputs.append(trg_embedding) - -# 对于é…备有注æ„力机制的解ç 器,在è®ç»ƒä¸ï¼Œ -# ç›®æ ‡å‘é‡ï¼ˆgroudtruth)是数æ®è¾“入, -# 而æºåºåˆ—çš„ç¼–ç å‘é‡å¯ä»¥è¢«æ— 边界的memory访问 -# StaticInput æ„味ç€ä¸åŒæ—¶é—´æ¥çš„输入都是相åŒçš„值, -# å¦åˆ™å®ƒä»¥ä¸€ä¸ªåºåˆ—输入,ä¸åŒæ—¶é—´æ¥çš„输入是ä¸åŒçš„。 -# 所有输入åºåˆ—应该有相åŒçš„长度。 -decoder = recurrent_group(name=decoder_group_name, - step=gru_decoder_with_attention, - input=group_inputs) -``` - -å•æ¥å‡½æ•°çš„实现如下所示。首先,它定义解ç 网络的**Memory**。然åŽå®šä¹‰ attention,门控循环å•å…ƒå•æ¥å‡½æ•°å’Œè¾“出函数: - -``` sourceCode -def gru_decoder_with_attention(enc_vec, enc_proj, current_word): - # 定义解ç 器的Memory - # Memory的输出定义在 gru_step 内 - # æ³¨æ„ gru_step 应该与它的Memoryåå—ç›¸åŒ - decoder_mem = memory(name='gru_decoder', - size=decoder_size, - boot_layer=decoder_boot) - # 计算 attention åŠ æƒç¼–ç å‘é‡ - context = simple_attention(encoded_sequence=enc_vec, - encoded_proj=enc_proj, - decoder_state=decoder_mem) - # æ··åˆå½“å‰è¯å‘é‡å’ŒattentionåŠ æƒç¼–ç å‘é‡ - decoder_inputs = mixed_layer(inputs = [full_matrix_projection(context), - full_matrix_projection(current_word)], - size = decoder_size * 3) - # 定义门控循环å•å…ƒå¾ªçŽ¯ç¥žç»ç½‘络å•æ¥å‡½æ•° - gru_step = gru_step_layer(name='gru_decoder', - input=decoder_inputs, - output_mem=decoder_mem, - size=decoder_size) - # 定义输出函数 - out = mixed_layer(input=[full_matrix_projection(input=gru_step)], - size=target_dict_dim, - bias_attr=True, - act=SoftmaxActivation()) - return out -``` - -生æˆåºåˆ— ------------------ - -è®ç»ƒæ¨¡åž‹åŽï¼Œæˆ‘们å¯ä»¥ä½¿ç”¨å®ƒæ¥ç”Ÿæˆåºåˆ—。通常的åšæ³•æ˜¯ä½¿ç”¨**beam search** 生æˆåºåˆ—。以下代ç 片段定义 beam search 算法。注æ„,`beam_search` 函数å‡è®¾ `step` 的输出函数返回的是下一个时刻输出è¯çš„ softmax 归一化概率å‘é‡ã€‚我们对模型进行了以下更改。 - -- 使用 `GeneratedInput` æ¥è¡¨ç¤º trg\_embedding。 `GeneratedInput` 将上一时间æ¥æ‰€ç”Ÿæˆçš„è¯çš„å‘é‡æ¥ä½œä¸ºå½“å‰æ—¶é—´æ¥çš„输入。 -- 使用 `beam_search` 函数。这个函数需è¦è®¾ç½®ï¼š - - `bos_id`: å¼€å§‹æ ‡è®°ã€‚æ¯ä¸ªå¥åéƒ½ä»¥å¼€å§‹æ ‡è®°å¼€å¤´ã€‚ - - `eos_id`: 结æŸæ ‡è®°ã€‚æ¯ä¸ªå¥å都以结æŸæ ‡è®°ç»“尾。 - - `beam_size`: beam search 算法ä¸çš„beam大å°ã€‚ - - `max_length`: 生æˆåºåˆ—的最大长度。 -- 使用 `seqtext_printer_evaluator` æ ¹æ®ç´¢å¼•çŸ©é˜µå’Œå—典打å°æ–‡æœ¬ã€‚这个函数需è¦è®¾ç½®ï¼š - - `id_input`: æ•°æ®çš„æ•´æ•°IDï¼Œç”¨äºŽæ ‡è¯†ç”Ÿæˆçš„文件ä¸çš„相应输出。 - - `dict_file`: 用于将è¯ID转æ¢ä¸ºè¯çš„å—典文件。 - - `result_file`: 生æˆç»“果文件的路径。 - -代ç 如下: - -``` sourceCode -group_inputs=[StaticInput(input=encoded_vector,is_seq=True), - StaticInput(input=encoded_proj,is_seq=True)] -# 在生æˆæ—¶ï¼Œè§£ç 器基于编ç æºåºåˆ—和最åŽç”Ÿæˆçš„ç›®æ ‡è¯é¢„æµ‹ä¸‹ä¸€ç›®æ ‡è¯ã€‚ -# ç¼–ç æºåºåˆ—(编ç 器输出)必须由åªè¯»Memoryçš„ StaticInput 指定。 -# 这里, GeneratedInputs 自动获å–上一个生æˆçš„è¯ï¼Œå¹¶åœ¨æœ€å¼€å§‹åˆå§‹åŒ–为起始è¯ï¼Œå¦‚ <s>。 -trg_embedding = GeneratedInput( - size=target_dict_dim, - embedding_name='_target_language_embedding', - embedding_size=word_vector_dim) -group_inputs.append(trg_embedding) -beam_gen = beam_search(name=decoder_group_name, - step=gru_decoder_with_attention, - input=group_inputs, - bos_id=0, # Beginnning token. - eos_id=1, # End of sentence token. - beam_size=beam_size, - max_length=max_length) - -seqtext_printer_evaluator(input=beam_gen, - id_input=data_layer(name="sent_id", size=1), - dict_file=trg_dict_path, - result_file=gen_trans_file) -outputs(beam_gen) -``` - -注æ„,这ç§ç”ŸæˆæŠ€æœ¯åªç”¨äºŽç±»ä¼¼è§£ç 器的生æˆè¿‡ç¨‹ã€‚å¦‚æžœä½ æ£åœ¨å¤„ç†åºåˆ—æ ‡è®°ä»»åŠ¡ï¼Œè¯·å‚阅 [Semantic Role Labeling Demo](../../demo/semantic_role_labeling/index.html) 了解更多详细信æ¯ã€‚ - -完整的é…置文件在`demo/seqToseq/seqToseq_net.py`。 diff --git a/doc/howto/deep_model/rnn_config_cn.rst b/doc/howto/deep_model/rnn/rnn_config_cn.rst similarity index 88% rename from doc/howto/deep_model/rnn_config_cn.rst rename to doc/howto/deep_model/rnn/rnn_config_cn.rst index e6d8c1133a5e8a481c9bf5340c4641343804dcbe..8d65b3512d0d99438898ec555a57f904691247f2 100644 --- a/doc/howto/deep_model/rnn_config_cn.rst +++ b/doc/howto/deep_model/rnn/rnn_config_cn.rst @@ -1,4 +1,4 @@ -RNN é…ç½® +RNNé…ç½® ======== æœ¬æ•™ç¨‹å°†æŒ‡å¯¼ä½ å¦‚ä½•åœ¨ PaddlePaddle @@ -20,7 +20,7 @@ PaddlePaddle ä¸éœ€è¦å¯¹åºåˆ—æ•°æ®è¿›è¡Œä»»ä½•é¢„处ç†ï¼Œä¾‹å¦‚填充。唯一需è¦åšçš„是将相应类型设置为输入。例如,以下代ç 段定义了三个输入。 它们都是åºåˆ—,它们的大å°æ˜¯\ ``src_dict``\ ,\ ``trg_dict``\ å’Œ\ ``trg_dict``\ : -.. code:: sourcecode +.. code:: python settings.input_types = [ integer_value_sequence(len(settings.src_dict)), @@ -29,7 +29,7 @@ PaddlePaddle 在\ ``process``\ 函数ä¸ï¼Œæ¯ä¸ª\ ``yield``\ 函数将返回三个整数列表。æ¯ä¸ªæ•´æ•°åˆ—表被视为一个整数åºåˆ—: -.. code:: sourcecode +.. code:: python yield src_ids, trg_ids, trg_ids_next @@ -45,18 +45,17 @@ PaddlePaddle 循环神ç»ç½‘络在æ¯ä¸ªæ—¶é—´æ¥éª¤é¡ºåºåœ°å¤„ç†åºåˆ—。下é¢åˆ—出了 LSTM 的架构的示例。 -.. figure:: ../../../tutorials/sentiment_analysis/bi_lstm.jpg - :alt: image +.. image:: ../../../tutorials/sentiment_analysis/bi_lstm.jpg + :align: center - image +一般æ¥è¯´ï¼Œå¾ªçŽ¯ç½‘络从 :math:`t=1` 到 :math:`t=T` 或者åå‘地从 :math:`t=T` 到 :math:`t=1` 执行以下æ“作。 -一般æ¥è¯´ï¼Œå¾ªçŽ¯ç½‘络从 *t* = 1 到 *t* = *T* 或者åå‘地从 *t* = *T* 到 *t* -= 1 执行以下æ“作。 +.. math:: -*x*\ \ *t* + 1 = *f*\ \ *x*\ (*x*\ \ *t*\ ),\ *y*\ \ *t*\  = *f*\ \ *y*\ (*x*\ \ *t*\ ) + x_{t+1} = f_x(x_t), y_t = f_y(x_t) -å…¶ä¸ *f*\ \ *x*\ (.) 称为\ **å•æ¥å‡½æ•°**\ (å³å•æ—¶é—´æ¥æ‰§è¡Œçš„函数,step -function),而 *f*\ \ *y*\ (.) 称为\ **输出函数**\ 。在 vanilla +å…¶ä¸ :math:`f_x(.)` 称为\ **å•æ¥å‡½æ•°**\ (å³å•æ—¶é—´æ¥æ‰§è¡Œçš„函数,step +function),而 :math:`f_y(.)` 称为\ **输出函数**\ 。在 vanilla 循环神ç»ç½‘络ä¸ï¼Œå•æ¥å‡½æ•°å’Œè¾“出函数都éžå¸¸ç®€å•ã€‚然而,PaddlePaddle å¯ä»¥é€šè¿‡ä¿®æ”¹è¿™ä¸¤ä¸ªå‡½æ•°æ¥å®žçŽ°å¤æ‚的网络é…置。我们将使用 sequence to sequence @@ -67,16 +66,17 @@ vanilla 对于 vanilla RNN,在æ¯ä¸ªæ—¶é—´æ¥é•¿ï¼Œ\ **å•æ¥å‡½æ•°**\ 为: -*x*\ \ *t* + 1 = *W*\ \ *x*\ \ *x*\ \ *t*\  + *W*\ \ *i*\ \ *I*\ \ *t*\  + *b* +.. math:: -å…¶ä¸ *x*\ \ *t*\ 是RNN状æ€ï¼Œå¹¶ä¸” *I*\ \ *t*\ 是输入,\ *W*\ \ *x*\ å’Œ -*W*\ \ *i*\ 分别是RNN状æ€å’Œè¾“入的å˜æ¢çŸ©é˜µã€‚\ *b* -是å差。它的\ **输出函数**\ åªéœ€è¦\ *x*\ \ *t*\ 作为输出。 + x_{t+1} = W_x x_t + W_i I_t + b + +å…¶ä¸ :math:`x_t` 是RNN状æ€ï¼Œå¹¶ä¸” :math:`I_t` 是输入,:math:`W_x` å’Œ +:math:`W_i` 分别是RNN状æ€å’Œè¾“入的å˜æ¢çŸ©é˜µã€‚:math:`b` 是å差。它的\ **输出函数**\ åªéœ€è¦ :math:`x_t` 作为输出。 ``recurrent_group``\ 是构建循环神ç»ç½‘络的最é‡è¦çš„工具。 它定义了\ **å•æ¥å‡½æ•°**\ ,\ **输出函数**\ 和循环神ç»ç½‘络的输入。注æ„,这个函数的\ ``step``\ å‚数需è¦å®žçŽ°\ ``step function``\ (å•æ¥å‡½æ•°ï¼‰å’Œ\ ``output function``\ (输出函数): -.. code:: sourcecode +.. code:: python def simple_rnn(input, size=None, @@ -102,7 +102,7 @@ vanilla PaddlePaddle 使用“Memoryâ€ï¼ˆè®°å¿†æ¨¡å—)实现å•æ¥å‡½æ•°ã€‚\ **Memory**\ 是在PaddlePaddleä¸æž„é€ å¾ªçŽ¯ç¥žç»ç½‘络时最é‡è¦çš„概念。 -Memory是在å•æ¥å‡½æ•°ä¸å¾ªçŽ¯ä½¿ç”¨çš„状æ€ï¼Œä¾‹å¦‚\ *x*\ \ *t* + 1 = *f*\ \ *x*\ (*x*\ \ *t*\ )。 +Memory是在å•æ¥å‡½æ•°ä¸å¾ªçŽ¯ä½¿ç”¨çš„状æ€ï¼Œä¾‹å¦‚ :math:`x_{t+1} = f_x(x_t)` 。 一个Memory包å«\ **输出**\ å’Œ\ **输入**\ 。当å‰æ—¶é—´æ¥å¤„çš„Memory的输出作为下一时间æ¥Memory的输入。Memory也å¯ä»¥å…·æœ‰\ **boot layer(引导层)**\ ,其输出被用作Memoryçš„åˆå§‹å€¼ã€‚ 在我们的例åä¸ï¼Œé—¨æŽ§å¾ªçŽ¯å•å…ƒçš„输出被用作输出Memory。请注æ„,\ ``rnn_out``\ 层的å称与\ ``out_mem``\ çš„å称相åŒã€‚è¿™æ„味ç€\ ``rnn_out`` @@ -120,18 +120,15 @@ Sequence to Sequence Model with Attention 我们将使用 sequence to sequence model with attention 作为例å演示如何é…ç½®å¤æ‚的循环神ç»ç½‘络模型。该模型的说明如下图所示。 -.. figure:: ../../../tutorials/text_generation/encoder-decoder-attention-model.png - :alt: image - - image +.. image:: ../../../tutorials/text_generation/encoder-decoder-attention-model.png + :align: center -在这个模型ä¸ï¼Œæºåºåˆ— *S* = {*s*\ 1, …, \ *s*\ \ *T*\ } +在这个模型ä¸ï¼Œæºåºåˆ— :math:`S = \{s_1, \dots, s_T\}` 用åŒå‘门控循环神ç»ç½‘络编ç 。åŒå‘门控循环神ç»ç½‘络的éšè—çŠ¶æ€ -*H*\ \ *S*\  = {*H*\ 1, …, \ *H*\ \ *T*\ } 被称为 -*ç¼–ç å‘é‡*\ 。解ç 器是门控循环神ç»ç½‘络。当解读æ¯ä¸€ä¸ª\ *y*\ \ *t*\ æ—¶, -这个门控循环神ç»ç½‘络生æˆä¸€ç³»åˆ—æƒé‡ -*W*\ \ *S*\ \ *t*\  = {*W*\ 1\ *t*\ , …, \ *W*\ \ *T*\ \ *t*\ }, -用于计算编ç å‘é‡çš„åŠ æƒå’Œã€‚åŠ æƒå’Œç”¨æ¥ç”Ÿæˆ\ *y*\ \ *t*\ 。 +:math:`H_S = \{H_1, \dots, H_T\}` 被称为 +*ç¼–ç å‘é‡*\ 。解ç 器是门控循环神ç»ç½‘络。当解读æ¯ä¸€ä¸ª :math:`y_t` æ—¶, +这个门控循环神ç»ç½‘络生æˆä¸€ç³»åˆ—æƒé‡ :math:`W_S^t = \{W_1^t, \dots, W_T^t\}` , +用于计算编ç å‘é‡çš„åŠ æƒå’Œã€‚åŠ æƒå’Œç”¨æ¥ç”Ÿæˆ :math:`y_t` 。 模型的编ç 器部分如下所示。它å«åš\ ``grumemory``\ æ¥è¡¨ç¤ºé—¨æŽ§å¾ªçŽ¯ç¥žç»ç½‘络。如果网络架构简å•ï¼Œé‚£ä¹ˆæŽ¨è使用循环神ç»ç½‘ç»œçš„æ–¹æ³•ï¼Œå› ä¸ºå®ƒæ¯” ``recurrent_group`` @@ -143,7 +140,7 @@ Sequence to Sequence Model with Attention 维空间。这通过获得åå‘循环网络的第一个实例,并将其投射到 ``decoder_size`` 维空间完æˆï¼š -.. code:: sourcecode +.. code:: python # 定义æºè¯å¥çš„æ•°æ®å±‚ src_word_id = data_layer(name='source_language_word', size=source_dict_dim) @@ -174,7 +171,7 @@ Sequence to Sequence Model with Attention 解ç 器使用 ``recurrent_group`` æ¥å®šä¹‰å¾ªçŽ¯ç¥žç»ç½‘络。å•æ¥å‡½æ•°å’Œè¾“出函数在 ``gru_decoder_with_attention`` ä¸å®šä¹‰ï¼š -.. code:: sourcecode +.. code:: python group_inputs=[StaticInput(input=encoded_vector,is_seq=True), StaticInput(input=encoded_proj,is_seq=True)] @@ -198,7 +195,7 @@ Sequence to Sequence Model with Attention å•æ¥å‡½æ•°çš„实现如下所示。首先,它定义解ç 网络的\ **Memory**\ 。然åŽå®šä¹‰ attention,门控循环å•å…ƒå•æ¥å‡½æ•°å’Œè¾“出函数: -.. code:: sourcecode +.. code:: python def gru_decoder_with_attention(enc_vec, enc_proj, current_word): # 定义解ç 器的Memory @@ -253,7 +250,7 @@ attention,门控循环å•å…ƒå•æ¥å‡½æ•°å’Œè¾“出函数: 代ç 如下: -.. code:: sourcecode +.. code:: python group_inputs=[StaticInput(input=encoded_vector,is_seq=True), StaticInput(input=encoded_proj,is_seq=True)] diff --git a/doc/howto/index_cn.rst b/doc/howto/index_cn.rst index 6a14ce8ae75c3dd372184ea6ea9f6034a3dbf919..bd3d0ec292057037414792b1ac176d12605b90d5 100644 --- a/doc/howto/index_cn.rst +++ b/doc/howto/index_cn.rst @@ -7,10 +7,11 @@ .. toctree:: :maxdepth: 1 + usage/cmd_parameter/index_cn.rst usage/concepts/use_concepts_cn.rst usage/cluster/cluster_train_cn.md - usage/cluster/k8s/k8s_cn.md - usage/cluster/k8s/k8s_distributed_cn.md + usage/k8s/k8s_cn.md + usage/k8s/k8s_distributed_cn.md å¼€å‘æ ‡å‡† -------- diff --git a/doc/howto/index_en.rst b/doc/howto/index_en.rst index 983dc743eb453a0210bc5fb3c7e4525fa838d428..1fbfcd260b912078f00ed5b720ed607db725c4e2 100644 --- a/doc/howto/index_en.rst +++ b/doc/howto/index_en.rst @@ -7,8 +7,10 @@ Usage .. toctree:: :maxdepth: 1 - usage/cmd_parameter/index_en.md + usage/cmd_parameter/index_en.rst usage/cluster/cluster_train_en.md + usage/k8s/k8s_en.md + usage/k8s/k8s_aws_en.md Development ------------ diff --git a/doc/howto/usage/cmd_parameter/index_cn.rst b/doc/howto/usage/cmd_parameter/index_cn.rst new file mode 100644 index 0000000000000000000000000000000000000000..4c8729821110b9aec99351fc0a83a1ba75a8a2bb --- /dev/null +++ b/doc/howto/usage/cmd_parameter/index_cn.rst @@ -0,0 +1,11 @@ +.. _cmd_line_index: + +设置命令行å‚æ•° +=============== + +.. toctree:: + :maxdepth: 1 + + use_case_cn.md + arguments_cn.md + detail_introduction_cn.md diff --git a/doc/howto/usage/cmd_parameter/index_en.md b/doc/howto/usage/cmd_parameter/index_en.md deleted file mode 100644 index 2a96e7e976c43fd69befccd78753cee431ef61bc..0000000000000000000000000000000000000000 --- a/doc/howto/usage/cmd_parameter/index_en.md +++ /dev/null @@ -1,8 +0,0 @@ -```eval_rst -.. _cmd_line_index: -``` -# Set Command-line Parameters - -* [Use Case](use_case_en.md) -* [Arguments](arguments_en.md) -* [Detailed Descriptions](detail_introduction_en.md) diff --git a/doc/howto/usage/cmd_parameter/index_en.rst b/doc/howto/usage/cmd_parameter/index_en.rst new file mode 100644 index 0000000000000000000000000000000000000000..0e3c72d27aca063f1b6f1c23e55718dba373c40a --- /dev/null +++ b/doc/howto/usage/cmd_parameter/index_en.rst @@ -0,0 +1,11 @@ +.. _cmd_line_index: + +Set Command-line Parameters +=========================== + +.. toctree:: + :maxdepth: 1 + + use_case_en.md + arguments_en.md + detail_introduction_en.md diff --git a/doc/howto/usage/cluster/k8s-aws/README.md b/doc/howto/usage/k8s/k8s_aws_en.md similarity index 99% rename from doc/howto/usage/cluster/k8s-aws/README.md rename to doc/howto/usage/k8s/k8s_aws_en.md index 593158428803c067a07cd741aabfe601f6f8e194..201bcae48df29eecca175a63fb2723ad687e7f69 100644 --- a/doc/howto/usage/cluster/k8s-aws/README.md +++ b/doc/howto/usage/k8s/k8s_aws_en.md @@ -1,4 +1,4 @@ -# PaddlePaddle on AWS with Kubernetes +# Kubernetes on AWS ## Create AWS Account and IAM Account @@ -331,15 +331,15 @@ For sharing the training data across all the Kubernetes nodes, we use EFS (Elast 1. Make sure you added AmazonElasticFileSystemFullAccess policy in your group. 1. Create the Elastic File System in AWS console, and attach the new VPC with it. -<img src="create_efs.png" width="800"> +<img src="src/create_efs.png" width="800"> 1. Modify the Kubernetes security group under ec2/Security Groups, add additional inbound policy "All TCP TCP 0 - 65535 0.0.0.0/0" for Kubernetes default VPC security group. -<img src="add_security_group.png" width="800"> +<img src="src/add_security_group.png" width="800"> 1. Follow the EC2 mount instruction to mount the disk onto all the Kubernetes nodes, we recommend to mount EFS disk onto ~/efs. -<img src="efs_mount.png" width="800"> +<img src="src/efs_mount.png" width="800"> Before starting the training, you should place your user config and divided training data onto EFS. When the training start, each task will copy related files from EFS into container, and it will also write the training results back onto EFS, we will show you how to place the data later in this article. diff --git a/doc/howto/usage/cluster/k8s/k8s_cn.md b/doc/howto/usage/k8s/k8s_cn.md similarity index 99% rename from doc/howto/usage/cluster/k8s/k8s_cn.md rename to doc/howto/usage/k8s/k8s_cn.md index 2575701053ca12cc3af45682af6cd682a88bb987..ab07cb9cd5b135ddea82b3360720537f1dc5a801 100644 --- a/doc/howto/usage/cluster/k8s/k8s_cn.md +++ b/doc/howto/usage/k8s/k8s_cn.md @@ -1,4 +1,4 @@ -# Kubernetes å•æœºè®ç»ƒ +# Kuberneteså•æœºè®ç»ƒ 在这篇文档里,我们介ç»å¦‚何在 Kubernetes 集群上å¯åŠ¨ä¸€ä¸ªå•æœºä½¿ç”¨CPUçš„Paddleè®ç»ƒä½œä¸šã€‚在下一篇ä¸ï¼Œæˆ‘们将介ç»å¦‚何å¯åŠ¨åˆ†å¸ƒå¼è®ç»ƒä½œä¸šã€‚ diff --git a/doc/howto/usage/cluster/k8s/k8s_distributed_cn.md b/doc/howto/usage/k8s/k8s_distributed_cn.md similarity index 99% rename from doc/howto/usage/cluster/k8s/k8s_distributed_cn.md rename to doc/howto/usage/k8s/k8s_distributed_cn.md index 53d0b4676c6a3a2dc8c58e231756638cc0b67765..b63b8437a0114a0165971933912da83c2dd770a6 100644 --- a/doc/howto/usage/cluster/k8s/k8s_distributed_cn.md +++ b/doc/howto/usage/k8s/k8s_distributed_cn.md @@ -1,4 +1,4 @@ -# Kubernetes 分布å¼è®ç»ƒ +# Kubernetes分布å¼è®ç»ƒ å‰ä¸€ç¯‡æ–‡ç« 介ç»äº†å¦‚何在Kubernetes集群上å¯åŠ¨ä¸€ä¸ªå•æœºPaddlePaddleè®ç»ƒä½œä¸š (Job)ã€‚åœ¨è¿™ç¯‡æ–‡ç« é‡Œï¼Œæˆ‘ä»¬ä»‹ç»å¦‚何在Kubernetes集群上进行分布å¼PaddlePaddleè®ç»ƒä½œä¸šã€‚关于PaddlePaddle的分布å¼è®ç»ƒï¼Œæ–‡ç« [Cluster Training](https://github.com/baidu/Paddle/blob/develop/doc/cluster/opensource/cluster_train.md)介ç»äº†ä¸€ç§é€šè¿‡SSH远程分å‘任务,进行分布å¼è®ç»ƒçš„方法,与æ¤ä¸åŒçš„是,本文将介ç»åœ¨Kubernetes容器管ç†å¹³å°ä¸Šå¿«é€Ÿæž„建PaddlePaddle容器集群,进行分布å¼è®ç»ƒçš„方案。 @@ -22,7 +22,7 @@ 首先,我们需è¦æ‹¥æœ‰ä¸€ä¸ªKubernetes集群,在这个集群ä¸æ‰€æœ‰node与pod都å¯ä»¥äº’相通信。关于Kubernetes集群æ建,å¯ä»¥å‚考[官方文档](http://kubernetes.io/docs/getting-started-guides/kubeadm/),在以åŽçš„æ–‡ç« ä¸æˆ‘们也会介ç»AWS上æ建的方案。本文å‡è®¾å¤§å®¶èƒ½æ‰¾åˆ°å‡ å°ç‰©ç†æœºï¼Œå¹¶ä¸”å¯ä»¥æŒ‰ç…§å®˜æ–¹æ–‡æ¡£åœ¨ä¸Šé¢éƒ¨ç½²Kubernetes。在本文的环境ä¸ï¼ŒKubernetes集群ä¸æ‰€æœ‰node都挂载了一个[MFS](http://moosefs.org/)(Moose filesystem,一ç§åˆ†å¸ƒå¼æ–‡ä»¶ç³»ç»Ÿï¼‰å…±äº«ç›®å½•ï¼Œæˆ‘们通过这个目录æ¥å˜æ”¾è®ç»ƒæ–‡ä»¶ä¸Žæœ€ç»ˆè¾“出的模型。关于MFS的安装部署,å¯ä»¥å‚考[MooseFS documentation](https://moosefs.com/documentation.html)。在è®ç»ƒä¹‹å‰ï¼Œç”¨æˆ·å°†é…置与è®ç»ƒæ•°æ®åˆ‡åˆ†å¥½æ”¾åœ¨MFS目录ä¸ï¼Œè®ç»ƒæ—¶ï¼Œç¨‹åºä»Žæ¤ç›®å½•æ‹·è´æ–‡ä»¶åˆ°å®¹å™¨å†…进行è®ç»ƒï¼Œå°†ç»“æžœä¿å˜åˆ°æ¤ç›®å½•é‡Œã€‚整体的结构图如下: -![paddle on kubernetes结构图](k8s-paddle-arch.png) +![paddle on kubernetes结构图](src/k8s-paddle-arch.png) 上图æ述了一个3节点的分布å¼è®ç»ƒåœºæ™¯ï¼ŒKubernetes集群的æ¯ä¸ªnode上都挂载了一个MFS目录,这个目录å¯ä»¥é€šè¿‡volumeçš„å½¢å¼æŒ‚载到容器ä¸ã€‚Kubernetes为这次è®ç»ƒåˆ›å»ºäº†3个pod并且调度到了3个node上è¿è¡Œï¼Œæ¯ä¸ªpod包å«ä¸€ä¸ªPaddlePaddle容器。在容器创建åŽï¼Œä¼šå¯åŠ¨pserver与trainer进程,读å–volumeä¸çš„æ•°æ®è¿›è¡Œè¿™æ¬¡åˆ†å¸ƒå¼è®ç»ƒã€‚ diff --git a/doc/howto/usage/cluster/k8s/k8s_en.md b/doc/howto/usage/k8s/k8s_en.md similarity index 100% rename from doc/howto/usage/cluster/k8s/k8s_en.md rename to doc/howto/usage/k8s/k8s_en.md diff --git a/doc/howto/usage/cluster/k8s/Dockerfile b/doc/howto/usage/k8s/src/Dockerfile similarity index 100% rename from doc/howto/usage/cluster/k8s/Dockerfile rename to doc/howto/usage/k8s/src/Dockerfile diff --git a/doc/howto/usage/cluster/k8s-aws/add_security_group.png b/doc/howto/usage/k8s/src/add_security_group.png similarity index 100% rename from doc/howto/usage/cluster/k8s-aws/add_security_group.png rename to doc/howto/usage/k8s/src/add_security_group.png diff --git a/doc/howto/usage/cluster/k8s-aws/create_efs.png b/doc/howto/usage/k8s/src/create_efs.png similarity index 100% rename from doc/howto/usage/cluster/k8s-aws/create_efs.png rename to doc/howto/usage/k8s/src/create_efs.png diff --git a/doc/howto/usage/cluster/k8s-aws/efs_mount.png b/doc/howto/usage/k8s/src/efs_mount.png similarity index 100% rename from doc/howto/usage/cluster/k8s-aws/efs_mount.png rename to doc/howto/usage/k8s/src/efs_mount.png diff --git a/doc/howto/usage/cluster/k8s/job.yaml b/doc/howto/usage/k8s/src/job.yaml similarity index 100% rename from doc/howto/usage/cluster/k8s/job.yaml rename to doc/howto/usage/k8s/src/job.yaml diff --git a/doc/howto/usage/cluster/k8s/k8s-paddle-arch.png b/doc/howto/usage/k8s/src/k8s-paddle-arch.png similarity index 100% rename from doc/howto/usage/cluster/k8s/k8s-paddle-arch.png rename to doc/howto/usage/k8s/src/k8s-paddle-arch.png diff --git a/doc/howto/usage/cluster/k8s-aws/managed_policy.png b/doc/howto/usage/k8s/src/managed_policy.png similarity index 100% rename from doc/howto/usage/cluster/k8s-aws/managed_policy.png rename to doc/howto/usage/k8s/src/managed_policy.png diff --git a/doc/howto/usage/cluster/k8s/start.sh b/doc/howto/usage/k8s/src/start.sh similarity index 100% rename from doc/howto/usage/cluster/k8s/start.sh rename to doc/howto/usage/k8s/src/start.sh diff --git a/doc/howto/usage/cluster/k8s/start_paddle.py b/doc/howto/usage/k8s/src/start_paddle.py similarity index 100% rename from doc/howto/usage/cluster/k8s/start_paddle.py rename to doc/howto/usage/k8s/src/start_paddle.py diff --git a/doc/tutorials/index_cn.md b/doc/tutorials/index_cn.md index 97014d537655d21871295699381c5dd2106d0b56..6a27004d58d24cc466d930322be8cdbb2f434c74 100644 --- a/doc/tutorials/index_cn.md +++ b/doc/tutorials/index_cn.md @@ -2,6 +2,7 @@ * [快速入门](quick_start/index_cn.rst) * [个性化推è](rec/ml_regression_cn.rst) +* [图åƒåˆ†ç±»](image_classification/index_cn.md) * [情感分æž](sentiment_analysis/index_cn.md) * [è¯ä¹‰è§’è‰²æ ‡æ³¨](semantic_role_labeling/index_cn.md) * [机器翻译](text_generation/index_cn.md) @@ -9,3 +10,4 @@ ## 常用模型 * [ResNet模型](imagenet_model/resnet_model_cn.md) +* [è¯å‘é‡æ¨¡åž‹](embedding_model/index_cn.md) diff --git a/doc/tutorials/index_en.md b/doc/tutorials/index_en.md index cce9d3a176a5e5c87e97c16362ec8a202e8eb80a..77331a703b6f0fdf92921ebcc476325b7327e976 100644 --- a/doc/tutorials/index_en.md +++ b/doc/tutorials/index_en.md @@ -7,6 +7,7 @@ There are several examples and demos here. * [Sentiment Analysis](sentiment_analysis/index_en.md) * [Semantic Role Labeling](semantic_role_labeling/index_en.md) * [Text Generation](text_generation/index_en.md) +* [Image Auto-Generation](gan/index_en.md) ## Model Zoo * [ImageNet: ResNet](imagenet_model/resnet_model_en.md)