Add nmt_without_attention: English version of readme

d85646d5 · julie · b0328bd1 · d85646d5
隐藏空白更改
内联并排

Showing with 6 addition and 14 deletion

nmt_without_attention/README.md nmt_without_attention/README.md +6 -14

未找到文件。
--- a/nmt_without_attention/README.md
+++ b/nmt_without_attention/README.md
@@ -10,9 +10,9 @@ RNN-based neural machine translation follows the encoder-decoder architecture. A

 The input and output of the neural machine translation model can be any of character, word or phrase. This example illustrates the word-based NMT.

- ** Encoder **: Encodes the source language sentence into a vector as input to the decoder. The original input of the decoder is the `id` sequence $ w = {w_1, w_2, ..., w_T} $ of the word, expressed in the one-hot code. In order to reduce the input dimension, and to establish the semantic association between words, the model is a word that is expressed by hot independent code. Word embedding is a word vector. For more information about word vector, please refer to PaddleBook [word vector] (https://github.com/PaddlePaddle/book/blob/develop/04.word2vec/README.cn.md) chapter. Finally, the RNN unit processes the input word by word to get the encoding vector of the complete sentence.
+- **Encoder**: Encodes the source language sentence into a vector as input to the decoder. The original input of the decoder is the `id` sequence $ w = {w_1, w_2, ..., w_T} $ of the word, expressed in the one-hot code. In order to reduce the input dimension, and to establish the semantic association between words, the model is a word that is expressed by hot independent code. Word embedding is a word vector. For more information about word vector, please refer to PaddleBook [word vector] (https://github.com/PaddlePaddle/book/blob/develop/04.word2vec/README.cn.md) chapter. Finally, the RNN unit processes the input word by word to get the encoding vector of the complete sentence.

- ** Decoder **: Accepts the input of the encoder, decoding the target language sequence $ u = {u_1, u_2, ..., u_ {T '}} $ one by one. For each time step, the RNN unit outputs a hidden vector. Then the conditional probability of the next target word is calculated by `Softmax` normalization, i.e. $ P (u_i | w, u_1, u_2, ..., u_ {t- 1}) $. Thus, given the input $ w $, the corresponding translation result is $ u $
+- **Decoder**: Accepts the input of the encoder, decoding the target language sequence $ u = {u_1, u_2, ..., u_ {T '}} $ one by one. For each time step, the RNN unit outputs a hidden vector. Then the conditional probability of the next target word is calculated by `Softmax` normalization, i.e. $ P (u_i | w, u_1, u_2, ..., u_ {t- 1}) $. Thus, given the input $ w $, the corresponding translation result is $ u $

 $$ P(u_1,u_2,...,u_{T'} | w) = \prod_{t=1}^{t={T'}}p(u_t|w, u_1, u_2, u_{t-1})$$

@@ -240,15 +240,7 @@ def event_handler(event):
                event.pass_id, event.batch_id, event.cost, event.metrics))
 ```

-**d) 开始训练**
-
-```python
-# start training
-trainer.train(
-    reader=wmt14_reader, event_handler=event_handler, num_passes=2)
-```
-
-** d) Start training **
+**d) Start training**

 ```python
 # start training
@@ -340,12 +332,12 @@ Elles connaissent leur entreprise mieux que personne .
 -5.026885        They know their business better than anybody . <e>

 ```
- the first line of input for the source language.
- second ~ beam_size + 1 line is the result of the `beam_size` translation generated by the column search
+- The first line of input for the source language.
+- Second ~ beam_size + 1 line is the result of the `beam_size` translation generated by the column search
    - the output of the same row is separated into two columns by "\ t", the first column is the log probability of the sentence, and the second column is the text of the translation result.
    - the symbol `<s>` represents the beginning of the sentence, the symbol `<e>` indicates the end of a sentence, and if there is a word that is not included in the dictionary, it is replaced with the symbol `<unk>`.

-So far, we implemented a basic machine translation model using PaddlePaddle. We can see, PaddlePaddle provides a flexible and rich API. This enables users to easily choose and use a various complex network configuration. NMT itself is also a rapidly developing field, and many new ideas continue to emerge. This example is a basic implementation of NMT. Users can also implement more complex NMT models using PaddlePaddle.
+So far, we have implemented a basic machine translation model using PaddlePaddle. We can see, PaddlePaddle provides a flexible and rich API. This enables users to easily choose and use a various complex network configuration. NMT itself is also a rapidly developing field, and many new ideas continue to emerge. This example is a basic implementation of NMT. Users can also implement more complex NMT models using PaddlePaddle.


 ## References