diff --git a/docs/GETTING_STARTED.md b/docs/GETTING_STARTED.md
index 843b119d4484c8ab607b868919b3ee2b78aed066..788f7f408873a5caeee542129c0457111af40ebb 100644
--- a/docs/GETTING_STARTED.md
+++ b/docs/GETTING_STARTED.md
@@ -41,6 +41,8 @@ python tools/train.py -c configs/faster_rcnn_r50_1x.yml -o use_gpu=false
 - `-o`: Set configuration options in config file. Such as: `-o max_iters=180000`. `-o` has higher priority to file configured by `-c`
 - `--use_tb`: Whether to record the data with [tb-paddle](https://github.com/linshuliang/tb-paddle), so as to display in Tensorboard, default is `False`
 - `--tb_log_dir`: tb-paddle logging directory for scalar, default is `tb_log_dir/scalar`
+- `--fp16`: Whether to enable mixed precision training (requires GPU), default is `False`
+- `--loss_scale`: Loss scaling factor for mixed precision training, default is `8.0`
 
 
 ##### Examples
@@ -109,7 +111,7 @@ python tools/eval.py -c configs/faster_rcnn_r50_1x.yml
 
 #### Examples
 
-- Evaluate by specified weights path and dataset path 
+- Evaluate by specified weights path and dataset path
 ```bash
 # run on GPU with:
 export PYTHONPATH=$PYTHONPATH:.
@@ -183,7 +185,7 @@ python tools/infer.py -c configs/faster_rcnn_r50_1x.yml \
                       --use_tb=Ture
 ```
 
-The visualization files are saved in `output` by default, to specify a different path, simply add a `--output_dir=` flag.  
+The visualization files are saved in `output` by default, to specify a different path, simply add a `--output_dir=` flag.
 `--draw_threshold` is an optional argument. Default is 0.5.
 Different thresholds will produce different results depending on the calculation of [NMS](https://ieeexplore.ieee.org/document/1699659).
 If users want to infer according to customized model path, `-o weights` can be set for specified path.
@@ -208,12 +210,12 @@ Save inference model by set `--save_inference_model`, which can be loaded by Pad
 
 **Q:**  Why do I get `NaN` loss values during single GPU training? </br>
 **A:**  The default learning rate is tuned to multi-GPU training (8x GPUs), it must
-be adapted for single GPU training accordingly (e.g., divide by 8).  
-The calculation rules are as follows，they are equivalent: </br>  
+be adapted for single GPU training accordingly (e.g., divide by 8).
+The calculation rules are as follows，they are equivalent: </br>
 
 
-| GPU number  | Learning rate  | Max_iters | Milestones       |  
-| :---------: | :------------: | :-------: | :--------------: |  
+| GPU number  | Learning rate  | Max_iters | Milestones       |
+| :---------: | :------------: | :-------: | :--------------: |
 | 2           | 0.0025         | 720000    | [480000, 640000] |
 | 4           | 0.005          | 360000    | [240000, 320000] |
 | 8           | 0.01           | 180000    | [120000, 160000] |
diff --git a/docs/GETTING_STARTED_cn.md b/docs/GETTING_STARTED_cn.md
index e2817354905eebe2cd1ee2d2972b622e159375ce..2f0dff5fedc7864842816a1bbfc84dd34cef1108 100644
--- a/docs/GETTING_STARTED_cn.md
+++ b/docs/GETTING_STARTED_cn.md
@@ -42,6 +42,8 @@ python tools/train.py -c configs/faster_rcnn_r50_1x.yml -o use_gpu=false
 - `-o`: 设置配置文件里的参数内容。例如: `-o max_iters=180000`。使用`-o`配置相较于`-c`选择的配置文件具有更高的优先级。
 - `--use_tb`: 是否使用[tb-paddle](https://github.com/linshuliang/tb-paddle)记录数据，进而在TensorBoard中显示，默认是False。
 - `--tb_log_dir`: 指定 tb-paddle 记录数据的存储路径，默认是`tb_log_dir/scalar`。
+- `--fp16`: 是否使用混合精度训练模式（需GPU训练），默认是`False`。
+- `--loss_scale`: 设置混合精度训练模式中损失值的缩放比例，默认是`8.0`。
 
 ##### 例子
 
@@ -184,7 +186,7 @@ python tools/infer.py -c configs/faster_rcnn_r50_1x.yml \
 ```
 
 
-可视化文件默认保存在`output`中，可通过`--output_dir=`指定不同的输出路径。  
+可视化文件默认保存在`output`中，可通过`--output_dir=`指定不同的输出路径。
 `--draw_threshold` 是个可选参数. 根据 [NMS](https://ieeexplore.ieee.org/document/1699659) 的计算，
 不同阈值会产生不同的结果。如果用户需要对自定义路径的模型进行推断，可以设置`-o weights`指定模型路径。
 `--use_tb`是个可选参数，当为`True`时，可使用 TensorBoard 来可视化参数的变化趋势和图片。
@@ -205,12 +207,12 @@ python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_img=demo/0000005
 ## FAQ
 
 **Q:**  为什么我使用单GPU训练loss会出`NaN`? </br>
-**A:**  默认学习率是适配多GPU训练(8x GPU)，若使用单GPU训练，须对应调整学习率（例如，除以8）。  
-计算规则表如下所示，它们是等价的: </br>  
+**A:**  默认学习率是适配多GPU训练(8x GPU)，若使用单GPU训练，须对应调整学习率（例如，除以8）。
+计算规则表如下所示，它们是等价的: </br>
 
 
-| GPU数  | 学习率  | 最大轮数 | 变化节点       |  
-| :---------: | :------------: | :-------: | :--------------: |  
+| GPU数  | 学习率  | 最大轮数 | 变化节点       |
+| :---------: | :------------: | :-------: | :--------------: |
 | 2           | 0.0025         | 720000    | [480000, 640000] |
 | 4           | 0.005          | 360000    | [240000, 320000] |
 | 8           | 0.01           | 180000    | [120000, 160000] |