diff --git a/docs/zh_CN/extension/train_with_DALI.md b/docs/zh_CN/extension/train_with_DALI.md index 6e7e977c937ea3473c9dd7a085cc0982fd685afc..298deafa224de787ac11fb4fb7b887d59e545a95 100644 --- a/docs/zh_CN/extension/train_with_DALI.md +++ b/docs/zh_CN/extension/train_with_DALI.md @@ -49,8 +49,14 @@ python -m paddle.distributed.launch \ ## 使用FP16训练 -在上述基础上,使用FP16半精度训练,可以进一步提高速度,只需在启动训练命令中添加字段`AMP.use_pure_fp16=True`: +在上述基础上,使用FP16半精度训练,可以进一步提高速度,可以参考下面的配置与运行命令。 ```shell -python tools/static/train.py -c configs/ResNet/ResNet50.yaml -o use_dali=True -o AMP.use_pure_fp16=True +export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 +export FLAGS_fraction_of_gpu_memory_to_use=0.8 + +python -m paddle.distributed.launch \ + --gpus="0,1,2,3,4,5,6,7" \ + tools/static/train.py \ + -c configs/ResNet/ResNet50_fp16.yaml ```