Develop (#997)

* add training mainbody doc (#994) * add training mainbody doc * fix en doc * add finetune doc and fix vit typo (#996)

Develop (#997)
* add training mainbody doc (#994) * add training mainbody doc * fix en doc * add finetune doc and fix vit typo (#996)
fa831306 · littletomatodonkey · GitHub · 2c7ce3c0 · fa831306 · fa831306
4 changed file
--- a/docs/en/application/mainbody_detection_en.md
+++ b/docs/en/application/mainbody_detection_en.md
@@ -22,7 +22,7 @@ The datasets we used for mainbody detection task are shown in the following tabl
 In the actual training process, all datasets are mixed together. Categories of all the labeled boxes are modified to the category `foreground`, and the detection model we trained just contains one category (`foreground`).
-## 2. Model Training
+## 2. Model Selection
 There are many types of object detection methods such as the commonly used two-stage detectors (FasterRCNN series, etc.), single-stage detectors (YOLO, SSD, etc.), anchor-free detectors (FCOS, etc.) and so on.
@@ -45,3 +45,131 @@ For more information about PP-YOLO, you can refer to [PP-YOLO tutorial](https://
 In the mainbody detection task, we use `ResNet50vd-DCN` as our backbone for better performance. The config file is [ppyolov2_r50vd_dcn_365e_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml) used for the model training, in which the dagtaset path is modified to the mainbody detection dataset.
 The final inference model can be downloaded [here](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar).
+## 3. Model training
+This section mainly talks about how to train your own mainbody detection model using PaddleDetection on your own dataset.
+### 3.1 Prepare for the environment
+Download PaddleDetection and install requirements。
+```shell
+cd <path/to/clone/PaddleDetection>
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+cd PaddleDetection
+# install requirements
+pip install -r requirements.txt
+```
+For more installation tutorials, please refer to [Installation tutorial](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL.md)
+### 3.2 Prepare for the dataset
+For customized dataset, you should convert it to COCO format. Please refer to [Customized dataset tutorial](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/static/docs/tutorials/Custom_DataSet.md) to build your own dataset with COCO format.
+In mainbody detection task, all the objects belong to foregroud. Therefore, `category_id` of all the objects in the annotation file should be modified to 1. And the `categories` map should be modified as follows, in which just class `foregroud` is included.
+```json
+[{u'id': 1, u'name': u'foreground', u'supercategory': u'foreground'}]
+```
+### 3.3 Configuration files
+You can use `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml` to train the model, mode details are as follows.
+<div align='center'>
+  <img src='../../images/det/PaddleDetection_config.png' width='400'/>
+</div>
+`ppyolov2_r50vd_dcn_365e_coco.yml` depends on other configuration files, their meanings are as follows.
+```
+coco_detection.yml：num_class of the model, and train/eval/test dataset.
+runtime.yml：public runtime parameters, use_gpu, save_interval, etc.
+optimizer_365e.yml：learning rate and optimizer.
+ppyolov2_r50vd_dcn.yml：model architecture.
+ppyolov2_reader.yml：train/eval/test reader.
+```
+In mainbody detection task, you need to modify `num_classes` in `datasets/coco_detection.yml` to 1 (just `foreground` is included). Dataset path should also be updated.
+### 3.4 Begin the training process
+PaddleDetection supports many ways of training process.
+* Training using single GPU
+```bash
+# not needed for windows and Mac
+export CUDA_VISIBLE_DEVICES=0
+python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml
+```
+* Training using multiple GPU's
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval
+```
+--eval：eval during training
+* (**Recommend**) Model finetune
+If you want to finetune the model on your own dataset, you can run the following command to train the model.
+```bash
+export CUDA_VISIBLE_DEVICES=0
+# assign pretrain_weights, load the general mainbody-detection pretrained model
+python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml -o pretrain_weights=https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/ppyolov2_r50vd_dcn_mainbody_v1.0_pretrained.pdparams
+```
+* Resume training: you can use `-r` to load checkpoints and resume training.
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval -r output/ppyolov2_r50vd_dcn_365e_coco/10000
+```
+Note:
+If error `out of memory` occured, you can try to decrease `batch_size` in `ppyolov2_reader.yml`.
+### 3.5 Model prediction
+Use the following command to finish the prediction process.
+```bash
+export CUDA_VISIBLE_DEVICES=0
+python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer_img=your_image_path.jpg --output_dir=infer_output/ --draw_threshold=0.5 -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final
+```
+`--draw_threshold` is an optional parameter.
+### 3.6 Export model and inference.
+Use the following to export the inference model.
+```bash
+python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --output_dir=./inference -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final.pdparams
+```
+The inference model will be saved folder `inference/ppyolov2_r50vd_dcn_365e_coco`, which contains `model.pdiparams`, `model.pdiparams.info`,`model.pdmodel` and `infer_cfg.yml`(optional for mainbody detection).
+* Note: Inference model name that `PaddleDetection` exports is `model.xxx`, here if you want to keep it consistent with `PaddleClas`, you can rename `model.xxx` to `inference.xxx` for subsequent inference.
+For more model export tutorial, please refer to [EXPORT_MODEL](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md).
+Now you get the newest model on your own dataset. In the recognition process, you can replace the detection model path with yours. For quick start of recognition process, please refer to the [tutorial](../tutorials/quick_start_recognition_en.md).
--- a/docs/images/det/PaddleDetection_config.png
+++ b/docs/images/det/PaddleDetection_config.png
--- a/docs/zh_CN/application/mainbody_detection.md
+++ b/docs/zh_CN/application/mainbody_detection.md
@@ -20,7 +20,7 @@
 在实际训练的过程中，将所有数据集混合在一起。由于是主体检测，这里将所有标注出的检测框对应的类别都修改为"前景"的类别，最终融合的数据集中只包含1个类别，即前景。
-## 2. 模型训练
+## 2. 模型选择
 目标检测方法种类繁多，比较常用的有两阶段检测器（如FasterRCNN系列等）；单阶段检测器（如YOLO、SSD等）；anchor-free检测器（如FCOS等）。
@@ -41,3 +41,130 @@ PP-YOLO由[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)提
 在主体检测任务中，为了保证检测效果，我们使用ResNet50vd-DCN的骨干网络，使用配置文件[ppyolov2_r50vd_dcn_365e_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml)，更换为自定义的主体检测数据集，进行训练，最终得到检测模型。
 主体检测模型的inference模型下载地址为：[链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar)。
+## 3. 模型训练
+本节主要介绍怎样基于PaddleDetection，基于自己的数据集，训练主体检测模型。
+### 3.1 环境准备
+下载PaddleDetection代码，安装requirements。
+```shell
+cd <path/to/clone/PaddleDetection>
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+cd PaddleDetection
+# 安装其他依赖
+pip install -r requirements.txt
+```
+更多安装教程，请参考: [安装文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL_cn.md)
+### 3.2 数据准备
+对于自定义数据集，首先需要将自己的数据集修改为COCO格式，可以参考[自定义检测数据集教程](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/static/docs/tutorials/Custom_DataSet.md)制作COCO格式的数据集。
+主体检测任务中，所有的检测框均属于前景，在这里需要将标注文件中，检测框的`category_id`修改为1，同时将整个标注文件中的`categories`映射表修改为下面的格式，即整个类别映射表中只包含`前景`类别。
+```json
+[{u'id': 1, u'name': u'foreground', u'supercategory': u'foreground'}]
+```
+### 3.3 配置文件改动和说明
+我们使用 `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml`配置进行训练，配置文件摘要如下：
+<div align='center'>
+  <img src='../../images/det/PaddleDetection_config.png' width='400'/>
+</div>
+从上图看到 `ppyolov2_r50vd_dcn_365e_coco.yml` 配置需要依赖其他的配置文件，这些配置文件的含义如下:
+```
+coco_detection.yml：主要说明了训练数据和验证数据的路径
+runtime.yml：主要说明了公共的运行参数，比如是否使用GPU、每多少个epoch存储checkpoint等
+optimizer_365e.yml：主要说明了学习率和优化器的配置
+ppyolov2_r50vd_dcn.yml：主要说明模型和主干网络的情况
+ppyolov2_reader.yml：主要说明数据读取器配置，如batch size，并发加载子进程数等，同时包含读取后预处理操作，如resize、数据增强等等
+```
+在主体检测任务中，需要将`datasets/coco_detection.yml`中的`num_classes`参数修改为1（只有1个前景类别），同时将训练集和测试集的路径修改为自定义数据集的路径。
+此外，也可以根据实际情况，修改上述文件，比如，如果显存溢出，可以将batch size和学习率等比缩小等。
+### 3.4 启动训练
+PaddleDetection提供了单卡/多卡训练模式，满足用户多种训练需求。
+* GPU 单卡训练
+```bash
+# windows和Mac下不需要执行该命令
+export CUDA_VISIBLE_DEVICES=0
+python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml
+```
+* GPU多卡训练
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval
+```
+--eval：表示边训练边验证。
+* (**推荐**)模型微调
+如果希望加载PaddleClas中已经训练好的主体检测模型，在自己的数据集上进行模型微调，可以使用下面的命令进行训练。
+```bash
+export CUDA_VISIBLE_DEVICES=0
+# 指定pretrain_weights参数，加载通用的主体检测预训练模型
+python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml -o pretrain_weights=https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/ppyolov2_r50vd_dcn_mainbody_v1.0_pretrained.pdparams
+```
+* 模型恢复训练
+在日常训练过程中，有的用户由于一些原因导致训练中断，可以使用-r的命令恢复训练:
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval -r output/ppyolov2_r50vd_dcn_365e_coco/10000
+```
+注意：如果遇到 "`Out of memory error`" 问题, 尝试在 `ppyolov2_reader.yml` 文件中调小`batch_size`
+### 3.5 模型预测与调试
+使用下面的命令完成PaddleDetection的预测过程。
+```bash
+export CUDA_VISIBLE_DEVICES=0
+python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer_img=your_image_path.jpg --output_dir=infer_output/ --draw_threshold=0.5 -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final
+```
+`--draw_threshold` 是个可选参数. 根据 [NMS](https://ieeexplore.ieee.org/document/1699659) 的计算，不同阈值会产生不同的结果 `keep_top_k`表示设置输出目标的最大数量，默认值为100，用户可以根据自己的实际情况进行设定。
+### 3.6 模型导出与预测部署。
+执行导出模型脚本：
+```bash
+python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --output_dir=./inference -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final.pdparams
+```
+预测模型会导出到`inference/ppyolov2_r50vd_dcn_365e_coco`目录下，分别为`infer_cfg.yml`(预测不需要), `model.pdiparams`, `model.pdiparams.info`,`model.pdmodel` 。
+注意：`PaddleDetection`导出的inference模型的文件格式为`model.xxx`，这里如果希望与PaddleClas的inference模型文件格式保持一致，需要将其`model.xxx`文件修改为`inference.xxx`文件，用于后续主体检测的预测部署。
+更多模型导出教程，请参考：[EXPORT_MODEL](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md)
+导出模型之后，在主体检测与识别任务中，就可以将检测模型的路径更改为该inference模型路径，完成预测。图像识别快速体验可以参考：[图像识别快速开始教程](../tutorials/quick_start_recognition.md)。
--- a/ppcls/configs/ImageNet/VisionTransformer/ViT_base_patch16_384.yaml
+++ b/ppcls/configs/ImageNet/VisionTransformer/ViT_base_patch16_384.yaml
@@ -16,7 +16,7 @@ Global:
 # model architecture
 Arch:
-  name: ViT_base_patch16_224
+  name: ViT_base_patch16_384
  class_num: 1000
 # loss function config for traing/eval process