modify classification inference func and readme (#4116)

9b50a73c · littletomatodonkey · ruri · 76f0eeae · 9b50a73c · 9b50a73c
7 changed file
--- a/PaddleCV/image_classification/README.md
+++ b/PaddleCV/image_classification/README.md
@@ -16,7 +16,6 @@
    - [混合精度训练](#混合精度训练)
    - [性能分析](#性能分析)
    - [DALI预处理](#DALI预处理)
-    - [自定义数据集](#自定义数据集)
 - [已发布模型及其性能](#已发布模型及其性能)
 - [FAQ](#faq)
 - [参考文献](#参考文献)
@@ -77,19 +76,34 @@ val/ILSVRC2012_val_00000001.jpeg 65
 ### 模型训练
 数据准备完毕后，可以通过如下的方式启动训练：
-```
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+export FLAGS_fraction_of_gpu_memory_to_use=0.98
 python train.py \
-       --model=ResNet50 \
+        --data_dir=./data/ILSVRC2012/ \
-       --batch_size=256 \
+        --total_images=1281167 \
-       --total_images=1281167 \
+        --class_dim=1000 \
-       --class_dim=1000 \
+        --validate=True \
-       --image_shape=3,224,224 \
+        --model=ResNet50_vd \
-       --model_save_dir=output/ \
+        --batch_size=256 \
-       --lr_strategy=piecewise_decay \
+        --lr_strategy=cosine_decay \
-       --lr=0.1
+        --lr=0.1 \
+        --num_epochs=200 \
+        --model_save_dir=output/ \
+        --l2_decay=7e-5 \
+        --use_mixup=True \
+        --use_label_smoothing=True \
+        --label_smoothing_epsilon=0.1
 ```
-注意: 当添加如step_epochs这种列表型参数，需要去掉"="，如：--step_epochs 10 20 30
+注意:
+- 当添加如step_epochs这种列表型参数，需要去掉"="，如：--step_epochs 10 20 30
+- 如果需要训练自己的数据集，则需要修改根据自己的数据集修改`data_dir`, `total_images`, `class_dim`参数；如果因为GPU显存不够而需要调整`batch_size`，则参数`lr`也需要根据`batch_size`进行线性调整。
+- 如果需要使用其他模型进行训练，则需要修改`model`参数，也可以在`scripts/train/`文件夹中根据对应模型的默认运行脚本进行修改并训练。
 或通过run.sh 启动训练
@@ -100,13 +114,14 @@ bash run.sh train 模型名
 **多进程模型训练：**
 如果你有多张GPU卡的话，我们强烈建议你使用多进程模式来训练模型，这会极大的提升训练速度。启动方式如下：
-```
+```bash
 CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch train.py \
       --model=ResNet50 \
       --batch_size=256 \
       --total_images=1281167 \
       --class_dim=1000 \
-       --image_shape=3,224,224 \
+       --image_shape 3 224 224 \
       --model_save_dir=output/ \
       --lr_strategy=piecewise_decay \
       --reader_thread=4 \
@@ -196,26 +211,51 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch train.py \
 参数微调(Finetune)是指在特定任务上微调已训练模型的参数。可以下载[已发布模型及其性能](#已发布模型及其性能)并且设置```path_to_pretrain_model```为模型所在路径，微调一个模型可以采用如下的命令：
 ```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+export FLAGS_fraction_of_gpu_memory_to_use=0.98
 python train.py \
-       --model=model_name \
+        --data_dir=./data/ILSVRC2012/ \
-       --pretrained_model=${path_to_pretrain_model}
+        --total_images=1281167 \
+        --class_dim=1000 \
+        --validate=True \
+        --model=ResNet50_vd \
+        --batch_size=256 \  
+        --lr=0.1 \
+        --num_epochs=200 \
+        --model_save_dir=output/ \
+        --l2_decay=7e-5 \
+        --pretrained_model=${path_to_pretrain_model} \
+        --finetune_exclude_pretrained_params=fc_0.w_0,fc_0.b_0
 ```
-注意：根据具体模型和任务添加并调整其他参数
+注意：
+- 在自己的数据集上进行微调时，则需要修改根据自己的数据集修改`data_dir`, `total_images`, `class_dim`参数。
+- 加载的参数是ImageNet1000的预训练模型参数，对于相同模型，最后的类别数或者含义可能不同，因此在加载预训练模型参数时，需要过滤掉最后的FC层，否则可能会因为**维度不匹配**而报错。
 ### 模型评估
-模型评估(Eval)是指对训练完毕的模型评估各类性能指标。可以下载[已发布模型及其性能](#已发布模型及其性能)并且设置```path_to_pretrain_model```为模型所在路径。运行如下的命令，可以获得模型top-1/top-5精度:
+模型评估(Eval)是指对训练完毕的模型评估各类性能指标。可以下载[已发布模型及其性能](#已发布模型及其性能)并且设置```path_to_pretrain_model```为模型所在路径，```json_path```为保存指标的路径。运行如下的命令，可以获得模型top-1/top-5精度。
 **参数说明**
 * **save_json_path**: 是否将eval结果保存到json文件中，默认值：None
+* `model`: 模型名称，与预训练模型需保持一致。
+* `batch_size`: 每个minibatch评测的图片个数。
+* `data_dir`: 数据路径。注意：该路径下需要同时包括待评估的**图片文件**以及图片和对应类别标注的**映射文本文件**，文本文件名称需为`val.txt`。
 ```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+export FLAGS_fraction_of_gpu_memory_to_use=0.98
 python eval.py \
-       --model=model_name \
+       --model=ResNet50_vd \
-       --pretrained_model=${path_to_pretrain_model}
+       --pretrained_model=${path_to_pretrain_model} \
+       --data_dir=./data/ILSVRC2012/ \
+       --save_json_path=${json_path} \
+       --batch_size=256
 ```
-注意：根据具体模型和任务添加并调整其他参数
 ### 指数滑动平均的模型评估
@@ -227,32 +267,89 @@ python ema_clean.py \
       --cleaned_model_dir=your_cleaned_model_dir
 python eval.py \
-       --model=model_name \
+       --model=ResNet50_vd \
       --pretrained_model=your_cleaned_model_dir
 ```
-### 模型预测
+### 模型fluid预测
-模型预测(Infer)可以获取一个模型的预测分数或者图像的特征，可以下载[已发布模型及其性能](#已发布模型及其性能)并且设置```path_to_pretrain_model```为模型所在路径。运行如下的命令获得预测结果：
+模型预测(Infer)可以获取一个模型的预测分数或者图像的特征，可以下载[已发布模型及其性能](#已发布模型及其性能)并且设置```path_to_pretrain_model```为模型所在路径，```test_res_json_path```为模型预测结果保存的文本路径，```image_path```为模型预测的图片路径或者图片列表所在的文件夹路径。
 **参数说明：**
-* **save_inference**: 是否保存二进制模型，默认值：False
+* **save_inference**: 是否保存二进制模型，默认值：`False`
 * **topk**: 按照置信由高到低排序标签结果，返回的结果数量，默认值：1
-* **class_map_path**: 可读标签文件路径，默认值："./utils/tools/readable_label.txt"
+* **class_map_path**: 可读标签文件路径，默认值：`./utils/tools/readable_label.txt`
-* **image_path**: 指定单文件进行预测，默认值：None
+* **image_path**: 指定单文件进行预测，默认值：`None`
-* **save_json_path**: 将预测结果保存到json文件中，默认值: None
+* **save_json_path**: 将预测结果保存到json文件中，默认值: `test_res.json`
+#### 单张图片预测
+```bash
+export CUDA_VISIBLE_DEVICES=0
+python infer.py \
+        --model=ResNet50_vd \
+        --pretrained_model=${path_to_pretrain_model} \
+        --class_map_path=./utils/tools/readable_label.txt \
+        --image_path=${image_path} \
+        --save_json_path=${test_res_json_path}
+```
+#### 图片列表预测
+* 该种情况下，需要指定```data_dir```路径。
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python infer.py \
+        --model=ResNet50_vd \
+        --pretrained_model=${path_to_pretrain_model} \
+        --class_map_path=./utils/tools/readable_label.txt \
+        --data_dir=./data/ILSVRC2012/ \
+        --save_json_path=${test_res_json_path}
+```
+注意：
+- 模型名称需要与该模型训练时保持一致。
+- 模型预测默认ImageNet1000类类别，预测数值和可读标签的map文件存储在`./utils/tools/readable_label.txt`中，如果使用自定义数据，请指定`--class_map_path`参数。
+### Python预测API
+* Fluid提供了高度优化的C++预测库，为了方便使用，Paddle也提供了C++预测库对应的Python接口，更加具体的Python预测API介绍可以参考这里：[https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_usage/deploy/inference/python_infer_cn.html](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_usage/deploy/inference/python_infer_cn.html)
+* 使用Python预测API进行模型预测的步骤有模型转换和模型预测，详细介绍如下。
+#### 模型转换
+* 首先将保存的fluid模型转换为二进制模型，转换方法如下，其中```path_to_pretrain_model```表示预训练模型的路径。
 ```bash
 python infer.py \
-       --model=model_name \
+        --model=ResNet50_vd \
-       --pretrained_model=${path_to_pretrain_model}
+        --pretrained_model=${path_to_pretrain_model} \
-       --image_path=${path_to_single_image}
+        --save_inference=True
 ```
-注意：根据具体模型和任务添加并调整其他参数
-模型预测默认ImageNet1000类类别，预测数值和可读标签的map文件存储在/utils/tools/readable_label.txt中，如果使用自定义数据，请指定--class_map_path参数
+注意：
+- 预训练模型和模型名称需要保持一致。
+- 在转换模型时，使用`save_inference_model`函数进行模型转换，参数`feeded_var_names`表示模型预测时所需提供数据的所有变量名称；参数`target_vars`表示模型的所有输出变量，通过这些输出变量即可得到模型的预测结果。
+- 转换完成后，会在`ResNet50_vd`文件下生成`model`和`params`文件。
+#### 模型预测
+根据转换的模型二进制文件，基于Python API的预测方法如下，其中```model_path```表示model文件的路径，```params_path```表示params文件的路径，```image_path```表示图片文件的路径。
+```bash
+python predict.py \
+        --model_file=./ResNet50_vd/model \
+        --params_file=./ResNet50_vd/params \
+        --image_path=${image_path} \
+        --gpu_id=0 \
+        --gpu_mem=1024
+```
+注意：
+- 这里只提供了预测单张图片的脚本，如果需要预测文件夹内的多张图片，需要自己修改预测文件`predict.py`。
+- 参数`gpu_id`指定了当前使用的GPU ID号，参数`gpu_mem`指定了初始化的GPU显存。
 ## 进阶使用
@@ -335,17 +432,6 @@ python -m paddle.distributed.launch train.py \
 2. Nvidia DALI需要使用[#1371](https://github.com/NVIDIA/DALI/pull/1371)以后的git版本。请参考[此文档](https://docs.nvidia.com/deeplearning/sdk/dali-master-branch-user-guide/docs/installation.html)安装nightly版本或从源码安装。
 3. 因为DALI使用GPU进行图片预处理，需要占用部分显存，请适当调整 `FLAGS_fraction_of_gpu_memory_to_use`环境变量（如`0.8`）来预留部分显存供DALI使用。
-### 自定义数据集
-PaddlePaddle/Models ImageClassification 支持自定义数据
-1. 组织自定义数据，调整数据读取器以正确的传入数据
-2. 注意更改训练脚本中
--data_dim 类别数为自定义数据类别数
--total_image 图片数量
-3. 当进行finetune时，
-指定--pretrained_model 加载预训练模型，注意：本模型库提供的是基于ImageNet 1000类数据的预训练模型，当使用不同类别数的数据时，请删除预训练模型中fc_weight 和fc_offset参数
 ## 已发布模型及其性能
 表格中列出了在models目录下目前支持的图像分类模型，并且给出了已完成训练的模型在ImageNet-2012验证集合上的top-1和top-5精度，以及Paddle Fluid和Paddle TensorRT基于动态链接库的预测时间（测试GPU型号为NVIDIA® Tesla® P4）。
@@ -459,7 +545,7 @@ PaddlePaddle/Models ImageClassification 支持自定义数据
    ```bash
    python infer.py \
-           --model=model_name \
+           --model=ResNet50_vd \
           --pretrained_model=${path_to_pretrain_model} \
           --save_inference=True
    ```

--- a/PaddleCV/image_classification/eval.py
+++ b/PaddleCV/image_classification/eval.py
@@ -23,6 +23,7 @@ import math
 import numpy as np
 import argparse
 import functools
+import logging
 import paddle
 import paddle.fluid as fluid

--- a/PaddleCV/image_classification/infer.py
+++ b/PaddleCV/image_classification/infer.py
@@ -30,6 +30,7 @@ import paddle
 import paddle.fluid as fluid
 import reader
 import models
+import json
 from utils import *
 parser = argparse.ArgumentParser(description=__doc__)
@@ -54,7 +55,7 @@ add_arg('padding_type',     str,  "SAME",               "Padding type of convolu
 add_arg('use_se',           bool, True,                 "Whether to use Squeeze-and-Excitation module for EfficientNet.")
 add_arg('image_path',       str,  None,                 "single image path")
 add_arg('batch_size',       int,  8,                    "batch_size on all the devices")
-add_arg('save_json_path',        str,  None,            "save output to a json file")
+add_arg('save_json_path',        str,  "test_res.json",            "save output to a json file")
 # yapf: enable
 logging.basicConfig(level=logging.INFO)
@@ -121,7 +122,7 @@ def infer(args):
            executor=exe,
            model_filename='model',
            params_filename='params')
-        logger.info("model: ", args.model, " is already saved")
+        logger.info("model: {0} is already saved".format(args.model))
        exit(0)
    imagenet_reader = reader.ImageNetReader()
@@ -147,48 +148,47 @@ def infer(args):
    parallel_data = []
    parallel_id = []
    place_num = paddle.fluid.core.get_cuda_device_count() if args.use_gpu else 1
+    with open(args.save_json_path, "w") as fout:
-    for batch_id, data in enumerate(test_reader()):
+        for batch_id, data in enumerate(test_reader()):
-        image_data = [[items[0]] for items in data]
+            image_data = [[items[0]] for items in data]
-        image_id = [items[1] for items in data]
+            image_id = [items[1] for items in data]
-        parallel_id.append(image_id)
+            parallel_id.append(image_id)
-        parallel_data.append(image_data)
+            parallel_data.append(image_data)
-        if place_num == len(parallel_data):
+            if place_num == len(parallel_data):
-            result = exe.run(
+                result = exe.run(
-                compiled_program,
+                    compiled_program,
-                fetch_list=fetch_list,
+                    fetch_list=fetch_list,
-                feed=list(feeder.feed_parallel(parallel_data, place_num)))
+                    feed=list(feeder.feed_parallel(parallel_data, place_num)))
-            for i, res in enumerate(result[0]):
+                for i, res in enumerate(result[0]):
-                pred_label = np.argsort(res)[::-1][:TOPK]
+                    pred_label = np.argsort(res)[::-1][:TOPK]
-                real_id = str(np.array(parallel_id).flatten()[i])
+                    real_id = str(np.array(parallel_id).flatten()[i])
-                _, real_id = os.path.split(real_id)
+                    _, real_id = os.path.split(real_id)
-                if os.path.exists(args.class_map_path):
+                    if os.path.exists(args.class_map_path):
-                    readable_pred_label = []
+                        readable_pred_label = []
-                    for label in pred_label:
+                        for label in pred_label:
-                        readable_pred_label.append(label_dict[str(label)])
+                            readable_pred_label.append(label_dict[str(label)])
-                    info[real_id] = {}
+                        info[real_id] = {}
-                    info[real_id]['score'], info[real_id]['class'], info[
+                        info[real_id]['score'], info[real_id]['class'], info[
-                        real_id]['class_name'] = str(res[pred_label]), str(
+                            real_id]['class_name'] = str(res[pred_label]), str(
-                            pred_label), readable_pred_label
+                                pred_label), readable_pred_label
-                else:
+                    else:
-                    info[real_id] = {}
+                        info[real_id] = {}
-                    info[real_id]['score'], info[real_id]['class'] = str(res[
+                        info[real_id]['score'], info[real_id]['class'] = str(
-                        pred_label]), str(pred_label)
+                            res[pred_label]), str(pred_label)
-                logger.info(real_id, info[real_id])
+                    logger.info(real_id, info[real_id])
-                sys.stdout.flush()
+                    sys.stdout.flush()
+                    fout.write(real_id + "\t" + json.dumps(info[real_id]) +
-                if args.save_json_path:
+                               "\n")
-                    save_json(info, args.save_json_path)
+                parallel_data = []
-            parallel_data = []
+                parallel_id = []
-            parallel_id = []
-        if args.image_path:
+    os.remove(".tmp.txt")
-            os.remove(".tmp.txt")
 def main():

--- a/PaddleCV/image_classification/predict.py
+++ b/PaddleCV/image_classification/predict.py
@@ -104,8 +104,10 @@ def predict(args):
    else:
        config.enable_use_gpu(args.gpu_mem, args.gpu_id)
-    # create predictor
+    # you can enable tensorrt engine if paddle is installed with tensorrt
-    predictor = create_paddle_predictor(config.to_native_config())
+    # config.enable_tensorrt_engine() 
+    predictor = create_paddle_predictor(config)
    # input
    inputs = preprocess_image(args.image_path)
@@ -120,8 +122,8 @@ def predict(args):
    cls = np.argmax(output)
    score = output[cls]
-    logger.info("class: ", cls)
+    logger.info("class: {0}".format(cls))
-    logger.info("score: ", score)
+    logger.info("score: {0}".format(score))
    return

--- a/PaddleCV/image_classification/reader.py
+++ b/PaddleCV/image_classification/reader.py
@@ -16,9 +16,10 @@ import os
 import math
 import random
 import functools
-import logging
 import numpy as np
 import cv2
+import logging
+import imghdr
 import paddle
 from paddle import fluid
@@ -210,6 +211,10 @@ def process_image(sample, settings, mode, color_jitter, rotate):
    img_path = sample[0]
    img = cv2.imread(img_path)
+    if img is None:
+        logger.warning("img({0}) is None, pass it.".format(img_path))
+        return None
    if mode == 'train':
        if rotate:
            img = rotate_image(img)
@@ -258,10 +263,13 @@ def process_batch_data(input_data, settings, mode, color_jitter, rotate):
    batch_data = []
    for sample in input_data:
        if os.path.isfile(sample[0]):
-            batch_data.append(
+            tmp_data = process_image(sample, settings, mode, color_jitter,
-                process_image(sample, settings, mode, color_jitter, rotate))
+                                     rotate)
+            if tmp_data is None:
+                continue
+            batch_data.append(tmp_data)
        else:
-            logger.info("File not exist : %s" % sample[0])
+            logger.info("File not exist : {0}".format(sample[0]))
    return batch_data
@@ -310,7 +318,7 @@ class ImageNetReader:
                    full_lines = [line.strip() for line in flist]
                    if mode != "test" and len(full_lines) < settings.batch_size:
                        logger.error(
-                            "Error: The number of the whole data ({}) is smaller than the batch_size ({}), and drop_last is turnning on, so nothing  will feed in program, Terminated now. Please set the batch_size to a smaller number or feed more data!".
+                            "Error: The number of the whole data ({}) is smaller than the batch_size ({}), and drop_last is turnning on, so nothing  will feed in program, Terminated now. Please reset batch_size to a smaller number or feed more data!".
                            format(len(full_lines), settings.batch_size))
                        os._exit(1)
                    if num_trainers > 1 and mode == "train":
@@ -331,15 +339,12 @@ class ImageNetReader:
                        full_lines.append(temp_file)
                for line in full_lines:
                    img_path, label = line.split()
                    img_path = os.path.join(data_dir, img_path)
                    batch_data.append([img_path, int(label)])
                    if len(batch_data) == batch_size:
                        if mode == 'train' or mode == 'val' or mode == 'test':
                            yield batch_data
                        batch_data = []
            return read_file_list
@@ -434,16 +439,20 @@ class ImageNetReader:
        Returns:
            test reader
        """
-        if settings.image_path:
+        file_list = ".tmp.txt"
-            tmp = open(".tmp.txt", "w")
+        imgType_list = {'jpg', 'bmp', 'png', 'jpeg', 'rgb', 'tif', 'tiff'}
-            tmp.write(settings.image_path + " 0")
+        with open(file_list, "w") as fout:
-            file_list = ".tmp.txt"
+            if settings.image_path:
-            settings.batch_size = 1
+                fout.write(settings.image_path + " 0" + "\n")
-        else:
+                settings.batch_size = 1
-            file_list = os.path.join(settings.data_dir, 'val_list.txt')
+                settings.data_dir = ""
-        assert os.path.isfile(
+            else:
-            file_list), "{} doesn't exist, please check data list path".format(
+                tmp_file_list = os.listdir(settings.data_dir)
-                file_list)
+                for file_name in tmp_file_list:
+                    file_path = os.path.join(settings.data_dir, file_name)
+                    if imghdr.what(file_path) not in imgType_list:
+                        continue
+                    fout.write(file_name + " 0" + "\n")
        return self._reader_creator(
            settings,
            file_list,

--- a/PaddleCV/image_classification/scripts/train/ResNet50_vd.sh
+++ b/PaddleCV/image_classification/scripts/train/ResNet50_vd.sh
@@ -6,13 +6,17 @@ export FLAGS_eager_delete_tensor_gb=0.0
 export FLAGS_fraction_of_gpu_memory_to_use=0.98
 python train.py \
-            --model=ResNet50_vd \
+        --data_dir=./data/ILSVRC2012/ \
-            --batch_size=256 \
+        --total_images=1281167 \
-            --lr_strategy=cosine_decay \
+        --class_dim=1000 \
-            --lr=0.1 \
+        --validate=1 \
-            --num_epochs=200 \
+        --model=ResNet50_vd \
-            --model_save_dir=output/ \
+        --batch_size=256 \
-            --l2_decay=7e-5 \
+        --lr_strategy=cosine_decay \
-            --use_mixup=True \
+        --lr=0.1 \
-            --use_label_smoothing=True \
+        --num_epochs=200 \
-            --label_smoothing_epsilon=0.1
+        --model_save_dir=output/ \
+        --l2_decay=7e-5 \
+        --use_mixup=True \
+        --use_label_smoothing=True \
+        --label_smoothing_epsilon=0.1
--- a/PaddleCV/image_classification/utils/utility.py
+++ b/PaddleCV/image_classification/utils/utility.py
@@ -19,9 +19,9 @@ from __future__ import print_function
 import six
 import argparse
 import functools
-import logging
 import sys
 import os
+import logging
 import warnings
 import signal
 import json
@@ -172,13 +172,11 @@ def check_gpu():
    Log error and exit when set use_gpu=true in paddlepaddle
    cpu ver sion.
    """
-    logger = logging.getLogger(__name__)
    err = "Config use_gpu cannot be set as true while you are " \
                "using paddlepaddle cpu version ! \nPlease try: \n" \
                "\t1. Install paddlepaddle-gpu to run model on GPU \n" \
                "\t2. Set use_gpu as false in config file to run " \
                "model on CPU"
    try:
        if args.use_gpu and not fluid.is_compiled_with_cuda():
            logger.error(err)
@@ -209,8 +207,6 @@ def check_args(args):
    Args:
        all arguments
    """
-    logging.basicConfig(level=logging.INFO)
-    logger = logging.getLogger(__name__)
    # check models name
    sys.path.append("..")