提交 d9056bcd 编写于 作者: D dongshuilong

fix benchmark

---
name: 问题反馈
about: PaddleClas问题反馈
title: ''
labels: ''
assignees: ''
---
欢迎您使用PaddleClas并反馈相关问题,非常感谢您对PaddleClas的贡献!
提出issue时,辛苦您提供以下信息,方便我们快速定位问题并及时有效地解决您的问题:
1. PaddleClas版本以及PaddlePaddle版本:请您提供您使用的版本号或分支信息,如PaddleClas release/2.2和PaddlePaddle 2.1.0
2. 涉及的其他产品使用的版本号:如您在使用PaddleClas的同时还在使用其他产品,如PaddleServing、PaddleInference等,请您提供其版本号
3. 训练环境信息:
a. 具体操作系统,如Linux/Windows/MacOS
b. Python版本号,如Python3.6/7/8
c. CUDA/cuDNN版本, 如CUDA10.2/cuDNN 7.6.5等
4. 完整的代码(相比于repo中代码,有改动的地方)、详细的错误信息及相关log
...@@ -8,6 +8,7 @@ ...@@ -8,6 +8,7 @@
**近期更新** **近期更新**
- 2021.07.08、07.27 添加26个[FAQ](docs/zh_CN/faq_series/faq_2021_s2.md)
- 2021.06.29 添加Swin-transformer系列模型,ImageNet1k数据集上Top1 acc最高精度可达87.2%;支持训练预测评估与whl包部署,预训练模型可以从[这里](docs/zh_CN/models/models_intro.md)下载。 - 2021.06.29 添加Swin-transformer系列模型,ImageNet1k数据集上Top1 acc最高精度可达87.2%;支持训练预测评估与whl包部署,预训练模型可以从[这里](docs/zh_CN/models/models_intro.md)下载。
- 2021.06.22,23,24 PaddleClas官方研发团队带来技术深入解读三日直播课。课程回放:[https://aistudio.baidu.com/aistudio/course/introduce/24519](https://aistudio.baidu.com/aistudio/course/introduce/24519) - 2021.06.22,23,24 PaddleClas官方研发团队带来技术深入解读三日直播课。课程回放:[https://aistudio.baidu.com/aistudio/course/introduce/24519](https://aistudio.baidu.com/aistudio/course/introduce/24519)
- 2021.06.16 PaddleClas v2.2版本升级,集成Metric learning,向量检索等组件。新增商品识别、动漫人物识别、车辆识别和logo识别等4个图像识别应用。新增LeViT、Twins、TNT、DLA、HarDNet、RedNet系列30个预训练模型。 - 2021.06.16 PaddleClas v2.2版本升级,集成Metric learning,向量检索等组件。新增商品识别、动漫人物识别、车辆识别和logo识别等4个图像识别应用。新增LeViT、Twins、TNT、DLA、HarDNet、RedNet系列30个预训练模型。
...@@ -74,7 +75,8 @@ Res2Net200_vd预训练模型Top-1精度高达85.1%。 ...@@ -74,7 +75,8 @@ Res2Net200_vd预训练模型Top-1精度高达85.1%。
- [知识蒸馏](./docs/zh_CN/advanced_tutorials/distillation/distillation.md) - [知识蒸馏](./docs/zh_CN/advanced_tutorials/distillation/distillation.md)
- [模型量化](./docs/zh_CN/extension/paddle_quantization.md) - [模型量化](./docs/zh_CN/extension/paddle_quantization.md)
- [数据增广](./docs/zh_CN/advanced_tutorials/image_augmentation/ImageAugment.md) - [数据增广](./docs/zh_CN/advanced_tutorials/image_augmentation/ImageAugment.md)
- FAQ(暂停更新) - FAQ
- [图像识别任务FAQ](docs/zh_CN/faq_series/faq_2021_s2.md)
- [图像分类任务FAQ](docs/zh_CN/faq.md) - [图像分类任务FAQ](docs/zh_CN/faq.md)
- [许可证书](#许可证书) - [许可证书](#许可证书)
- [贡献代码](#贡献代码) - [贡献代码](#贡献代码)
......
Global: Global:
rec_inference_model_dir: "./models/cartoon_rec_ResNet50_iCartoon_v1.0_infer/" rec_inference_model_dir: "./models/cartoon_rec_ResNet50_iCartoon_v1.0_infer/"
batch_size: 1 batch_size: 32
use_gpu: True use_gpu: True
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
......
Global: Global:
rec_inference_model_dir: "./models/logo_rec_ResNet50_Logo3K_v1.0_infer/" rec_inference_model_dir: "./models/logo_rec_ResNet50_Logo3K_v1.0_infer/"
batch_size: 1 batch_size: 32
use_gpu: True use_gpu: True
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
......
Global: Global:
rec_inference_model_dir: "./models/product_ResNet50_vd_aliproduct_v1.0_infer" rec_inference_model_dir: "./models/product_ResNet50_vd_aliproduct_v1.0_infer"
batch_size: 1 batch_size: 32
use_gpu: True use_gpu: True
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
......
Global: Global:
rec_inference_model_dir: "./models/vehicle_cls_ResNet50_CompCars_v1.0_infer/" rec_inference_model_dir: "./models/vehicle_cls_ResNet50_CompCars_v1.0_infer/"
batch_size: 1 batch_size: 32
use_gpu: True use_gpu: True
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
......
...@@ -12,8 +12,8 @@ Global: ...@@ -12,8 +12,8 @@ Global:
- foreground - foreground
use_gpu: True use_gpu: True
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
......
...@@ -3,8 +3,8 @@ Global: ...@@ -3,8 +3,8 @@ Global:
inference_model_dir: "./models" inference_model_dir: "./models"
batch_size: 1 batch_size: 1
use_gpu: True use_gpu: True
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
...@@ -22,6 +22,7 @@ PreProcess: ...@@ -22,6 +22,7 @@ PreProcess:
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
channel_num: 3
- ToCHWImage: - ToCHWImage:
PostProcess: PostProcess:
main_indicator: Topk main_indicator: Topk
...@@ -29,4 +30,4 @@ PostProcess: ...@@ -29,4 +30,4 @@ PostProcess:
topk: 5 topk: 5
class_id_map_file: "../ppcls/utils/imagenet1k_label_list.txt" class_id_map_file: "../ppcls/utils/imagenet1k_label_list.txt"
SavePreLabel: SavePreLabel:
save_dir: ./pre_label/ save_dir: ./pre_label/
\ No newline at end of file
Global:
infer_imgs: "./images/ILSVRC2012_val_00000010.jpeg"
inference_model_dir: "./models"
batch_size: 1
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 4
- ToCHWImage:
PostProcess:
main_indicator: Topk
Topk:
topk: 5
class_id_map_file: "../ppcls/utils/imagenet1k_label_list.txt"
SavePreLabel:
save_dir: ./pre_label/
...@@ -10,8 +10,8 @@ Global: ...@@ -10,8 +10,8 @@ Global:
# inference engine config # inference engine config
use_gpu: True use_gpu: True
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
......
...@@ -13,8 +13,8 @@ Global: ...@@ -13,8 +13,8 @@ Global:
# inference engine config # inference engine config
use_gpu: True use_gpu: True
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
......
...@@ -13,8 +13,8 @@ Global: ...@@ -13,8 +13,8 @@ Global:
# inference engine config # inference engine config
use_gpu: True use_gpu: True
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
......
...@@ -10,8 +10,8 @@ Global: ...@@ -10,8 +10,8 @@ Global:
# inference engine config # inference engine config
use_gpu: False use_gpu: False
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
......
...@@ -13,8 +13,8 @@ Global: ...@@ -13,8 +13,8 @@ Global:
# inference engine config # inference engine config
use_gpu: True use_gpu: True
enable_mkldnn: False enable_mkldnn: True
cpu_num_threads: 100 cpu_num_threads: 10
enable_benchmark: True enable_benchmark: True
use_fp16: False use_fp16: False
ir_optim: True ir_optim: True
......
...@@ -33,8 +33,10 @@ def get_default_confg(): ...@@ -33,8 +33,10 @@ def get_default_confg():
"enable_benchmark": False "enable_benchmark": False
}, },
'PostProcess': { 'PostProcess': {
'name': 'Topk', 'main_indicator': 'Topk',
'topk': 5, 'Topk': {
'class_id_map_file': './utils/imagenet1k_label_list.txt' 'topk': 5,
'class_id_map_file': './utils/imagenet1k_label_list.txt'
}
} }
} }
\ No newline at end of file
...@@ -15,7 +15,7 @@ hubserving/clas/ ...@@ -15,7 +15,7 @@ hubserving/clas/
### 1. 准备环境 ### 1. 准备环境
```shell ```shell
# 安装paddlehub,请安装2.0版本 # 安装paddlehub,请安装2.0版本
pip3 install paddlehub==2.0.0b1 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
``` ```
### 2. 下载推理模型 ### 2. 下载推理模型
...@@ -128,8 +128,12 @@ python hubserving/test_hubserving.py server_url image_path ...@@ -128,8 +128,12 @@ python hubserving/test_hubserving.py server_url image_path
`http://[ip_address]:[port]/predict/[module_name]` `http://[ip_address]:[port]/predict/[module_name]`
- **image_path**:测试图像路径,可以是单张图片路径,也可以是图像集合目录路径。 - **image_path**:测试图像路径,可以是单张图片路径,也可以是图像集合目录路径。
- **batch_size**:[**可选**] 以`batch_size`大小为单位进行预测,默认为`1`。 - **batch_size**:[**可选**] 以`batch_size`大小为单位进行预测,默认为`1`。
- **resize_short**:[**可选**] 预处理时,按短边调整大小,默认为`256`。
- **crop_size**:[**可选**] 预处理时,居中裁剪的大小,默认为`224`。
- **normalize**:[**可选**] 预处理时,是否进行`normalize`,默认为`True`。
- **to_chw**:[**可选**] 预处理时,是否调整为`CHW`顺序,默认为`True`。
**注意**:如果使用`Transformer`系列模型,如`DeiT_***_384`, `ViT_***_384`等,请注意模型的输入数据尺寸。需要指定`--resize_short=384 --resize=384`。 **注意**:如果使用`Transformer`系列模型,如`DeiT_***_384`, `ViT_***_384`等,请注意模型的输入数据尺寸,需要指定`--resize_short=384 --crop_size=384`。
访问示例: 访问示例:
......
...@@ -15,7 +15,7 @@ hubserving/clas/ ...@@ -15,7 +15,7 @@ hubserving/clas/
### 1. Prepare the environment ### 1. Prepare the environment
```shell ```shell
# Install version 2.0 of PaddleHub # Install version 2.0 of PaddleHub
pip3 install paddlehub==2.0.0b1 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
``` ```
### 2. Download inference model ### 2. Download inference model
...@@ -126,9 +126,13 @@ Two required parameters need to be passed to the script: ...@@ -126,9 +126,13 @@ Two required parameters need to be passed to the script:
`http://[ip_address]:[port]/predict/[module_name]` `http://[ip_address]:[port]/predict/[module_name]`
- **image_path**: Test image path, can be a single image path or an image directory path - **image_path**: Test image path, can be a single image path or an image directory path
- **batch_size**: [**Optional**] batch_size. Default by `1`. - **batch_size**: [**Optional**] batch_size. Default by `1`.
- **resize_short**: [**Optional**] In preprocessing, resize according to short size. Default by `256`
- **crop_size**: [**Optional**] In preprocessing, centor crop size. Default by `224`
- **normalize**: [**Optional**] In preprocessing, whether to do `normalize`. Default by `True`
- **to_chw**: [**Optional**] In preprocessing, whether to transpose to `CHW`. Default by `True`
**Notice**: **Notice**:
If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `--resize_short=384`, `--resize=384`. If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `--resize_short=384`, `--crop_size=384`.
**Eg.** **Eg.**
```shell ```shell
......
...@@ -32,30 +32,59 @@ from utils import config ...@@ -32,30 +32,59 @@ from utils import config
from utils.encode_decode import np_to_b64 from utils.encode_decode import np_to_b64
from python.preprocess import create_operators from python.preprocess import create_operators
preprocess_config = [{
'ResizeImage': { def get_args():
'resize_short': 256 def str2bool(v):
} return v.lower() in ("true", "t", "1")
}, {
'CropImage': { parser = argparse.ArgumentParser()
'size': 224 parser.add_argument("--server_url", type=str)
} parser.add_argument("--image_file", type=str)
}, { parser.add_argument("--batch_size", type=int, default=1)
'NormalizeImage': { parser.add_argument("--resize_short", type=int, default=256)
'scale': 0.00392157, parser.add_argument("--crop_size", type=int, default=224)
'mean': [0.485, 0.456, 0.406], parser.add_argument("--normalize", type=str2bool, default=True)
'std': [0.229, 0.224, 0.225], parser.add_argument("--to_chw", type=str2bool, default=True)
'order': '' return parser.parse_args()
}
}, {
'ToCHWImage': None class PreprocessConfig(object):
}] def __init__(self,
resize_short=256,
crop_size=224,
normalize=True,
to_chw=True):
self.config = [{
'ResizeImage': {
'resize_short': resize_short
}
}, {
'CropImage': {
'size': crop_size
}
}]
if normalize:
self.config.append({
'NormalizeImage': {
'scale': 0.00392157,
'mean': [0.485, 0.456, 0.406],
'std': [0.229, 0.224, 0.225],
'order': ''
}
})
if to_chw:
self.config.append({'ToCHWImage': None})
def __call__(self):
return self.config
def main(args): def main(args):
image_path_list = get_image_list(args.image_file) image_path_list = get_image_list(args.image_file)
headers = {"Content-type": "application/json"} headers = {"Content-type": "application/json"}
preprocess_ops = create_operators(preprocess_config) preprocess_ops = create_operators(
PreprocessConfig(args.resize_short, args.crop_size, args.normalize,
args.to_chw)())
cnt = 0 cnt = 0
predict_time = 0 predict_time = 0
...@@ -113,14 +142,10 @@ def main(args): ...@@ -113,14 +142,10 @@ def main(args):
for number, result_list in enumerate(preds): for number, result_list in enumerate(preds):
all_score += result_list["scores"][0] all_score += result_list["scores"][0]
result_str = "" pred_str = ", ".join(
for i in range(len(result_list["class_ids"])): [f"{k}: {result_list[k]}" for k in result_list])
result_str += "{}: {:.2f}\t".format(
result_list["class_ids"][i],
result_list["scores"][i])
logger.info( logger.info(
f"File:{img_name_list[number]}, The result(s): {result_str}" f"File:{img_name_list[number]}, The result(s): {pred_str}"
) )
finally: finally:
...@@ -136,10 +161,5 @@ def main(args): ...@@ -136,10 +161,5 @@ def main(args):
if __name__ == '__main__': if __name__ == '__main__':
parser = argparse.ArgumentParser() args = get_args()
parser.add_argument("--server_url", type=str)
parser.add_argument("--image_file", type=str)
parser.add_argument("--batch_size", type=int, default=1)
args = parser.parse_args()
main(args) main(args)
...@@ -42,7 +42,7 @@ class ClsPredictor(Predictor): ...@@ -42,7 +42,7 @@ class ClsPredictor(Predictor):
self.postprocess = build_postprocess(config["PostProcess"]) self.postprocess = build_postprocess(config["PostProcess"])
# for whole_chain project to test each repo of paddle # for whole_chain project to test each repo of paddle
self.benchmark = config.get("benchmark", False) self.benchmark = config["Global"].get("benchmark", False)
if self.benchmark: if self.benchmark:
import auto_log import auto_log
import os import os
...@@ -88,6 +88,10 @@ class ClsPredictor(Predictor): ...@@ -88,6 +88,10 @@ class ClsPredictor(Predictor):
batch_output = output_tensor.copy_to_cpu() batch_output = output_tensor.copy_to_cpu()
if self.benchmark: if self.benchmark:
self.auto_log.times.stamp() self.auto_log.times.stamp()
if self.postprocess is not None:
batch_output = self.postprocess(batch_output)
if self.benchmark:
self.auto_log.times.end(stamp=True)
return batch_output return batch_output
...@@ -95,14 +99,38 @@ def main(config): ...@@ -95,14 +99,38 @@ def main(config):
cls_predictor = ClsPredictor(config) cls_predictor = ClsPredictor(config)
image_list = get_image_list(config["Global"]["infer_imgs"]) image_list = get_image_list(config["Global"]["infer_imgs"])
assert config["Global"]["batch_size"] == 1 batch_imgs = []
for idx, image_file in enumerate(image_list): batch_names = []
img = cv2.imread(image_file)[:, :, ::-1] cnt = 0
output = cls_predictor.predict(img) for idx, img_path in enumerate(image_list):
output = cls_predictor.postprocess(output, [image_file]) img = cv2.imread(img_path)
if cls_predictor.benchmark: if img is None:
cls_predictor.auto_log.times.end(stamp=True) logger.warning(
print(output) "Image file failed to read and has been skipped. The path: {}".
format(img_path))
else:
img = img[:, :, ::-1]
batch_imgs.append(img)
img_name = os.path.basename(img_path)
batch_names.append(img_name)
cnt += 1
if cnt % config["Global"]["batch_size"] == 0 or (idx + 1
) == len(image_list):
if len(batch_imgs) == 0:
continue
batch_results = cls_predictor.predict(batch_imgs)
for number, result_dict in enumerate(batch_results):
filename = batch_names[number]
clas_ids = result_dict["class_ids"]
scores_str = "[{}]".format(", ".join("{:.2f}".format(
r) for r in result_dict["scores"]))
label_names = result_dict["label_names"]
print("{}:\tclass id(s): {}, score(s): {}, label_name(s): {}".
format(filename, clas_ids, scores_str, label_names))
batch_imgs = []
batch_names = []
cls_predictor.auto_log.report() cls_predictor.auto_log.report()
return return
......
...@@ -54,12 +54,14 @@ class RecPredictor(Predictor): ...@@ -54,12 +54,14 @@ class RecPredictor(Predictor):
input_tensor.copy_from_cpu(image) input_tensor.copy_from_cpu(image)
self.paddle_predictor.run() self.paddle_predictor.run()
batch_output = output_tensor.copy_to_cpu() batch_output = output_tensor.copy_to_cpu()
if feature_normalize: if feature_normalize:
feas_norm = np.sqrt( feas_norm = np.sqrt(
np.sum(np.square(batch_output), axis=1, keepdims=True)) np.sum(np.square(batch_output), axis=1, keepdims=True))
batch_output = np.divide(batch_output, feas_norm) batch_output = np.divide(batch_output, feas_norm)
if self.postprocess is not None:
batch_output = self.postprocess(batch_output)
return batch_output return batch_output
...@@ -67,14 +69,33 @@ def main(config): ...@@ -67,14 +69,33 @@ def main(config):
rec_predictor = RecPredictor(config) rec_predictor = RecPredictor(config)
image_list = get_image_list(config["Global"]["infer_imgs"]) image_list = get_image_list(config["Global"]["infer_imgs"])
assert config["Global"]["batch_size"] == 1 batch_imgs = []
for idx, image_file in enumerate(image_list): batch_names = []
batch_input = [] cnt = 0
img = cv2.imread(image_file)[:, :, ::-1] for idx, img_path in enumerate(image_list):
output = rec_predictor.predict(img) img = cv2.imread(img_path)
if rec_predictor.postprocess is not None: if img is None:
output = rec_predictor.postprocess(output) logger.warning(
print(output) "Image file failed to read and has been skipped. The path: {}".
format(img_path))
else:
img = img[:, :, ::-1]
batch_imgs.append(img)
img_name = os.path.basename(img_path)
batch_names.append(img_name)
cnt += 1
if cnt % config["Global"]["batch_size"] == 0 or (idx + 1) == len(image_list):
if len(batch_imgs) == 0:
continue
batch_results = rec_predictor.predict(batch_imgs)
for number, result_dict in enumerate(batch_results):
filename = batch_names[number]
print("{}:\t{}".format(filename, result_dict))
batch_imgs = []
batch_names = []
return return
......
...@@ -35,7 +35,6 @@ sudo apt-get install build-essential gcc g++ ...@@ -35,7 +35,6 @@ sudo apt-get install build-essential gcc g++
进入该文件夹,直接运行`make`即可,如果希望重新生成`index.so`文件,可以首先使用`make clean`清除已经生成的缓存,再使用`make`生成更新之后的库文件。 进入该文件夹,直接运行`make`即可,如果希望重新生成`index.so`文件,可以首先使用`make clean`清除已经生成的缓存,再使用`make`生成更新之后的库文件。
### 2.3 Windows上编译生成库文件 ### 2.3 Windows上编译生成库文件
Windows上首先需要安装gcc编译工具,推荐使用[TDM-GCC](https://jmeubank.github.io/tdm-gcc/articles/2020-03/9.2.0-release),进入官网之后,可以选择合适的版本进行下载。推荐下载[tdm64-gcc-10.3.0-2.exe](https://github.com/jmeubank/tdm-gcc/releases/download/v10.3.0-tdm64-2/tdm64-gcc-10.3.0-2.exe) Windows上首先需要安装gcc编译工具,推荐使用[TDM-GCC](https://jmeubank.github.io/tdm-gcc/articles/2020-03/9.2.0-release),进入官网之后,可以选择合适的版本进行下载。推荐下载[tdm64-gcc-10.3.0-2.exe](https://github.com/jmeubank/tdm-gcc/releases/download/v10.3.0-tdm64-2/tdm64-gcc-10.3.0-2.exe)
...@@ -50,6 +49,25 @@ Windows上首先需要安装gcc编译工具,推荐使用[TDM-GCC](https://jmeu ...@@ -50,6 +49,25 @@ Windows上首先需要安装gcc编译工具,推荐使用[TDM-GCC](https://jmeu
在该文件夹(deploy/vector_search)下,运行命令`mingw32-make`,即可生成`index.dll`库文件。如果希望重新生成`index.dll`文件,可以首先使用`mingw32-make clean`清除已经生成的缓存,再使用`mingw32-make`生成更新之后的库文件。 在该文件夹(deploy/vector_search)下,运行命令`mingw32-make`,即可生成`index.dll`库文件。如果希望重新生成`index.dll`文件,可以首先使用`mingw32-make clean`清除已经生成的缓存,再使用`mingw32-make`生成更新之后的库文件。
### 2.4 MacOS上编译生成库文件
运行下面的命令,安装gcc与g++:
```shell
brew install gcc
```
#### 注意:
1. 若提示 `Error: Running Homebrew as root is extremely dangerous and no longer supported...`, 参考该[链接](https://jingyan.baidu.com/article/e52e3615057a2840c60c519c.html)处理
2. 若提示 `Error: Failure while executing; `tar --extract --no-same-owner --file...`, 参考该[链接](https://blog.csdn.net/Dawn510/article/details/117787358)处理
在安装之后编译后的可执行程序会被复制到/usr/local/bin下面,查看这个文件夹下的gcc:
```
ls /usr/local/bin/gcc*
```
可以看到本地gcc对应的版本号为gcc-11,编译命令如下: (如果本地gcc版本为gcc-9, 则相应命令修改为`CXX=g++-9 make`)
```
CXX=g++-11 make
```
## 3. 快速使用 ## 3. 快速使用
......
# Vector search
## 1. Introduction
Some vertical domain recognition tasks (e.g., vehicles, commodities, etc.) require a large number of recognized categories, and often use a retrieval-based approach to obtain matching predicted categories by performing a fast nearest neighbor search with query vectors and underlying library vectors. The vector search module provides the basic approximate nearest neighbor search algorithm based on Baidu's self-developed Möbius algorithm, a graph-based approximate nearest neighbor search algorithm for maximum inner product search (MIPS). This module provides python interface, supports numpy and tensor type vectors, and supports L2 and Inner Product distance calculation.
Details of the Mobius algorithm can be found in the paper.([Möbius Transformation for Fast Inner Product Search on Graph](http://research.baidu.com/Public/uploads/5e189d36b5cf6.PDF), [Code](https://github.com/sunbelbd/mobius)
## 2. Installation
### 2.1 Use the provided library files directly
This folder contains the compiled `index.so` (compiled under gcc8.2.0 for Linux) and `index.dll` (compiled under gcc10.3.0 for Windows), which can be used directly, skipping sections 2.2 and 2.3.
If the library files are not available due to a low gcc version or an incompatible environment, you need to manually compile the library files under a different platform.
**Note:** Make sure that C++ compiler supports the C++11 standard.
### 2.2 Compile and generate library files on Linux
Run the following command to install gcc and g++.
```
sudo apt-get update
sudo apt-get upgrade -y
sudo apt-get install build-essential gcc g++
```
Check the gcc version by the command `gcc -v`.
`make` can be operated directly. If you wish to regenerate the `index.so`, you can first use `make clean` to clear the cache, and then use `make` to generate the updated library file.
### 2.3 Compile and generate library files on Windows
You need to install gcc compiler tool first, we recommend using [TDM-GCC](https://jmeubank.github.io/tdm-gcc/articles/2020-03/9.2.0-release), you can choose the right version on the official website. We recommend downloading [tdm64-gcc-10.3.0-2.exe](https://github.com/jmeubank/tdm-gcc/releases/download/v10.3.0-tdm64-2/tdm64-gcc-10.3.0-2.exe).
After the downloading, follow the default installation steps to install. There are 3 points to note here:
1. The vector search module depends on openmp, so you need to check the `openmp` installation option when going on to `choose components` step, otherwise it will report an error `libgomp.spec: No such file or directory`, [reference link](https://github.com/dmlc/xgboost/issues/1027)
2. When being asked whether to add to the system environment variables, it is recommended to check here, otherwise you need to add the system environment variables manually later.
3. The compile command is `make` on Linux and `mingw32-make` on Windows, so you need to distinguish here.
After installation, you can open a command line terminal and check the gcc version with the command `gcc -v`.
Run the command `mingw32-make` to generate the `index.dll` library file under the folder (deploy/vector_search). If you want to regenerate the `index.dll` file, you can first use `mingw32-make clean` to clear the cache, and then use `mingw32-make` to generate the updated library file.
### 2.4 Compile and generate library files on MacOS
Run the following command to install gcc and g++:
```
brew install gcc
```
#### Caution:
1. If prompted with `Error: Running Homebrew as root is extremely dangerous and no longer supported... `, refer to this [link](https://jingyan.baidu.com/article/e52e3615057a2840c60c519c.html)
2. If prompted with `Error: Failure while executing; tar --extract --no-same-owner --file... `, refer to this [link](https://blog.csdn.net/Dawn510/article/details/117787358).
After installation the compiled executable is copied under /usr/local/bin, look at the gcc in this folder:
```
ls /usr/local/bin/gcc*
```
The local gcc version is gcc-11, and the compile command is as follows: (If the local gcc version is gcc-9, the corresponding command should be `CXX=g++-9 make`)
```
CXX=g++-11 make
```
## 3. Quick use
```
import numpy as np
from interface import Graph_Index
# Random sample generation
index_vectors = np.random.rand(100000,128).astype(np.float32)
query_vector = np.random.rand(128).astype(np.float32)
index_docs = ["ID_"+str(i) for i in range(100000)]
# Initialize index structure
indexer = Graph_Index(dist_type="IP") #support "IP" and "L2"
indexer.build(gallery_vectors=index_vectors, gallery_docs=index_docs, pq_size=100, index_path='test')
# Query
scores, docs = indexer.search(query=query_vector, return_k=10, search_budget=100)
print(scores)
print(docs)
# Save and load
indexer.dump(index_path="test")
indexer.load(index_path="test")
```
...@@ -3,9 +3,9 @@ ...@@ -3,9 +3,9 @@
## Overview ## Overview
The Twins network includes Twins-PCPVT and Twins-SVT, which focuses on the meticulous design of the spatial attention mechanism, resulting in a simple but more effective solution. Since the architecture only involves matrix multiplication, and the current deep learning framework has a high degree of optimization for matrix multiplication, the architecture is very efficient and easy to implement. Moreover, this architecture can achieve excellent performance in a variety of downstream vision tasks such as image classification, target detection, and semantic segmentation. [Paper](https://arxiv.org/abs/2104.13840). The Twins network includes Twins-PCPVT and Twins-SVT, which focuses on the meticulous design of the spatial attention mechanism, resulting in a simple but more effective solution. Since the architecture only involves matrix multiplication, and the current deep learning framework has a high degree of optimization for matrix multiplication, the architecture is very efficient and easy to implement. Moreover, this architecture can achieve excellent performance in a variety of downstream vision tasks such as image classification, target detection, and semantic segmentation. [Paper](https://arxiv.org/abs/2104.13840).
## Accuracy, FLOPS and Parameters ## Accuracy, FLOPs and Parameters
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPS<br>(G) | Params<br>(M) | | Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPs<br>(G) | Params<br>(M) |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:| |:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| pcpvt_small | 0.8082 | 0.9552 | 0.812 | - | 3.7 | 24.1 | | pcpvt_small | 0.8082 | 0.9552 | 0.812 | - | 3.7 | 24.1 |
| pcpvt_base | 0.8242 | 0.9619 | 0.827 | - | 6.4 | 43.8 | | pcpvt_base | 0.8242 | 0.9619 | 0.827 | - | 6.4 | 43.8 |
......
# Configuration Instruction
------
## Introdction
The parameters in the PaddleClas configuration file(`ppcls/configs/*.yaml`)are described for you to customize or modify the hyperparameter configuration more quickly.
## Details
### 1. Classification model
Here the configuration of `ResNet50_vd` on`ImageNet-1k`is used as an example to explain the each parameter in detail. [Configure Path](https://github.com/PaddlePaddle/PaddleClas/blob/develop/ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml).
#### 1.1Global Configuration
| Parameter name | Specific meaning | Defult value | Optional value |
| ------------------ | ------------------------------------------------------- | ---------------- | ----------------- |
| checkpoints | Breakpoint model path for resuming training | null | str |
| pretrained_model | Pre-trained model path | null | str |
| output_dir | Save model path | "./output/" | str |
| save_interval | How many epochs to save the model at each interval | 1 | int |
| eval_during_train | Whether to evaluate at training | True | bool |
| eval_interval | How many epochs to evaluate at each interval | 1 | int |
| epochs | Total number of epochs in training | | int |
| print_batch_step | How many mini-batches to print out at each interval | 10 | int |
| use_visualdl | Whether to visualize the training process with visualdl | False | bool |
| image_shape | Image size | [3,224,224] | list, shape: (3,) |
| save_inference_dir | Inference model save path | "./inference" | str |
| eval_mode | Model of eval | "classification" | "retrieval" |
**Note**:The http address of pre-trained model can be filled in the `pretrained_model`
#### 1.2 Architecture
| Parameter name | Specific meaning | Defult value | Optional value |
| -------------- | ----------------- | ------------ | --------------------- |
| name | Model Arch name | ResNet50 | PaddleClas model arch |
| class_num | Category number | 1000 | int |
| pretrained | Pre-trained model | False | bool, str |
**Note**: Here pretrained can be set to True or False, so does the path of the weights. In addition, the pretrained is disabled when Global.pretrained_model is also set to the corresponding path.
#### 1.3 Loss function
| Parameter name | Specific meaning | Defult value | Optional value |
| -------------- | ------------------------------------------- | ------------ | ---------------------- |
| CELoss | cross-entropy loss function | —— | —— |
| CELoss.weight | The weight of CELoss in the whole Loss | 1.0 | float |
| CELoss.epsilon | The epsilon value of label_smooth in CELoss | 0.1 | float,between 0 and 1 |
#### 1.4 Optimizer
| Parameter name | Specific meaning | Defult value | Optional value |
| ----------------- | -------------------------------- | ------------ | -------------------------------------------------- |
| name | optimizer method name | "Momentum" | Other optimizer including "RmsProp" |
| momentum | momentum value | 0.9 | float |
| lr.name | method of dropping learning rate | "Cosine" | Other dropping methods of "Linear" and "Piecewise" |
| lr.learning_rate | initial value of learning rate | 0.1 | float |
| lr.warmup_epoch | warmup rounds | 0 | int,such as 5 |
| regularizer.name | regularization method name | "L2" | ["L1", "L2"] |
| regularizer.coeff | regularization factor | 0.00007 | float |
**Note**:The new parameters may be different when `lr.name` is different , as when `lr.name=Piecewise`, the following parameters need to be added:
```
lr:
name: Piecewise
learning_rate: 0.1
decay_epochs: [30, 60, 90]
values: [0.1, 0.01, 0.001, 0.0001]
```
Referring to [learning_rate.py](https://github.com/PaddlePaddle/PaddleClas/blob/develop/ppcls/optimizer/learning_rate.py) for adding method and parameters.
#### 1.5 Data reading module(DataLoader)
##### 1.5.1 dataset
| Parameter name | Specific meaning | Defult value | Optional value |
| ------------------- | ------------------------------------ | ----------------------------------- | ------------------------------ |
| name | The name of the class to read the data | ImageNetDataset | VeriWild and other Dataet type |
| image_root | The path where the dataset is stored | ./dataset/ILSVRC2012/ | str |
| cls_label_path | data label list | ./dataset/ILSVRC2012/train_list.txt | str |
| transform_ops | data preprocessing for single images | —— | —— |
| batch_transform_ops | Data preprocessing for batch images | —— | —— |
The parameter meaning of transform_ops:
| Function name | Parameter name | Specific meaning |
| -------------- | -------------- | --------------------- |
| DecodeImage | to_rgb | data to RGB |
| | channel_first | image data by CHW |
| RandCropImage | size | Random crop |
| RandFlipImage | | Random flip |
| NormalizeImage | scale | Normalize scale value |
| | mean | Normalize mean value |
| | std | normalized variance |
| | order | Normalize order |
| CropImage | size | crop size |
| ResizeImage | resize_short | resize by short edge |
The parameter meaning of batch_transform_ops:
| Function name | Parameter name | Specific meaning |
| ------------- | -------------- | --------------------------------------- |
| MixupOperator | alpha | Mixup parameter value,the larger the value, the stronger the augment |
##### 1.5.2 sampler
| Parameter name | Specific meaning | Default value | Optional value |
| -------------- | ------------------------------------------------------------ | ----------------------- | -------------------------------------------------- |
| name | sampler type | DistributedBatchSampler | DistributedRandomIdentitySampler and other Sampler |
| batch_size | batch size | 64 | int |
| drop_last | Whether to drop the last data that does reach the batch-size | False | bool |
| shuffle | whether to shuffle the data | True | bool |
##### 1.5.3 loader
| Parameter name | Specific meaning | Default meaning | Optional meaning |
| ----------------- | ---------------------------- | --------------- | ---------------- |
| num_workers | Number of data read threads | 4 | int |
| use_shared_memory | Whether to use shared memory | True | bool |
#### 1.6 Evaluation metric
| Parameter name | Specific meaning | Default meaning | Optional meaning |
| -------------- | ---------------- | --------------- | ---------------- |
| TopkAcc | TopkAcc | [1, 5] | list, int |
#### 1.7 Inference
| Parameter name | Specific meaning | Default meaning | Optional meaning |
| ----------------------------- | --------------------------------- | ------------------------------------- | ---------------- |
| infer_imgs | Image address to be inferred | docs/images/whl/demo.jpg | str |
| batch_size | batch size | 10 | int |
| PostProcess.name | Post-process name | Topk | str |
| PostProcess.topk | topk value | 5 | int |
| PostProcess.class_id_map_file | mapping file of class id and name | ppcls/utils/imagenet1k_label_list.txt | str |
**Note**:The interpretation of `transforms` in the Infer module refers to the interpretation of`transform_ops`in the dataset in the data reading module.
### 2.Distillation model
**Note**:Here the training configuration of `MobileNetV3_large_x1_0` on `ImageNet-1k` distilled MobileNetV3_small_x1_0 is used as an example to explain the meaning of each parameter in detail. [Configure path](https://github.com/PaddlePaddle/PaddleClas/blob/develop/ppcls/configs/ImageNet/Distillation/mv3_large_x1_0_distill_mv3_small_x1_0.yaml). Only parameters that are distinct from the classification model are introduced here.
#### 2.1 Architecture
| Parameter name | Specific meaning | Default meaning | Optional meaning |
| ------------------ | --------------------------------------------------------- | ---------------------- | ---------------------------------- |
| name | model arch name | DistillationModel | —— |
| class_num | category number | 1000 | int |
| freeze_params_list | freeze_params_list | [True, False] | list |
| models | model list | [Teacher, Student] | list |
| Teacher.name | teacher model name | MobileNetV3_large_x1_0 | PaddleClas model |
| Teacher.pretrained | teacher model pre-trained weights | True | Boolean or pre-trained weight path |
| Teacher.use_ssld | whether teacher model pretrained weights are ssld weights | True | Boolean |
| infer_model_name | type of the model being inferred | Student | Teacher |
**Note**
1. list is represented in yaml as follows:
```
freeze_params_list:
- True
- False
```
2.Student's parameters are similar and will not be repeated.
#### 2.2 Loss function
| Parameter name | Specific meaning | Default meaning | Optional meaning |
| ----------------------------------- | ------------------------------------------------------------ | --------------- | ---------------- |
| DistillationCELoss | Distillation's cross-entropy loss function | —— | —— |
| DistillationCELoss.weight | Loss weight | 1.0 | float |
| DistillationCELoss.model_name_pairs | ["Student", "Teacher"] | —— | —— |
| DistillationGTCELoss.weight | Distillation's cross-entropy loss function of model and true Label | —— | —— |
| DistillationGTCELos.weight | Loss weight | 1.0 | float |
| DistillationCELoss.model_names | Model names with real label for cross-entropy | ["Student"] | —— |
#### 2.3 Evaluation metric
| Parameter name | Specific meaning | Default meaning | Optional meaning |
| ----------------------------- | ------------------- | ---------------------------- | ---------------- |
| DistillationTopkAcc | DistillationTopkAcc | including model_key and topk | —— |
| DistillationTopkAcc.model_key | the evaluated model | "Student" | "Teacher" |
| DistillationTopkAcc.topk | Topk value | [1, 5] | list, int |
**Note**`DistillationTopkAcc` has the same meaning as `TopkAcc`, except that it is only used in distillation tasks.
### 3. Recognition model
**Note**:The training configuration of`ResNet50` on`LogoDet-3k` is used here as an example to explain the meaning of each parameter in detail. [configure path](https://github.com/PaddlePaddle/PaddleClas/blob/develop/ppcls/configs/Logo/ResNet50_ReID.yaml). Only parameters that are distinct from the classification model are presented here.
#### 3.1 Architechture
| Parameter name | Specific meaning | Default meaning | Optional meaning |
| ---------------------- | ------------------------------------------------------------ | --------------------------- | ------------------------------------------------------------ |
| name | Model arch | "RecModel" | ["RecModel"] |
| infer_output_key | inference output value | “feature” | ["feature", "logits"] |
| infer_add_softmax | softmaxwhether to add softmax to infercne | False | [True, False] |
| Backbone.name | Backbone name | ResNet50_last_stage_stride1 | other backbone provided by PaddleClas |
| Backbone.pretrained | Backbone pre-trained model | True | Boolean value or pre-trained model path |
| BackboneStopLayer.name | The name of the output layer in Backbone | True | The`full_name`of the feature output layer in Backbone |
| Neck.name | The name of the Neck part | VehicleNeck | the dictionary structure to be passed in, the specific input parameters for the Neck network layer |
| Neck.in_channels | Input dimension size of the Neck part | 2048 | the size is the same as BackboneStopLayer.name |
| Neck.out_channels | Output the dimension size of the Neck part, i.e. feature dimension size | 512 | int |
| Head.name | Network Head part nam | CircleMargin | Arcmargin. Etc |
| Head.embedding_size | Feature dimension size | 512 | Consistent with Neck.out_channels |
| Head.class_num | number of classes | 3000 | int |
| Head.margin | margin value in CircleMargin | 0.35 | float |
| Head.scale | scale value in CircleMargin | 64 | int |
**Note**
1.In PaddleClas, the `Neck` part is the connection part between Backbone and embedding layer, and `Head` part is the connection part between embedding layer and classification layer.。
2.`BackboneStopLayer.name` can be obtained by visualizing the model, visualization can be referred to [Netron](https://github.com/lutzroeder/netron) or [visualdl](https://github.com/PaddlePaddle/VisualDL).
3.Calling tools/export_model.py will convert the model weights to inference model, where the infer_add_softmax parameter will control whether to add the Softmax activation function afterwards, the code default is True (the last output layer in the classification task will be connected to the Softmax activation function). In the recognition task, the activation function is not required for the feature layer, so it should be set to False here.
#### 3.2 Evaluation metric
| Parameter name | Specific meaning | Default meaning | Optional meaning |
| -------------- | --------------------------- | --------------- | ---------------- |
| Recallk | Recall rate | [1, 5] | list, int |
| mAP | Average retrieval precision | None | None |
# Configuration
---
## Introduction
This document introduces the configuration(filed in `config/*.yaml`) of PaddleClas.
* Note: Some parameters do not appear in the yaml file (because they are not used for this file). During training or validation, you can use the command `-o` to update or add the specified parameters. For the example `-o checkpoints=./ckp_path/ppcls`, it means that the parameter `checkpoints` will be updated or added using the value `./ckp_path/ppcls`.
### Basic
| name | detail | default value | optional value |
|:---:|:---:|:---:|:---:|
| mode | mode | "train" | ["train"," valid"] |
| checkpoints | checkpoint model path for resuming training process | "" | Str |
| last_epoch | last epoch for the training,used with checkpoints | -1 | int |
| pretrained_model | pretrained model path | "" | Str |
| load_static_weights | whether the pretrained model is saved in static mode | False | bool |
| model_save_dir | model stored path | "" | Str |
| classes_num | class number | 1000 | int |
| total_images | total images | 1281167 | int |
| save_interval | save interval | 1 | int |
| validate | whether to validate when training | TRUE | bool |
| valid_interval | valid interval | 1 | int |
| epochs | epoch | | int |
| topk | K value | 5 | int |
| image_shape | image size | [3,224,224] | list, shape: (3,) |
| use_mix | whether to use mixup | False | ['True', 'False'] |
| ls_epsilon | label_smoothing epsilon value| 0 | float |
| use_distillation | whether to use SSLD distillation training | False | bool |
## ARCHITECTURE
| name | detail | default value | optional value |
|:---:|:---:|:---:|:---:|
| name | model name | "ResNet50_vd" | one of 23 architectures |
| params | model parameters | {} | extra dictionary for the model structure, parameters such as `padding_type` in EfficientNet can be set here |
### LEARNING_RATE
| name | detail | default value |Optional value |
|:---:|:---:|:---:|:---:|
| function | decay type | "Linear" | ["Linear", "Cosine", <br> "Piecewise", "CosineWarmup"] |
| params.lr | initial learning rate | 0.1 | float |
| params.decay_epochs | milestone in piecewisedecay | | list |
| params.gamma | gamma in piecewisedecay | 0.1 | float |
| params.warmup_epoch | warmup epoch | 5 | int |
| parmas.steps | decay steps in lineardecay | 100 | int |
| params.end_lr | end lr in lineardecay | 0 | float |
### OPTIMIZER
| name | detail | default value | optional value |
|:---:|:---:|:---:|:---:|
| function | optimizer name | "Momentum" | ["Momentum", "RmsProp"] |
| params.momentum | momentum value | 0.9 | float |
| regularizer.function | regularizer method name | "L2" | ["L1", "L2"] |
| regularizer.factor | regularizer factor | 0.0001 | float |
### reader
| name | detail |
|:---:|:---:|
| batch_size | batch size |
| num_workers | worker number |
| file_list | train list path |
| data_dir | train dataset path |
| shuffle_seed | seed |
processing
| function name | attribute name | detail |
|:---:|:---:|:---:|
| DecodeImage | to_rgb | decode to RGB |
| | to_np | to numpy |
| | channel_first | Channel first |
| RandCropImage | size | random crop |
| RandFlipImage | | random flip |
| NormalizeImage | scale | normalize image |
| | mean | mean |
| | std | std |
| | order | order |
| ToCHWImage | | to CHW |
| CropImage | size | crop size |
| ResizeImage | resize_short | resize according to short size |
mix preprocessing
| name| detail|
|:---:|:---:|
| MixupOperator.alpha | alpha value in mixup|
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
* installing from pypi * installing from pypi
```bash ```bash
pip3 install paddleclas==2.2.0 pip3 install paddleclas==2.2.1
``` ```
* build own whl package and install * build own whl package and install
......
docs/images/wx_group.png

648.9 KB | W: | H:

docs/images/wx_group.png

57.8 KB | W: | H:

docs/images/wx_group.png
docs/images/wx_group.png
docs/images/wx_group.png
docs/images/wx_group.png
  • 2-up
  • Swipe
  • Onion skin
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
## 目录 ## 目录
* [第1期](#第1期)(2021.07.08) * [第1期](#第1期)(2021.07.08)
* [第2期](#第2期)(2021.07.27)
<a name="第1期"></a> <a name="第1期"></a>
## 第1期 ## 第1期
...@@ -99,3 +99,26 @@ ...@@ -99,3 +99,26 @@
### Q1.20 PaddleClas 的`train_log`文件在哪里? ### Q1.20 PaddleClas 的`train_log`文件在哪里?
**A**:在保存权重的路径中存放了`train.log` **A**:在保存权重的路径中存放了`train.log`
<a name="第2期"></a>
## 第2期
### Q2.1 PaddleClas目前使用的Möbius向量检索算法支持类似于faiss的那种index.add()的功能吗? 另外,每次构建新的图都要进行train吗?这里的train是为了检索加速还是为了构建相似的图?
**A**:Mobius提供的检索算法是一种基于图的近似最近邻搜索算法,目前支持两种距离计算方式:inner product和L2 distance. faiss中提供的index.add功能暂时不支持,如果需要增加检索库的内容,需要从头重新构建新的index. 在每次构建index时,检索算法内部执行的操作是一种类似于train的过程,不同于faiss提供的train接口,我们命名为build, 主要的目的是为了加速检索的速度。
### Q2.2 可以对视频中每一帧画面进行逐帧预测吗?
**A**:可以,但目前PaddleClas并不支持视频输入。可以尝试修改一下PaddleClas代码,或者预先将视频逐帧转为图像存储,再使用PaddleClas进行预测。
### Q2.3:在直播场景中,需要提供一个直播即时识别画面,能够在延迟几秒内找到特征目标物并用框圈起,这个可以实现吗?
**A**:要达到实时的检测效果,需要检测速度达到实时性的要求;PPyolo是Paddle团队提供的轻量级目标检测模型,检测速度和精度达到了很好的平衡,可以试试ppyolo来做检测. 关于ppyolo的使用,可以参照: https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/configs/ppyolo/README_cn.md
### Q2.4: 对于未知的标签,加入gallery dataset可以用于后续的分类识别(无需训练),但是如果前面的检测模型对于未知的标签无法定位检测出来,是否还是要训练前面的检测模型?
**A**:如果检测模型在自己的数据集上表现不佳,需要在自己的检测数据集上再finetune下
### Q2.5: Mac重新编译index.so时报错如下:clang: error: unsupported option '-fopenmp', 该如何处理?
**A**:该问题已经解决。Mac编译index.so,可以参照文档: https://github.com/PaddlePaddle/PaddleClas/blob/develop/deploy/vector_search/README.md
### Q2.6: PaddleClas有提供调整图片亮度,对比度,饱和度,色调等方面的数据增强吗?
**A**:PaddleClas提供了多种数据增广方式, 可分为3类:1. 图像变换类: AutoAugment, RandAugment; 2. 图像裁剪类: CutOut、RandErasing、HideAndSeek、GridMask;3. 图像混叠类:Mixup, Cutmix. 其中,Randangment提供了多种数据增强方式的随机组合,可以满足亮度、对比度、饱和度、色调等多方面的数据增广需求
...@@ -3,9 +3,9 @@ ...@@ -3,9 +3,9 @@
## 概述 ## 概述
Twins网络包括Twins-PCPVT和Twins-SVT,其重点对空间注意力机制进行了精心设计,得到了简单却更为有效的方案。由于该体系结构仅涉及矩阵乘法,而目前的深度学习框架中对矩阵乘法有较高的优化程度,因此该体系结构十分高效且易于实现。并且,该体系结构在图像分类、目标检测和语义分割等多种下游视觉任务中都能够取得优异的性能。[论文地址](https://arxiv.org/abs/2104.13840) Twins网络包括Twins-PCPVT和Twins-SVT,其重点对空间注意力机制进行了精心设计,得到了简单却更为有效的方案。由于该体系结构仅涉及矩阵乘法,而目前的深度学习框架中对矩阵乘法有较高的优化程度,因此该体系结构十分高效且易于实现。并且,该体系结构在图像分类、目标检测和语义分割等多种下游视觉任务中都能够取得优异的性能。[论文地址](https://arxiv.org/abs/2104.13840)
## 精度、FLOPS和参数量 ## 精度、FLOPs和参数量
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPS<br>(G) | Params<br>(M) | | Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPs<br>(G) | Params<br>(M) |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:| |:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| pcpvt_small | 0.8082 | 0.9552 | 0.812 | - | 3.7 | 24.1 | | pcpvt_small | 0.8082 | 0.9552 | 0.812 | - | 3.7 | 24.1 |
| pcpvt_base | 0.8242 | 0.9619 | 0.827 | - | 6.4 | 43.8 | | pcpvt_base | 0.8242 | 0.9619 | 0.827 | - | 6.4 | 43.8 |
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
* pip安装 * pip安装
```bash ```bash
pip3 install paddleclas==2.2.0 pip3 install paddleclas==2.2.1
``` ```
* 本地构建并安装 * 本地构建并安装
......
...@@ -18,6 +18,7 @@ __dir__ = os.path.dirname(__file__) ...@@ -18,6 +18,7 @@ __dir__ = os.path.dirname(__file__)
sys.path.append(os.path.join(__dir__, "")) sys.path.append(os.path.join(__dir__, ""))
sys.path.append(os.path.join(__dir__, "deploy")) sys.path.append(os.path.join(__dir__, "deploy"))
from typing import Union, Generator
import argparse import argparse
import shutil import shutil
import textwrap import textwrap
...@@ -279,8 +280,13 @@ def args_cfg(): ...@@ -279,8 +280,13 @@ def args_cfg():
"--save_dir", "--save_dir",
type=str, type=str,
help="The directory to save prediction results as pre-label.") help="The directory to save prediction results as pre-label.")
parser.add_argument("--resize_short", type=int, default=256, help="") parser.add_argument(
parser.add_argument("--crop_size", type=int, default=224, help="") "--resize_short",
type=int,
default=256,
help="Resize according to short size.")
parser.add_argument(
"--crop_size", type=int, default=224, help="Centor crop size.")
args = parser.parse_args() args = parser.parse_args()
return vars(args) return vars(args)
...@@ -351,7 +357,7 @@ def download_with_progressbar(url, save_path): ...@@ -351,7 +357,7 @@ def download_with_progressbar(url, save_path):
def check_model_file(model_name): def check_model_file(model_name):
"""Check the model files exist and download and untar when no exist. """Check the model files exist and download and untar when no exist.
""" """
storage_directory = partial(os.path.join, BASE_INFERENCE_MODEL_DIR, storage_directory = partial(os.path.join, BASE_INFERENCE_MODEL_DIR,
model_name) model_name)
...@@ -405,11 +411,11 @@ class PaddleClas(object): ...@@ -405,11 +411,11 @@ class PaddleClas(object):
"""Init PaddleClas with config. """Init PaddleClas with config.
Args: Args:
model_name: The model name supported by PaddleClas, default by None. If specified, override config. model_name (str, optional): The model name supported by PaddleClas. If specified, override config. Defaults to None.
inference_model_dir: The directory that contained model file and params file to be used, default by None. If specified, override config. inference_model_dir (str, optional): The directory that contained model file and params file to be used. If specified, override config. Defaults to None.
use_gpu: Wheather use GPU, default by None. If specified, override config. use_gpu (bool, optional): Whether use GPU. If specified, override config. Defaults to True.
batch_size: The batch size to pridict, default by None. If specified, override config. batch_size (int, optional): The batch size to pridict. If specified, override config. Defaults to 1.
topk: Return the top k prediction results with the highest score. topk (int, optional): Return the top k prediction results with the highest score. Defaults to 5.
""" """
super().__init__() super().__init__()
self._config = init_config(model_name, inference_model_dir, use_gpu, self._config = init_config(model_name, inference_model_dir, use_gpu,
...@@ -454,20 +460,26 @@ class PaddleClas(object): ...@@ -454,20 +460,26 @@ class PaddleClas(object):
raise InputModelError(err) raise InputModelError(err)
return return
def predict(self, input_data, print_pred=False): def predict(self, input_data: Union[str, np.array],
print_pred: bool=False) -> Generator[list, None, None]:
"""Predict input_data. """Predict input_data.
Args: Args:
input_data (str | NumPy.array): The path of image, or the directory containing images, or the URL of image from Internet. input_data (Union[str, np.array]):
print_pred (bool, optional): Wheather print the prediction result. Defaults to False. When the type is str, it is the path of image, or the directory containing images, or the URL of image from Internet.
When the type is np.array, it is the image data whose channel order is RGB.
print_pred (bool, optional): Whether print the prediction result. Defaults to False. Defaults to False.
Raises: Raises:
ImageTypeError: Illegal input_data. ImageTypeError: Illegal input_data.
Yields: Yields:
list: The prediction result(s) of input_data by batch_size. For every one image, prediction result(s) is zipped as a dict, that includs topk "class_ids", "scores" and "label_names". The format is as follow: Generator[list, None, None]:
[{"class_ids": [...], "scores": [...], "label_names": [...]}, ...] The prediction result(s) of input_data by batch_size. For every one image,
prediction result(s) is zipped as a dict, that includs topk "class_ids", "scores" and "label_names".
The format is as follow: [{"class_ids": [...], "scores": [...], "label_names": [...]}, ...]
""" """
if isinstance(input_data, np.ndarray): if isinstance(input_data, np.ndarray):
outputs = self.cls_predictor.predict(input_data) outputs = self.cls_predictor.predict(input_data)
yield self.cls_predictor.postprocess(outputs) yield self.cls_predictor.postprocess(outputs)
...@@ -497,6 +509,7 @@ class PaddleClas(object): ...@@ -497,6 +509,7 @@ class PaddleClas(object):
f"Image file failed to read and has been skipped. The path: {img_path}" f"Image file failed to read and has been skipped. The path: {img_path}"
) )
continue continue
img = img[:, :, ::-1]
img_list.append(img) img_list.append(img)
img_path_list.append(img_path) img_path_list.append(img_path)
cnt += 1 cnt += 1
...@@ -506,12 +519,12 @@ class PaddleClas(object): ...@@ -506,12 +519,12 @@ class PaddleClas(object):
preds = self.cls_predictor.postprocess(outputs, preds = self.cls_predictor.postprocess(outputs,
img_path_list) img_path_list)
if print_pred and preds: if print_pred and preds:
for nu, pred in enumerate(preds): for pred in preds:
filename = pred.pop("file_name")
pred_str = ", ".join( pred_str = ", ".join(
[f"{k}: {pred[k]}" for k in pred]) [f"{k}: {pred[k]}" for k in pred])
print( print(
f"filename: {img_path_list[nu]}, top-{topk}, {pred_str}" f"filename: {filename}, top-{topk}, {pred_str}")
)
img_list = [] img_list = []
img_path_list = [] img_path_list = []
......
...@@ -104,7 +104,8 @@ class ConvBNLayer(TheseusLayer): ...@@ -104,7 +104,8 @@ class ConvBNLayer(TheseusLayer):
groups=1, groups=1,
is_vd_mode=False, is_vd_mode=False,
act=None, act=None,
lr_mult=1.0): lr_mult=1.0,
data_format="NCHW"):
super().__init__() super().__init__()
self.is_vd_mode = is_vd_mode self.is_vd_mode = is_vd_mode
self.act = act self.act = act
...@@ -118,11 +119,13 @@ class ConvBNLayer(TheseusLayer): ...@@ -118,11 +119,13 @@ class ConvBNLayer(TheseusLayer):
padding=(filter_size - 1) // 2, padding=(filter_size - 1) // 2,
groups=groups, groups=groups,
weight_attr=ParamAttr(learning_rate=lr_mult), weight_attr=ParamAttr(learning_rate=lr_mult),
bias_attr=False) bias_attr=False,
data_format=data_format)
self.bn = BatchNorm( self.bn = BatchNorm(
num_filters, num_filters,
param_attr=ParamAttr(learning_rate=lr_mult), param_attr=ParamAttr(learning_rate=lr_mult),
bias_attr=ParamAttr(learning_rate=lr_mult)) bias_attr=ParamAttr(learning_rate=lr_mult),
data_layout=data_format)
self.relu = nn.ReLU() self.relu = nn.ReLU()
def forward(self, x): def forward(self, x):
...@@ -136,14 +139,14 @@ class ConvBNLayer(TheseusLayer): ...@@ -136,14 +139,14 @@ class ConvBNLayer(TheseusLayer):
class BottleneckBlock(TheseusLayer): class BottleneckBlock(TheseusLayer):
def __init__( def __init__(self,
self, num_channels,
num_channels, num_filters,
num_filters, stride,
stride, shortcut=True,
shortcut=True, if_first=False,
if_first=False, lr_mult=1.0,
lr_mult=1.0, ): data_format="NCHW"):
super().__init__() super().__init__()
self.conv0 = ConvBNLayer( self.conv0 = ConvBNLayer(
...@@ -151,20 +154,23 @@ class BottleneckBlock(TheseusLayer): ...@@ -151,20 +154,23 @@ class BottleneckBlock(TheseusLayer):
num_filters=num_filters, num_filters=num_filters,
filter_size=1, filter_size=1,
act="relu", act="relu",
lr_mult=lr_mult) lr_mult=lr_mult,
data_format=data_format)
self.conv1 = ConvBNLayer( self.conv1 = ConvBNLayer(
num_channels=num_filters, num_channels=num_filters,
num_filters=num_filters, num_filters=num_filters,
filter_size=3, filter_size=3,
stride=stride, stride=stride,
act="relu", act="relu",
lr_mult=lr_mult) lr_mult=lr_mult,
data_format=data_format)
self.conv2 = ConvBNLayer( self.conv2 = ConvBNLayer(
num_channels=num_filters, num_channels=num_filters,
num_filters=num_filters * 4, num_filters=num_filters * 4,
filter_size=1, filter_size=1,
act=None, act=None,
lr_mult=lr_mult) lr_mult=lr_mult,
data_format=data_format)
if not shortcut: if not shortcut:
self.short = ConvBNLayer( self.short = ConvBNLayer(
...@@ -173,7 +179,8 @@ class BottleneckBlock(TheseusLayer): ...@@ -173,7 +179,8 @@ class BottleneckBlock(TheseusLayer):
filter_size=1, filter_size=1,
stride=stride if if_first else 1, stride=stride if if_first else 1,
is_vd_mode=False if if_first else True, is_vd_mode=False if if_first else True,
lr_mult=lr_mult) lr_mult=lr_mult,
data_format=data_format)
self.relu = nn.ReLU() self.relu = nn.ReLU()
self.shortcut = shortcut self.shortcut = shortcut
...@@ -199,7 +206,8 @@ class BasicBlock(TheseusLayer): ...@@ -199,7 +206,8 @@ class BasicBlock(TheseusLayer):
stride, stride,
shortcut=True, shortcut=True,
if_first=False, if_first=False,
lr_mult=1.0): lr_mult=1.0,
data_format="NCHW"):
super().__init__() super().__init__()
self.stride = stride self.stride = stride
...@@ -209,13 +217,15 @@ class BasicBlock(TheseusLayer): ...@@ -209,13 +217,15 @@ class BasicBlock(TheseusLayer):
filter_size=3, filter_size=3,
stride=stride, stride=stride,
act="relu", act="relu",
lr_mult=lr_mult) lr_mult=lr_mult,
data_format=data_format)
self.conv1 = ConvBNLayer( self.conv1 = ConvBNLayer(
num_channels=num_filters, num_channels=num_filters,
num_filters=num_filters, num_filters=num_filters,
filter_size=3, filter_size=3,
act=None, act=None,
lr_mult=lr_mult) lr_mult=lr_mult,
data_format=data_format)
if not shortcut: if not shortcut:
self.short = ConvBNLayer( self.short = ConvBNLayer(
num_channels=num_channels, num_channels=num_channels,
...@@ -223,7 +233,8 @@ class BasicBlock(TheseusLayer): ...@@ -223,7 +233,8 @@ class BasicBlock(TheseusLayer):
filter_size=1, filter_size=1,
stride=stride if if_first else 1, stride=stride if if_first else 1,
is_vd_mode=False if if_first else True, is_vd_mode=False if if_first else True,
lr_mult=lr_mult) lr_mult=lr_mult,
data_format=data_format)
self.shortcut = shortcut self.shortcut = shortcut
self.relu = nn.ReLU() self.relu = nn.ReLU()
...@@ -256,7 +267,9 @@ class ResNet(TheseusLayer): ...@@ -256,7 +267,9 @@ class ResNet(TheseusLayer):
config, config,
version="vb", version="vb",
class_num=1000, class_num=1000,
lr_mult_list=[1.0, 1.0, 1.0, 1.0, 1.0]): lr_mult_list=[1.0, 1.0, 1.0, 1.0, 1.0],
data_format="NCHW",
input_image_channel=3):
super().__init__() super().__init__()
self.cfg = config self.cfg = config
...@@ -279,22 +292,25 @@ class ResNet(TheseusLayer): ...@@ -279,22 +292,25 @@ class ResNet(TheseusLayer):
self.stem_cfg = { self.stem_cfg = {
#num_channels, num_filters, filter_size, stride #num_channels, num_filters, filter_size, stride
"vb": [[3, 64, 7, 2]], "vb": [[input_image_channel, 64, 7, 2]],
"vd": [[3, 32, 3, 2], [32, 32, 3, 1], [32, 64, 3, 1]] "vd":
[[input_image_channel, 32, 3, 2], [32, 32, 3, 1], [32, 64, 3, 1]]
} }
self.stem = nn.Sequential(*[ self.stem = nn.Sequential(* [
ConvBNLayer( ConvBNLayer(
num_channels=in_c, num_channels=in_c,
num_filters=out_c, num_filters=out_c,
filter_size=k, filter_size=k,
stride=s, stride=s,
act="relu", act="relu",
lr_mult=self.lr_mult_list[0]) lr_mult=self.lr_mult_list[0],
data_format=data_format)
for in_c, out_c, k, s in self.stem_cfg[version] for in_c, out_c, k, s in self.stem_cfg[version]
]) ])
self.max_pool = MaxPool2D(kernel_size=3, stride=2, padding=1) self.max_pool = MaxPool2D(
kernel_size=3, stride=2, padding=1, data_format=data_format)
block_list = [] block_list = []
for block_idx in range(len(self.block_depth)): for block_idx in range(len(self.block_depth)):
shortcut = False shortcut = False
...@@ -306,11 +322,12 @@ class ResNet(TheseusLayer): ...@@ -306,11 +322,12 @@ class ResNet(TheseusLayer):
stride=2 if i == 0 and block_idx != 0 else 1, stride=2 if i == 0 and block_idx != 0 else 1,
shortcut=shortcut, shortcut=shortcut,
if_first=block_idx == i == 0 if version == "vd" else True, if_first=block_idx == i == 0 if version == "vd" else True,
lr_mult=self.lr_mult_list[block_idx + 1])) lr_mult=self.lr_mult_list[block_idx + 1],
data_format=data_format))
shortcut = True shortcut = True
self.blocks = nn.Sequential(*block_list) self.blocks = nn.Sequential(*block_list)
self.avg_pool = AdaptiveAvgPool2D(1) self.avg_pool = AdaptiveAvgPool2D(1, data_format=data_format)
self.flatten = nn.Flatten() self.flatten = nn.Flatten()
self.avg_pool_channels = self.num_channels[-1] * 2 self.avg_pool_channels = self.num_channels[-1] * 2
stdv = 1.0 / math.sqrt(self.avg_pool_channels * 1.0) stdv = 1.0 / math.sqrt(self.avg_pool_channels * 1.0)
...@@ -319,13 +336,19 @@ class ResNet(TheseusLayer): ...@@ -319,13 +336,19 @@ class ResNet(TheseusLayer):
self.class_num, self.class_num,
weight_attr=ParamAttr(initializer=Uniform(-stdv, stdv))) weight_attr=ParamAttr(initializer=Uniform(-stdv, stdv)))
self.data_format = data_format
def forward(self, x): def forward(self, x):
x = self.stem(x) with paddle.static.amp.fp16_guard():
x = self.max_pool(x) if self.data_format == "NHWC":
x = self.blocks(x) x = paddle.transpose(x, [0, 2, 3, 1])
x = self.avg_pool(x) x.stop_gradient = True
x = self.flatten(x) x = self.stem(x)
x = self.fc(x) x = self.max_pool(x)
x = self.blocks(x)
x = self.avg_pool(x)
x = self.flatten(x)
x = self.fc(x)
return x return x
......
...@@ -56,10 +56,10 @@ class GroupAttention(nn.Layer): ...@@ -56,10 +56,10 @@ class GroupAttention(nn.Layer):
ws=1): ws=1):
super().__init__() super().__init__()
if ws == 1: if ws == 1:
raise Exception(f"ws {ws} should not be 1") raise Exception("ws {ws} should not be 1")
if dim % num_heads != 0: if dim % num_heads != 0:
raise Exception( raise Exception(
f"dim {dim} should be divided by num_heads {num_heads}.") "dim {dim} should be divided by num_heads {num_heads}.")
self.dim = dim self.dim = dim
self.num_heads = num_heads self.num_heads = num_heads
...@@ -78,15 +78,15 @@ class GroupAttention(nn.Layer): ...@@ -78,15 +78,15 @@ class GroupAttention(nn.Layer):
total_groups = h_group * w_group total_groups = h_group * w_group
x = x.reshape([B, h_group, self.ws, w_group, self.ws, C]).transpose( x = x.reshape([B, h_group, self.ws, w_group, self.ws, C]).transpose(
[0, 1, 3, 2, 4, 5]) [0, 1, 3, 2, 4, 5])
qkv = self.qkv(x).reshape( qkv = self.qkv(x).reshape([
[B, total_groups, -1, 3, self.num_heads, B, total_groups, self.ws**2, 3, self.num_heads, C // self.num_heads
C // self.num_heads]).transpose([3, 0, 1, 4, 2, 5]) ]).transpose([3, 0, 1, 4, 2, 5])
q, k, v = qkv[0], qkv[1], qkv[2] q, k, v = qkv[0], qkv[1], qkv[2]
attn = (q @k.transpose([0, 1, 2, 4, 3])) * self.scale attn = paddle.matmul(q, k.transpose([0, 1, 2, 4, 3])) * self.scale
attn = nn.Softmax(axis=-1)(attn) attn = nn.Softmax(axis=-1)(attn)
attn = self.attn_drop(attn) attn = self.attn_drop(attn)
attn = (attn @v).transpose([0, 1, 3, 2, 4]).reshape( attn = paddle.matmul(attn, v).transpose([0, 1, 3, 2, 4]).reshape(
[B, h_group, w_group, self.ws, self.ws, C]) [B, h_group, w_group, self.ws, self.ws, C])
x = attn.transpose([0, 1, 3, 2, 4, 5]).reshape([B, N, C]) x = attn.transpose([0, 1, 3, 2, 4, 5]).reshape([B, N, C])
...@@ -135,22 +135,23 @@ class Attention(nn.Layer): ...@@ -135,22 +135,23 @@ class Attention(nn.Layer):
if self.sr_ratio > 1: if self.sr_ratio > 1:
x_ = x.transpose([0, 2, 1]).reshape([B, C, H, W]) x_ = x.transpose([0, 2, 1]).reshape([B, C, H, W])
x_ = self.sr(x_).reshape([B, C, -1]).transpose([0, 2, 1]) tmp_n = H * W // self.sr_ratio**2
x_ = self.sr(x_).reshape([B, C, tmp_n]).transpose([0, 2, 1])
x_ = self.norm(x_) x_ = self.norm(x_)
kv = self.kv(x_).reshape( kv = self.kv(x_).reshape(
[B, -1, 2, self.num_heads, C // self.num_heads]).transpose( [B, tmp_n, 2, self.num_heads, C // self.num_heads]).transpose(
[2, 0, 3, 1, 4]) [2, 0, 3, 1, 4])
else: else:
kv = self.kv(x).reshape( kv = self.kv(x).reshape(
[B, -1, 2, self.num_heads, C // self.num_heads]).transpose( [B, N, 2, self.num_heads, C // self.num_heads]).transpose(
[2, 0, 3, 1, 4]) [2, 0, 3, 1, 4])
k, v = kv[0], kv[1] k, v = kv[0], kv[1]
attn = (q @k.transpose([0, 1, 3, 2])) * self.scale attn = paddle.matmul(q, k.transpose([0, 1, 3, 2])) * self.scale
attn = nn.Softmax(axis=-1)(attn) attn = nn.Softmax(axis=-1)(attn)
attn = self.attn_drop(attn) attn = self.attn_drop(attn)
x = (attn @v).transpose([0, 2, 1, 3]).reshape([B, N, C]) x = paddle.matmul(attn, v).transpose([0, 2, 1, 3]).reshape([B, N, C])
x = self.proj(x) x = self.proj(x)
x = self.proj_drop(x) x = self.proj_drop(x)
return x return x
...@@ -280,7 +281,7 @@ class PyramidVisionTransformer(nn.Layer): ...@@ -280,7 +281,7 @@ class PyramidVisionTransformer(nn.Layer):
img_size=224, img_size=224,
patch_size=16, patch_size=16,
in_chans=3, in_chans=3,
num_classes=1000, class_num=1000,
embed_dims=[64, 128, 256, 512], embed_dims=[64, 128, 256, 512],
num_heads=[1, 2, 4, 8], num_heads=[1, 2, 4, 8],
mlp_ratios=[4, 4, 4, 4], mlp_ratios=[4, 4, 4, 4],
...@@ -294,7 +295,7 @@ class PyramidVisionTransformer(nn.Layer): ...@@ -294,7 +295,7 @@ class PyramidVisionTransformer(nn.Layer):
sr_ratios=[8, 4, 2, 1], sr_ratios=[8, 4, 2, 1],
block_cls=Block): block_cls=Block):
super().__init__() super().__init__()
self.num_classes = num_classes self.class_num = class_num
self.depths = depths self.depths = depths
# patch_embed # patch_embed
...@@ -317,7 +318,6 @@ class PyramidVisionTransformer(nn.Layer): ...@@ -317,7 +318,6 @@ class PyramidVisionTransformer(nn.Layer):
self.create_parameter( self.create_parameter(
shape=[1, patch_num, embed_dims[i]], shape=[1, patch_num, embed_dims[i]],
default_initializer=zeros_)) default_initializer=zeros_))
self.add_parameter(f"pos_embeds_{i}", self.pos_embeds[i])
self.pos_drops.append(nn.Dropout(p=drop_rate)) self.pos_drops.append(nn.Dropout(p=drop_rate))
dpr = [ dpr = [
...@@ -354,7 +354,7 @@ class PyramidVisionTransformer(nn.Layer): ...@@ -354,7 +354,7 @@ class PyramidVisionTransformer(nn.Layer):
# classification head # classification head
self.head = nn.Linear(embed_dims[-1], self.head = nn.Linear(embed_dims[-1],
num_classes) if num_classes > 0 else Identity() class_num) if class_num > 0 else Identity()
# init weights # init weights
for pos_emb in self.pos_embeds: for pos_emb in self.pos_embeds:
...@@ -433,7 +433,7 @@ class CPVTV2(PyramidVisionTransformer): ...@@ -433,7 +433,7 @@ class CPVTV2(PyramidVisionTransformer):
img_size=224, img_size=224,
patch_size=4, patch_size=4,
in_chans=3, in_chans=3,
num_classes=1000, class_num=1000,
embed_dims=[64, 128, 256, 512], embed_dims=[64, 128, 256, 512],
num_heads=[1, 2, 4, 8], num_heads=[1, 2, 4, 8],
mlp_ratios=[4, 4, 4, 4], mlp_ratios=[4, 4, 4, 4],
...@@ -446,10 +446,10 @@ class CPVTV2(PyramidVisionTransformer): ...@@ -446,10 +446,10 @@ class CPVTV2(PyramidVisionTransformer):
depths=[3, 4, 6, 3], depths=[3, 4, 6, 3],
sr_ratios=[8, 4, 2, 1], sr_ratios=[8, 4, 2, 1],
block_cls=Block): block_cls=Block):
super().__init__(img_size, patch_size, in_chans, num_classes, super().__init__(img_size, patch_size, in_chans, class_num, embed_dims,
embed_dims, num_heads, mlp_ratios, qkv_bias, qk_scale, num_heads, mlp_ratios, qkv_bias, qk_scale, drop_rate,
drop_rate, attn_drop_rate, drop_path_rate, norm_layer, attn_drop_rate, drop_path_rate, norm_layer, depths,
depths, sr_ratios, block_cls) sr_ratios, block_cls)
del self.pos_embeds del self.pos_embeds
del self.cls_token del self.cls_token
self.pos_block = nn.LayerList( self.pos_block = nn.LayerList(
...@@ -488,7 +488,7 @@ class CPVTV2(PyramidVisionTransformer): ...@@ -488,7 +488,7 @@ class CPVTV2(PyramidVisionTransformer):
x = self.pos_block[i](x, H, W) # PEG here x = self.pos_block[i](x, H, W) # PEG here
if i < len(self.depths) - 1: if i < len(self.depths) - 1:
x = x.reshape([B, H, W, -1]).transpose([0, 3, 1, 2]) x = x.reshape([B, H, W, x.shape[-1]]).transpose([0, 3, 1, 2])
x = self.norm(x) x = self.norm(x)
return x.mean(axis=1) # GAP here return x.mean(axis=1) # GAP here
...@@ -499,7 +499,7 @@ class PCPVT(CPVTV2): ...@@ -499,7 +499,7 @@ class PCPVT(CPVTV2):
img_size=224, img_size=224,
patch_size=4, patch_size=4,
in_chans=3, in_chans=3,
num_classes=1000, class_num=1000,
embed_dims=[64, 128, 256], embed_dims=[64, 128, 256],
num_heads=[1, 2, 4], num_heads=[1, 2, 4],
mlp_ratios=[4, 4, 4], mlp_ratios=[4, 4, 4],
...@@ -512,10 +512,10 @@ class PCPVT(CPVTV2): ...@@ -512,10 +512,10 @@ class PCPVT(CPVTV2):
depths=[4, 4, 4], depths=[4, 4, 4],
sr_ratios=[4, 2, 1], sr_ratios=[4, 2, 1],
block_cls=SBlock): block_cls=SBlock):
super().__init__(img_size, patch_size, in_chans, num_classes, super().__init__(img_size, patch_size, in_chans, class_num, embed_dims,
embed_dims, num_heads, mlp_ratios, qkv_bias, qk_scale, num_heads, mlp_ratios, qkv_bias, qk_scale, drop_rate,
drop_rate, attn_drop_rate, drop_path_rate, norm_layer, attn_drop_rate, drop_path_rate, norm_layer, depths,
depths, sr_ratios, block_cls) sr_ratios, block_cls)
class ALTGVT(PCPVT): class ALTGVT(PCPVT):
......
...@@ -38,7 +38,7 @@ class CosMargin(paddle.nn.Layer): ...@@ -38,7 +38,7 @@ class CosMargin(paddle.nn.Layer):
input_norm = paddle.sqrt( input_norm = paddle.sqrt(
paddle.sum(paddle.square(input), axis=1, keepdim=True)) paddle.sum(paddle.square(input), axis=1, keepdim=True))
input = paddle.divide(input, x_norm) input = paddle.divide(input, input_norm)
weight = self.fc.weight weight = self.fc.weight
weight_norm = paddle.sqrt( weight_norm = paddle.sqrt(
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
Eval: Eval:
- CELoss: - CELoss:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
Eval: Eval:
- CELoss: - CELoss:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 120
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_channel: &image_channel 4
image_shape: [*image_channel, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
# mixed precision training
AMP:
scale_loss: 128.0
use_dynamic_loss_scaling: True
use_pure_fp16: &use_pure_fp16 True
# model architecture
Arch:
name: ResNet50
class_num: 1000
input_image_channel: *image_channel
data_format: "NHWC"
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
multi_precision: *use_pure_fp16
lr:
name: Piecewise
learning_rate: 0.1
decay_epochs: [30, 60, 90]
values: [0.1, 0.01, 0.001, 0.0001]
regularizer:
name: 'L2'
coeff: 0.0001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
output_fp16: *use_pure_fp16
channel_num: *image_channel
sampler:
name: DistributedBatchSampler
batch_size: 32
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
output_fp16: *use_pure_fp16
channel_num: *image_channel
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
output_fp16: *use_pure_fp16
channel_num: *image_channel
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 120
print_batch_step: 10
use_visualdl: False
image_channel: &image_channel 4
# used for static mode and model export
image_shape: [*image_channel, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
use_dali: True
# mixed precision training
AMP:
scale_loss: 128.0
use_dynamic_loss_scaling: True
use_pure_fp16: &use_pure_fp16 False
# model architecture
Arch:
name: ResNet50
class_num: 1000
input_image_channel: *image_channel
data_format: "NHWC"
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Piecewise
learning_rate: 0.1
decay_epochs: [30, 60, 90]
values: [0.1, 0.01, 0.001, 0.0001]
regularizer:
name: 'L2'
coeff: 0.0001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
output_fp16: *use_pure_fp16
channel_num: *image_channel
sampler:
name: DistributedBatchSampler
batch_size: 256
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
output_fp16: *use_pure_fp16
channel_num: *image_channel
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
output_fp16: *use_pure_fp16
channel_num: *image_channel
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 200
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_channel: &image_channel 4
image_shape: [*image_channel, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: SE_ResNeXt101_32x4d
class_num: 1000
input_image_channel: *image_channel
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
# mixed precision training
AMP:
scale_loss: 128.0
use_dynamic_loss_scaling: True
use_pure_fp16: &use_pure_fp16 True
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.1
regularizer:
name: 'L2'
coeff: 0.00007
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
output_fp16: *use_pure_fp16
channel_num: *image_channel
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
output_fp16: *use_pure_fp16
channel_num: *image_channel
sampler:
name: BatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
output_fp16: *use_pure_fp16
channel_num: *image_channel
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 120
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
# model architecture
Arch:
name: alt_gvt_base
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Piecewise
learning_rate: 0.1
decay_epochs: [30, 60, 90]
values: [0.1, 0.01, 0.001, 0.0001]
regularizer:
name: 'L2'
coeff: 0.0001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 120
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
# model architecture
Arch:
name: alt_gvt_large
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Piecewise
learning_rate: 0.1
decay_epochs: [30, 60, 90]
values: [0.1, 0.01, 0.001, 0.0001]
regularizer:
name: 'L2'
coeff: 0.0001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 120
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
# model architecture
Arch:
name: alt_gvt_small
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Piecewise
learning_rate: 0.1
decay_epochs: [30, 60, 90]
values: [0.1, 0.01, 0.001, 0.0001]
regularizer:
name: 'L2'
coeff: 0.0001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 10 epochs: 120
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -18,7 +18,7 @@ Global: ...@@ -18,7 +18,7 @@ Global:
# model architecture # model architecture
Arch: Arch:
name: ResNet50 name: pcpvt_base
class_num: 1000 class_num: 1000
# loss function config for traing/eval process # loss function config for traing/eval process
...@@ -49,8 +49,8 @@ DataLoader: ...@@ -49,8 +49,8 @@ DataLoader:
Train: Train:
dataset: dataset:
name: ImageNetDataset name: ImageNetDataset
image_root: ./dataset/chain_dataset/ image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/chain_dataset/train.txt cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops: transform_ops:
- DecodeImage: - DecodeImage:
to_rgb: True to_rgb: True
...@@ -77,8 +77,8 @@ DataLoader: ...@@ -77,8 +77,8 @@ DataLoader:
Eval: Eval:
dataset: dataset:
name: ImageNetDataset name: ImageNetDataset
image_root: ./dataset/chain_dataset/ image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/chain_dataset/val.txt cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops: transform_ops:
- DecodeImage: - DecodeImage:
to_rgb: True to_rgb: True
......
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 120
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
# model architecture
Arch:
name: pcpvt_large
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Piecewise
learning_rate: 0.1
decay_epochs: [30, 60, 90]
values: [0.1, 0.01, 0.001, 0.0001]
regularizer:
name: 'L2'
coeff: 0.0001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 120
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# training model under @to_static
to_static: False
# model architecture
Arch:
name: pcpvt_small
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Piecewise
learning_rate: 0.1
decay_epochs: [30, 60, 90]
values: [0.1, 0.01, 0.001, 0.0001]
regularizer:
name: 'L2'
coeff: 0.0001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
...@@ -16,13 +16,13 @@ Global: ...@@ -16,13 +16,13 @@ Global:
# model architecture # model architecture
Arch: Arch:
name: Xception41_deeplab name: Xception65
class_num: 1000 class_num: 1000
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
...@@ -22,7 +22,7 @@ Arch: ...@@ -22,7 +22,7 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1 epsilon: 0.1
Eval: Eval:
......
# global configs # global configs
Global: Global:
checkpoints: null checkpoints: null
# please download pretrained model via this link: pretrained_model: "https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/product_ResNet50_vd_Aliproduct_v1.0_pretrained.pdparams"
# https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/product_ResNet50_vd_Aliproduct_v1.0_pretrained.pdparams
pretrained_model: product_ResNet50_vd_Aliproduct_v1.0_pretrained
output_dir: ./output/ output_dir: ./output/
device: gpu device: gpu
save_interval: 10 save_interval: 10
......
# global configs # global configs
Global: Global:
checkpoints: null checkpoints: null
# please download pretrained model via this link: pretrained_model: "https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/product_ResNet50_vd_Aliproduct_v1.0_pretrained.pdparams"
# https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/product_ResNet50_vd_Aliproduct_v1.0_pretrained.pdparams
pretrained_model: product_ResNet50_vd_Aliproduct_v1.0_pretrained
output_dir: ./output/ output_dir: ./output/
device: gpu device: gpu
save_interval: 10 save_interval: 10
......
...@@ -53,10 +53,14 @@ def create_operators(params): ...@@ -53,10 +53,14 @@ def create_operators(params):
return ops return ops
def build_dataloader(config, mode, device, seed=None): def build_dataloader(config, mode, device, use_dali=False, seed=None):
assert mode in ['Train', 'Eval', 'Test', 'Gallery', 'Query' assert mode in ['Train', 'Eval', 'Test', 'Gallery', 'Query'
], "Mode should be Train, Eval, Test, Gallery, Query" ], "Mode should be Train, Eval, Test, Gallery, Query"
# build dataset # build dataset
if use_dali:
from ppcls.data.dataloader.dali import dali_dataloader
return dali_dataloader(config, mode, paddle.device.get_device(), seed)
config_dataset = config[mode]['dataset'] config_dataset = config[mode]['dataset']
config_dataset = copy.deepcopy(config_dataset) config_dataset = copy.deepcopy(config_dataset)
dataset_name = config_dataset.pop('name') dataset_name = config_dataset.pop('name')
......
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved. # Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
# #
# Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
...@@ -14,16 +14,17 @@ ...@@ -14,16 +14,17 @@
from __future__ import division from __future__ import division
import copy
import os import os
import numpy as np import numpy as np
from nvidia.dali.pipeline import Pipeline
import nvidia.dali.ops as ops import nvidia.dali.ops as ops
import nvidia.dali.types as types import nvidia.dali.types as types
from nvidia.dali.plugin.paddle import DALIGenericIterator
import paddle import paddle
from paddle import fluid from nvidia.dali import fn
from nvidia.dali.pipeline import Pipeline
from nvidia.dali.plugin.base_iterator import LastBatchPolicy
from nvidia.dali.plugin.paddle import DALIGenericIterator
class HybridTrainPipe(Pipeline): class HybridTrainPipe(Pipeline):
...@@ -46,10 +47,11 @@ class HybridTrainPipe(Pipeline): ...@@ -46,10 +47,11 @@ class HybridTrainPipe(Pipeline):
num_threads=4, num_threads=4,
seed=42, seed=42,
pad_output=False, pad_output=False,
output_dtype=types.FLOAT): output_dtype=types.FLOAT,
dataset='Train'):
super(HybridTrainPipe, self).__init__( super(HybridTrainPipe, self).__init__(
batch_size, num_threads, device_id, seed=seed) batch_size, num_threads, device_id, seed=seed)
self.input = ops.FileReader( self.input = ops.readers.File(
file_root=file_root, file_root=file_root,
file_list=file_list, file_list=file_list,
shard_id=shard_id, shard_id=shard_id,
...@@ -59,9 +61,9 @@ class HybridTrainPipe(Pipeline): ...@@ -59,9 +61,9 @@ class HybridTrainPipe(Pipeline):
# without additional reallocations # without additional reallocations
device_memory_padding = 211025920 device_memory_padding = 211025920
host_memory_padding = 140544512 host_memory_padding = 140544512
self.decode = ops.ImageDecoderRandomCrop( self.decode = ops.decoders.ImageRandomCrop(
device='mixed', device='mixed',
output_type=types.RGB, output_type=types.DALIImageType.RGB,
device_memory_padding=device_memory_padding, device_memory_padding=device_memory_padding,
host_memory_padding=host_memory_padding, host_memory_padding=host_memory_padding,
random_aspect_ratio=[lower, upper], random_aspect_ratio=[lower, upper],
...@@ -71,15 +73,14 @@ class HybridTrainPipe(Pipeline): ...@@ -71,15 +73,14 @@ class HybridTrainPipe(Pipeline):
device='gpu', resize_x=crop, resize_y=crop, interp_type=interp) device='gpu', resize_x=crop, resize_y=crop, interp_type=interp)
self.cmnp = ops.CropMirrorNormalize( self.cmnp = ops.CropMirrorNormalize(
device="gpu", device="gpu",
output_dtype=output_dtype, dtype=output_dtype,
output_layout=types.NCHW, output_layout='CHW',
crop=(crop, crop), crop=(crop, crop),
image_type=types.RGB,
mean=mean, mean=mean,
std=std, std=std,
pad_output=pad_output) pad_output=pad_output)
self.coin = ops.CoinFlip(probability=0.5) self.coin = ops.random.CoinFlip(probability=0.5)
self.to_int64 = ops.Cast(dtype=types.INT64, device="gpu") self.to_int64 = ops.Cast(dtype=types.DALIDataType.INT64, device="gpu")
def define_graph(self): def define_graph(self):
rng = self.coin() rng = self.coin()
...@@ -113,25 +114,24 @@ class HybridValPipe(Pipeline): ...@@ -113,25 +114,24 @@ class HybridValPipe(Pipeline):
output_dtype=types.FLOAT): output_dtype=types.FLOAT):
super(HybridValPipe, self).__init__( super(HybridValPipe, self).__init__(
batch_size, num_threads, device_id, seed=seed) batch_size, num_threads, device_id, seed=seed)
self.input = ops.FileReader( self.input = ops.readers.File(
file_root=file_root, file_root=file_root,
file_list=file_list, file_list=file_list,
shard_id=shard_id, shard_id=shard_id,
num_shards=num_shards, num_shards=num_shards,
random_shuffle=random_shuffle) random_shuffle=random_shuffle)
self.decode = ops.ImageDecoder(device="mixed", output_type=types.RGB) self.decode = ops.decoders.Image(device="mixed")
self.res = ops.Resize( self.res = ops.Resize(
device="gpu", resize_shorter=resize_shorter, interp_type=interp) device="gpu", resize_shorter=resize_shorter, interp_type=interp)
self.cmnp = ops.CropMirrorNormalize( self.cmnp = ops.CropMirrorNormalize(
device="gpu", device="gpu",
output_dtype=output_dtype, dtype=output_dtype,
output_layout=types.NCHW, output_layout='CHW',
crop=(crop, crop), crop=(crop, crop),
image_type=types.RGB,
mean=mean, mean=mean,
std=std, std=std,
pad_output=pad_output) pad_output=pad_output)
self.to_int64 = ops.Cast(dtype=types.INT64, device="gpu") self.to_int64 = ops.Cast(dtype=types.DALIDataType.INT64, device="gpu")
def define_graph(self): def define_graph(self):
jpegs, labels = self.input(name="Reader") jpegs, labels = self.input(name="Reader")
...@@ -144,64 +144,84 @@ class HybridValPipe(Pipeline): ...@@ -144,64 +144,84 @@ class HybridValPipe(Pipeline):
return self.epoch_size("Reader") return self.epoch_size("Reader")
def build(config, mode='train'): def dali_dataloader(config, mode, device, seed=None):
env = os.environ assert "gpu" in device, "gpu training is required for DALI"
assert config.get('use_gpu', device_id = int(device.split(':')[1])
True) == True, "gpu training is required for DALI" config_dataloader = config[mode]
assert not config.get( seed = 42 if seed is None else seed
'use_aa'), "auto augment is not supported by DALI reader" ops = [
assert float(env.get('FLAGS_fraction_of_gpu_memory_to_use', 0.92)) < 0.9, \ list(x.keys())[0]
"Please leave enough GPU memory for DALI workspace, e.g., by setting" \ for x in config_dataloader["dataset"]["transform_ops"]
" `export FLAGS_fraction_of_gpu_memory_to_use=0.8`" ]
support_ops_train = [
"DecodeImage", "NormalizeImage", "RandFlipImage", "RandCropImage"
]
support_ops_eval = [
"DecodeImage", "ResizeImage", "CropImage", "NormalizeImage"
]
if mode.lower() == 'train':
assert set(ops) == set(
support_ops_train
), "The supported trasform_ops for train_dataset in dali is : {}".format(
",".join(support_ops_train))
else:
assert set(ops) == set(
support_ops_eval
), "The supported trasform_ops for eval_dataset in dali is : {}".format(
",".join(support_ops_eval))
normalize_ops = [
op for op in config_dataloader["dataset"]["transform_ops"]
if "NormalizeImage" in op
][0]["NormalizeImage"]
channel_num = normalize_ops.get("channel_num", 3)
output_dtype = types.FLOAT16 if normalize_ops.get("output_fp16",
False) else types.FLOAT
dataset_config = config[mode.upper()] env = os.environ
# assert float(env.get('FLAGS_fraction_of_gpu_memory_to_use', 0.92)) < 0.9, \
# "Please leave enough GPU memory for DALI workspace, e.g., by setting" \
# " `export FLAGS_fraction_of_gpu_memory_to_use=0.8`"
gpu_num = paddle.fluid.core.get_cuda_device_count() if ( gpu_num = paddle.distributed.get_world_size()
'PADDLE_TRAINERS_NUM') and (
'PADDLE_TRAINER_ID'
) not in env else int(env.get('PADDLE_TRAINERS_NUM', 0))
batch_size = dataset_config.batch_size batch_size = config_dataloader["sampler"]["batch_size"]
assert batch_size % gpu_num == 0, \
"batch size must be multiple of number of devices"
batch_size = batch_size // gpu_num
file_root = dataset_config.data_dir file_root = config_dataloader["dataset"]["image_root"]
file_list = dataset_config.file_list file_list = config_dataloader["dataset"]["cls_label_path"]
interp = 1 # settings.interpolation or 1 # default to linear interp = 1 # settings.interpolation or 1 # default to linear
interp_map = { interp_map = {
0: types.INTERP_NN, # cv2.INTER_NEAREST 0: types.DALIInterpType.INTERP_NN, # cv2.INTER_NEAREST
1: types.INTERP_LINEAR, # cv2.INTER_LINEAR 1: types.DALIInterpType.INTERP_LINEAR, # cv2.INTER_LINEAR
2: types.INTERP_CUBIC, # cv2.INTER_CUBIC 2: types.DALIInterpType.INTERP_CUBIC, # cv2.INTER_CUBIC
4: types.INTERP_LANCZOS3, # XXX use LANCZOS3 for cv2.INTER_LANCZOS4 3: types.DALIInterpType.
INTERP_LANCZOS3, # XXX use LANCZOS3 for cv2.INTER_LANCZOS4
} }
output_dtype = (types.FLOAT16 if 'AMP' in config and
config.AMP.get("use_pure_fp16", False)
else types.FLOAT)
assert interp in interp_map, "interpolation method not supported by DALI" assert interp in interp_map, "interpolation method not supported by DALI"
interp = interp_map[interp] interp = interp_map[interp]
pad_output = False pad_output = channel_num == 4
image_shape = config.get("image_shape", None)
if image_shape and image_shape[0] == 4:
pad_output = True
transforms = { transforms = {
k: v k: v
for d in dataset_config["transforms"] for k, v in d.items() for d in config_dataloader["dataset"]["transform_ops"]
for k, v in d.items()
} }
scale = transforms["NormalizeImage"].get("scale", 1.0 / 255) scale = transforms["NormalizeImage"].get("scale", 1.0 / 255)
if isinstance(scale, str): scale = eval(scale) if isinstance(scale, str) else scale
scale = eval(scale)
mean = transforms["NormalizeImage"].get("mean", [0.485, 0.456, 0.406]) mean = transforms["NormalizeImage"].get("mean", [0.485, 0.456, 0.406])
std = transforms["NormalizeImage"].get("std", [0.229, 0.224, 0.225]) std = transforms["NormalizeImage"].get("std", [0.229, 0.224, 0.225])
mean = [v / scale for v in mean] mean = [v / scale for v in mean]
std = [v / scale for v in std] std = [v / scale for v in std]
if mode == "train": sampler_name = config_dataloader["sampler"].get("name",
"DistributedBatchSampler")
assert sampler_name in ["DistributedBatchSampler", "BatchSampler"]
if mode.lower() == "train":
resize_shorter = 256 resize_shorter = 256
crop = transforms["RandCropImage"]["size"] crop = transforms["RandCropImage"]["size"]
scale = transforms["RandCropImage"].get("scale", [0.08, 1.]) scale = transforms["RandCropImage"].get("scale", [0.08, 1.])
...@@ -229,133 +249,71 @@ def build(config, mode='train'): ...@@ -229,133 +249,71 @@ def build(config, mode='train'):
device_id, device_id,
shard_id, shard_id,
num_shards, num_shards,
seed=42 + shard_id, seed=seed + shard_id,
pad_output=pad_output, pad_output=pad_output,
output_dtype=output_dtype) output_dtype=output_dtype)
pipe.build() pipe.build()
pipelines = [pipe] pipelines = [pipe]
sample_per_shard = len(pipe) // num_shards # sample_per_shard = len(pipe) // num_shards
else: else:
pipelines = [] pipe = HybridTrainPipe(
places = fluid.framework.cuda_places() file_root,
num_shards = len(places) file_list,
for idx, p in enumerate(places): batch_size,
place = fluid.core.Place() resize_shorter,
place.set_place(p) crop,
device_id = place.gpu_device_id() min_area,
pipe = HybridTrainPipe( lower,
file_root, upper,
file_list, interp,
batch_size, mean,
resize_shorter, std,
crop, device_id=device_id,
min_area, shard_id=0,
lower, num_shards=1,
upper, seed=seed,
interp,
mean,
std,
device_id,
idx,
num_shards,
seed=42 + idx,
pad_output=pad_output, pad_output=pad_output,
output_dtype=output_dtype) output_dtype=output_dtype)
pipe.build() pipe.build()
pipelines.append(pipe) pipelines = [pipe]
sample_per_shard = len(pipelines[0]) # sample_per_shard = len(pipelines[0])
return DALIGenericIterator( return DALIGenericIterator(
pipelines, ['feed_image', 'feed_label'], size=sample_per_shard) pipelines, ['data', 'label'], reader_name='Reader')
else: else:
resize_shorter = transforms["ResizeImage"].get("resize_short", 256) resize_shorter = transforms["ResizeImage"].get("resize_short", 256)
crop = transforms["CropImage"]["size"] crop = transforms["CropImage"]["size"]
if 'PADDLE_TRAINER_ID' in env and 'PADDLE_TRAINERS_NUM' in env and sampler_name == "DistributedBatchSampler":
shard_id = int(env['PADDLE_TRAINER_ID'])
num_shards = int(env['PADDLE_TRAINERS_NUM'])
device_id = int(env['FLAGS_selected_gpus'])
p = fluid.framework.cuda_places()[0] pipe = HybridValPipe(
place = fluid.core.Place() file_root,
place.set_place(p) file_list,
device_id = place.gpu_device_id() batch_size,
pipe = HybridValPipe( resize_shorter,
file_root, crop,
file_list, interp,
batch_size, mean,
resize_shorter, std,
crop, device_id=device_id,
interp, shard_id=shard_id,
mean, num_shards=num_shards,
std, pad_output=pad_output,
device_id=device_id, output_dtype=output_dtype)
pad_output=pad_output, else:
output_dtype=output_dtype) pipe = HybridValPipe(
file_root,
file_list,
batch_size,
resize_shorter,
crop,
interp,
mean,
std,
device_id=device_id,
pad_output=pad_output,
output_dtype=output_dtype)
pipe.build() pipe.build()
return DALIGenericIterator( return DALIGenericIterator(
pipe, ['feed_image', 'feed_label'], [pipe], ['data', 'label'], reader_name="Reader")
size=len(pipe),
dynamic_shape=True,
fill_last_batch=True,
last_batch_padded=True)
def train(config):
return build(config, 'train')
def val(config):
return build(config, 'valid')
def _to_Tensor(lod_tensor, dtype):
data_tensor = fluid.layers.create_tensor(dtype=dtype)
data = np.array(lod_tensor).astype(dtype)
fluid.layers.assign(data, data_tensor)
return data_tensor
def normalize(feeds, config):
image, label = feeds['image'], feeds['label']
img_mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1))
img_std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1))
image = fluid.layers.cast(image, 'float32')
costant = fluid.layers.fill_constant(
shape=[1], value=255.0, dtype='float32')
image = fluid.layers.elementwise_div(image, costant)
mean = fluid.layers.create_tensor(dtype="float32")
fluid.layers.assign(input=img_mean.astype("float32"), output=mean)
std = fluid.layers.create_tensor(dtype="float32")
fluid.layers.assign(input=img_std.astype("float32"), output=std)
image = fluid.layers.elementwise_sub(image, mean)
image = fluid.layers.elementwise_div(image, std)
image.stop_gradient = True
feeds['image'] = image
return feeds
def mix(feeds, config, is_train=True):
env = os.environ
gpu_num = paddle.fluid.core.get_cuda_device_count() if (
'PADDLE_TRAINERS_NUM') and (
'PADDLE_TRAINER_ID'
) not in env else int(env.get('PADDLE_TRAINERS_NUM', 0))
batch_size = config.TRAIN.batch_size // gpu_num
images = feeds['image']
label = feeds['label']
# TODO: hard code here, should be fixed!
alpha = 0.2
idx = _to_Tensor(np.random.permutation(batch_size), 'int32')
lam = np.random.beta(alpha, alpha)
images = lam * images + (1 - lam) * paddle.fluid.layers.gather(images, idx)
feed = {
'image': images,
'feed_y_a': label,
'feed_y_b': paddle.fluid.layers.gather(label, idx),
'feed_lam': _to_Tensor([lam] * batch_size, 'float32')
}
return feed if is_train else feeds
...@@ -197,14 +197,26 @@ class NormalizeImage(object): ...@@ -197,14 +197,26 @@ class NormalizeImage(object):
""" normalize image such as substract mean, divide std """ normalize image such as substract mean, divide std
""" """
def __init__(self, scale=None, mean=None, std=None, order='chw'): def __init__(self,
scale=None,
mean=None,
std=None,
order='chw',
output_fp16=False,
channel_num=3):
if isinstance(scale, str): if isinstance(scale, str):
scale = eval(scale) scale = eval(scale)
assert channel_num in [
3, 4
], "channel number of input image should be set to 3 or 4."
self.channel_num = channel_num
self.output_dtype = 'float16' if output_fp16 else 'float32'
self.scale = np.float32(scale if scale is not None else 1.0 / 255.0) self.scale = np.float32(scale if scale is not None else 1.0 / 255.0)
self.order = order
mean = mean if mean is not None else [0.485, 0.456, 0.406] mean = mean if mean is not None else [0.485, 0.456, 0.406]
std = std if std is not None else [0.229, 0.224, 0.225] std = std if std is not None else [0.229, 0.224, 0.225]
shape = (3, 1, 1) if order == 'chw' else (1, 1, 3) shape = (3, 1, 1) if self.order == 'chw' else (1, 1, 3)
self.mean = np.array(mean).reshape(shape).astype('float32') self.mean = np.array(mean).reshape(shape).astype('float32')
self.std = np.array(std).reshape(shape).astype('float32') self.std = np.array(std).reshape(shape).astype('float32')
...@@ -215,7 +227,20 @@ class NormalizeImage(object): ...@@ -215,7 +227,20 @@ class NormalizeImage(object):
assert isinstance(img, assert isinstance(img,
np.ndarray), "invalid input 'img' in NormalizeImage" np.ndarray), "invalid input 'img' in NormalizeImage"
return (img.astype('float32') * self.scale - self.mean) / self.std
img = (img.astype('float32') * self.scale - self.mean) / self.std
if self.channel_num == 4:
img_h = img.shape[1] if self.order == 'chw' else img.shape[0]
img_w = img.shape[2] if self.order == 'chw' else img.shape[1]
pad_zeros = np.zeros(
(1, img_h, img_w)) if self.order == 'chw' else np.zeros(
(img_h, img_w, 1))
img = (np.concatenate(
(img, pad_zeros), axis=0)
if self.order == 'chw' else np.concatenate(
(img, pad_zeros), axis=2))
return img.astype(self.output_dtype)
class ToCHWImage(object): class ToCHWImage(object):
......
...@@ -17,6 +17,7 @@ from __future__ import print_function ...@@ -17,6 +17,7 @@ from __future__ import print_function
import os import os
import sys import sys
import numpy as np import numpy as np
__dir__ = os.path.dirname(os.path.abspath(__file__)) __dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.abspath(os.path.join(__dir__, '../../'))) sys.path.append(os.path.abspath(os.path.join(__dir__, '../../')))
...@@ -40,7 +41,7 @@ from ppcls.arch import apply_to_static ...@@ -40,7 +41,7 @@ from ppcls.arch import apply_to_static
from ppcls.loss import build_loss from ppcls.loss import build_loss
from ppcls.metric import build_metrics from ppcls.metric import build_metrics
from ppcls.optimizer import build_optimizer from ppcls.optimizer import build_optimizer
from ppcls.utils.save_load import load_dygraph_pretrain from ppcls.utils.save_load import load_dygraph_pretrain, load_dygraph_pretrain_from_url
from ppcls.utils.save_load import init_model from ppcls.utils.save_load import init_model
from ppcls.utils import save_load from ppcls.utils import save_load
...@@ -78,8 +79,12 @@ class Trainer(object): ...@@ -78,8 +79,12 @@ class Trainer(object):
apply_to_static(self.config, self.model) apply_to_static(self.config, self.model)
if self.config["Global"]["pretrained_model"] is not None: if self.config["Global"]["pretrained_model"] is not None:
load_dygraph_pretrain(self.model, if self.config["Global"]["pretrained_model"].startswith("http"):
self.config["Global"]["pretrained_model"]) load_dygraph_pretrain_from_url(
self.model, self.config["Global"]["pretrained_model"])
else:
load_dygraph_pretrain(
self.model, self.config["Global"]["pretrained_model"])
if self.config["Global"]["distributed"]: if self.config["Global"]["distributed"]:
self.model = paddle.DataParallel(self.model) self.model = paddle.DataParallel(self.model)
...@@ -99,10 +104,25 @@ class Trainer(object): ...@@ -99,10 +104,25 @@ class Trainer(object):
self.query_dataloader = None self.query_dataloader = None
self.eval_mode = self.config["Global"].get("eval_mode", self.eval_mode = self.config["Global"].get("eval_mode",
"classification") "classification")
self.amp = True if "AMP" in self.config else False
if self.amp and self.config["AMP"] is not None:
self.scale_loss = self.config["AMP"].get("scale_loss", 1.0)
self.use_dynamic_loss_scaling = self.config["AMP"].get(
"use_dynamic_loss_scaling", False)
else:
self.scale_loss = 1.0
self.use_dynamic_loss_scaling = False
if self.amp:
AMP_RELATED_FLAGS_SETTING = {
'FLAGS_cudnn_batchnorm_spatial_persistent': 1,
'FLAGS_max_inplace_grad_add': 8,
}
paddle.fluid.set_flags(AMP_RELATED_FLAGS_SETTING)
self.train_loss_func = None self.train_loss_func = None
self.eval_loss_func = None self.eval_loss_func = None
self.train_metric_func = None self.train_metric_func = None
self.eval_metric_func = None self.eval_metric_func = None
self.use_dali = self.config['Global'].get("use_dali", False)
def train(self): def train(self):
# build train loss and metric info # build train loss and metric info
...@@ -117,8 +137,8 @@ class Trainer(object): ...@@ -117,8 +137,8 @@ class Trainer(object):
self.train_metric_func = build_metrics(metric_config) self.train_metric_func = build_metrics(metric_config)
if self.train_dataloader is None: if self.train_dataloader is None:
self.train_dataloader = build_dataloader(self.config["DataLoader"], self.train_dataloader = build_dataloader(
"Train", self.device) self.config["DataLoader"], "Train", self.device, self.use_dali)
step_each_epoch = len(self.train_dataloader) step_each_epoch = len(self.train_dataloader)
...@@ -134,7 +154,7 @@ class Trainer(object): ...@@ -134,7 +154,7 @@ class Trainer(object):
"metric": 0.0, "metric": 0.0,
"epoch": 0, "epoch": 0,
} }
# key: # key:
# val: metrics list word # val: metrics list word
output_info = dict() output_info = dict()
time_info = { time_info = {
...@@ -152,31 +172,52 @@ class Trainer(object): ...@@ -152,31 +172,52 @@ class Trainer(object):
if metric_info is not None: if metric_info is not None:
best_metric.update(metric_info) best_metric.update(metric_info)
# for amp training
if self.amp:
scaler = paddle.amp.GradScaler(
init_loss_scaling=self.scale_loss,
use_dynamic_loss_scaling=self.use_dynamic_loss_scaling)
tic = time.time() tic = time.time()
max_iter = len(self.train_dataloader) - 1 if platform.system( max_iter = len(self.train_dataloader) - 1 if platform.system(
) == "Windows" else len(self.train_dataloader) ) == "Windows" else len(self.train_dataloader)
for epoch_id in range(best_metric["epoch"] + 1, for epoch_id in range(best_metric["epoch"] + 1,
self.config["Global"]["epochs"] + 1): self.config["Global"]["epochs"] + 1):
acc = 0.0 acc = 0.0
for iter_id, batch in enumerate(self.train_dataloader()): train_dataloader = self.train_dataloader if self.use_dali else self.train_dataloader(
)
for iter_id, batch in enumerate(train_dataloader):
if iter_id >= max_iter: if iter_id >= max_iter:
break break
if iter_id == 5: if iter_id == 5:
for key in time_info: for key in time_info:
time_info[key].reset() time_info[key].reset()
time_info["reader_cost"].update(time.time() - tic) time_info["reader_cost"].update(time.time() - tic)
if self.use_dali:
batch = [
paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label'])
]
batch_size = batch[0].shape[0] batch_size = batch[0].shape[0]
batch[1] = batch[1].reshape([-1, 1]).astype("int64") batch[1] = batch[1].reshape([-1, 1]).astype("int64")
global_step += 1 global_step += 1
# image input # image input
if not self.is_rec: if self.amp:
out = self.model(batch[0]) with paddle.amp.auto_cast(custom_black_list={
"flatten_contiguous_range", "greater_than"
}):
out = self.forward(batch)
loss_dict = self.train_loss_func(out, batch[1])
else: else:
out = self.model(batch[0], batch[1]) out = self.forward(batch)
# calc loss # calc loss
loss_dict = self.train_loss_func(out, batch[1]) if self.config["DataLoader"]["Train"]["dataset"].get(
"batch_transform_ops", None):
loss_dict = self.train_loss_func(out, batch[1:])
else:
loss_dict = self.train_loss_func(out, batch[1])
for key in loss_dict: for key in loss_dict:
if not key in output_info: if not key in output_info:
...@@ -193,8 +234,13 @@ class Trainer(object): ...@@ -193,8 +234,13 @@ class Trainer(object):
batch_size) batch_size)
# step opt and lr # step opt and lr
loss_dict["loss"].backward() if self.amp:
optimizer.step() scaled = scaler.scale(loss_dict["loss"])
scaled.backward()
scaler.minimize(optimizer, scaled)
else:
loss_dict["loss"].backward()
optimizer.step()
optimizer.clear_grad() optimizer.clear_grad()
lr_sch.step() lr_sch.step()
...@@ -237,7 +283,8 @@ class Trainer(object): ...@@ -237,7 +283,8 @@ class Trainer(object):
step=global_step, step=global_step,
writer=self.vdl_writer) writer=self.vdl_writer)
tic = time.time() tic = time.time()
if self.use_dali:
self.train_dataloader.reset()
metric_msg = ", ".join([ metric_msg = ", ".join([
"{}: {:.5f}".format(key, output_info[key].avg) "{}: {:.5f}".format(key, output_info[key].avg)
for key in output_info for key in output_info
...@@ -307,7 +354,8 @@ class Trainer(object): ...@@ -307,7 +354,8 @@ class Trainer(object):
if self.eval_mode == "classification": if self.eval_mode == "classification":
if self.eval_dataloader is None: if self.eval_dataloader is None:
self.eval_dataloader = build_dataloader( self.eval_dataloader = build_dataloader(
self.config["DataLoader"], "Eval", self.device) self.config["DataLoader"], "Eval", self.device,
self.use_dali)
if self.eval_metric_func is None: if self.eval_metric_func is None:
metric_config = self.config.get("Metric") metric_config = self.config.get("Metric")
...@@ -321,11 +369,13 @@ class Trainer(object): ...@@ -321,11 +369,13 @@ class Trainer(object):
elif self.eval_mode == "retrieval": elif self.eval_mode == "retrieval":
if self.gallery_dataloader is None: if self.gallery_dataloader is None:
self.gallery_dataloader = build_dataloader( self.gallery_dataloader = build_dataloader(
self.config["DataLoader"]["Eval"], "Gallery", self.device) self.config["DataLoader"]["Eval"], "Gallery", self.device,
self.use_dali)
if self.query_dataloader is None: if self.query_dataloader is None:
self.query_dataloader = build_dataloader( self.query_dataloader = build_dataloader(
self.config["DataLoader"]["Eval"], "Query", self.device) self.config["DataLoader"]["Eval"], "Query", self.device,
self.use_dali)
# build metric info # build metric info
if self.eval_metric_func is None: if self.eval_metric_func is None:
metric_config = self.config.get("Metric", None) metric_config = self.config.get("Metric", None)
...@@ -341,6 +391,13 @@ class Trainer(object): ...@@ -341,6 +391,13 @@ class Trainer(object):
self.model.train() self.model.train()
return eval_result return eval_result
def forward(self, batch):
if not self.is_rec:
out = self.model(batch[0])
else:
out = self.model(batch[0], batch[1])
return out
@paddle.no_grad() @paddle.no_grad()
def eval_cls(self, epoch_id=0): def eval_cls(self, epoch_id=0):
output_info = dict() output_info = dict()
...@@ -354,24 +411,27 @@ class Trainer(object): ...@@ -354,24 +411,27 @@ class Trainer(object):
metric_key = None metric_key = None
tic = time.time() tic = time.time()
eval_dataloader = self.eval_dataloader if self.use_dali else self.eval_dataloader(
)
max_iter = len(self.eval_dataloader) - 1 if platform.system( max_iter = len(self.eval_dataloader) - 1 if platform.system(
) == "Windows" else len(self.eval_dataloader) ) == "Windows" else len(self.eval_dataloader)
for iter_id, batch in enumerate(self.eval_dataloader()): for iter_id, batch in enumerate(eval_dataloader):
if iter_id >= max_iter: if iter_id >= max_iter:
break break
if iter_id == 5: if iter_id == 5:
for key in time_info: for key in time_info:
time_info[key].reset() time_info[key].reset()
if self.use_dali:
batch = [
paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label'])
]
time_info["reader_cost"].update(time.time() - tic) time_info["reader_cost"].update(time.time() - tic)
batch_size = batch[0].shape[0] batch_size = batch[0].shape[0]
batch[0] = paddle.to_tensor(batch[0]).astype("float32") batch[0] = paddle.to_tensor(batch[0]).astype("float32")
batch[1] = batch[1].reshape([-1, 1]).astype("int64") batch[1] = batch[1].reshape([-1, 1]).astype("int64")
# image input # image input
if self.is_rec: out = self.forward(batch)
out = self.model(batch[0], batch[1])
else:
out = self.model(batch[0])
# calc loss # calc loss
if self.eval_loss_func is not None: if self.eval_loss_func is not None:
loss_dict = self.eval_loss_func(out, batch[-1]) loss_dict = self.eval_loss_func(out, batch[-1])
...@@ -419,7 +479,8 @@ class Trainer(object): ...@@ -419,7 +479,8 @@ class Trainer(object):
len(self.eval_dataloader), metric_msg, time_msg, ips_msg)) len(self.eval_dataloader), metric_msg, time_msg, ips_msg))
tic = time.time() tic = time.time()
if self.use_dali:
self.eval_dataloader.reset()
metric_msg = ", ".join([ metric_msg = ", ".join([
"{}: {:.5f}".format(key, output_info[key].avg) "{}: {:.5f}".format(key, output_info[key].avg)
for key in output_info for key in output_info
...@@ -434,7 +495,6 @@ class Trainer(object): ...@@ -434,7 +495,6 @@ class Trainer(object):
def eval_retrieval(self, epoch_id=0): def eval_retrieval(self, epoch_id=0):
self.model.eval() self.model.eval()
cum_similarity_matrix = None
# step1. build gallery # step1. build gallery
gallery_feas, gallery_img_id, gallery_unique_id = self._cal_feature( gallery_feas, gallery_img_id, gallery_unique_id = self._cal_feature(
name='gallery') name='gallery')
...@@ -509,14 +569,20 @@ class Trainer(object): ...@@ -509,14 +569,20 @@ class Trainer(object):
has_unique_id = False has_unique_id = False
max_iter = len(dataloader) - 1 if platform.system( max_iter = len(dataloader) - 1 if platform.system(
) == "Windows" else len(dataloader) ) == "Windows" else len(dataloader)
for idx, batch in enumerate(dataloader( dataloader_tmp = dataloader if self.use_dali else dataloader()
)): # load is very time-consuming for idx, batch in enumerate(
dataloader_tmp): # load is very time-consuming
if idx >= max_iter: if idx >= max_iter:
break break
if idx % self.config["Global"]["print_batch_step"] == 0: if idx % self.config["Global"]["print_batch_step"] == 0:
logger.info( logger.info(
f"{name} feature calculation process: [{idx}/{len(dataloader)}]" f"{name} feature calculation process: [{idx}/{len(dataloader)}]"
) )
if self.use_dali:
batch = [
paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label'])
]
batch = [paddle.to_tensor(x) for x in batch] batch = [paddle.to_tensor(x) for x in batch]
batch[1] = batch[1].reshape([-1, 1]).astype("int64") batch[1] = batch[1].reshape([-1, 1]).astype("int64")
if len(batch) == 3: if len(batch) == 3:
...@@ -542,7 +608,8 @@ class Trainer(object): ...@@ -542,7 +608,8 @@ class Trainer(object):
all_image_id = paddle.concat([all_image_id, batch[1]]) all_image_id = paddle.concat([all_image_id, batch[1]])
if has_unique_id: if has_unique_id:
all_unique_id = paddle.concat([all_unique_id, batch[2]]) all_unique_id = paddle.concat([all_unique_id, batch[2]])
if self.use_dali:
dataloader_tmp.reset()
if paddle.distributed.get_world_size() > 1: if paddle.distributed.get_world_size() > 1:
feat_list = [] feat_list = []
img_id_list = [] img_id_list = []
......
...@@ -4,7 +4,7 @@ import paddle ...@@ -4,7 +4,7 @@ import paddle
import paddle.nn as nn import paddle.nn as nn
from ppcls.utils import logger from ppcls.utils import logger
from .celoss import CELoss from .celoss import CELoss, MixCELoss
from .googlenetloss import GoogLeNetLoss from .googlenetloss import GoogLeNetLoss
from .centerloss import CenterLoss from .centerloss import CenterLoss
from .emlloss import EmlLoss from .emlloss import EmlLoss
...@@ -30,7 +30,6 @@ class CombinedLoss(nn.Layer): ...@@ -30,7 +30,6 @@ class CombinedLoss(nn.Layer):
assert isinstance(config_list, list), ( assert isinstance(config_list, list), (
'operator config should be a list') 'operator config should be a list')
for config in config_list: for config in config_list:
print(config)
assert isinstance(config, assert isinstance(config,
dict) and len(config) == 1, "yaml format error" dict) and len(config) == 1, "yaml format error"
name = list(config)[0] name = list(config)[0]
......
...@@ -18,6 +18,10 @@ import paddle.nn.functional as F ...@@ -18,6 +18,10 @@ import paddle.nn.functional as F
class CELoss(nn.Layer): class CELoss(nn.Layer):
"""
Cross entropy loss
"""
def __init__(self, epsilon=None): def __init__(self, epsilon=None):
super().__init__() super().__init__()
if epsilon is not None and (epsilon <= 0 or epsilon >= 1): if epsilon is not None and (epsilon <= 0 or epsilon >= 1):
...@@ -50,3 +54,21 @@ class CELoss(nn.Layer): ...@@ -50,3 +54,21 @@ class CELoss(nn.Layer):
loss = F.cross_entropy(x, label=label, soft_label=soft_label) loss = F.cross_entropy(x, label=label, soft_label=soft_label)
loss = loss.mean() loss = loss.mean()
return {"CELoss": loss} return {"CELoss": loss}
class MixCELoss(CELoss):
"""
Cross entropy loss with mix(mixup, cutmix, fixmix)
"""
def __init__(self, epsilon=None):
super().__init__()
self.epsilon = epsilon
def __call__(self, input, batch):
target0, target1, lam = batch
loss0 = super().forward(input, target0)["CELoss"]
loss1 = super().forward(input, target1)["CELoss"]
loss = lam * loss0 + (1.0 - lam) * loss1
loss = paddle.mean(loss)
return {"MixCELoss": loss}
...@@ -41,7 +41,7 @@ def build_lr_scheduler(lr_config, epochs, step_each_epoch): ...@@ -41,7 +41,7 @@ def build_lr_scheduler(lr_config, epochs, step_each_epoch):
return lr return lr
def build_optimizer(config, epochs, step_each_epoch, parameters): def build_optimizer(config, epochs, step_each_epoch, parameters=None):
config = copy.deepcopy(config) config = copy.deepcopy(config)
# step1 build lr # step1 build lr
lr = build_lr_scheduler(config.pop('lr'), epochs, step_each_epoch) lr = build_lr_scheduler(config.pop('lr'), epochs, step_each_epoch)
......
...@@ -33,12 +33,14 @@ class Momentum(object): ...@@ -33,12 +33,14 @@ class Momentum(object):
learning_rate, learning_rate,
momentum, momentum,
weight_decay=None, weight_decay=None,
grad_clip=None): grad_clip=None,
multi_precision=False):
super(Momentum, self).__init__() super(Momentum, self).__init__()
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.momentum = momentum self.momentum = momentum
self.weight_decay = weight_decay self.weight_decay = weight_decay
self.grad_clip = grad_clip self.grad_clip = grad_clip
self.multi_precision = multi_precision
def __call__(self, parameters): def __call__(self, parameters):
opt = optim.Momentum( opt = optim.Momentum(
...@@ -46,6 +48,7 @@ class Momentum(object): ...@@ -46,6 +48,7 @@ class Momentum(object):
momentum=self.momentum, momentum=self.momentum,
weight_decay=self.weight_decay, weight_decay=self.weight_decay,
grad_clip=self.grad_clip, grad_clip=self.grad_clip,
multi_precision=self.multi_precision,
parameters=parameters) parameters=parameters)
return opt return opt
...@@ -60,7 +63,8 @@ class Adam(object): ...@@ -60,7 +63,8 @@ class Adam(object):
weight_decay=None, weight_decay=None,
grad_clip=None, grad_clip=None,
name=None, name=None,
lazy_mode=False): lazy_mode=False,
multi_precision=False):
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.beta1 = beta1 self.beta1 = beta1
self.beta2 = beta2 self.beta2 = beta2
...@@ -71,6 +75,7 @@ class Adam(object): ...@@ -71,6 +75,7 @@ class Adam(object):
self.grad_clip = grad_clip self.grad_clip = grad_clip
self.name = name self.name = name
self.lazy_mode = lazy_mode self.lazy_mode = lazy_mode
self.multi_precision = multi_precision
def __call__(self, parameters): def __call__(self, parameters):
opt = optim.Adam( opt = optim.Adam(
...@@ -82,6 +87,7 @@ class Adam(object): ...@@ -82,6 +87,7 @@ class Adam(object):
grad_clip=self.grad_clip, grad_clip=self.grad_clip,
name=self.name, name=self.name,
lazy_mode=self.lazy_mode, lazy_mode=self.lazy_mode,
multi_precision=self.multi_precision,
parameters=parameters) parameters=parameters)
return opt return opt
...@@ -104,7 +110,8 @@ class RMSProp(object): ...@@ -104,7 +110,8 @@ class RMSProp(object):
rho=0.95, rho=0.95,
epsilon=1e-6, epsilon=1e-6,
weight_decay=None, weight_decay=None,
grad_clip=None): grad_clip=None,
multi_precision=False):
super(RMSProp, self).__init__() super(RMSProp, self).__init__()
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.momentum = momentum self.momentum = momentum
...@@ -122,4 +129,4 @@ class RMSProp(object): ...@@ -122,4 +129,4 @@ class RMSProp(object):
weight_decay=self.weight_decay, weight_decay=self.weight_decay,
grad_clip=self.grad_clip, grad_clip=self.grad_clip,
parameters=parameters) parameters=parameters)
return opt return opt
\ No newline at end of file
...@@ -16,29 +16,32 @@ from __future__ import absolute_import ...@@ -16,29 +16,32 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
import os
import time import time
import numpy as np import numpy as np
from collections import OrderedDict from collections import OrderedDict
from ppcls.optimizer import OptimizerBuilder
import paddle import paddle
import paddle.nn.functional as F import paddle.nn.functional as F
from ppcls.optimizer.learning_rate import LearningRateBuilder
from ppcls.arch import backbone
from ppcls.arch.loss import CELoss
from ppcls.arch.loss import MixCELoss
from ppcls.arch.loss import JSDivLoss
from ppcls.arch.loss import GoogLeNetLoss
from ppcls.utils.misc import AverageMeter
from ppcls.utils import logger, profiler
from paddle.distributed import fleet from paddle.distributed import fleet
from paddle.distributed.fleet import DistributedStrategy from paddle.distributed.fleet import DistributedStrategy
# from ppcls.optimizer import OptimizerBuilder
# from ppcls.optimizer.learning_rate import LearningRateBuilder
def create_feeds(image_shape, use_mix=None, use_dali=None, dtype="float32"): from ppcls.arch import build_model
from ppcls.loss import build_loss
from ppcls.metric import build_metrics
from ppcls.optimizer import build_optimizer
from ppcls.optimizer import build_lr_scheduler
from ppcls.utils.misc import AverageMeter
from ppcls.utils import logger, profiler
def create_feeds(image_shape, use_mix=None, dtype="float32"):
""" """
Create feeds as model input Create feeds as model input
...@@ -50,164 +53,33 @@ def create_feeds(image_shape, use_mix=None, use_dali=None, dtype="float32"): ...@@ -50,164 +53,33 @@ def create_feeds(image_shape, use_mix=None, use_dali=None, dtype="float32"):
feeds(dict): dict of model input variables feeds(dict): dict of model input variables
""" """
feeds = OrderedDict() feeds = OrderedDict()
feeds['image'] = paddle.static.data( feeds['data'] = paddle.static.data(
name="feed_image", shape=[None] + image_shape, dtype=dtype) name="data", shape=[None] + image_shape, dtype=dtype)
if use_mix and not use_dali: if use_mix:
feeds['feed_y_a'] = paddle.static.data( feeds['y_a'] = paddle.static.data(
name="feed_y_a", shape=[None, 1], dtype="int64") name="y_a", shape=[None, 1], dtype="int64")
feeds['feed_y_b'] = paddle.static.data( feeds['y_b'] = paddle.static.data(
name="feed_y_b", shape=[None, 1], dtype="int64") name="y_b", shape=[None, 1], dtype="int64")
feeds['feed_lam'] = paddle.static.data( feeds['lam'] = paddle.static.data(
name="feed_lam", shape=[None, 1], dtype=dtype) name="lam", shape=[None, 1], dtype=dtype)
else: else:
feeds['label'] = paddle.static.data( feeds['label'] = paddle.static.data(
name="feed_label", shape=[None, 1], dtype="int64") name="label", shape=[None, 1], dtype="int64")
return feeds return feeds
def create_model(architecture, image, classes_num, config, is_train):
"""
Create a model
Args:
architecture(dict): architecture information,
name(such as ResNet50) is needed
image(variable): model input variable
classes_num(int): num of classes
config(dict): model config
Returns:
out(variable): model output variable
"""
name = architecture["name"]
params = architecture.get("params", {})
if "data_format" in config:
params["data_format"] = config["data_format"]
data_format = config["data_format"]
input_image_channel = config.get('image_shape', [3, 224, 224])[0]
if input_image_channel != 3:
logger.warning(
"Input image channel is changed to {}, maybe for better speed-up".
format(input_image_channel))
params["input_image_channel"] = input_image_channel
if "is_test" in params:
params['is_test'] = not is_train
model = backbone.__dict__[name](class_dim=classes_num, **params)
out = model(image)
return out
def create_loss(out,
feeds,
architecture,
classes_num=1000,
epsilon=None,
use_mix=False,
use_distillation=False):
"""
Create a loss for optimization, such as:
1. CrossEnotry loss
2. CrossEnotry loss with label smoothing
3. CrossEnotry loss with mix(mixup, cutmix, fmix)
4. CrossEnotry loss with label smoothing and (mixup, cutmix, fmix)
5. GoogLeNet loss
Args:
out(variable): model output variable
feeds(dict): dict of model input variables
architecture(dict): architecture information,
name(such as ResNet50) is needed
classes_num(int): num of classes
epsilon(float): parameter for label smoothing, 0.0 <= epsilon <= 1.0
use_mix(bool): whether to use mix(include mixup, cutmix, fmix)
Returns:
loss(variable): loss variable
"""
if use_mix:
feed_y_a = paddle.reshape(feeds['feed_y_a'], [-1, 1])
feed_y_b = paddle.reshape(feeds['feed_y_b'], [-1, 1])
feed_lam = paddle.reshape(feeds['feed_lam'], [-1, 1])
else:
target = paddle.reshape(feeds['label'], [-1, 1])
if architecture["name"] == "GoogLeNet":
assert len(out) == 3, "GoogLeNet should have 3 outputs"
loss = GoogLeNetLoss(class_dim=classes_num, epsilon=epsilon)
return loss(out[0], out[1], out[2], target)
if use_distillation:
assert len(out) == 2, ("distillation output length must be 2, "
"but got {}".format(len(out)))
loss = JSDivLoss(class_dim=classes_num, epsilon=epsilon)
return loss(out[1], out[0])
if use_mix:
loss = MixCELoss(class_dim=classes_num, epsilon=epsilon)
return loss(out, feed_y_a, feed_y_b, feed_lam)
else:
loss = CELoss(class_dim=classes_num, epsilon=epsilon)
return loss(out, target)
def create_metric(out,
feeds,
architecture,
topk=5,
classes_num=1000,
config=None,
use_distillation=False):
"""
Create measures of model accuracy, such as top1 and top5
Args:
out(variable): model output variable
feeds(dict): dict of model input variables(included label)
topk(int): usually top5
classes_num(int): num of classes
config(dict) : model config
Returns:
fetchs(dict): dict of measures
"""
label = paddle.reshape(feeds['label'], [-1, 1])
if architecture["name"] == "GoogLeNet":
assert len(out) == 3, "GoogLeNet should have 3 outputs"
out = out[0]
else:
# just need student label to get metrics
if use_distillation:
out = out[1]
softmax_out = F.softmax(out)
fetchs = OrderedDict()
# set top1 to fetchs
top1 = paddle.metric.accuracy(softmax_out, label=label, k=1)
fetchs['top1'] = (top1, AverageMeter('top1', '.4f', need_avg=True))
# set topk to fetchs
k = min(topk, classes_num)
topk = paddle.metric.accuracy(softmax_out, label=label, k=k)
topk_name = 'top{}'.format(k)
fetchs[topk_name] = (topk, AverageMeter(topk_name, '.4f', need_avg=True))
return fetchs
def create_fetchs(out, def create_fetchs(out,
feeds, feeds,
architecture, architecture,
topk=5, topk=5,
classes_num=1000,
epsilon=None, epsilon=None,
use_mix=False, use_mix=False,
config=None, config=None,
use_distillation=False): mode="Train"):
""" """
Create fetchs as model outputs(included loss and measures), Create fetchs as model outputs(included loss and measures),
will call create_loss and create_metric(if use_mix). will call create_loss and create_metric(if use_mix).
Args: Args:
out(variable): model output variable out(variable): model output variable
feeds(dict): dict of model input variables. feeds(dict): dict of model input variables.
...@@ -215,7 +87,6 @@ def create_fetchs(out, ...@@ -215,7 +87,6 @@ def create_fetchs(out,
architecture(dict): architecture information, architecture(dict): architecture information,
name(such as ResNet50) is needed name(such as ResNet50) is needed
topk(int): usually top5 topk(int): usually top5
classes_num(int): num of classes
epsilon(float): parameter for label smoothing, 0.0 <= epsilon <= 1.0 epsilon(float): parameter for label smoothing, 0.0 <= epsilon <= 1.0
use_mix(bool): whether to use mix(include mixup, cutmix, fmix) use_mix(bool): whether to use mix(include mixup, cutmix, fmix)
config(dict): model config config(dict): model config
...@@ -224,53 +95,57 @@ def create_fetchs(out, ...@@ -224,53 +95,57 @@ def create_fetchs(out,
fetchs(dict): dict of model outputs(included loss and measures) fetchs(dict): dict of model outputs(included loss and measures)
""" """
fetchs = OrderedDict() fetchs = OrderedDict()
loss = create_loss(out, feeds, architecture, classes_num, epsilon, use_mix, # build loss
use_distillation) # TODO(littletomatodonkey): support mix training
fetchs['loss'] = (loss, AverageMeter('loss', '7.4f', need_avg=True)) if use_mix:
y_a = paddle.reshape(feeds['y_a'], [-1, 1])
y_b = paddle.reshape(feeds['y_b'], [-1, 1])
lam = paddle.reshape(feeds['lam'], [-1, 1])
else:
target = paddle.reshape(feeds['label'], [-1, 1])
loss_func = build_loss(config["Loss"][mode])
# TODO: support mix training
loss_dict = loss_func(out, target)
loss_out = loss_dict["loss"]
# if "AMP" in config and config.AMP.get("use_pure_fp16", False):
# loss_out = loss_out.astype("float16")
# if use_mix:
# return loss_func(out, feed_y_a, feed_y_b, feed_lam)
# else:
# return loss_func(out, target)
fetchs['loss'] = (loss_out, AverageMeter('loss', '7.4f', need_avg=True))
assert use_mix is False
# build metric
if not use_mix: if not use_mix:
metric = create_metric(out, feeds, architecture, topk, classes_num, metric_func = build_metrics(config["Metric"][mode])
config, use_distillation)
fetchs.update(metric)
return fetchs metric_dict = metric_func(out, target)
for key in metric_dict:
if mode != "Train" and paddle.distributed.get_world_size() > 1:
paddle.distributed.all_reduce(
metric_dict[key], op=paddle.distributed.ReduceOp.SUM)
metric_dict[key] = metric_dict[
key] / paddle.distributed.get_world_size()
def create_optimizer(config): fetchs[key] = (metric_dict[key], AverageMeter(
""" key, '7.4f', need_avg=True))
Create an optimizer using config, usually including
learning rate and regularization.
Args: return fetchs
config(dict): such as
{
'LEARNING_RATE':
{'function': 'Cosine',
'params': {'lr': 0.1}
},
'OPTIMIZER':
{'function': 'Momentum',
'params':{'momentum': 0.9},
'regularizer':
{'function': 'L2', 'factor': 0.0001}
}
}
Returns:
an optimizer instance
"""
# create learning_rate instance
lr_config = config['LEARNING_RATE']
lr_config['params'].update({
'epochs': config['epochs'],
'step_each_epoch':
config['total_images'] // config['TRAIN']['batch_size'],
})
lr = LearningRateBuilder(**lr_config)()
# create optimizer instance def create_optimizer(config, step_each_epoch):
opt_config = config['OPTIMIZER'] # create learning_rate instance
opt = OptimizerBuilder(**opt_config) optimizer, lr_sch = build_optimizer(
return opt(lr), lr config["Optimizer"], config["Global"]["epochs"], step_each_epoch)
return optimizer, lr_sch
def create_strategy(config): def create_strategy(config):
...@@ -299,32 +174,11 @@ def create_strategy(config): ...@@ -299,32 +174,11 @@ def create_strategy(config):
fuse_bn_add_act_ops = config.get('fuse_bn_add_act_ops', fuse_op) fuse_bn_add_act_ops = config.get('fuse_bn_add_act_ops', fuse_op)
enable_addto = config.get('enable_addto', fuse_op) enable_addto = config.get('enable_addto', fuse_op)
try: build_strategy.fuse_bn_act_ops = fuse_bn_act_ops
build_strategy.fuse_bn_act_ops = fuse_bn_act_ops build_strategy.fuse_elewise_add_act_ops = fuse_elewise_add_act_ops
except Exception as e: build_strategy.fuse_bn_add_act_ops = fuse_bn_add_act_ops
logger.info( build_strategy.enable_addto = enable_addto
"PaddlePaddle version 1.7.0 or higher is "
"required when you want to fuse batch_norm and activation_op.")
try:
build_strategy.fuse_elewise_add_act_ops = fuse_elewise_add_act_ops
except Exception as e:
logger.info(
"PaddlePaddle version 1.7.0 or higher is "
"required when you want to fuse elewise_add_act and activation_op.")
try:
build_strategy.fuse_bn_add_act_ops = fuse_bn_add_act_ops
except Exception as e:
logger.info(
"PaddlePaddle 2.0-rc or higher is "
"required when you want to enable fuse_bn_add_act_ops strategy.")
try:
build_strategy.enable_addto = enable_addto
except Exception as e:
logger.info("PaddlePaddle 2.0-rc or higher is "
"required when you want to enable addto strategy.")
return build_strategy, exec_strategy return build_strategy, exec_strategy
...@@ -370,7 +224,12 @@ def mixed_precision_optimizer(config, optimizer): ...@@ -370,7 +224,12 @@ def mixed_precision_optimizer(config, optimizer):
return optimizer return optimizer
def build(config, main_prog, startup_prog, is_train=True, is_distributed=True): def build(config,
main_prog,
startup_prog,
step_each_epoch=100,
is_train=True,
is_distributed=True):
""" """
Build a program using a model and an optimizer Build a program using a model and an optimizer
1. create feeds 1. create feeds
...@@ -383,7 +242,7 @@ def build(config, main_prog, startup_prog, is_train=True, is_distributed=True): ...@@ -383,7 +242,7 @@ def build(config, main_prog, startup_prog, is_train=True, is_distributed=True):
config(dict): config config(dict): config
main_prog(): main program main_prog(): main program
startup_prog(): startup program startup_prog(): startup program
is_train(bool): train or valid is_train(bool): train or eval
is_distributed(bool): whether to use distributed training method is_distributed(bool): whether to use distributed training method
Returns: Returns:
...@@ -392,34 +251,37 @@ def build(config, main_prog, startup_prog, is_train=True, is_distributed=True): ...@@ -392,34 +251,37 @@ def build(config, main_prog, startup_prog, is_train=True, is_distributed=True):
""" """
with paddle.static.program_guard(main_prog, startup_prog): with paddle.static.program_guard(main_prog, startup_prog):
with paddle.utils.unique_name.guard(): with paddle.utils.unique_name.guard():
use_mix = config.get('use_mix') and is_train mode = "Train" if is_train else "Eval"
use_dali = config.get('use_dali', False) use_mix = "batch_transform_ops" in config["DataLoader"][mode][
use_distillation = config.get('use_distillation') "dataset"]
use_dali = config["Global"].get('use_dali', False)
feeds = create_feeds( feeds = create_feeds(
config.image_shape, config["Global"]["image_shape"],
use_mix=use_mix, use_mix=use_mix,
use_dali=use_dali,
dtype="float32") dtype="float32")
if use_dali and use_mix:
import dali # build model
feeds = dali.mix(feeds, config, is_train) # data_format should be assigned in arch-dict
out = create_model(config.ARCHITECTURE, feeds['image'], input_image_channel = config["Global"]["image_shape"][
config.classes_num, config, is_train) 0] # default as [3, 224, 224]
model = build_model(config["Arch"])
out = model(feeds["data"])
# end of build model
fetchs = create_fetchs( fetchs = create_fetchs(
out, out,
feeds, feeds,
config.ARCHITECTURE, config["Arch"],
config.topk,
config.classes_num,
epsilon=config.get('ls_epsilon'), epsilon=config.get('ls_epsilon'),
use_mix=use_mix, use_mix=use_mix,
config=config, config=config,
use_distillation=use_distillation) mode=mode)
lr_scheduler = None lr_scheduler = None
optimizer = None optimizer = None
if is_train: if is_train:
optimizer, lr_scheduler = create_optimizer(config) optimizer, lr_scheduler = build_optimizer(
config["Optimizer"], config["Global"]["epochs"],
step_each_epoch)
optimizer = mixed_precision_optimizer(config, optimizer) optimizer = mixed_precision_optimizer(config, optimizer)
if is_distributed: if is_distributed:
optimizer = dist_optimizer(config, optimizer) optimizer = dist_optimizer(config, optimizer)
...@@ -474,36 +336,32 @@ def run(dataloader, ...@@ -474,36 +336,32 @@ def run(dataloader,
exe(): exe():
program(): program():
fetchs(dict): dict of measures and the loss fetchs(dict): dict of measures and the loss
epoch(int): epoch of training or validation epoch(int): epoch of training or evaluation
model(str): log only model(str): log only
Returns: Returns:
""" """
fetch_list = [f[0] for f in fetchs.values()] fetch_list = [f[0] for f in fetchs.values()]
metric_list = [ metric_dict = OrderedDict([("lr", AverageMeter(
("lr", AverageMeter( 'lr', 'f', postfix=",", need_avg=False))])
'lr', 'f', postfix=",", need_avg=False)),
("batch_time", AverageMeter( for k in fetchs:
'batch_cost', '.5f', postfix=" s,")), metric_dict[k] = fetchs[k][1]
("reader_time", AverageMeter(
'reader_cost', '.5f', postfix=" s,")),
]
topk_name = 'top{}'.format(config.topk)
metric_list.insert(0, ("loss", fetchs["loss"][1]))
use_mix = config.get("use_mix", False) and mode == "train"
if not use_mix:
metric_list.insert(0, (topk_name, fetchs[topk_name][1]))
metric_list.insert(0, ("top1", fetchs["top1"][1]))
metric_list = OrderedDict(metric_list) metric_dict["batch_time"] = AverageMeter(
'batch_cost', '.5f', postfix=" s,")
metric_dict["reader_time"] = AverageMeter(
'reader_cost', '.5f', postfix=" s,")
for m in metric_list.values(): for m in metric_dict.values():
m.reset() m.reset()
use_dali = config.get('use_dali', False) use_dali = config["Global"].get('use_dali', False)
dataloader = dataloader if use_dali else dataloader()
tic = time.time() tic = time.time()
if not use_dali:
dataloader = dataloader()
idx = 0 idx = 0
batch_size = None batch_size = None
while True: while True:
...@@ -520,15 +378,15 @@ def run(dataloader, ...@@ -520,15 +378,15 @@ def run(dataloader,
idx += 1 idx += 1
# ignore the warmup iters # ignore the warmup iters
if idx == 5: if idx == 5:
metric_list["batch_time"].reset() metric_dict["batch_time"].reset()
metric_list["reader_time"].reset() metric_dict["reader_time"].reset()
metric_list['reader_time'].update(time.time() - tic) metric_dict['reader_time'].update(time.time() - tic)
profiler.add_profiler_step(profiler_options) profiler.add_profiler_step(profiler_options)
if use_dali: if use_dali:
batch_size = batch[0]["feed_image"].shape()[0] batch_size = batch[0]["data"].shape()[0]
feed_dict = batch[0] feed_dict = batch[0]
else: else:
batch_size = batch[0].shape()[0] batch_size = batch[0].shape()[0]
...@@ -536,41 +394,34 @@ def run(dataloader, ...@@ -536,41 +394,34 @@ def run(dataloader,
key.name: batch[idx] key.name: batch[idx]
for idx, key in enumerate(feeds.values()) for idx, key in enumerate(feeds.values())
} }
metrics = exe.run(program=program, metrics = exe.run(program=program,
feed=feed_dict, feed=feed_dict,
fetch_list=fetch_list) fetch_list=fetch_list)
for name, m in zip(fetchs.keys(), metrics): for name, m in zip(fetchs.keys(), metrics):
metric_list[name].update(np.mean(m), batch_size) metric_dict[name].update(np.mean(m), batch_size)
metric_list["batch_time"].update(time.time() - tic) metric_dict["batch_time"].update(time.time() - tic)
if mode == "train": if mode == "train":
metric_list['lr'].update(lr_scheduler.get_lr()) metric_dict['lr'].update(lr_scheduler.get_lr())
fetchs_str = ' '.join([ fetchs_str = ' '.join([
str(metric_list[key].mean) str(metric_dict[key].mean)
if "time" in key else str(metric_list[key].value) if "time" in key else str(metric_dict[key].value)
for key in metric_list for key in metric_dict
]) ])
ips_info = " ips: {:.5f} images/sec.".format( ips_info = " ips: {:.5f} images/sec.".format(
batch_size / metric_list["batch_time"].avg) batch_size / metric_dict["batch_time"].avg)
fetchs_str += ips_info fetchs_str += ips_info
if lr_scheduler is not None: if lr_scheduler is not None:
if lr_scheduler.update_specified: lr_scheduler.step()
curr_global_counter = lr_scheduler.step_each_epoch * epoch + idx
update = max(
0, curr_global_counter - lr_scheduler.
update_start_step) % lr_scheduler.update_step_interval == 0
if update:
lr_scheduler.step()
else:
lr_scheduler.step()
if vdl_writer: if vdl_writer:
global total_step global total_step
logger.scaler('loss', metrics[0][0], total_step, vdl_writer) logger.scaler('loss', metrics[0][0], total_step, vdl_writer)
total_step += 1 total_step += 1
if mode == 'valid': if mode == 'eval':
if idx % config.get('print_interval', 10) == 0: if idx % config.get('print_interval', 10) == 0:
logger.info("{:s} step:{:<4d} {:s}".format(mode, idx, logger.info("{:s} step:{:<4d} {:s}".format(mode, idx,
fetchs_str)) fetchs_str))
...@@ -579,20 +430,17 @@ def run(dataloader, ...@@ -579,20 +430,17 @@ def run(dataloader,
step_str = "{:s} step:{:<4d}".format(mode, idx) step_str = "{:s} step:{:<4d}".format(mode, idx)
if idx % config.get('print_interval', 10) == 0: if idx % config.get('print_interval', 10) == 0:
logger.info("{:s} {:s} {:s}".format( logger.info("{:s} {:s} {:s}".format(epoch_str, step_str,
logger.coloring(epoch_str, "HEADER") fetchs_str))
if idx == 0 else epoch_str,
logger.coloring(step_str, "PURPLE"),
logger.coloring(fetchs_str, 'OKGREEN')))
tic = time.time() tic = time.time()
end_str = ' '.join([str(m.mean) for m in metric_list.values()] + end_str = ' '.join([str(m.mean) for m in metric_dict.values()] +
[metric_list["batch_time"].total]) [metric_dict["batch_time"].total])
ips_info = "ips: {:.5f} images/sec.".format( ips_info = "ips: {:.5f} images/sec.".format(
batch_size * metric_list["batch_time"].count / batch_size * metric_dict["batch_time"].count /
metric_list["batch_time"].sum) metric_dict["batch_time"].sum)
if mode == 'valid': if mode == 'eval':
logger.info("END {:s} {:s} {:s}".format(mode, end_str, ips_info)) logger.info("END {:s} {:s} {:s}".format(mode, end_str, ips_info))
else: else:
end_epoch_str = "END epoch:{:<3d}".format(epoch) end_epoch_str = "END epoch:{:<3d}".format(epoch)
...@@ -602,5 +450,5 @@ def run(dataloader, ...@@ -602,5 +450,5 @@ def run(dataloader,
dataloader.reset() dataloader.reset()
# return top1_acc in order to save the best model # return top1_acc in order to save the best model
if mode == 'valid': if mode == 'eval':
return fetchs["top1"][1].avg return fetchs["top1"][1].avg
#!/usr/bin/env bash #!/usr/bin/env bash
export CUDA_VISIBLE_DEVICES="0,1,2,3" export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
export FLAGS_fraction_of_gpu_memory_to_use=0.80 export FLAGS_fraction_of_gpu_memory_to_use=0.80
python3.7 -m paddle.distributed.launch \ python3.7 -m paddle.distributed.launch \
--gpus="0,1,2,3" \ --gpus="0,1,2,3,4,5,6,7" \
tools/static/train.py \ ppcls/static//train.py \
-c ./configs/ResNet/ResNet50.yaml \ -c ./ppcls/configs/ImageNet/ResNet/ResNet50_fp16.yaml \
-o print_interval=10 \ -o Global.use_dali=True
-o use_dali=True
...@@ -74,9 +74,7 @@ def load_params(exe, prog, path, ignore_params=None): ...@@ -74,9 +74,7 @@ def load_params(exe, prog, path, ignore_params=None):
raise ValueError("Model pretrain path {} does not " raise ValueError("Model pretrain path {} does not "
"exists.".format(path)) "exists.".format(path))
logger.info( logger.info("Loading parameters from {}...".format(path))
logger.coloring('Loading parameters from {}...'.format(path),
'HEADER'))
ignore_set = set() ignore_set = set()
state = _load_state(path) state = _load_state(path)
...@@ -116,9 +114,7 @@ def init_model(config, program, exe): ...@@ -116,9 +114,7 @@ def init_model(config, program, exe):
checkpoints = config.get('checkpoints') checkpoints = config.get('checkpoints')
if checkpoints: if checkpoints:
paddle.static.load(program, checkpoints, exe) paddle.static.load(program, checkpoints, exe)
logger.info( logger.info("Finish initing model from {}".format(checkpoints))
logger.coloring("Finish initing model from {}".format(checkpoints),
"HEADER"))
return return
pretrained_model = config.get('pretrained_model') pretrained_model = config.get('pretrained_model')
...@@ -127,19 +123,17 @@ def init_model(config, program, exe): ...@@ -127,19 +123,17 @@ def init_model(config, program, exe):
pretrained_model = [pretrained_model] pretrained_model = [pretrained_model]
for pretrain in pretrained_model: for pretrain in pretrained_model:
load_params(exe, program, pretrain) load_params(exe, program, pretrain)
logger.info( logger.info("Finish initing model from {}".format(pretrained_model))
logger.coloring("Finish initing model from {}".format(
pretrained_model), "HEADER"))
def save_model(program, model_path, epoch_id, prefix='ppcls'): def save_model(program, model_path, epoch_id, prefix='ppcls'):
""" """
save model to the target path save model to the target path
""" """
if paddle.distributed.get_rank() != 0:
return
model_path = os.path.join(model_path, str(epoch_id)) model_path = os.path.join(model_path, str(epoch_id))
_mkdir_if_not_exist(model_path) _mkdir_if_not_exist(model_path)
model_prefix = os.path.join(model_path, prefix) model_prefix = os.path.join(model_path, prefix)
paddle.static.save(program, model_prefix) paddle.static.save(program, model_prefix)
logger.info( logger.info("Already save model in {}".format(model_path))
logger.coloring("Already save model in {}".format(model_path),
"HEADER"))
...@@ -23,16 +23,16 @@ __dir__ = os.path.dirname(os.path.abspath(__file__)) ...@@ -23,16 +23,16 @@ __dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(__dir__) sys.path.append(__dir__)
sys.path.append(os.path.abspath(os.path.join(__dir__, '../../'))) sys.path.append(os.path.abspath(os.path.join(__dir__, '../../')))
from sys import version_info
import paddle import paddle
from paddle.distributed import fleet from paddle.distributed import fleet
from visualdl import LogWriter
from ppcls.data import Reader from ppcls.data import build_dataloader
from ppcls.utils.config import get_config from ppcls.utils.config import get_config, print_config
from ppcls.utils import logger from ppcls.utils import logger
from tools.static import program from ppcls.utils.logger import init_logger
from save_load import init_model, save_model from ppcls.static.save_load import init_model, save_model
from ppcls.static import program
def parse_args(): def parse_args():
...@@ -43,11 +43,6 @@ def parse_args(): ...@@ -43,11 +43,6 @@ def parse_args():
type=str, type=str,
default='configs/ResNet/ResNet50.yaml', default='configs/ResNet/ResNet50.yaml',
help='config file path') help='config file path')
parser.add_argument(
'--vdl_dir',
type=str,
default=None,
help='VisualDL logging directory for image.')
parser.add_argument( parser.add_argument(
'-p', '-p',
'--profiler_options', '--profiler_options',
...@@ -66,32 +61,64 @@ def parse_args(): ...@@ -66,32 +61,64 @@ def parse_args():
def main(args): def main(args):
config = get_config(args.config, overrides=args.override, show=True) """
if config.get("is_distributed", True): all the config of training paradigm should be in config["Global"]
"""
config = get_config(args.config, overrides=args.override, show=False)
global_config = config["Global"]
mode = "train"
log_file = os.path.join(global_config['output_dir'],
config["Arch"]["name"], f"{mode}.log")
init_logger(name='root', log_file=log_file)
print_config(config)
if global_config.get("is_distributed", True):
fleet.init(is_collective=True) fleet.init(is_collective=True)
# assign the place # assign the device
use_gpu = config.get("use_gpu", True) use_gpu = global_config.get("use_gpu", True)
# amp related config # amp related config
if 'AMP' in config: if 'AMP' in config:
AMP_RELATED_FLAGS_SETTING = { AMP_RELATED_FLAGS_SETTING = {
'FLAGS_cudnn_exhaustive_search': 1, 'FLAGS_cudnn_exhaustive_search': "1",
'FLAGS_conv_workspace_size_limit': 1500, 'FLAGS_conv_workspace_size_limit': "1500",
'FLAGS_cudnn_batchnorm_spatial_persistent': 1, 'FLAGS_cudnn_batchnorm_spatial_persistent': "1",
'FLAGS_max_inplace_grad_add': 8, 'FLAGS_max_indevice_grad_add': "8",
"FLAGS_cudnn_batchnorm_spatial_persistent": "1",
} }
os.environ['FLAGS_cudnn_batchnorm_spatial_persistent'] = '1' for k in AMP_RELATED_FLAGS_SETTING:
paddle.fluid.set_flags(AMP_RELATED_FLAGS_SETTING) os.environ[k] = AMP_RELATED_FLAGS_SETTING[k]
use_xpu = config.get("use_xpu", False)
use_xpu = global_config.get("use_xpu", False)
assert ( assert (
use_gpu and use_xpu use_gpu and use_xpu
) is not True, "gpu and xpu can not be true in the same time in static mode!" ) is not True, "gpu and xpu can not be true in the same time in static mode!"
if use_gpu: if use_gpu:
place = paddle.set_device('gpu') device = paddle.set_device('gpu')
elif use_xpu: elif use_xpu:
place = paddle.set_device('xpu') device = paddle.set_device('xpu')
else: else:
place = paddle.set_device('cpu') device = paddle.set_device('cpu')
# visualDL
vdl_writer = None
if global_config["use_visualdl"]:
vdl_dir = os.path.join(global_config["output_dir"], "vdl")
vdl_writer = LogWriter(vdl_dir)
# build dataloader
eval_dataloader = None
use_dali = global_config.get('use_dali', False)
train_dataloader = build_dataloader(
config["DataLoader"], "Train", device=device, use_dali=use_dali)
if global_config["eval_during_train"]:
eval_dataloader = build_dataloader(
config["DataLoader"], "Eval", device=device, use_dali=use_dali)
step_each_epoch = len(train_dataloader)
# startup_prog is used to do some parameter init work, # startup_prog is used to do some parameter init work,
# and train prog is used to hold the network # and train prog is used to hold the network
...@@ -104,89 +131,71 @@ def main(args): ...@@ -104,89 +131,71 @@ def main(args):
config, config,
train_prog, train_prog,
startup_prog, startup_prog,
step_each_epoch=step_each_epoch,
is_train=True, is_train=True,
is_distributed=config.get("is_distributed", True)) is_distributed=global_config.get("is_distributed", True))
if config.validate: if global_config["eval_during_train"]:
valid_prog = paddle.static.Program() eval_prog = paddle.static.Program()
valid_fetchs, _, valid_feeds, _ = program.build( eval_fetchs, _, eval_feeds, _ = program.build(
config, config,
valid_prog, eval_prog,
startup_prog, startup_prog,
is_train=False, is_train=False,
is_distributed=config.get("is_distributed", True)) is_distributed=global_config.get("is_distributed", True))
# clone to prune some content which is irrelevant in valid_prog # clone to prune some content which is irrelevant in eval_prog
valid_prog = valid_prog.clone(for_test=True) eval_prog = eval_prog.clone(for_test=True)
# create the "Executor" with the statement of which place # create the "Executor" with the statement of which device
exe = paddle.static.Executor(place) exe = paddle.static.Executor(device)
# Parameter initialization # Parameter initialization
exe.run(startup_prog) exe.run(startup_prog)
# load pretrained models or checkpoints # load pretrained models or checkpoints
init_model(config, train_prog, exe) init_model(global_config, train_prog, exe)
if 'AMP' in config and config.AMP.get("use_pure_fp16", False): if 'AMP' in config and config.AMP.get("use_pure_fp16", False):
optimizer.amp_init( optimizer.amp_init(
place, device,
scope=paddle.static.global_scope(), scope=paddle.static.global_scope(),
test_program=valid_prog if config.validate else None) test_program=eval_prog
if global_config["eval_during_train"] else None)
if not config.get("is_distributed", True): if not global_config.get("is_distributed", True):
compiled_train_prog = program.compile( compiled_train_prog = program.compile(
config, train_prog, loss_name=train_fetchs["loss"][0].name) config, train_prog, loss_name=train_fetchs["loss"][0].name)
else: else:
compiled_train_prog = train_prog compiled_train_prog = train_prog
if not config.get('use_dali', False): if eval_dataloader is not None:
train_dataloader = Reader(config, 'train', places=place)() compiled_eval_prog = program.compile(config, eval_prog)
if config.validate and paddle.distributed.get_rank() == 0:
valid_dataloader = Reader(config, 'valid', places=place)()
compiled_valid_prog = program.compile(config, valid_prog)
else:
assert use_gpu is True, "DALI only support gpu, please set use_gpu to True!"
import dali
train_dataloader = dali.train(config)
if config.validate and paddle.distributed.get_rank() == 0:
valid_dataloader = dali.val(config)
compiled_valid_prog = program.compile(config, valid_prog)
vdl_writer = None for epoch_id in range(global_config["epochs"]):
if args.vdl_dir:
if version_info.major == 2:
logger.info(
"visualdl is just supported for python3, so it is disabled in python2..."
)
else:
from visualdl import LogWriter
vdl_writer = LogWriter(args.vdl_dir)
for epoch_id in range(config.epochs):
# 1. train with train dataset # 1. train with train dataset
program.run(train_dataloader, exe, compiled_train_prog, train_feeds, program.run(train_dataloader, exe, compiled_train_prog, train_feeds,
train_fetchs, epoch_id, 'train', config, vdl_writer, train_fetchs, epoch_id, 'train', config, vdl_writer,
lr_scheduler, args.profiler_options) lr_scheduler, args.profiler_options)
if paddle.distributed.get_rank() == 0: # 2. evaate with eval dataset
# 2. validate with validate dataset if global_config["eval_during_train"] and epoch_id % global_config[
if config.validate and epoch_id % config.valid_interval == 0: "eval_interval"] == 0:
top1_acc = program.run(valid_dataloader, exe, top1_acc = program.run(eval_dataloader, exe, compiled_eval_prog,
compiled_valid_prog, valid_feeds, eval_feeds, eval_fetchs, epoch_id, "eval",
valid_fetchs, epoch_id, 'valid', config) config)
if top1_acc > best_top1_acc: if top1_acc > best_top1_acc:
best_top1_acc = top1_acc best_top1_acc = top1_acc
message = "The best top1 acc {:.5f}, in epoch: {:d}".format( message = "The best top1 acc {:.5f}, in epoch: {:d}".format(
best_top1_acc, epoch_id) best_top1_acc, epoch_id)
logger.info("{:s}".format(logger.coloring(message, "RED"))) logger.info(message)
if epoch_id % config.save_interval == 0: if epoch_id % global_config["save_interval"] == 0:
model_path = os.path.join(config.model_save_dir, model_path = os.path.join(global_config["output_dir"],
config.ARCHITECTURE["name"]) config["Arch"]["name"])
save_model(train_prog, model_path, "best_model") save_model(train_prog, model_path, "best_model")
# 3. save the persistable model # 3. save the persistable model
if epoch_id % config.save_interval == 0: if epoch_id % global_config["save_interval"] == 0:
model_path = os.path.join(config.model_save_dir, model_path = os.path.join(global_config["output_dir"],
config.ARCHITECTURE["name"]) config["Arch"]["name"])
save_model(train_prog, model_path, epoch_id) save_model(train_prog, model_path, epoch_id)
if __name__ == '__main__': if __name__ == '__main__':
......
...@@ -54,7 +54,7 @@ def load_dygraph_pretrain(model, path=None): ...@@ -54,7 +54,7 @@ def load_dygraph_pretrain(model, path=None):
return return
def load_dygraph_pretrain_from_url(model, pretrained_url, use_ssld): def load_dygraph_pretrain_from_url(model, pretrained_url, use_ssld=False):
if use_ssld: if use_ssld:
pretrained_url = pretrained_url.replace("_pretrained", pretrained_url = pretrained_url.replace("_pretrained",
"_ssld_pretrained") "_ssld_pretrained")
......
#!/bin/bash
FILENAME=$1
# MODE be one of ['lite_train_infer' 'whole_infer' 'whole_train_infer', 'infer']
MODE=$2
dataline=$(cat ${FILENAME})
# parser params
IFS=$'\n'
lines=(${dataline})
function func_parser_key(){
strs=$1
IFS=":"
array=(${strs})
tmp=${array[0]}
echo ${tmp}
}
function func_parser_value(){
strs=$1
IFS=":"
array=(${strs})
tmp=${array[1]}
echo ${tmp}
}
function status_check(){
last_status=$1 # the exit code
run_command=$2
run_log=$3
if [ $last_status -eq 0 ]; then
echo -e "\033[33m Run successfully with command - ${run_command}! \033[0m" | tee -a ${run_log}
else
echo -e "\033[33m Run failed with command - ${run_command}! \033[0m" | tee -a ${run_log}
fi
}
IFS=$'\n'
# The training params
model_name_list=$(func_parser_value "${lines[1]}")
model_name_pact_list=$(func_parser_value "${lines[2]}")
model_name_fpgm_list=$(func_parser_value "${lines[3]}")
model_name_kl_list=$(func_parser_value "${lines[4]}")
python=$(func_parser_value "${lines[5]}")
gpu_list=$(func_parser_value "${lines[6]}")
epoch_key=$(func_parser_key "${lines[7]}")
epoch_value=$(func_parser_value "${lines[7]}")
save_model_key=$(func_parser_key "${lines[8]}")
save_model_value=$(func_parser_value "${lines[8]}")
pretrain_model_key=$(func_parser_key "${lines[9]}")
save_infer_key=$(func_parser_key "${lines[10]}")
#scripts
train_py=$(func_parser_value "${lines[20]}")
eval_py=$(func_parser_value "${lines[21]}")
norm_export=$(func_parser_value "${lines[22]}")
inference_py=$(func_parser_value "${lines[23]}")
#The inference params
use_gpu_key=$(func_parser_key "${lines[33]}")
use_gpu_list=$(func_parser_value "${lines[33]}")
use_mkldnn_key=$(func_parser_key "${lines[34]}")
use_mkldnn_list=$(func_parser_value "${lines[34]}")
cpu_threads_key=$(func_parser_key "${lines[35]}")
cpu_threads_list=$(func_parser_value "${lines[35]}")
batch_size_key=$(func_parser_key "${lines[36]}")
batch_size_list=$(func_parser_value "${lines[36]}")
use_trt_key=$(func_parser_key "${lines[37]}")
use_trt_list=$(func_parser_value "${lines[37]}")
precision_key=$(func_parser_key "${lines[38]}")
precision_list=$(func_parser_value "${lines[38]}")
infer_model_key=$(func_parser_key "${lines[39]}")
infer_model=$(func_parser_value "${lines[39]}")
image_dir_key=$(func_parser_key "${lines[40]}")
infer_img_dir=$(func_parser_value "${lines[40]}")
save_log_key=$(func_parser_key "${lines[32]}")
LOG_PATH="./test/output"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results.log"
function func_inference(){
IFS='|'
_python=$1
_script=$2
_model_dir=$3
_log_path=$4
_img_dir=$5
_model_name=$6
# inference
for use_gpu in ${use_gpu_list[*]}; do
if [ ${use_gpu} = "False" ]; then
for use_mkldnn in ${use_mkldnn_list[*]}; do
for threads in ${cpu_threads_list[*]}; do
for batch_size in ${batch_size_list[*]}; do
_save_log_path="${_log_path}/${_model_name}_infer_cpu_usemkldnn_${use_mkldnn}_threads_${threads}_batchsize_${batch_size}.log"
command="${_python} ${_script} -o ${use_gpu_key}=${use_gpu} -o ${use_mkldnn_key}=${use_mkldnn} -o ${cpu_threads_key}=${threads} -o ${infer_model_key}=${_model_dir} -o ${batch_size_key}=${batch_size} -o ${image_dir_key}=${_img_dir} -o ${save_log_key}=${_save_log_path} -o benchmark=True -o Global.model_name=${_model_name}"
eval $command
status_check $? "${command}" "../${status_log}"
done
done
done
else
for use_trt in ${use_trt_list[*]}; do
for precision in ${precision_list[*]}; do
if [ ${use_trt} = "False" ] && [ ${precision} != "fp32" ]; then
continue
fi
for batch_size in ${batch_size_list[*]}; do
_save_log_path="${_log_path}/${_model_name}_infer_gpu_usetrt_${use_trt}_precision_${precision}_batchsize_${batch_size}.log"
command="${_python} ${_script} -o ${use_gpu_key}=${use_gpu} -o ${use_trt_key}=${use_trt} -o ${precision_key}=${precision} -o ${infer_model_key}=${_model_dir} -o ${batch_size_key}=${batch_size} -o ${image_dir_key}=${_img_dir} -o ${save_log_key}=${_save_log_path} -o benchmark=True -o Global.model_name=${_model_name}"
eval $command
status_check $? "${command}" "../${status_log}"
done
done
done
fi
done
}
if [ ${MODE} != "infer" ]; then
IFS="|"
for gpu in ${gpu_list[*]}; do
use_gpu=True
if [ ${gpu} = "-1" ];then
use_gpu=False
env=""
elif [ ${#gpu} -le 1 ];then
env="export CUDA_VISIBLE_DEVICES=${gpu}"
eval ${env}
elif [ ${#gpu} -le 15 ];then
IFS=","
array=(${gpu})
env="export CUDA_VISIBLE_DEVICES=${array[0]}"
IFS="|"
else
IFS=";"
array=(${gpu})
ips=${array[0]}
gpu=${array[1]}
IFS="|"
env=" "
fi
for model_name in ${model_name_list[*]}; do
# not set epoch when whole_train_infer
if [ ${MODE} != "whole_train_infer" ]; then
set_epoch="-o ${epoch_key}=${epoch_value}"
else
set_epoch=" "
fi
save_log="${LOG_PATH}/${model_name}_gpus_${gpu}"
# train with cpu
if [ ${gpu} = "-1" ];then
cmd="${python} ${train_py} -o Arch.name=${model_name} -o Global.device=cpu -o ${save_model_key}=${save_log} ${set_epoch}"
# train with single gpu
elif [ ${#gpu} -le 2 ];then # train with single gpu
cmd="${python} ${train_py} -o Arch.name=${model_name} -o ${save_model_key}=${save_log} ${set_epoch}"
elif [ ${#gpu} -le 15 ];then # train with multi-gpu
cmd="${python} -m paddle.distributed.launch --gpus=${gpu} ${train_py} -o Arch.name=${model_name} -o ${save_model_key}=${save_log} ${set_epoch}"
else # train with multi-machine
cmd="${python} -m paddle.distributed.launch --ips=${ips} --gpus=${gpu} ${train_py} -o Arch.name=${model_name} -c ${save_model_key}=${save_log} ${set_epoch}"
fi
# run train
eval $cmd
status_check $? "${cmd}" "${status_log}"
# run eval
eval_cmd="${python} ${eval_py} -o Arch.name=${model_name} -o ${pretrain_model_key}=${save_log}/${model_name}/latest"
eval $eval_cmd
status_check $? "${eval_cmd}" "${status_log}"
# run export model
save_infer_path="${save_log}/inference"
export_cmd="${python} ${norm_export} -o Arch.name=${model_name} -o ${pretrain_model_key}=${save_log}/${model_name}/latest -o ${save_infer_key}=${save_infer_path}"
eval $export_cmd
status_check $? "${export_cmd}" "${status_log}"
#run inference
eval $env
save_infer_path="${save_log}/inference"
cd deploy
func_inference "${python}" "${inference_py}" "../${save_infer_path}" "../${LOG_PATH}" "../${infer_img_dir}" "${model_name}"
eval "unset CUDA_VISIBLE_DEVICES"
cd ..
done
done
else
GPUID=$3
if [ ${#GPUID} -le 0 ];then
env=" "
else
env="export CUDA_VISIBLE_DEVICES=${GPUID}"
fi
echo $env
# export inference model
mkdir -p inference_models
for model_name in ${model_name_list[*]}; do
export_cmd="${python} ${norm_export} -o Arch.name=${model_name} -o ${pretrain_model_key}=pretrained_models/${model_name}_pretrained -o ${save_infer_key}=./inference_models/${model_name}"
eval $export_cmd
done
#run inference
cd deploy
for model_name in ${model_name_list[*]}; do
func_inference "${python}" "${inference_py}" "../inference_models/${model_name}" "../${LOG_PATH}" "../${infer_img_dir}" "${model_name}"
done
cd ..
fi
===========================train_params=========================== ===========================train_params===========================
model_name:ResNet50_vd|ResNeXt101_vd_64x4d|HRNet_W18_C|MobileNetV3_large_x1_0|DarkNet53|MobileNetV1|MobileNetV2|ShuffleNetV2_x1_0 model_name:DarkNet53
model_name_pact:ResNet50_vd|MobileNetV3_large_x1_0
model_name_fpgm:ResNet50_vd|MobileNetV3_large_x1_0
model_name_kl:ResNet50_vd|MobileNetV3_large_x1_0
python:python3.7 python:python3.7
gpu_list:0|0,1|-1 gpu_list:0|0,1
Global.epochs:10 -o Global.device:gpu
Global.output_dir:./output/ -o Global.auto_cast:null
Global.pretrained_model:null -o Global.epochs:lite_train_infer=2|whole_train_infer=120
Global.save_inference_dir:null -o Global.output_dir:./output/
# -o DataLoader.Train.sampler.batch_size:8
# -o Global.pretrained_model:null
# train_model_name:latest
# train_infer_img_dir:./dataset/ILSVRC2012/val
# null:null
# ##
# trainer:norm_train
# norm_train:tools/train.py -c ppcls/configs/ImageNet/DarkNet/DarkNet53.yaml
===========================scripts=========================== pact_train:null
train:tools/train.py -c test/benchmark.yaml fpgm_train:null
eval:tools/eval.py -c test/benchmark.yaml distill_train:null
norm_export:tools/export_model.py -c test/benchmark.yaml null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/DarkNet/DarkNet53.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/DarkNet/DarkNet53.yaml
quant_export:null
fpgm_export:null
distill_export:null
export1:null
export2:null
##
infer_model:../inference/
infer_export:null
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml inference:python/predict_cls.py -c configs/inference_cls.yaml
# -o Global.use_gpu:True|False
# -o Global.enable_mkldnn:True|False
# -o Global.cpu_num_threads:1|6
# -o Global.batch_size:1
# -o Global.use_tensorrt:True|False
# -o Global.use_fp16:True|False
# -o Global.inference_model_dir:../inference
===========================infer_params=========================== -o Global.infer_imgs:../dataset/ILSVRC2012/val
Global.save_log_path:./test/output/ -o Global.save_log_path:null
Global.use_gpu:True|False -o Global.benchmark:True
Global.enable_mkldnn:True|False
Global.cpu_num_threads:1|6
Global.batch_size:1
Global.use_tensorrt:True|False
Global.use_fp16:True|False
Global.inference_model_dir:./inference
Global.infer_imgs:./dataset/chain_dataset/val
#
#
#
# #
# #
# #
......
===========================train_params===========================
model_name:HRNet_W18_C
python:python3.7
gpu_list:0|0,1
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_infer=2|whole_train_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.batch_size:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/HRNet/HRNet_W18_C.yaml
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/HRNet/HRNet_W18_C.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/HRNet/HRNet_W18_C.yaml
quant_export:null
fpgm_export:null
distill_export:null
export1:null
export2:null
##
infer_model:../inference/
infer_export:null
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.inference_model_dir:../inference
-o Global.infer_imgs:../dataset/ILSVRC2012/val
-o Global.save_log_path:null
-o Global.benchmark:True
#
#
#
#
===========================pretrained_model===========================
ResNet50_vd:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_pretrained.pdparams
ResNeXt101_vd_64x4d:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNeXt101_vd_64x4d_pretrained.pdparams
HRNet_W18_C:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_pretrained.pdparams
MobileNetV3_large_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_pretrained.pdparams
DarkNet53:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DarkNet53_pretrained.pdparams
MobileNetV1:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_pretrained.pdparams
MobileNetV2:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_pretrained.pdparams
ShuffleNetV2_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ShuffleNetV2_x1_0_pretrained.pdparams
===========================train_params===========================
model_name:MobileNetV1
python:python3.7
gpu_list:0|0,1
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_infer=2|whole_train_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.batch_size:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/MobileNetV1/MobileNetV1.yaml
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/MobileNetV1/MobileNetV1.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/MobileNetV1/MobileNetV1.yaml
quant_export:null
fpgm_export:null
distill_export:null
export1:null
export2:null
##
infer_model:../inference/
infer_export:null
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.inference_model_dir:../inference
-o Global.infer_imgs:../dataset/ILSVRC2012/val
-o Global.save_log_path:null
-o Global.benchmark:True
#
#
#
#
===========================pretrained_model===========================
ResNet50_vd:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_pretrained.pdparams
ResNeXt101_vd_64x4d:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNeXt101_vd_64x4d_pretrained.pdparams
HRNet_W18_C:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_pretrained.pdparams
MobileNetV3_large_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_pretrained.pdparams
DarkNet53:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DarkNet53_pretrained.pdparams
MobileNetV1:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_pretrained.pdparams
MobileNetV2:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_pretrained.pdparams
ShuffleNetV2_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ShuffleNetV2_x1_0_pretrained.pdparams
===========================train_params===========================
model_name:MobileNetV2
python:python3.7
gpu_list:0|0,1
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_infer=2|whole_train_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.batch_size:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/MobileNetV2/MobileNetV2.yaml
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/MobileNetV2/MobileNetV2.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/MobileNetV2/MobileNetV2.yaml
quant_export:null
fpgm_export:null
distill_export:null
export1:null
export2:null
##
infer_model:../inference/
infer_export:null
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.inference_model_dir:../inference
-o Global.infer_imgs:../dataset/ILSVRC2012/val
-o Global.save_log_path:null
-o Global.benchmark:True
#
#
#
#
===========================pretrained_model===========================
ResNet50_vd:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_pretrained.pdparams
ResNeXt101_vd_64x4d:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNeXt101_vd_64x4d_pretrained.pdparams
HRNet_W18_C:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_pretrained.pdparams
MobileNetV3_large_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_pretrained.pdparams
DarkNet53:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DarkNet53_pretrained.pdparams
MobileNetV1:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_pretrained.pdparams
MobileNetV2:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_pretrained.pdparams
ShuffleNetV2_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ShuffleNetV2_x1_0_pretrained.pdparams
===========================train_params===========================
model_name:MobileNetV3_large_x1_0
python:python3.7
gpu_list:0|0,1
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_infer=2|whole_train_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.batch_size:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/MobileNetV3/MobileNetV3_large_x1_0.yaml
pact_train:deploy/slim/slim.py -c ppcls/configs/slim/MobileNetV3_large_x1_0_quantization.yaml
fpgm_train:deploy/slim/slim.py -c ppcls/configs/slim/MobileNetV3_large_x1_0_prune.yaml
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/MobileNetV3/MobileNetV3_large_x1_0.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/MobileNetV3/MobileNetV3_large_x1_0.yaml
quant_export:deploy/slim/slim.py -m export -c ppcls/configs/slim/MobileNetV3_large_x1_0_quantalization.yaml
fpgm_export:deploy/slim/slim.py -m export -c ppcls/configs/slim/MobileNetV3_large_x1_0_prune.yaml
distill_export:null
export1:null
export2:null
##
infer_model:../inference/
infer_export:null
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.inference_model_dir:../inference
-o Global.infer_imgs:../dataset/ILSVRC2012/val
-o Global.save_log_path:null
-o Global.benchmark:True
#
#
#
#
===========================pretrained_model===========================
ResNet50_vd:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_pretrained.pdparams
ResNeXt101_vd_64x4d:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNeXt101_vd_64x4d_pretrained.pdparams
HRNet_W18_C:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_pretrained.pdparams
MobileNetV3_large_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_pretrained.pdparams
DarkNet53:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DarkNet53_pretrained.pdparams
MobileNetV1:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_pretrained.pdparams
MobileNetV2:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_pretrained.pdparams
ShuffleNetV2_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ShuffleNetV2_x1_0_pretrained.pdparams
===========================train_params===========================
model_name:ResNeXt101_vd_64x4d
python:python3.7
gpu_list:0|0,1
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_infer=2|whole_train_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.batch_size:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/ResNeXt/ResNeXt101_vd_64x4d.yaml
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/ResNeXt/ResNeXt101_vd_64x4d.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/ResNeXt/ResNeXt101_vd_64x4d.yaml
quant_export:null
fpgm_export:null
distill_export:null
export1:null
export2:null
##
infer_model:../inference/
infer_export:null
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.inference_model_dir:../inference
-o Global.infer_imgs:../dataset/ILSVRC2012/val
-o Global.save_log_path:null
-o Global.benchmark:True
#
#
#
#
===========================pretrained_model===========================
ResNet50_vd:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_pretrained.pdparams
ResNeXt101_vd_64x4d:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNeXt101_vd_64x4d_pretrained.pdparams
HRNet_W18_C:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_pretrained.pdparams
MobileNetV3_large_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_pretrained.pdparams
DarkNet53:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DarkNet53_pretrained.pdparams
MobileNetV1:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_pretrained.pdparams
MobileNetV2:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_pretrained.pdparams
ShuffleNetV2_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ShuffleNetV2_x1_0_pretrained.pdparams
===========================train_params===========================
model_name:ResNet50_vd
python:python3.7
gpu_list:3,4
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_infer=2|whole_train_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.batch_size:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml
pact_train:deploy/slim/slim.py -c ppcls/configs/slim/ResNet50_vd_quantization.yaml
fpgm_train:deploy/slim/slim.py -c ppcls/configs/slim/ResNet50_vd_prune.yaml
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml
quant_export:deploy/slim/slim.py -m export -c ppcls/configs/slim/ResNet50_vd_quantalization.yaml
fpgm_export:deploy/slim/slim.py -m export -c ppcls/configs/slim/ResNet50_vd_prune.yaml
distill_export:null
export1:null
export2:null
##
infer_model:../inference/
infer_export:null
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.inference_model_dir:../inference
-o Global.infer_imgs:../dataset/ILSVRC2012/val
-o Global.save_log_path:null
-o Global.benchmark:True
#
#
#
#
===========================pretrained_model===========================
ResNet50_vd:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_pretrained.pdparams
ResNeXt101_vd_64x4d:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNeXt101_vd_64x4d_pretrained.pdparams
HRNet_W18_C:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_pretrained.pdparams
MobileNetV3_large_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_pretrained.pdparams
DarkNet53:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DarkNet53_pretrained.pdparams
MobileNetV1:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_pretrained.pdparams
MobileNetV2:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_pretrained.pdparams
ShuffleNetV2_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ShuffleNetV2_x1_0_pretrained.pdparams
===========================train_params===========================
model_name:ShuffleNetV2_x1_0
python:python3.7
gpu_list:0|0,1
-o Global.device:gpu
-o Global.auto_cast:null
-o Global.epochs:lite_train_infer=2|whole_train_infer=120
-o Global.output_dir:./output/
-o DataLoader.Train.sampler.batch_size:8
-o Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val
null:null
##
trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/ShuffleNet/ShuffleNetV2_x1_0.yaml
pact_train:null
fpgm_train:null
distill_train:null
null:null
null:null
##
===========================eval_params===========================
eval:tools/eval.py -c ppcls/configs/ImageNet/ShuffleNet/ShuffleNetV2_x1_0.yaml
null:null
##
===========================infer_params==========================
-o Global.save_inference_dir:./inference
-o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/ShuffleNet/ShuffleNetV2_x1_0.yaml
quant_export:null
fpgm_export:null
distill_export:null
export1:null
export2:null
##
infer_model:../inference/
infer_export:null
infer_quant:Fasle
inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.use_gpu:True|False
-o Global.enable_mkldnn:True|False
-o Global.cpu_num_threads:1|6
-o Global.batch_size:1
-o Global.use_tensorrt:True|False
-o Global.use_fp16:True|False
-o Global.inference_model_dir:../inference
-o Global.infer_imgs:../dataset/ILSVRC2012/val
-o Global.save_log_path:null
-o Global.benchmark:True
#
#
#
#
===========================pretrained_model===========================
ResNet50_vd:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_pretrained.pdparams
ResNeXt101_vd_64x4d:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNeXt101_vd_64x4d_pretrained.pdparams
HRNet_W18_C:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_pretrained.pdparams
MobileNetV3_large_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_pretrained.pdparams
DarkNet53:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/DarkNet53_pretrained.pdparams
MobileNetV1:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_pretrained.pdparams
MobileNetV2:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_pretrained.pdparams
ShuffleNetV2_x1_0:https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ShuffleNetV2_x1_0_pretrained.pdparams
...@@ -15,44 +15,38 @@ function func_parser_value(){ ...@@ -15,44 +15,38 @@ function func_parser_value(){
tmp="${array[1]}:${array[2]}" tmp="${array[1]}:${array[2]}"
echo ${tmp} echo ${tmp}
} }
ResNet50_vd=$(func_parser_value "${lines[49]}") inference_model_url=$(func_parser_value "${lines[50]}")
ResNeXt101_vd_64x4d=$(func_parser_value "${lines[50]}")
HRNet_W18_C=$(func_parser_value "${lines[51]}")
MobileNetV3_large_x1_0=$(func_parser_value "${lines[52]}")
DarkNet53=$(func_parser_value "${lines[53]}")
MobileNetV1=$(func_parser_value "${lines[54]}")
MobileNetV2=$(func_parser_value "${lines[55]}")
ShuffleNetV2_x1_0=$(func_parser_value "${lines[56]}")
if [ ${MODE} = "lite_train_infer" ] || [ ${MODE} = "whole_infer" ];then if [ ${MODE} = "lite_train_infer" ] || [ ${MODE} = "whole_infer" ];then
# pretrain lite train data # pretrain lite train data
cd dataset cd dataset
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/whole_chain_little_train.tar wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/whole_chain_little_train.tar
tar xf whole_chain_little_train.tar tar xf whole_chain_little_train.tar
ln -s whole_chain_little_train chain_dataset ln -s whole_chain_little_train ILSVRC2012
cd ../ cd ILSVRC2012
mv train.txt train_list.txt
mv val.txt val_list.txt
cd ../../
elif [ ${MODE} = "infer" ];then elif [ ${MODE} = "infer" ];then
# download data # download data
cd dataset cd dataset
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/whole_chain_infer.tar wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/whole_chain_infer.tar
tar xf whole_chain_infer.tar tar xf whole_chain_infer.tar
ln -s whole_chain_infer chain_dataset ln -s whole_chain_infer ILSVRC2012
cd ../ cd ILSVRC2012
# download pretrained model mv train.txt train_list.txt
mkdir -p pretrained_models mv val.txt val_list.txt
cd pretrained_models cd ../../
eval "wget -nc $ResNet50_vd" # download inference model
eval "wget -nc $ResNeXt101_vd_64x4d" eval "wget -nc $inference_model_url"
eval "wget -nc $HRNet_W18_C"
eval "wget -nc $MobileNetV3_large_x1_0"
eval "wget -nc $DarkNet53"
eval "wget -nc $MobileNetV1"
eval "wget -nc $MobileNetV2"
eval "wget -nc $ShuffleNetV2_x1_0"
elif [ ${MODE} = "whole_train_infer" ];then elif [ ${MODE} = "whole_train_infer" ];then
cd dataset cd dataset
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/whole_chain_CIFAR100.tar wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/data/whole_chain/whole_chain_CIFAR100.tar
tar xf whole_chain_CIFAR100.tar tar xf whole_chain_CIFAR100.tar
ln -s whole_chain_CIFAR100 chain_dataset ln -s whole_chain_CIFAR100 ILSVRC2012
cd ILSVRC2012
mv train.txt train_list.txt
mv val.txt val_list.txt
cd ../../
fi fi
#!/bin/bash
FILENAME=$1
# MODE be one of ['lite_train_infer' 'whole_infer' 'whole_train_infer', 'infer']
MODE=$2
dataline=$(cat ${FILENAME})
# parser params
IFS=$'\n'
lines=(${dataline})
function func_parser_key(){
strs=$1
IFS=":"
array=(${strs})
tmp=${array[0]}
echo ${tmp}
}
function func_parser_value(){
strs=$1
IFS=":"
array=(${strs})
tmp=${array[1]}
echo ${tmp}
}
function func_set_params(){
key=$1
value=$2
if [ ${key} = "null" ];then
echo " "
elif [[ ${value} = "null" ]] || [[ ${value} = " " ]] || [ ${#value} -le 0 ];then
echo " "
else
echo "${key}=${value}"
fi
}
function func_parser_params(){
strs=$1
IFS=":"
array=(${strs})
key=${array[0]}
tmp=${array[1]}
IFS="|"
res=""
for _params in ${tmp[*]}; do
IFS="="
array=(${_params})
mode=${array[0]}
value=${array[1]}
if [[ ${mode} = ${MODE} ]]; then
IFS="|"
#echo $(func_set_params "${mode}" "${value}")
echo $value
break
fi
IFS="|"
done
echo ${res}
}
function status_check(){
last_status=$1 # the exit code
run_command=$2
run_log=$3
if [ $last_status -eq 0 ]; then
echo -e "\033[33m Run successfully with command - ${run_command}! \033[0m" | tee -a ${run_log}
else
echo -e "\033[33m Run failed with command - ${run_command}! \033[0m" | tee -a ${run_log}
fi
}
IFS=$'\n'
# The training params
model_name=$(func_parser_value "${lines[1]}")
python=$(func_parser_value "${lines[2]}")
gpu_list=$(func_parser_value "${lines[3]}")
train_use_gpu_key=$(func_parser_key "${lines[4]}")
train_use_gpu_value=$(func_parser_value "${lines[4]}")
autocast_list=$(func_parser_value "${lines[5]}")
autocast_key=$(func_parser_key "${lines[5]}")
epoch_key=$(func_parser_key "${lines[6]}")
epoch_num=$(func_parser_params "${lines[6]}")
save_model_key=$(func_parser_key "${lines[7]}")
train_batch_key=$(func_parser_key "${lines[8]}")
train_batch_value=$(func_parser_params "${lines[8]}")
pretrain_model_key=$(func_parser_key "${lines[9]}")
pretrain_model_value=$(func_parser_value "${lines[9]}")
train_model_name=$(func_parser_value "${lines[10]}")
train_infer_img_dir=$(func_parser_value "${lines[11]}")
train_param_key1=$(func_parser_key "${lines[12]}")
train_param_value1=$(func_parser_value "${lines[12]}")
trainer_list=$(func_parser_value "${lines[14]}")
trainer_norm=$(func_parser_key "${lines[15]}")
norm_trainer=$(func_parser_value "${lines[15]}")
pact_key=$(func_parser_key "${lines[16]}")
pact_trainer=$(func_parser_value "${lines[16]}")
fpgm_key=$(func_parser_key "${lines[17]}")
fpgm_trainer=$(func_parser_value "${lines[17]}")
distill_key=$(func_parser_key "${lines[18]}")
distill_trainer=$(func_parser_value "${lines[18]}")
trainer_key1=$(func_parser_key "${lines[19]}")
trainer_value1=$(func_parser_value "${lines[19]}")
trainer_key2=$(func_parser_key "${lines[20]}")
trainer_value2=$(func_parser_value "${lines[20]}")
eval_py=$(func_parser_value "${lines[23]}")
eval_key1=$(func_parser_key "${lines[24]}")
eval_value1=$(func_parser_value "${lines[24]}")
save_infer_key=$(func_parser_key "${lines[27]}")
export_weight=$(func_parser_key "${lines[28]}")
norm_export=$(func_parser_value "${lines[29]}")
pact_export=$(func_parser_value "${lines[30]}")
fpgm_export=$(func_parser_value "${lines[31]}")
distill_export=$(func_parser_value "${lines[32]}")
export_key1=$(func_parser_key "${lines[33]}")
export_value1=$(func_parser_value "${lines[33]}")
export_key2=$(func_parser_key "${lines[34]}")
export_value2=$(func_parser_value "${lines[34]}")
# parser inference model
infer_model_dir_list=$(func_parser_value "${lines[36]}")
infer_export_list=$(func_parser_value "${lines[37]}")
infer_is_quant=$(func_parser_value "${lines[38]}")
# parser inference
inference_py=$(func_parser_value "${lines[39]}")
use_gpu_key=$(func_parser_key "${lines[40]}")
use_gpu_list=$(func_parser_value "${lines[40]}")
use_mkldnn_key=$(func_parser_key "${lines[41]}")
use_mkldnn_list=$(func_parser_value "${lines[41]}")
cpu_threads_key=$(func_parser_key "${lines[42]}")
cpu_threads_list=$(func_parser_value "${lines[42]}")
batch_size_key=$(func_parser_key "${lines[43]}")
batch_size_list=$(func_parser_value "${lines[43]}")
use_trt_key=$(func_parser_key "${lines[44]}")
use_trt_list=$(func_parser_value "${lines[44]}")
precision_key=$(func_parser_key "${lines[45]}")
precision_list=$(func_parser_value "${lines[45]}")
infer_model_key=$(func_parser_key "${lines[46]}")
image_dir_key=$(func_parser_key "${lines[47]}")
infer_img_dir=$(func_parser_value "${lines[47]}")
save_log_key=$(func_parser_key "${lines[48]}")
benchmark_key=$(func_parser_key "${lines[49]}")
benchmark_value=$(func_parser_value "${lines[49]}")
infer_key1=$(func_parser_key "${lines[50]}")
infer_value1=$(func_parser_value "${lines[50]}")
LOG_PATH="./tests/output"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results.log"
function func_inference(){
IFS='|'
_python=$1
_script=$2
_model_dir=$3
_log_path=$4
_img_dir=$5
_flag_quant=$6
# inference
for use_gpu in ${use_gpu_list[*]}; do
if [ ${use_gpu} = "False" ] || [ ${use_gpu} = "cpu" ]; then
for use_mkldnn in ${use_mkldnn_list[*]}; do
if [ ${use_mkldnn} = "False" ] && [ ${_flag_quant} = "True" ]; then
continue
fi
for threads in ${cpu_threads_list[*]}; do
for batch_size in ${batch_size_list[*]}; do
_save_log_path="${_log_path}/infer_cpu_usemkldnn_${use_mkldnn}_threads_${threads}_batchsize_${batch_size}.log"
set_infer_data=$(func_set_params "${image_dir_key}" "${_img_dir}")
set_benchmark=$(func_set_params "${benchmark_key}" "${benchmark_value}")
set_batchsize=$(func_set_params "${batch_size_key}" "${batch_size}")
set_cpu_threads=$(func_set_params "${cpu_threads_key}" "${threads}")
set_model_dir=$(func_set_params "${infer_model_key}" "${_model_dir}")
set_infer_params1=$(func_set_params "${infer_key1}" "${infer_value1}")
command="${_python} ${_script} ${use_gpu_key}=${use_gpu} ${use_mkldnn_key}=${use_mkldnn} ${set_cpu_threads} ${set_model_dir} ${set_batchsize} ${set_infer_data} ${set_benchmark} ${set_infer_params1} > ${_save_log_path} 2>&1 "
eval $command
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
status_check $last_status "${command}" "../${status_log}"
done
done
done
elif [ ${use_gpu} = "True" ] || [ ${use_gpu} = "gpu" ]; then
for use_trt in ${use_trt_list[*]}; do
for precision in ${precision_list[*]}; do
if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
continue
fi
if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [ ${_flag_quant} = "True" ]; then
continue
fi
for batch_size in ${batch_size_list[*]}; do
_save_log_path="${_log_path}/infer_gpu_usetrt_${use_trt}_precision_${precision}_batchsize_${batch_size}.log"
set_infer_data=$(func_set_params "${image_dir_key}" "${_img_dir}")
set_benchmark=$(func_set_params "${benchmark_key}" "${benchmark_value}")
set_batchsize=$(func_set_params "${batch_size_key}" "${batch_size}")
set_tensorrt=$(func_set_params "${use_trt_key}" "${use_trt}")
set_precision=$(func_set_params "${precision_key}" "${precision}")
set_model_dir=$(func_set_params "${infer_model_key}" "${_model_dir}")
command="${_python} ${_script} ${use_gpu_key}=${use_gpu} ${set_tensorrt} ${set_precision} ${set_model_dir} ${set_batchsize} ${set_infer_data} ${set_benchmark} > ${_save_log_path} 2>&1 "
eval $command
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
status_check $last_status "${command}" "../${status_log}"
done
done
done
else
echo "Does not support hardware other than CPU and GPU Currently!"
fi
done
}
if [ ${MODE} = "infer" ]; then
GPUID=$3
if [ ${#GPUID} -le 0 ];then
env=" "
else
env="export CUDA_VISIBLE_DEVICES=${GPUID}"
fi
# set CUDA_VISIBLE_DEVICES
eval $env
export Count=0
IFS="|"
infer_run_exports=(${infer_export_list})
infer_quant_flag=(${infer_is_quant})
for infer_model in ${infer_model_dir_list[*]}; do
# run export
if [ ${infer_run_exports[Count]} != "null" ];then
set_export_weight=$(func_set_params "${export_weight}" "${infer_model}")
set_save_infer_key=$(func_set_params "${save_infer_key}" "${infer_model}")
export_cmd="${python} ${norm_export} ${set_export_weight} ${set_save_infer_key}"
eval $export_cmd
status_export=$?
if [ ${status_export} = 0 ];then
status_check $status_export "${export_cmd}" "${status_log}"
fi
fi
#run inference
is_quant=${infer_quant_flag[Count]}
echo "is_quant: ${is_quant}"
func_inference "${python}" "${inference_py}" "${infer_model}" "${LOG_PATH}" "${infer_img_dir}" ${is_quant}
Count=$(($Count + 1))
done
else
IFS="|"
export Count=0
USE_GPU_KEY=(${train_use_gpu_value})
for gpu in ${gpu_list[*]}; do
use_gpu=${USE_GPU_KEY[Count]}
Count=$(($Count + 1))
if [ ${gpu} = "-1" ];then
env=""
elif [ ${#gpu} -le 1 ];then
env="export CUDA_VISIBLE_DEVICES=${gpu}"
eval ${env}
elif [ ${#gpu} -le 15 ];then
IFS=","
array=(${gpu})
env="export CUDA_VISIBLE_DEVICES=${array[0]}"
IFS="|"
else
IFS=";"
array=(${gpu})
ips=${array[0]}
gpu=${array[1]}
IFS="|"
env=" "
fi
for autocast in ${autocast_list[*]}; do
for trainer in ${trainer_list[*]}; do
flag_quant=False
if [ ${trainer} = ${pact_key} ]; then
run_train=${pact_trainer}
run_export=${pact_export}
flag_quant=True
elif [ ${trainer} = "${fpgm_key}" ]; then
run_train=${fpgm_trainer}
run_export=${fpgm_export}
elif [ ${trainer} = "${distill_key}" ]; then
run_train=${distill_trainer}
run_export=${distill_export}
elif [ ${trainer} = ${trainer_key1} ]; then
run_train=${trainer_value1}
run_export=${export_value1}
elif [[ ${trainer} = ${trainer_key2} ]]; then
run_train=${trainer_value2}
run_export=${export_value2}
else
run_train=${norm_trainer}
run_export=${norm_export}
fi
if [ ${run_train} = "null" ]; then
continue
fi
set_autocast=$(func_set_params "${autocast_key}" "${autocast}")
set_epoch=$(func_set_params "${epoch_key}" "${epoch_num}")
set_pretrain=$(func_set_params "${pretrain_model_key}" "${pretrain_model_value}")
set_batchsize=$(func_set_params "${train_batch_key}" "${train_batch_value}")
set_train_params1=$(func_set_params "${train_param_key1}" "${train_param_value1}")
set_use_gpu=$(func_set_params "${train_use_gpu_key}" "${use_gpu}")
save_log="${LOG_PATH}/${trainer}_gpus_${gpu}_autocast_${autocast}"
# load pretrain from norm training if current trainer is pact or fpgm trainer
if [ ${trainer} = ${pact_key} ] || [ ${trainer} = ${fpgm_key} ]; then
set_pretrain="${load_norm_train_model}"
fi
set_save_model=$(func_set_params "${save_model_key}" "${save_log}")
if [ ${#gpu} -le 2 ];then # train with cpu or single gpu
cmd="${python} ${run_train} ${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_autocast} ${set_batchsize} ${set_train_params1} "
elif [ ${#gpu} -le 15 ];then # train with multi-gpu
cmd="${python} -m paddle.distributed.launch --gpus=${gpu} ${run_train} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_autocast} ${set_batchsize} ${set_train_params1}"
else # train with multi-machine
cmd="${python} -m paddle.distributed.launch --ips=${ips} --gpus=${gpu} ${run_train} ${set_save_model} ${set_pretrain} ${set_epoch} ${set_autocast} ${set_batchsize} ${set_train_params1}"
fi
# run train
eval "unset CUDA_VISIBLE_DEVICES"
eval $cmd
status_check $? "${cmd}" "${status_log}"
set_eval_pretrain=$(func_set_params "${pretrain_model_key}" "${save_log}/${$model_name}/${train_model_name}")
# save norm trained models to set pretrain for pact training and fpgm training
if [ ${trainer} = ${trainer_norm} ]; then
load_norm_train_model=${set_eval_pretrain}
fi
# run eval
if [ ${eval_py} != "null" ]; then
set_eval_params1=$(func_set_params "${eval_key1}" "${eval_value1}")
eval_cmd="${python} ${eval_py} ${set_eval_pretrain} ${set_use_gpu} ${set_eval_params1}"
eval $eval_cmd
status_check $? "${eval_cmd}" "${status_log}"
fi
# run export model
if [ ${run_export} != "null" ]; then
# run export model
save_infer_path="${save_log}"
set_export_weight=$(func_set_params "${export_weight}" "${save_log}/${model_name}/${train_model_name}")
set_save_infer_key=$(func_set_params "${save_infer_key}" "${save_infer_path}")
export_cmd="${python} ${run_export} ${set_export_weight} ${set_save_infer_key}"
eval $export_cmd
status_check $? "${export_cmd}" "${status_log}"
#run inference
eval $env
save_infer_path="${save_log}"
cd deploy
func_inference "${python}" "${inference_py}" "../${save_infer_path}" "../${LOG_PATH}" "${infer_img_dir}" "${flag_quant}"
cd ..
fi
eval "unset CUDA_VISIBLE_DEVICES"
done # done with: for trainer in ${trainer_list[*]}; do
done # done with: for autocast in ${autocast_list[*]}; do
done # done with: for gpu in ${gpu_list[*]}; do
fi # end if [ ${MODE} = "infer" ]; then
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册