提交 5aa57d2c 编写于 作者: D dongshuilong

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleClas into arcmargin

...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。 飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。
**近期更新** **近期更新**
- 2021.09.17 增加PaddleClas自研PP-LCNet系列模型, 这些模型在Intel CPU上有较强的竞争力。相关指标和预训练权重可以从 [这里](docs/zh_CN/ImageNet_models.md)下载。
- 2021.08.11 更新7个[FAQ](docs/zh_CN/faq_series/faq_2021_s2.md) - 2021.08.11 更新7个[FAQ](docs/zh_CN/faq_series/faq_2021_s2.md)
- 2021.06.29 添加Swin-transformer系列模型,ImageNet1k数据集上Top1 acc最高精度可达87.2%;支持训练预测评估与whl包部署,预训练模型可以从[这里](docs/zh_CN/models/models_intro.md)下载。 - 2021.06.29 添加Swin-transformer系列模型,ImageNet1k数据集上Top1 acc最高精度可达87.2%;支持训练预测评估与whl包部署,预训练模型可以从[这里](docs/zh_CN/models/models_intro.md)下载。
- 2021.06.22,23,24 PaddleClas官方研发团队带来技术深入解读三日直播课。课程回放:[https://aistudio.baidu.com/aistudio/course/introduce/24519](https://aistudio.baidu.com/aistudio/course/introduce/24519) - 2021.06.22,23,24 PaddleClas官方研发团队带来技术深入解读三日直播课。课程回放:[https://aistudio.baidu.com/aistudio/course/introduce/24519](https://aistudio.baidu.com/aistudio/course/introduce/24519)
......
...@@ -8,6 +8,8 @@ PaddleClas is an image recognition toolset for industry and academia, helping us ...@@ -8,6 +8,8 @@ PaddleClas is an image recognition toolset for industry and academia, helping us
**Recent updates** **Recent updates**
- 2021.09.17 Add PP-LCNet series model developed by PaddleClas, these models show strong competitiveness on Intel CPUs. The metrics and pretrained model are available [here](docs/en/ImageNet_models_en.md).
- 2021.06.29 Add Swin-transformer series model,Highest top1 acc on ImageNet1k dataset reaches 87.2%, training, evaluation and inference are all supported. Pretrained models can be downloaded [here](docs/en/models/models_intro_en.md). - 2021.06.29 Add Swin-transformer series model,Highest top1 acc on ImageNet1k dataset reaches 87.2%, training, evaluation and inference are all supported. Pretrained models can be downloaded [here](docs/en/models/models_intro_en.md).
- 2021.06.16 PaddleClas release/2.2. Add metric learning and vector search modules. Add product recognition, animation character recognition, vehicle recognition and logo recognition. Added 30 pretrained models of LeViT, Twins, TNT, DLA, HarDNet, and RedNet, and the accuracy is roughly the same as that of the paper. - 2021.06.16 PaddleClas release/2.2. Add metric learning and vector search modules. Add product recognition, animation character recognition, vehicle recognition and logo recognition. Added 30 pretrained models of LeViT, Twins, TNT, DLA, HarDNet, and RedNet, and the accuracy is roughly the same as that of the paper.
- [more](./docs/en/update_history_en.md) - [more](./docs/en/update_history_en.md)
......
Global:
infer_imgs: "./images/0517_2715693311.jpg"
inference_model_dir: "../inference/"
batch_size: 1
use_gpu: True
enable_mkldnn: False
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
PreProcess:
transform_ops:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
PostProcess:
main_indicator: MultiLabelTopk
MultiLabelTopk:
topk: 5
class_id_map_file: None
SavePreLabel:
save_dir: ./pre_label/
...@@ -4,9 +4,9 @@ ...@@ -4,9 +4,9 @@
PaddleClas provides two service deployment methods: PaddleClas provides two service deployment methods:
- Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please refer to the [tutorial](../../deploy/hubserving/readme_en.md) - Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please refer to the [tutorial](../../deploy/hubserving/readme_en.md)
- Based on **PaddleServing**: Code path is "`./deploy/paddleserving`". Please follow this tutorial. - Based on **PaddleServing**: Code path is "`./deploy/paddleserving`". if you prefer retrieval_based image reocognition service, please refer to [tutorial](./recognition/README.md),if you'd like image classification service, Please follow this tutorial.
# Service deployment based on PaddleServing # Image Classification Service deployment based on PaddleServing
This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the ResNet50_vd model as a pipeline online service. This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the ResNet50_vd model as a pipeline online service.
...@@ -131,7 +131,7 @@ fetch_var { ...@@ -131,7 +131,7 @@ fetch_var {
config.yml # configuration file of starting the service config.yml # configuration file of starting the service
pipeline_http_client.py # script to send pipeline prediction request by http pipeline_http_client.py # script to send pipeline prediction request by http
pipeline_rpc_client.py # script to send pipeline prediction request by rpc pipeline_rpc_client.py # script to send pipeline prediction request by rpc
resnet50_web_service.py # start the script of the pipeline server classification_web_service.py # start the script of the pipeline server
``` ```
2. Run the following command to start the service. 2. Run the following command to start the service.
...@@ -147,7 +147,7 @@ fetch_var { ...@@ -147,7 +147,7 @@ fetch_var {
python3 pipeline_http_client.py python3 pipeline_http_client.py
``` ```
After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is: After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is:
![](./imgs/results.png) ![](./imgs/results.png)
Adjust the number of concurrency in config.yml to get the largest QPS. Adjust the number of concurrency in config.yml to get the largest QPS.
......
...@@ -4,9 +4,9 @@ ...@@ -4,9 +4,9 @@
PaddleClas提供2种服务部署方式: PaddleClas提供2种服务部署方式:
- 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",使用方法参考[文档](../../deploy/hubserving/readme.md) - 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",使用方法参考[文档](../../deploy/hubserving/readme.md)
- 基于PaddleServing的部署:代码路径为"`./deploy/paddleserving`",按照本教程使用。 - 基于PaddleServing的部署:代码路径为"`./deploy/paddleserving`", 基于检索方式的图像识别服务参考[文档](./recognition/README_CN.md), 图像分类服务按照本教程使用。
# 基于PaddleServing的服务部署 # 基于PaddleServing的图像分类服务部署
本文档以经典的ResNet50_vd模型为例,介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PaddleClas 本文档以经典的ResNet50_vd模型为例,介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PaddleClas
动态图模型的pipeline在线服务。 动态图模型的pipeline在线服务。
...@@ -127,7 +127,7 @@ fetch_var { ...@@ -127,7 +127,7 @@ fetch_var {
config.yml # 启动服务的配置文件 config.yml # 启动服务的配置文件
pipeline_http_client.py # http方式发送pipeline预测请求的脚本 pipeline_http_client.py # http方式发送pipeline预测请求的脚本
pipeline_rpc_client.py # rpc方式发送pipeline预测请求的脚本 pipeline_rpc_client.py # rpc方式发送pipeline预测请求的脚本
resnet50_web_service.py # 启动pipeline服务端的脚本 classification_web_service.py # 启动pipeline服务端的脚本
``` ```
2. 启动服务可运行如下命令: 2. 启动服务可运行如下命令:
......
# Product Recognition Service deployment based on PaddleServing
(English|[简体中文](./README_CN.md))
This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the product recognition model based on retrieval method as a pipeline online service.
Some Key Features of Paddle Serving:
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command.
- Industrial serving features supported, such as models management, online loading, online A/B testing etc.
- Highly concurrent and efficient communication between clients and servers supported.
The introduction and tutorial of Paddle Serving service deployment framework reference [document](https://github.com/PaddlePaddle/Serving/blob/develop/README.md).
## Contents
- [Environmental preparation](#environmental-preparation)
- [Model conversion](#model-conversion)
- [Paddle Serving pipeline deployment](#paddle-serving-pipeline-deployment)
- [FAQ](#faq)
<a name="environmental-preparation"></a>
## Environmental preparation
PaddleClas operating environment and PaddleServing operating environment are needed.
1. Please prepare PaddleClas operating environment reference [link](../../docs/zh_CN/tutorials/install.md).
Download the corresponding paddle whl package according to the environment, it is recommended to install version 2.1.0.
2. The steps of PaddleServing operating environment prepare are as follows:
Install serving which used to start the service
```
pip3 install paddle-serving-server==0.6.1 # for CPU
pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
# Other GPU environments need to confirm the environment and then choose to execute the following commands
pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
```
3. Install the client to send requests to the service
In [download link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md) find the client installation package corresponding to the python version.
The python3.7 version is recommended here:
```
wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
```
4. Install serving-app
```
pip3 install paddle-serving-app==0.6.1
```
**note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md).
<a name="model-conversion"></a>
## Model conversion
When using PaddleServing for service deployment, you need to convert the saved inference model into a serving model that is easy to deploy.
The following assumes that the current working directory is the PaddleClas root directory
Firstly, download the inference model of ResNet50_vd
```
cd deploy
# Download and unzip the ResNet50_vd model
wget -P models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar
cd models
tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar
```
Then, you can use installed paddle_serving_client tool to convert inference model to mobile model.
```
# Product recognition model conversion
python3 -m paddle_serving_client.convert --dirname ./product_ResNet50_vd_aliproduct_v1.0_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./product_ResNet50_vd_aliproduct_v1.0_serving/ \
--serving_client ./product_ResNet50_vd_aliproduct_v1.0_client/
```
After the ResNet50_vd inference model is converted, there will be additional folders of `product_ResNet50_vd_aliproduct_v1.0_serving` and `product_ResNet50_vd_aliproduct_v1.0_client` in the current folder, with the following format:
```
|- product_ResNet50_vd_aliproduct_v1.0_serving/
|- __model__
|- __params__
|- serving_server_conf.prototxt
|- serving_server_conf.stream.prototxt
|- product_ResNet50_vd_aliproduct_v1.0_client
|- serving_client_conf.prototxt
|- serving_client_conf.stream.prototxt
```
Once you have the model file for deployment, you need to change the alias name in `serving_server_conf.prototxt`: change `alias_name` in `fetch_var` to `features`,
The modified serving_server_conf.prototxt file is as follows:
```
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: true
fetch_type: 1
shape: -1
}
```
Next,download and unpack the built index of product gallery
```
cd ../
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar && tar -xf recognition_demo_data_v1.1.tar
```
<a name="paddle-serving-pipeline-deployment"></a>
## Paddle Serving pipeline deployment
1. Download the PaddleClas code, if you have already downloaded it, you can skip this step.
```
git clone https://github.com/PaddlePaddle/PaddleClas
# Enter the working directory
cd PaddleClas/deploy/paddleserving/recognition
```
The paddleserving directory contains the code to start the pipeline service and send prediction requests, including:
```
__init__.py
config.yml # configuration file of starting the service
pipeline_http_client.py # script to send pipeline prediction request by http
pipeline_rpc_client.py # script to send pipeline prediction request by rpc
recognition_web_service.py # start the script of the pipeline server
```
2. Run the following command to start the service.
```
# Start the service and save the running log in log.txt
python3 recognition_web_service.py &>log.txt &
```
After the service is successfully started, a log similar to the following will be printed in log.txt
![](../imgs/start_server_recog.png)
3. Send service request
```
python3 pipeline_http_client.py
```
After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is:
![](../imgs/results_recog.png)
Adjust the number of concurrency in config.yml to get the largest QPS.
```
op:
concurrency: 8
...
```
Multiple service requests can be sent at the same time if necessary.
The predicted performance data will be automatically written into the `PipelineServingLogs/pipeline.tracer` file.
<a name="faq"></a>
## FAQ
**Q1**: No result return after sending the request.
**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and before sending the request. The command to close the proxy is:
```
unset https_proxy
unset http_proxy
```
# 基于PaddleServing的商品识别服务部署
([English](./README.md)|简体中文)
本文以商品识别为例,介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PaddleClas动态图模型的pipeline在线服务。
相比较于hubserving部署,PaddleServing具备以下优点:
- 支持客户端和服务端之间高并发和高效通信
- 支持 工业级的服务能力 例如模型管理,在线加载,在线A/B测试等
- 支持 多种编程语言 开发客户端,例如C++, Python和Java
更多有关PaddleServing服务化部署框架介绍和使用教程参考[文档](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)
## 目录
- [环境准备](#环境准备)
- [模型转换](#模型转换)
- [Paddle Serving pipeline部署](#部署)
- [FAQ](#FAQ)
<a name="环境准备"></a>
## 环境准备
需要准备PaddleClas的运行环境和PaddleServing的运行环境。
- 准备PaddleClas的[运行环境](../../docs/zh_CN/tutorials/install.md), 根据环境下载对应的paddle whl包,推荐安装2.1.0版本
- 准备PaddleServing的运行环境,步骤如下
1. 安装serving,用于启动服务
```
pip3 install paddle-serving-server==0.6.1 # for CPU
pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
# 其他GPU环境需要确认环境再选择执行如下命令
pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
```
2. 安装client,用于向服务发送请求
[下载链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)中找到对应python版本的client安装包,这里推荐python3.7版本:
```
wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
```
3. 安装serving-app
```
pip3 install paddle-serving-app==0.6.1
```
**Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)
<a name="模型转换"></a>
## 模型转换
使用PaddleServing做服务化部署时,需要将保存的inference模型转换为serving易于部署的模型。
以下内容假定当前工作目录为PaddleClas根目录。
首先,下载商品识别的inference模型
```
cd deploy
# 下载并解压商品识别模型
wget -P models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_aliproduct_v1.0_infer.tar
cd models
tar -xf product_ResNet50_vd_aliproduct_v1.0_infer.tar
```
接下来,用安装的paddle_serving_client把下载的inference模型转换成易于server部署的模型格式。
```
# 转换商品识别模型
python3 -m paddle_serving_client.convert --dirname ./product_ResNet50_vd_aliproduct_v1.0_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./product_ResNet50_vd_aliproduct_v1.0_serving/ \
--serving_client ./product_ResNet50_vd_aliproduct_v1.0_client/
```
商品识别推理模型转换完成后,会在当前文件夹多出`product_ResNet50_vd_aliproduct_v1.0_serving``product_ResNet50_vd_aliproduct_v1.0_client`的文件夹,具备如下格式:
```
|- product_ResNet50_vd_aliproduct_v1.0_serving/
|- __model__
|- __params__
|- serving_server_conf.prototxt
|- serving_server_conf.stream.prototxt
|- product_ResNet50_vd_aliproduct_v1.0_client
|- serving_client_conf.prototxt
|- serving_client_conf.stream.prototxt
```
得到模型文件之后,需要修改serving_server_conf.prototxt中的alias名字: 将`fetch_var`中的`alias_name`改为`features`,
修改后的serving_server_conf.prototxt内容如下:
```
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: true
fetch_type: 1
shape: -1
}
```
接下来,下载并解压已经构建后的商品库index
```
cd ../
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar && tar -xf recognition_demo_data_v1.1.tar
```
<a name="部署"></a>
## Paddle Serving pipeline部署
1. 下载PaddleClas代码,若已下载可跳过此步骤
```
git clone https://github.com/PaddlePaddle/PaddleClas
# 进入到工作目录
cd PaddleClas/deploy/paddleserving/recognition
```
paddleserving目录包含启动pipeline服务和发送预测请求的代码,包括:
```
__init__.py
config.yml # 启动服务的配置文件
pipeline_http_client.py # http方式发送pipeline预测请求的脚本
pipeline_rpc_client.py # rpc方式发送pipeline预测请求的脚本
recognition_web_service.py # 启动pipeline服务端的脚本
```
2. 启动服务可运行如下命令:
```
# 启动服务,运行日志保存在log.txt
python3 recognition_web_service.py &>log.txt &
```
成功启动服务后,log.txt中会打印类似如下日志
![](../imgs/start_server_recog.png)
3. 发送服务请求:
```
python3 pipeline_http_client.py
```
成功运行后,模型预测的结果会打印在cmd窗口中,结果示例为:
![](../imgs/results_recog.png)
调整 config.yml 中的并发个数可以获得最大的QPS
```
op:
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 8
...
```
有需要的话可以同时发送多个服务请求
预测性能数据会被自动写入 `PipelineServingLogs/pipeline.tracer` 文件中。
<a name="FAQ"></a>
## FAQ
**Q1**: 发送请求后没有结果返回或者提示输出解码报错
**A1**: 启动服务和发送请求时不要设置代理,可以在启动服务前和发送请求前关闭代理,关闭代理的命令是:
```
unset https_proxy
unset http_proxy
```
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG
##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num
worker_num: 1
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port
http_port: 18081
rpc_port: 9994
dag:
#op资源类型, True, 为线程模型;False,为进程模型
is_thread_op: False
op:
rec:
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 1
#当op配置没有server_endpoints时,从local_service_conf读取本地服务配置
local_service_conf:
#uci模型路径
model_config: ../../models/product_ResNet50_vd_aliproduct_v1.0_serving
#计算硬件类型: 空缺时由devices决定(CPU/GPU),0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
device_type: 1
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: "0" # "0,1"
#client类型,包括brpc, grpc和local_predictor.local_predictor不启动Serving服务,进程内预测
client_type: local_predictor
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["features"]
det:
concurrency: 1
local_service_conf:
client_type: local_predictor
device_type: 1
devices: '0'
fetch_list:
- save_infer_model/scale_0.tmp_1
model_config: ../../models/ppyolov2_r50vd_dcn_mainbody_v1.0_serving/
\ No newline at end of file
foreground
background
\ No newline at end of file
import requests
import json
import base64
import os
imgpath = "daoxiangcunjinzhubing_6.jpg"
def cv2_to_base64(image):
return base64.b64encode(image).decode('utf8')
if __name__ == "__main__":
url = "http://127.0.0.1:18081/recognition/prediction"
with open(os.path.join(".", imgpath), 'rb') as file:
image_data1 = file.read()
image = cv2_to_base64(image_data1)
data = {"key": ["image"], "value": [image]}
for i in range(1):
r = requests.post(url=url, data=json.dumps(data))
print(r.json())
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
try:
from paddle_serving_server_gpu.pipeline import PipelineClient
except ImportError:
from paddle_serving_server.pipeline import PipelineClient
import base64
client = PipelineClient()
client.connect(['127.0.0.1:9994'])
imgpath = "daoxiangcunjinzhubing_6.jpg"
def cv2_to_base64(image):
return base64.b64encode(image).decode('utf8')
if __name__ == "__main__":
with open(imgpath, 'rb') as file:
image_data = file.read()
image = cv2_to_base64(image_data)
for i in range(1):
ret = client.predict(feed_dict={"image": image}, fetch=["result"])
print(ret)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle_serving_server.web_service import WebService, Op
import logging
import numpy as np
import sys
import cv2
from paddle_serving_app.reader import *
import base64
import os
import faiss
import pickle
import json
class DetOp(Op):
def init_op(self):
self.img_preprocess = Sequential([
BGR2RGB(), Div(255.0),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], False),
Resize((640, 640)), Transpose((2, 0, 1))
])
self.img_postprocess = RCNNPostprocess("label_list.txt", "output")
self.threshold = 0.2
self.max_det_results = 5
def generate_scale(self, im):
"""
Args:
im (np.ndarray): image (np.ndarray)
Returns:
im_scale_x: the resize ratio of X
im_scale_y: the resize ratio of Y
"""
target_size = [640, 640]
origin_shape = im.shape[:2]
resize_h, resize_w = target_size
im_scale_y = resize_h / float(origin_shape[0])
im_scale_x = resize_w / float(origin_shape[1])
return im_scale_y, im_scale_x
def preprocess(self, input_dicts, data_id, log_id):
(_, input_dict), = input_dicts.items()
imgs = []
raw_imgs = []
for key in input_dict.keys():
data = base64.b64decode(input_dict[key].encode('utf8'))
raw_imgs.append(data)
data = np.fromstring(data, np.uint8)
raw_im = cv2.imdecode(data, cv2.IMREAD_COLOR)
im_scale_y, im_scale_x = self.generate_scale(raw_im)
im = self.img_preprocess(raw_im)
imgs.append({
"image": im[np.newaxis, :],
"im_shape": np.array(list(im.shape[1:])).reshape(-1)[np.newaxis,:],
"scale_factor": np.array([im_scale_y, im_scale_x]).astype('float32'),
})
self.raw_img = raw_imgs
feed_dict = {
"image": np.concatenate([x["image"] for x in imgs], axis=0),
"im_shape": np.concatenate([x["im_shape"] for x in imgs], axis=0),
"scale_factor": np.concatenate([x["scale_factor"] for x in imgs], axis=0)
}
return feed_dict, False, None, ""
def postprocess(self, input_dicts, fetch_dict, log_id):
boxes = self.img_postprocess(fetch_dict, visualize=False)
boxes.sort(key = lambda x: x["score"], reverse = True)
boxes = filter(lambda x: x["score"] >= self.threshold, boxes[:self.max_det_results])
boxes = list(boxes)
for i in range(len(boxes)):
boxes[i]["bbox"][2] += boxes[i]["bbox"][0] - 1
boxes[i]["bbox"][3] += boxes[i]["bbox"][1] - 1
result = json.dumps(boxes)
res_dict = {"bbox_result": result, "image": self.raw_img}
return res_dict, None, ""
class RecOp(Op):
def init_op(self):
self.seq = Sequential([
BGR2RGB(), Resize((224, 224)),
Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225],
False), Transpose((2, 0, 1))
])
index_dir = "../../recognition_demo_data_v1.1/gallery_product/index"
assert os.path.exists(os.path.join(
index_dir, "vector.index")), "vector.index not found ..."
assert os.path.exists(os.path.join(
index_dir, "id_map.pkl")), "id_map.pkl not found ... "
self.searcher = faiss.read_index(
os.path.join(index_dir, "vector.index"))
with open(os.path.join(index_dir, "id_map.pkl"), "rb") as fd:
self.id_map = pickle.load(fd)
self.rec_nms_thresold = 0.05
self.rec_score_thres = 0.5
self.feature_normalize = True
self.return_k = 1
def preprocess(self, input_dicts, data_id, log_id):
(_, input_dict), = input_dicts.items()
raw_img = input_dict["image"][0]
data = np.frombuffer(raw_img, np.uint8)
origin_img = cv2.imdecode(data, cv2.IMREAD_COLOR)
dt_boxes = input_dict["bbox_result"]
boxes = json.loads(dt_boxes)
boxes.append({"category_id": 0,
"score": 1.0,
"bbox": [0, 0, origin_img.shape[1], origin_img.shape[0]]
})
self.det_boxes = boxes
#construct batch images for rec
imgs = []
for box in boxes:
box = [int(x) for x in box["bbox"]]
im = origin_img[box[1]: box[3], box[0]: box[2]].copy()
img = self.seq(im)
imgs.append(img[np.newaxis, :].copy())
input_imgs = np.concatenate(imgs, axis=0)
return {"x": input_imgs}, False, None, ""
def nms_to_rec_results(self, results, thresh = 0.1):
filtered_results = []
x1 = np.array([r["bbox"][0] for r in results]).astype("float32")
y1 = np.array([r["bbox"][1] for r in results]).astype("float32")
x2 = np.array([r["bbox"][2] for r in results]).astype("float32")
y2 = np.array([r["bbox"][3] for r in results]).astype("float32")
scores = np.array([r["rec_scores"] for r in results])
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = scores.argsort()[::-1]
while order.size > 0:
i = order[0]
xx1 = np.maximum(x1[i], x1[order[1:]])
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= thresh)[0]
order = order[inds + 1]
filtered_results.append(results[i])
return filtered_results
def postprocess(self, input_dicts, fetch_dict, log_id):
batch_features = fetch_dict["features"]
if self.feature_normalize:
feas_norm = np.sqrt(
np.sum(np.square(batch_features), axis=1, keepdims=True))
batch_features = np.divide(batch_features, feas_norm)
scores, docs = self.searcher.search(batch_features, self.return_k)
results = []
for i in range(scores.shape[0]):
pred = {}
if scores[i][0] >= self.rec_score_thres:
pred["bbox"] = [int(x) for x in self.det_boxes[i]["bbox"]]
pred["rec_docs"] = self.id_map[docs[i][0]].split()[1]
pred["rec_scores"] = scores[i][0]
results.append(pred)
#do nms
results = self.nms_to_rec_results(results, self.rec_nms_thresold)
return {"result": str(results)}, None, ""
class RecognitionService(WebService):
def get_pipeline_response(self, read_op):
det_op = DetOp(name="det", input_ops=[read_op])
rec_op = RecOp(name="rec", input_ops=[det_op])
return rec_op
product_recog_service = RecognitionService(name="recognition")
product_recog_service.prepare_pipeline_config("config.yml")
product_recog_service.run_service()
...@@ -81,12 +81,14 @@ class Topk(object): ...@@ -81,12 +81,14 @@ class Topk(object):
class_id_map = None class_id_map = None
return class_id_map return class_id_map
def __call__(self, x, file_names=None): def __call__(self, x, file_names=None, multilabel=False):
if file_names is not None: if file_names is not None:
assert x.shape[0] == len(file_names) assert x.shape[0] == len(file_names)
y = [] y = []
for idx, probs in enumerate(x): for idx, probs in enumerate(x):
index = probs.argsort(axis=0)[-self.topk:][::-1].astype("int32") index = probs.argsort(axis=0)[-self.topk:][::-1].astype(
"int32") if not multilabel else np.where(
probs >= 0.5)[0].astype("int32")
clas_id_list = [] clas_id_list = []
score_list = [] score_list = []
label_name_list = [] label_name_list = []
...@@ -108,6 +110,14 @@ class Topk(object): ...@@ -108,6 +110,14 @@ class Topk(object):
return y return y
class MultiLabelTopk(Topk):
def __init__(self, topk=1, class_id_map_file=None):
super().__init__()
def __call__(self, x, file_names=None):
return super().__call__(x, file_names, multilabel=True)
class SavePreLabel(object): class SavePreLabel(object):
def __init__(self, save_dir): def __init__(self, save_dir):
if save_dir is None: if save_dir is None:
...@@ -128,23 +138,24 @@ class SavePreLabel(object): ...@@ -128,23 +138,24 @@ class SavePreLabel(object):
os.makedirs(output_dir, exist_ok=True) os.makedirs(output_dir, exist_ok=True)
shutil.copy(image_file, output_dir) shutil.copy(image_file, output_dir)
class Binarize(object): class Binarize(object):
def __init__(self, method = "round"): def __init__(self, method="round"):
self.method = method self.method = method
self.unit = np.array([[128, 64, 32, 16, 8, 4, 2, 1]]).T self.unit = np.array([[128, 64, 32, 16, 8, 4, 2, 1]]).T
def __call__(self, x, file_names=None): def __call__(self, x, file_names=None):
if self.method == "round": if self.method == "round":
x = np.round(x + 1).astype("uint8") - 1 x = np.round(x + 1).astype("uint8") - 1
if self.method == "sign": if self.method == "sign":
x = ((np.sign(x) + 1) / 2).astype("uint8") x = ((np.sign(x) + 1) / 2).astype("uint8")
embedding_size = x.shape[1] embedding_size = x.shape[1]
assert embedding_size % 8 == 0, "The Binary index only support vectors with sizes multiple of 8" assert embedding_size % 8 == 0, "The Binary index only support vectors with sizes multiple of 8"
byte = np.zeros([x.shape[0], embedding_size // 8], dtype=np.uint8) byte = np.zeros([x.shape[0], embedding_size // 8], dtype=np.uint8)
for i in range(embedding_size // 8): for i in range(embedding_size // 8):
byte[:, i:i+1] = np.dot(x[:, i * 8: (i + 1)* 8], self.unit) byte[:, i:i + 1] = np.dot(x[:, i * 8:(i + 1) * 8], self.unit)
return byte return byte
...@@ -71,7 +71,6 @@ class ClsPredictor(Predictor): ...@@ -71,7 +71,6 @@ class ClsPredictor(Predictor):
output_names = self.paddle_predictor.get_output_names() output_names = self.paddle_predictor.get_output_names()
output_tensor = self.paddle_predictor.get_output_handle(output_names[ output_tensor = self.paddle_predictor.get_output_handle(output_names[
0]) 0])
if self.benchmark: if self.benchmark:
self.auto_logger.times.start() self.auto_logger.times.start()
if not isinstance(images, (list, )): if not isinstance(images, (list, )):
...@@ -119,7 +118,6 @@ def main(config): ...@@ -119,7 +118,6 @@ def main(config):
) == len(image_list): ) == len(image_list):
if len(batch_imgs) == 0: if len(batch_imgs) == 0:
continue continue
batch_results = cls_predictor.predict(batch_imgs) batch_results = cls_predictor.predict(batch_imgs)
for number, result_dict in enumerate(batch_results): for number, result_dict in enumerate(batch_results):
filename = batch_names[number] filename = batch_names[number]
......
...@@ -19,12 +19,14 @@ from __future__ import division ...@@ -19,12 +19,14 @@ from __future__ import division
from __future__ import print_function from __future__ import print_function
from __future__ import unicode_literals from __future__ import unicode_literals
from functools import partial
import six import six
import math import math
import random import random
import cv2 import cv2
import numpy as np import numpy as np
import importlib import importlib
from PIL import Image
from python.det_preprocess import DetNormalizeImage, DetPadStride, DetPermute, DetResize from python.det_preprocess import DetNormalizeImage, DetPadStride, DetPermute, DetResize
...@@ -50,6 +52,50 @@ def create_operators(params): ...@@ -50,6 +52,50 @@ def create_operators(params):
return ops return ops
class UnifiedResize(object):
def __init__(self, interpolation=None, backend="cv2"):
_cv2_interp_from_str = {
'nearest': cv2.INTER_NEAREST,
'bilinear': cv2.INTER_LINEAR,
'area': cv2.INTER_AREA,
'bicubic': cv2.INTER_CUBIC,
'lanczos': cv2.INTER_LANCZOS4
}
_pil_interp_from_str = {
'nearest': Image.NEAREST,
'bilinear': Image.BILINEAR,
'bicubic': Image.BICUBIC,
'box': Image.BOX,
'lanczos': Image.LANCZOS,
'hamming': Image.HAMMING
}
def _pil_resize(src, size, resample):
pil_img = Image.fromarray(src)
pil_img = pil_img.resize(size, resample)
return np.asarray(pil_img)
if backend.lower() == "cv2":
if isinstance(interpolation, str):
interpolation = _cv2_interp_from_str[interpolation.lower()]
# compatible with opencv < version 4.4.0
elif not interpolation:
interpolation = cv2.INTER_LINEAR
self.resize_func = partial(cv2.resize, interpolation=interpolation)
elif backend.lower() == "pil":
if isinstance(interpolation, str):
interpolation = _pil_interp_from_str[interpolation.lower()]
self.resize_func = partial(_pil_resize, resample=interpolation)
else:
logger.warning(
f"The backend of Resize only support \"cv2\" or \"PIL\". \"f{backend}\" is unavailable. Use \"cv2\" instead."
)
self.resize_func = cv2.resize
def __call__(self, src, size):
return self.resize_func(src, size)
class OperatorParamError(ValueError): class OperatorParamError(ValueError):
""" OperatorParamError """ OperatorParamError
""" """
...@@ -87,8 +133,11 @@ class DecodeImage(object): ...@@ -87,8 +133,11 @@ class DecodeImage(object):
class ResizeImage(object): class ResizeImage(object):
""" resize image """ """ resize image """
def __init__(self, size=None, resize_short=None, interpolation=-1): def __init__(self,
self.interpolation = interpolation if interpolation >= 0 else None size=None,
resize_short=None,
interpolation=None,
backend="cv2"):
if resize_short is not None and resize_short > 0: if resize_short is not None and resize_short > 0:
self.resize_short = resize_short self.resize_short = resize_short
self.w = None self.w = None
...@@ -101,6 +150,9 @@ class ResizeImage(object): ...@@ -101,6 +150,9 @@ class ResizeImage(object):
raise OperatorParamError("invalid params for ReisizeImage for '\ raise OperatorParamError("invalid params for ReisizeImage for '\
'both 'size' and 'resize_short' are None") 'both 'size' and 'resize_short' are None")
self._resize_func = UnifiedResize(
interpolation=interpolation, backend=backend)
def __call__(self, img): def __call__(self, img):
img_h, img_w = img.shape[:2] img_h, img_w = img.shape[:2]
if self.resize_short is not None: if self.resize_short is not None:
...@@ -110,10 +162,7 @@ class ResizeImage(object): ...@@ -110,10 +162,7 @@ class ResizeImage(object):
else: else:
w = self.w w = self.w
h = self.h h = self.h
if self.interpolation is None: return self._resize_func(img, (w, h))
return cv2.resize(img, (w, h))
else:
return cv2.resize(img, (w, h), interpolation=self.interpolation)
class CropImage(object): class CropImage(object):
...@@ -145,9 +194,12 @@ class CropImage(object): ...@@ -145,9 +194,12 @@ class CropImage(object):
class RandCropImage(object): class RandCropImage(object):
""" random crop image """ """ random crop image """
def __init__(self, size, scale=None, ratio=None, interpolation=-1): def __init__(self,
size,
self.interpolation = interpolation if interpolation >= 0 else None scale=None,
ratio=None,
interpolation=None,
backend="cv2"):
if type(size) is int: if type(size) is int:
self.size = (size, size) # (h, w) self.size = (size, size) # (h, w)
else: else:
...@@ -156,6 +208,9 @@ class RandCropImage(object): ...@@ -156,6 +208,9 @@ class RandCropImage(object):
self.scale = [0.08, 1.0] if scale is None else scale self.scale = [0.08, 1.0] if scale is None else scale
self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio
self._resize_func = UnifiedResize(
interpolation=interpolation, backend=backend)
def __call__(self, img): def __call__(self, img):
size = self.size size = self.size
scale = self.scale scale = self.scale
...@@ -181,10 +236,8 @@ class RandCropImage(object): ...@@ -181,10 +236,8 @@ class RandCropImage(object):
j = random.randint(0, img_h - h) j = random.randint(0, img_h - h)
img = img[j:j + h, i:i + w, :] img = img[j:j + h, i:i + w, :]
if self.interpolation is None:
return cv2.resize(img, size) return self._resize_func(img, size)
else:
return cv2.resize(img, size, interpolation=self.interpolation)
class RandFlipImage(object): class RandFlipImage(object):
......
# classification # classification
python3.7 python/predict_cls.py -c configs/inference_cls.yaml python3.7 python/predict_cls.py -c configs/inference_cls.yaml
# multilabel_classification
#python3.7 python/predict_cls.py -c configs/inference_multilabel_cls.yaml
# feature extractor # feature extractor
# python3.7 python/predict_rec.py -c configs/inference_rec.yaml # python3.7 python/predict_rec.py -c configs/inference_rec.yaml
......
...@@ -24,13 +24,13 @@ Accuracy and inference time of the prtrained models based on SSLD distillation a ...@@ -24,13 +24,13 @@ Accuracy and inference time of the prtrained models based on SSLD distillation a
* Server-side distillation pretrained models * Server-side distillation pretrained models
| Model | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | Download Address | | Model | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | Download Address |
|---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------| |---------------------|-----------|-----------|---------------|----------------|----------|-----------|-----------------------------------|
| ResNet34_vd_ssld | 0.797 | 0.760 | 0.037 | 2.434 | 6.222 | 7.39 | 21.82 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet34_vd_ssld_pretrained.pdparams) | | ResNet34_vd_ssld | 0.797 | 0.760 | 0.037 | 2.434 | 6.222 | 7.39 | 21.82 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet34_vd_ssld_pretrained.pdparams) |
| ResNet50_vd_<br>ssld | 0.830 | 0.792 | 0.039 | 3.531 | 8.090 | 8.67 | 25.58 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_ssld_pretrained.pdparams) | | ResNet50_vd_ssld | 0.830 | 0.792 | 0.039 | 3.531 | 8.090 | 8.67 | 25.58 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_ssld_pretrained.pdparams) |
| ResNet101_vd_<br>ssld | 0.837 | 0.802 | 0.035 | 6.117 | 13.762 | 16.1 | 44.57 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet101_vd_ssld_pretrained.pdparams) | | ResNet101_vd_ssld | 0.837 | 0.802 | 0.035 | 6.117 | 13.762 | 16.1 | 44.57 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet101_vd_ssld_pretrained.pdparams) |
| Res2Net50_vd_<br>26w_4s_ssld | 0.831 | 0.798 | 0.033 | 4.527 | 9.657 | 8.37 | 25.06 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net50_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net50_vd_26w_4s_ssld | 0.831 | 0.798 | 0.033 | 4.527 | 9.657 | 8.37 | 25.06 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net50_vd_26w_4s_ssld_pretrained.pdparams) |
| Res2Net101_vd_<br>26w_4s_ssld | 0.839 | 0.806 | 0.033 | 8.087 | 17.312 | 16.67 | 45.22 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net101_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net101_vd_26w_4s_ssld | 0.839 | 0.806 | 0.033 | 8.087 | 17.312 | 16.67 | 45.22 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net101_vd_26w_4s_ssld_pretrained.pdparams) |
| Res2Net200_vd_<br>26w_4s_ssld | 0.851 | 0.812 | 0.049 | 14.678 | 32.350 | 31.49 | 76.21 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net200_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net200_vd_26w_4s_ssld | 0.851 | 0.812 | 0.049 | 14.678 | 32.350 | 31.49 | 76.21 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net200_vd_26w_4s_ssld_pretrained.pdparams) |
| HRNet_W18_C_ssld | 0.812 | 0.769 | 0.043 | 7.406 | 13.297 | 4.14 | 21.29 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_ssld_pretrained.pdparams) | | HRNet_W18_C_ssld | 0.812 | 0.769 | 0.043 | 7.406 | 13.297 | 4.14 | 21.29 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_ssld_pretrained.pdparams) |
| HRNet_W48_C_ssld | 0.836 | 0.790 | 0.046 | 13.707 | 34.435 | 34.58 | 77.47 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W48_C_ssld_pretrained.pdparams) | | HRNet_W48_C_ssld | 0.836 | 0.790 | 0.046 | 13.707 | 34.435 | 34.58 | 77.47 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W48_C_ssld_pretrained.pdparams) |
| SE_HRNet_W64_C_ssld | 0.848 | - | - | 31.697 | 94.995 | 57.83 | 128.97 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/SE_HRNet_W64_C_ssld_pretrained.pdparams) | | SE_HRNet_W64_C_ssld | 0.848 | - | - | 31.697 | 94.995 | 57.83 | 128.97 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/SE_HRNet_W64_C_ssld_pretrained.pdparams) |
...@@ -38,19 +38,44 @@ Accuracy and inference time of the prtrained models based on SSLD distillation a ...@@ -38,19 +38,44 @@ Accuracy and inference time of the prtrained models based on SSLD distillation a
* Mobile-side distillation pretrained models * Mobile-side distillation pretrained models
| Model | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | SD855 time(ms)<br>bs=1 | Flops(G) | Params(M) | 模型大小(M) | Download Address | | Model | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | SD855 time(ms)<br>bs=1 | Flops(G) | Params(M) | Storage Size(M) | Download Address |
|---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------|
| MobileNetV1_ssld | 0.779 | 0.710 | 0.069 | 32.523 | 1.11 | 4.19 | 16 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_ssld_pretrained.pdparams) |
| MobileNetV2_ssld | 0.767 | 0.722 | 0.045 | 23.318 | 0.6 | 3.44 | 14 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_ssld_pretrained.pdparams) |
| MobileNetV3_small_x0_35_ssld | 0.556 | 0.530 | 0.026 | 2.635 | 0.026 | 1.66 | 6.9 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x0_35_ssld_pretrained.pdparams) |
| MobileNetV3_large_x1_0_ssld | 0.790 | 0.753 | 0.036 | 19.308 | 0.45 | 5.47 | 21 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_ssld_pretrained.pdparams) |
| MobileNetV3_small_x1_0_ssld | 0.713 | 0.682 | 0.031 | 6.546 | 0.123 | 2.94 | 12 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x1_0_ssld_pretrained.pdparams) |
| GhostNet_x1_3_ssld | 0.794 | 0.757 | 0.037 | 19.983 | 0.44 | 7.3 | 29 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/GhostNet_x1_3_ssld_pretrained.pdparams)
* Intel-CPU-side distillation pretrained models
| Model | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | Intel-Xeon-Gold-6148 time(ms)<br>bs=1 | Flops(M) | Params(M) | Download Address |
|---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------| |---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------|
| MobileNetV1_<br>ssld | 0.779 | 0.710 | 0.069 | 32.523 | 1.11 | 4.19 | 16 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_ssld_pretrained.pdparams) | | PPLCNet_x0_5_ssld | 0.661 | 0.631 | 0.030 | 2.05 | 47 | 1.9 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_ssld_pretrained.pdparams) |
| MobileNetV2_<br>ssld | 0.767 | 0.722 | 0.045 | 23.318 | 0.6 | 3.44 | 14 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_ssld_pretrained.pdparams) | | PPLCNet_x1_0_ssld | 0.744 | 0.713 | 0.033 | 2.46 | 161 | 3.0 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_ssld_pretrained.pdparams) |
| MobileNetV3_<br>small_x0_35_ssld | 0.556 | 0.530 | 0.026 | 2.635 | 0.026 | 1.66 | 6.9 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x0_35_ssld_pretrained.pdparams) | | PPLCNet_x2_5_ssld | 0.808 | 0.766 | 0.042 | 5.39 | 906 | 9.0 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_ssld_pretrained.pdparams) |
| MobileNetV3_<br>large_x1_0_ssld | 0.790 | 0.753 | 0.036 | 19.308 | 0.45 | 5.47 | 21 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_ssld_pretrained.pdparams) |
| MobileNetV3_small_<br>x1_0_ssld | 0.713 | 0.682 | 0.031 | 6.546 | 0.123 | 2.94 | 12 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x1_0_ssld_pretrained.pdparams) |
| GhostNet_<br>x1_3_ssld | 0.794 | 0.757 | 0.037 | 19.983 | 0.44 | 7.3 | 29 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/GhostNet_x1_3_ssld_pretrained.pdparams)
* Note: `Reference Top-1 Acc` means accuracy of pretrained models which are trained on ImageNet1k dataset. * Note: `Reference Top-1 Acc` means accuracy of pretrained models which are trained on ImageNet1k dataset.
<a name="PPLCNet_series"></a>
### PPLCNet_series
Accuracy and inference time metrics of PPLCNet series models are shown as follows. More detailed information can be refered to [PPLCNet series tutorial](../en/models/PPLCNet_en.md).
| Model | Top-1 Acc | Top-5 Acc | Intel-Xeon-Gold-6148 time(ms)<br>bs=1 | FLOPs(M) | Params(M) | Download Address |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| PPLCNet_x0_25 |0.5186 | 0.7565 | 1.74 | 18 | 1.5 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_25_pretrained.pdparams) |
| PPLCNet_x0_35 |0.5809 | 0.8083 | 1.92 | 29 | 1.6 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_35_pretrained.pdparams) |
| PPLCNet_x0_5 |0.6314 | 0.8466 | 2.05 | 47 | 1.9 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_pretrained.pdparams) |
| PPLCNet_x0_75 |0.6818 | 0.8830 | 2.29 | 99 | 2.4 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams) |
| PPLCNet_x1_0 |0.7132 | 0.9003 | 2.46 | 161 | 3.0 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_pretrained.pdparams) |
| PPLCNet_x1_5 |0.7371 | 0.9153 | 3.19 | 342 | 4.5 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_5_pretrained.pdparams) |
| PPLCNet_x2_0 |0.7518 | 0.9227 | 4.27 | 590 | 6.5 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams) |
| PPLCNet_x2_5 |0.7660 | 0.9300 | 5.39 | 906 | 9.0 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_pretrained.pdparams) |
<a name="ResNet_and_Vd_series"></a> <a name="ResNet_and_Vd_series"></a>
### ResNet and Vd series ### ResNet and Vd series
......
...@@ -25,58 +25,68 @@ tar -xf NUS-SCENE-dataset.tar ...@@ -25,58 +25,68 @@ tar -xf NUS-SCENE-dataset.tar
cd ../../ cd ../../
``` ```
## Environment ## Training
### Download pretrained model ```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml
```
You can use the following commands to download the pretrained model of ResNet50_vd. After training for 10 epochs, the best accuracy over the validation set should be around 0.95.
## Evaluation
```bash ```bash
mkdir pretrained python tools/eval.py \
cd pretrained -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_pretrained.pdparams -o Arch.pretrained="./output/MobileNetV1/best_model"
cd ../
``` ```
## Training ## Prediction
```shell ```bash
export CUDA_VISIBLE_DEVICES=0 python3 tools/infer.py
python -m paddle.distributed.launch \ -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \
--gpus="0" \ -o Arch.pretrained="./output/MobileNetV1/best_model"
tools/train.py \
-c ./configs/quick_start/ResNet50_vd_multilabel.yaml
``` ```
After training for 10 epochs, the best accuracy over the validation set should be around 0.72. You will get multiple output such as the following:
```
[{'class_ids': [6, 13, 17, 23, 26, 30], 'scores': [0.95683, 0.5567, 0.55211, 0.99088, 0.5943, 0.78767], 'file_name': './deploy/images/0517_2715693311.jpg', 'label_names': []}]
```
## Evaluation ## Prediction based on prediction engine
### Export model
```bash ```bash
python tools/eval.py \ python3 tools/export_model.py \
-c ./configs/quick_start/ResNet50_vd_multilabel.yaml \ -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \
-o pretrained_model="./output/ResNet50_vd/best_model/ppcls" \ -o Arch.pretrained="./output/MobileNetV1/best_model"
-o load_static_weights=False
``` ```
The metric of evaluation is based on mAP, which is commonly used in multilabel task to show model perfermance. The mAP over validation set should be around 0.57. The default path of the inference model is under the current path `./inference`
## Prediction ### Prediction based on prediction engine
Enter the deploy directory:
```bash ```bash
python tools/infer/infer.py \ cd ./deploy
-i "./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/images/0199_434752251.jpg" \ ```
--model ResNet50_vd \
--pretrained_model "./output/ResNet50_vd/best_model/ppcls" \ Prediction based on prediction engine:
--use_gpu True \
--load_static_weights False \ ```
--multilabel True \ python3 python/predict_cls.py \
--class_num 33 -c configs/inference_multilabel_cls.yaml
``` ```
You will get multiple output such as the following: You will get multiple output such as the following:
```
class id: 3, probability: 0.6025 ```
class id: 23, probability: 0.5491 0517_2715693311.jpg: class id(s): [6, 13, 17, 23, 26, 30], score(s): [0.96, 0.56, 0.55, 0.99, 0.59, 0.79], label_name(s): []
class id: 32, probability: 0.7006 ```
```
\ No newline at end of file
# PPLCNet series
## Overview
The PPLCNet series is a network that has excellent performance on Intel-CPU proposed by the Baidu PaddleCV team. The author summarizes some methods that can improve the accuracy of the model on Intel-CPU but hardly increase the inference time. The author combines these methods into a new network, namely PPLCNet. Compared with other lightweight networks, PPLCNet can achieve higher accuracy with the same inference time. PPLCNet has shown strong competitiveness in image classification, object detection, and semantic segmentation.
## Accuracy, FLOPS and Parameters
| Models | Top1 | Top5 | FLOPs<br>(M) | Parameters<br>(M) |
|:--:|:--:|:--:|:--:|:--:|
| PPLCNet_x0_25 |0.5186 | 0.7565 | 18 | 1.5 |
| PPLCNet_x0_35 |0.5809 | 0.8083 | 29 | 1.6 |
| PPLCNet_x0_5 |0.6314 | 0.8466 | 47 | 1.9 |
| PPLCNet_x0_75 |0.6818 | 0.8830 | 99 | 2.4 |
| PPLCNet_x1_0 |0.7132 | 0.9003 | 161 | 3.0 |
| PPLCNet_x1_5 |0.7371 | 0.9153 | 342 | 4.5 |
| PPLCNet_x2_0 |0.7518 | 0.9227 | 590 | 6.5 |
| PPLCNet_x2_5 |0.7660 | 0.9300 | 906 | 9.0 |
| PPLCNet_x0_5_ssld |0.6610 | 0.8646 | 47 | 1.9 |
| PPLCNet_x1_0_ssld |0.7439 | 0.9209 | 161 | 3.0 |
| PPLCNet_x2_5_ssld |0.8082 | 0.9533 | 906 | 9.0 |
## Inference speed based on Intel(R)-Xeon(R)-Gold-6148-CPU
| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) |
|------------------|-----------|-------------------|--------------------------|
| PPLCNet_x0_25 | 224 | 256 | 1.74 |
| PPLCNet_x0_35 | 224 | 256 | 1.92 |
| PPLCNet_x0_5 | 224 | 256 | 2.05 |
| PPLCNet_x0_75 | 224 | 256 | 2.29 |
| PPLCNet_x1_0 | 224 | 256 | 2.46 |
| PPLCNet_x1_5 | 224 | 256 | 3.19 |
| PPLCNet_x2_0 | 224 | 256 | 4.27 |
| PPLCNet_x2_5 | 224 | 256 | 5.39 |
| PPLCNet_x0_5_ssld | 224 | 256 | 2.05 |
| PPLCNet_x1_0_ssld | 224 | 256 | 2.46 |
| PPLCNet_x2_5_ssld | 224 | 256 | 5.39 |
...@@ -14,13 +14,13 @@ After preparing the configuration file, The training process can be started in t ...@@ -14,13 +14,13 @@ After preparing the configuration file, The training process can be started in t
``` ```
python tools/train.py \ python tools/train.py \
-c configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \ -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="" \ -o Arch.pretrained=False \
-o use_gpu=False -o Global.device=gpu
``` ```
Among them, `-c` is used to specify the path of the configuration file, `-o` is used to specify the parameters needed to be modified or added, `-o pretrained_model=""` means to not using pre-trained models. Among them, `-c` is used to specify the path of the configuration file, `-o` is used to specify the parameters needed to be modified or added, `-o Arch.pretrained=False` means to not using pre-trained models.
`-o use_gpu=True` means to use GPU for training. If you want to use the CPU for training, you need to set `use_gpu` to `False`. `-o Global.device=gpu` means to use GPU for training. If you want to use the CPU for training, you need to set `Global.device` to `cpu`.
Of course, you can also directly modify the configuration file to update the configuration. For specific configuration parameters, please refer to [Configuration Document](config_description_en.md). Of course, you can also directly modify the configuration file to update the configuration. For specific configuration parameters, please refer to [Configuration Document](config_description_en.md).
...@@ -54,12 +54,12 @@ After configuring the configuration file, you can finetune it by loading the pre ...@@ -54,12 +54,12 @@ After configuring the configuration file, you can finetune it by loading the pre
``` ```
python tools/train.py \ python tools/train.py \
-c configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \ -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./pretrained/MobileNetV3_large_x1_0_pretrained" \ -o Arch.pretrained=True \
-o use_gpu=True -o Global.device=gpu
``` ```
Among them, `-o pretrained_model` is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file. Among them, `-o Arch.pretrained` is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file. You can also set it into `True` to use pretrained weights that trained in ImageNet1k.
We also provide a lot of pre-trained models trained on the ImageNet-1k dataset. For the model list and download address, please refer to the [model library overview](../models/models_intro_en.md). We also provide a lot of pre-trained models trained on the ImageNet-1k dataset. For the model list and download address, please refer to the [model library overview](../models/models_intro_en.md).
...@@ -69,28 +69,26 @@ If the training process is terminated for some reasons, you can also load the ch ...@@ -69,28 +69,26 @@ If the training process is terminated for some reasons, you can also load the ch
``` ```
python tools/train.py \ python tools/train.py \
-c configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \ -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o checkpoints="./output/MobileNetV3_large_x1_0/5/ppcls" \ -o Global.checkpoints="./output/MobileNetV3_large_x1_0/epoch_5" \
-o last_epoch=5 \ -o Global.device=gpu
-o use_gpu=True
``` ```
The configuration file does not need to be modified. You only need to add the `checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter. The configuration file does not need to be modified. You only need to add the `Global.checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter.
**Note**: **Note**:
* The parameter `-o last_epoch=5` means to record the number of the last training epoch as `5`, that is, the number of this training epoch starts from `6`, , and the parameter defaults to `-1`, which means the number of this training epoch starts from `0`.
* The `-o checkpoints` parameter does not need to include the suffix of the checkpoints. The above training command will generate the checkpoints as shown below during the training process. If you want to continue training from the epoch `5`, Just set the `checkpoints` to `./output/MobileNetV3_large_x1_0_gpupaddle/5/ppcls`, PaddleClas will automatically fill in the `pdopt` and `pdparams` suffixes. * The `-o Global.checkpoints` parameter does not need to include the suffix of the checkpoints. The above training command will generate the checkpoints as shown below during the training process. If you want to continue training from the epoch `5`, Just set the `Global.checkpoints` to `../output/MobileNetV3_large_x1_0/epoch_5`, PaddleClas will automatically fill in the `pdopt` and `pdparams` suffixes.
```shell ```shell
output/ output
── MobileNetV3_large_x1_0 ── MobileNetV3_large_x1_0
├── 0 │ ├── best_model.pdopt
│ ├── ppcls.pdopt │ ├── best_model.pdparams
│ └── ppcls.pdparams │ ├── best_model.pdstates
├── 1 │ ├── epoch_1.pdopt
│ ├── ppcls.pdopt │ ├── epoch_1.pdparams
│ └── ppcls.pdparams │ ├── epoch_1.pdstates
. .
. .
. .
...@@ -103,18 +101,15 @@ The model evaluation process can be started as follows. ...@@ -103,18 +101,15 @@ The model evaluation process can be started as follows.
```bash ```bash
python tools/eval.py \ python tools/eval.py \
-c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \ -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./output/MobileNetV3_large_x1_0/best_model/ppcls"\ -o Global.pretrained_model=./output/MobileNetV3_large_x1_0/best_model
-o load_static_weights=False
``` ```
The above command will use `./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml` as the configuration file to evaluate the model `./output/MobileNetV3_large_x1_0/best_model/ppcls`. You can also set the evaluation by changing the parameters in the configuration file, or you can update the configuration with the `-o` parameter, as shown above. The above command will use `./configs/quick_start/MobileNetV3_large_x1_0.yaml` as the configuration file to evaluate the model `./output/MobileNetV3_large_x1_0/best_model`. You can also set the evaluation by changing the parameters in the configuration file, or you can update the configuration with the `-o` parameter, as shown above.
Some of the configurable evaluation parameters are described as follows: Some of the configurable evaluation parameters are described as follows:
* `ARCHITECTURE.name`: Model name * `Arch.name`: Model name
* `pretrained_model`: The path of the model file to be evaluated * `Global.pretrained_model`: The path of the model file to be evaluated
* `load_static_weights`: Whether the model to be evaluated is a static graph model
**Note:** If the model is a dygraph type, you only need to specify the prefix of the model file when loading the model, instead of specifying the suffix, such as [1.3 Resume Training](#13-resume-training). **Note:** If the model is a dygraph type, you only need to specify the prefix of the model file when loading the model, instead of specifying the suffix, such as [1.3 Resume Training](#13-resume-training).
...@@ -125,26 +120,15 @@ If you want to run PaddleClas on Linux with GPU, it is highly recommended to use ...@@ -125,26 +120,15 @@ If you want to run PaddleClas on Linux with GPU, it is highly recommended to use
### 2.1 Model training ### 2.1 Model training
After preparing the configuration file, The training process can be started in the following way. `paddle.distributed.launch` specifies the GPU running card number by setting `selected_gpus`: After preparing the configuration file, The training process can be started in the following way. `paddle.distributed.launch` specifies the GPU running card number by setting `gpus`:
```bash ```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3 export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch \ python3 -m paddle.distributed.launch \
--selected_gpus="0,1,2,3" \ --gpus="0,1,2,3" \
tools/train.py \ tools/train.py \
-c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml
```
The configuration can be updated by adding the `-o` parameter.
```bash
python -m paddle.distributed.launch \
--selected_gpus="0,1,2,3" \
tools/train.py \
-c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
-o pretrained_model="" \
-o use_gpu=True
``` ```
The format of output log information is the same as above, see [1.1 Model training](#11-model-training) for details. The format of output log information is the same as above, see [1.1 Model training](#11-model-training) for details.
...@@ -156,14 +140,14 @@ After configuring the configuration file, you can finetune it by loading the pre ...@@ -156,14 +140,14 @@ After configuring the configuration file, you can finetune it by loading the pre
``` ```
export CUDA_VISIBLE_DEVICES=0,1,2,3 export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch \ python3 -m paddle.distributed.launch \
--selected_gpus="0,1,2,3" \ --gpus="0,1,2,3" \
tools/train.py \ tools/train.py \
-c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \ -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./pretrained/MobileNetV3_large_x1_0_pretrained" -o Arch.pretrained=True
``` ```
Among them, `pretrained_model` is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file. Among them, `Arch.pretrained` is set to `True` or `False`. It also can be used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file.
There contains a lot of examples of model finetuning in [Quick Start](./quick_start_en.md). You can refer to this tutorial to finetune the model on a specific dataset. There contains a lot of examples of model finetuning in [Quick Start](./quick_start_en.md). You can refer to this tutorial to finetune the model on a specific dataset.
...@@ -175,26 +159,26 @@ If the training process is terminated for some reasons, you can also load the ch ...@@ -175,26 +159,26 @@ If the training process is terminated for some reasons, you can also load the ch
``` ```
export CUDA_VISIBLE_DEVICES=0,1,2,3 export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch \ python3 -m paddle.distributed.launch \
--selected_gpus="0,1,2,3" \ --gpus="0,1,2,3" \
tools/train.py \ tools/train.py \
-c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \ -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o checkpoints="./output/MobileNetV3_large_x1_0/5/ppcls" \ -o Global.checkpoints="./output/MobileNetV3_large_x1_0/epoch_5" \
-o last_epoch=5 \ -o Global.device=gpu
-o use_gpu=True
``` ```
The configuration file does not need to be modified. You only need to add the `checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter. About `last_epoch` parameter, please refer [1.3 Resume training](#13-resume-training) for details. The configuration file does not need to be modified. You only need to add the `Global.checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter as described in [1.3 Resume training](#13-resume-training).
### 2.4 Model evaluation ### 2.4 Model evaluation
The model evaluation process can be started as follows. The model evaluation process can be started as follows.
```bash ```bash
python tools/eval.py \ export CUDA_VISIBLE_DEVICES=0,1,2,3
-c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \ python3 -m paddle.distributed.launch \
-o pretrained_model="./output/MobileNetV3_large_x1_0/best_model/ppcls"\ tools/eval.py \
-o load_static_weights=False -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o Global.pretrained_model=./output/MobileNetV3_large_x1_0/best_model
``` ```
About parameter description, see [1.4 Model evaluation](#14-model-evaluation) for details. About parameter description, see [1.4 Model evaluation](#14-model-evaluation) for details.
...@@ -204,30 +188,16 @@ About parameter description, see [1.4 Model evaluation](#14-model-evaluation) fo ...@@ -204,30 +188,16 @@ About parameter description, see [1.4 Model evaluation](#14-model-evaluation) fo
After the training is completed, you can predict by using the pre-trained model obtained by the training, as follows: After the training is completed, you can predict by using the pre-trained model obtained by the training, as follows:
```python ```python
python tools/infer/infer.py \ python3 tools/infer.py \
-i image path \ -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
--model MobileNetV3_large_x1_0 \ -o Infer.infer_imgs=dataset/flowers102/jpg/image_00001.jpg \
--pretrained_model "./output/MobileNetV3_large_x1_0/best_model/ppcls" \ -o Global.pretrained_model=./output/MobileNetV3_large_x1_0/best_model
--use_gpu True \
--load_static_weights False
``` ```
Among them: Among them:
+ `image_file`(i): The path of the image file to be predicted, such as `./test.jpeg`; + `Infer.infer_imgs`: The path of the image file or folder to be predicted;
+ `model`: Model name, such as `MobileNetV3_large_x1_0`; + `Global.pretrained_model`: Weight file path, such as `./output/MobileNetV3_large_x1_0/best_model`;
+ `pretrained_model`: Weight file path, such as `./pretrained/MobileNetV3_large_x1_0_pretrained/`;
+ `use_gpu`: Whether to use the GPU, default by `True`;
+ `load_static_weights`: Whether to load the pre-trained model obtained from static image training, default by `False`;
+ `resize_short`: The length of the shortest side of the image that be scaled proportionally, default by `256`;
+ `resize`: The side length of the image that be center cropped from resize_shorted image, default by `224`;
+ `pre_label_image`: Whether to pre-label the image data, default value: `False`;
+ `pre_label_out_idr`: The output path of pre-labeled image data. When `pre_label_image=True`, a lot of subfolders will be generated under the path, each subfolder represent a category, which stores all the images predicted by the model to belong to the category.
**Note**: If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `resize_short=384`, `resize=384`.
About more detailed infomation, you can refer to [infer.py](../../../tools/infer/infer.py).
<a name="model_inference"></a>
## 4. Use the inference model to predict ## 4. Use the inference model to predict
PaddlePaddle supports inference using prediction engines, which will be introduced next. PaddlePaddle supports inference using prediction engines, which will be introduced next.
...@@ -235,41 +205,38 @@ PaddlePaddle supports inference using prediction engines, which will be introduc ...@@ -235,41 +205,38 @@ PaddlePaddle supports inference using prediction engines, which will be introduc
Firstly, you should export inference model using `tools/export_model.py`. Firstly, you should export inference model using `tools/export_model.py`.
```bash ```bash
python tools/export_model.py \ python3 tools/export_model.py \
--model MobileNetV3_large_x1_0 \ -c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
--pretrained_model ./output/MobileNetV3_large_x1_0/best_model/ppcls \ -o Global.pretrained_model=output/MobileNetV3_large_x1_0/best_model
--output_path ./inference \
--class_dim 1000
``` ```
Among them, the `--model` parameter is used to specify the model name, `--pretrained_model` parameter is used to specify the model file path, the path does not need to include the model file suffix name, and `--output_path` is used to specify the storage path of the converted model, class_dim means number of class for the model, default as 1000. Among them, `Global.pretrained_model` parameter is used to specify the model file path that does not need to include the file suffix name.
**Note**:
1. If `--output_path=./inference`, then three files will be generated in the folder `inference`, they are `inference.pdiparams`, `inference.pdmodel` and `inference.pdiparams.info`.
2. You can specify the `shape` of the model input image by setting the parameter `--img_size`, the default is `224`, which means the shape of input image is `224*224`. If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, you need to set `--img_size=384`.
The above command will generate the model structure file (`inference.pdmodel`) and the model weight file (`inference.pdiparams`), and then the inference engine can be used for inference: The above command will generate the model structure file (`inference.pdmodel`) and the model weight file (`inference.pdiparams`), and then the inference engine can be used for inference:
Go to the deploy directory:
```
cd deploy
```
Using inference engine to inference. Because the mapping file of ImageNet1k dataset is used by default, we should set `PostProcess.Topk.class_id_map_file` into `None`.
```bash ```bash
python tools/infer/predict.py \ python3 python/predict_cls.py \
--image_file image path \ -c configs/inference_cls.yaml \
--model_file "./inference/inference.pdmodel" \ -o Global.infer_imgs=../dataset/flowers102/jpg/image_00001.jpg \
--params_file "./inference/inference.pdiparams" \ -o Global.inference_model_dir=../inference/ \
--use_gpu=True \ -o PostProcess.Topk.class_id_map_file=None
--use_tensorrt=False
``` ```
Among them: Among them:
+ `image_file`: The path of the image file to be predicted, such as `./test.jpeg`; + `Global.infer_imgs`: The path of the image file to be predicted;
+ `model_file`: Model file path, such as `./MobileNetV3_large_x1_0/inference.pdmodel`; + `Global.inference_model_dir`: Model structure file path, such as `../inference/inference.pdmodel`;
+ `params_file`: Weight file path, such as `./MobileNetV3_large_x1_0/inference.pdiparams`; + `Global.use_tensorrt`: Whether to use the TesorRT, default by `False`;
+ `use_tensorrt`: Whether to use the TesorRT, default by `True`; + `Global.use_gpu`: Whether to use the GPU, default by `True`
+ `use_gpu`: Whether to use the GPU, default by `True` + `Global.enable_mkldnn`: Wheter to use `MKL-DNN`, default by `False`. It is valid when `Global.use_gpu` is `False`.
+ `enable_mkldnn`: Wheter to use `MKL-DNN`, default by `False`. When both `use_gpu` and `enable_mkldnn` are set to `True`, GPU is used to run and `enable_mkldnn` will be ignored. + `Global.use_fp16`: Whether to enable FP16, default by `False`;
+ `resize_short`: The length of the shortest side of the image that be scaled proportionally, default by `256`;
+ `resize`: The side length of the image that be center cropped from resize_shorted image, default by `224`;
+ `enable_calc_topk`: Whether to calculate top-k accuracy of the predction, default by `False`. Top-k accuracy will be printed out when set as `True`.
+ `gt_label_path`: Image name and label file, used when `enable_calc_topk` is `True` to get image list and labels.
**Note**: If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `resize_short=384`, `resize=384`. **Note**: If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `resize_short=384`, `resize=384`.
If you want to evaluate the speed of the model, it is recommended to use [predict.py](../../../tools/infer/predict.py), and enable TensorRT to accelerate. If you want to evaluate the speed of the model, it is recommended to enable TensorRT to accelerate for GPU, and MKL-DNN for CPU.
...@@ -120,7 +120,7 @@ python3 tools/train.py \ ...@@ -120,7 +120,7 @@ python3 tools/train.py \
`-c` is used to specify the path to the configuration file, and `-o` is used to specify the parameters that need to be modified or added, where `-o Arch.Backbone.pretrained=True` indicates that the Backbone part uses the pre-trained model, in addition, `Arch.Backbone.pretrained` can also specify backbone.`pretrained` can also specify the address of a specific model weight file, which needs to be replaced with the path to your own pre-trained model weight file when using it. `-o Global.device=gpu` indicates that the GPU is used for training. If you want to use a CPU for training, you need to set `Global.device` to `cpu`. `-c` is used to specify the path to the configuration file, and `-o` is used to specify the parameters that need to be modified or added, where `-o Arch.Backbone.pretrained=True` indicates that the Backbone part uses the pre-trained model, in addition, `Arch.Backbone.pretrained` can also specify backbone.`pretrained` can also specify the address of a specific model weight file, which needs to be replaced with the path to your own pre-trained model weight file when using it. `-o Global.device=gpu` indicates that the GPU is used for training. If you want to use a CPU for training, you need to set `Global.device` to `cpu`.
For more detailed training configuration, you can also modify the corresponding configuration file of the model directly. Refer to the [configuration document](config_en.md) for specific configuration parameters. For more detailed training configuration, you can also modify the corresponding configuration file of the model directly. Refer to the [configuration document](config_description_en.md) for specific configuration parameters.
Run the above commands to check the output log, an example is as follows: Run the above commands to check the output log, an example is as follows:
......
docs/images/wx_group.png

57.6 KB | W: | H:

docs/images/wx_group.png

201.2 KB | W: | H:

docs/images/wx_group.png
docs/images/wx_group.png
docs/images/wx_group.png
docs/images/wx_group.png
  • 2-up
  • Swipe
  • Onion skin
...@@ -31,9 +31,9 @@ ...@@ -31,9 +31,9 @@
| 模型 | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | 下载地址 | | 模型 | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | 下载地址 |
|---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------| |---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------|
| ResNet34_vd_ssld | 0.797 | 0.760 | 0.037 | 2.434 | 6.222 | 7.39 | 21.82 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet34_vd_ssld_pretrained.pdparams) | | ResNet34_vd_ssld | 0.797 | 0.760 | 0.037 | 2.434 | 6.222 | 7.39 | 21.82 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet34_vd_ssld_pretrained.pdparams) |
| ResNet50_vd_<br>ssld | 0.830 | 0.792 | 0.039 | 3.531 | 8.090 | 8.67 | 25.58 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_ssld_pretrained.pdparams) | | ResNet50_vd_ssld | 0.830 | 0.792 | 0.039 | 3.531 | 8.090 | 8.67 | 25.58 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_ssld_pretrained.pdparams) |
| ResNet101_vd_<br>ssld | 0.837 | 0.802 | 0.035 | 6.117 | 13.762 | 16.1 | 44.57 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet101_vd_ssld_pretrained.pdparams) | | ResNet101_vd_ssld | 0.837 | 0.802 | 0.035 | 6.117 | 13.762 | 16.1 | 44.57 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet101_vd_ssld_pretrained.pdparams) |
| Res2Net50_vd_<br>26w_4s_ssld | 0.831 | 0.798 | 0.033 | 4.527 | 9.657 | 8.37 | 25.06 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net50_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net50_vd_26w_4s_ssld | 0.831 | 0.798 | 0.033 | 4.527 | 9.657 | 8.37 | 25.06 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net50_vd_26w_4s_ssld_pretrained.pdparams) |
| Res2Net101_vd_<br>26w_4s_ssld | 0.839 | 0.806 | 0.033 | 8.087 | 17.312 | 16.67 | 45.22 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net101_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net101_vd_<br>26w_4s_ssld | 0.839 | 0.806 | 0.033 | 8.087 | 17.312 | 16.67 | 45.22 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net101_vd_26w_4s_ssld_pretrained.pdparams) |
| Res2Net200_vd_<br>26w_4s_ssld | 0.851 | 0.812 | 0.049 | 14.678 | 32.350 | 31.49 | 76.21 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net200_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net200_vd_<br>26w_4s_ssld | 0.851 | 0.812 | 0.049 | 14.678 | 32.350 | 31.49 | 76.21 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net200_vd_26w_4s_ssld_pretrained.pdparams) |
| HRNet_W18_C_ssld | 0.812 | 0.769 | 0.043 | 7.406 | 13.297 | 4.14 | 21.29 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_ssld_pretrained.pdparams) | | HRNet_W18_C_ssld | 0.812 | 0.769 | 0.043 | 7.406 | 13.297 | 4.14 | 21.29 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_ssld_pretrained.pdparams) |
...@@ -45,16 +45,44 @@ ...@@ -45,16 +45,44 @@
| 模型 | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | SD855 time(ms)<br>bs=1 | Flops(G) | Params(M) | 模型大小(M) | 下载地址 | | 模型 | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | SD855 time(ms)<br>bs=1 | Flops(G) | Params(M) | 模型大小(M) | 下载地址 |
|---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------| |---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------|
| MobileNetV1_<br>ssld | 0.779 | 0.710 | 0.069 | 32.523 | 1.11 | 4.19 | 16 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_ssld_pretrained.pdparams) | | MobileNetV1_ssld | 0.779 | 0.710 | 0.069 | 32.523 | 1.11 | 4.19 | 16 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_ssld_pretrained.pdparams) |
| MobileNetV2_<br>ssld | 0.767 | 0.722 | 0.045 | 23.318 | 0.6 | 3.44 | 14 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_ssld_pretrained.pdparams) | | MobileNetV2_ssld | 0.767 | 0.722 | 0.045 | 23.318 | 0.6 | 3.44 | 14 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_ssld_pretrained.pdparams) |
| MobileNetV3_<br>small_x0_35_ssld | 0.556 | 0.530 | 0.026 | 2.635 | 0.026 | 1.66 | 6.9 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x0_35_ssld_pretrained.pdparams) | | MobileNetV3_small_x0_35_ssld | 0.556 | 0.530 | 0.026 | 2.635 | 0.026 | 1.66 | 6.9 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x0_35_ssld_pretrained.pdparams) |
| MobileNetV3_<br>large_x1_0_ssld | 0.790 | 0.753 | 0.036 | 19.308 | 0.45 | 5.47 | 21 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_ssld_pretrained.pdparams) | | MobileNetV3_large_x1_0_ssld | 0.790 | 0.753 | 0.036 | 19.308 | 0.45 | 5.47 | 21 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_ssld_pretrained.pdparams) |
| MobileNetV3_small_<br>x1_0_ssld | 0.713 | 0.682 | 0.031 | 6.546 | 0.123 | 2.94 | 12 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x1_0_ssld_pretrained.pdparams) | | MobileNetV3_small_x1_0_ssld | 0.713 | 0.682 | 0.031 | 6.546 | 0.123 | 2.94 | 12 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x1_0_ssld_pretrained.pdparams) |
| GhostNet_<br>x1_3_ssld | 0.794 | 0.757 | 0.037 | 19.983 | 0.44 | 7.3 | 29 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/GhostNet_x1_3_ssld_pretrained.pdparams) | | GhostNet_x1_3_ssld | 0.794 | 0.757 | 0.037 | 19.983 | 0.44 | 7.3 | 29 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/GhostNet_x1_3_ssld_pretrained.pdparams) |
* Intel CPU端知识蒸馏模型
| 模型 | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | Intel-Xeon-Gold-6148 time(ms)<br>bs=1 | Flops(M) | Params(M) | 下载地址 |
|---------------------|-----------|-----------|---------------|----------------|----------|-----------|-----------------------------------|
| PPLCNet_x0_5_ssld | 0.661 | 0.631 | 0.030 | 2.05 | 47 | 1.9 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_ssld_pretrained.pdparams) |
| PPLCNet_x1_0_ssld | 0.744 | 0.713 | 0.033 | 2.46 | 161 | 3.0 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_ssld_pretrained.pdparams) |
| PPLCNet_x2_5_ssld | 0.808 | 0.766 | 0.042 | 5.39 | 906 | 9.0 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_ssld_pretrained.pdparams) |
* 注: `Reference Top-1 Acc`表示PaddleClas基于ImageNet1k数据集训练得到的预训练模型精度。 * 注: `Reference Top-1 Acc`表示PaddleClas基于ImageNet1k数据集训练得到的预训练模型精度。
<a name="PPLCNet系列"></a>
### PPLCNet系列
PPLCNet系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:[PPLCNet系列模型文档](./models/PPLCNet.md)
| 模型 | Top-1 Acc | Top-5 Acc | Intel-Xeon-Gold-6148 time(ms)<br>bs=1 | FLOPs(M) | Params(M) | 下载地址 |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| PPLCNet_x0_25 |0.5186 | 0.7565 | 1.74 | 18 | 1.5 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_25_pretrained.pdparams) |
| PPLCNet_x0_35 |0.5809 | 0.8083 | 1.92 | 29 | 1.6 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_35_pretrained.pdparams) |
| PPLCNet_x0_5 |0.6314 | 0.8466 | 2.05 | 47 | 1.9 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_pretrained.pdparams) |
| PPLCNet_x0_75 |0.6818 | 0.8830 | 2.29 | 99 | 2.4 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams) |
| PPLCNet_x1_0 |0.7132 | 0.9003 | 2.46 | 161 | 3.0 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_pretrained.pdparams) |
| PPLCNet_x1_5 |0.7371 | 0.9153 | 3.19 | 342 | 4.5 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_5_pretrained.pdparams) |
| PPLCNet_x2_0 |0.7518 | 0.9227 | 4.27 | 590 | 6.5 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams) |
| PPLCNet_x2_5 |0.7660 | 0.9300 | 5.39 | 906 | 9.0 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_pretrained.pdparams) |
<a name="ResNet及其Vd系列"></a> <a name="ResNet及其Vd系列"></a>
### ResNet及其Vd系列 ### ResNet及其Vd系列
...@@ -429,7 +457,7 @@ ViT(Vision Transformer)与DeiT(Data-efficient Image Transformers)系列 ...@@ -429,7 +457,7 @@ ViT(Vision Transformer)与DeiT(Data-efficient Image Transformers)系列
| 模型 | Top-1 Acc | Top-5 Acc | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | 下载地址 | | 模型 | Top-1 Acc | Top-5 Acc | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | 下载地址 |
| ---------- | --------- | --------- | ---------------- | ---------------- | -------- | --------- | ------------------------------------------------------------ | | ---------- | --------- | --------- | ---------------- | ---------------- | -------- | --------- | ------------------------------------------------------------ |
| TNT_small | 0.8121 |0.9563 | | | 5.2 | 23.8 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/TNT_small_pretrained.pdparams) | | | TNT_small | 0.8121 |0.9563 | | | 5.2 | 23.8 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/TNT_small_pretrained.pdparams) | |
**注**:TNT模型的数据预处理部分`NormalizeImage`中的`mean``std`均为0.5。 **注**:TNT模型的数据预处理部分`NormalizeImage`中的`mean``std`均为0.5。
......
...@@ -25,58 +25,66 @@ tar -xf NUS-SCENE-dataset.tar ...@@ -25,58 +25,66 @@ tar -xf NUS-SCENE-dataset.tar
cd ../../ cd ../../
``` ```
## 二、环境准备 ## 二、模型训练
### 2.1 下载预训练模型 ```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml
```
训练10epoch之后,验证集最好的正确率应该在0.95左右。
本例展示基于ResNet50_vd模型的多标签分类流程,因此首先下载ResNet50_vd的预训练模型 ## 三、模型评估
```bash ```bash
mkdir pretrained python3 tools/eval.py \
cd pretrained -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_pretrained.pdparams -o Arch.pretrained="./output/MobileNetV1/best_model"
cd ../
``` ```
## 三、模型训练 ## 四、模型预测
```shell ```bash
export CUDA_VISIBLE_DEVICES=0 python3 tools/infer.py \
python -m paddle.distributed.launch \ -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \
--gpus="0" \ -o Arch.pretrained="./output/MobileNetV1/best_model"
tools/train.py \ ```
-c ./configs/quick_start/ResNet50_vd_multilabel.yaml
得到类似下面的输出:
```
[{'class_ids': [6, 13, 17, 23, 26, 30], 'scores': [0.95683, 0.5567, 0.55211, 0.99088, 0.5943, 0.78767], 'file_name': './deploy/images/0517_2715693311.jpg', 'label_names': []}]
``` ```
训练10epoch之后,验证集最好的正确率应该在0.72左右。 ## 五、基于预测引擎预测
## 四、模型评估 ### 5.1 导出inference model
```bash ```bash
python tools/eval.py \ python3 tools/export_model.py \
-c ./configs/quick_start/ResNet50_vd_multilabel.yaml \ -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \
-o pretrained_model="./output/ResNet50_vd/best_model/ppcls" \ -o Arch.pretrained="./output/MobileNetV1/best_model"
-o load_static_weights=False
``` ```
inference model的路径默认在当前路径下`./inference`
评估指标采用mAP,验证集的mAP应该在0.57左右。 ### 5.2 基于预测引擎预测
## 五、模型预测 首先进入deploy目录下:
```bash ```bash
python tools/infer/infer.py \ cd ./deploy
-i "./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/images/0199_434752251.jpg" \ ```
--model ResNet50_vd \
--pretrained_model "./output/ResNet50_vd/best_model/ppcls" \ 通过预测引擎推理预测:
--use_gpu True \
--load_static_weights False \ ```
--multilabel True \ python3 python/predict_cls.py \
--class_num 33 -c configs/inference_multilabel_cls.yaml
``` ```
得到类似下面的输出: 得到类似下面的输出:
``` ```
class id: 3, probability: 0.6025 0517_2715693311.jpg: class id(s): [6, 13, 17, 23, 26, 30], score(s): [0.96, 0.56, 0.55, 0.99, 0.59, 0.79], label_name(s): []
class id: 23, probability: 0.5491 ```
class id: 32, probability: 0.7006
```
\ No newline at end of file
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
* 图像分类、识别、检索领域大佬众多,模型和论文更新速度也很快,本文档回答主要依赖有限的项目实践,难免挂一漏万,如有遗漏和不足,也希望有识之士帮忙补充和修正,万分感谢。 * 图像分类、识别、检索领域大佬众多,模型和论文更新速度也很快,本文档回答主要依赖有限的项目实践,难免挂一漏万,如有遗漏和不足,也希望有识之士帮忙补充和修正,万分感谢。
## 目录 ## 目录
* [近期更新](#近期更新)(2021.08.11) * [近期更新](#近期更新)(2021.09.08)
* [精选](#精选) * [精选](#精选)
* [1. 理论篇](#1.理论篇) * [1. 理论篇](#1.理论篇)
* [1.1 PaddleClas基础知识](#1.1PaddleClas基础知识) * [1.1 PaddleClas基础知识](#1.1PaddleClas基础知识)
...@@ -27,60 +27,69 @@ ...@@ -27,60 +27,69 @@
<a name="近期更新"></a> <a name="近期更新"></a>
## 近期更新 ## 近期更新
#### Q2.6.2: 导出inference模型进行预测部署,准确率异常,为什么呢? #### Q2.1.7: 在训练时,出现如下报错信息:`ERROR: Unexpected segmentation fault encountered in DataLoader workers.`,如何排查解决问题呢?
**A**: 该问题通常是由于在导出时未能正确加载模型参数导致的,首先检查模型导出时的日志,是否存在类似下述内容: **A**:尝试将训练配置文件中的字段 `num_workers` 设置为 `0`;尝试将训练配置文件中的字段 `batch_size` 调小一些;检查数据集格式和配置文件中的数据集路径是否正确。
```
UserWarning: Skip loading for ***. *** is not found in the provided dict.
```
如果存在,则说明模型权重未能加载成功,请进一步检查配置文件中的 `Global.pretrained_model` 字段,是否正确配置了模型权重文件的路径。模型权重文件后缀名通常为 `pdparams`,注意在配置该路径时无需填写文件后缀名。
#### Q2.1.4: 数据预处理中,不想对输入数据进行裁剪,该如何设置?或者如何设置剪裁的尺寸。 #### Q2.1.8: 如何在训练时使用 `Mixup` 和 `Cutmix` ?
**A**: PaddleClas 支持的数据预处理算子可在这里查看:`ppcls/data/preprocess/__init__.py`,所有支持的算子均可在配置文件中进行配置,配置的算子名称需要和算子类名一致,参数与对应算子类的构造函数参数一致。如不需要对图像裁剪,则可去掉 `CropImage``RandCropImage`,使用 `ResizeImage` 替换即可,可通过其参数设置不同的resize方式, 使用 `size` 参数则直接将图像缩放至固定大小,使用`resize_short` 参数则会维持图像宽高比进行缩放。设置裁剪尺寸时,可通过 `CropImage` 算子的 `size` 参数,或 `RandCropImage` 算子的 `size` 参数。 **A**
* `Mixup` 的使用方法请参考 [Mixup](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Mixup.yaml#L63-L65)`Cuxmix` 请参考 [Cuxmix](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L63-L65)
#### Q1.1.3: Momentum 优化器中的 momentum 参数是什么意思呢? * 在使用 `Mixup``Cutmix` 时,需要注意:
**A**: Momentum 优化器是在 SGD 优化器的基础上引入了“动量”的概念。在 SGD 优化器中,在 `t+1` 时刻,参数 `w` 的更新可表示为: * 配置文件中的 `Loss.Tranin.CELoss` 需要修改为 `Loss.Tranin.MixCELoss`,可参考 [MixCELoss](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L23-L26)
```latex * 使用 `Mixup``Cutmix` 做训练时无法计算训练的精度(Acc)指标,因此需要在配置文件中取消 `Metric.Train.TopkAcc` 字段,可参考 [Metric.Train.TopkAcc](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L125-L128)
w_t+1 = w_t - lr * grad
``` #### Q2.1.9: 训练配置yaml文件中,字段 `Global.pretrain_model` 和 `Global.checkpoints` 分别用于配置什么呢?
其中,`lr` 为学习率,`grad` 为此时参数 `w` 的梯度。在引入动量的概念后,参数 `w` 的更新可表示为: **A**
```latex * 当需要 `fine-tune` 时,可以通过字段 `Global.pretrain_model` 配置预训练模型权重文件的路径,预训练模型权重文件后缀名通常为 `.pdparams`
v_t+1 = m * v_t + lr * grad * 在训练过程中,训练程序会自动保存每个epoch结束时的断点信息,包括优化器信息 `.pdopt` 和模型权重信息 `.pdparams`。在训练过程意外中断等情况下,需要恢复训练时,可以通过字段 `Global.checkpoints` 配置训练过程中保存的断点信息文件,例如通过配置 `checkpoints: ./output/ResNet18/epoch_18` 即可恢复18epoch训练结束时的断点信息,PaddleClas将自动加载 `epoch_18.pdopt``epoch_18.pdparams`,从19epoch继续训练。
w_t+1 = w_t - v_t+1
#### Q2.6.3: 如何将模型转为 `ONNX` 格式?
**A**:Paddle支持两种转ONNX格式模型的方式,且依赖于 `paddle2onnx` 工具,首先需要安装 `paddle2onnx`
```shell
pip install paddle2onnx
``` ```
其中,`m` 即为动量 `momentum`,表示累积动量的加权值,一般取 `0.9`,当取值小于 `1` 时,则越早期的梯度对当前的影响越小,例如,当动量参数 `m``0.9` 时,在 `t` 时刻,`t-5` 的梯度加权值为 `0.9 ^ 5 = 0.59049`,而 `t-2` 时刻的梯度加权值为 `0.9 ^ 2 = 0.81`。因此,太过“久远”的梯度信息对当前的参考意义很小,而“最近”的历史梯度信息对当前影响更大,这也是符合直觉的。
<div align="center"> * 从 inference model 转为 ONNX 格式模型:
<img src="../../images/faq/momentum.jpeg" width="400">
</div>
*该图来自 `https://blog.csdn.net/tsyccnh/article/details/76270707`* 以动态图导出的 `combined` 格式 inference model(包含 `.pdmodel` 和 `.pdiparams` 两个文件)为例,使用以下命令进行模型格式转换:
```shell
paddle2onnx --model_dir ${model_path} --model_filename ${model_path}/inference.pdmodel --params_filename ${model_path}/inference.pdiparams --save_file ${save_path}/model.onnx --enable_onnx_checker True
```
上述命令中:
* `model_dir`:该参数下需要包含 `.pdmodel` 和 `.pdiparams` 两个文件;
* `model_filename`:该参数用于指定参数 `model_dir` 下的 `.pdmodel` 文件路径;
* `params_filename`:该参数用于指定参数 `model_dir` 下的 `.pdiparams` 文件路径;
* `save_file`:该参数用于指定转换后的模型保存目录路径。
通过引入动量的概念,在参数更新时考虑了历史更新的影响,因此可以加快收敛速度,也改善了 `SGD` 优化器带来的损失(cost、loss)震荡问题 关于静态图导出的非 `combined` 格式的 inference model(通常包含文件 `__model__` 和多个参数文件)转换模型格式,以及更多参数说明请参考 paddle2onnx 官方文档 [paddle2onnx](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_zh.md#%E5%8F%82%E6%95%B0%E9%80%89%E9%A1%B9)
#### Q1.1.4: PaddleClas 是否有 `Fixing the train-test resolution discrepancy` 这篇论文的实现呢? * 直接从模型组网代码导出ONNX格式模型:
**A**: 目前 PaddleClas 没有实现。如果需要,可以尝试自己修改代码。简单来说,该论文所提出的思想是使用较大分辨率作为输入,对已经训练好的模型最后的FC层进行fine-tune。具体操作上,首先在较低分辨率的数据集上对模型网络进行训练,完成训练后,对网络除最后的FC层外的其他层的权重设置参数 `stop_gradient=True`,然后使用较大分辨率的输入对网络进行fine-tune训练。
#### Q1.6.2: PaddleClas 图像识别用于 Eval 的配置文件中,`Query` 和 `Gallery` 配置具体是用于做什么呢? 以动态图模型组网代码为例,模型类为继承于 `paddle.nn.Layer` 的子类,代码如下所示:
**A**: `Query``Gallery` 均为数据集配置,其中 `Gallery` 用于配置底库数据,`Query` 用于配置验证集。在进行 Eval 时,首先使用模型对 `Gallery` 底库数据进行前向计算特征向量,特征向量用于构建底库,然后模型对 `Query` 验证集中的数据进行前向计算特征向量,再与底库计算召回率等指标。
#### Q2.1.5: PaddlePaddle 安装后,使用报错,无法导入 paddle 下的任何模块(import paddle.xxx),是为什么呢? ```python
**A**: 首先可以使用以下代码测试 Paddle 是否安装正确: import paddle
```python from paddle.static import InputSpec
import paddle
paddle.utils.install_check.run_check(
```
正确安装时,通常会有如下提示:
```
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
```
如未能安装成功,则会有相应问题的提示。
另外,在同时安装CPU版本和GPU版本Paddle后,由于两个版本存在冲突,需要将两个版本全部卸载,然后重新安装所需要的版本。
#### Q2.1.6: 使用PaddleClas训练时,如何设置仅保存最优模型?不想保存中间模型。 class SimpleNet(paddle.nn.Layer):
**A**: PaddleClas在训练过程中,会保存/更新以下三类模型: def __init__(self):
1. 最新的模型(`latest.pdopt``latest.pdparams``latest.pdstates`),当训练意外中断时,可使用最新保存的模型恢复训练; pass
2. 最优的模型(`best_model.pdopt``best_model.pdparams``best_model.pdstates`); def forward(self, x):
3. 训练过程中,一个epoch结束时的断点(`epoch_xxx.pdopt``epoch_xxx.pdparams``epoch_xxx.pdstates`)。训练配置文件中 `Global.save_interval` 字段表示该模型的保存间隔。将该字段设置大于总epochs数,则不再保存中间断点模型。 pass
net = SimpleNet()
x_spec = InputSpec(shape=[None, 3, 224, 224], dtype='float32', name='x')
paddle.onnx.export(layer=net, path="./SimpleNet", input_spec=[x_spec])
```
其中:
* `InputSpec()` 函数用于描述模型输入的签名信息,包括输入数据的 `shape`、`type` 和 `name`(可省略);
* `paddle.onnx.export()` 函数需要指定模型组网对象 `net`,导出模型的保存路径 `save_path`,模型的输入数据描述 `input_spec`。
需要注意,`paddlepaddle` 版本需大于 `2.0.0`。关于 `paddle.onnx.export()` 函数的更多参数说明请参考[paddle.onnx.export](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/onnx/export_cn.html#export)。
#### Q2.5.4: 在 build 检索底库时,参数 `pq_size` 应该如何设置?
**A**`pq_size` 是PQ检索算法的参数。PQ检索算法可以简单理解为“分层”检索算法,`pq_size` 是每层的“容量”,因此该参数的设置会影响检索性能,不过,在底库总数据量不太大(小于10000张)的情况下,这个参数对性能的影响很小,因此对于大多数使用场景而言,在构建底库时无需修改该参数。关于PQ检索算法的更多内容,可以查看相关[论文](https://lear.inrialpes.fr/pubs/2011/JDS11/jegou_searching_with_quantization.pdf)
<a name="精选"></a> <a name="精选"></a>
## 精选 ## 精选
...@@ -204,6 +213,22 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad ...@@ -204,6 +213,22 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad
2. 最优的模型(`best_model.pdopt``best_model.pdparams``best_model.pdstates`); 2. 最优的模型(`best_model.pdopt``best_model.pdparams``best_model.pdstates`);
3. 训练过程中,一个epoch结束时的断点(`epoch_xxx.pdopt``epoch_xxx.pdparams``epoch_xxx.pdstates`)。训练配置文件中 `Global.save_interval` 字段表示该模型的保存间隔。将该字段设置大于总epochs数,则不再保存中间断点模型。 3. 训练过程中,一个epoch结束时的断点(`epoch_xxx.pdopt``epoch_xxx.pdparams``epoch_xxx.pdstates`)。训练配置文件中 `Global.save_interval` 字段表示该模型的保存间隔。将该字段设置大于总epochs数,则不再保存中间断点模型。
#### Q2.1.7: 在训练时,出现如下报错信息:`ERROR: Unexpected segmentation fault encountered in DataLoader workers.`,如何排查解决问题呢?
**A**:尝试将训练配置文件中的字段 `num_workers` 设置为 `0`;尝试将训练配置文件中的字段 `batch_size` 调小一些;检查数据集格式和配置文件中的数据集路径是否正确。
#### Q2.1.8: 如何在训练时使用 `Mixup` 和 `Cutmix` ?
**A**
* `Mixup` 的使用方法请参考 [Mixup](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Mixup.yaml#L63-L65)`Cuxmix` 请参考 [Cuxmix](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L63-L65)
* 在使用 `Mixup``Cutmix` 时,需要注意:
* 配置文件中的 `Loss.Tranin.CELoss` 需要修改为 `Loss.Tranin.MixCELoss`,可参考 [MixCELoss](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L23-L26)
* 使用 `Mixup``Cutmix` 做训练时无法计算训练的精度(Acc)指标,因此需要在配置文件中取消 `Metric.Train.TopkAcc` 字段,可参考 [Metric.Train.TopkAcc](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L125-L128)
#### Q2.1.9: 训练配置yaml文件中,字段 `Global.pretrain_model` 和 `Global.checkpoints` 分别用于配置什么呢?
**A**
* 当需要 `fine-tune` 时,可以通过字段 `Global.pretrain_model` 配置预训练模型权重文件的路径,预训练模型权重文件后缀名通常为 `.pdparams`
* 在训练过程中,训练程序会自动保存每个epoch结束时的断点信息,包括优化器信息 `.pdopt` 和模型权重信息 `.pdparams`。在训练过程意外中断等情况下,需要恢复训练时,可以通过字段 `Global.checkpoints` 配置训练过程中保存的断点信息文件,例如通过配置 `checkpoints: ./output/ResNet18/epoch_18` 即可恢复18epoch训练结束时的断点信息,PaddleClas将自动加载 `epoch_18.pdopt``epoch_18.pdparams`,从19epoch继续训练。
<a name="2.2图像分类"></a> <a name="2.2图像分类"></a>
### 2.2 图像分类 ### 2.2 图像分类
...@@ -255,6 +280,9 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad ...@@ -255,6 +280,9 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad
#### Q2.5.3: Mac重新编译index.so时报错如下:clang: error: unsupported option '-fopenmp', 该如何处理? #### Q2.5.3: Mac重新编译index.so时报错如下:clang: error: unsupported option '-fopenmp', 该如何处理?
**A**:该问题已经解决。可以参照[文档](../../../develop/deploy/vector_search/README.md)重新编译 index.so。 **A**:该问题已经解决。可以参照[文档](../../../develop/deploy/vector_search/README.md)重新编译 index.so。
#### Q2.5.4: 在 build 检索底库时,参数 `pq_size` 应该如何设置?
**A**`pq_size` 是PQ检索算法的参数。PQ检索算法可以简单理解为“分层”检索算法,`pq_size` 是每层的“容量”,因此该参数的设置会影响检索性能,不过,在底库总数据量不太大(小于10000张)的情况下,这个参数对性能的影响很小,因此对于大多数使用场景而言,在构建底库时无需修改该参数。关于PQ检索算法的更多内容,可以查看相关[论文](https://lear.inrialpes.fr/pubs/2011/JDS11/jegou_searching_with_quantization.pdf)
<a name="2.6模型预测部署"></a> <a name="2.6模型预测部署"></a>
### 2.6 模型预测部署 ### 2.6 模型预测部署
...@@ -267,3 +295,48 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad ...@@ -267,3 +295,48 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad
UserWarning: Skip loading for ***. *** is not found in the provided dict. UserWarning: Skip loading for ***. *** is not found in the provided dict.
``` ```
如果存在,则说明模型权重未能加载成功,请进一步检查配置文件中的 `Global.pretrained_model` 字段,是否正确配置了模型权重文件的路径。模型权重文件后缀名通常为 `pdparams`,注意在配置该路径时无需填写文件后缀名。 如果存在,则说明模型权重未能加载成功,请进一步检查配置文件中的 `Global.pretrained_model` 字段,是否正确配置了模型权重文件的路径。模型权重文件后缀名通常为 `pdparams`,注意在配置该路径时无需填写文件后缀名。
#### Q2.6.3: 如何将模型转为 `ONNX` 格式?
**A**:Paddle支持两种转ONNX格式模型的方式,且依赖于 `paddle2onnx` 工具,首先需要安装 `paddle2onnx`
```shell
pip install paddle2onnx
```
* 从 inference model 转为 ONNX 格式模型:
以动态图导出的 `combined` 格式 inference model(包含 `.pdmodel` 和 `.pdiparams` 两个文件)为例,使用以下命令进行模型格式转换:
```shell
paddle2onnx --model_dir ${model_path} --model_filename ${model_path}/inference.pdmodel --params_filename ${model_path}/inference.pdiparams --save_file ${save_path}/model.onnx --enable_onnx_checker True
```
上述命令中:
* `model_dir`:该参数下需要包含 `.pdmodel` 和 `.pdiparams` 两个文件;
* `model_filename`:该参数用于指定参数 `model_dir` 下的 `.pdmodel` 文件路径;
* `params_filename`:该参数用于指定参数 `model_dir` 下的 `.pdiparams` 文件路径;
* `save_file`:该参数用于指定转换后的模型保存目录路径。
关于静态图导出的非 `combined` 格式的 inference model(通常包含文件 `__model__` 和多个参数文件)转换模型格式,以及更多参数说明请参考 paddle2onnx 官方文档 [paddle2onnx](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_zh.md#%E5%8F%82%E6%95%B0%E9%80%89%E9%A1%B9)。
* 直接从模型组网代码导出ONNX格式模型:
以动态图模型组网代码为例,模型类为继承于 `paddle.nn.Layer` 的子类,代码如下所示:
```python
import paddle
from paddle.static import InputSpec
class SimpleNet(paddle.nn.Layer):
def __init__(self):
pass
def forward(self, x):
pass
net = SimpleNet()
x_spec = InputSpec(shape=[None, 3, 224, 224], dtype='float32', name='x')
paddle.onnx.export(layer=net, path="./SimpleNet", input_spec=[x_spec])
```
其中:
* `InputSpec()` 函数用于描述模型输入的签名信息,包括输入数据的 `shape`、`type` 和 `name`(可省略);
* `paddle.onnx.export()` 函数需要指定模型组网对象 `net`,导出模型的保存路径 `save_path`,模型的输入数据描述 `input_spec`。
需要注意,`paddlepaddle` 版本需大于 `2.0.0`。关于 `paddle.onnx.export()` 函数的更多参数说明请参考[paddle.onnx.export](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/onnx/export_cn.html#export)。
# PPLCNet系列
## 概述
PPLCNet系列是百度PaddleCV团队提出的一种在Intel-CPU上表现优异的网络,作者总结了一些在Intel-CPU上可以提升模型精度但几乎不增加推理耗时的方法,将这些方法组合成了一个新的网络,即PPLCNet。与其他轻量级网络相比,PPLCNet可以在相同延时下取得更高的精度。PPLCNet已在图像分类、目标检测、语义分割上表现出了强大的竞争力。
## 精度、FLOPS和参数量
| Models | Top1 | Top5 | FLOPs<br>(M) | Parameters<br>(M) |
|:--:|:--:|:--:|:--:|:--:|
| PPLCNet_x0_25 |0.5186 | 0.7565 | 18 | 1.5 |
| PPLCNet_x0_35 |0.5809 | 0.8083 | 29 | 1.6 |
| PPLCNet_x0_5 |0.6314 | 0.8466 | 47 | 1.9 |
| PPLCNet_x0_75 |0.6818 | 0.8830 | 99 | 2.4 |
| PPLCNet_x1_0 |0.7132 | 0.9003 | 161 | 3.0 |
| PPLCNet_x1_5 |0.7371 | 0.9153 | 342 | 4.5 |
| PPLCNet_x2_0 |0.7518 | 0.9227 | 590 | 6.5 |
| PPLCNet_x2_5 |0.7660 | 0.9300 | 906 | 9.0 |
| PPLCNet_x0_5_ssld |0.6610 | 0.8646 | 47 | 1.9 |
| PPLCNet_x1_0_ssld |0.7439 | 0.9209 | 161 | 3.0 |
| PPLCNet_x2_5_ssld |0.8082 | 0.9533 | 906 | 9.0 |
## 基于Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz的预测速度
| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) |
|------------------|-----------|-------------------|--------------------------|
| PPLCNet_x0_25 | 224 | 256 | 1.74 |
| PPLCNet_x0_35 | 224 | 256 | 1.92 |
| PPLCNet_x0_5 | 224 | 256 | 2.05 |
| PPLCNet_x0_75 | 224 | 256 | 2.29 |
| PPLCNet_x1_0 | 224 | 256 | 2.46 |
| PPLCNet_x1_5 | 224 | 256 | 3.19 |
| PPLCNet_x2_0 | 224 | 256 | 4.27 |
| PPLCNet_x2_5 | 224 | 256 | 5.39 |
| PPLCNet_x0_5_ssld | 224 | 256 | 2.05 |
| PPLCNet_x1_0_ssld | 224 | 256 | 2.46 |
| PPLCNet_x2_5_ssld | 224 | 256 | 5.39 |
# PPLCNet系列
## 概述
PPLCNet系列是百度PaddleCV团队提出的一种在Intel-CPU上表现优异的网络,作者总结了一些在Intel-CPU上可以提升模型精度但几乎不增加推理耗时的方法,将这些方法组合成了一个新的网络,即PPLCNet。与其他轻量级网络相比,PPLCNet可以在相同延时下取得更高的精度。PPLCNet已在图像分类、目标检测、语义分割上表现出了强大的竞争力。
## 精度、FLOPS和参数量
| Models | Top1 | Top5 | FLOPs<br>(M) | Parameters<br>(M) |
|:--:|:--:|:--:|:--:|:--:|
| PPLCNet_x0_25 |0.5186 | 0.7565 | 18 | 1.5 |
| PPLCNet_x0_35 |0.5809 | 0.8083 | 29 | 1.6 |
| PPLCNet_x0_5 |0.6314 | 0.8466 | 47 | 1.9 |
| PPLCNet_x0_75 |0.6818 | 0.8830 | 99 | 2.4 |
| PPLCNet_x1_0 |0.7132 | 0.9003 | 161 | 3.0 |
| PPLCNet_x1_5 |0.7371 | 0.9153 | 342 | 4.5 |
| PPLCNet_x2_0 |0.7518 | 0.9227 | 590 | 6.5 |
| PPLCNet_x2_5 |0.7660 | 0.9300 | 906 | 9.0 |
| PPLCNet_x0_5_ssld |0.6610 | 0.8646 | 47 | 1.9 |
| PPLCNet_x1_0_ssld |0.7439 | 0.9209 | 161 | 3.0 |
| PPLCNet_x2_5_ssld |0.8082 | 0.9533 | 906 | 9.0 |
## 基于Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz的预测速度
| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) |
|------------------|-----------|-------------------|--------------------------|
| PPLCNet_x0_25 | 224 | 256 | 1.74 |
| PPLCNet_x0_35 | 224 | 256 | 1.92 |
| PPLCNet_x0_5 | 224 | 256 | 2.05 |
| PPLCNet_x0_75 | 224 | 256 | 2.29 |
| PPLCNet_x1_0 | 224 | 256 | 2.46 |
| PPLCNet_x1_5 | 224 | 256 | 3.19 |
| PPLCNet_x2_0 | 224 | 256 | 4.27 |
| PPLCNet_x2_5 | 224 | 256 | 5.39 |
| PPLCNet_x0_5_ssld | 224 | 256 | 2.05 |
| PPLCNet_x1_0_ssld | 224 | 256 | 2.46 |
| PPLCNet_x2_5_ssld | 224 | 256 | 5.39 |
...@@ -117,7 +117,7 @@ python3 tools/train.py \ ...@@ -117,7 +117,7 @@ python3 tools/train.py \
其中,`-c`用于指定配置文件的路径,`-o`用于指定需要修改或者添加的参数,其中`-o Arch.Backbone.pretrained=True`表示Backbone部分使用预训练模型,此外,`Arch.Backbone.pretrained`也可以指定具体的模型权重文件的地址,使用时需要换成自己的预训练模型权重文件的路径。`-o Global.device=gpu`表示使用GPU进行训练。如果希望使用CPU进行训练,则需要将`Global.device`设置为`cpu` 其中,`-c`用于指定配置文件的路径,`-o`用于指定需要修改或者添加的参数,其中`-o Arch.Backbone.pretrained=True`表示Backbone部分使用预训练模型,此外,`Arch.Backbone.pretrained`也可以指定具体的模型权重文件的地址,使用时需要换成自己的预训练模型权重文件的路径。`-o Global.device=gpu`表示使用GPU进行训练。如果希望使用CPU进行训练,则需要将`Global.device`设置为`cpu`
更详细的训练配置,也可以直接修改模型对应的配置文件。具体配置参数参考[配置文档](config.md) 更详细的训练配置,也可以直接修改模型对应的配置文件。具体配置参数参考[配置文档](config_description.md)
运行上述命令,可以看到输出日志,示例如下: 运行上述命令,可以看到输出日志,示例如下:
...@@ -245,4 +245,4 @@ python3 tools/export_model.py \ ...@@ -245,4 +245,4 @@ python3 tools/export_model.py \
- 平均检索精度(mAP) - 平均检索精度(mAP)
- AP: AP指的是不同召回率上的正确率的平均值 - AP: AP指的是不同召回率上的正确率的平均值
- mAP: 测试集中所有图片对应的AP的的平均值 - mAP: 测试集中所有图片对应的AP的的平均值
\ No newline at end of file
...@@ -21,6 +21,7 @@ from ppcls.arch.backbone.legendary_models.resnet import ResNet18, ResNet18_vd, R ...@@ -21,6 +21,7 @@ from ppcls.arch.backbone.legendary_models.resnet import ResNet18, ResNet18_vd, R
from ppcls.arch.backbone.legendary_models.vgg import VGG11, VGG13, VGG16, VGG19 from ppcls.arch.backbone.legendary_models.vgg import VGG11, VGG13, VGG16, VGG19
from ppcls.arch.backbone.legendary_models.inception_v3 import InceptionV3 from ppcls.arch.backbone.legendary_models.inception_v3 import InceptionV3
from ppcls.arch.backbone.legendary_models.hrnet import HRNet_W18_C, HRNet_W30_C, HRNet_W32_C, HRNet_W40_C, HRNet_W44_C, HRNet_W48_C, HRNet_W60_C, HRNet_W64_C, SE_HRNet_W64_C from ppcls.arch.backbone.legendary_models.hrnet import HRNet_W18_C, HRNet_W30_C, HRNet_W32_C, HRNet_W40_C, HRNet_W44_C, HRNet_W48_C, HRNet_W60_C, HRNet_W64_C, SE_HRNet_W64_C
from ppcls.arch.backbone.legendary_models.pp_lcnet import PPLCNet_x0_25, PPLCNet_x0_35, PPLCNet_x0_5, PPLCNet_x0_75, PPLCNet_x1_0, PPLCNet_x1_5, PPLCNet_x2_0, PPLCNet_x2_5
from ppcls.arch.backbone.model_zoo.resnet_vc import ResNet50_vc from ppcls.arch.backbone.model_zoo.resnet_vc import ResNet50_vc
from ppcls.arch.backbone.model_zoo.resnext import ResNeXt50_32x4d, ResNeXt50_64x4d, ResNeXt101_32x4d, ResNeXt101_64x4d, ResNeXt152_32x4d, ResNeXt152_64x4d from ppcls.arch.backbone.model_zoo.resnext import ResNeXt50_32x4d, ResNeXt50_64x4d, ResNeXt101_32x4d, ResNeXt101_64x4d, ResNeXt152_32x4d, ResNeXt152_64x4d
......
# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import, division, print_function
import paddle
import paddle.nn as nn
from paddle import ParamAttr
from paddle.nn import AdaptiveAvgPool2D, BatchNorm, Conv2D, Dropout, Linear
from paddle.regularizer import L2Decay
from paddle.nn.initializer import KaimingNormal
from ppcls.arch.backbone.base.theseus_layer import TheseusLayer
from ppcls.utils.save_load import load_dygraph_pretrain, load_dygraph_pretrain_from_url
MODEL_URLS = {
"PPLCNet_x0_25":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_25_pretrained.pdparams",
"PPLCNet_x0_35":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_35_pretrained.pdparams",
"PPLCNet_x0_5":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_pretrained.pdparams",
"PPLCNet_x0_75":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams",
"PPLCNet_x1_0":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_pretrained.pdparams",
"PPLCNet_x1_5":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_5_pretrained.pdparams",
"PPLCNet_x2_0":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams",
"PPLCNet_x2_5":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_pretrained.pdparams"
}
__all__ = list(MODEL_URLS.keys())
# Each element(list) represents a depthwise block, which is composed of k, in_c, out_c, s, use_se.
# k: kernel_size
# in_c: input channel number in depthwise block
# out_c: output channel number in depthwise block
# s: stride in depthwise block
# use_se: whether to use SE block
NET_CONFIG = {
"blocks2":
#k, in_c, out_c, s, use_se
[[3, 16, 32, 1, False]],
"blocks3": [[3, 32, 64, 2, False], [3, 64, 64, 1, False]],
"blocks4": [[3, 64, 128, 2, False], [3, 128, 128, 1, False]],
"blocks5": [[3, 128, 256, 2, False], [5, 256, 256, 1, False],
[5, 256, 256, 1, False], [5, 256, 256, 1, False],
[5, 256, 256, 1, False], [5, 256, 256, 1, False]],
"blocks6": [[5, 256, 512, 2, True], [5, 512, 512, 1, True]]
}
def make_divisible(v, divisor=8, min_value=None):
if min_value is None:
min_value = divisor
new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
if new_v < 0.9 * v:
new_v += divisor
return new_v
class ConvBNLayer(TheseusLayer):
def __init__(self,
num_channels,
filter_size,
num_filters,
stride,
num_groups=1):
super().__init__()
self.conv = Conv2D(
in_channels=num_channels,
out_channels=num_filters,
kernel_size=filter_size,
stride=stride,
padding=(filter_size - 1) // 2,
groups=num_groups,
weight_attr=ParamAttr(initializer=KaimingNormal()),
bias_attr=False)
self.bn = BatchNorm(
num_filters,
param_attr=ParamAttr(regularizer=L2Decay(0.0)),
bias_attr=ParamAttr(regularizer=L2Decay(0.0)))
self.hardswish = nn.Hardswish()
def forward(self, x):
x = self.conv(x)
x = self.bn(x)
x = self.hardswish(x)
return x
class DepthwiseSeparable(TheseusLayer):
def __init__(self,
num_channels,
num_filters,
stride,
dw_size=3,
use_se=False):
super().__init__()
self.use_se = use_se
self.dw_conv = ConvBNLayer(
num_channels=num_channels,
num_filters=num_channels,
filter_size=dw_size,
stride=stride,
num_groups=num_channels)
if use_se:
self.se = SEModule(num_channels)
self.pw_conv = ConvBNLayer(
num_channels=num_channels,
filter_size=1,
num_filters=num_filters,
stride=1)
def forward(self, x):
x = self.dw_conv(x)
if self.use_se:
x = self.se(x)
x = self.pw_conv(x)
return x
class SEModule(TheseusLayer):
def __init__(self, channel, reduction=4):
super().__init__()
self.avg_pool = AdaptiveAvgPool2D(1)
self.conv1 = Conv2D(
in_channels=channel,
out_channels=channel // reduction,
kernel_size=1,
stride=1,
padding=0)
self.relu = nn.ReLU()
self.conv2 = Conv2D(
in_channels=channel // reduction,
out_channels=channel,
kernel_size=1,
stride=1,
padding=0)
self.hardsigmoid = nn.Hardsigmoid()
def forward(self, x):
identity = x
x = self.avg_pool(x)
x = self.conv1(x)
x = self.relu(x)
x = self.conv2(x)
x = self.hardsigmoid(x)
x = paddle.multiply(x=identity, y=x)
return x
class PPLCNet(TheseusLayer):
def __init__(self,
scale=1.0,
class_num=1000,
dropout_prob=0.2,
class_expand=1280):
super().__init__()
self.scale = scale
self.class_expand = class_expand
self.conv1 = ConvBNLayer(
num_channels=3,
filter_size=3,
num_filters=make_divisible(16 * scale),
stride=2)
self.blocks2 = nn.Sequential(*[
DepthwiseSeparable(
num_channels=make_divisible(in_c * scale),
num_filters=make_divisible(out_c * scale),
dw_size=k,
stride=s,
use_se=se)
for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks2"])
])
self.blocks3 = nn.Sequential(*[
DepthwiseSeparable(
num_channels=make_divisible(in_c * scale),
num_filters=make_divisible(out_c * scale),
dw_size=k,
stride=s,
use_se=se)
for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks3"])
])
self.blocks4 = nn.Sequential(*[
DepthwiseSeparable(
num_channels=make_divisible(in_c * scale),
num_filters=make_divisible(out_c * scale),
dw_size=k,
stride=s,
use_se=se)
for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks4"])
])
self.blocks5 = nn.Sequential(*[
DepthwiseSeparable(
num_channels=make_divisible(in_c * scale),
num_filters=make_divisible(out_c * scale),
dw_size=k,
stride=s,
use_se=se)
for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks5"])
])
self.blocks6 = nn.Sequential(*[
DepthwiseSeparable(
num_channels=make_divisible(in_c * scale),
num_filters=make_divisible(out_c * scale),
dw_size=k,
stride=s,
use_se=se)
for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks6"])
])
self.avg_pool = AdaptiveAvgPool2D(1)
self.last_conv = Conv2D(
in_channels=make_divisible(NET_CONFIG["blocks6"][-1][2] * scale),
out_channels=self.class_expand,
kernel_size=1,
stride=1,
padding=0,
bias_attr=False)
self.hardswish = nn.Hardswish()
self.dropout = Dropout(p=dropout_prob, mode="downscale_in_infer")
self.flatten = nn.Flatten(start_axis=1, stop_axis=-1)
self.fc = Linear(self.class_expand, class_num)
def forward(self, x):
x = self.conv1(x)
x = self.blocks2(x)
x = self.blocks3(x)
x = self.blocks4(x)
x = self.blocks5(x)
x = self.blocks6(x)
x = self.avg_pool(x)
x = self.last_conv(x)
x = self.hardswish(x)
x = self.dropout(x)
x = self.flatten(x)
x = self.fc(x)
return x
def _load_pretrained(pretrained, model, model_url, use_ssld):
if pretrained is False:
pass
elif pretrained is True:
load_dygraph_pretrain_from_url(model, model_url, use_ssld=use_ssld)
elif isinstance(pretrained, str):
load_dygraph_pretrain(model, pretrained)
else:
raise RuntimeError(
"pretrained type is not available. Please use `string` or `boolean` type."
)
def PPLCNet_x0_25(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x0_25
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x0_25` model depends on args.
"""
model = PPLCNet(scale=0.25, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x0_25"], use_ssld)
return model
def PPLCNet_x0_35(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x0_35
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x0_35` model depends on args.
"""
model = PPLCNet(scale=0.35, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x0_35"], use_ssld)
return model
def PPLCNet_x0_5(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x0_5
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x0_5` model depends on args.
"""
model = PPLCNet(scale=0.5, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x0_5"], use_ssld)
return model
def PPLCNet_x0_75(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x0_75
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x0_75` model depends on args.
"""
model = PPLCNet(scale=0.75, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x0_75"], use_ssld)
return model
def PPLCNet_x1_0(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x1_0
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x1_0` model depends on args.
"""
model = PPLCNet(scale=1.0, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x1_0"], use_ssld)
return model
def PPLCNet_x1_5(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x1_5
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x1_5` model depends on args.
"""
model = PPLCNet(scale=1.5, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x1_5"], use_ssld)
return model
def PPLCNet_x2_0(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x2_0
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x2_0` model depends on args.
"""
model = PPLCNet(scale=2.0, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x2_0"], use_ssld)
return model
def PPLCNet_x2_5(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x2_5
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x2_5` model depends on args.
"""
model = PPLCNet(scale=2.5, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x2_5"], use_ssld)
return model
...@@ -131,7 +131,7 @@ class GoogLeNetDY(nn.Layer): ...@@ -131,7 +131,7 @@ class GoogLeNetDY(nn.Layer):
self._ince5b = Inception( self._ince5b = Inception(
832, 832, 384, 192, 384, 48, 128, 128, name="ince5b") 832, 832, 384, 192, 384, 48, 128, 128, name="ince5b")
self._pool_5 = AvgPool2D(kernel_size=7, stride=7) self._pool_5 = AdaptiveAvgPool2D(1)
self._drop = Dropout(p=0.4, mode="downscale_in_infer") self._drop = Dropout(p=0.4, mode="downscale_in_infer")
self._fc_out = Linear( self._fc_out = Linear(
......
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 100
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
eval_mode: retrieval
use_dali: False
to_static: False
# model architecture
Arch:
name: RecModel
infer_output_key: features
infer_add_softmax: False
Backbone:
name: PPLCNet_x2_5
pretrained: True
use_ssld: True
BackboneStopLayer:
name: flatten_0
Neck:
name: FC
embedding_size: 1280
class_num: 512
Head:
name: ArcMargin
embedding_size: 512
class_num: 185341
margin: 0.2
scale: 30
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.04
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/
cls_label_path: ./dataset/train_reg_all_data.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 256
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
Query:
dataset:
name: VeriWild
image_root: ./dataset/Aliproduct/
cls_label_path: ./dataset/Aliproduct/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Gallery:
dataset:
name: VeriWild
image_root: ./dataset/Aliproduct/
cls_label_path: ./dataset/Aliproduct/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Metric:
Eval:
- Recallk:
topk: [1, 5]
...@@ -34,9 +34,8 @@ Optimizer: ...@@ -34,9 +34,8 @@ Optimizer:
momentum: 0.9 momentum: 0.9
lr: lr:
name: Piecewise name: Piecewise
learning_rate: 0.01
decay_epochs: [30, 60, 90] decay_epochs: [30, 60, 90]
values: [0.1, 0.01, 0.001, 0.0001] values: [0.01, 0.001, 0.0001, 0.00001]
regularizer: regularizer:
name: 'L2' name: 'L2'
coeff: 0.0001 coeff: 0.0001
......
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -54,18 +56,39 @@ DataLoader: ...@@ -54,18 +56,39 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 384 size: 384
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -82,7 +105,9 @@ DataLoader: ...@@ -82,7 +105,9 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 426 resize_short: 438
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 384 size: 384
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -107,7 +132,9 @@ Infer: ...@@ -107,7 +132,9 @@ Infer:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 426 resize_short: 438
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 384 size: 384
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -54,18 +56,39 @@ DataLoader: ...@@ -54,18 +56,39 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 384 size: 384
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -82,7 +105,9 @@ DataLoader: ...@@ -82,7 +105,9 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 426 resize_short: 438
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 384 size: 384
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -107,7 +132,9 @@ Infer: ...@@ -107,7 +132,9 @@ Infer:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 426 resize_short: 438
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 384 size: 384
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x0_25
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00003
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x0_35
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00003
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x0_5
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00003
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x0_75
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00003
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x1_0
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00003
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x1_5
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x2_0
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x2_5
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +61,39 @@ DataLoader: ...@@ -57,17 +61,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 384 size: 384
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -84,7 +110,11 @@ DataLoader: ...@@ -84,7 +110,11 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
size: [384, 384] resize_short: 438
interpolation: bicubic
backend: pil
- CropImage:
size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
...@@ -92,7 +122,7 @@ DataLoader: ...@@ -92,7 +122,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -107,7 +137,11 @@ Infer: ...@@ -107,7 +137,11 @@ Infer:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
size: [384, 384] resize_short: 438
interpolation: bicubic
backend: pil
- CropImage:
size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
...@@ -120,9 +154,6 @@ Infer: ...@@ -120,9 +154,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +61,39 @@ DataLoader: ...@@ -57,17 +61,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +111,8 @@ DataLoader: ...@@ -85,6 +111,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +122,7 @@ DataLoader: ...@@ -94,7 +122,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +138,8 @@ Infer: ...@@ -110,6 +138,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +154,6 @@ Infer: ...@@ -124,9 +154,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +61,39 @@ DataLoader: ...@@ -57,17 +61,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 384 size: 384
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -84,7 +110,11 @@ DataLoader: ...@@ -84,7 +110,11 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
size: [384, 384] resize_short: 438
interpolation: bicubic
backend: pil
- CropImage:
size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
...@@ -92,7 +122,7 @@ DataLoader: ...@@ -92,7 +122,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -107,7 +137,11 @@ Infer: ...@@ -107,7 +137,11 @@ Infer:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
size: [384, 384] resize_short: 438
interpolation: bicubic
backend: pil
- CropImage:
size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
...@@ -120,9 +154,6 @@ Infer: ...@@ -120,9 +154,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +61,39 @@ DataLoader: ...@@ -57,17 +61,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +111,8 @@ DataLoader: ...@@ -85,6 +111,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +122,7 @@ DataLoader: ...@@ -94,7 +122,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +138,8 @@ Infer: ...@@ -110,6 +138,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +154,6 @@ Infer: ...@@ -124,9 +154,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +61,39 @@ DataLoader: ...@@ -57,17 +61,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +111,8 @@ DataLoader: ...@@ -85,6 +111,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +122,7 @@ DataLoader: ...@@ -94,7 +122,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +138,8 @@ Infer: ...@@ -110,6 +138,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +154,6 @@ Infer: ...@@ -124,9 +154,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +61,39 @@ DataLoader: ...@@ -57,17 +61,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +111,8 @@ DataLoader: ...@@ -85,6 +111,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +122,7 @@ DataLoader: ...@@ -94,7 +122,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +138,8 @@ Infer: ...@@ -110,6 +138,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +154,6 @@ Infer: ...@@ -124,9 +154,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: alt_gvt_base name: alt_gvt_base
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.3
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: alt_gvt_large name: alt_gvt_large
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.5
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: alt_gvt_small name: alt_gvt_small
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.2
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: pcpvt_base name: pcpvt_base
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.3
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: pcpvt_large name: pcpvt_large
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.5
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: pcpvt_small name: pcpvt_small
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.2
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -54,7 +54,7 @@ Optimizer: ...@@ -54,7 +54,7 @@ Optimizer:
momentum: 0.9 momentum: 0.9
lr: lr:
name: Cosine name: Cosine
learning_rate: 0.01 learning_rate: 0.04
regularizer: regularizer:
name: 'L2' name: 'L2'
coeff: 0.0001 coeff: 0.0001
...@@ -84,10 +84,10 @@ DataLoader: ...@@ -84,10 +84,10 @@ DataLoader:
- RandomErasing: - RandomErasing:
EPSILON: 0.5 EPSILON: 0.5
sampler: sampler:
name: DistributedRandomIdentitySampler name: PKSampler
batch_size: 128 batch_size: 128
num_instances: 2 sample_per_id: 2
drop_last: False drop_last: True
loader: loader:
num_workers: 6 num_workers: 6
...@@ -97,7 +97,7 @@ DataLoader: ...@@ -97,7 +97,7 @@ DataLoader:
dataset: dataset:
name: LogoDataset name: LogoDataset
image_root: "dataset/LogoDet-3K-crop/val/" image_root: "dataset/LogoDet-3K-crop/val/"
cls_label_path: "dataset/LogoDet-3K-crop/LogoDet-3K+query.txt" cls_label_path: "dataset/LogoDet-3K-crop/LogoDet-3K+val.txt"
transform_ops: transform_ops:
- DecodeImage: - DecodeImage:
to_rgb: True to_rgb: True
...@@ -122,7 +122,7 @@ DataLoader: ...@@ -122,7 +122,7 @@ DataLoader:
dataset: dataset:
name: LogoDataset name: LogoDataset
image_root: "dataset/LogoDet-3K-crop/train/" image_root: "dataset/LogoDet-3K-crop/train/"
cls_label_path: "dataset/LogoDet-3K-crop/LogoDet-3K+gallery.txt" cls_label_path: "dataset/LogoDet-3K-crop/LogoDet-3K+train.txt"
transform_ops: transform_ops:
- DecodeImage: - DecodeImage:
to_rgb: True to_rgb: True
......
...@@ -54,7 +54,7 @@ Optimizer: ...@@ -54,7 +54,7 @@ Optimizer:
momentum: 0.9 momentum: 0.9
lr: lr:
name: MultiStepDecay name: MultiStepDecay
learning_rate: 0.01 learning_rate: 0.04
milestones: [30, 60, 70, 80, 90, 100] milestones: [30, 60, 70, 80, 90, 100]
gamma: 0.5 gamma: 0.5
verbose: False verbose: False
...@@ -90,10 +90,10 @@ DataLoader: ...@@ -90,10 +90,10 @@ DataLoader:
r1: 0.3 r1: 0.3
mean: [0., 0., 0.] mean: [0., 0., 0.]
sampler: sampler:
name: DistributedRandomIdentitySampler name: PKSampler
batch_size: 64 batch_size: 64
num_instances: 2 sample_per_id: 2
drop_last: False drop_last: True
shuffle: True shuffle: True
loader: loader:
num_workers: 4 num_workers: 4
......
...@@ -53,7 +53,7 @@ Optimizer: ...@@ -53,7 +53,7 @@ Optimizer:
momentum: 0.9 momentum: 0.9
lr: lr:
name: Cosine name: Cosine
learning_rate: 0.01 learning_rate: 0.04
regularizer: regularizer:
name: 'L2' name: 'L2'
coeff: 0.0005 coeff: 0.0005
...@@ -88,10 +88,10 @@ DataLoader: ...@@ -88,10 +88,10 @@ DataLoader:
mean: [0., 0., 0.] mean: [0., 0., 0.]
sampler: sampler:
name: DistributedRandomIdentitySampler name: PKSampler
batch_size: 128 batch_size: 128
num_instances: 2 sample_per_id: 2
drop_last: False drop_last: True
shuffle: True shuffle: True
loader: loader:
num_workers: 6 num_workers: 6
......
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 10
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
use_multilabel: True
# model architecture
Arch:
name: MobileNetV1
class_num: 33
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- MultiLabelLoss:
weight: 1.0
Eval:
- MultiLabelLoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.1
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: MultiLabelDataset
image_root: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/images/
cls_label_path: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/multilabel_train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: MultiLabelDataset
image_root: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/images/
cls_label_path: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/multilabel_test_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 256
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: ./deploy/images/0517_2715693311.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: MultiLabelTopk
topk: 5
class_id_map_file: None
Metric:
Train:
- HammingDistance:
- AccuracyScore:
Eval:
- HammingDistance:
- AccuracyScore:
...@@ -26,9 +26,12 @@ from ppcls.data.dataloader.common_dataset import create_operators ...@@ -26,9 +26,12 @@ from ppcls.data.dataloader.common_dataset import create_operators
from ppcls.data.dataloader.vehicle_dataset import CompCars, VeriWild from ppcls.data.dataloader.vehicle_dataset import CompCars, VeriWild
from ppcls.data.dataloader.logo_dataset import LogoDataset from ppcls.data.dataloader.logo_dataset import LogoDataset
from ppcls.data.dataloader.icartoon_dataset import ICartoonDataset from ppcls.data.dataloader.icartoon_dataset import ICartoonDataset
from ppcls.data.dataloader.mix_dataset import MixDataset
# sampler # sampler
from ppcls.data.dataloader.DistributedRandomIdentitySampler import DistributedRandomIdentitySampler from ppcls.data.dataloader.DistributedRandomIdentitySampler import DistributedRandomIdentitySampler
from ppcls.data.dataloader.pk_sampler import PKSampler
from ppcls.data.dataloader.mix_sampler import MixSampler
from ppcls.data import preprocess from ppcls.data import preprocess
from ppcls.data.preprocess import transform from ppcls.data.preprocess import transform
......
from ppcls.data.dataloader.imagenet_dataset import ImageNetDataset
from ppcls.data.dataloader.multilabel_dataset import MultiLabelDataset
from ppcls.data.dataloader.common_dataset import create_operators
from ppcls.data.dataloader.vehicle_dataset import CompCars, VeriWild
from ppcls.data.dataloader.logo_dataset import LogoDataset
from ppcls.data.dataloader.icartoon_dataset import ICartoonDataset
from ppcls.data.dataloader.mix_dataset import MixDataset
from ppcls.data.dataloader.mix_sampler import MixSampler
from ppcls.data.dataloader.pk_sampler import PKSampler
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import print_function
import numpy as np
import os
from paddle.io import Dataset
from .. import dataloader
class MixDataset(Dataset):
def __init__(self, datasets_config):
super().__init__()
self.dataset_list = []
start_idx = 0
end_idx = 0
for config_i in datasets_config:
dataset_name = config_i.pop('name')
dataset = getattr(dataloader, dataset_name)(**config_i)
end_idx += len(dataset)
self.dataset_list.append([end_idx, start_idx, dataset])
start_idx = end_idx
self.length = end_idx
def __getitem__(self, idx):
for dataset_i in self.dataset_list:
if dataset_i[0] > idx:
dataset_i_idx = idx - dataset_i[1]
return dataset_i[2][dataset_i_idx]
def __len__(self):
return self.length
def get_dataset_list(self):
return self.dataset_list
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from paddle.io import DistributedBatchSampler, Sampler
from ppcls.utils import logger
from ppcls.data.dataloader.mix_dataset import MixDataset
from ppcls.data import dataloader
class MixSampler(DistributedBatchSampler):
def __init__(self, dataset, batch_size, sample_configs, iter_per_epoch):
super().__init__(dataset, batch_size)
assert isinstance(dataset,
MixDataset), "MixSampler only support MixDataset"
self.sampler_list = []
self.batch_size = batch_size
self.start_list = []
self.length = iter_per_epoch
dataset_list = dataset.get_dataset_list()
batch_size_left = self.batch_size
self.iter_list = []
for i, config_i in enumerate(sample_configs):
self.start_list.append(dataset_list[i][1])
sample_method = config_i.pop("name")
ratio_i = config_i.pop("ratio")
if i < len(sample_configs) - 1:
batch_size_i = int(self.batch_size * ratio_i)
batch_size_left -= batch_size_i
else:
batch_size_i = batch_size_left
assert batch_size_i <= len(dataset_list[i][2])
config_i["batch_size"] = batch_size_i
if sample_method == "DistributedBatchSampler":
sampler_i = DistributedBatchSampler(dataset_list[i][2],
**config_i)
else:
sampler_i = getattr(dataloader, sample_method)(
dataset_list[i][2], **config_i)
self.sampler_list.append(sampler_i)
self.iter_list.append(iter(sampler_i))
self.length += len(dataset_list[i][2]) * ratio_i
self.iter_counter = 0
def __iter__(self):
while self.iter_counter < self.length:
batch = []
for i, iter_i in enumerate(self.iter_list):
batch_i = next(iter_i, None)
if batch_i is None:
iter_i = iter(self.sampler_list[i])
self.iter_list[i] = iter_i
batch_i = next(iter_i, None)
assert batch_i is not None, "dataset {} return None".format(
i)
batch += [idx + self.start_list[i] for idx in batch_i]
if len(batch) == self.batch_size:
self.iter_counter += 1
yield batch
else:
logger.info("Some dataset reaches end")
self.iter_counter = 0
def __len__(self):
return self.length
...@@ -33,7 +33,7 @@ class MultiLabelDataset(CommonDataset): ...@@ -33,7 +33,7 @@ class MultiLabelDataset(CommonDataset):
with open(self._cls_path) as fd: with open(self._cls_path) as fd:
lines = fd.readlines() lines = fd.readlines()
for l in lines: for l in lines:
l = l.strip().split(" ") l = l.strip().split("\t")
self.images.append(os.path.join(self._img_root, l[0])) self.images.append(os.path.join(self._img_root, l[0]))
labels = l[1].split(',') labels = l[1].split(',')
...@@ -44,13 +44,14 @@ class MultiLabelDataset(CommonDataset): ...@@ -44,13 +44,14 @@ class MultiLabelDataset(CommonDataset):
def __getitem__(self, idx): def __getitem__(self, idx):
try: try:
img = cv2.imread(self.images[idx]) with open(self.images[idx], 'rb') as f:
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) img = f.read()
if self._transform_ops: if self._transform_ops:
img = transform(img, self._transform_ops) img = transform(img, self._transform_ops)
img = img.transpose((2, 0, 1)) img = img.transpose((2, 0, 1))
label = np.array(self.labels[idx]).astype("float32") label = np.array(self.labels[idx]).astype("float32")
return (img, label) return (img, label)
except Exception as ex: except Exception as ex:
logger.error("Exception occured when parse line: {} with msg: {}". logger.error("Exception occured when parse line: {} with msg: {}".
format(self.images[idx], ex)) format(self.images[idx], ex))
......
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from collections import defaultdict
import numpy as np
import random
from paddle.io import DistributedBatchSampler
from ppcls.utils import logger
class PKSampler(DistributedBatchSampler):
"""
First, randomly sample P identities.
Then for each identity randomly sample K instances.
Therefore batch size is P*K, and the sampler called PKSampler.
Args:
dataset (paddle.io.Dataset): list of (img_path, pid, cam_id).
sample_per_id(int): number of instances per identity in a batch.
batch_size (int): number of examples in a batch.
shuffle(bool): whether to shuffle indices order before generating
batch indices. Default False.
"""
def __init__(self,
dataset,
batch_size,
sample_per_id,
shuffle=True,
drop_last=True,
sample_method="sample_avg_prob"):
super().__init__(
dataset, batch_size, shuffle=shuffle, drop_last=drop_last)
assert batch_size % sample_per_id == 0, \
"PKSampler configs error, Sample_per_id must be a divisor of batch_size."
assert hasattr(self.dataset,
"labels"), "Dataset must have labels attribute."
self.sample_per_label = sample_per_id
self.label_dict = defaultdict(list)
self.sample_method = sample_method
for idx, label in enumerate(self.dataset.labels):
self.label_dict[label].append(idx)
self.label_list = list(self.label_dict)
assert len(self.label_list) * self.sample_per_label > self.batch_size, \
"batch size should be smaller than "
if self.sample_method == "id_avg_prob":
self.prob_list = np.array([1 / len(self.label_list)] *
len(self.label_list))
elif self.sample_method == "sample_avg_prob":
counter = []
for label_i in self.label_list:
counter.append(len(self.label_dict[label_i]))
self.prob_list = np.array(counter) / sum(counter)
else:
logger.error(
"PKSampler only support id_avg_prob and sample_avg_prob sample method, "
"but receive {}.".format(self.sample_method))
if sum(np.abs(self.prob_list - 1) > 0.00000001):
self.prob_list[-1] = 1 - sum(self.prob_list[:-1])
if self.prob_list[-1] > 1 or self.prob_list[-1] < 0:
logger.error("PKSampler prob list error")
else:
logger.info(
"PKSampler: sum of prob list not equal to 1, change the last prob"
)
def __iter__(self):
label_per_batch = self.batch_size // self.sample_per_label
if self.shuffle:
np.random.RandomState(self.epoch).shuffle(self.label_list)
for i in range(len(self)):
batch_index = []
batch_label_list = np.random.choice(
self.label_list,
size=label_per_batch,
replace=False,
p=self.prob_list)
for label_i in batch_label_list:
label_i_indexes = self.label_dict[label_i]
if self.sample_per_label <= len(label_i_indexes):
batch_index.extend(
np.random.choice(
label_i_indexes,
size=self.sample_per_label,
replace=False))
else:
batch_index.extend(
np.random.choice(
label_i_indexes,
size=self.sample_per_label,
replace=True))
if not self.drop_last or len(batch_index) == self.batch_size:
yield batch_index
...@@ -16,7 +16,7 @@ import importlib ...@@ -16,7 +16,7 @@ import importlib
from . import topk from . import topk
from .topk import Topk from .topk import Topk, MultiLabelTopk
def build_postprocess(config): def build_postprocess(config):
......
...@@ -45,15 +45,17 @@ class Topk(object): ...@@ -45,15 +45,17 @@ class Topk(object):
class_id_map = None class_id_map = None
return class_id_map return class_id_map
def __call__(self, x, file_names=None): def __call__(self, x, file_names=None, multilabel=False):
assert isinstance(x, paddle.Tensor) assert isinstance(x, paddle.Tensor)
if file_names is not None: if file_names is not None:
assert x.shape[0] == len(file_names) assert x.shape[0] == len(file_names)
x = F.softmax(x, axis=-1) x = F.softmax(x, axis=-1) if not multilabel else F.sigmoid(x)
x = x.numpy() x = x.numpy()
y = [] y = []
for idx, probs in enumerate(x): for idx, probs in enumerate(x):
index = probs.argsort(axis=0)[-self.topk:][::-1].astype("int32") index = probs.argsort(axis=0)[-self.topk:][::-1].astype(
"int32") if not multilabel else np.where(
probs >= 0.5)[0].astype("int32")
clas_id_list = [] clas_id_list = []
score_list = [] score_list = []
label_name_list = [] label_name_list = []
...@@ -73,3 +75,11 @@ class Topk(object): ...@@ -73,3 +75,11 @@ class Topk(object):
result["label_names"] = label_name_list result["label_names"] = label_name_list
y.append(result) y.append(result)
return y return y
class MultiLabelTopk(Topk):
def __init__(self, topk=1, class_id_map_file=None):
super().__init__()
def __call__(self, x, file_names=None):
return super().__call__(x, file_names, multilabel=True)
...@@ -14,6 +14,7 @@ ...@@ -14,6 +14,7 @@
from ppcls.data.preprocess.ops.autoaugment import ImageNetPolicy as RawImageNetPolicy from ppcls.data.preprocess.ops.autoaugment import ImageNetPolicy as RawImageNetPolicy
from ppcls.data.preprocess.ops.randaugment import RandAugment as RawRandAugment from ppcls.data.preprocess.ops.randaugment import RandAugment as RawRandAugment
from ppcls.data.preprocess.ops.timm_autoaugment import RawTimmAutoAugment
from ppcls.data.preprocess.ops.cutout import Cutout from ppcls.data.preprocess.ops.cutout import Cutout
from ppcls.data.preprocess.ops.hide_and_seek import HideAndSeek from ppcls.data.preprocess.ops.hide_and_seek import HideAndSeek
...@@ -29,9 +30,8 @@ from ppcls.data.preprocess.ops.operators import NormalizeImage ...@@ -29,9 +30,8 @@ from ppcls.data.preprocess.ops.operators import NormalizeImage
from ppcls.data.preprocess.ops.operators import ToCHWImage from ppcls.data.preprocess.ops.operators import ToCHWImage
from ppcls.data.preprocess.ops.operators import AugMix from ppcls.data.preprocess.ops.operators import AugMix
from ppcls.data.preprocess.batch_ops.batch_operators import MixupOperator, CutmixOperator, FmixOperator from ppcls.data.preprocess.batch_ops.batch_operators import MixupOperator, CutmixOperator, OpSampler, FmixOperator
import six
import numpy as np import numpy as np
from PIL import Image from PIL import Image
...@@ -45,21 +45,16 @@ def transform(data, ops=[]): ...@@ -45,21 +45,16 @@ def transform(data, ops=[]):
class AutoAugment(RawImageNetPolicy): class AutoAugment(RawImageNetPolicy):
""" ImageNetPolicy wrapper to auto fit different img types """ """ ImageNetPolicy wrapper to auto fit different img types """
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
if six.PY2: super().__init__(*args, **kwargs)
super(AutoAugment, self).__init__(*args, **kwargs)
else:
super().__init__(*args, **kwargs)
def __call__(self, img): def __call__(self, img):
if not isinstance(img, Image.Image): if not isinstance(img, Image.Image):
img = np.ascontiguousarray(img) img = np.ascontiguousarray(img)
img = Image.fromarray(img) img = Image.fromarray(img)
if six.PY2: img = super().__call__(img)
img = super(AutoAugment, self).__call__(img)
else:
img = super().__call__(img)
if isinstance(img, Image.Image): if isinstance(img, Image.Image):
img = np.asarray(img) img = np.asarray(img)
...@@ -69,21 +64,35 @@ class AutoAugment(RawImageNetPolicy): ...@@ -69,21 +64,35 @@ class AutoAugment(RawImageNetPolicy):
class RandAugment(RawRandAugment): class RandAugment(RawRandAugment):
""" RandAugment wrapper to auto fit different img types """ """ RandAugment wrapper to auto fit different img types """
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def __call__(self, img):
if not isinstance(img, Image.Image):
img = np.ascontiguousarray(img)
img = Image.fromarray(img)
img = super().__call__(img)
if isinstance(img, Image.Image):
img = np.asarray(img)
return img
class TimmAutoAugment(RawTimmAutoAugment):
""" TimmAutoAugment wrapper to auto fit different img tyeps. """
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
if six.PY2: super().__init__(*args, **kwargs)
super(RandAugment, self).__init__(*args, **kwargs)
else:
super().__init__(*args, **kwargs)
def __call__(self, img): def __call__(self, img):
if not isinstance(img, Image.Image): if not isinstance(img, Image.Image):
img = np.ascontiguousarray(img) img = np.ascontiguousarray(img)
img = Image.fromarray(img) img = Image.fromarray(img)
if six.PY2: img = super().__call__(img)
img = super(RandAugment, self).__call__(img)
else:
img = super().__call__(img)
if isinstance(img, Image.Image): if isinstance(img, Image.Image):
img = np.asarray(img) img = np.asarray(img)
......
...@@ -16,13 +16,17 @@ from __future__ import absolute_import ...@@ -16,13 +16,17 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
from __future__ import unicode_literals from __future__ import unicode_literals
import random
import numpy as np import numpy as np
from ppcls.utils import logger
from ppcls.data.preprocess.ops.fmix import sample_mask from ppcls.data.preprocess.ops.fmix import sample_mask
class BatchOperator(object): class BatchOperator(object):
""" BatchOperator """ """ BatchOperator """
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
pass pass
...@@ -46,9 +50,20 @@ class BatchOperator(object): ...@@ -46,9 +50,20 @@ class BatchOperator(object):
class MixupOperator(BatchOperator): class MixupOperator(BatchOperator):
""" Mixup operator """ """ Mixup operator """
def __init__(self, alpha=0.2):
assert alpha > 0., \ def __init__(self, alpha: float=1.):
'parameter alpha[%f] should > 0.0' % (alpha) """Build Mixup operator
Args:
alpha (float, optional): The parameter alpha of mixup. Defaults to 1..
Raises:
Exception: The value of parameter is illegal.
"""
if alpha <= 0:
raise Exception(
f"Parameter \"alpha\" of Mixup should be greater than 0. \"alpha\": {alpha}."
)
self._alpha = alpha self._alpha = alpha
def __call__(self, batch): def __call__(self, batch):
...@@ -62,9 +77,20 @@ class MixupOperator(BatchOperator): ...@@ -62,9 +77,20 @@ class MixupOperator(BatchOperator):
class CutmixOperator(BatchOperator): class CutmixOperator(BatchOperator):
""" Cutmix operator """ """ Cutmix operator """
def __init__(self, alpha=0.2): def __init__(self, alpha=0.2):
assert alpha > 0., \ """Build Cutmix operator
'parameter alpha[%f] should > 0.0' % (alpha)
Args:
alpha (float, optional): The parameter alpha of cutmix. Defaults to 0.2.
Raises:
Exception: The value of parameter is illegal.
"""
if alpha <= 0:
raise Exception(
f"Parameter \"alpha\" of Cutmix should be greater than 0. \"alpha\": {alpha}."
)
self._alpha = alpha self._alpha = alpha
def _rand_bbox(self, size, lam): def _rand_bbox(self, size, lam):
...@@ -72,8 +98,8 @@ class CutmixOperator(BatchOperator): ...@@ -72,8 +98,8 @@ class CutmixOperator(BatchOperator):
w = size[2] w = size[2]
h = size[3] h = size[3]
cut_rat = np.sqrt(1. - lam) cut_rat = np.sqrt(1. - lam)
cut_w = np.int(w * cut_rat) cut_w = int(w * cut_rat)
cut_h = np.int(h * cut_rat) cut_h = int(h * cut_rat)
# uniform # uniform
cx = np.random.randint(w) cx = np.random.randint(w)
...@@ -101,6 +127,7 @@ class CutmixOperator(BatchOperator): ...@@ -101,6 +127,7 @@ class CutmixOperator(BatchOperator):
class FmixOperator(BatchOperator): class FmixOperator(BatchOperator):
""" Fmix operator """ """ Fmix operator """
def __init__(self, alpha=1, decay_power=3, max_soft=0., reformulate=False): def __init__(self, alpha=1, decay_power=3, max_soft=0., reformulate=False):
self._alpha = alpha self._alpha = alpha
self._decay_power = decay_power self._decay_power = decay_power
...@@ -115,3 +142,42 @@ class FmixOperator(BatchOperator): ...@@ -115,3 +142,42 @@ class FmixOperator(BatchOperator):
size, self._max_soft, self._reformulate) size, self._max_soft, self._reformulate)
imgs = mask * imgs + (1 - mask) * imgs[idx] imgs = mask * imgs + (1 - mask) * imgs[idx]
return list(zip(imgs, labels, labels[idx], [lam] * bs)) return list(zip(imgs, labels, labels[idx], [lam] * bs))
class OpSampler(object):
""" Sample a operator from """
def __init__(self, **op_dict):
"""Build OpSampler
Raises:
Exception: The parameter \"prob\" of operator(s) are be set error.
"""
if len(op_dict) < 1:
msg = f"ConfigWarning: No operator in \"OpSampler\". \"OpSampler\" has been skipped."
self.ops = {}
total_prob = 0
for op_name in op_dict:
param = op_dict[op_name]
if "prob" not in param:
msg = f"ConfigWarning: Parameter \"prob\" should be set when use operator in \"OpSampler\". The operator \"{op_name}\"'s prob has been set \"0\"."
logger.warning(msg)
prob = param.pop("prob", 0)
total_prob += prob
op = eval(op_name)(**param)
self.ops.update({op: prob})
if total_prob > 1:
msg = f"ConfigError: The total prob of operators in \"OpSampler\" should be less 1."
logger.error(msg)
raise Exception(msg)
# add "None Op" when total_prob < 1, "None Op" do nothing
self.ops[None] = 1 - total_prob
def __call__(self, batch):
op = random.choices(
list(self.ops.keys()), weights=list(self.ops.values()), k=1)[0]
# return batch directly when None Op
return op(batch) if op else batch
...@@ -19,15 +19,62 @@ from __future__ import division ...@@ -19,15 +19,62 @@ from __future__ import division
from __future__ import print_function from __future__ import print_function
from __future__ import unicode_literals from __future__ import unicode_literals
from functools import partial
import six import six
import math import math
import random import random
import cv2 import cv2
import numpy as np import numpy as np
from PIL import Image from PIL import Image
from paddle.vision.transforms import ColorJitter as RawColorJitter
from .autoaugment import ImageNetPolicy from .autoaugment import ImageNetPolicy
from .functional import augmentations from .functional import augmentations
from ppcls.utils import logger
class UnifiedResize(object):
def __init__(self, interpolation=None, backend="cv2"):
_cv2_interp_from_str = {
'nearest': cv2.INTER_NEAREST,
'bilinear': cv2.INTER_LINEAR,
'area': cv2.INTER_AREA,
'bicubic': cv2.INTER_CUBIC,
'lanczos': cv2.INTER_LANCZOS4
}
_pil_interp_from_str = {
'nearest': Image.NEAREST,
'bilinear': Image.BILINEAR,
'bicubic': Image.BICUBIC,
'box': Image.BOX,
'lanczos': Image.LANCZOS,
'hamming': Image.HAMMING
}
def _pil_resize(src, size, resample):
pil_img = Image.fromarray(src)
pil_img = pil_img.resize(size, resample)
return np.asarray(pil_img)
if backend.lower() == "cv2":
if isinstance(interpolation, str):
interpolation = _cv2_interp_from_str[interpolation.lower()]
# compatible with opencv < version 4.4.0
elif not interpolation:
interpolation = cv2.INTER_LINEAR
self.resize_func = partial(cv2.resize, interpolation=interpolation)
elif backend.lower() == "pil":
if isinstance(interpolation, str):
interpolation = _pil_interp_from_str[interpolation.lower()]
self.resize_func = partial(_pil_resize, resample=interpolation)
else:
logger.warning(
f"The backend of Resize only support \"cv2\" or \"PIL\". \"f{backend}\" is unavailable. Use \"cv2\" instead."
)
self.resize_func = cv2.resize
def __call__(self, src, size):
return self.resize_func(src, size)
class OperatorParamError(ValueError): class OperatorParamError(ValueError):
...@@ -67,8 +114,11 @@ class DecodeImage(object): ...@@ -67,8 +114,11 @@ class DecodeImage(object):
class ResizeImage(object): class ResizeImage(object):
""" resize image """ """ resize image """
def __init__(self, size=None, resize_short=None, interpolation=-1): def __init__(self,
self.interpolation = interpolation if interpolation >= 0 else None size=None,
resize_short=None,
interpolation=None,
backend="cv2"):
if resize_short is not None and resize_short > 0: if resize_short is not None and resize_short > 0:
self.resize_short = resize_short self.resize_short = resize_short
self.w = None self.w = None
...@@ -81,6 +131,9 @@ class ResizeImage(object): ...@@ -81,6 +131,9 @@ class ResizeImage(object):
raise OperatorParamError("invalid params for ReisizeImage for '\ raise OperatorParamError("invalid params for ReisizeImage for '\
'both 'size' and 'resize_short' are None") 'both 'size' and 'resize_short' are None")
self._resize_func = UnifiedResize(
interpolation=interpolation, backend=backend)
def __call__(self, img): def __call__(self, img):
img_h, img_w = img.shape[:2] img_h, img_w = img.shape[:2]
if self.resize_short is not None: if self.resize_short is not None:
...@@ -90,10 +143,7 @@ class ResizeImage(object): ...@@ -90,10 +143,7 @@ class ResizeImage(object):
else: else:
w = self.w w = self.w
h = self.h h = self.h
if self.interpolation is None: return self._resize_func(img, (w, h))
return cv2.resize(img, (w, h))
else:
return cv2.resize(img, (w, h), interpolation=self.interpolation)
class CropImage(object): class CropImage(object):
...@@ -119,9 +169,12 @@ class CropImage(object): ...@@ -119,9 +169,12 @@ class CropImage(object):
class RandCropImage(object): class RandCropImage(object):
""" random crop image """ """ random crop image """
def __init__(self, size, scale=None, ratio=None, interpolation=-1): def __init__(self,
size,
self.interpolation = interpolation if interpolation >= 0 else None scale=None,
ratio=None,
interpolation=None,
backend="cv2"):
if type(size) is int: if type(size) is int:
self.size = (size, size) # (h, w) self.size = (size, size) # (h, w)
else: else:
...@@ -130,6 +183,9 @@ class RandCropImage(object): ...@@ -130,6 +183,9 @@ class RandCropImage(object):
self.scale = [0.08, 1.0] if scale is None else scale self.scale = [0.08, 1.0] if scale is None else scale
self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio
self._resize_func = UnifiedResize(
interpolation=interpolation, backend=backend)
def __call__(self, img): def __call__(self, img):
size = self.size size = self.size
scale = self.scale scale = self.scale
...@@ -155,10 +211,8 @@ class RandCropImage(object): ...@@ -155,10 +211,8 @@ class RandCropImage(object):
j = random.randint(0, img_h - h) j = random.randint(0, img_h - h)
img = img[j:j + h, i:i + w, :] img = img[j:j + h, i:i + w, :]
if self.interpolation is None:
return cv2.resize(img, size) return self._resize_func(img, size)
else:
return cv2.resize(img, size, interpolation=self.interpolation)
class RandFlipImage(object): class RandFlipImage(object):
...@@ -313,3 +367,20 @@ class AugMix(object): ...@@ -313,3 +367,20 @@ class AugMix(object):
mixed = (1 - m) * image + m * mix mixed = (1 - m) * image + m * mix
return mixed.astype(np.uint8) return mixed.astype(np.uint8)
class ColorJitter(RawColorJitter):
"""ColorJitter.
"""
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def __call__(self, img):
if not isinstance(img, Image.Image):
img = np.ascontiguousarray(img)
img = Image.fromarray(img)
img = super()._apply_image(img)
if isinstance(img, Image.Image):
img = np.asarray(img)
return img
...@@ -12,7 +12,9 @@ ...@@ -12,7 +12,9 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
#This code is based on https://github.com/zhunzhong07/Random-Erasing #This code is adapted from https://github.com/zhunzhong07/Random-Erasing, and refer to Timm.
from functools import partial
import math import math
import random import random
...@@ -20,36 +22,69 @@ import random ...@@ -20,36 +22,69 @@ import random
import numpy as np import numpy as np
class Pixels(object):
def __init__(self, mode="const", mean=[0., 0., 0.]):
self._mode = mode
self._mean = mean
def __call__(self, h=224, w=224, c=3):
if self._mode == "rand":
return np.random.normal(size=(1, 1, 3))
elif self._mode == "pixel":
return np.random.normal(size=(h, w, c))
elif self._mode == "const":
return self._mean
else:
raise Exception(
"Invalid mode in RandomErasing, only support \"const\", \"rand\", \"pixel\""
)
class RandomErasing(object): class RandomErasing(object):
def __init__(self, EPSILON=0.5, sl=0.02, sh=0.4, r1=0.3, """RandomErasing.
mean=[0., 0., 0.]): """
self.EPSILON = EPSILON
self.mean = mean def __init__(self,
self.sl = sl EPSILON=0.5,
self.sh = sh sl=0.02,
self.r1 = r1 sh=0.4,
r1=0.3,
mean=[0., 0., 0.],
attempt=100,
use_log_aspect=False,
mode='const'):
self.EPSILON = eval(EPSILON) if isinstance(EPSILON, str) else EPSILON
self.sl = eval(sl) if isinstance(sl, str) else sl
self.sh = eval(sh) if isinstance(sh, str) else sh
r1 = eval(r1) if isinstance(r1, str) else r1
self.r1 = (math.log(r1), math.log(1 / r1)) if use_log_aspect else (
r1, 1 / r1)
self.use_log_aspect = use_log_aspect
self.attempt = attempt
self.get_pixels = Pixels(mode, mean)
def __call__(self, img): def __call__(self, img):
if random.uniform(0, 1) > self.EPSILON: if random.random() > self.EPSILON:
return img return img
for _ in range(100): for _ in range(self.attempt):
area = img.shape[0] * img.shape[1] area = img.shape[0] * img.shape[1]
target_area = random.uniform(self.sl, self.sh) * area target_area = random.uniform(self.sl, self.sh) * area
aspect_ratio = random.uniform(self.r1, 1 / self.r1) aspect_ratio = random.uniform(*self.r1)
if self.use_log_aspect:
aspect_ratio = math.exp(aspect_ratio)
h = int(round(math.sqrt(target_area * aspect_ratio))) h = int(round(math.sqrt(target_area * aspect_ratio)))
w = int(round(math.sqrt(target_area / aspect_ratio))) w = int(round(math.sqrt(target_area / aspect_ratio)))
if w < img.shape[1] and h < img.shape[0]: if w < img.shape[1] and h < img.shape[0]:
pixels = self.get_pixels(h, w, img.shape[2])
x1 = random.randint(0, img.shape[0] - h) x1 = random.randint(0, img.shape[0] - h)
y1 = random.randint(0, img.shape[1] - w) y1 = random.randint(0, img.shape[1] - w)
if img.shape[0] == 3: if img.shape[2] == 3:
img[x1:x1 + h, y1:y1 + w, 0] = self.mean[0] img[x1:x1 + h, y1:y1 + w, :] = pixels
img[x1:x1 + h, y1:y1 + w, 1] = self.mean[1]
img[x1:x1 + h, y1:y1 + w, 2] = self.mean[2]
else: else:
img[0, x1:x1 + h, y1:y1 + w] = self.mean[1] img[x1:x1 + h, y1:y1 + w, 0] = pixels[0]
return img return img
return img return img
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
This code implements is borrowed from Timm: https://github.com/rwightman/pytorch-image-models.
hacked together by / Copyright 2020 Ross Wightman
"""
import random
import math
import re
from PIL import Image, ImageOps, ImageEnhance, ImageChops
import PIL
import numpy as np
IMAGENET_DEFAULT_MEAN = (0.485, 0.456, 0.406)
_PIL_VER = tuple([int(x) for x in PIL.__version__.split('.')[:2]])
_FILL = (128, 128, 128)
# This signifies the max integer that the controller RNN could predict for the
# augmentation scheme.
_MAX_LEVEL = 10.
_HPARAMS_DEFAULT = dict(
translate_const=250,
img_mean=_FILL, )
_RANDOM_INTERPOLATION = (Image.BILINEAR, Image.BICUBIC)
def _pil_interp(method):
if method == 'bicubic':
return Image.BICUBIC
elif method == 'lanczos':
return Image.LANCZOS
elif method == 'hamming':
return Image.HAMMING
else:
# default bilinear, do we want to allow nearest?
return Image.BILINEAR
def _interpolation(kwargs):
interpolation = kwargs.pop('resample', Image.BILINEAR)
if isinstance(interpolation, (list, tuple)):
return random.choice(interpolation)
else:
return interpolation
def _check_args_tf(kwargs):
if 'fillcolor' in kwargs and _PIL_VER < (5, 0):
kwargs.pop('fillcolor')
kwargs['resample'] = _interpolation(kwargs)
def shear_x(img, factor, **kwargs):
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, factor, 0, 0, 1, 0),
**kwargs)
def shear_y(img, factor, **kwargs):
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, 0, 0, factor, 1, 0),
**kwargs)
def translate_x_rel(img, pct, **kwargs):
pixels = pct * img.size[0]
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, 0, pixels, 0, 1, 0),
**kwargs)
def translate_y_rel(img, pct, **kwargs):
pixels = pct * img.size[1]
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, 0, 0, 0, 1, pixels),
**kwargs)
def translate_x_abs(img, pixels, **kwargs):
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, 0, pixels, 0, 1, 0),
**kwargs)
def translate_y_abs(img, pixels, **kwargs):
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, 0, 0, 0, 1, pixels),
**kwargs)
def rotate(img, degrees, **kwargs):
_check_args_tf(kwargs)
if _PIL_VER >= (5, 2):
return img.rotate(degrees, **kwargs)
elif _PIL_VER >= (5, 0):
w, h = img.size
post_trans = (0, 0)
rotn_center = (w / 2.0, h / 2.0)
angle = -math.radians(degrees)
matrix = [
round(math.cos(angle), 15),
round(math.sin(angle), 15),
0.0,
round(-math.sin(angle), 15),
round(math.cos(angle), 15),
0.0,
]
def transform(x, y, matrix):
(a, b, c, d, e, f) = matrix
return a * x + b * y + c, d * x + e * y + f
matrix[2], matrix[5] = transform(-rotn_center[0] - post_trans[0],
-rotn_center[1] - post_trans[1],
matrix)
matrix[2] += rotn_center[0]
matrix[5] += rotn_center[1]
return img.transform(img.size, Image.AFFINE, matrix, **kwargs)
else:
return img.rotate(degrees, resample=kwargs['resample'])
def auto_contrast(img, **__):
return ImageOps.autocontrast(img)
def invert(img, **__):
return ImageOps.invert(img)
def equalize(img, **__):
return ImageOps.equalize(img)
def solarize(img, thresh, **__):
return ImageOps.solarize(img, thresh)
def solarize_add(img, add, thresh=128, **__):
lut = []
for i in range(256):
if i < thresh:
lut.append(min(255, i + add))
else:
lut.append(i)
if img.mode in ("L", "RGB"):
if img.mode == "RGB" and len(lut) == 256:
lut = lut + lut + lut
return img.point(lut)
else:
return img
def posterize(img, bits_to_keep, **__):
if bits_to_keep >= 8:
return img
return ImageOps.posterize(img, bits_to_keep)
def contrast(img, factor, **__):
return ImageEnhance.Contrast(img).enhance(factor)
def color(img, factor, **__):
return ImageEnhance.Color(img).enhance(factor)
def brightness(img, factor, **__):
return ImageEnhance.Brightness(img).enhance(factor)
def sharpness(img, factor, **__):
return ImageEnhance.Sharpness(img).enhance(factor)
def _randomly_negate(v):
"""With 50% prob, negate the value"""
return -v if random.random() > 0.5 else v
def _rotate_level_to_arg(level, _hparams):
# range [-30, 30]
level = (level / _MAX_LEVEL) * 30.
level = _randomly_negate(level)
return level,
def _enhance_level_to_arg(level, _hparams):
# range [0.1, 1.9]
return (level / _MAX_LEVEL) * 1.8 + 0.1,
def _enhance_increasing_level_to_arg(level, _hparams):
# the 'no change' level is 1.0, moving away from that towards 0. or 2.0 increases the enhancement blend
# range [0.1, 1.9]
level = (level / _MAX_LEVEL) * .9
level = 1.0 + _randomly_negate(level)
return level,
def _shear_level_to_arg(level, _hparams):
# range [-0.3, 0.3]
level = (level / _MAX_LEVEL) * 0.3
level = _randomly_negate(level)
return level,
def _translate_abs_level_to_arg(level, hparams):
translate_const = hparams['translate_const']
level = (level / _MAX_LEVEL) * float(translate_const)
level = _randomly_negate(level)
return level,
def _translate_rel_level_to_arg(level, hparams):
# default range [-0.45, 0.45]
translate_pct = hparams.get('translate_pct', 0.45)
level = (level / _MAX_LEVEL) * translate_pct
level = _randomly_negate(level)
return level,
def _posterize_level_to_arg(level, _hparams):
# As per Tensorflow TPU EfficientNet impl
# range [0, 4], 'keep 0 up to 4 MSB of original image'
# intensity/severity of augmentation decreases with level
return int((level / _MAX_LEVEL) * 4),
def _posterize_increasing_level_to_arg(level, hparams):
# As per Tensorflow models research and UDA impl
# range [4, 0], 'keep 4 down to 0 MSB of original image',
# intensity/severity of augmentation increases with level
return 4 - _posterize_level_to_arg(level, hparams)[0],
def _posterize_original_level_to_arg(level, _hparams):
# As per original AutoAugment paper description
# range [4, 8], 'keep 4 up to 8 MSB of image'
# intensity/severity of augmentation decreases with level
return int((level / _MAX_LEVEL) * 4) + 4,
def _solarize_level_to_arg(level, _hparams):
# range [0, 256]
# intensity/severity of augmentation decreases with level
return int((level / _MAX_LEVEL) * 256),
def _solarize_increasing_level_to_arg(level, _hparams):
# range [0, 256]
# intensity/severity of augmentation increases with level
return 256 - _solarize_level_to_arg(level, _hparams)[0],
def _solarize_add_level_to_arg(level, _hparams):
# range [0, 110]
return int((level / _MAX_LEVEL) * 110),
LEVEL_TO_ARG = {
'AutoContrast': None,
'Equalize': None,
'Invert': None,
'Rotate': _rotate_level_to_arg,
# There are several variations of the posterize level scaling in various Tensorflow/Google repositories/papers
'Posterize': _posterize_level_to_arg,
'PosterizeIncreasing': _posterize_increasing_level_to_arg,
'PosterizeOriginal': _posterize_original_level_to_arg,
'Solarize': _solarize_level_to_arg,
'SolarizeIncreasing': _solarize_increasing_level_to_arg,
'SolarizeAdd': _solarize_add_level_to_arg,
'Color': _enhance_level_to_arg,
'ColorIncreasing': _enhance_increasing_level_to_arg,
'Contrast': _enhance_level_to_arg,
'ContrastIncreasing': _enhance_increasing_level_to_arg,
'Brightness': _enhance_level_to_arg,
'BrightnessIncreasing': _enhance_increasing_level_to_arg,
'Sharpness': _enhance_level_to_arg,
'SharpnessIncreasing': _enhance_increasing_level_to_arg,
'ShearX': _shear_level_to_arg,
'ShearY': _shear_level_to_arg,
'TranslateX': _translate_abs_level_to_arg,
'TranslateY': _translate_abs_level_to_arg,
'TranslateXRel': _translate_rel_level_to_arg,
'TranslateYRel': _translate_rel_level_to_arg,
}
NAME_TO_OP = {
'AutoContrast': auto_contrast,
'Equalize': equalize,
'Invert': invert,
'Rotate': rotate,
'Posterize': posterize,
'PosterizeIncreasing': posterize,
'PosterizeOriginal': posterize,
'Solarize': solarize,
'SolarizeIncreasing': solarize,
'SolarizeAdd': solarize_add,
'Color': color,
'ColorIncreasing': color,
'Contrast': contrast,
'ContrastIncreasing': contrast,
'Brightness': brightness,
'BrightnessIncreasing': brightness,
'Sharpness': sharpness,
'SharpnessIncreasing': sharpness,
'ShearX': shear_x,
'ShearY': shear_y,
'TranslateX': translate_x_abs,
'TranslateY': translate_y_abs,
'TranslateXRel': translate_x_rel,
'TranslateYRel': translate_y_rel,
}
class AugmentOp(object):
def __init__(self, name, prob=0.5, magnitude=10, hparams=None):
hparams = hparams or _HPARAMS_DEFAULT
self.aug_fn = NAME_TO_OP[name]
self.level_fn = LEVEL_TO_ARG[name]
self.prob = prob
self.magnitude = magnitude
self.hparams = hparams.copy()
self.kwargs = dict(
fillcolor=hparams['img_mean'] if 'img_mean' in hparams else _FILL,
resample=hparams['interpolation']
if 'interpolation' in hparams else _RANDOM_INTERPOLATION, )
# If magnitude_std is > 0, we introduce some randomness
# in the usually fixed policy and sample magnitude from a normal distribution
# with mean `magnitude` and std-dev of `magnitude_std`.
# NOTE This is my own hack, being tested, not in papers or reference impls.
self.magnitude_std = self.hparams.get('magnitude_std', 0)
def __call__(self, img):
if self.prob < 1.0 and random.random() > self.prob:
return img
magnitude = self.magnitude
if self.magnitude_std and self.magnitude_std > 0:
magnitude = random.gauss(magnitude, self.magnitude_std)
magnitude = min(_MAX_LEVEL, max(0, magnitude)) # clip to valid range
level_args = self.level_fn(
magnitude, self.hparams) if self.level_fn is not None else tuple()
return self.aug_fn(img, *level_args, **self.kwargs)
def auto_augment_policy_v0(hparams):
# ImageNet v0 policy from TPU EfficientNet impl, cannot find a paper reference.
policy = [
[('Equalize', 0.8, 1), ('ShearY', 0.8, 4)],
[('Color', 0.4, 9), ('Equalize', 0.6, 3)],
[('Color', 0.4, 1), ('Rotate', 0.6, 8)],
[('Solarize', 0.8, 3), ('Equalize', 0.4, 7)],
[('Solarize', 0.4, 2), ('Solarize', 0.6, 2)],
[('Color', 0.2, 0), ('Equalize', 0.8, 8)],
[('Equalize', 0.4, 8), ('SolarizeAdd', 0.8, 3)],
[('ShearX', 0.2, 9), ('Rotate', 0.6, 8)],
[('Color', 0.6, 1), ('Equalize', 1.0, 2)],
[('Invert', 0.4, 9), ('Rotate', 0.6, 0)],
[('Equalize', 1.0, 9), ('ShearY', 0.6, 3)],
[('Color', 0.4, 7), ('Equalize', 0.6, 0)],
[('Posterize', 0.4, 6), ('AutoContrast', 0.4, 7)],
[('Solarize', 0.6, 8), ('Color', 0.6, 9)],
[('Solarize', 0.2, 4), ('Rotate', 0.8, 9)],
[('Rotate', 1.0, 7), ('TranslateYRel', 0.8, 9)],
[('ShearX', 0.0, 0), ('Solarize', 0.8, 4)],
[('ShearY', 0.8, 0), ('Color', 0.6, 4)],
[('Color', 1.0, 0), ('Rotate', 0.6, 2)],
[('Equalize', 0.8, 4), ('Equalize', 0.0, 8)],
[('Equalize', 1.0, 4), ('AutoContrast', 0.6, 2)],
[('ShearY', 0.4, 7), ('SolarizeAdd', 0.6, 7)],
[('Posterize', 0.8, 2), ('Solarize', 0.6, 10)
], # This results in black image with Tpu posterize
[('Solarize', 0.6, 8), ('Equalize', 0.6, 1)],
[('Color', 0.8, 6), ('Rotate', 0.4, 5)],
]
pc = [[AugmentOp(*a, hparams=hparams) for a in sp] for sp in policy]
return pc
def auto_augment_policy_v0r(hparams):
# ImageNet v0 policy from TPU EfficientNet impl, with variation of Posterize used
# in Google research implementation (number of bits discarded increases with magnitude)
policy = [
[('Equalize', 0.8, 1), ('ShearY', 0.8, 4)],
[('Color', 0.4, 9), ('Equalize', 0.6, 3)],
[('Color', 0.4, 1), ('Rotate', 0.6, 8)],
[('Solarize', 0.8, 3), ('Equalize', 0.4, 7)],
[('Solarize', 0.4, 2), ('Solarize', 0.6, 2)],
[('Color', 0.2, 0), ('Equalize', 0.8, 8)],
[('Equalize', 0.4, 8), ('SolarizeAdd', 0.8, 3)],
[('ShearX', 0.2, 9), ('Rotate', 0.6, 8)],
[('Color', 0.6, 1), ('Equalize', 1.0, 2)],
[('Invert', 0.4, 9), ('Rotate', 0.6, 0)],
[('Equalize', 1.0, 9), ('ShearY', 0.6, 3)],
[('Color', 0.4, 7), ('Equalize', 0.6, 0)],
[('PosterizeIncreasing', 0.4, 6), ('AutoContrast', 0.4, 7)],
[('Solarize', 0.6, 8), ('Color', 0.6, 9)],
[('Solarize', 0.2, 4), ('Rotate', 0.8, 9)],
[('Rotate', 1.0, 7), ('TranslateYRel', 0.8, 9)],
[('ShearX', 0.0, 0), ('Solarize', 0.8, 4)],
[('ShearY', 0.8, 0), ('Color', 0.6, 4)],
[('Color', 1.0, 0), ('Rotate', 0.6, 2)],
[('Equalize', 0.8, 4), ('Equalize', 0.0, 8)],
[('Equalize', 1.0, 4), ('AutoContrast', 0.6, 2)],
[('ShearY', 0.4, 7), ('SolarizeAdd', 0.6, 7)],
[('PosterizeIncreasing', 0.8, 2), ('Solarize', 0.6, 10)],
[('Solarize', 0.6, 8), ('Equalize', 0.6, 1)],
[('Color', 0.8, 6), ('Rotate', 0.4, 5)],
]
pc = [[AugmentOp(*a, hparams=hparams) for a in sp] for sp in policy]
return pc
def auto_augment_policy_original(hparams):
# ImageNet policy from https://arxiv.org/abs/1805.09501
policy = [
[('PosterizeOriginal', 0.4, 8), ('Rotate', 0.6, 9)],
[('Solarize', 0.6, 5), ('AutoContrast', 0.6, 5)],
[('Equalize', 0.8, 8), ('Equalize', 0.6, 3)],
[('PosterizeOriginal', 0.6, 7), ('PosterizeOriginal', 0.6, 6)],
[('Equalize', 0.4, 7), ('Solarize', 0.2, 4)],
[('Equalize', 0.4, 4), ('Rotate', 0.8, 8)],
[('Solarize', 0.6, 3), ('Equalize', 0.6, 7)],
[('PosterizeOriginal', 0.8, 5), ('Equalize', 1.0, 2)],
[('Rotate', 0.2, 3), ('Solarize', 0.6, 8)],
[('Equalize', 0.6, 8), ('PosterizeOriginal', 0.4, 6)],
[('Rotate', 0.8, 8), ('Color', 0.4, 0)],
[('Rotate', 0.4, 9), ('Equalize', 0.6, 2)],
[('Equalize', 0.0, 7), ('Equalize', 0.8, 8)],
[('Invert', 0.6, 4), ('Equalize', 1.0, 8)],
[('Color', 0.6, 4), ('Contrast', 1.0, 8)],
[('Rotate', 0.8, 8), ('Color', 1.0, 2)],
[('Color', 0.8, 8), ('Solarize', 0.8, 7)],
[('Sharpness', 0.4, 7), ('Invert', 0.6, 8)],
[('ShearX', 0.6, 5), ('Equalize', 1.0, 9)],
[('Color', 0.4, 0), ('Equalize', 0.6, 3)],
[('Equalize', 0.4, 7), ('Solarize', 0.2, 4)],
[('Solarize', 0.6, 5), ('AutoContrast', 0.6, 5)],
[('Invert', 0.6, 4), ('Equalize', 1.0, 8)],
[('Color', 0.6, 4), ('Contrast', 1.0, 8)],
[('Equalize', 0.8, 8), ('Equalize', 0.6, 3)],
]
pc = [[AugmentOp(*a, hparams=hparams) for a in sp] for sp in policy]
return pc
def auto_augment_policy_originalr(hparams):
# ImageNet policy from https://arxiv.org/abs/1805.09501 with research posterize variation
policy = [
[('PosterizeIncreasing', 0.4, 8), ('Rotate', 0.6, 9)],
[('Solarize', 0.6, 5), ('AutoContrast', 0.6, 5)],
[('Equalize', 0.8, 8), ('Equalize', 0.6, 3)],
[('PosterizeIncreasing', 0.6, 7), ('PosterizeIncreasing', 0.6, 6)],
[('Equalize', 0.4, 7), ('Solarize', 0.2, 4)],
[('Equalize', 0.4, 4), ('Rotate', 0.8, 8)],
[('Solarize', 0.6, 3), ('Equalize', 0.6, 7)],
[('PosterizeIncreasing', 0.8, 5), ('Equalize', 1.0, 2)],
[('Rotate', 0.2, 3), ('Solarize', 0.6, 8)],
[('Equalize', 0.6, 8), ('PosterizeIncreasing', 0.4, 6)],
[('Rotate', 0.8, 8), ('Color', 0.4, 0)],
[('Rotate', 0.4, 9), ('Equalize', 0.6, 2)],
[('Equalize', 0.0, 7), ('Equalize', 0.8, 8)],
[('Invert', 0.6, 4), ('Equalize', 1.0, 8)],
[('Color', 0.6, 4), ('Contrast', 1.0, 8)],
[('Rotate', 0.8, 8), ('Color', 1.0, 2)],
[('Color', 0.8, 8), ('Solarize', 0.8, 7)],
[('Sharpness', 0.4, 7), ('Invert', 0.6, 8)],
[('ShearX', 0.6, 5), ('Equalize', 1.0, 9)],
[('Color', 0.4, 0), ('Equalize', 0.6, 3)],
[('Equalize', 0.4, 7), ('Solarize', 0.2, 4)],
[('Solarize', 0.6, 5), ('AutoContrast', 0.6, 5)],
[('Invert', 0.6, 4), ('Equalize', 1.0, 8)],
[('Color', 0.6, 4), ('Contrast', 1.0, 8)],
[('Equalize', 0.8, 8), ('Equalize', 0.6, 3)],
]
pc = [[AugmentOp(*a, hparams=hparams) for a in sp] for sp in policy]
return pc
def auto_augment_policy(name='v0', hparams=None):
hparams = hparams or _HPARAMS_DEFAULT
if name == 'original':
return auto_augment_policy_original(hparams)
elif name == 'originalr':
return auto_augment_policy_originalr(hparams)
elif name == 'v0':
return auto_augment_policy_v0(hparams)
elif name == 'v0r':
return auto_augment_policy_v0r(hparams)
else:
assert False, 'Unknown AA policy (%s)' % name
class AutoAugment(object):
def __init__(self, policy):
self.policy = policy
def __call__(self, img):
sub_policy = random.choice(self.policy)
for op in sub_policy:
img = op(img)
return img
def auto_augment_transform(config_str, hparams):
"""
Create a AutoAugment transform
:param config_str: String defining configuration of auto augmentation. Consists of multiple sections separated by
dashes ('-'). The first section defines the AutoAugment policy (one of 'v0', 'v0r', 'original', 'originalr').
The remaining sections, not order sepecific determine
'mstd' - float std deviation of magnitude noise applied
Ex 'original-mstd0.5' results in AutoAugment with original policy, magnitude_std 0.5
:param hparams: Other hparams (kwargs) for the AutoAugmentation scheme
:return: A callable Transform Op
"""
config = config_str.split('-')
policy_name = config[0]
config = config[1:]
for c in config:
cs = re.split(r'(\d.*)', c)
if len(cs) < 2:
continue
key, val = cs[:2]
if key == 'mstd':
# noise param injected via hparams for now
hparams.setdefault('magnitude_std', float(val))
else:
assert False, 'Unknown AutoAugment config section'
aa_policy = auto_augment_policy(policy_name, hparams=hparams)
return AutoAugment(aa_policy)
_RAND_TRANSFORMS = [
'AutoContrast',
'Equalize',
'Invert',
'Rotate',
'Posterize',
'Solarize',
'SolarizeAdd',
'Color',
'Contrast',
'Brightness',
'Sharpness',
'ShearX',
'ShearY',
'TranslateXRel',
'TranslateYRel',
#'Cutout' # NOTE I've implement this as random erasing separately
]
_RAND_INCREASING_TRANSFORMS = [
'AutoContrast',
'Equalize',
'Invert',
'Rotate',
'PosterizeIncreasing',
'SolarizeIncreasing',
'SolarizeAdd',
'ColorIncreasing',
'ContrastIncreasing',
'BrightnessIncreasing',
'SharpnessIncreasing',
'ShearX',
'ShearY',
'TranslateXRel',
'TranslateYRel',
#'Cutout' # NOTE I've implement this as random erasing separately
]
# These experimental weights are based loosely on the relative improvements mentioned in paper.
# They may not result in increased performance, but could likely be tuned to so.
_RAND_CHOICE_WEIGHTS_0 = {
'Rotate': 0.3,
'ShearX': 0.2,
'ShearY': 0.2,
'TranslateXRel': 0.1,
'TranslateYRel': 0.1,
'Color': .025,
'Sharpness': 0.025,
'AutoContrast': 0.025,
'Solarize': .005,
'SolarizeAdd': .005,
'Contrast': .005,
'Brightness': .005,
'Equalize': .005,
'Posterize': 0,
'Invert': 0,
}
def _select_rand_weights(weight_idx=0, transforms=None):
transforms = transforms or _RAND_TRANSFORMS
assert weight_idx == 0 # only one set of weights currently
rand_weights = _RAND_CHOICE_WEIGHTS_0
probs = [rand_weights[k] for k in transforms]
probs /= np.sum(probs)
return probs
def rand_augment_ops(magnitude=10, hparams=None, transforms=None):
hparams = hparams or _HPARAMS_DEFAULT
transforms = transforms or _RAND_TRANSFORMS
return [
AugmentOp(
name, prob=0.5, magnitude=magnitude, hparams=hparams)
for name in transforms
]
class RandAugment(object):
def __init__(self, ops, num_layers=2, choice_weights=None):
self.ops = ops
self.num_layers = num_layers
self.choice_weights = choice_weights
def __call__(self, img):
# no replacement when using weighted choice
ops = np.random.choice(
self.ops,
self.num_layers,
replace=self.choice_weights is None,
p=self.choice_weights)
for op in ops:
img = op(img)
return img
def rand_augment_transform(config_str, hparams):
"""
Create a RandAugment transform
:param config_str: String defining configuration of random augmentation. Consists of multiple sections separated by
dashes ('-'). The first section defines the specific variant of rand augment (currently only 'rand'). The remaining
sections, not order sepecific determine
'm' - integer magnitude of rand augment
'n' - integer num layers (number of transform ops selected per image)
'w' - integer probabiliy weight index (index of a set of weights to influence choice of op)
'mstd' - float std deviation of magnitude noise applied
'inc' - integer (bool), use augmentations that increase in severity with magnitude (default: 0)
Ex 'rand-m9-n3-mstd0.5' results in RandAugment with magnitude 9, num_layers 3, magnitude_std 0.5
'rand-mstd1-w0' results in magnitude_std 1.0, weights 0, default magnitude of 10 and num_layers 2
:param hparams: Other hparams (kwargs) for the RandAugmentation scheme
:return: A callable Transform Op
"""
magnitude = _MAX_LEVEL # default to _MAX_LEVEL for magnitude (currently 10)
num_layers = 2 # default to 2 ops per image
weight_idx = None # default to no probability weights for op choice
transforms = _RAND_TRANSFORMS
config = config_str.split('-')
assert config[0] == 'rand'
config = config[1:]
for c in config:
cs = re.split(r'(\d.*)', c)
if len(cs) < 2:
continue
key, val = cs[:2]
if key == 'mstd':
# noise param injected via hparams for now
hparams.setdefault('magnitude_std', float(val))
elif key == 'inc':
if bool(val):
transforms = _RAND_INCREASING_TRANSFORMS
elif key == 'm':
magnitude = int(val)
elif key == 'n':
num_layers = int(val)
elif key == 'w':
weight_idx = int(val)
else:
assert False, 'Unknown RandAugment config section'
ra_ops = rand_augment_ops(
magnitude=magnitude, hparams=hparams, transforms=transforms)
choice_weights = None if weight_idx is None else _select_rand_weights(
weight_idx)
return RandAugment(ra_ops, num_layers, choice_weights=choice_weights)
_AUGMIX_TRANSFORMS = [
'AutoContrast',
'ColorIncreasing', # not in paper
'ContrastIncreasing', # not in paper
'BrightnessIncreasing', # not in paper
'SharpnessIncreasing', # not in paper
'Equalize',
'Rotate',
'PosterizeIncreasing',
'SolarizeIncreasing',
'ShearX',
'ShearY',
'TranslateXRel',
'TranslateYRel',
]
def augmix_ops(magnitude=10, hparams=None, transforms=None):
hparams = hparams or _HPARAMS_DEFAULT
transforms = transforms or _AUGMIX_TRANSFORMS
return [
AugmentOp(
name, prob=1.0, magnitude=magnitude, hparams=hparams)
for name in transforms
]
class AugMixAugment(object):
""" AugMix Transform
Adapted and improved from impl here: https://github.com/google-research/augmix/blob/master/imagenet.py
From paper: 'AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty -
https://arxiv.org/abs/1912.02781
"""
def __init__(self, ops, alpha=1., width=3, depth=-1, blended=False):
self.ops = ops
self.alpha = alpha
self.width = width
self.depth = depth
self.blended = blended # blended mode is faster but not well tested
def _calc_blended_weights(self, ws, m):
ws = ws * m
cump = 1.
rws = []
for w in ws[::-1]:
alpha = w / cump
cump *= (1 - alpha)
rws.append(alpha)
return np.array(rws[::-1], dtype=np.float32)
def _apply_blended(self, img, mixing_weights, m):
# This is my first crack and implementing a slightly faster mixed augmentation. Instead
# of accumulating the mix for each chain in a Numpy array and then blending with original,
# it recomputes the blending coefficients and applies one PIL image blend per chain.
# TODO the results appear in the right ballpark but they differ by more than rounding.
img_orig = img.copy()
ws = self._calc_blended_weights(mixing_weights, m)
for w in ws:
depth = self.depth if self.depth > 0 else np.random.randint(1, 4)
ops = np.random.choice(self.ops, depth, replace=True)
img_aug = img_orig # no ops are in-place, deep copy not necessary
for op in ops:
img_aug = op(img_aug)
img = Image.blend(img, img_aug, w)
return img
def _apply_basic(self, img, mixing_weights, m):
# This is a literal adaptation of the paper/official implementation without normalizations and
# PIL <-> Numpy conversions between every op. It is still quite CPU compute heavy compared to the
# typical augmentation transforms, could use a GPU / Kornia implementation.
img_shape = img.size[0], img.size[1], len(img.getbands())
mixed = np.zeros(img_shape, dtype=np.float32)
for mw in mixing_weights:
depth = self.depth if self.depth > 0 else np.random.randint(1, 4)
ops = np.random.choice(self.ops, depth, replace=True)
img_aug = img # no ops are in-place, deep copy not necessary
for op in ops:
img_aug = op(img_aug)
mixed += mw * np.asarray(img_aug, dtype=np.float32)
np.clip(mixed, 0, 255., out=mixed)
mixed = Image.fromarray(mixed.astype(np.uint8))
return Image.blend(img, mixed, m)
def __call__(self, img):
mixing_weights = np.float32(
np.random.dirichlet([self.alpha] * self.width))
m = np.float32(np.random.beta(self.alpha, self.alpha))
if self.blended:
mixed = self._apply_blended(img, mixing_weights, m)
else:
mixed = self._apply_basic(img, mixing_weights, m)
return mixed
def augment_and_mix_transform(config_str, hparams):
""" Create AugMix transform
:param config_str: String defining configuration of random augmentation. Consists of multiple sections separated by
dashes ('-'). The first section defines the specific variant of rand augment (currently only 'rand'). The remaining
sections, not order sepecific determine
'm' - integer magnitude (severity) of augmentation mix (default: 3)
'w' - integer width of augmentation chain (default: 3)
'd' - integer depth of augmentation chain (-1 is random [1, 3], default: -1)
'b' - integer (bool), blend each branch of chain into end result without a final blend, less CPU (default: 0)
'mstd' - float std deviation of magnitude noise applied (default: 0)
Ex 'augmix-m5-w4-d2' results in AugMix with severity 5, chain width 4, chain depth 2
:param hparams: Other hparams (kwargs) for the Augmentation transforms
:return: A callable Transform Op
"""
magnitude = 3
width = 3
depth = -1
alpha = 1.
blended = False
config = config_str.split('-')
assert config[0] == 'augmix'
config = config[1:]
for c in config:
cs = re.split(r'(\d.*)', c)
if len(cs) < 2:
continue
key, val = cs[:2]
if key == 'mstd':
# noise param injected via hparams for now
hparams.setdefault('magnitude_std', float(val))
elif key == 'm':
magnitude = int(val)
elif key == 'w':
width = int(val)
elif key == 'd':
depth = int(val)
elif key == 'a':
alpha = float(val)
elif key == 'b':
blended = bool(val)
else:
assert False, 'Unknown AugMix config section'
ops = augmix_ops(magnitude=magnitude, hparams=hparams)
return AugMixAugment(
ops, alpha=alpha, width=width, depth=depth, blended=blended)
class RawTimmAutoAugment(object):
"""TimmAutoAugment API for PaddleClas."""
def __init__(self,
config_str="rand-m9-mstd0.5-inc1",
interpolation="bicubic",
img_size=224,
mean=IMAGENET_DEFAULT_MEAN):
if isinstance(img_size, (tuple, list)):
img_size_min = min(img_size)
else:
img_size_min = img_size
aa_params = dict(
translate_const=int(img_size_min * 0.45),
img_mean=tuple([min(255, round(255 * x)) for x in mean]), )
if interpolation and interpolation != 'random':
aa_params['interpolation'] = _pil_interp(interpolation)
if config_str.startswith('rand'):
self.augment_func = rand_augment_transform(config_str, aa_params)
elif config_str.startswith('augmix'):
aa_params['translate_pct'] = 0.3
self.augment_func = augment_and_mix_transform(config_str,
aa_params)
elif config_str.startswith('auto'):
self.augment_func = auto_augment_transform(config_str, aa_params)
else:
raise Exception(
"ConfigError: The TimmAutoAugment Op only support RandAugment, AutoAugment, AugMix, and the config_str only starts with \"rand\", \"augmix\", \"auto\"."
)
def __call__(self, img):
return self.augment_func(img)
...@@ -200,7 +200,7 @@ class Engine(object): ...@@ -200,7 +200,7 @@ class Engine(object):
if self.mode == 'train': if self.mode == 'train':
self.optimizer, self.lr_sch = build_optimizer( self.optimizer, self.lr_sch = build_optimizer(
self.config["Optimizer"], self.config["Global"]["epochs"], self.config["Optimizer"], self.config["Global"]["epochs"],
len(self.train_dataloader), self.model.parameters()) len(self.train_dataloader), [self.model])
# for distributed # for distributed
self.config["Global"][ self.config["Global"][
...@@ -355,7 +355,8 @@ class Engine(object): ...@@ -355,7 +355,8 @@ class Engine(object):
def export(self): def export(self):
assert self.mode == "export" assert self.mode == "export"
model = ExportModel(self.config["Arch"], self.model) use_multilabel = self.config["Global"].get("use_multilabel", False)
model = ExportModel(self.config["Arch"], self.model, use_multilabel)
if self.config["Global"]["pretrained_model"] is not None: if self.config["Global"]["pretrained_model"] is not None:
load_dygraph_pretrain(model.base_model, load_dygraph_pretrain(model.base_model,
self.config["Global"]["pretrained_model"]) self.config["Global"]["pretrained_model"])
...@@ -388,10 +389,9 @@ class ExportModel(nn.Layer): ...@@ -388,10 +389,9 @@ class ExportModel(nn.Layer):
ExportModel: add softmax onto the model ExportModel: add softmax onto the model
""" """
def __init__(self, config, model): def __init__(self, config, model, use_multilabel):
super().__init__() super().__init__()
self.base_model = model self.base_model = model
# we should choose a final model to export # we should choose a final model to export
if isinstance(self.base_model, DistillationModel): if isinstance(self.base_model, DistillationModel):
self.infer_model_name = config["infer_model_name"] self.infer_model_name = config["infer_model_name"]
...@@ -402,10 +402,13 @@ class ExportModel(nn.Layer): ...@@ -402,10 +402,13 @@ class ExportModel(nn.Layer):
if self.infer_output_key == "features" and isinstance(self.base_model, if self.infer_output_key == "features" and isinstance(self.base_model,
RecModel): RecModel):
self.base_model.head = IdentityHead() self.base_model.head = IdentityHead()
if config.get("infer_add_softmax", True): if use_multilabel:
self.softmax = nn.Softmax(axis=-1) self.out_act = nn.Sigmoid()
else: else:
self.softmax = None if config.get("infer_add_softmax", True):
self.out_act = nn.Softmax(axis=-1)
else:
self.out_act = None
def eval(self): def eval(self):
self.training = False self.training = False
...@@ -421,6 +424,6 @@ class ExportModel(nn.Layer): ...@@ -421,6 +424,6 @@ class ExportModel(nn.Layer):
x = x[self.infer_model_name] x = x[self.infer_model_name]
if self.infer_output_key is not None: if self.infer_output_key is not None:
x = x[self.infer_output_key] x = x[self.infer_output_key]
if self.softmax is not None: if self.out_act is not None:
x = self.softmax(x) x = self.out_act(x)
return x return x
...@@ -22,7 +22,7 @@ from ppcls.utils.misc import AverageMeter ...@@ -22,7 +22,7 @@ from ppcls.utils.misc import AverageMeter
from ppcls.utils import logger from ppcls.utils import logger
def classification_eval(evaler, epoch_id=0): def classification_eval(engine, epoch_id=0):
output_info = dict() output_info = dict()
time_info = { time_info = {
"batch_cost": AverageMeter( "batch_cost": AverageMeter(
...@@ -30,21 +30,19 @@ def classification_eval(evaler, epoch_id=0): ...@@ -30,21 +30,19 @@ def classification_eval(evaler, epoch_id=0):
"reader_cost": AverageMeter( "reader_cost": AverageMeter(
"reader_cost", ".5f", postfix=" s,"), "reader_cost", ".5f", postfix=" s,"),
} }
print_batch_step = evaler.config["Global"]["print_batch_step"] print_batch_step = engine.config["Global"]["print_batch_step"]
metric_key = None metric_key = None
tic = time.time() tic = time.time()
eval_dataloader = evaler.eval_dataloader if evaler.use_dali else evaler.eval_dataloader( max_iter = len(engine.eval_dataloader) - 1 if platform.system(
) ) == "Windows" else len(engine.eval_dataloader)
max_iter = len(evaler.eval_dataloader) - 1 if platform.system( for iter_id, batch in enumerate(engine.eval_dataloader):
) == "Windows" else len(evaler.eval_dataloader)
for iter_id, batch in enumerate(eval_dataloader):
if iter_id >= max_iter: if iter_id >= max_iter:
break break
if iter_id == 5: if iter_id == 5:
for key in time_info: for key in time_info:
time_info[key].reset() time_info[key].reset()
if evaler.use_dali: if engine.use_dali:
batch = [ batch = [
paddle.to_tensor(batch[0]['data']), paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label']) paddle.to_tensor(batch[0]['label'])
...@@ -52,19 +50,20 @@ def classification_eval(evaler, epoch_id=0): ...@@ -52,19 +50,20 @@ def classification_eval(evaler, epoch_id=0):
time_info["reader_cost"].update(time.time() - tic) time_info["reader_cost"].update(time.time() - tic)
batch_size = batch[0].shape[0] batch_size = batch[0].shape[0]
batch[0] = paddle.to_tensor(batch[0]).astype("float32") batch[0] = paddle.to_tensor(batch[0]).astype("float32")
batch[1] = batch[1].reshape([-1, 1]).astype("int64") if not engine.config["Global"].get("use_multilabel", False):
batch[1] = batch[1].reshape([-1, 1]).astype("int64")
# image input # image input
out = evaler.model(batch[0]) out = engine.model(batch[0])
# calc loss # calc loss
if evaler.eval_loss_func is not None: if engine.eval_loss_func is not None:
loss_dict = evaler.eval_loss_func(out, batch[1]) loss_dict = engine.eval_loss_func(out, batch[1])
for key in loss_dict: for key in loss_dict:
if key not in output_info: if key not in output_info:
output_info[key] = AverageMeter(key, '7.5f') output_info[key] = AverageMeter(key, '7.5f')
output_info[key].update(loss_dict[key].numpy()[0], batch_size) output_info[key].update(loss_dict[key].numpy()[0], batch_size)
# calc metric # calc metric
if evaler.eval_metric_func is not None: if engine.eval_metric_func is not None:
metric_dict = evaler.eval_metric_func(out, batch[1]) metric_dict = engine.eval_metric_func(out, batch[1])
if paddle.distributed.get_world_size() > 1: if paddle.distributed.get_world_size() > 1:
for key in metric_dict: for key in metric_dict:
paddle.distributed.all_reduce( paddle.distributed.all_reduce(
...@@ -97,18 +96,18 @@ def classification_eval(evaler, epoch_id=0): ...@@ -97,18 +96,18 @@ def classification_eval(evaler, epoch_id=0):
]) ])
logger.info("[Eval][Epoch {}][Iter: {}/{}]{}, {}, {}".format( logger.info("[Eval][Epoch {}][Iter: {}/{}]{}, {}, {}".format(
epoch_id, iter_id, epoch_id, iter_id,
len(evaler.eval_dataloader), metric_msg, time_msg, ips_msg)) len(engine.eval_dataloader), metric_msg, time_msg, ips_msg))
tic = time.time() tic = time.time()
if evaler.use_dali: if engine.use_dali:
evaler.eval_dataloader.reset() engine.eval_dataloader.reset()
metric_msg = ", ".join([ metric_msg = ", ".join([
"{}: {:.5f}".format(key, output_info[key].avg) for key in output_info "{}: {:.5f}".format(key, output_info[key].avg) for key in output_info
]) ])
logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg)) logger.info("[Eval][Epoch {}][Avg]{}".format(epoch_id, metric_msg))
# do not try to save best eval.model # do not try to save best eval.model
if evaler.eval_metric_func is None: if engine.eval_metric_func is None:
return -1 return -1
# return 1st metric in the dict # return 1st metric in the dict
return output_info[metric_key].avg return output_info[metric_key].avg
...@@ -20,21 +20,21 @@ import paddle ...@@ -20,21 +20,21 @@ import paddle
from ppcls.utils import logger from ppcls.utils import logger
def retrieval_eval(evaler, epoch_id=0): def retrieval_eval(engine, epoch_id=0):
evaler.model.eval() engine.model.eval()
# step1. build gallery # step1. build gallery
if evaler.gallery_query_dataloader is not None: if engine.gallery_query_dataloader is not None:
gallery_feas, gallery_img_id, gallery_unique_id = cal_feature( gallery_feas, gallery_img_id, gallery_unique_id = cal_feature(
evaler, name='gallery_query') engine, name='gallery_query')
query_feas, query_img_id, query_query_id = gallery_feas, gallery_img_id, gallery_unique_id query_feas, query_img_id, query_query_id = gallery_feas, gallery_img_id, gallery_unique_id
else: else:
gallery_feas, gallery_img_id, gallery_unique_id = cal_feature( gallery_feas, gallery_img_id, gallery_unique_id = cal_feature(
evaler, name='gallery') engine, name='gallery')
query_feas, query_img_id, query_query_id = cal_feature( query_feas, query_img_id, query_query_id = cal_feature(
evaler, name='query') engine, name='query')
# step2. do evaluation # step2. do evaluation
sim_block_size = evaler.config["Global"].get("sim_block_size", 64) sim_block_size = engine.config["Global"].get("sim_block_size", 64)
sections = [sim_block_size] * (len(query_feas) // sim_block_size) sections = [sim_block_size] * (len(query_feas) // sim_block_size)
if len(query_feas) % sim_block_size: if len(query_feas) % sim_block_size:
sections.append(len(query_feas) % sim_block_size) sections.append(len(query_feas) % sim_block_size)
...@@ -45,7 +45,7 @@ def retrieval_eval(evaler, epoch_id=0): ...@@ -45,7 +45,7 @@ def retrieval_eval(evaler, epoch_id=0):
image_id_blocks = paddle.split(query_img_id, num_or_sections=sections) image_id_blocks = paddle.split(query_img_id, num_or_sections=sections)
metric_key = None metric_key = None
if evaler.eval_loss_func is None: if engine.eval_loss_func is None:
metric_dict = {metric_key: 0.} metric_dict = {metric_key: 0.}
else: else:
metric_dict = dict() metric_dict = dict()
...@@ -65,7 +65,7 @@ def retrieval_eval(evaler, epoch_id=0): ...@@ -65,7 +65,7 @@ def retrieval_eval(evaler, epoch_id=0):
else: else:
keep_mask = None keep_mask = None
metric_tmp = evaler.eval_metric_func(similarity_matrix, metric_tmp = engine.eval_metric_func(similarity_matrix,
image_id_blocks[block_idx], image_id_blocks[block_idx],
gallery_img_id, keep_mask) gallery_img_id, keep_mask)
...@@ -88,32 +88,31 @@ def retrieval_eval(evaler, epoch_id=0): ...@@ -88,32 +88,31 @@ def retrieval_eval(evaler, epoch_id=0):
return metric_dict[metric_key] return metric_dict[metric_key]
def cal_feature(evaler, name='gallery'): def cal_feature(engine, name='gallery'):
all_feas = None all_feas = None
all_image_id = None all_image_id = None
all_unique_id = None all_unique_id = None
has_unique_id = False has_unique_id = False
if name == 'gallery': if name == 'gallery':
dataloader = evaler.gallery_dataloader dataloader = engine.gallery_dataloader
elif name == 'query': elif name == 'query':
dataloader = evaler.query_dataloader dataloader = engine.query_dataloader
elif name == 'gallery_query': elif name == 'gallery_query':
dataloader = evaler.gallery_query_dataloader dataloader = engine.gallery_query_dataloader
else: else:
raise RuntimeError("Only support gallery or query dataset") raise RuntimeError("Only support gallery or query dataset")
max_iter = len(dataloader) - 1 if platform.system() == "Windows" else len( max_iter = len(dataloader) - 1 if platform.system() == "Windows" else len(
dataloader) dataloader)
dataloader_tmp = dataloader if evaler.use_dali else dataloader() for idx, batch in enumerate(dataloader): # load is very time-consuming
for idx, batch in enumerate(dataloader_tmp): # load is very time-consuming
if idx >= max_iter: if idx >= max_iter:
break break
if idx % evaler.config["Global"]["print_batch_step"] == 0: if idx % engine.config["Global"]["print_batch_step"] == 0:
logger.info( logger.info(
f"{name} feature calculation process: [{idx}/{len(dataloader)}]" f"{name} feature calculation process: [{idx}/{len(dataloader)}]"
) )
if evaler.use_dali: if engine.use_dali:
batch = [ batch = [
paddle.to_tensor(batch[0]['data']), paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label']) paddle.to_tensor(batch[0]['label'])
...@@ -123,20 +122,20 @@ def cal_feature(evaler, name='gallery'): ...@@ -123,20 +122,20 @@ def cal_feature(evaler, name='gallery'):
if len(batch) == 3: if len(batch) == 3:
has_unique_id = True has_unique_id = True
batch[2] = batch[2].reshape([-1, 1]).astype("int64") batch[2] = batch[2].reshape([-1, 1]).astype("int64")
out = evaler.model(batch[0], batch[1]) out = engine.model(batch[0], batch[1])
batch_feas = out["features"] batch_feas = out["features"]
# do norm # do norm
if evaler.config["Global"].get("feature_normalize", True): if engine.config["Global"].get("feature_normalize", True):
feas_norm = paddle.sqrt( feas_norm = paddle.sqrt(
paddle.sum(paddle.square(batch_feas), axis=1, keepdim=True)) paddle.sum(paddle.square(batch_feas), axis=1, keepdim=True))
batch_feas = paddle.divide(batch_feas, feas_norm) batch_feas = paddle.divide(batch_feas, feas_norm)
# do binarize # do binarize
if evaler.config["Global"].get("feature_binarize") == "round": if engine.config["Global"].get("feature_binarize") == "round":
batch_feas = paddle.round(batch_feas).astype("float32") * 2.0 - 1.0 batch_feas = paddle.round(batch_feas).astype("float32") * 2.0 - 1.0
if evaler.config["Global"].get("feature_binarize") == "sign": if engine.config["Global"].get("feature_binarize") == "sign":
batch_feas = paddle.sign(batch_feas).astype("float32") batch_feas = paddle.sign(batch_feas).astype("float32")
if all_feas is None: if all_feas is None:
...@@ -150,8 +149,8 @@ def cal_feature(evaler, name='gallery'): ...@@ -150,8 +149,8 @@ def cal_feature(evaler, name='gallery'):
if has_unique_id: if has_unique_id:
all_unique_id = paddle.concat([all_unique_id, batch[2]]) all_unique_id = paddle.concat([all_unique_id, batch[2]])
if evaler.use_dali: if engine.use_dali:
dataloader_tmp.reset() dataloader.reset()
if paddle.distributed.get_world_size() > 1: if paddle.distributed.get_world_size() > 1:
feat_list = [] feat_list = []
......
...@@ -18,68 +18,66 @@ import paddle ...@@ -18,68 +18,66 @@ import paddle
from ppcls.engine.train.utils import update_loss, update_metric, log_info from ppcls.engine.train.utils import update_loss, update_metric, log_info
def train_epoch(trainer, epoch_id, print_batch_step): def train_epoch(engine, epoch_id, print_batch_step):
tic = time.time() tic = time.time()
for iter_id, batch in enumerate(engine.train_dataloader):
train_dataloader = trainer.train_dataloader if trainer.use_dali else trainer.train_dataloader( if iter_id >= engine.max_iter:
)
for iter_id, batch in enumerate(train_dataloader):
if iter_id >= trainer.max_iter:
break break
if iter_id == 5: if iter_id == 5:
for key in trainer.time_info: for key in engine.time_info:
trainer.time_info[key].reset() engine.time_info[key].reset()
trainer.time_info["reader_cost"].update(time.time() - tic) engine.time_info["reader_cost"].update(time.time() - tic)
if trainer.use_dali: if engine.use_dali:
batch = [ batch = [
paddle.to_tensor(batch[0]['data']), paddle.to_tensor(batch[0]['data']),
paddle.to_tensor(batch[0]['label']) paddle.to_tensor(batch[0]['label'])
] ]
batch_size = batch[0].shape[0] batch_size = batch[0].shape[0]
batch[1] = batch[1].reshape([-1, 1]).astype("int64") if not engine.config["Global"].get("use_multilabel", False):
batch[1] = batch[1].reshape([-1, 1]).astype("int64")
engine.global_step += 1
trainer.global_step += 1
# image input # image input
if trainer.amp: if engine.amp:
with paddle.amp.auto_cast(custom_black_list={ with paddle.amp.auto_cast(custom_black_list={
"flatten_contiguous_range", "greater_than" "flatten_contiguous_range", "greater_than"
}): }):
out = forward(trainer, batch) out = forward(engine, batch)
loss_dict = trainer.train_loss_func(out, batch[1]) loss_dict = engine.train_loss_func(out, batch[1])
else: else:
out = forward(trainer, batch) out = forward(engine, batch)
# calc loss # calc loss
if trainer.config["DataLoader"]["Train"]["dataset"].get( if engine.config["DataLoader"]["Train"]["dataset"].get(
"batch_transform_ops", None): "batch_transform_ops", None):
loss_dict = trainer.train_loss_func(out, batch[1:]) loss_dict = engine.train_loss_func(out, batch[1:])
else: else:
loss_dict = trainer.train_loss_func(out, batch[1]) loss_dict = engine.train_loss_func(out, batch[1])
# step opt and lr # step opt and lr
if trainer.amp: if engine.amp:
scaled = trainer.scaler.scale(loss_dict["loss"]) scaled = engine.scaler.scale(loss_dict["loss"])
scaled.backward() scaled.backward()
trainer.scaler.minimize(trainer.optimizer, scaled) engine.scaler.minimize(engine.optimizer, scaled)
else: else:
loss_dict["loss"].backward() loss_dict["loss"].backward()
trainer.optimizer.step() engine.optimizer.step()
trainer.optimizer.clear_grad() engine.optimizer.clear_grad()
trainer.lr_sch.step() engine.lr_sch.step()
# below code just for logging # below code just for logging
# update metric_for_logger # update metric_for_logger
update_metric(trainer, out, batch, batch_size) update_metric(engine, out, batch, batch_size)
# update_loss_for_logger # update_loss_for_logger
update_loss(trainer, loss_dict, batch_size) update_loss(engine, loss_dict, batch_size)
trainer.time_info["batch_cost"].update(time.time() - tic) engine.time_info["batch_cost"].update(time.time() - tic)
if iter_id % print_batch_step == 0: if iter_id % print_batch_step == 0:
log_info(trainer, batch_size, epoch_id, iter_id) log_info(engine, batch_size, epoch_id, iter_id)
tic = time.time() tic = time.time()
def forward(trainer, batch): def forward(engine, batch):
if not trainer.is_rec: if not engine.is_rec:
return trainer.model(batch[0]) return engine.model(batch[0])
else: else:
return trainer.model(batch[0], batch[1]) return engine.model(batch[0], batch[1])
...@@ -20,6 +20,7 @@ from .distanceloss import DistanceLoss ...@@ -20,6 +20,7 @@ from .distanceloss import DistanceLoss
from .distillationloss import DistillationCELoss from .distillationloss import DistillationCELoss
from .distillationloss import DistillationGTCELoss from .distillationloss import DistillationGTCELoss
from .distillationloss import DistillationDMLLoss from .distillationloss import DistillationDMLLoss
from .multilabelloss import MultiLabelLoss
class CombinedLoss(nn.Layer): class CombinedLoss(nn.Layer):
......
import paddle
import paddle.nn as nn
import paddle.nn.functional as F
class MultiLabelLoss(nn.Layer):
"""
Multi-label loss
"""
def __init__(self, epsilon=None):
super().__init__()
if epsilon is not None and (epsilon <= 0 or epsilon >= 1):
epsilon = None
self.epsilon = epsilon
def _labelsmoothing(self, target, class_num):
if target.ndim == 1 or target.shape[-1] != class_num:
one_hot_target = F.one_hot(target, class_num)
else:
one_hot_target = target
soft_target = F.label_smooth(one_hot_target, epsilon=self.epsilon)
soft_target = paddle.reshape(soft_target, shape=[-1, class_num])
return soft_target
def _binary_crossentropy(self, input, target, class_num):
if self.epsilon is not None:
target = self._labelsmoothing(target, class_num)
cost = F.binary_cross_entropy_with_logits(
logit=input, label=target)
else:
cost = F.binary_cross_entropy_with_logits(
logit=input, label=target)
return cost
def forward(self, x, target):
if isinstance(x, dict):
x = x["logits"]
class_num = x.shape[-1]
loss = self._binary_crossentropy(x, target, class_num)
loss = loss.mean()
return {"MultiLabelLoss": loss}
...@@ -19,6 +19,8 @@ from collections import OrderedDict ...@@ -19,6 +19,8 @@ from collections import OrderedDict
from .metrics import TopkAcc, mAP, mINP, Recallk, Precisionk from .metrics import TopkAcc, mAP, mINP, Recallk, Precisionk
from .metrics import DistillationTopkAcc from .metrics import DistillationTopkAcc
from .metrics import GoogLeNetTopkAcc from .metrics import GoogLeNetTopkAcc
from .metrics import HammingDistance, AccuracyScore
class CombinedMetrics(nn.Layer): class CombinedMetrics(nn.Layer):
def __init__(self, config_list): def __init__(self, config_list):
...@@ -32,7 +34,8 @@ class CombinedMetrics(nn.Layer): ...@@ -32,7 +34,8 @@ class CombinedMetrics(nn.Layer):
metric_name = list(config)[0] metric_name = list(config)[0]
metric_params = config[metric_name] metric_params = config[metric_name]
if metric_params is not None: if metric_params is not None:
self.metric_func_list.append(eval(metric_name)(**metric_params)) self.metric_func_list.append(
eval(metric_name)(**metric_params))
else: else:
self.metric_func_list.append(eval(metric_name)()) self.metric_func_list.append(eval(metric_name)())
...@@ -42,6 +45,7 @@ class CombinedMetrics(nn.Layer): ...@@ -42,6 +45,7 @@ class CombinedMetrics(nn.Layer):
metric_dict.update(metric_func(*args, **kwargs)) metric_dict.update(metric_func(*args, **kwargs))
return metric_dict return metric_dict
def build_metrics(config): def build_metrics(config):
metrics_list = CombinedMetrics(copy.deepcopy(config)) metrics_list = CombinedMetrics(copy.deepcopy(config))
return metrics_list return metrics_list
...@@ -15,6 +15,12 @@ ...@@ -15,6 +15,12 @@
import numpy as np import numpy as np
import paddle import paddle
import paddle.nn as nn import paddle.nn as nn
import paddle.nn.functional as F
from sklearn.metrics import hamming_loss
from sklearn.metrics import accuracy_score as accuracy_metric
from sklearn.metrics import multilabel_confusion_matrix
from sklearn.preprocessing import binarize
class TopkAcc(nn.Layer): class TopkAcc(nn.Layer):
...@@ -198,7 +204,7 @@ class Precisionk(nn.Layer): ...@@ -198,7 +204,7 @@ class Precisionk(nn.Layer):
equal_flag = paddle.logical_and(equal_flag, equal_flag = paddle.logical_and(equal_flag,
keep_mask.astype('bool')) keep_mask.astype('bool'))
equal_flag = paddle.cast(equal_flag, 'float32') equal_flag = paddle.cast(equal_flag, 'float32')
Ns = paddle.arange(gallery_img_id.shape[0]) + 1 Ns = paddle.arange(gallery_img_id.shape[0]) + 1
equal_flag_cumsum = paddle.cumsum(equal_flag, axis=1) equal_flag_cumsum = paddle.cumsum(equal_flag, axis=1)
Precision_at_k = (paddle.mean(equal_flag_cumsum, axis=0) / Ns).numpy() Precision_at_k = (paddle.mean(equal_flag_cumsum, axis=0) / Ns).numpy()
...@@ -232,3 +238,71 @@ class GoogLeNetTopkAcc(TopkAcc): ...@@ -232,3 +238,71 @@ class GoogLeNetTopkAcc(TopkAcc):
def forward(self, x, label): def forward(self, x, label):
return super().forward(x[0], label) return super().forward(x[0], label)
class MutiLabelMetric(object):
def __init__(self):
pass
def _multi_hot_encode(self, logits, threshold=0.5):
return binarize(logits, threshold=threshold)
def __call__(self, output):
output = F.sigmoid(output)
preds = self._multi_hot_encode(logits=output.numpy(), threshold=0.5)
return preds
class HammingDistance(MutiLabelMetric):
"""
Soft metric based label for multilabel classification
Returns:
The smaller the return value is, the better model is.
"""
def __init__(self):
super().__init__()
def __call__(self, output, target):
preds = super().__call__(output)
metric_dict = dict()
metric_dict["HammingDistance"] = paddle.to_tensor(
hamming_loss(target, preds))
return metric_dict
class AccuracyScore(MutiLabelMetric):
"""
Hard metric for multilabel classification
Args:
base: ["sample", "label"], default="sample"
if "sample", return metric score based sample,
if "label", return metric score based label.
Returns:
accuracy:
"""
def __init__(self, base="label"):
super().__init__()
assert base in ["sample", "label"
], 'must be one of ["sample", "label"]'
self.base = base
def __call__(self, output, target):
preds = super().__call__(output)
metric_dict = dict()
if self.base == "sample":
accuracy = accuracy_metric(target, preds)
elif self.base == "label":
mcm = multilabel_confusion_matrix(target, preds)
tns = mcm[:, 0, 0]
fns = mcm[:, 1, 0]
tps = mcm[:, 1, 1]
fps = mcm[:, 0, 1]
accuracy = (sum(tps) + sum(tns)) / (
sum(tps) + sum(tns) + sum(fns) + sum(fps))
precision = sum(tps) / (sum(tps) + sum(fps))
recall = sum(tps) / (sum(tps) + sum(fns))
F1 = 2 * (accuracy * recall) / (accuracy + recall)
metric_dict["AccuracyScore"] = paddle.to_tensor(accuracy)
return metric_dict
...@@ -41,19 +41,22 @@ def build_lr_scheduler(lr_config, epochs, step_each_epoch): ...@@ -41,19 +41,22 @@ def build_lr_scheduler(lr_config, epochs, step_each_epoch):
return lr return lr
def build_optimizer(config, epochs, step_each_epoch, parameters=None): def build_optimizer(config, epochs, step_each_epoch, model_list):
config = copy.deepcopy(config) config = copy.deepcopy(config)
# step1 build lr # step1 build lr
lr = build_lr_scheduler(config.pop('lr'), epochs, step_each_epoch) lr = build_lr_scheduler(config.pop('lr'), epochs, step_each_epoch)
logger.debug("build lr ({}) success..".format(lr)) logger.debug("build lr ({}) success..".format(lr))
# step2 build regularization # step2 build regularization
if 'regularizer' in config and config['regularizer'] is not None: if 'regularizer' in config and config['regularizer'] is not None:
if 'weight_decay' in config:
logger.warning(
"ConfigError: Only one of regularizer and weight_decay can be set in Optimizer Config. \"weight_decay\" has been ignored."
)
reg_config = config.pop('regularizer') reg_config = config.pop('regularizer')
reg_name = reg_config.pop('name') + 'Decay' reg_name = reg_config.pop('name') + 'Decay'
reg = getattr(paddle.regularizer, reg_name)(**reg_config) reg = getattr(paddle.regularizer, reg_name)(**reg_config)
else: config["weight_decay"] = reg
reg = None logger.debug("build regularizer ({}) success..".format(reg))
logger.debug("build regularizer ({}) success..".format(reg))
# step3 build optimizer # step3 build optimizer
optim_name = config.pop('name') optim_name = config.pop('name')
if 'clip_norm' in config: if 'clip_norm' in config:
...@@ -62,8 +65,7 @@ def build_optimizer(config, epochs, step_each_epoch, parameters=None): ...@@ -62,8 +65,7 @@ def build_optimizer(config, epochs, step_each_epoch, parameters=None):
else: else:
grad_clip = None grad_clip = None
optim = getattr(optimizer, optim_name)(learning_rate=lr, optim = getattr(optimizer, optim_name)(learning_rate=lr,
weight_decay=reg,
grad_clip=grad_clip, grad_clip=grad_clip,
**config)(parameters=parameters) **config)(model_list=model_list)
logger.debug("build optimizer ({}) success..".format(optim)) logger.debug("build optimizer ({}) success..".format(optim))
return optim, lr return optim, lr
...@@ -11,12 +11,15 @@ ...@@ -11,12 +11,15 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from __future__ import (absolute_import, division, print_function, from __future__ import (absolute_import, division, print_function,
unicode_literals) unicode_literals)
from paddle.optimizer import lr from paddle.optimizer import lr
from paddle.optimizer.lr import LRScheduler from paddle.optimizer.lr import LRScheduler
from ppcls.utils import logger
class Linear(object): class Linear(object):
""" """
...@@ -26,6 +29,8 @@ class Linear(object): ...@@ -26,6 +29,8 @@ class Linear(object):
epochs(int): The decay step size. It determines the decay cycle. epochs(int): The decay step size. It determines the decay cycle.
end_lr(float, optional): The minimum final learning rate. Default: 0.0001. end_lr(float, optional): The minimum final learning rate. Default: 0.0001.
power(float, optional): Power of polynomial. Default: 1.0. power(float, optional): Power of polynomial. Default: 1.0.
warmup_epoch(int): The epoch numbers for LinearWarmup. Default: 0.
warmup_start_lr(float): Initial learning rate of warm up. Default: 0.0.
last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
""" """
...@@ -36,28 +41,35 @@ class Linear(object): ...@@ -36,28 +41,35 @@ class Linear(object):
end_lr=0.0, end_lr=0.0,
power=1.0, power=1.0,
warmup_epoch=0, warmup_epoch=0,
warmup_start_lr=0.0,
last_epoch=-1, last_epoch=-1,
**kwargs): **kwargs):
super(Linear, self).__init__() super().__init__()
if warmup_epoch >= epochs:
msg = f"When using warm up, the value of \"Global.epochs\" must be greater than value of \"Optimizer.lr.warmup_epoch\". The value of \"Optimizer.lr.warmup_epoch\" has been set to {epochs}."
logger.warning(msg)
warmup_epoch = epochs
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.epochs = epochs * step_each_epoch self.steps = (epochs - warmup_epoch) * step_each_epoch
self.end_lr = end_lr self.end_lr = end_lr
self.power = power self.power = power
self.last_epoch = last_epoch self.last_epoch = last_epoch
self.warmup_epoch = round(warmup_epoch * step_each_epoch) self.warmup_steps = round(warmup_epoch * step_each_epoch)
self.warmup_start_lr = warmup_start_lr
def __call__(self): def __call__(self):
learning_rate = lr.PolynomialDecay( learning_rate = lr.PolynomialDecay(
learning_rate=self.learning_rate, learning_rate=self.learning_rate,
decay_steps=self.epochs, decay_steps=self.steps,
end_lr=self.end_lr, end_lr=self.end_lr,
power=self.power, power=self.power,
last_epoch=self.last_epoch) last_epoch=self.
if self.warmup_epoch > 0: last_epoch) if self.steps > 0 else self.learning_rate
if self.warmup_steps > 0:
learning_rate = lr.LinearWarmup( learning_rate = lr.LinearWarmup(
learning_rate=learning_rate, learning_rate=learning_rate,
warmup_steps=self.warmup_epoch, warmup_steps=self.warmup_steps,
start_lr=0.0, start_lr=self.warmup_start_lr,
end_lr=self.learning_rate, end_lr=self.learning_rate,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
return learning_rate return learning_rate
...@@ -71,6 +83,9 @@ class Cosine(object): ...@@ -71,6 +83,9 @@ class Cosine(object):
lr(float): initial learning rate lr(float): initial learning rate
step_each_epoch(int): steps each epoch step_each_epoch(int): steps each epoch
epochs(int): total training epochs epochs(int): total training epochs
eta_min(float): Minimum learning rate. Default: 0.0.
warmup_epoch(int): The epoch numbers for LinearWarmup. Default: 0.
warmup_start_lr(float): Initial learning rate of warm up. Default: 0.0.
last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
""" """
...@@ -78,25 +93,35 @@ class Cosine(object): ...@@ -78,25 +93,35 @@ class Cosine(object):
learning_rate, learning_rate,
step_each_epoch, step_each_epoch,
epochs, epochs,
eta_min=0.0,
warmup_epoch=0, warmup_epoch=0,
warmup_start_lr=0.0,
last_epoch=-1, last_epoch=-1,
**kwargs): **kwargs):
super(Cosine, self).__init__() super().__init__()
if warmup_epoch >= epochs:
msg = f"When using warm up, the value of \"Global.epochs\" must be greater than value of \"Optimizer.lr.warmup_epoch\". The value of \"Optimizer.lr.warmup_epoch\" has been set to {epochs}."
logger.warning(msg)
warmup_epoch = epochs
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.T_max = step_each_epoch * epochs self.T_max = (epochs - warmup_epoch) * step_each_epoch
self.eta_min = eta_min
self.last_epoch = last_epoch self.last_epoch = last_epoch
self.warmup_epoch = round(warmup_epoch * step_each_epoch) self.warmup_steps = round(warmup_epoch * step_each_epoch)
self.warmup_start_lr = warmup_start_lr
def __call__(self): def __call__(self):
learning_rate = lr.CosineAnnealingDecay( learning_rate = lr.CosineAnnealingDecay(
learning_rate=self.learning_rate, learning_rate=self.learning_rate,
T_max=self.T_max, T_max=self.T_max,
last_epoch=self.last_epoch) eta_min=self.eta_min,
if self.warmup_epoch > 0: last_epoch=self.
last_epoch) if self.T_max > 0 else self.learning_rate
if self.warmup_steps > 0:
learning_rate = lr.LinearWarmup( learning_rate = lr.LinearWarmup(
learning_rate=learning_rate, learning_rate=learning_rate,
warmup_steps=self.warmup_epoch, warmup_steps=self.warmup_steps,
start_lr=0.0, start_lr=self.warmup_start_lr,
end_lr=self.learning_rate, end_lr=self.learning_rate,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
return learning_rate return learning_rate
...@@ -111,6 +136,8 @@ class Step(object): ...@@ -111,6 +136,8 @@ class Step(object):
step_size (int): the interval to update. step_size (int): the interval to update.
gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` . gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` .
It should be less than 1.0. Default: 0.1. It should be less than 1.0. Default: 0.1.
warmup_epoch(int): The epoch numbers for LinearWarmup. Default: 0.
warmup_start_lr(float): Initial learning rate of warm up. Default: 0.0.
last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
""" """
...@@ -118,16 +145,23 @@ class Step(object): ...@@ -118,16 +145,23 @@ class Step(object):
learning_rate, learning_rate,
step_size, step_size,
step_each_epoch, step_each_epoch,
epochs,
gamma, gamma,
warmup_epoch=0, warmup_epoch=0,
warmup_start_lr=0.0,
last_epoch=-1, last_epoch=-1,
**kwargs): **kwargs):
super(Step, self).__init__() super().__init__()
if warmup_epoch >= epochs:
msg = f"When using warm up, the value of \"Global.epochs\" must be greater than value of \"Optimizer.lr.warmup_epoch\". The value of \"Optimizer.lr.warmup_epoch\" has been set to {epochs}."
logger.warning(msg)
warmup_epoch = epochs
self.step_size = step_each_epoch * step_size self.step_size = step_each_epoch * step_size
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.gamma = gamma self.gamma = gamma
self.last_epoch = last_epoch self.last_epoch = last_epoch
self.warmup_epoch = round(warmup_epoch * step_each_epoch) self.warmup_steps = round(warmup_epoch * step_each_epoch)
self.warmup_start_lr = warmup_start_lr
def __call__(self): def __call__(self):
learning_rate = lr.StepDecay( learning_rate = lr.StepDecay(
...@@ -135,11 +169,11 @@ class Step(object): ...@@ -135,11 +169,11 @@ class Step(object):
step_size=self.step_size, step_size=self.step_size,
gamma=self.gamma, gamma=self.gamma,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
if self.warmup_epoch > 0: if self.warmup_steps > 0:
learning_rate = lr.LinearWarmup( learning_rate = lr.LinearWarmup(
learning_rate=learning_rate, learning_rate=learning_rate,
warmup_steps=self.warmup_epoch, warmup_steps=self.warmup_steps,
start_lr=0.0, start_lr=self.warmup_start_lr,
end_lr=self.learning_rate, end_lr=self.learning_rate,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
return learning_rate return learning_rate
...@@ -152,6 +186,8 @@ class Piecewise(object): ...@@ -152,6 +186,8 @@ class Piecewise(object):
boundaries(list): A list of steps numbers. The type of element in the list is python int. boundaries(list): A list of steps numbers. The type of element in the list is python int.
values(list): A list of learning rate values that will be picked during different epoch boundaries. values(list): A list of learning rate values that will be picked during different epoch boundaries.
The type of element in the list is python float. The type of element in the list is python float.
warmup_epoch(int): The epoch numbers for LinearWarmup. Default: 0.
warmup_start_lr(float): Initial learning rate of warm up. Default: 0.0.
last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
""" """
...@@ -159,25 +195,32 @@ class Piecewise(object): ...@@ -159,25 +195,32 @@ class Piecewise(object):
step_each_epoch, step_each_epoch,
decay_epochs, decay_epochs,
values, values,
epochs,
warmup_epoch=0, warmup_epoch=0,
warmup_start_lr=0.0,
last_epoch=-1, last_epoch=-1,
**kwargs): **kwargs):
super(Piecewise, self).__init__() super().__init__()
if warmup_epoch >= epochs:
msg = f"When using warm up, the value of \"Global.epochs\" must be greater than value of \"Optimizer.lr.warmup_epoch\". The value of \"Optimizer.lr.warmup_epoch\" has been set to {epochs}."
logger.warning(msg)
warmup_epoch = epochs
self.boundaries = [step_each_epoch * e for e in decay_epochs] self.boundaries = [step_each_epoch * e for e in decay_epochs]
self.values = values self.values = values
self.last_epoch = last_epoch self.last_epoch = last_epoch
self.warmup_epoch = round(warmup_epoch * step_each_epoch) self.warmup_steps = round(warmup_epoch * step_each_epoch)
self.warmup_start_lr = warmup_start_lr
def __call__(self): def __call__(self):
learning_rate = lr.PiecewiseDecay( learning_rate = lr.PiecewiseDecay(
boundaries=self.boundaries, boundaries=self.boundaries,
values=self.values, values=self.values,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
if self.warmup_epoch > 0: if self.warmup_steps > 0:
learning_rate = lr.LinearWarmup( learning_rate = lr.LinearWarmup(
learning_rate=learning_rate, learning_rate=learning_rate,
warmup_steps=self.warmup_epoch, warmup_steps=self.warmup_steps,
start_lr=0.0, start_lr=self.warmup_start_lr,
end_lr=self.values[0], end_lr=self.values[0],
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
return learning_rate return learning_rate
...@@ -186,7 +229,7 @@ class Piecewise(object): ...@@ -186,7 +229,7 @@ class Piecewise(object):
class MultiStepDecay(LRScheduler): class MultiStepDecay(LRScheduler):
""" """
Update the learning rate by ``gamma`` once ``epoch`` reaches one of the milestones. Update the learning rate by ``gamma`` once ``epoch`` reaches one of the milestones.
The algorithm can be described as the code below. The algorithm can be described as the code below.
.. code-block:: text .. code-block:: text
learning_rate = 0.5 learning_rate = 0.5
milestones = [30, 50] milestones = [30, 50]
...@@ -200,15 +243,15 @@ class MultiStepDecay(LRScheduler): ...@@ -200,15 +243,15 @@ class MultiStepDecay(LRScheduler):
Args: Args:
learning_rate (float): The initial learning rate. It is a python float number. learning_rate (float): The initial learning rate. It is a python float number.
milestones (tuple|list): List or tuple of each boundaries. Must be increasing. milestones (tuple|list): List or tuple of each boundaries. Must be increasing.
gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` . gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` .
It should be less than 1.0. Default: 0.1. It should be less than 1.0. Default: 0.1.
last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
verbose (bool, optional): If ``True``, prints a message to stdout for each update. Default: ``False`` . verbose (bool, optional): If ``True``, prints a message to stdout for each update. Default: ``False`` .
Returns: Returns:
``MultiStepDecay`` instance to schedule learning rate. ``MultiStepDecay`` instance to schedule learning rate.
Examples: Examples:
.. code-block:: python .. code-block:: python
import paddle import paddle
import numpy as np import numpy as np
...@@ -274,8 +317,7 @@ class MultiStepDecay(LRScheduler): ...@@ -274,8 +317,7 @@ class MultiStepDecay(LRScheduler):
raise ValueError('gamma should be < 1.0.') raise ValueError('gamma should be < 1.0.')
self.milestones = [x * step_each_epoch for x in milestones] self.milestones = [x * step_each_epoch for x in milestones]
self.gamma = gamma self.gamma = gamma
super(MultiStepDecay, self).__init__(learning_rate, last_epoch, super().__init__(learning_rate, last_epoch, verbose)
verbose)
def get_lr(self): def get_lr(self):
for i in range(len(self.milestones)): for i in range(len(self.milestones)):
......
...@@ -35,14 +35,15 @@ class Momentum(object): ...@@ -35,14 +35,15 @@ class Momentum(object):
weight_decay=None, weight_decay=None,
grad_clip=None, grad_clip=None,
multi_precision=False): multi_precision=False):
super(Momentum, self).__init__() super().__init__()
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.momentum = momentum self.momentum = momentum
self.weight_decay = weight_decay self.weight_decay = weight_decay
self.grad_clip = grad_clip self.grad_clip = grad_clip
self.multi_precision = multi_precision self.multi_precision = multi_precision
def __call__(self, parameters): def __call__(self, model_list):
parameters = sum([m.parameters() for m in model_list], [])
opt = optim.Momentum( opt = optim.Momentum(
learning_rate=self.learning_rate, learning_rate=self.learning_rate,
momentum=self.momentum, momentum=self.momentum,
...@@ -77,7 +78,8 @@ class Adam(object): ...@@ -77,7 +78,8 @@ class Adam(object):
self.lazy_mode = lazy_mode self.lazy_mode = lazy_mode
self.multi_precision = multi_precision self.multi_precision = multi_precision
def __call__(self, parameters): def __call__(self, model_list):
parameters = sum([m.parameters() for m in model_list], [])
opt = optim.Adam( opt = optim.Adam(
learning_rate=self.learning_rate, learning_rate=self.learning_rate,
beta1=self.beta1, beta1=self.beta1,
...@@ -112,7 +114,7 @@ class RMSProp(object): ...@@ -112,7 +114,7 @@ class RMSProp(object):
weight_decay=None, weight_decay=None,
grad_clip=None, grad_clip=None,
multi_precision=False): multi_precision=False):
super(RMSProp, self).__init__() super().__init__()
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.momentum = momentum self.momentum = momentum
self.rho = rho self.rho = rho
...@@ -120,7 +122,8 @@ class RMSProp(object): ...@@ -120,7 +122,8 @@ class RMSProp(object):
self.weight_decay = weight_decay self.weight_decay = weight_decay
self.grad_clip = grad_clip self.grad_clip = grad_clip
def __call__(self, parameters): def __call__(self, model_list):
parameters = sum([m.parameters() for m in model_list], [])
opt = optim.RMSProp( opt = optim.RMSProp(
learning_rate=self.learning_rate, learning_rate=self.learning_rate,
momentum=self.momentum, momentum=self.momentum,
...@@ -130,3 +133,57 @@ class RMSProp(object): ...@@ -130,3 +133,57 @@ class RMSProp(object):
grad_clip=self.grad_clip, grad_clip=self.grad_clip,
parameters=parameters) parameters=parameters)
return opt return opt
class AdamW(object):
def __init__(self,
learning_rate=0.001,
beta1=0.9,
beta2=0.999,
epsilon=1e-8,
weight_decay=None,
multi_precision=False,
grad_clip=None,
no_weight_decay_name=None,
one_dim_param_no_weight_decay=False,
**args):
super().__init__()
self.learning_rate = learning_rate
self.beta1 = beta1
self.beta2 = beta2
self.epsilon = epsilon
self.grad_clip = grad_clip
self.weight_decay = weight_decay
self.multi_precision = multi_precision
self.no_weight_decay_name_list = no_weight_decay_name.split(
) if no_weight_decay_name else []
self.one_dim_param_no_weight_decay = one_dim_param_no_weight_decay
def __call__(self, model_list):
parameters = sum([m.parameters() for m in model_list], [])
self.no_weight_decay_param_name_list = [
p.name for model in model_list for n, p in model.named_parameters()
if any(nd in n for nd in self.no_weight_decay_name_list)
]
if self.one_dim_param_no_weight_decay:
self.no_weight_decay_param_name_list += [
p.name for model in model_list
for n, p in model.named_parameters() if len(p.shape) == 1
]
opt = optim.AdamW(
learning_rate=self.learning_rate,
beta1=self.beta1,
beta2=self.beta2,
epsilon=self.epsilon,
parameters=parameters,
weight_decay=self.weight_decay,
multi_precision=self.multi_precision,
grad_clip=self.grad_clip,
apply_decay_param_fun=self._apply_decay_param_fun)
return opt
def _apply_decay_param_fun(self, name):
return name not in self.no_weight_decay_param_name_list
...@@ -4,4 +4,4 @@ ...@@ -4,4 +4,4 @@
# python3.7 tools/train.py -c ./ppcls/configs/ImageNet/ResNet/ResNet50.yaml # python3.7 tools/train.py -c ./ppcls/configs/ImageNet/ResNet/ResNet50.yaml
# for multi-cards train # for multi-cards train
python3.7 -m paddle.distributed.launch --gpus="0,1,2,3" tools/train.py -c ./ppcls/configs/ImageNet/ResNet/ResNet50.yaml python3.7 -m paddle.distributed.launch --gpus="0,1,2,3" tools/train.py -c ./ppcls/configs/ImageNet/ResNet/ResNet50.yaml
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册