未验证 提交 6497727b 编写于 作者: W Wei Shengyu 提交者: GitHub

Merge branch 'PaddlePaddle:develop' into develop

...@@ -3,10 +3,11 @@ __pycache__/ ...@@ -3,10 +3,11 @@ __pycache__/
*.sw* *.sw*
*/workerlog* */workerlog*
checkpoints/ checkpoints/
output/ output*/
pretrained/ pretrained/
.ipynb_checkpoints/ .ipynb_checkpoints/
*.ipynb* *.ipynb*
_build/ _build/
build/ build/
log/
nohup.out nohup.out
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。 飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。
**近期更新** **近期更新**
- 2021.09.17 增加PaddleClas自研PP-LCNet系列模型, 这些模型在Intel CPU上有较强的竞争力。相关指标和预训练权重可以从 [这里](docs/zh_CN/ImageNet_models.md)下载。
- 2021.08.11 更新7个[FAQ](docs/zh_CN/faq_series/faq_2021_s2.md) - 2021.08.11 更新7个[FAQ](docs/zh_CN/faq_series/faq_2021_s2.md)
- 2021.06.29 添加Swin-transformer系列模型,ImageNet1k数据集上Top1 acc最高精度可达87.2%;支持训练预测评估与whl包部署,预训练模型可以从[这里](docs/zh_CN/models/models_intro.md)下载。 - 2021.06.29 添加Swin-transformer系列模型,ImageNet1k数据集上Top1 acc最高精度可达87.2%;支持训练预测评估与whl包部署,预训练模型可以从[这里](docs/zh_CN/models/models_intro.md)下载。
- 2021.06.22,23,24 PaddleClas官方研发团队带来技术深入解读三日直播课。课程回放:[https://aistudio.baidu.com/aistudio/course/introduce/24519](https://aistudio.baidu.com/aistudio/course/introduce/24519) - 2021.06.22,23,24 PaddleClas官方研发团队带来技术深入解读三日直播课。课程回放:[https://aistudio.baidu.com/aistudio/course/introduce/24519](https://aistudio.baidu.com/aistudio/course/introduce/24519)
......
...@@ -8,6 +8,8 @@ PaddleClas is an image recognition toolset for industry and academia, helping us ...@@ -8,6 +8,8 @@ PaddleClas is an image recognition toolset for industry and academia, helping us
**Recent updates** **Recent updates**
- 2021.09.17 Add PP-LCNet series model developed by PaddleClas, these models show strong competitiveness on Intel CPUs. The metrics and pretrained model are available [here](docs/en/ImageNet_models_en.md).
- 2021.06.29 Add Swin-transformer series model,Highest top1 acc on ImageNet1k dataset reaches 87.2%, training, evaluation and inference are all supported. Pretrained models can be downloaded [here](docs/en/models/models_intro_en.md). - 2021.06.29 Add Swin-transformer series model,Highest top1 acc on ImageNet1k dataset reaches 87.2%, training, evaluation and inference are all supported. Pretrained models can be downloaded [here](docs/en/models/models_intro_en.md).
- 2021.06.16 PaddleClas release/2.2. Add metric learning and vector search modules. Add product recognition, animation character recognition, vehicle recognition and logo recognition. Added 30 pretrained models of LeViT, Twins, TNT, DLA, HarDNet, and RedNet, and the accuracy is roughly the same as that of the paper. - 2021.06.16 PaddleClas release/2.2. Add metric learning and vector search modules. Add product recognition, animation character recognition, vehicle recognition and logo recognition. Added 30 pretrained models of LeViT, Twins, TNT, DLA, HarDNet, and RedNet, and the accuracy is roughly the same as that of the paper.
- [more](./docs/en/update_history_en.md) - [more](./docs/en/update_history_en.md)
......
Global:
rec_inference_model_dir: "./models/product_MV3_x1_0_aliproduct_bin_v1.0_infer"
batch_size: 32
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
RecPreProcess:
transform_ops:
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
RecPostProcess:
main_indicator: Binarize
Binarize:
method: "round"
# indexing engine config
IndexProcess:
index_method: "Flat" # supported: HNSW32, Flat
index_dir: "./recognition_demo_data_v1.1/gallery_product/index_binary"
image_root: "./recognition_demo_data_v1.1/gallery_product/"
data_file: "./recognition_demo_data_v1.1/gallery_product/data_file.txt"
index_operation: "new" # suported: "append", "remove", "new"
delimiter: "\t"
dist_type: "hamming"
embedding_size: 512
Global:
infer_imgs: "./recognition_demo_data_v1.1/test_product/daoxiangcunjinzhubing_6.jpg"
det_inference_model_dir: "./models/ppyolov2_r50vd_dcn_mainbody_v1.0_infer"
rec_inference_model_dir: "./models/product_MV3_x1_0_aliproduct_bin_v1.0_infer"
rec_nms_thresold: 0.05
batch_size: 1
image_shape: [3, 640, 640]
threshold: 0.2
max_det_results: 5
labe_list:
- foreground
# inference engine config
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False
DetPreProcess:
transform_ops:
- DetResize:
interp: 2
keep_ratio: false
target_size: [640, 640]
- DetNormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- DetPermute: {}
DetPostProcess: {}
RecPreProcess:
transform_ops:
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
RecPostProcess:
main_indicator: Binarize
Binarize:
method: "round"
# indexing engine config
IndexProcess:
binary_index: true
index_dir: "./recognition_demo_data_v1.1/gallery_product/index_binary"
return_k: 5
score_thres: 0
project(clas_system CXX C) project(clas_system CXX C)
cmake_minimum_required(VERSION 3.14)
option(WITH_MKL "Compile demo with MKL/OpenBlas support, default use MKL." ON) option(WITH_MKL "Compile demo with MKL/OpenBlas support, default use MKL." ON)
option(WITH_GPU "Compile demo with GPU/CPU, default use CPU." OFF) option(WITH_GPU "Compile demo with GPU/CPU, default use CPU." OFF)
...@@ -13,7 +14,6 @@ SET(TENSORRT_DIR "" CACHE PATH "Compile demo with TensorRT") ...@@ -13,7 +14,6 @@ SET(TENSORRT_DIR "" CACHE PATH "Compile demo with TensorRT")
set(DEMO_NAME "clas_system") set(DEMO_NAME "clas_system")
macro(safe_set_static_flag) macro(safe_set_static_flag)
foreach(flag_var foreach(flag_var
CMAKE_CXX_FLAGS CMAKE_CXX_FLAGS_DEBUG CMAKE_CXX_FLAGS_RELEASE CMAKE_CXX_FLAGS CMAKE_CXX_FLAGS_DEBUG CMAKE_CXX_FLAGS_RELEASE
...@@ -198,6 +198,10 @@ endif() ...@@ -198,6 +198,10 @@ endif()
set(DEPS ${DEPS} ${OpenCV_LIBS}) set(DEPS ${DEPS} ${OpenCV_LIBS})
include(FetchContent)
include(external-cmake/auto-log.cmake)
include_directories(${FETCHCONTENT_BASE_DIR}/extern_autolog-src)
AUX_SOURCE_DIRECTORY(./src SRCS) AUX_SOURCE_DIRECTORY(./src SRCS)
add_executable(${DEMO_NAME} ${SRCS}) add_executable(${DEMO_NAME} ${SRCS})
......
find_package(Git REQUIRED)
include(FetchContent)
set(FETCHCONTENT_BASE_DIR "${CMAKE_CURRENT_BINARY_DIR}/third-party")
FetchContent_Declare(
extern_Autolog
PREFIX autolog
GIT_REPOSITORY https://github.com/LDOUBLEV/AutoLog.git
GIT_TAG main
)
FetchContent_MakeAvailable(extern_Autolog)
...@@ -61,7 +61,7 @@ public: ...@@ -61,7 +61,7 @@ public:
void LoadModel(const std::string &model_path, const std::string &params_path); void LoadModel(const std::string &model_path, const std::string &params_path);
// Run predictor // Run predictor
double Run(cv::Mat &img); double Run(cv::Mat &img, std::vector<double> *times);
private: private:
std::shared_ptr<Predictor> predictor_; std::shared_ptr<Predictor> predictor_;
......
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved. // Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
...@@ -36,8 +36,7 @@ public: ...@@ -36,8 +36,7 @@ public:
this->gpu_mem = stoi(config_map_["gpu_mem"]); this->gpu_mem = stoi(config_map_["gpu_mem"]);
this->cpu_math_library_num_threads = this->cpu_threads = stoi(config_map_["cpu_threads"]);
stoi(config_map_["cpu_math_library_num_threads"]);
this->use_mkldnn = bool(stoi(config_map_["use_mkldnn"])); this->use_mkldnn = bool(stoi(config_map_["use_mkldnn"]));
...@@ -51,6 +50,8 @@ public: ...@@ -51,6 +50,8 @@ public:
this->resize_short_size = stoi(config_map_["resize_short_size"]); this->resize_short_size = stoi(config_map_["resize_short_size"]);
this->crop_size = stoi(config_map_["crop_size"]); this->crop_size = stoi(config_map_["crop_size"]);
this->benchmark = bool(stoi(config_map_["benchmark"]));
} }
bool use_gpu = false; bool use_gpu = false;
...@@ -59,12 +60,13 @@ public: ...@@ -59,12 +60,13 @@ public:
int gpu_mem = 4000; int gpu_mem = 4000;
int cpu_math_library_num_threads = 1; int cpu_threads = 1;
bool use_mkldnn = false; bool use_mkldnn = false;
bool use_tensorrt = false; bool use_tensorrt = false;
bool use_fp16 = false; bool use_fp16 = false;
bool benchmark = false;
std::string cls_model_path; std::string cls_model_path;
......
...@@ -12,7 +12,6 @@ ...@@ -12,7 +12,6 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
#include <chrono>
#include <include/cls.h> #include <include/cls.h>
namespace PaddleClas { namespace PaddleClas {
...@@ -53,11 +52,12 @@ void Classifier::LoadModel(const std::string &model_path, ...@@ -53,11 +52,12 @@ void Classifier::LoadModel(const std::string &model_path,
this->predictor_ = CreatePredictor(config); this->predictor_ = CreatePredictor(config);
} }
double Classifier::Run(cv::Mat &img) { double Classifier::Run(cv::Mat &img, std::vector<double> *times) {
cv::Mat srcimg; cv::Mat srcimg;
cv::Mat resize_img; cv::Mat resize_img;
img.copyTo(srcimg); img.copyTo(srcimg);
auto preprocess_start = std::chrono::system_clock::now();
this->resize_op_.Run(img, resize_img, this->resize_short_size_); this->resize_op_.Run(img, resize_img, this->resize_short_size_);
this->crop_op_.Run(resize_img, this->crop_size_); this->crop_op_.Run(resize_img, this->crop_size_);
...@@ -70,7 +70,9 @@ double Classifier::Run(cv::Mat &img) { ...@@ -70,7 +70,9 @@ double Classifier::Run(cv::Mat &img) {
auto input_names = this->predictor_->GetInputNames(); auto input_names = this->predictor_->GetInputNames();
auto input_t = this->predictor_->GetInputHandle(input_names[0]); auto input_t = this->predictor_->GetInputHandle(input_names[0]);
input_t->Reshape({1, 3, resize_img.rows, resize_img.cols}); input_t->Reshape({1, 3, resize_img.rows, resize_img.cols});
auto start = std::chrono::system_clock::now(); auto preprocess_end = std::chrono::system_clock::now();
auto infer_start = std::chrono::system_clock::now();
input_t->CopyFromCpu(input.data()); input_t->CopyFromCpu(input.data());
this->predictor_->Run(); this->predictor_->Run();
...@@ -83,21 +85,29 @@ double Classifier::Run(cv::Mat &img) { ...@@ -83,21 +85,29 @@ double Classifier::Run(cv::Mat &img) {
out_data.resize(out_num); out_data.resize(out_num);
output_t->CopyToCpu(out_data.data()); output_t->CopyToCpu(out_data.data());
auto end = std::chrono::system_clock::now(); auto infer_end = std::chrono::system_clock::now();
auto duration =
std::chrono::duration_cast<std::chrono::microseconds>(end - start);
double cost_time = double(duration.count()) *
std::chrono::microseconds::period::num /
std::chrono::microseconds::period::den;
auto postprocess_start = std::chrono::system_clock::now();
int maxPosition = int maxPosition =
max_element(out_data.begin(), out_data.end()) - out_data.begin(); max_element(out_data.begin(), out_data.end()) - out_data.begin();
auto postprocess_end = std::chrono::system_clock::now();
std::chrono::duration<float> preprocess_diff =
preprocess_end - preprocess_start;
times->push_back(double(preprocess_diff.count() * 1000));
std::chrono::duration<float> inference_diff = infer_end - infer_start;
double inference_cost_time = double(inference_diff.count() * 1000);
times->push_back(inference_cost_time);
std::chrono::duration<float> postprocess_diff =
postprocess_end - postprocess_start;
times->push_back(double(postprocess_diff.count() * 1000));
std::cout << "result: " << std::endl; std::cout << "result: " << std::endl;
std::cout << "\tclass id: " << maxPosition << std::endl; std::cout << "\tclass id: " << maxPosition << std::endl;
std::cout << std::fixed << std::setprecision(10) std::cout << std::fixed << std::setprecision(10)
<< "\tscore: " << double(out_data[maxPosition]) << std::endl; << "\tscore: " << double(out_data[maxPosition]) << std::endl;
return cost_time; return inference_cost_time;
} }
} // namespace PaddleClas } // namespace PaddleClas
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved. // Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
...@@ -26,6 +26,7 @@ ...@@ -26,6 +26,7 @@
#include <fstream> #include <fstream>
#include <numeric> #include <numeric>
#include <auto_log/autolog.h>
#include <include/cls.h> #include <include/cls.h>
#include <include/cls_config.h> #include <include/cls_config.h>
...@@ -61,11 +62,12 @@ int main(int argc, char **argv) { ...@@ -61,11 +62,12 @@ int main(int argc, char **argv) {
Classifier classifier(config.cls_model_path, config.cls_params_path, Classifier classifier(config.cls_model_path, config.cls_params_path,
config.use_gpu, config.gpu_id, config.gpu_mem, config.use_gpu, config.gpu_id, config.gpu_mem,
config.cpu_math_library_num_threads, config.use_mkldnn, config.cpu_threads, config.use_mkldnn,
config.use_tensorrt, config.use_fp16, config.use_tensorrt, config.use_fp16,
config.resize_short_size, config.crop_size); config.resize_short_size, config.crop_size);
double elapsed_time = 0.0; double elapsed_time = 0.0;
std::vector<double> cls_times;
int warmup_iter = img_files_list.size() > 5 ? 5 : 0; int warmup_iter = img_files_list.size() > 5 ? 5 : 0;
for (int idx = 0; idx < img_files_list.size(); ++idx) { for (int idx = 0; idx < img_files_list.size(); ++idx) {
std::string img_path = img_files_list[idx]; std::string img_path = img_files_list[idx];
...@@ -78,7 +80,7 @@ int main(int argc, char **argv) { ...@@ -78,7 +80,7 @@ int main(int argc, char **argv) {
cv::cvtColor(srcimg, srcimg, cv::COLOR_BGR2RGB); cv::cvtColor(srcimg, srcimg, cv::COLOR_BGR2RGB);
double run_time = classifier.Run(srcimg); double run_time = classifier.Run(srcimg, &cls_times);
if (idx >= warmup_iter) { if (idx >= warmup_iter) {
elapsed_time += run_time; elapsed_time += run_time;
std::cout << "Current image path: " << img_path << std::endl; std::cout << "Current image path: " << img_path << std::endl;
...@@ -90,5 +92,16 @@ int main(int argc, char **argv) { ...@@ -90,5 +92,16 @@ int main(int argc, char **argv) {
} }
} }
std::string presion = "fp32";
if (config.use_fp16)
presion = "fp16";
if (config.benchmark) {
AutoLogger autolog("Classification", config.use_gpu, config.use_tensorrt,
config.use_mkldnn, config.cpu_threads, 1,
"1, 3, 224, 224", presion, cls_times,
img_files_list.size());
autolog.report();
}
return 0; return 0;
} }
OPENCV_DIR=/PaddleClas/opencv-3.4.7/opencv3/ OPENCV_DIR=/work/project/project/cpp_infer/opencv-3.4.7/opencv3
LIB_DIR=/PaddleClas/fluid_inference/ LIB_DIR=/work/project/project/cpp_infer/paddle_inference/
CUDA_LIB_DIR=/usr/local/cuda/lib64 CUDA_LIB_DIR=/usr/local/cuda/lib64
CUDNN_LIB_DIR=/usr/lib/x86_64-linux-gnu/ CUDNN_LIB_DIR=/usr/lib/x86_64-linux-gnu/
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
use_gpu 0 use_gpu 0
gpu_id 0 gpu_id 0
gpu_mem 4000 gpu_mem 4000
cpu_math_library_num_threads 10 cpu_threads 10
use_mkldnn 1 use_mkldnn 1
use_tensorrt 0 use_tensorrt 0
use_fp16 0 use_fp16 0
...@@ -12,3 +12,6 @@ cls_model_path /PaddleClas/inference/cls_infer.pdmodel ...@@ -12,3 +12,6 @@ cls_model_path /PaddleClas/inference/cls_infer.pdmodel
cls_params_path /PaddleClas/inference/cls_infer.pdiparams cls_params_path /PaddleClas/inference/cls_infer.pdiparams
resize_short_size 256 resize_short_size 256
crop_size 224 crop_size 224
# for log env info
benchmark 0
...@@ -76,8 +76,7 @@ class ClasSystem(nn.Layer): ...@@ -76,8 +76,7 @@ class ClasSystem(nn.Layer):
starttime = time.time() starttime = time.time()
outputs = self.cls_predictor.predict(inputs) outputs = self.cls_predictor.predict(inputs)
elapse = time.time() - starttime elapse = time.time() - starttime
preds = self.cls_predictor.postprocess(outputs) return {"prediction": outputs, "elapse": elapse}
return {"prediction": preds, "elapse": elapse}
@serving @serving
def serving_method(self, images, revert_params): def serving_method(self, images, revert_params):
......
# PaddleClas Pipeline WebService
(English|[简体中文](./README_CN.md))
PaddleClas provides two service deployment methods:
- Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please refer to the [tutorial](../../deploy/hubserving/readme_en.md)
- Based on **PaddleServing**: Code path is "`./deploy/paddleserving`". Please follow this tutorial.
# Service deployment based on PaddleServing
This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the ResNet50_vd model as a pipeline online service.
Some Key Features of Paddle Serving:
- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command.
- Industrial serving features supported, such as models management, online loading, online A/B testing etc.
- Highly concurrent and efficient communication between clients and servers supported.
The introduction and tutorial of Paddle Serving service deployment framework reference [document](https://github.com/PaddlePaddle/Serving/blob/develop/README.md).
## Contents
- [Environmental preparation](#environmental-preparation)
- [Model conversion](#model-conversion)
- [Paddle Serving pipeline deployment](#paddle-serving-pipeline-deployment)
- [FAQ](#faq)
<a name="environmental-preparation"></a>
## Environmental preparation
PaddleClas operating environment and PaddleServing operating environment are needed.
1. Please prepare PaddleClas operating environment reference [link](../../docs/zh_CN/tutorials/install.md).
Download the corresponding paddle whl package according to the environment, it is recommended to install version 2.1.0.
2. The steps of PaddleServing operating environment prepare are as follows:
Install serving which used to start the service
```
pip3 install paddle-serving-server==0.6.1 # for CPU
pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
# Other GPU environments need to confirm the environment and then choose to execute the following commands
pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
```
3. Install the client to send requests to the service
In [download link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md) find the client installation package corresponding to the python version.
The python3.7 version is recommended here:
```
wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
```
4. Install serving-app
```
pip3 install paddle-serving-app==0.6.1
```
**note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md).
<a name="model-conversion"></a>
## Model conversion
When using PaddleServing for service deployment, you need to convert the saved inference model into a serving model that is easy to deploy.
Firstly, download the inference model of ResNet50_vd
```
# Download and unzip the ResNet50_vd model
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
```
Then, you can use installed paddle_serving_client tool to convert inference model to mobile model.
```
# ResNet50_vd model conversion
python3 -m paddle_serving_client.convert --dirname ./ResNet50_vd_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./ResNet50_vd_serving/ \
--serving_client ./ResNet50_vd_client/
```
After the ResNet50_vd inference model is converted, there will be additional folders of `ResNet50_vd_serving` and `ResNet50_vd_client` in the current folder, with the following format:
```
|- ResNet50_vd_client/
|- __model__
|- __params__
|- serving_server_conf.prototxt
|- serving_server_conf.stream.prototxt
|- ResNet50_vd_client
|- serving_client_conf.prototxt
|- serving_client_conf.stream.prototxt
```
Once you have the model file for deployment, you need to change the alias name in `serving_server_conf.prototxt`: Change `alias_name` in `feed_var` to `image`, change `alias_name` in `fetch_var` to `prediction`,
The modified serving_server_conf.prototxt file is as follows:
```
feed_var {
name: "inputs"
alias_name: "image"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "prediction"
is_lod_tensor: true
fetch_type: 1
shape: -1
}
```
<a name="paddle-serving-pipeline-deployment"></a>
## Paddle Serving pipeline deployment
1. Download the PaddleClas code, if you have already downloaded it, you can skip this step.
```
git clone https://github.com/PaddlePaddle/PaddleClas
# Enter the working directory
cd PaddleClas/deploy/paddleserving/
```
The paddleserving directory contains the code to start the pipeline service and send prediction requests, including:
```
__init__.py
config.yml # configuration file of starting the service
pipeline_http_client.py # script to send pipeline prediction request by http
pipeline_rpc_client.py # script to send pipeline prediction request by rpc
resnet50_web_service.py # start the script of the pipeline server
```
2. Run the following command to start the service.
```
# Start the service and save the running log in log.txt
python3 classification_web_service.py &>log.txt &
```
After the service is successfully started, a log similar to the following will be printed in log.txt
![](./imgs/start_server.png)
3. Send service request
```
python3 pipeline_http_client.py
```
After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is:
![](./imgs/results.png)
Adjust the number of concurrency in config.yml to get the largest QPS.
```
op:
concurrency: 8
...
```
Multiple service requests can be sent at the same time if necessary.
The predicted performance data will be automatically written into the `PipelineServingLogs/pipeline.tracer` file.
<a name="faq"></a>
## FAQ
**Q1**: No result return after sending the request.
**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and before sending the request. The command to close the proxy is:
```
unset https_proxy
unset http_proxy
```
# PaddleClas 服务化部署
([English](./README.md)|简体中文)
PaddleClas提供2种服务部署方式:
- 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",使用方法参考[文档](../../deploy/hubserving/readme.md)
- 基于PaddleServing的部署:代码路径为"`./deploy/paddleserving`",按照本教程使用。
# 基于PaddleServing的服务部署
本文档以经典的ResNet50_vd模型为例,介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PaddleClas
动态图模型的pipeline在线服务。
相比较于hubserving部署,PaddleServing具备以下优点:
- 支持客户端和服务端之间高并发和高效通信
- 支持 工业级的服务能力 例如模型管理,在线加载,在线A/B测试等
- 支持 多种编程语言 开发客户端,例如C++, Python和Java
更多有关PaddleServing服务化部署框架介绍和使用教程参考[文档](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)
## 目录
- [环境准备](#环境准备)
- [模型转换](#模型转换)
- [Paddle Serving pipeline部署](#部署)
- [FAQ](#FAQ)
<a name="环境准备"></a>
## 环境准备
需要准备PaddleClas的运行环境和PaddleServing的运行环境。
- 准备PaddleClas的[运行环境](../../docs/zh_CN/tutorials/install.md), 根据环境下载对应的paddle whl包,推荐安装2.1.0版本
- 准备PaddleServing的运行环境,步骤如下
1. 安装serving,用于启动服务
```
pip3 install paddle-serving-server==0.6.1 # for CPU
pip3 install paddle-serving-server-gpu==0.6.1 # for GPU
# 其他GPU环境需要确认环境再选择执行如下命令
pip3 install paddle-serving-server-gpu==0.6.1.post101 # GPU with CUDA10.1 + TensorRT6
pip3 install paddle-serving-server-gpu==0.6.1.post11 # GPU with CUDA11 + TensorRT7
```
2. 安装client,用于向服务发送请求
[下载链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)中找到对应python版本的client安装包,这里推荐python3.7版本:
```
wget https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.0.0-cp37-none-any.whl
pip3 install paddle_serving_client-0.0.0-cp37-none-any.whl
```
3. 安装serving-app
```
pip3 install paddle-serving-app==0.6.1
```
**Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)
<a name="模型转换"></a>
## 模型转换
使用PaddleServing做服务化部署时,需要将保存的inference模型转换为serving易于部署的模型。
首先,下载ResNet50_vd的inference模型
```
# 下载并解压ResNet50_vd模型
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
```
接下来,用安装的paddle_serving_client把下载的inference模型转换成易于server部署的模型格式。
```
# 转换ResNet50_vd模型
python3 -m paddle_serving_client.convert --dirname ./ResNet50_vd_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./ResNet50_vd_serving/ \
--serving_client ./ResNet50_vd_client/
```
ResNet50_vd推理模型转换完成后,会在当前文件夹多出`ResNet50_vd_serving``ResNet50_vd_client`的文件夹,具备如下格式:
```
|- ResNet50_vd_client/
|- __model__
|- __params__
|- serving_server_conf.prototxt
|- serving_server_conf.stream.prototxt
|- ResNet50_vd_client
|- serving_client_conf.prototxt
|- serving_client_conf.stream.prototxt
```
得到模型文件之后,需要修改serving_server_conf.prototxt中的alias名字: 将`feed_var`中的`alias_name`改为`image`, 将`fetch_var`中的`alias_name`改为`prediction`,
修改后的serving_server_conf.prototxt内容如下:
```
feed_var {
name: "inputs"
alias_name: "image"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
}
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "prediction"
is_lod_tensor: true
fetch_type: 1
shape: -1
}
```
<a name="部署"></a>
## Paddle Serving pipeline部署
1. 下载PaddleClas代码,若已下载可跳过此步骤
```
git clone https://github.com/PaddlePaddle/PaddleClas
# 进入到工作目录
cd PaddleClas/deploy/paddleserving/
```
paddleserving目录包含启动pipeline服务和发送预测请求的代码,包括:
```
__init__.py
config.yml # 启动服务的配置文件
pipeline_http_client.py # http方式发送pipeline预测请求的脚本
pipeline_rpc_client.py # rpc方式发送pipeline预测请求的脚本
resnet50_web_service.py # 启动pipeline服务端的脚本
```
2. 启动服务可运行如下命令:
```
# 启动服务,运行日志保存在log.txt
python3 classification_web_service.py &>log.txt &
```
成功启动服务后,log.txt中会打印类似如下日志
![](./imgs/start_server.png)
3. 发送服务请求:
```
python3 pipeline_http_client.py
```
成功运行后,模型预测的结果会打印在cmd窗口中,结果示例为:
![](./imgs/results.png)
调整 config.yml 中的并发个数可以获得最大的QPS
```
op:
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 8
...
```
有需要的话可以同时发送多个服务请求
预测性能数据会被自动写入 `PipelineServingLogs/pipeline.tracer` 文件中。
<a name="FAQ"></a>
## FAQ
**Q1**: 发送请求后没有结果返回或者提示输出解码报错
**A1**: 启动服务和发送请求时不要设置代理,可以在启动服务前和发送请求前关闭代理,关闭代理的命令是:
```
unset https_proxy
unset http_proxy
```
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
from paddle_serving_app.reader import Sequential, URL2Image, Resize, CenterCrop, RGB2BGR, Transpose, Div, Normalize, Base64ToImage
try:
from paddle_serving_server_gpu.web_service import WebService, Op
except ImportError:
from paddle_serving_server.web_service import WebService, Op
import logging
import numpy as np
import base64, cv2
class ImagenetOp(Op):
def init_op(self):
self.seq = Sequential([
Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)),
Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225],
True)
])
self.label_dict = {}
label_idx = 0
with open("imagenet.label") as fin:
for line in fin:
self.label_dict[label_idx] = line.strip()
label_idx += 1
def preprocess(self, input_dicts, data_id, log_id):
(_, input_dict), = input_dicts.items()
batch_size = len(input_dict.keys())
imgs = []
for key in input_dict.keys():
data = base64.b64decode(input_dict[key].encode('utf8'))
data = np.fromstring(data, np.uint8)
im = cv2.imdecode(data, cv2.IMREAD_COLOR)
img = self.seq(im)
imgs.append(img[np.newaxis, :].copy())
input_imgs = np.concatenate(imgs, axis=0)
return {"image": input_imgs}, False, None, ""
def postprocess(self, input_dicts, fetch_dict, log_id):
score_list = fetch_dict["prediction"]
result = {"label": [], "prob": []}
for score in score_list:
score = score.tolist()
max_score = max(score)
result["label"].append(self.label_dict[score.index(max_score)]
.strip().replace(",", ""))
result["prob"].append(max_score)
result["label"] = str(result["label"])
result["prob"] = str(result["prob"])
return result, None, ""
class ImageService(WebService):
def get_pipeline_response(self, read_op):
image_op = ImagenetOp(name="imagenet", input_ops=[read_op])
return image_op
uci_service = ImageService(name="imagenet")
uci_service.prepare_pipeline_config("config.yml")
uci_service.run_service()
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG
##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num
worker_num: 1
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port
http_port: 18080
rpc_port: 9993
dag:
#op资源类型, True, 为线程模型;False,为进程模型
is_thread_op: False
op:
imagenet:
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 1
#当op配置没有server_endpoints时,从local_service_conf读取本地服务配置
local_service_conf:
#uci模型路径
model_config: ResNet50_vd_serving
#计算硬件类型: 空缺时由devices决定(CPU/GPU),0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
device_type: 1
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: "0" # "0,1"
#client类型,包括brpc, grpc和local_predictor.local_predictor不启动Serving服务,进程内预测
client_type: local_predictor
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["prediction"]
import psutil
cpu_utilization=psutil.cpu_percent(1,False)
print('CPU_UTILIZATION:', cpu_utilization)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import base64
from paddle_serving_server.web_service import WebService
import utils
class ImageService(WebService):
def __init__(self, name):
super(ImageService, self).__init__(name=name)
self.operators = self.create_operators()
def create_operators(self):
size = 224
img_mean = [0.485, 0.456, 0.406]
img_std = [0.229, 0.224, 0.225]
img_scale = 1.0 / 255.0
decode_op = utils.DecodeImage()
resize_op = utils.ResizeImage(resize_short=256)
crop_op = utils.CropImage(size=(size, size))
normalize_op = utils.NormalizeImage(
scale=img_scale, mean=img_mean, std=img_std)
totensor_op = utils.ToTensor()
return [decode_op, resize_op, crop_op, normalize_op, totensor_op]
def _process_image(self, data, ops):
for op in ops:
data = op(data)
return data
def preprocess(self, feed={}, fetch=[]):
feed_batch = []
for ins in feed:
if "image" not in ins:
raise ("feed data error!")
sample = base64.b64decode(ins["image"])
img = self._process_image(sample, self.operators)
feed_batch.append({"image": img})
return feed_batch, fetch
image_service = ImageService(name="image")
image_service.load_model_config(sys.argv[1])
image_service.prepare_server(
workdir=sys.argv[2], port=int(sys.argv[3]), device="cpu")
image_service.run_server()
image_service.run_flask()
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import base64
from paddle_serving_server_gpu.web_service import WebService
import utils
class ImageService(WebService):
def __init__(self, name):
super(ImageService, self).__init__(name=name)
self.operators = self.create_operators()
def create_operators(self):
size = 224
img_mean = [0.485, 0.456, 0.406]
img_std = [0.229, 0.224, 0.225]
img_scale = 1.0 / 255.0
decode_op = utils.DecodeImage()
resize_op = utils.ResizeImage(resize_short=256)
crop_op = utils.CropImage(size=(size, size))
normalize_op = utils.NormalizeImage(
scale=img_scale, mean=img_mean, std=img_std)
totensor_op = utils.ToTensor()
return [decode_op, resize_op, crop_op, normalize_op, totensor_op]
def _process_image(self, data, ops):
for op in ops:
data = op(data)
return data
def preprocess(self, feed={}, fetch=[]):
feed_batch = []
for ins in feed:
if "image" not in ins:
raise ("feed data error!")
sample = base64.b64decode(ins["image"])
img = self._process_image(sample, self.operators)
feed_batch.append({"image": img})
return feed_batch, fetch
image_service = ImageService(name="image")
image_service.load_model_config(sys.argv[1])
image_service.set_gpus("0")
image_service.prepare_server(
workdir=sys.argv[2], port=int(sys.argv[3]), device="gpu")
image_service.run_server()
image_service.run_flask()
tench, Tinca tinca,
goldfish, Carassius auratus,
great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias,
tiger shark, Galeocerdo cuvieri,
hammerhead, hammerhead shark,
electric ray, crampfish, numbfish, torpedo,
stingray,
cock,
hen,
ostrich, Struthio camelus,
brambling, Fringilla montifringilla,
goldfinch, Carduelis carduelis,
house finch, linnet, Carpodacus mexicanus,
junco, snowbird,
indigo bunting, indigo finch, indigo bird, Passerina cyanea,
robin, American robin, Turdus migratorius,
bulbul,
jay,
magpie,
chickadee,
water ouzel, dipper,
kite,
bald eagle, American eagle, Haliaeetus leucocephalus,
vulture,
great grey owl, great gray owl, Strix nebulosa,
European fire salamander, Salamandra salamandra,
common newt, Triturus vulgaris,
eft,
spotted salamander, Ambystoma maculatum,
axolotl, mud puppy, Ambystoma mexicanum,
bullfrog, Rana catesbeiana,
tree frog, tree-frog,
tailed frog, bell toad, ribbed toad, tailed toad, Ascaphus trui,
loggerhead, loggerhead turtle, Caretta caretta,
leatherback turtle, leatherback, leathery turtle, Dermochelys coriacea,
mud turtle,
terrapin,
box turtle, box tortoise,
banded gecko,
common iguana, iguana, Iguana iguana,
American chameleon, anole, Anolis carolinensis,
whiptail, whiptail lizard,
agama,
frilled lizard, Chlamydosaurus kingi,
alligator lizard,
Gila monster, Heloderma suspectum,
green lizard, Lacerta viridis,
African chameleon, Chamaeleo chamaeleon,
Komodo dragon, Komodo lizard, dragon lizard, giant lizard, Varanus komodoensis,
African crocodile, Nile crocodile, Crocodylus niloticus,
American alligator, Alligator mississipiensis,
triceratops,
thunder snake, worm snake, Carphophis amoenus,
ringneck snake, ring-necked snake, ring snake,
hognose snake, puff adder, sand viper,
green snake, grass snake,
king snake, kingsnake,
garter snake, grass snake,
water snake,
vine snake,
night snake, Hypsiglena torquata,
boa constrictor, Constrictor constrictor,
rock python, rock snake, Python sebae,
Indian cobra, Naja naja,
green mamba,
sea snake,
horned viper, cerastes, sand viper, horned asp, Cerastes cornutus,
diamondback, diamondback rattlesnake, Crotalus adamanteus,
sidewinder, horned rattlesnake, Crotalus cerastes,
trilobite,
harvestman, daddy longlegs, Phalangium opilio,
scorpion,
black and gold garden spider, Argiope aurantia,
barn spider, Araneus cavaticus,
garden spider, Aranea diademata,
black widow, Latrodectus mactans,
tarantula,
wolf spider, hunting spider,
tick,
centipede,
black grouse,
ptarmigan,
ruffed grouse, partridge, Bonasa umbellus,
prairie chicken, prairie grouse, prairie fowl,
peacock,
quail,
partridge,
African grey, African gray, Psittacus erithacus,
macaw,
sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita,
lorikeet,
coucal,
bee eater,
hornbill,
hummingbird,
jacamar,
toucan,
drake,
red-breasted merganser, Mergus serrator,
goose,
black swan, Cygnus atratus,
tusker,
echidna, spiny anteater, anteater,
platypus, duckbill, duckbilled platypus, duck-billed platypus, Ornithorhynchus anatinus,
wallaby, brush kangaroo,
koala, koala bear, kangaroo bear, native bear, Phascolarctos cinereus,
wombat,
jellyfish,
sea anemone, anemone,
brain coral,
flatworm, platyhelminth,
nematode, nematode worm, roundworm,
conch,
snail,
slug,
sea slug, nudibranch,
chiton, coat-of-mail shell, sea cradle, polyplacophore,
chambered nautilus, pearly nautilus, nautilus,
Dungeness crab, Cancer magister,
rock crab, Cancer irroratus,
fiddler crab,
king crab, Alaska crab, Alaskan king crab, Alaska king crab, Paralithodes camtschatica,
American lobster, Northern lobster, Maine lobster, Homarus americanus,
spiny lobster, langouste, rock lobster, crawfish, crayfish, sea crawfish,
crayfish, crawfish, crawdad, crawdaddy,
hermit crab,
isopod,
white stork, Ciconia ciconia,
black stork, Ciconia nigra,
spoonbill,
flamingo,
little blue heron, Egretta caerulea,
American egret, great white heron, Egretta albus,
bittern,
crane,
limpkin, Aramus pictus,
European gallinule, Porphyrio porphyrio,
American coot, marsh hen, mud hen, water hen, Fulica americana,
bustard,
ruddy turnstone, Arenaria interpres,
red-backed sandpiper, dunlin, Erolia alpina,
redshank, Tringa totanus,
dowitcher,
oystercatcher, oyster catcher,
pelican,
king penguin, Aptenodytes patagonica,
albatross, mollymawk,
grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus,
killer whale, killer, orca, grampus, sea wolf, Orcinus orca,
dugong, Dugong dugon,
sea lion,
Chihuahua,
Japanese spaniel,
Maltese dog, Maltese terrier, Maltese,
Pekinese, Pekingese, Peke,
Shih-Tzu,
Blenheim spaniel,
papillon,
toy terrier,
Rhodesian ridgeback,
Afghan hound, Afghan,
basset, basset hound,
beagle,
bloodhound, sleuthhound,
bluetick,
black-and-tan coonhound,
Walker hound, Walker foxhound,
English foxhound,
redbone,
borzoi, Russian wolfhound,
Irish wolfhound,
Italian greyhound,
whippet,
Ibizan hound, Ibizan Podenco,
Norwegian elkhound, elkhound,
otterhound, otter hound,
Saluki, gazelle hound,
Scottish deerhound, deerhound,
Weimaraner,
Staffordshire bullterrier, Staffordshire bull terrier,
American Staffordshire terrier, Staffordshire terrier, American pit bull terrier, pit bull terrier,
Bedlington terrier,
Border terrier,
Kerry blue terrier,
Irish terrier,
Norfolk terrier,
Norwich terrier,
Yorkshire terrier,
wire-haired fox terrier,
Lakeland terrier,
Sealyham terrier, Sealyham,
Airedale, Airedale terrier,
cairn, cairn terrier,
Australian terrier,
Dandie Dinmont, Dandie Dinmont terrier,
Boston bull, Boston terrier,
miniature schnauzer,
giant schnauzer,
standard schnauzer,
Scotch terrier, Scottish terrier, Scottie,
Tibetan terrier, chrysanthemum dog,
silky terrier, Sydney silky,
soft-coated wheaten terrier,
West Highland white terrier,
Lhasa, Lhasa apso,
flat-coated retriever,
curly-coated retriever,
golden retriever,
Labrador retriever,
Chesapeake Bay retriever,
German short-haired pointer,
vizsla, Hungarian pointer,
English setter,
Irish setter, red setter,
Gordon setter,
Brittany spaniel,
clumber, clumber spaniel,
English springer, English springer spaniel,
Welsh springer spaniel,
cocker spaniel, English cocker spaniel, cocker,
Sussex spaniel,
Irish water spaniel,
kuvasz,
schipperke,
groenendael,
malinois,
briard,
kelpie,
komondor,
Old English sheepdog, bobtail,
Shetland sheepdog, Shetland sheep dog, Shetland,
collie,
Border collie,
Bouvier des Flandres, Bouviers des Flandres,
Rottweiler,
German shepherd, German shepherd dog, German police dog, alsatian,
Doberman, Doberman pinscher,
miniature pinscher,
Greater Swiss Mountain dog,
Bernese mountain dog,
Appenzeller,
EntleBucher,
boxer,
bull mastiff,
Tibetan mastiff,
French bulldog,
Great Dane,
Saint Bernard, St Bernard,
Eskimo dog, husky,
malamute, malemute, Alaskan malamute,
Siberian husky,
dalmatian, coach dog, carriage dog,
affenpinscher, monkey pinscher, monkey dog,
basenji,
pug, pug-dog,
Leonberg,
Newfoundland, Newfoundland dog,
Great Pyrenees,
Samoyed, Samoyede,
Pomeranian,
chow, chow chow,
keeshond,
Brabancon griffon,
Pembroke, Pembroke Welsh corgi,
Cardigan, Cardigan Welsh corgi,
toy poodle,
miniature poodle,
standard poodle,
Mexican hairless,
timber wolf, grey wolf, gray wolf, Canis lupus,
white wolf, Arctic wolf, Canis lupus tundrarum,
red wolf, maned wolf, Canis rufus, Canis niger,
coyote, prairie wolf, brush wolf, Canis latrans,
dingo, warrigal, warragal, Canis dingo,
dhole, Cuon alpinus,
African hunting dog, hyena dog, Cape hunting dog, Lycaon pictus,
hyena, hyaena,
red fox, Vulpes vulpes,
kit fox, Vulpes macrotis,
Arctic fox, white fox, Alopex lagopus,
grey fox, gray fox, Urocyon cinereoargenteus,
tabby, tabby cat,
tiger cat,
Persian cat,
Siamese cat, Siamese,
Egyptian cat,
cougar, puma, catamount, mountain lion, painter, panther, Felis concolor,
lynx, catamount,
leopard, Panthera pardus,
snow leopard, ounce, Panthera uncia,
jaguar, panther, Panthera onca, Felis onca,
lion, king of beasts, Panthera leo,
tiger, Panthera tigris,
cheetah, chetah, Acinonyx jubatus,
brown bear, bruin, Ursus arctos,
American black bear, black bear, Ursus americanus, Euarctos americanus,
ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus,
sloth bear, Melursus ursinus, Ursus ursinus,
mongoose,
meerkat, mierkat,
tiger beetle,
ladybug, ladybeetle, lady beetle, ladybird, ladybird beetle,
ground beetle, carabid beetle,
long-horned beetle, longicorn, longicorn beetle,
leaf beetle, chrysomelid,
dung beetle,
rhinoceros beetle,
weevil,
fly,
bee,
ant, emmet, pismire,
grasshopper, hopper,
cricket,
walking stick, walkingstick, stick insect,
cockroach, roach,
mantis, mantid,
cicada, cicala,
leafhopper,
lacewing, lacewing fly,
"dragonfly, darning needle, devils darning needle, sewing needle, snake feeder, snake doctor, mosquito hawk, skeeter hawk",
damselfly,
admiral,
ringlet, ringlet butterfly,
monarch, monarch butterfly, milkweed butterfly, Danaus plexippus,
cabbage butterfly,
sulphur butterfly, sulfur butterfly,
lycaenid, lycaenid butterfly,
starfish, sea star,
sea urchin,
sea cucumber, holothurian,
wood rabbit, cottontail, cottontail rabbit,
hare,
Angora, Angora rabbit,
hamster,
porcupine, hedgehog,
fox squirrel, eastern fox squirrel, Sciurus niger,
marmot,
beaver,
guinea pig, Cavia cobaya,
sorrel,
zebra,
hog, pig, grunter, squealer, Sus scrofa,
wild boar, boar, Sus scrofa,
warthog,
hippopotamus, hippo, river horse, Hippopotamus amphibius,
ox,
water buffalo, water ox, Asiatic buffalo, Bubalus bubalis,
bison,
ram, tup,
bighorn, bighorn sheep, cimarron, Rocky Mountain bighorn, Rocky Mountain sheep, Ovis canadensis,
ibex, Capra ibex,
hartebeest,
impala, Aepyceros melampus,
gazelle,
Arabian camel, dromedary, Camelus dromedarius,
llama,
weasel,
mink,
polecat, fitch, foulmart, foumart, Mustela putorius,
black-footed ferret, ferret, Mustela nigripes,
otter,
skunk, polecat, wood pussy,
badger,
armadillo,
three-toed sloth, ai, Bradypus tridactylus,
orangutan, orang, orangutang, Pongo pygmaeus,
gorilla, Gorilla gorilla,
chimpanzee, chimp, Pan troglodytes,
gibbon, Hylobates lar,
siamang, Hylobates syndactylus, Symphalangus syndactylus,
guenon, guenon monkey,
patas, hussar monkey, Erythrocebus patas,
baboon,
macaque,
langur,
colobus, colobus monkey,
proboscis monkey, Nasalis larvatus,
marmoset,
capuchin, ringtail, Cebus capucinus,
howler monkey, howler,
titi, titi monkey,
spider monkey, Ateles geoffroyi,
squirrel monkey, Saimiri sciureus,
Madagascar cat, ring-tailed lemur, Lemur catta,
indri, indris, Indri indri, Indri brevicaudatus,
Indian elephant, Elephas maximus,
African elephant, Loxodonta africana,
lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens,
giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca,
barracouta, snoek,
eel,
coho, cohoe, coho salmon, blue jack, silver salmon, Oncorhynchus kisutch,
rock beauty, Holocanthus tricolor,
anemone fish,
sturgeon,
gar, garfish, garpike, billfish, Lepisosteus osseus,
lionfish,
puffer, pufferfish, blowfish, globefish,
abacus,
abaya,
"academic gown, academic robe, judges robe",
accordion, piano accordion, squeeze box,
acoustic guitar,
aircraft carrier, carrier, flattop, attack aircraft carrier,
airliner,
airship, dirigible,
altar,
ambulance,
amphibian, amphibious vehicle,
analog clock,
apiary, bee house,
apron,
ashcan, trash can, garbage can, wastebin, ash bin, ash-bin, ashbin, dustbin, trash barrel, trash bin,
assault rifle, assault gun,
backpack, back pack, knapsack, packsack, rucksack, haversack,
bakery, bakeshop, bakehouse,
balance beam, beam,
balloon,
ballpoint, ballpoint pen, ballpen, Biro,
Band Aid,
banjo,
bannister, banister, balustrade, balusters, handrail,
barbell,
barber chair,
barbershop,
barn,
barometer,
barrel, cask,
barrow, garden cart, lawn cart, wheelbarrow,
baseball,
basketball,
bassinet,
bassoon,
bathing cap, swimming cap,
bath towel,
bathtub, bathing tub, bath, tub,
beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon,
beacon, lighthouse, beacon light, pharos,
beaker,
bearskin, busby, shako,
beer bottle,
beer glass,
bell cote, bell cot,
bib,
bicycle-built-for-two, tandem bicycle, tandem,
bikini, two-piece,
binder, ring-binder,
binoculars, field glasses, opera glasses,
birdhouse,
boathouse,
bobsled, bobsleigh, bob,
bolo tie, bolo, bola tie, bola,
bonnet, poke bonnet,
bookcase,
bookshop, bookstore, bookstall,
bottlecap,
bow,
bow tie, bow-tie, bowtie,
brass, memorial tablet, plaque,
brassiere, bra, bandeau,
breakwater, groin, groyne, mole, bulwark, seawall, jetty,
breastplate, aegis, egis,
broom,
bucket, pail,
buckle,
bulletproof vest,
bullet train, bullet,
butcher shop, meat market,
cab, hack, taxi, taxicab,
caldron, cauldron,
candle, taper, wax light,
cannon,
canoe,
can opener, tin opener,
cardigan,
car mirror,
carousel, carrousel, merry-go-round, roundabout, whirligig,
"carpenters kit, tool kit",
carton,
car wheel,
cash machine, cash dispenser, automated teller machine, automatic teller machine, automated teller, automatic teller, ATM,
cassette,
cassette player,
castle,
catamaran,
CD player,
cello, violoncello,
cellular telephone, cellular phone, cellphone, cell, mobile phone,
chain,
chainlink fence,
chain mail, ring mail, mail, chain armor, chain armour, ring armor, ring armour,
chain saw, chainsaw,
chest,
chiffonier, commode,
chime, bell, gong,
china cabinet, china closet,
Christmas stocking,
church, church building,
cinema, movie theater, movie theatre, movie house, picture palace,
cleaver, meat cleaver, chopper,
cliff dwelling,
cloak,
clog, geta, patten, sabot,
cocktail shaker,
coffee mug,
coffeepot,
coil, spiral, volute, whorl, helix,
combination lock,
computer keyboard, keypad,
confectionery, confectionary, candy store,
container ship, containership, container vessel,
convertible,
corkscrew, bottle screw,
cornet, horn, trumpet, trump,
cowboy boot,
cowboy hat, ten-gallon hat,
cradle,
crane,
crash helmet,
crate,
crib, cot,
Crock Pot,
croquet ball,
crutch,
cuirass,
dam, dike, dyke,
desk,
desktop computer,
dial telephone, dial phone,
diaper, nappy, napkin,
digital clock,
digital watch,
dining table, board,
dishrag, dishcloth,
dishwasher, dish washer, dishwashing machine,
disk brake, disc brake,
dock, dockage, docking facility,
dogsled, dog sled, dog sleigh,
dome,
doormat, welcome mat,
drilling platform, offshore rig,
drum, membranophone, tympan,
drumstick,
dumbbell,
Dutch oven,
electric fan, blower,
electric guitar,
electric locomotive,
entertainment center,
envelope,
espresso maker,
face powder,
feather boa, boa,
file, file cabinet, filing cabinet,
fireboat,
fire engine, fire truck,
fire screen, fireguard,
flagpole, flagstaff,
flute, transverse flute,
folding chair,
football helmet,
forklift,
fountain,
fountain pen,
four-poster,
freight car,
French horn, horn,
frying pan, frypan, skillet,
fur coat,
garbage truck, dustcart,
gasmask, respirator, gas helmet,
gas pump, gasoline pump, petrol pump, island dispenser,
goblet,
go-kart,
golf ball,
golfcart, golf cart,
gondola,
gong, tam-tam,
gown,
grand piano, grand,
greenhouse, nursery, glasshouse,
grille, radiator grille,
grocery store, grocery, food market, market,
guillotine,
hair slide,
hair spray,
half track,
hammer,
hamper,
hand blower, blow dryer, blow drier, hair dryer, hair drier,
hand-held computer, hand-held microcomputer,
handkerchief, hankie, hanky, hankey,
hard disc, hard disk, fixed disk,
harmonica, mouth organ, harp, mouth harp,
harp,
harvester, reaper,
hatchet,
holster,
home theater, home theatre,
honeycomb,
hook, claw,
hoopskirt, crinoline,
horizontal bar, high bar,
horse cart, horse-cart,
hourglass,
iPod,
iron, smoothing iron,
"jack-o-lantern",
jean, blue jean, denim,
jeep, landrover,
jersey, T-shirt, tee shirt,
jigsaw puzzle,
jinrikisha, ricksha, rickshaw,
joystick,
kimono,
knee pad,
knot,
lab coat, laboratory coat,
ladle,
lampshade, lamp shade,
laptop, laptop computer,
lawn mower, mower,
lens cap, lens cover,
letter opener, paper knife, paperknife,
library,
lifeboat,
lighter, light, igniter, ignitor,
limousine, limo,
liner, ocean liner,
lipstick, lip rouge,
Loafer,
lotion,
loudspeaker, speaker, speaker unit, loudspeaker system, speaker system,
"loupe, jewelers loupe",
lumbermill, sawmill,
magnetic compass,
mailbag, postbag,
mailbox, letter box,
maillot,
maillot, tank suit,
manhole cover,
maraca,
marimba, xylophone,
mask,
matchstick,
maypole,
maze, labyrinth,
measuring cup,
medicine chest, medicine cabinet,
megalith, megalithic structure,
microphone, mike,
microwave, microwave oven,
military uniform,
milk can,
minibus,
miniskirt, mini,
minivan,
missile,
mitten,
mixing bowl,
mobile home, manufactured home,
Model T,
modem,
monastery,
monitor,
moped,
mortar,
mortarboard,
mosque,
mosquito net,
motor scooter, scooter,
mountain bike, all-terrain bike, off-roader,
mountain tent,
mouse, computer mouse,
mousetrap,
moving van,
muzzle,
nail,
neck brace,
necklace,
nipple,
notebook, notebook computer,
obelisk,
oboe, hautboy, hautbois,
ocarina, sweet potato,
odometer, hodometer, mileometer, milometer,
oil filter,
organ, pipe organ,
oscilloscope, scope, cathode-ray oscilloscope, CRO,
overskirt,
oxcart,
oxygen mask,
packet,
paddle, boat paddle,
paddlewheel, paddle wheel,
padlock,
paintbrush,
"pajama, pyjama, pjs, jammies",
palace,
panpipe, pandean pipe, syrinx,
paper towel,
parachute, chute,
parallel bars, bars,
park bench,
parking meter,
passenger car, coach, carriage,
patio, terrace,
pay-phone, pay-station,
pedestal, plinth, footstall,
pencil box, pencil case,
pencil sharpener,
perfume, essence,
Petri dish,
photocopier,
pick, plectrum, plectron,
pickelhaube,
picket fence, paling,
pickup, pickup truck,
pier,
piggy bank, penny bank,
pill bottle,
pillow,
ping-pong ball,
pinwheel,
pirate, pirate ship,
pitcher, ewer,
"plane, carpenters plane, woodworking plane",
planetarium,
plastic bag,
plate rack,
plow, plough,
"plunger, plumbers helper",
Polaroid camera, Polaroid Land camera,
pole,
police van, police wagon, paddy wagon, patrol wagon, wagon, black Maria,
poncho,
pool table, billiard table, snooker table,
pop bottle, soda bottle,
pot, flowerpot,
"potters wheel",
power drill,
prayer rug, prayer mat,
printer,
prison, prison house,
projectile, missile,
projector,
puck, hockey puck,
punching bag, punch bag, punching ball, punchball,
purse,
quill, quill pen,
quilt, comforter, comfort, puff,
racer, race car, racing car,
racket, racquet,
radiator,
radio, wireless,
radio telescope, radio reflector,
rain barrel,
recreational vehicle, RV, R.V.,
reel,
reflex camera,
refrigerator, icebox,
remote control, remote,
restaurant, eating house, eating place, eatery,
revolver, six-gun, six-shooter,
rifle,
rocking chair, rocker,
rotisserie,
rubber eraser, rubber, pencil eraser,
rugby ball,
rule, ruler,
running shoe,
safe,
safety pin,
saltshaker, salt shaker,
sandal,
sarong,
sax, saxophone,
scabbard,
scale, weighing machine,
school bus,
schooner,
scoreboard,
screen, CRT screen,
screw,
screwdriver,
seat belt, seatbelt,
sewing machine,
shield, buckler,
shoe shop, shoe-shop, shoe store,
shoji,
shopping basket,
shopping cart,
shovel,
shower cap,
shower curtain,
ski,
ski mask,
sleeping bag,
slide rule, slipstick,
sliding door,
slot, one-armed bandit,
snorkel,
snowmobile,
snowplow, snowplough,
soap dispenser,
soccer ball,
sock,
solar dish, solar collector, solar furnace,
sombrero,
soup bowl,
space bar,
space heater,
space shuttle,
spatula,
speedboat,
"spider web, spiders web",
spindle,
sports car, sport car,
spotlight, spot,
stage,
steam locomotive,
steel arch bridge,
steel drum,
stethoscope,
stole,
stone wall,
stopwatch, stop watch,
stove,
strainer,
streetcar, tram, tramcar, trolley, trolley car,
stretcher,
studio couch, day bed,
stupa, tope,
submarine, pigboat, sub, U-boat,
suit, suit of clothes,
sundial,
sunglass,
sunglasses, dark glasses, shades,
sunscreen, sunblock, sun blocker,
suspension bridge,
swab, swob, mop,
sweatshirt,
swimming trunks, bathing trunks,
swing,
switch, electric switch, electrical switch,
syringe,
table lamp,
tank, army tank, armored combat vehicle, armoured combat vehicle,
tape player,
teapot,
teddy, teddy bear,
television, television system,
tennis ball,
thatch, thatched roof,
theater curtain, theatre curtain,
thimble,
thresher, thrasher, threshing machine,
throne,
tile roof,
toaster,
tobacco shop, tobacconist shop, tobacconist,
toilet seat,
torch,
totem pole,
tow truck, tow car, wrecker,
toyshop,
tractor,
trailer truck, tractor trailer, trucking rig, rig, articulated lorry, semi,
tray,
trench coat,
tricycle, trike, velocipede,
trimaran,
tripod,
triumphal arch,
trolleybus, trolley coach, trackless trolley,
trombone,
tub, vat,
turnstile,
typewriter keyboard,
umbrella,
unicycle, monocycle,
upright, upright piano,
vacuum, vacuum cleaner,
vase,
vault,
velvet,
vending machine,
vestment,
viaduct,
violin, fiddle,
volleyball,
waffle iron,
wall clock,
wallet, billfold, notecase, pocketbook,
wardrobe, closet, press,
warplane, military plane,
washbasin, handbasin, washbowl, lavabo, wash-hand basin,
washer, automatic washer, washing machine,
water bottle,
water jug,
water tower,
whiskey jug,
whistle,
wig,
window screen,
window shade,
Windsor tie,
wine bottle,
wing,
wok,
wooden spoon,
wool, woolen, woollen,
worm fence, snake fence, snake-rail fence, Virginia fence,
wreck,
yawl,
yurt,
web site, website, internet site, site,
comic book,
crossword puzzle, crossword,
street sign,
traffic light, traffic signal, stoplight,
book jacket, dust cover, dust jacket, dust wrapper,
menu,
plate,
guacamole,
consomme,
hot pot, hotpot,
trifle,
ice cream, icecream,
ice lolly, lolly, lollipop, popsicle,
French loaf,
bagel, beigel,
pretzel,
cheeseburger,
hotdog, hot dog, red hot,
mashed potato,
head cabbage,
broccoli,
cauliflower,
zucchini, courgette,
spaghetti squash,
acorn squash,
butternut squash,
cucumber, cuke,
artichoke, globe artichoke,
bell pepper,
cardoon,
mushroom,
Granny Smith,
strawberry,
orange,
lemon,
fig,
pineapple, ananas,
banana,
jackfruit, jak, jack,
custard apple,
pomegranate,
hay,
carbonara,
chocolate sauce, chocolate syrup,
dough,
meat loaf, meatloaf,
pizza, pizza pie,
potpie,
burrito,
red wine,
espresso,
cup,
eggnog,
alp,
bubble,
cliff, drop, drop-off,
coral reef,
geyser,
lakeside, lakeshore,
promontory, headland, head, foreland,
sandbar, sand bar,
seashore, coast, seacoast, sea-coast,
valley, vale,
volcano,
ballplayer, baseball player,
groom, bridegroom,
scuba diver,
rapeseed,
daisy,
"yellow ladys slipper, yellow lady-slipper, Cypripedium calceolus, Cypripedium parviflorum",
corn,
acorn,
hip, rose hip, rosehip,
buckeye, horse chestnut, conker,
coral fungus,
agaric,
gyromitra,
stinkhorn, carrion fungus,
earthstar,
hen-of-the-woods, hen of the woods, Polyporus frondosus, Grifola frondosa,
bolete,
ear, spike, capitulum,
toilet tissue, toilet paper, bathroom tissue
import requests
import json
import base64
import os
def cv2_to_base64(image):
return base64.b64encode(image).decode('utf8')
if __name__ == "__main__":
url = "http://127.0.0.1:18080/imagenet/prediction"
with open(os.path.join(".", "daisy.jpg"), 'rb') as file:
image_data1 = file.read()
image = cv2_to_base64(image_data1)
data = {"key": ["image"], "value": [image]}
for i in range(100):
r = requests.post(url=url, data=json.dumps(data))
print(r.json())
...@@ -11,36 +11,23 @@ ...@@ -11,36 +11,23 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
try:
import requests from paddle_serving_server_gpu.pipeline import PipelineClient
except ImportError:
from paddle_serving_server.pipeline import PipelineClient
import base64 import base64
import json
import sys
import numpy as np
py_version = sys.version_info[0]
def predict(image_path, server): client = PipelineClient()
client.connect(['127.0.0.1:9993'])
with open(image_path, "rb") as f:
image = base64.b64encode(f.read()).decode("utf-8")
req = json.dumps({"feed": [{"image": image}], "fetch": ["prediction"]})
r = requests.post(
server, data=req, headers={"Content-Type": "application/json"})
try:
pred = r.json()["result"]["prediction"][0]
cls_id = np.argmax(pred)
score = pred[cls_id]
pred = {"cls_id": cls_id, "score": score}
return pred
except ValueError:
print(r.text)
return r
def cv2_to_base64(image):
return base64.b64encode(image).decode('utf8')
if __name__ == "__main__": if __name__ == "__main__":
server = "http://127.0.0.1:{}/image/prediction".format(sys.argv[1]) with open("daisy.jpg", 'rb') as file:
image_file = sys.argv[2] image_data = file.read()
res = predict(image_file, server) image = cv2_to_base64(image_data)
print("res:", res)
for i in range(1):
ret = client.predict(feed_dict={"image": image}, fetch=["label", "prob"])
print(ret)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import numpy as np
class DecodeImage(object):
def __init__(self, to_rgb=True):
self.to_rgb = to_rgb
def __call__(self, img):
data = np.frombuffer(img, dtype='uint8')
img = cv2.imdecode(data, 1)
if self.to_rgb:
assert img.shape[2] == 3, 'invalid shape of image[%s]' % (
img.shape)
img = img[:, :, ::-1]
return img
class ResizeImage(object):
def __init__(self, resize_short=None):
self.resize_short = resize_short
def __call__(self, img):
img_h, img_w = img.shape[:2]
percent = float(self.resize_short) / min(img_w, img_h)
w = int(round(img_w * percent))
h = int(round(img_h * percent))
return cv2.resize(img, (w, h))
class CropImage(object):
def __init__(self, size):
if type(size) is int:
self.size = (size, size)
else:
self.size = size
def __call__(self, img):
w, h = self.size
img_h, img_w = img.shape[:2]
w_start = (img_w - w) // 2
h_start = (img_h - h) // 2
w_end = w_start + w
h_end = h_start + h
return img[h_start:h_end, w_start:w_end, :]
class NormalizeImage(object):
def __init__(self, scale=None, mean=None, std=None):
self.scale = np.float32(scale if scale is not None else 1.0 / 255.0)
mean = mean if mean is not None else [0.485, 0.456, 0.406]
std = std if std is not None else [0.229, 0.224, 0.225]
shape = (1, 1, 3)
self.mean = np.array(mean).reshape(shape).astype('float32')
self.std = np.array(std).reshape(shape).astype('float32')
def __call__(self, img):
return (img.astype('float32') * self.scale - self.mean) / self.std
class ToTensor(object):
def __init__(self):
pass
def __call__(self, img):
img = img.transpose((2, 0, 1))
return img
...@@ -71,7 +71,6 @@ class GalleryBuilder(object): ...@@ -71,7 +71,6 @@ class GalleryBuilder(object):
# when remove data in index, do not need extract fatures # when remove data in index, do not need extract fatures
if operation_method != "remove": if operation_method != "remove":
gallery_features = self._extract_features(gallery_images, config) gallery_features = self._extract_features(gallery_images, config)
assert operation_method in [ assert operation_method in [
"new", "remove", "append" "new", "remove", "append"
], "Only append, remove and new operation are supported" ], "Only append, remove and new operation are supported"
...@@ -104,11 +103,23 @@ class GalleryBuilder(object): ...@@ -104,11 +103,23 @@ class GalleryBuilder(object):
if index_method == "IVF": if index_method == "IVF":
index_method = index_method + str( index_method = index_method + str(
min(int(len(gallery_images) // 8), 65536)) + ",Flat" min(int(len(gallery_images) // 8), 65536)) + ",Flat"
# for binary index, add B at head of index_method
if config["dist_type"] == "hamming":
index_method = "B" + index_method
#dist_type
dist_type = faiss.METRIC_INNER_PRODUCT if config[ dist_type = faiss.METRIC_INNER_PRODUCT if config[
"dist_type"] == "IP" else faiss.METRIC_L2 "dist_type"] == "IP" else faiss.METRIC_L2
index = faiss.index_factory(config["embedding_size"], index_method,
dist_type) #build index
index = faiss.IndexIDMap2(index) if config["dist_type"] == "hamming":
index = faiss.index_binary_factory(config["embedding_size"],
index_method)
else:
index = faiss.index_factory(config["embedding_size"],
index_method, dist_type)
index = faiss.IndexIDMap2(index)
ids = {} ids = {}
if config["index_method"] == "HNSW32": if config["index_method"] == "HNSW32":
...@@ -123,8 +134,13 @@ class GalleryBuilder(object): ...@@ -123,8 +134,13 @@ class GalleryBuilder(object):
# only train when new index file # only train when new index file
if operation_method == "new": if operation_method == "new":
index.train(gallery_features) if config["dist_type"] == "hamming":
index.add_with_ids(gallery_features, ids_now) index.add(gallery_features)
else:
index.train(gallery_features)
if not config["dist_type"] == "hamming":
index.add_with_ids(gallery_features, ids_now)
for i, d in zip(list(ids_now), gallery_docs): for i, d in zip(list(ids_now), gallery_docs):
ids[i] = d ids[i] = d
...@@ -142,15 +158,26 @@ class GalleryBuilder(object): ...@@ -142,15 +158,26 @@ class GalleryBuilder(object):
del ids[k] del ids[k]
# store faiss index file and id_map file # store faiss index file and id_map file
faiss.write_index(index, if config["dist_type"] == "hamming":
os.path.join(config["index_dir"], "vector.index")) faiss.write_index_binary(
index, os.path.join(config["index_dir"], "vector.index"))
else:
faiss.write_index(
index, os.path.join(config["index_dir"], "vector.index"))
with open(os.path.join(config["index_dir"], "id_map.pkl"), 'wb') as fd: with open(os.path.join(config["index_dir"], "id_map.pkl"), 'wb') as fd:
pickle.dump(ids, fd) pickle.dump(ids, fd)
def _extract_features(self, gallery_images, config): def _extract_features(self, gallery_images, config):
# extract gallery features # extract gallery features
gallery_features = np.zeros( if config["dist_type"] == "hamming":
[len(gallery_images), config['embedding_size']], dtype=np.float32) gallery_features = np.zeros(
[len(gallery_images), config['embedding_size'] // 8],
dtype=np.uint8)
else:
gallery_features = np.zeros(
[len(gallery_images), config['embedding_size']],
dtype=np.float32)
#construct batch imgs and do inference #construct batch imgs and do inference
batch_size = config.get("batch_size", 32) batch_size = config.get("batch_size", 32)
...@@ -172,6 +199,7 @@ class GalleryBuilder(object): ...@@ -172,6 +199,7 @@ class GalleryBuilder(object):
rec_feat = self.rec_predictor.predict(batch_img) rec_feat = self.rec_predictor.predict(batch_img)
gallery_features[-len(batch_img):, :] = rec_feat gallery_features[-len(batch_img):, :] = rec_feat
batch_img = [] batch_img = []
return gallery_features return gallery_features
......
...@@ -62,6 +62,7 @@ class Topk(object): ...@@ -62,6 +62,7 @@ class Topk(object):
def parse_class_id_map(self, class_id_map_file): def parse_class_id_map(self, class_id_map_file):
if class_id_map_file is None: if class_id_map_file is None:
return None return None
if not os.path.exists(class_id_map_file): if not os.path.exists(class_id_map_file):
print( print(
"Warning: If want to use your own label_dict, please input legal path!\nOtherwise label_names will be empty!" "Warning: If want to use your own label_dict, please input legal path!\nOtherwise label_names will be empty!"
...@@ -126,3 +127,24 @@ class SavePreLabel(object): ...@@ -126,3 +127,24 @@ class SavePreLabel(object):
output_dir = self.save_dir(str(id)) output_dir = self.save_dir(str(id))
os.makedirs(output_dir, exist_ok=True) os.makedirs(output_dir, exist_ok=True)
shutil.copy(image_file, output_dir) shutil.copy(image_file, output_dir)
class Binarize(object):
def __init__(self, method = "round"):
self.method = method
self.unit = np.array([[128, 64, 32, 16, 8, 4, 2, 1]]).T
def __call__(self, x, file_names=None):
if self.method == "round":
x = np.round(x + 1).astype("uint8") - 1
if self.method == "sign":
x = ((np.sign(x) + 1) / 2).astype("uint8")
embedding_size = x.shape[1]
assert embedding_size % 8 == 0, "The Binary index only support vectors with sizes multiple of 8"
byte = np.zeros([x.shape[0], embedding_size // 8], dtype=np.uint8)
for i in range(embedding_size // 8):
byte[:, i:i+1] = np.dot(x[:, i * 8: (i + 1)* 8], self.unit)
return byte
...@@ -47,8 +47,14 @@ class SystemPredictor(object): ...@@ -47,8 +47,14 @@ class SystemPredictor(object):
index_dir, "vector.index")), "vector.index not found ..." index_dir, "vector.index")), "vector.index not found ..."
assert os.path.exists(os.path.join( assert os.path.exists(os.path.join(
index_dir, "id_map.pkl")), "id_map.pkl not found ... " index_dir, "id_map.pkl")), "id_map.pkl not found ... "
self.Searcher = faiss.read_index(
os.path.join(index_dir, "vector.index")) if config['IndexProcess'].get("binary_index", False):
self.Searcher = faiss.read_index_binary(
os.path.join(index_dir, "vector.index"))
else:
self.Searcher = faiss.read_index(
os.path.join(index_dir, "vector.index"))
with open(os.path.join(index_dir, "id_map.pkl"), "rb") as fd: with open(os.path.join(index_dir, "id_map.pkl"), "rb") as fd:
self.id_map = pickle.load(fd) self.id_map = pickle.load(fd)
...@@ -105,6 +111,7 @@ class SystemPredictor(object): ...@@ -105,6 +111,7 @@ class SystemPredictor(object):
rec_results = self.rec_predictor.predict(crop_img) rec_results = self.rec_predictor.predict(crop_img)
preds["bbox"] = [xmin, ymin, xmax, ymax] preds["bbox"] = [xmin, ymin, xmax, ymax]
scores, docs = self.Searcher.search(rec_results, self.return_k) scores, docs = self.Searcher.search(rec_results, self.return_k)
# just top-1 result will be returned for the final # just top-1 result will be returned for the final
if scores[0][0] >= self.config["IndexProcess"]["score_thres"]: if scores[0][0] >= self.config["IndexProcess"]["score_thres"]:
preds["rec_docs"] = self.id_map[docs[0][0]].split()[1] preds["rec_docs"] = self.id_map[docs[0][0]].split()[1]
......
...@@ -19,12 +19,14 @@ from __future__ import division ...@@ -19,12 +19,14 @@ from __future__ import division
from __future__ import print_function from __future__ import print_function
from __future__ import unicode_literals from __future__ import unicode_literals
from functools import partial
import six import six
import math import math
import random import random
import cv2 import cv2
import numpy as np import numpy as np
import importlib import importlib
from PIL import Image
from python.det_preprocess import DetNormalizeImage, DetPadStride, DetPermute, DetResize from python.det_preprocess import DetNormalizeImage, DetPadStride, DetPermute, DetResize
...@@ -50,6 +52,47 @@ def create_operators(params): ...@@ -50,6 +52,47 @@ def create_operators(params):
return ops return ops
class UnifiedResize(object):
def __init__(self, interpolation=None, backend="cv2"):
_cv2_interp_from_str = {
'nearest': cv2.INTER_NEAREST,
'bilinear': cv2.INTER_LINEAR,
'area': cv2.INTER_AREA,
'bicubic': cv2.INTER_CUBIC,
'lanczos': cv2.INTER_LANCZOS4
}
_pil_interp_from_str = {
'nearest': Image.NEAREST,
'bilinear': Image.BILINEAR,
'bicubic': Image.BICUBIC,
'box': Image.BOX,
'lanczos': Image.LANCZOS,
'hamming': Image.HAMMING
}
def _pil_resize(src, size, resample):
pil_img = Image.fromarray(src)
pil_img = pil_img.resize(size, resample)
return np.asarray(pil_img)
if backend.lower() == "cv2":
if isinstance(interpolation, str):
interpolation = _cv2_interp_from_str[interpolation.lower()]
self.resize_func = partial(cv2.resize, interpolation=interpolation)
elif backend.lower() == "pil":
if isinstance(interpolation, str):
interpolation = _pil_interp_from_str[interpolation.lower()]
self.resize_func = partial(_pil_resize, resample=interpolation)
else:
logger.warning(
f"The backend of Resize only support \"cv2\" or \"PIL\". \"f{backend}\" is unavailable. Use \"cv2\" instead."
)
self.resize_func = cv2.resize
def __call__(self, src, size):
return self.resize_func(src, size)
class OperatorParamError(ValueError): class OperatorParamError(ValueError):
""" OperatorParamError """ OperatorParamError
""" """
...@@ -87,8 +130,11 @@ class DecodeImage(object): ...@@ -87,8 +130,11 @@ class DecodeImage(object):
class ResizeImage(object): class ResizeImage(object):
""" resize image """ """ resize image """
def __init__(self, size=None, resize_short=None, interpolation=-1): def __init__(self,
self.interpolation = interpolation if interpolation >= 0 else None size=None,
resize_short=None,
interpolation=None,
backend="cv2"):
if resize_short is not None and resize_short > 0: if resize_short is not None and resize_short > 0:
self.resize_short = resize_short self.resize_short = resize_short
self.w = None self.w = None
...@@ -101,6 +147,9 @@ class ResizeImage(object): ...@@ -101,6 +147,9 @@ class ResizeImage(object):
raise OperatorParamError("invalid params for ReisizeImage for '\ raise OperatorParamError("invalid params for ReisizeImage for '\
'both 'size' and 'resize_short' are None") 'both 'size' and 'resize_short' are None")
self._resize_func = UnifiedResize(
interpolation=interpolation, backend=backend)
def __call__(self, img): def __call__(self, img):
img_h, img_w = img.shape[:2] img_h, img_w = img.shape[:2]
if self.resize_short is not None: if self.resize_short is not None:
...@@ -110,10 +159,7 @@ class ResizeImage(object): ...@@ -110,10 +159,7 @@ class ResizeImage(object):
else: else:
w = self.w w = self.w
h = self.h h = self.h
if self.interpolation is None: return self._resize_func(img, (w, h))
return cv2.resize(img, (w, h))
else:
return cv2.resize(img, (w, h), interpolation=self.interpolation)
class CropImage(object): class CropImage(object):
...@@ -145,9 +191,12 @@ class CropImage(object): ...@@ -145,9 +191,12 @@ class CropImage(object):
class RandCropImage(object): class RandCropImage(object):
""" random crop image """ """ random crop image """
def __init__(self, size, scale=None, ratio=None, interpolation=-1): def __init__(self,
size,
self.interpolation = interpolation if interpolation >= 0 else None scale=None,
ratio=None,
interpolation=None,
backend="cv2"):
if type(size) is int: if type(size) is int:
self.size = (size, size) # (h, w) self.size = (size, size) # (h, w)
else: else:
...@@ -156,6 +205,9 @@ class RandCropImage(object): ...@@ -156,6 +205,9 @@ class RandCropImage(object):
self.scale = [0.08, 1.0] if scale is None else scale self.scale = [0.08, 1.0] if scale is None else scale
self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio
self._resize_func = UnifiedResize(
interpolation=interpolation, backend=backend)
def __call__(self, img): def __call__(self, img):
size = self.size size = self.size
scale = self.scale scale = self.scale
...@@ -181,10 +233,8 @@ class RandCropImage(object): ...@@ -181,10 +233,8 @@ class RandCropImage(object):
j = random.randint(0, img_h - h) j = random.randint(0, img_h - h)
img = img[j:j + h, i:i + w, :] img = img[j:j + h, i:i + w, :]
if self.interpolation is None:
return cv2.resize(img, size) return self._resize_func(img, size)
else:
return cv2.resize(img, size, interpolation=self.interpolation)
class RandFlipImage(object): class RandFlipImage(object):
......
## 介绍 ## Slim功能介绍
复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型量化将全精度缩减到定点数减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。 复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余。此部分提供精简模型的功能,包括两部分:模型量化(量化训练、离线量化)、模型剪枝。
其中模型量化将全精度缩减到定点数减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。
模型量化可以在基本不损失模型的精度的情况下,将FP32精度的模型参数转换为Int8精度,减小模型参数大小并加速计算,使用量化后的模型在移动端等部署时更具备速度优势。 模型量化可以在基本不损失模型的精度的情况下,将FP32精度的模型参数转换为Int8精度,减小模型参数大小并加速计算,使用量化后的模型在移动端等部署时更具备速度优势。
模型剪枝将CNN中不重要的卷积核裁剪掉,减少模型参数量,从而降低模型计算复杂度。
本教程将介绍如何使用飞桨模型压缩库PaddleSlim做PaddleClas模型的压缩。 本教程将介绍如何使用飞桨模型压缩库PaddleSlim做PaddleClas模型的压缩。
[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim) 集成了模型剪枝、量化(包括量化训练和离线量化)、蒸馏和神经网络搜索等多种业界常用且领先的模型压缩功能,如果您感兴趣,可以关注并了解。 [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim) 集成了模型剪枝、量化(包括量化训练和离线量化)、蒸馏和神经网络搜索等多种业界常用且领先的模型压缩功能,如果您感兴趣,可以关注并了解。
在开始本教程之前,建议先了解[PaddleClas模型的训练方法](../../../docs/zh_CN/tutorials/quick_start.md)以及[PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/index.html) 在开始本教程之前,建议先了解[PaddleClas模型的训练方法](../../docs/zh_CN/tutorials/getting_started.md)以及[PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/index.html)
## 快速开始 ## 快速开始
量化多适用于轻量模型在移动端的部署,当训练出一个模型后,如果希望进一步的压缩模型大小并加速预测,可使用量化的方法压缩模型。 当训练出一个模型后,如果希望进一步的压缩模型大小并加速预测,可使用量化或者剪枝的方法压缩模型。
模型量化主要包括五个步骤: 模型压缩主要包括五个步骤:
1. 安装 PaddleSlim 1. 安装 PaddleSlim
2. 准备训练好的模型 2. 准备训练好的模型
3. 量化训练 3. 模型压缩
4. 导出量化推理模型 4. 导出量化推理模型
5. 量化模型预测部署 5. 量化模型预测部署
...@@ -24,7 +28,7 @@ ...@@ -24,7 +28,7 @@
* 可以通过pip install的方式进行安装。 * 可以通过pip install的方式进行安装。
```bash ```bash
pip3.7 install paddleslim==2.0.0 pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
``` ```
* 如果获取PaddleSlim的最新特性,可以从源码安装。 * 如果获取PaddleSlim的最新特性,可以从源码安装。
...@@ -37,70 +41,104 @@ python3.7 setup.py install ...@@ -37,70 +41,104 @@ python3.7 setup.py install
### 2. 准备训练好的模型 ### 2. 准备训练好的模型
PaddleClas提供了一系列训练好的[模型](../../../docs/zh_CN/models/models_intro.md),如果待量化的模型不在列表中,需要按照[常规训练](../../../docs/zh_CN/tutorials/getting_started.md)方法得到训练好的模型。 PaddleClas提供了一系列训练好的[模型](../../docs/zh_CN/models/models_intro.md),如果待量化的模型不在列表中,需要按照[常规训练](../../docs/zh_CN/tutorials/getting_started.md)方法得到训练好的模型。
### 3. 模型压缩
进入PaddleClas根目录
```bash
cd PaddleClas
```
`slim`训练相关代码已经集成到`ppcls/engine/`下,离线量化代码位于`deploy/slim/quant_post_static.py`
#### 3.1 模型量化
### 3. 量化训练
量化训练包括离线量化训练和在线量化训练,在线量化训练效果更好,需加载预训练模型,在定义好量化策略后即可对模型进行量化。 量化训练包括离线量化训练和在线量化训练,在线量化训练效果更好,需加载预训练模型,在定义好量化策略后即可对模型进行量化。
##### 3.1.1 在线量化训练
量化训练的代码位于`deploy/slim/quant/quant.py` 中,训练指令如下: 训练指令如下:
* CPU/单机单卡启动 * CPU/单卡GPU
以CPU为例,若使用GPU,则将命令中改成`cpu`改成`gpu`
```bash ```bash
python3.7 deploy/slim/quant/quant.py \ python3.7 tools/train.py -c ppcls/configs/slim/ResNet50_vd_quantization.yaml -o Global.device=cpu
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
``` ```
* 单机单卡/单机多卡/多机多卡启动 其中`yaml`文件解析详见[参考文档](../../docs/zh_CN/tutorials/config_description.md)。为了保证精度,`yaml`文件中已经使用`pretrained model`.
* 单机多卡/多机多卡启动
```bash ```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 export CUDA_VISIBLE_DEVICES=0,1,2,3
python3.7 -m paddle.distributed.launch \ python3.7 -m paddle.distributed.launch \
--gpus="0,1,2,3,4,5,6,7" \ --gpus="0,1,2,3" \
deploy/slim/quant/quant.py \ tools/train.py \
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \ -c ppcls/configs/slim/ResNet50_vd_quantization.yaml
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
``` ```
##### 3.1.2 离线量化
**注意**:目前离线量化,必须使用已经训练好的模型,导出的`inference model`进行量化。一般模型导出`inference model`可参考[教程](../../docs/zh_CN/inference.md).
一般来说,离线量化损失模型精度较多。
生成`inference model`后,离线量化运行方式如下
```bash
python3.7 deploy/slim/quant_post_static.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml -o Global.save_inference_dir=./deploy/models/class_ResNet50_vd_ImageNet_infer
```
`Global.save_inference_dir``inference model`存放的目录。
执行成功后,在`Global.save_inference_dir`的目录下,生成`quant_post_static_model`文件夹,其中存储生成的离线量化模型,其可以直接进行预测部署,无需再重新导出模型。
#### 3.2 模型剪枝
训练指令如下:
- CPU/单卡GPU
以CPU为例,若使用GPU,则将命令中改成`cpu`改成`gpu`
```bash
python3.7 tools/train.py -c ppcls/configs/slim/ResNet50_vd_prune.yaml -o Global.device=cpu
```
* 下面是量化`MobileNetV3_large_x1_0`模型的训练示例脚本。 - 单机单卡/单机多卡/多机多卡启动
```bash ```bash
# 下载预训练模型 export CUDA_VISIBLE_DEVICES=0,1,2,3
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x1_0_pretrained.pdparams
# 启动训练,这里如果因为显存限制,batch size无法设置过大,可以将batch size和learning rate同比例缩小。
python3.7 -m paddle.distributed.launch \ python3.7 -m paddle.distributed.launch \
--gpus="0,1,2,3,4,5,6,7" \ --gpus="0,1,2,3" \
deploy/slim/quant/quant.py \ tools/train.py \
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \ -c ppcls/configs/slim/ResNet50_vd_prune.yaml
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
-o LEARNING_RATE.params.lr=0.13 \
-o epochs=100
``` ```
### 4. 导出模型 ### 4. 导出模型
在得到量化训练保存的模型后,可以将其导出为inference model,用于预测部署 在得到在线量化训练、模型剪枝保存的模型后,可以将其导出为inference model,用于预测部署,以模型剪枝为例
```bash ```bash
python3.7 deploy/slim/quant/export_model.py \ python3.7 tools/export.py \
-m MobileNetV3_large_x1_0 \ -c ppcls/configs/slim/ResNet50_vd_prune.yaml \
-p output/MobileNetV3_large_x1_0/best_model/ppcls \ -o Global.pretrained_model=./output/ResNet50_vd/best_model \
-o ./MobileNetV3_large_x1_0_infer/ \ -o Global.save_inference_dir=./inference
--img_size=224 \
--class_dim=1000
``` ```
### 5. 量化模型部署 ### 5. 模型部署
上述步骤导出的量化模型,参数精度仍然是FP32,但是参数的数值范围是int8,导出的模型可以通过PaddleLite的opt模型转换工具完成模型转换。 上述步骤导出的模型可以通过PaddleLite的opt模型转换工具完成模型转换。
量化模型部署的可参考 [移动端模型部署](../../lite/readme.md) 模型部署的可参考 [移动端模型部署](../lite/readme.md)
## 量化训练超参数建议 ## 训练超参数建议
* 量化训练时,建议加载常规训练得到的预训练模型,加速量化训练收敛。 * 量化训练时,建议加载常规训练得到的预训练模型,加速量化训练收敛。
* 量化训练时,建议初始学习率修改为常规训练的`1/20~1/10`,同时将训练epoch数修改为常规训练的`1/5~1/2`,学习率策略方面,加上Warmup,其他配置信息不建议修改。 * 量化训练时,建议初始学习率修改为常规训练的`1/20~1/10`,同时将训练epoch数修改为常规训练的`1/5~1/2`,学习率策略方面,加上Warmup,其他配置信息不建议修改。
## Introduction ## Introduction to Slim
Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. This part provides the function of compressing the model, including two parts: model quantization (offline quantization training and online quantization training) and model pruning.
Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number, Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance.
so as to reduce model calculation complexity and improve model inference performance.
This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the PaddleClas models. Model pruning cuts off the unimportant convolution kernel in CNN to reduce the amount of model parameters, so as to reduce the computational complexity of the model.
It is recommended that you could understand following pages before reading this example: It is recommended that you could understand following pages before reading this example:
- [The training strategy of PaddleClas models](../../../docs/en/tutorials/quick_start_en.md) - [The training strategy of PaddleClas models](../../docs/en/tutorials/getting_started_en.md)
- [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) - [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim)
## Quick Start ## Quick Start
Quantization is mostly suitable for the deployment of lightweight models on mobile terminals. After training a model, if you want to further compress the model size and speed up the prediction, you can use quantization or pruning to compress the model according to the following steps.
After training, if you want to further compress the model size and accelerate the prediction, you can use quantization methods to compress the model according to the following steps.
1. Install PaddleSlim 1. Install PaddleSlim
2. Prepare trained model 2. Prepare trained model
3. Quantization-Aware Training 3. Model compression
4. Export inference model 4. Export inference model
5. Deploy quantization inference model 5. Deploy quantization inference model
...@@ -27,7 +25,7 @@ After training, if you want to further compress the model size and accelerate th ...@@ -27,7 +25,7 @@ After training, if you want to further compress the model size and accelerate th
* Install by pip. * Install by pip.
```bash ```bash
pip3.7 install paddleslim==2.0.0 pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
``` ```
* Install from source code to get the lastest features. * Install from source code to get the lastest features.
...@@ -40,71 +38,105 @@ python setup.py install ...@@ -40,71 +38,105 @@ python setup.py install
### 2. Download Pretrain Model ### 2. Download Pretrain Model
PaddleClas provides a series of trained [models](../../../docs/en/models/models_intro_en.md). PaddleClas provides a series of trained [models](../../docs/en/models/models_intro_en.md).
If the model to be quantified is not in the list, you need to follow the [Regular Training](../../../docs/en/tutorials/getting_started_en.md) method to get the trained model. If the model to be quantified is not in the list, you need to follow the [Regular Training](../../docs/en/tutorials/getting_started_en.md) method to get the trained model.
### 3. Model Compression
Go to the root directory of PaddleClas
```bash
cd PaddleClase
```
The training related codes have been integrated into `ppcls/engine/`. The offline quantization code is located in `deploy/slim/quant_post_static.py`
#### 3.1 Model Quantization
Quantization training includes offline quantization and online quantization training.
##### 3.1.1 Online quantization training
### 3. Quant-Aware Training
Quantization training includes offline quantization training and online quantization training.
Online quantization training is more effective. It is necessary to load the pre-trained model. Online quantization training is more effective. It is necessary to load the pre-trained model.
After the quantization strategy is defined, the model can be quantified. After the quantization strategy is defined, the model can be quantified.
The code for quantization training is located in `deploy/slim/quant/quant.py`. The training command is as follow: The training command is as follow:
* CPU/Single GPU
* CPU/Single GPU training If using GPU, change the `cpu` to `gpu` in the following command.
```bash ```bash
python3.7 deploy/slim/quant/quant.py \ python3.7 tools/train.py -c ppcls/configs/slim/ResNet50_vd_quantization.yaml -o Global.device=cpu
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
``` ```
The description of `yaml` file can be found in this [doc](../../docs/en/tutorials/config_en.md). To get better accuracy, the `pretrained model`is used in `yaml`.
* Distributed training * Distributed training
```bash ```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 export CUDA_VISIBLE_DEVICES=0,1,2,3
python3.7 -m paddle.distributed.launch \ python3.7 -m paddle.distributed.launch \
--gpus="0,1,2,3,4,5,6,7" \ --gpus="0,1,2,3" \
deploy/slim/quant/quant.py \ tools/train.py \
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \ -m train \
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained" -c ppcls/configs/slim/ResNet50_vd_quantization.yaml
```
##### 3.1.2 Offline quantization
**Attention**: At present, offline quantization must use `inference model` as input, which can be exported by trained model. The process of exporting `inference model` for trained model can refer to this [doc](../../docs/en/inference.md).
Generally speaking, the offline quantization gets more loss of accuracy than online qutization training.
After getting `inference model`, we can run following command to get offline quantization model.
``` ```
python3.7 deploy/slim/quant_post_static.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml -o Global.save_inference_dir=./deploy/models/class_ResNet50_vd_ImageNet_infer
```
`Global.save_inference_dir` is the directory storing the `inference model`.
If run successfully, the directory `quant_post_static_model` is generated in `Global.save_inference_dir`, which stores the offline quantization model that can be used for deploy directly.
* The command of quantizing `MobileNetV3_large_x1_0` model is as follow: #### 3.2 Model Pruning
- CPU/Single GPU
If using GPU, change the `cpu` to `gpu` in the following command.
```bash ```bash
# download pre-trained model python3.7 tools/train.py -c ppcls/configs/slim/ResNet50_vd_prune.yaml -o Global.device=cpu
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x1_0_pretrained.pdparams ```
# run training - Distributed training
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3.7 -m paddle.distributed.launch \ python3.7 -m paddle.distributed.launch \
--gpus="0,1,2,3,4,5,6,7" \ --gpus="0,1,2,3" \
deploy/slim/quant/quant.py \ tools/train.py \
-c configs/MobileNetV3/MobileNetV3_large_x1_0.yaml \ -c ppcls/configs/slim/ResNet50_vd_prune.yaml
-o pretrained_model="./MobileNetV3_large_x1_0_pretrained"
-o LEARNING_RATE.params.lr=0.13 \
-o epochs=100
``` ```
### 4. Export inference model ### 4. Export inference model
After getting the model quantization aware trained, we can export it as inference model for predictive deployment: After getting the compressed model, we can export it as inference model for predictive deployment. Using pruned model as example:
```bash ```bash
python3.7 deploy/slim/quant/export_model.py \ python3.7 tools/export.py \
-m MobileNetV3_large_x1_0 \ -c ppcls/configs/slim/ResNet50_vd_prune.yaml \
-p output/MobileNetV3_large_x1_0/best_model/ppcls \ -o Global.pretrained_model=./output/ResNet50_vd/best_model
-o ./MobileNetV3_large_x1_0_infer/ \ -o Global.save_inference_dir=./inference
--img_size=224 \
--class_dim=1000
``` ```
### 5. Deploy ### 5. Deploy
The type of quantized model's parameters derived from the above steps is still FP32, but the numerical range of the parameters is int8.
The derived model can be converted through the `opt tool` of PaddleLite. The derived model can be converted through the `opt tool` of PaddleLite.
For quantitative model deployment, please refer to [Mobile terminal model deployment](../../lite/readme_en.md) For compresed model deployment, please refer to [Mobile terminal model deployment](../lite/readme_en.md)
## Notes: ## Notes:
......
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import argparse
import os
import sys
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(__dir__)
sys.path.append(os.path.abspath(os.path.join(__dir__, '..', '..', '..')))
sys.path.append(
os.path.abspath(os.path.join(__dir__, '..', '..', '..', 'tools')))
from ppcls.arch import backbone
from ppcls.utils.save_load import load_dygraph_pretrain
import paddle
import paddle.nn.functional as F
from paddle.jit import to_static
from paddleslim.dygraph.quant import QAT
from pact_helper import get_default_quant_config
def parse_args():
def str2bool(v):
return v.lower() in ("true", "t", "1")
parser = argparse.ArgumentParser()
parser.add_argument("-m", "--model", type=str)
parser.add_argument("-p", "--pretrained_model", type=str)
parser.add_argument("-o", "--output_path", type=str, default="./inference")
parser.add_argument("--class_dim", type=int, default=1000)
parser.add_argument("--load_static_weights", type=str2bool, default=False)
parser.add_argument("--img_size", type=int, default=224)
return parser.parse_args()
class Net(paddle.nn.Layer):
def __init__(self, net, class_dim, model=None):
super(Net, self).__init__()
self.pre_net = net(class_dim=class_dim)
self.model = model
def forward(self, inputs):
x = self.pre_net(inputs)
if self.model == "GoogLeNet":
x = x[0]
x = F.softmax(x)
return x
def main():
args = parse_args()
net = backbone.__dict__[args.model]
model = Net(net, args.class_dim, args.model)
# get QAT model
quant_config = get_default_quant_config()
# TODO(littletomatodonkey): add PACT for export model
# quant_config["activation_preprocess_type"] = "PACT"
quanter = QAT(config=quant_config)
quanter.quantize(model)
load_dygraph_pretrain(
model.pre_net,
path=args.pretrained_model,
load_static_weights=args.load_static_weights)
model.eval()
save_path = os.path.join(args.output_path, "inference")
quanter.save_quantized_model(
model,
save_path,
input_spec=[
paddle.static.InputSpec(
shape=[None, 3, args.img_size, args.img_size], dtype='float32')
])
print('inference QAT model is saved to {}'.format(save_path))
if __name__ == "__main__":
main()
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import paddle
def get_default_quant_config():
quant_config = {
# weight preprocess type, default is None and no preprocessing is performed.
'weight_preprocess_type': None,
# activation preprocess type, default is None and no preprocessing is performed.
'activation_preprocess_type': None,
# weight quantize type, default is 'channel_wise_abs_max'
'weight_quantize_type': 'channel_wise_abs_max',
# activation quantize type, default is 'moving_average_abs_max'
'activation_quantize_type': 'moving_average_abs_max',
# weight quantize bit num, default is 8
'weight_bits': 8,
# activation quantize bit num, default is 8
'activation_bits': 8,
# data type after quantization, such as 'uint8', 'int8', etc. default is 'int8'
'dtype': 'int8',
# window size for 'range_abs_max' quantization. default is 10000
'window_size': 10000,
# The decay coefficient of moving average, default is 0.9
'moving_rate': 0.9,
# for dygraph quantization, layers of type in quantizable_layer_type will be quantized
'quantizable_layer_type': ['Conv2D', 'Linear'],
}
return quant_config
# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import os
import sys
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(__dir__)
sys.path.append(os.path.abspath(os.path.join(__dir__, '..', '..', '..')))
sys.path.append(
os.path.abspath(os.path.join(__dir__, '..', '..', '..', 'tools')))
import paddle
from paddleslim.dygraph.quant import QAT
from ppcls.data import Reader
from ppcls.utils.config import get_config
from ppcls.utils.save_load import init_model, save_model
from ppcls.utils import logger
import program
from pact_helper import get_default_quant_config
def parse_args():
parser = argparse.ArgumentParser("PaddleClas train script")
parser.add_argument(
'-c',
'--config',
type=str,
default='configs/ResNet/ResNet50.yaml',
help='config file path')
parser.add_argument(
'-o',
'--override',
action='append',
default=[],
help='config options to be overridden')
args = parser.parse_args()
return args
def main(args):
paddle.seed(12345)
config = get_config(args.config, overrides=args.override, show=True)
# assign the place
use_gpu = config.get("use_gpu", True)
place = paddle.set_device('gpu' if use_gpu else 'cpu')
trainer_num = paddle.distributed.get_world_size()
use_data_parallel = trainer_num != 1
config["use_data_parallel"] = use_data_parallel
if config["use_data_parallel"]:
paddle.distributed.init_parallel_env()
net = program.create_model(config.ARCHITECTURE, config.classes_num)
# prepare to quant
quant_config = get_default_quant_config()
quant_config["activation_preprocess_type"] = "PACT"
quanter = QAT(config=quant_config)
quanter.quantize(net)
optimizer, lr_scheduler = program.create_optimizer(
config, parameter_list=net.parameters())
init_model(config, net, optimizer)
if config["use_data_parallel"]:
net = paddle.DataParallel(net)
train_dataloader = Reader(config, 'train', places=place)()
if config.validate:
valid_dataloader = Reader(config, 'valid', places=place)()
last_epoch_id = config.get("last_epoch", -1)
best_top1_acc = 0.0 # best top1 acc record
best_top1_epoch = last_epoch_id
for epoch_id in range(last_epoch_id + 1, config.epochs):
net.train()
# 1. train with train dataset
program.run(train_dataloader, config, net, optimizer, lr_scheduler,
epoch_id, 'train')
# 2. validate with validate dataset
if config.validate and epoch_id % config.valid_interval == 0:
net.eval()
with paddle.no_grad():
top1_acc = program.run(valid_dataloader, config, net, None,
None, epoch_id, 'valid')
if top1_acc > best_top1_acc:
best_top1_acc = top1_acc
best_top1_epoch = epoch_id
model_path = os.path.join(config.model_save_dir,
config.ARCHITECTURE["name"])
save_model(net, optimizer, model_path, "best_model")
message = "The best top1 acc {:.5f}, in epoch: {:d}".format(
best_top1_acc, best_top1_epoch)
logger.info(message)
# 3. save the persistable model
if epoch_id % config.save_interval == 0:
model_path = os.path.join(config.model_save_dir,
config.ARCHITECTURE["name"])
save_model(net, optimizer, model_path, epoch_id)
if __name__ == '__main__':
args = parse_args()
main(args)
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import, division, print_function
import os
import sys
import numpy as np
import paddle
import paddleslim
from paddle.jit import to_static
from paddleslim.analysis import dygraph_flops as flops
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.abspath(os.path.join(__dir__, '../../')))
from paddleslim.dygraph.quant import QAT
from ppcls.data import build_dataloader
from ppcls.utils import config as conf
from ppcls.utils.logger import init_logger
def main():
args = conf.parse_args()
config = conf.get_config(args.config, overrides=args.override, show=False)
assert os.path.exists(
os.path.join(config["Global"]["save_inference_dir"],
'inference.pdmodel')) and os.path.exists(
os.path.join(config["Global"]["save_inference_dir"],
'inference.pdiparams'))
config["DataLoader"]["Train"]["sampler"]["batch_size"] = 1
config["DataLoader"]["Train"]["loader"]["num_workers"] = 0
init_logger()
device = paddle.set_device("cpu")
train_dataloader = build_dataloader(config["DataLoader"], "Train", device,
False)
def sample_generator(loader):
def __reader__():
for indx, data in enumerate(loader):
images = np.array(data[0])
yield images
return __reader__
paddle.enable_static()
place = paddle.CPUPlace()
exe = paddle.static.Executor(place)
paddleslim.quant.quant_post_static(
executor=exe,
model_dir=config["Global"]["save_inference_dir"],
model_filename='inference.pdmodel',
params_filename='inference.pdiparams',
quantize_model_path=os.path.join(
config["Global"]["save_inference_dir"], "quant_post_static_model"),
sample_generator=sample_generator(train_dataloader),
batch_nums=10)
if __name__ == "__main__":
main()
...@@ -24,13 +24,13 @@ Accuracy and inference time of the prtrained models based on SSLD distillation a ...@@ -24,13 +24,13 @@ Accuracy and inference time of the prtrained models based on SSLD distillation a
* Server-side distillation pretrained models * Server-side distillation pretrained models
| Model | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | Download Address | | Model | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | Download Address |
|---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------| |---------------------|-----------|-----------|---------------|----------------|----------|-----------|-----------------------------------|
| ResNet34_vd_ssld | 0.797 | 0.760 | 0.037 | 2.434 | 6.222 | 7.39 | 21.82 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet34_vd_ssld_pretrained.pdparams) | | ResNet34_vd_ssld | 0.797 | 0.760 | 0.037 | 2.434 | 6.222 | 7.39 | 21.82 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet34_vd_ssld_pretrained.pdparams) |
| ResNet50_vd_<br>ssld | 0.830 | 0.792 | 0.039 | 3.531 | 8.090 | 8.67 | 25.58 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_ssld_pretrained.pdparams) | | ResNet50_vd_ssld | 0.830 | 0.792 | 0.039 | 3.531 | 8.090 | 8.67 | 25.58 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_ssld_pretrained.pdparams) |
| ResNet101_vd_<br>ssld | 0.837 | 0.802 | 0.035 | 6.117 | 13.762 | 16.1 | 44.57 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet101_vd_ssld_pretrained.pdparams) | | ResNet101_vd_ssld | 0.837 | 0.802 | 0.035 | 6.117 | 13.762 | 16.1 | 44.57 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet101_vd_ssld_pretrained.pdparams) |
| Res2Net50_vd_<br>26w_4s_ssld | 0.831 | 0.798 | 0.033 | 4.527 | 9.657 | 8.37 | 25.06 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net50_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net50_vd_26w_4s_ssld | 0.831 | 0.798 | 0.033 | 4.527 | 9.657 | 8.37 | 25.06 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net50_vd_26w_4s_ssld_pretrained.pdparams) |
| Res2Net101_vd_<br>26w_4s_ssld | 0.839 | 0.806 | 0.033 | 8.087 | 17.312 | 16.67 | 45.22 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net101_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net101_vd_26w_4s_ssld | 0.839 | 0.806 | 0.033 | 8.087 | 17.312 | 16.67 | 45.22 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net101_vd_26w_4s_ssld_pretrained.pdparams) |
| Res2Net200_vd_<br>26w_4s_ssld | 0.851 | 0.812 | 0.049 | 14.678 | 32.350 | 31.49 | 76.21 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net200_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net200_vd_26w_4s_ssld | 0.851 | 0.812 | 0.049 | 14.678 | 32.350 | 31.49 | 76.21 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net200_vd_26w_4s_ssld_pretrained.pdparams) |
| HRNet_W18_C_ssld | 0.812 | 0.769 | 0.043 | 7.406 | 13.297 | 4.14 | 21.29 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_ssld_pretrained.pdparams) | | HRNet_W18_C_ssld | 0.812 | 0.769 | 0.043 | 7.406 | 13.297 | 4.14 | 21.29 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_ssld_pretrained.pdparams) |
| HRNet_W48_C_ssld | 0.836 | 0.790 | 0.046 | 13.707 | 34.435 | 34.58 | 77.47 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W48_C_ssld_pretrained.pdparams) | | HRNet_W48_C_ssld | 0.836 | 0.790 | 0.046 | 13.707 | 34.435 | 34.58 | 77.47 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W48_C_ssld_pretrained.pdparams) |
| SE_HRNet_W64_C_ssld | 0.848 | - | - | 31.697 | 94.995 | 57.83 | 128.97 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/SE_HRNet_W64_C_ssld_pretrained.pdparams) | | SE_HRNet_W64_C_ssld | 0.848 | - | - | 31.697 | 94.995 | 57.83 | 128.97 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/SE_HRNet_W64_C_ssld_pretrained.pdparams) |
...@@ -38,19 +38,44 @@ Accuracy and inference time of the prtrained models based on SSLD distillation a ...@@ -38,19 +38,44 @@ Accuracy and inference time of the prtrained models based on SSLD distillation a
* Mobile-side distillation pretrained models * Mobile-side distillation pretrained models
| Model | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | SD855 time(ms)<br>bs=1 | Flops(G) | Params(M) | 模型大小(M) | Download Address | | Model | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | SD855 time(ms)<br>bs=1 | Flops(G) | Params(M) | Storage Size(M) | Download Address |
|---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------|
| MobileNetV1_ssld | 0.779 | 0.710 | 0.069 | 32.523 | 1.11 | 4.19 | 16 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_ssld_pretrained.pdparams) |
| MobileNetV2_ssld | 0.767 | 0.722 | 0.045 | 23.318 | 0.6 | 3.44 | 14 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_ssld_pretrained.pdparams) |
| MobileNetV3_small_x0_35_ssld | 0.556 | 0.530 | 0.026 | 2.635 | 0.026 | 1.66 | 6.9 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x0_35_ssld_pretrained.pdparams) |
| MobileNetV3_large_x1_0_ssld | 0.790 | 0.753 | 0.036 | 19.308 | 0.45 | 5.47 | 21 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_ssld_pretrained.pdparams) |
| MobileNetV3_small_x1_0_ssld | 0.713 | 0.682 | 0.031 | 6.546 | 0.123 | 2.94 | 12 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x1_0_ssld_pretrained.pdparams) |
| GhostNet_x1_3_ssld | 0.794 | 0.757 | 0.037 | 19.983 | 0.44 | 7.3 | 29 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/GhostNet_x1_3_ssld_pretrained.pdparams)
* Intel-CPU-side distillation pretrained models
| Model | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | Intel-Xeon-Gold-6148 time(ms)<br>bs=1 | Flops(M) | Params(M) | Download Address |
|---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------| |---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------|
| MobileNetV1_<br>ssld | 0.779 | 0.710 | 0.069 | 32.523 | 1.11 | 4.19 | 16 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_ssld_pretrained.pdparams) | | PPLCNet_x0_5_ssld | 0.661 | 0.631 | 0.030 | 2.05 | 47 | 1.9 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_ssld_pretrained.pdparams) |
| MobileNetV2_<br>ssld | 0.767 | 0.722 | 0.045 | 23.318 | 0.6 | 3.44 | 14 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_ssld_pretrained.pdparams) | | PPLCNet_x1_0_ssld | 0.744 | 0.713 | 0.033 | 2.46 | 161 | 3.0 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_ssld_pretrained.pdparams) |
| MobileNetV3_<br>small_x0_35_ssld | 0.556 | 0.530 | 0.026 | 2.635 | 0.026 | 1.66 | 6.9 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x0_35_ssld_pretrained.pdparams) | | PPLCNet_x2_5_ssld | 0.808 | 0.766 | 0.042 | 5.39 | 906 | 9.0 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_ssld_pretrained.pdparams) |
| MobileNetV3_<br>large_x1_0_ssld | 0.790 | 0.753 | 0.036 | 19.308 | 0.45 | 5.47 | 21 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_ssld_pretrained.pdparams) |
| MobileNetV3_small_<br>x1_0_ssld | 0.713 | 0.682 | 0.031 | 6.546 | 0.123 | 2.94 | 12 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x1_0_ssld_pretrained.pdparams) |
| GhostNet_<br>x1_3_ssld | 0.794 | 0.757 | 0.037 | 19.983 | 0.44 | 7.3 | 29 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/GhostNet_x1_3_ssld_pretrained.pdparams)
* Note: `Reference Top-1 Acc` means accuracy of pretrained models which are trained on ImageNet1k dataset. * Note: `Reference Top-1 Acc` means accuracy of pretrained models which are trained on ImageNet1k dataset.
<a name="PPLCNet_series"></a>
### PPLCNet_series
Accuracy and inference time metrics of PPLCNet series models are shown as follows. More detailed information can be refered to [PPLCNet series tutorial](../en/models/PPLCNet_en.md).
| Model | Top-1 Acc | Top-5 Acc | Intel-Xeon-Gold-6148 time(ms)<br>bs=1 | FLOPs(M) | Params(M) | Download Address |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| PPLCNet_x0_25 |0.5186 | 0.7565 | 1.74 | 18 | 1.5 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_25_pretrained.pdparams) |
| PPLCNet_x0_35 |0.5809 | 0.8083 | 1.92 | 29 | 1.6 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_35_pretrained.pdparams) |
| PPLCNet_x0_5 |0.6314 | 0.8466 | 2.05 | 47 | 1.9 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_pretrained.pdparams) |
| PPLCNet_x0_75 |0.6818 | 0.8830 | 2.29 | 99 | 2.4 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams) |
| PPLCNet_x1_0 |0.7132 | 0.9003 | 2.46 | 161 | 3.0 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_pretrained.pdparams) |
| PPLCNet_x1_5 |0.7371 | 0.9153 | 3.19 | 342 | 4.5 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_5_pretrained.pdparams) |
| PPLCNet_x2_0 |0.7518 | 0.9227 | 4.27 | 590 | 6.5 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams) |
| PPLCNet_x2_5 |0.7660 | 0.9300 | 5.39 | 906 | 9.0 | [Download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_pretrained.pdparams) |
<a name="ResNet_and_Vd_series"></a> <a name="ResNet_and_Vd_series"></a>
### ResNet and Vd series ### ResNet and Vd series
......
# PPLCNet series
## Overview
The PPLCNet series is a network that has excellent performance on Intel-CPU proposed by the Baidu PaddleCV team. The author summarizes some methods that can improve the accuracy of the model on Intel-CPU but hardly increase the inference time. The author combines these methods into a new network, namely PPLCNet. Compared with other lightweight networks, PPLCNet can achieve higher accuracy with the same inference time. PPLCNet has shown strong competitiveness in image classification, object detection, and semantic segmentation.
## Accuracy, FLOPS and Parameters
| Models | Top1 | Top5 | FLOPs<br>(M) | Parameters<br>(M) |
|:--:|:--:|:--:|:--:|:--:|
| PPLCNet_x0_25 |0.5186 | 0.7565 | 18 | 1.5 |
| PPLCNet_x0_35 |0.5809 | 0.8083 | 29 | 1.6 |
| PPLCNet_x0_5 |0.6314 | 0.8466 | 47 | 1.9 |
| PPLCNet_x0_75 |0.6818 | 0.8830 | 99 | 2.4 |
| PPLCNet_x1_0 |0.7132 | 0.9003 | 161 | 3.0 |
| PPLCNet_x1_5 |0.7371 | 0.9153 | 342 | 4.5 |
| PPLCNet_x2_0 |0.7518 | 0.9227 | 590 | 6.5 |
| PPLCNet_x2_5 |0.7660 | 0.9300 | 906 | 9.0 |
| PPLCNet_x0_5_ssld |0.6610 | 0.8646 | 47 | 1.9 |
| PPLCNet_x1_0_ssld |0.7439 | 0.9209 | 161 | 3.0 |
| PPLCNet_x2_5_ssld |0.8082 | 0.9533 | 906 | 9.0 |
## Inference speed based on Intel(R)-Xeon(R)-Gold-6148-CPU
| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) |
|------------------|-----------|-------------------|--------------------------|
| PPLCNet_x0_25 | 224 | 256 | 1.74 |
| PPLCNet_x0_35 | 224 | 256 | 1.92 |
| PPLCNet_x0_5 | 224 | 256 | 2.05 |
| PPLCNet_x0_75 | 224 | 256 | 2.29 |
| PPLCNet_x1_0 | 224 | 256 | 2.46 |
| PPLCNet_x1_5 | 224 | 256 | 3.19 |
| PPLCNet_x2_0 | 224 | 256 | 4.27 |
| PPLCNet_x2_5 | 224 | 256 | 5.39 |
| PPLCNet_x0_5_ssld | 224 | 256 | 2.05 |
| PPLCNet_x1_0_ssld | 224 | 256 | 2.46 |
| PPLCNet_x2_5_ssld | 224 | 256 | 5.39 |
...@@ -90,13 +90,13 @@ wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/infere ...@@ -90,13 +90,13 @@ wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/infere
cd .. cd ..
# Download the demo data and unzip it # Download the demo data and unzip it
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_en_v1.0.tar && tar -xf recognition_demo_data_en_v1.0.tar wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_en_v1.1.tar && tar -xf recognition_demo_data_en_v1.1.tar
``` ```
Once unpacked, the `recognition_demo_data_v1.0` folder should have the following file structure. Once unpacked, the `recognition_demo_data_v1.1` folder should have the following file structure.
``` ```
├── recognition_demo_data_v1.0 ├── recognition_demo_data_v1.1
│ ├── gallery_cartoon │ ├── gallery_cartoon
│ ├── gallery_logo │ ├── gallery_logo
│ ├── gallery_product │ ├── gallery_product
...@@ -126,13 +126,21 @@ The `models` folder should have the following file structure. ...@@ -126,13 +126,21 @@ The `models` folder should have the following file structure.
<a name="Product_recognition_and_retrival"></a> <a name="Product_recognition_and_retrival"></a>
### 2.2 Product Recognition and Retrieval ### 2.2 Product Recognition and Retrieval
Take the product recognition demo as an example to show the recognition and retrieval process (if you wish to try other scenarios of recognition and retrieval, replace the corresponding configuration file after downloading and unzipping the corresponding demo data and model to complete the prediction) Take the product recognition demo as an example to show the recognition and retrieval process (if you wish to try other scenarios of recognition and retrieval, replace the corresponding configuration file after downloading and unzipping the corresponding demo data and model to complete the prediction).
**Note:** `faiss` is used as search library. The installation method is as follows:
```
pip install faiss-cpu==1.7.1post2
```
If error happens when using `import faiss`, please uninstall `faiss` and reinstall it, especially on `Windows`.
<a name="recognition_of_single_image"></a> <a name="recognition_of_single_image"></a>
#### 2.2.1 Single Image Recognition #### 2.2.1 Single Image Recognition
Run the following command to identify and retrieve the image `./recognition_demo_data_v1.0/test_product/daoxiangcunjinzhubing_6.jpg` for recognition and retrieval Run the following command to identify and retrieve the image `./recognition_demo_data_v1.1/test_product/daoxiangcunjinzhubing_6.jpg` for recognition and retrieval
```shell ```shell
# use the following command to predict using GPU. # use the following command to predict using GPU.
...@@ -141,8 +149,6 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml ...@@ -141,8 +149,6 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.use_gpu=False python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.use_gpu=False
``` ```
**Note:** Program lib used to build index is compliled on our machine, if error occured because of the environment, you can refer to [vector search tutorial](../../../deploy/vector_search/README.md) to rebuild the lib.
The image to be retrieved is shown below. The image to be retrieved is shown below.
...@@ -175,7 +181,7 @@ If you want to predict the images in the folder, you can directly modify the `Gl ...@@ -175,7 +181,7 @@ If you want to predict the images in the folder, you can directly modify the `Gl
```shell ```shell
# using the following command to predict using GPU, you can append `-o Global.use_gpu=False` to predict using CPU. # using the following command to predict using GPU, you can append `-o Global.use_gpu=False` to predict using CPU.
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.0/test_product/" python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/"
``` ```
...@@ -195,16 +201,16 @@ The results on the screen are shown as following. ...@@ -195,16 +201,16 @@ The results on the screen are shown as following.
All the visualization results are also saved in folder `output`. All the visualization results are also saved in folder `output`.
Furthermore, the recognition inference model path can be changed by modifying the `Global.rec_inference_model_dir` field, and the path of the index to the index databass can be changed by modifying the `IndexProcess.index_path` field. Furthermore, the recognition inference model path can be changed by modifying the `Global.rec_inference_model_dir` field, and the path of the index to the index databass can be changed by modifying the `IndexProcess.index_dir` field.
<a name="unkonw_category_image_recognition_experience"></a> <a name="unkonw_category_image_recognition_experience"></a>
## 3. Recognize Images of Unknown Category ## 3. Recognize Images of Unknown Category
To recognize the image `./recognition_demo_data_v1.0/test_product/anmuxi.jpg`, run the command as follows: To recognize the image `./recognition_demo_data_v1.1/test_product/anmuxi.jpg`, run the command as follows:
```shell ```shell
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.0/test_product/anmuxi.jpg" python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/anmuxi.jpg"
``` ```
The image to be retrieved is shown below. The image to be retrieved is shown below.
...@@ -225,14 +231,14 @@ When the index database cannot cover the scenes we actually recognise, i.e. when ...@@ -225,14 +231,14 @@ When the index database cannot cover the scenes we actually recognise, i.e. when
First, you need to copy the images which are similar with the image to retrieval to the original images for the index database. The command is as follows. First, you need to copy the images which are similar with the image to retrieval to the original images for the index database. The command is as follows.
```shell ```shell
cp -r ../docs/images/recognition/product_demo/gallery/anmuxi ./recognition_demo_data_v1.0/gallery_product/gallery/ cp -r ../docs/images/recognition/product_demo/gallery/anmuxi ./recognition_demo_data_/gallery_product/gallery/
``` ```
Then you need to create a new label file which records the image path and label information. Use the following command to create a new file based on the original one. Then you need to create a new label file which records the image path and label information. Use the following command to create a new file based on the original one.
```shell ```shell
# copy the file # copy the file
cp recognition_demo_data_v1.0/gallery_product/data_file.txt recognition_demo_data_v1.0/gallery_product/data_file_update.txt cp recognition_demo_data_v1.1/gallery_product/data_file.txt recognition_demo_data_v1.1/gallery_product/data_file_update.txt
``` ```
Then add some new lines into the new label file, which is shown as follows. Then add some new lines into the new label file, which is shown as follows.
...@@ -255,20 +261,20 @@ Each line can be splited into two fields. The first field denotes the relative i ...@@ -255,20 +261,20 @@ Each line can be splited into two fields. The first field denotes the relative i
Use the following command to build the index to accelerate the retrieval process after recognition. Use the following command to build the index to accelerate the retrieval process after recognition.
```shell ```shell
python3.7 python/build_gallery.py -c configs/build_product.yaml -o IndexProcess.data_file="./recognition_demo_data_v1.0/gallery_product/data_file_update.txt" -o IndexProcess.index_path="./recognition_demo_data_v1.0/gallery_product/index_update" python3.7 python/build_gallery.py -c configs/build_product.yaml -o IndexProcess.data_file="./recognition_demo_data_v1.1/gallery_product/data_file_update.txt" -o IndexProcess.index_dir="./recognition_demo_data_v1.1/gallery_product/index_update"
``` ```
Finally, the new index information is stored in the folder`./recognition_demo_data_v1.0/gallery_product/index_update`. Use the new index database for the above index. Finally, the new index information is stored in the folder`./recognition_demo_data_v1.1/gallery_product/index_update`. Use the new index database for the above index.
<a name="Image_differentiation_based_on_the_new_index_library"></a> <a name="Image_differentiation_based_on_the_new_index_library"></a>
### 3.2 Recognize the Unknown Category Images ### 3.2 Recognize the Unknown Category Images
To recognize the image `./recognition_demo_data_v1.0/test_product/anmuxi.jpg`, run the command as follows. To recognize the image `./recognition_demo_data_v1.1/test_product/anmuxi.jpg`, run the command as follows.
```shell ```shell
# using the following command to predict using GPU, you can append `-o Global.use_gpu=False` to predict using CPU. # using the following command to predict using GPU, you can append `-o Global.use_gpu=False` to predict using CPU.
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.0/test_product/anmuxi.jpg" -o IndexProcess.index_path="./recognition_demo_data_v1.0/gallery_product/index_update" python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/anmuxi.jpg" -o IndexProcess.index_dir="./recognition_demo_data_v1.1/gallery_product/index_update"
``` ```
The output is as follows: The output is as follows:
......
...@@ -31,9 +31,9 @@ ...@@ -31,9 +31,9 @@
| 模型 | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | 下载地址 | | 模型 | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | 下载地址 |
|---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------| |---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------|
| ResNet34_vd_ssld | 0.797 | 0.760 | 0.037 | 2.434 | 6.222 | 7.39 | 21.82 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet34_vd_ssld_pretrained.pdparams) | | ResNet34_vd_ssld | 0.797 | 0.760 | 0.037 | 2.434 | 6.222 | 7.39 | 21.82 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet34_vd_ssld_pretrained.pdparams) |
| ResNet50_vd_<br>ssld | 0.830 | 0.792 | 0.039 | 3.531 | 8.090 | 8.67 | 25.58 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_ssld_pretrained.pdparams) | | ResNet50_vd_ssld | 0.830 | 0.792 | 0.039 | 3.531 | 8.090 | 8.67 | 25.58 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet50_vd_ssld_pretrained.pdparams) |
| ResNet101_vd_<br>ssld | 0.837 | 0.802 | 0.035 | 6.117 | 13.762 | 16.1 | 44.57 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet101_vd_ssld_pretrained.pdparams) | | ResNet101_vd_ssld | 0.837 | 0.802 | 0.035 | 6.117 | 13.762 | 16.1 | 44.57 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/ResNet101_vd_ssld_pretrained.pdparams) |
| Res2Net50_vd_<br>26w_4s_ssld | 0.831 | 0.798 | 0.033 | 4.527 | 9.657 | 8.37 | 25.06 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net50_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net50_vd_26w_4s_ssld | 0.831 | 0.798 | 0.033 | 4.527 | 9.657 | 8.37 | 25.06 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net50_vd_26w_4s_ssld_pretrained.pdparams) |
| Res2Net101_vd_<br>26w_4s_ssld | 0.839 | 0.806 | 0.033 | 8.087 | 17.312 | 16.67 | 45.22 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net101_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net101_vd_<br>26w_4s_ssld | 0.839 | 0.806 | 0.033 | 8.087 | 17.312 | 16.67 | 45.22 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net101_vd_26w_4s_ssld_pretrained.pdparams) |
| Res2Net200_vd_<br>26w_4s_ssld | 0.851 | 0.812 | 0.049 | 14.678 | 32.350 | 31.49 | 76.21 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net200_vd_26w_4s_ssld_pretrained.pdparams) | | Res2Net200_vd_<br>26w_4s_ssld | 0.851 | 0.812 | 0.049 | 14.678 | 32.350 | 31.49 | 76.21 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/Res2Net200_vd_26w_4s_ssld_pretrained.pdparams) |
| HRNet_W18_C_ssld | 0.812 | 0.769 | 0.043 | 7.406 | 13.297 | 4.14 | 21.29 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_ssld_pretrained.pdparams) | | HRNet_W18_C_ssld | 0.812 | 0.769 | 0.043 | 7.406 | 13.297 | 4.14 | 21.29 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/HRNet_W18_C_ssld_pretrained.pdparams) |
...@@ -45,16 +45,44 @@ ...@@ -45,16 +45,44 @@
| 模型 | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | SD855 time(ms)<br>bs=1 | Flops(G) | Params(M) | 模型大小(M) | 下载地址 | | 模型 | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | SD855 time(ms)<br>bs=1 | Flops(G) | Params(M) | 模型大小(M) | 下载地址 |
|---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------| |---------------------|-----------|-----------|---------------|----------------|-----------|----------|-----------|-----------------------------------|
| MobileNetV1_<br>ssld | 0.779 | 0.710 | 0.069 | 32.523 | 1.11 | 4.19 | 16 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_ssld_pretrained.pdparams) | | MobileNetV1_ssld | 0.779 | 0.710 | 0.069 | 32.523 | 1.11 | 4.19 | 16 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV1_ssld_pretrained.pdparams) |
| MobileNetV2_<br>ssld | 0.767 | 0.722 | 0.045 | 23.318 | 0.6 | 3.44 | 14 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_ssld_pretrained.pdparams) | | MobileNetV2_ssld | 0.767 | 0.722 | 0.045 | 23.318 | 0.6 | 3.44 | 14 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV2_ssld_pretrained.pdparams) |
| MobileNetV3_<br>small_x0_35_ssld | 0.556 | 0.530 | 0.026 | 2.635 | 0.026 | 1.66 | 6.9 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x0_35_ssld_pretrained.pdparams) | | MobileNetV3_small_x0_35_ssld | 0.556 | 0.530 | 0.026 | 2.635 | 0.026 | 1.66 | 6.9 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x0_35_ssld_pretrained.pdparams) |
| MobileNetV3_<br>large_x1_0_ssld | 0.790 | 0.753 | 0.036 | 19.308 | 0.45 | 5.47 | 21 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_ssld_pretrained.pdparams) | | MobileNetV3_large_x1_0_ssld | 0.790 | 0.753 | 0.036 | 19.308 | 0.45 | 5.47 | 21 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_large_x1_0_ssld_pretrained.pdparams) |
| MobileNetV3_small_<br>x1_0_ssld | 0.713 | 0.682 | 0.031 | 6.546 | 0.123 | 2.94 | 12 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x1_0_ssld_pretrained.pdparams) | | MobileNetV3_small_x1_0_ssld | 0.713 | 0.682 | 0.031 | 6.546 | 0.123 | 2.94 | 12 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/MobileNetV3_small_x1_0_ssld_pretrained.pdparams) |
| GhostNet_<br>x1_3_ssld | 0.794 | 0.757 | 0.037 | 19.983 | 0.44 | 7.3 | 29 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/GhostNet_x1_3_ssld_pretrained.pdparams) | | GhostNet_x1_3_ssld | 0.794 | 0.757 | 0.037 | 19.983 | 0.44 | 7.3 | 29 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/GhostNet_x1_3_ssld_pretrained.pdparams) |
* Intel CPU端知识蒸馏模型
| 模型 | Top-1 Acc | Reference<br>Top-1 Acc | Acc gain | Intel-Xeon-Gold-6148 time(ms)<br>bs=1 | Flops(M) | Params(M) | 下载地址 |
|---------------------|-----------|-----------|---------------|----------------|----------|-----------|-----------------------------------|
| PPLCNet_x0_5_ssld | 0.661 | 0.631 | 0.030 | 2.05 | 47 | 1.9 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_ssld_pretrained.pdparams) |
| PPLCNet_x1_0_ssld | 0.744 | 0.713 | 0.033 | 2.46 | 161 | 3.0 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_ssld_pretrained.pdparams) |
| PPLCNet_x2_5_ssld | 0.808 | 0.766 | 0.042 | 5.39 | 906 | 9.0 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_ssld_pretrained.pdparams) |
* 注: `Reference Top-1 Acc`表示PaddleClas基于ImageNet1k数据集训练得到的预训练模型精度。 * 注: `Reference Top-1 Acc`表示PaddleClas基于ImageNet1k数据集训练得到的预训练模型精度。
<a name="PPLCNet系列"></a>
### PPLCNet系列
PPLCNet系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:[PPLCNet系列模型文档](./models/PPLCNet.md)
| 模型 | Top-1 Acc | Top-5 Acc | Intel-Xeon-Gold-6148 time(ms)<br>bs=1 | FLOPs(M) | Params(M) | 下载地址 |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| PPLCNet_x0_25 |0.5186 | 0.7565 | 1.74 | 18 | 1.5 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_25_pretrained.pdparams) |
| PPLCNet_x0_35 |0.5809 | 0.8083 | 1.92 | 29 | 1.6 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_35_pretrained.pdparams) |
| PPLCNet_x0_5 |0.6314 | 0.8466 | 2.05 | 47 | 1.9 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_pretrained.pdparams) |
| PPLCNet_x0_75 |0.6818 | 0.8830 | 2.29 | 99 | 2.4 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams) |
| PPLCNet_x1_0 |0.7132 | 0.9003 | 2.46 | 161 | 3.0 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_pretrained.pdparams) |
| PPLCNet_x1_5 |0.7371 | 0.9153 | 3.19 | 342 | 4.5 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_5_pretrained.pdparams) |
| PPLCNet_x2_0 |0.7518 | 0.9227 | 4.27 | 590 | 6.5 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams) |
| PPLCNet_x2_5 |0.7660 | 0.9300 | 5.39 | 906 | 9.0 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_pretrained.pdparams) |
<a name="ResNet及其Vd系列"></a> <a name="ResNet及其Vd系列"></a>
### ResNet及其Vd系列 ### ResNet及其Vd系列
...@@ -429,7 +457,7 @@ ViT(Vision Transformer)与DeiT(Data-efficient Image Transformers)系列 ...@@ -429,7 +457,7 @@ ViT(Vision Transformer)与DeiT(Data-efficient Image Transformers)系列
| 模型 | Top-1 Acc | Top-5 Acc | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | 下载地址 | | 模型 | Top-1 Acc | Top-5 Acc | time(ms)<br>bs=1 | time(ms)<br>bs=4 | Flops(G) | Params(M) | 下载地址 |
| ---------- | --------- | --------- | ---------------- | ---------------- | -------- | --------- | ------------------------------------------------------------ | | ---------- | --------- | --------- | ---------------- | ---------------- | -------- | --------- | ------------------------------------------------------------ |
| TNT_small | 0.8121 |0.9563 | | | 5.2 | 23.8 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/TNT_small_pretrained.pdparams) | | | TNT_small | 0.8121 |0.9563 | | | 5.2 | 23.8 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/TNT_small_pretrained.pdparams) | |
**注**:TNT模型的数据预处理部分`NormalizeImage`中的`mean``std`均为0.5。 **注**:TNT模型的数据预处理部分`NormalizeImage`中的`mean``std`均为0.5。
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
* 图像分类、识别、检索领域大佬众多,模型和论文更新速度也很快,本文档回答主要依赖有限的项目实践,难免挂一漏万,如有遗漏和不足,也希望有识之士帮忙补充和修正,万分感谢。 * 图像分类、识别、检索领域大佬众多,模型和论文更新速度也很快,本文档回答主要依赖有限的项目实践,难免挂一漏万,如有遗漏和不足,也希望有识之士帮忙补充和修正,万分感谢。
## 目录 ## 目录
* [近期更新](#近期更新)(2021.08.11) * [近期更新](#近期更新)(2021.09.08)
* [精选](#精选) * [精选](#精选)
* [1. 理论篇](#1.理论篇) * [1. 理论篇](#1.理论篇)
* [1.1 PaddleClas基础知识](#1.1PaddleClas基础知识) * [1.1 PaddleClas基础知识](#1.1PaddleClas基础知识)
...@@ -27,60 +27,69 @@ ...@@ -27,60 +27,69 @@
<a name="近期更新"></a> <a name="近期更新"></a>
## 近期更新 ## 近期更新
#### Q2.6.2: 导出inference模型进行预测部署,准确率异常,为什么呢? #### Q2.1.7: 在训练时,出现如下报错信息:`ERROR: Unexpected segmentation fault encountered in DataLoader workers.`,如何排查解决问题呢?
**A**: 该问题通常是由于在导出时未能正确加载模型参数导致的,首先检查模型导出时的日志,是否存在类似下述内容: **A**:尝试将训练配置文件中的字段 `num_workers` 设置为 `0`;尝试将训练配置文件中的字段 `batch_size` 调小一些;检查数据集格式和配置文件中的数据集路径是否正确。
```
UserWarning: Skip loading for ***. *** is not found in the provided dict.
```
如果存在,则说明模型权重未能加载成功,请进一步检查配置文件中的 `Global.pretrained_model` 字段,是否正确配置了模型权重文件的路径。模型权重文件后缀名通常为 `pdparams`,注意在配置该路径时无需填写文件后缀名。
#### Q2.1.4: 数据预处理中,不想对输入数据进行裁剪,该如何设置?或者如何设置剪裁的尺寸。 #### Q2.1.8: 如何在训练时使用 `Mixup` 和 `Cutmix` ?
**A**: PaddleClas 支持的数据预处理算子可在这里查看:`ppcls/data/preprocess/__init__.py`,所有支持的算子均可在配置文件中进行配置,配置的算子名称需要和算子类名一致,参数与对应算子类的构造函数参数一致。如不需要对图像裁剪,则可去掉 `CropImage``RandCropImage`,使用 `ResizeImage` 替换即可,可通过其参数设置不同的resize方式, 使用 `size` 参数则直接将图像缩放至固定大小,使用`resize_short` 参数则会维持图像宽高比进行缩放。设置裁剪尺寸时,可通过 `CropImage` 算子的 `size` 参数,或 `RandCropImage` 算子的 `size` 参数。 **A**
* `Mixup` 的使用方法请参考 [Mixup](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Mixup.yaml#L63-L65)`Cuxmix` 请参考 [Cuxmix](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L63-L65)
#### Q1.1.3: Momentum 优化器中的 momentum 参数是什么意思呢? * 在使用 `Mixup``Cutmix` 时,需要注意:
**A**: Momentum 优化器是在 SGD 优化器的基础上引入了“动量”的概念。在 SGD 优化器中,在 `t+1` 时刻,参数 `w` 的更新可表示为: * 配置文件中的 `Loss.Tranin.CELoss` 需要修改为 `Loss.Tranin.MixCELoss`,可参考 [MixCELoss](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L23-L26)
```latex * 使用 `Mixup``Cutmix` 做训练时无法计算训练的精度(Acc)指标,因此需要在配置文件中取消 `Metric.Train.TopkAcc` 字段,可参考 [Metric.Train.TopkAcc](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L125-L128)
w_t+1 = w_t - lr * grad
``` #### Q2.1.9: 训练配置yaml文件中,字段 `Global.pretrain_model` 和 `Global.checkpoints` 分别用于配置什么呢?
其中,`lr` 为学习率,`grad` 为此时参数 `w` 的梯度。在引入动量的概念后,参数 `w` 的更新可表示为: **A**
```latex * 当需要 `fine-tune` 时,可以通过字段 `Global.pretrain_model` 配置预训练模型权重文件的路径,预训练模型权重文件后缀名通常为 `.pdparams`
v_t+1 = m * v_t + lr * grad * 在训练过程中,训练程序会自动保存每个epoch结束时的断点信息,包括优化器信息 `.pdopt` 和模型权重信息 `.pdparams`。在训练过程意外中断等情况下,需要恢复训练时,可以通过字段 `Global.checkpoints` 配置训练过程中保存的断点信息文件,例如通过配置 `checkpoints: ./output/ResNet18/epoch_18` 即可恢复18epoch训练结束时的断点信息,PaddleClas将自动加载 `epoch_18.pdopt``epoch_18.pdparams`,从19epoch继续训练。
w_t+1 = w_t - v_t+1
#### Q2.6.3: 如何将模型转为 `ONNX` 格式?
**A**:Paddle支持两种转ONNX格式模型的方式,且依赖于 `paddle2onnx` 工具,首先需要安装 `paddle2onnx`
```shell
pip install paddle2onnx
``` ```
其中,`m` 即为动量 `momentum`,表示累积动量的加权值,一般取 `0.9`,当取值小于 `1` 时,则越早期的梯度对当前的影响越小,例如,当动量参数 `m``0.9` 时,在 `t` 时刻,`t-5` 的梯度加权值为 `0.9 ^ 5 = 0.59049`,而 `t-2` 时刻的梯度加权值为 `0.9 ^ 2 = 0.81`。因此,太过“久远”的梯度信息对当前的参考意义很小,而“最近”的历史梯度信息对当前影响更大,这也是符合直觉的。
<div align="center"> * 从 inference model 转为 ONNX 格式模型:
<img src="../../images/faq/momentum.jpeg" width="400">
</div>
*该图来自 `https://blog.csdn.net/tsyccnh/article/details/76270707`* 以动态图导出的 `combined` 格式 inference model(包含 `.pdmodel` 和 `.pdiparams` 两个文件)为例,使用以下命令进行模型格式转换:
```shell
paddle2onnx --model_dir ${model_path} --model_filename ${model_path}/inference.pdmodel --params_filename ${model_path}/inference.pdiparams --save_file ${save_path}/model.onnx --enable_onnx_checker True
```
上述命令中:
* `model_dir`:该参数下需要包含 `.pdmodel` 和 `.pdiparams` 两个文件;
* `model_filename`:该参数用于指定参数 `model_dir` 下的 `.pdmodel` 文件路径;
* `params_filename`:该参数用于指定参数 `model_dir` 下的 `.pdiparams` 文件路径;
* `save_file`:该参数用于指定转换后的模型保存目录路径。
通过引入动量的概念,在参数更新时考虑了历史更新的影响,因此可以加快收敛速度,也改善了 `SGD` 优化器带来的损失(cost、loss)震荡问题 关于静态图导出的非 `combined` 格式的 inference model(通常包含文件 `__model__` 和多个参数文件)转换模型格式,以及更多参数说明请参考 paddle2onnx 官方文档 [paddle2onnx](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_zh.md#%E5%8F%82%E6%95%B0%E9%80%89%E9%A1%B9)
#### Q1.1.4: PaddleClas 是否有 `Fixing the train-test resolution discrepancy` 这篇论文的实现呢? * 直接从模型组网代码导出ONNX格式模型:
**A**: 目前 PaddleClas 没有实现。如果需要,可以尝试自己修改代码。简单来说,该论文所提出的思想是使用较大分辨率作为输入,对已经训练好的模型最后的FC层进行fine-tune。具体操作上,首先在较低分辨率的数据集上对模型网络进行训练,完成训练后,对网络除最后的FC层外的其他层的权重设置参数 `stop_gradient=True`,然后使用较大分辨率的输入对网络进行fine-tune训练。
#### Q1.6.2: PaddleClas 图像识别用于 Eval 的配置文件中,`Query` 和 `Gallery` 配置具体是用于做什么呢? 以动态图模型组网代码为例,模型类为继承于 `paddle.nn.Layer` 的子类,代码如下所示:
**A**: `Query``Gallery` 均为数据集配置,其中 `Gallery` 用于配置底库数据,`Query` 用于配置验证集。在进行 Eval 时,首先使用模型对 `Gallery` 底库数据进行前向计算特征向量,特征向量用于构建底库,然后模型对 `Query` 验证集中的数据进行前向计算特征向量,再与底库计算召回率等指标。
#### Q2.1.5: PaddlePaddle 安装后,使用报错,无法导入 paddle 下的任何模块(import paddle.xxx),是为什么呢? ```python
**A**: 首先可以使用以下代码测试 Paddle 是否安装正确: import paddle
```python from paddle.static import InputSpec
import paddle
paddle.utils.install_check.run_check(
```
正确安装时,通常会有如下提示:
```
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
```
如未能安装成功,则会有相应问题的提示。
另外,在同时安装CPU版本和GPU版本Paddle后,由于两个版本存在冲突,需要将两个版本全部卸载,然后重新安装所需要的版本。
#### Q2.1.6: 使用PaddleClas训练时,如何设置仅保存最优模型?不想保存中间模型。 class SimpleNet(paddle.nn.Layer):
**A**: PaddleClas在训练过程中,会保存/更新以下三类模型: def __init__(self):
1. 最新的模型(`latest.pdopt``latest.pdparams``latest.pdstates`),当训练意外中断时,可使用最新保存的模型恢复训练; pass
2. 最优的模型(`best_model.pdopt``best_model.pdparams``best_model.pdstates`); def forward(self, x):
3. 训练过程中,一个epoch结束时的断点(`epoch_xxx.pdopt``epoch_xxx.pdparams``epoch_xxx.pdstates`)。训练配置文件中 `Global.save_interval` 字段表示该模型的保存间隔。将该字段设置大于总epochs数,则不再保存中间断点模型。 pass
net = SimpleNet()
x_spec = InputSpec(shape=[None, 3, 224, 224], dtype='float32', name='x')
paddle.onnx.export(layer=net, path="./SimpleNet", input_spec=[x_spec])
```
其中:
* `InputSpec()` 函数用于描述模型输入的签名信息,包括输入数据的 `shape`、`type` 和 `name`(可省略);
* `paddle.onnx.export()` 函数需要指定模型组网对象 `net`,导出模型的保存路径 `save_path`,模型的输入数据描述 `input_spec`。
需要注意,`paddlepaddle` 版本需大于 `2.0.0`。关于 `paddle.onnx.export()` 函数的更多参数说明请参考[paddle.onnx.export](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/onnx/export_cn.html#export)。
#### Q2.5.4: 在 build 检索底库时,参数 `pq_size` 应该如何设置?
**A**`pq_size` 是PQ检索算法的参数。PQ检索算法可以简单理解为“分层”检索算法,`pq_size` 是每层的“容量”,因此该参数的设置会影响检索性能,不过,在底库总数据量不太大(小于10000张)的情况下,这个参数对性能的影响很小,因此对于大多数使用场景而言,在构建底库时无需修改该参数。关于PQ检索算法的更多内容,可以查看相关[论文](https://lear.inrialpes.fr/pubs/2011/JDS11/jegou_searching_with_quantization.pdf)
<a name="精选"></a> <a name="精选"></a>
## 精选 ## 精选
...@@ -204,6 +213,22 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad ...@@ -204,6 +213,22 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad
2. 最优的模型(`best_model.pdopt``best_model.pdparams``best_model.pdstates`); 2. 最优的模型(`best_model.pdopt``best_model.pdparams``best_model.pdstates`);
3. 训练过程中,一个epoch结束时的断点(`epoch_xxx.pdopt``epoch_xxx.pdparams``epoch_xxx.pdstates`)。训练配置文件中 `Global.save_interval` 字段表示该模型的保存间隔。将该字段设置大于总epochs数,则不再保存中间断点模型。 3. 训练过程中,一个epoch结束时的断点(`epoch_xxx.pdopt``epoch_xxx.pdparams``epoch_xxx.pdstates`)。训练配置文件中 `Global.save_interval` 字段表示该模型的保存间隔。将该字段设置大于总epochs数,则不再保存中间断点模型。
#### Q2.1.7: 在训练时,出现如下报错信息:`ERROR: Unexpected segmentation fault encountered in DataLoader workers.`,如何排查解决问题呢?
**A**:尝试将训练配置文件中的字段 `num_workers` 设置为 `0`;尝试将训练配置文件中的字段 `batch_size` 调小一些;检查数据集格式和配置文件中的数据集路径是否正确。
#### Q2.1.8: 如何在训练时使用 `Mixup` 和 `Cutmix` ?
**A**
* `Mixup` 的使用方法请参考 [Mixup](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Mixup.yaml#L63-L65)`Cuxmix` 请参考 [Cuxmix](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L63-L65)
* 在使用 `Mixup``Cutmix` 时,需要注意:
* 配置文件中的 `Loss.Tranin.CELoss` 需要修改为 `Loss.Tranin.MixCELoss`,可参考 [MixCELoss](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L23-L26)
* 使用 `Mixup``Cutmix` 做训练时无法计算训练的精度(Acc)指标,因此需要在配置文件中取消 `Metric.Train.TopkAcc` 字段,可参考 [Metric.Train.TopkAcc](https://github.com/PaddlePaddle/PaddleClas/blob/cf9fc9363877f919996954a63716acfb959619d0/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L125-L128)
#### Q2.1.9: 训练配置yaml文件中,字段 `Global.pretrain_model` 和 `Global.checkpoints` 分别用于配置什么呢?
**A**
* 当需要 `fine-tune` 时,可以通过字段 `Global.pretrain_model` 配置预训练模型权重文件的路径,预训练模型权重文件后缀名通常为 `.pdparams`
* 在训练过程中,训练程序会自动保存每个epoch结束时的断点信息,包括优化器信息 `.pdopt` 和模型权重信息 `.pdparams`。在训练过程意外中断等情况下,需要恢复训练时,可以通过字段 `Global.checkpoints` 配置训练过程中保存的断点信息文件,例如通过配置 `checkpoints: ./output/ResNet18/epoch_18` 即可恢复18epoch训练结束时的断点信息,PaddleClas将自动加载 `epoch_18.pdopt``epoch_18.pdparams`,从19epoch继续训练。
<a name="2.2图像分类"></a> <a name="2.2图像分类"></a>
### 2.2 图像分类 ### 2.2 图像分类
...@@ -255,6 +280,9 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad ...@@ -255,6 +280,9 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad
#### Q2.5.3: Mac重新编译index.so时报错如下:clang: error: unsupported option '-fopenmp', 该如何处理? #### Q2.5.3: Mac重新编译index.so时报错如下:clang: error: unsupported option '-fopenmp', 该如何处理?
**A**:该问题已经解决。可以参照[文档](../../../develop/deploy/vector_search/README.md)重新编译 index.so。 **A**:该问题已经解决。可以参照[文档](../../../develop/deploy/vector_search/README.md)重新编译 index.so。
#### Q2.5.4: 在 build 检索底库时,参数 `pq_size` 应该如何设置?
**A**`pq_size` 是PQ检索算法的参数。PQ检索算法可以简单理解为“分层”检索算法,`pq_size` 是每层的“容量”,因此该参数的设置会影响检索性能,不过,在底库总数据量不太大(小于10000张)的情况下,这个参数对性能的影响很小,因此对于大多数使用场景而言,在构建底库时无需修改该参数。关于PQ检索算法的更多内容,可以查看相关[论文](https://lear.inrialpes.fr/pubs/2011/JDS11/jegou_searching_with_quantization.pdf)
<a name="2.6模型预测部署"></a> <a name="2.6模型预测部署"></a>
### 2.6 模型预测部署 ### 2.6 模型预测部署
...@@ -267,3 +295,48 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad ...@@ -267,3 +295,48 @@ PaddlePaddle is installed successfully! Let's start deep learning with PaddlePad
UserWarning: Skip loading for ***. *** is not found in the provided dict. UserWarning: Skip loading for ***. *** is not found in the provided dict.
``` ```
如果存在,则说明模型权重未能加载成功,请进一步检查配置文件中的 `Global.pretrained_model` 字段,是否正确配置了模型权重文件的路径。模型权重文件后缀名通常为 `pdparams`,注意在配置该路径时无需填写文件后缀名。 如果存在,则说明模型权重未能加载成功,请进一步检查配置文件中的 `Global.pretrained_model` 字段,是否正确配置了模型权重文件的路径。模型权重文件后缀名通常为 `pdparams`,注意在配置该路径时无需填写文件后缀名。
#### Q2.6.3: 如何将模型转为 `ONNX` 格式?
**A**:Paddle支持两种转ONNX格式模型的方式,且依赖于 `paddle2onnx` 工具,首先需要安装 `paddle2onnx`
```shell
pip install paddle2onnx
```
* 从 inference model 转为 ONNX 格式模型:
以动态图导出的 `combined` 格式 inference model(包含 `.pdmodel` 和 `.pdiparams` 两个文件)为例,使用以下命令进行模型格式转换:
```shell
paddle2onnx --model_dir ${model_path} --model_filename ${model_path}/inference.pdmodel --params_filename ${model_path}/inference.pdiparams --save_file ${save_path}/model.onnx --enable_onnx_checker True
```
上述命令中:
* `model_dir`:该参数下需要包含 `.pdmodel` 和 `.pdiparams` 两个文件;
* `model_filename`:该参数用于指定参数 `model_dir` 下的 `.pdmodel` 文件路径;
* `params_filename`:该参数用于指定参数 `model_dir` 下的 `.pdiparams` 文件路径;
* `save_file`:该参数用于指定转换后的模型保存目录路径。
关于静态图导出的非 `combined` 格式的 inference model(通常包含文件 `__model__` 和多个参数文件)转换模型格式,以及更多参数说明请参考 paddle2onnx 官方文档 [paddle2onnx](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_zh.md#%E5%8F%82%E6%95%B0%E9%80%89%E9%A1%B9)。
* 直接从模型组网代码导出ONNX格式模型:
以动态图模型组网代码为例,模型类为继承于 `paddle.nn.Layer` 的子类,代码如下所示:
```python
import paddle
from paddle.static import InputSpec
class SimpleNet(paddle.nn.Layer):
def __init__(self):
pass
def forward(self, x):
pass
net = SimpleNet()
x_spec = InputSpec(shape=[None, 3, 224, 224], dtype='float32', name='x')
paddle.onnx.export(layer=net, path="./SimpleNet", input_spec=[x_spec])
```
其中:
* `InputSpec()` 函数用于描述模型输入的签名信息,包括输入数据的 `shape`、`type` 和 `name`(可省略);
* `paddle.onnx.export()` 函数需要指定模型组网对象 `net`,导出模型的保存路径 `save_path`,模型的输入数据描述 `input_spec`。
需要注意,`paddlepaddle` 版本需大于 `2.0.0`。关于 `paddle.onnx.export()` 函数的更多参数说明请参考[paddle.onnx.export](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/onnx/export_cn.html#export)。
# PPLCNet系列
## 概述
PPLCNet系列是百度PaddleCV团队提出的一种在Intel-CPU上表现优异的网络,作者总结了一些在Intel-CPU上可以提升模型精度但几乎不增加推理耗时的方法,将这些方法组合成了一个新的网络,即PPLCNet。与其他轻量级网络相比,PPLCNet可以在相同延时下取得更高的精度。PPLCNet已在图像分类、目标检测、语义分割上表现出了强大的竞争力。
## 精度、FLOPS和参数量
| Models | Top1 | Top5 | FLOPs<br>(M) | Parameters<br>(M) |
|:--:|:--:|:--:|:--:|:--:|
| PPLCNet_x0_25 |0.5186 | 0.7565 | 18 | 1.5 |
| PPLCNet_x0_35 |0.5809 | 0.8083 | 29 | 1.6 |
| PPLCNet_x0_5 |0.6314 | 0.8466 | 47 | 1.9 |
| PPLCNet_x0_75 |0.6818 | 0.8830 | 99 | 2.4 |
| PPLCNet_x1_0 |0.7132 | 0.9003 | 161 | 3.0 |
| PPLCNet_x1_5 |0.7371 | 0.9153 | 342 | 4.5 |
| PPLCNet_x2_0 |0.7518 | 0.9227 | 590 | 6.5 |
| PPLCNet_x2_5 |0.7660 | 0.9300 | 906 | 9.0 |
| PPLCNet_x0_5_ssld |0.6610 | 0.8646 | 47 | 1.9 |
| PPLCNet_x1_0_ssld |0.7439 | 0.9209 | 161 | 3.0 |
| PPLCNet_x2_5_ssld |0.8082 | 0.9533 | 906 | 9.0 |
## 基于Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz的预测速度
| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) |
|------------------|-----------|-------------------|--------------------------|
| PPLCNet_x0_25 | 224 | 256 | 1.74 |
| PPLCNet_x0_35 | 224 | 256 | 1.92 |
| PPLCNet_x0_5 | 224 | 256 | 2.05 |
| PPLCNet_x0_75 | 224 | 256 | 2.29 |
| PPLCNet_x1_0 | 224 | 256 | 2.46 |
| PPLCNet_x1_5 | 224 | 256 | 3.19 |
| PPLCNet_x2_0 | 224 | 256 | 4.27 |
| PPLCNet_x2_5 | 224 | 256 | 5.39 |
| PPLCNet_x0_5_ssld | 224 | 256 | 2.05 |
| PPLCNet_x1_0_ssld | 224 | 256 | 2.46 |
| PPLCNet_x2_5_ssld | 224 | 256 | 5.39 |
# PPLCNet系列
## 概述
PPLCNet系列是百度PaddleCV团队提出的一种在Intel-CPU上表现优异的网络,作者总结了一些在Intel-CPU上可以提升模型精度但几乎不增加推理耗时的方法,将这些方法组合成了一个新的网络,即PPLCNet。与其他轻量级网络相比,PPLCNet可以在相同延时下取得更高的精度。PPLCNet已在图像分类、目标检测、语义分割上表现出了强大的竞争力。
## 精度、FLOPS和参数量
| Models | Top1 | Top5 | FLOPs<br>(M) | Parameters<br>(M) |
|:--:|:--:|:--:|:--:|:--:|
| PPLCNet_x0_25 |0.5186 | 0.7565 | 18 | 1.5 |
| PPLCNet_x0_35 |0.5809 | 0.8083 | 29 | 1.6 |
| PPLCNet_x0_5 |0.6314 | 0.8466 | 47 | 1.9 |
| PPLCNet_x0_75 |0.6818 | 0.8830 | 99 | 2.4 |
| PPLCNet_x1_0 |0.7132 | 0.9003 | 161 | 3.0 |
| PPLCNet_x1_5 |0.7371 | 0.9153 | 342 | 4.5 |
| PPLCNet_x2_0 |0.7518 | 0.9227 | 590 | 6.5 |
| PPLCNet_x2_5 |0.7660 | 0.9300 | 906 | 9.0 |
| PPLCNet_x0_5_ssld |0.6610 | 0.8646 | 47 | 1.9 |
| PPLCNet_x1_0_ssld |0.7439 | 0.9209 | 161 | 3.0 |
| PPLCNet_x2_5_ssld |0.8082 | 0.9533 | 906 | 9.0 |
## 基于Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz的预测速度
| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) |
|------------------|-----------|-------------------|--------------------------|
| PPLCNet_x0_25 | 224 | 256 | 1.74 |
| PPLCNet_x0_35 | 224 | 256 | 1.92 |
| PPLCNet_x0_5 | 224 | 256 | 2.05 |
| PPLCNet_x0_75 | 224 | 256 | 2.29 |
| PPLCNet_x1_0 | 224 | 256 | 2.46 |
| PPLCNet_x1_5 | 224 | 256 | 3.19 |
| PPLCNet_x2_0 | 224 | 256 | 4.27 |
| PPLCNet_x2_5 | 224 | 256 | 5.39 |
| PPLCNet_x0_5_ssld | 224 | 256 | 2.05 |
| PPLCNet_x1_0_ssld | 224 | 256 | 2.46 |
| PPLCNet_x2_5_ssld | 224 | 256 | 5.39 |
...@@ -46,8 +46,8 @@ ...@@ -46,8 +46,8 @@
本章节demo数据下载地址如下: [数据下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar) 本章节demo数据下载地址如下: [数据下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar)
**注意** **注意**
1. windows 环境下如果没有安装wget,可以按照下面的步骤安装wget与tar命令,也可以在,下载模型时将链接复制到浏览器中下载,并解压放置在相应目录下;linux或者macOS用户可以右键点击,然后复制下载链接,即可通过`wget`命令下载。 1. windows 环境下如果没有安装wget,可以按照下面的步骤安装wget与tar命令,也可以在,下载模型时将链接复制到浏览器中下载,并解压放置在相应目录下;linux或者macOS用户可以右键点击,然后复制下载链接,即可通过`wget`命令下载。
2. 如果macOS环境下没有安装`wget`命令,可以运行下面的命令进行安装。 2. 如果macOS环境下没有安装`wget`命令,可以运行下面的命令进行安装。
...@@ -74,8 +74,8 @@ cd .. ...@@ -74,8 +74,8 @@ cd ..
wget {数据下载链接地址} && tar -xf {压缩包的名称} wget {数据下载链接地址} && tar -xf {压缩包的名称}
``` ```
<a name="下载、解压inference_模型与demo数据"></a> <a name="下载、解压inference_模型与demo数据"></a>
### 2.1 下载、解压inference 模型与demo数据 ### 2.1 下载、解压inference 模型与demo数据
以商品识别为例,下载demo数据集以及通用检测、识别模型,命令如下。 以商品识别为例,下载demo数据集以及通用检测、识别模型,命令如下。
...@@ -90,13 +90,13 @@ wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/infere ...@@ -90,13 +90,13 @@ wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/infere
cd ../ cd ../
# 下载demo数据并解压 # 下载demo数据并解压
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.0.tar && tar -xf recognition_demo_data_v1.0.tar wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.1.tar && tar -xf recognition_demo_data_v1.1.tar
``` ```
解压完毕后,`recognition_demo_data_v1.0`文件夹下应有如下文件结构: 解压完毕后,`recognition_demo_data_v1.1`文件夹下应有如下文件结构:
``` ```
├── recognition_demo_data_v1.0 ├── recognition_demo_data_v1.1
│ ├── gallery_cartoon │ ├── gallery_cartoon
│ ├── gallery_logo │ ├── gallery_logo
│ ├── gallery_product │ ├── gallery_product
...@@ -129,11 +129,19 @@ wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognit ...@@ -129,11 +129,19 @@ wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognit
以商品识别demo为例,展示识别与检索过程(如果希望尝试其他方向的识别与检索效果,在下载解压好对应的demo数据与模型之后,替换对应的配置文件即可完成预测)。 以商品识别demo为例,展示识别与检索过程(如果希望尝试其他方向的识别与检索效果,在下载解压好对应的demo数据与模型之后,替换对应的配置文件即可完成预测)。
注意,此部分使用了`faiss`作为检索库,安装方法如下:
```python
pip install faiss-cpu==1.7.1post2
```
若使用时,不能正常引用,则`uninstall` 之后,重新`install`,尤其是windows下。
<a name="识别单张图像"></a> <a name="识别单张图像"></a>
#### 2.2.1 识别单张图像 #### 2.2.1 识别单张图像
运行下面的命令,对图像`./recognition_demo_data_v1.0/test_product/daoxiangcunjinzhubing_6.jpg`进行识别与检索 运行下面的命令,对图像`./recognition_demo_data_v1.1/test_product/daoxiangcunjinzhubing_6.jpg`进行识别与检索
```shell ```shell
# 使用下面的命令使用GPU进行预测 # 使用下面的命令使用GPU进行预测
...@@ -142,8 +150,6 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml ...@@ -142,8 +150,6 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.use_gpu=False python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.use_gpu=False
``` ```
注意:这里使用了默认编译生成的库文件进行特征索引,如果与您的环境不兼容,导致程序报错,可以参考[向量检索教程](../../../deploy/vector_search/README.md)重新编译库文件。
待检索图像如下所示。 待检索图像如下所示。
<div align="center"> <div align="center">
...@@ -173,7 +179,7 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.u ...@@ -173,7 +179,7 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.u
```shell ```shell
# 使用下面的命令使用GPU进行预测,如果希望使用CPU预测,可以在命令后面添加-o Global.use_gpu=False # 使用下面的命令使用GPU进行预测,如果希望使用CPU预测,可以在命令后面添加-o Global.use_gpu=False
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.0/test_product/" python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/"
``` ```
终端中会输出该文件夹内所有图像的识别结果,如下所示。 终端中会输出该文件夹内所有图像的识别结果,如下所示。
...@@ -193,17 +199,17 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.i ...@@ -193,17 +199,17 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.i
所有图像的识别结果可视化图像也保存在`output`文件夹内。 所有图像的识别结果可视化图像也保存在`output`文件夹内。
更多地,可以通过修改`Global.rec_inference_model_dir`字段来更改识别inference模型的路径,通过修改`IndexProcess.index_path`字段来更改索引库索引的路径。 更多地,可以通过修改`Global.rec_inference_model_dir`字段来更改识别inference模型的路径,通过修改`IndexProcess.index_dir`字段来更改索引库索引的路径。
<a name="未知类别的图像识别体验"></a> <a name="未知类别的图像识别体验"></a>
## 3. 未知类别的图像识别体验 ## 3. 未知类别的图像识别体验
对图像`./recognition_demo_data_v1.0/test_product/anmuxi.jpg`进行识别,命令如下 对图像`./recognition_demo_data_v1.1/test_product/anmuxi.jpg`进行识别,命令如下
```shell ```shell
# 使用下面的命令使用GPU进行预测,如果希望使用CPU预测,可以在命令后面添加-o Global.use_gpu=False # 使用下面的命令使用GPU进行预测,如果希望使用CPU预测,可以在命令后面添加-o Global.use_gpu=False
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.0/test_product/anmuxi.jpg" python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/anmuxi.jpg"
``` ```
待检索图像如下所示。 待检索图像如下所示。
...@@ -222,20 +228,20 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.i ...@@ -222,20 +228,20 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.i
<a name="准备新的数据与标签"></a> <a name="准备新的数据与标签"></a>
### 3.1 准备新的数据与标签 ### 3.1 准备新的数据与标签
首先需要将与待检索图像相似的图像列表拷贝到索引库原始图像的文件夹(`./recognition_demo_data_v1.0/gallery_product/gallery`)中,运行下面的命令拷贝相似图像。 首先需要将与待检索图像相似的图像列表拷贝到索引库原始图像的文件夹(`./recognition_demo_data_v1.1/gallery_product/gallery`)中,运行下面的命令拷贝相似图像。
```shell ```shell
cp -r ../docs/images/recognition/product_demo/gallery/anmuxi ./recognition_demo_data_v1.0/gallery_product/gallery/ cp -r ../docs/images/recognition/product_demo/gallery/anmuxi ./recognition_demo_data_v1.1/gallery_product/gallery/
``` ```
然后需要编辑记录了图像路径和标签信息的文本文件(`./recognition_demo_data_v1.0/gallery_product/data_file_update.txt`),这里基于原始标签文件,新建一个文件。命令如下。 然后需要编辑记录了图像路径和标签信息的文本文件(`./recognition_demo_data_v1.1/gallery_product/data_file_update.txt`),这里基于原始标签文件,新建一个文件。命令如下。
```shell ```shell
# 复制文件 # 复制文件
cp recognition_demo_data_v1.0/gallery_product/data_file.txt recognition_demo_data_v1.0/gallery_product/data_file_update.txt cp recognition_demo_data_v1.1/gallery_product/data_file.txt recognition_demo_data_v1.1/gallery_product/data_file_update.txt
``` ```
然后在文件`recognition_demo_data_v1.0/gallery_product/data_file_update.txt`中添加以下的信息, 然后在文件`recognition_demo_data_v1.1/gallery_product/data_file_update.txt`中添加以下的信息,
``` ```
gallery/anmuxi/001.jpg 安慕希酸奶 gallery/anmuxi/001.jpg 安慕希酸奶
...@@ -255,20 +261,20 @@ gallery/anmuxi/006.jpg 安慕希酸奶 ...@@ -255,20 +261,20 @@ gallery/anmuxi/006.jpg 安慕希酸奶
使用下面的命令构建index索引,加速识别后的检索过程。 使用下面的命令构建index索引,加速识别后的检索过程。
```shell ```shell
python3.7 python/build_gallery.py -c configs/build_product.yaml -o IndexProcess.data_file="./recognition_demo_data_v1.0/gallery_product/data_file_update.txt" -o IndexProcess.index_path="./recognition_demo_data_v1.0/gallery_product/index_update" python3.7 python/build_gallery.py -c configs/build_product.yaml -o IndexProcess.data_file="./recognition_demo_data_v1.1/gallery_product/data_file_update.txt" -o IndexProcess.index_dir="./recognition_demo_data_v1.1/gallery_product/index_update"
``` ```
最终新的索引信息保存在文件夹`./recognition_demo_data_v1.0/gallery_product/index_update`中。 最终新的索引信息保存在文件夹`./recognition_demo_data_v1.1/gallery_product/index_update`中。
<a name="基于新的索引库的图像识别"></a> <a name="基于新的索引库的图像识别"></a>
### 3.3 基于新的索引库的图像识别 ### 3.3 基于新的索引库的图像识别
使用新的索引库,对上述图像进行识别,运行命令如下。 使用新的索引库,对上述图像进行识别,运行命令如下。
```shell ```shell
# 使用下面的命令使用GPU进行预测,如果希望使用CPU预测,可以在命令后面添加-o Global.use_gpu=False # 使用下面的命令使用GPU进行预测,如果希望使用CPU预测,可以在命令后面添加-o Global.use_gpu=False
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.0/test_product/anmuxi.jpg" -o IndexProcess.index_path="./recognition_demo_data_v1.0/gallery_product/index_update" python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.infer_imgs="./recognition_demo_data_v1.1/test_product/anmuxi.jpg" -o IndexProcess.index_dir="./recognition_demo_data_v1.1/gallery_product/index_update"
``` ```
输出结果如下。 输出结果如下。
......
# Metric Learning
## 简介
在机器学习中,我们经常会遇到度量数据间距离的问题。一般来说,对于可度量的数据,我们可以直接通过欧式距离(Euclidean Distance),向量内积(Inner Product)或者是余弦相似度(Cosine Similarity)来进行计算。但对于非结构化数据来说,我们却很难进行这样的操作,如计算一段视频和一首音乐的匹配程度。由于数据格式的不同,我们难以直接进行上述的向量运算,但先验知识告诉我们ED(laugh_video, laugh_music) < ED(laugh_video, blue_music), 如何去有效得表征这种”距离”关系呢? 这就是Metric Learning所要研究的课题。
Metric learning全称是 Distance Metric Learning,它是通过机器学习的形式,根据训练数据,自动构造出一种基于特定任务的度量函数。Metric Learning的目标是学习一个变换函数(线性非线性均可)L,将数据点从原始的向量空间映射到一个新的向量空间,在新的向量空间里相似点的距离更近,非相似点的距离更远,使得度量更符合任务的要求,如下图所示。 Deep Metric Learning,就是用深度神经网络来拟合这个变换函数。
![example](../../images/ml_illustration.jpg)
## 应用
Metric Learning技术在生活实际中应用广泛,如我们耳熟能详的人脸识别(Face Recognition)、行人重识别(Person ReID)、图像检索(Image Retrieval)、细粒度分类(Fine-gained classification)等. 随着深度学习在工业实践中越来越广泛的应用,目前大家研究的方向基本都偏向于Deep Metric Learning(DML).
一般来说, DML包含三个部分: 特征提取网络来map embedding, 一个采样策略来将一个mini-batch里的样本组合成很多个sub-set, 最后loss function在每个sub-set上计算loss. 如下图所示:
![image](../../images/ml_pipeline.jpg)
## 算法
Metric Learning主要有如下两种学习范式:
### 1. Classification based:
这是一类基于分类标签的Metric Learning方法。这类方法通过将每个样本分类到正确的类别中,来学习有效的特征表示,学习过程中需要每个样本的显式标签参与Loss计算。常见的算法有[L2-Softmax](https://arxiv.org/abs/1703.09507), [Large-margin Softmax](https://arxiv.org/abs/1612.02295), [Angular Softmax](https://arxiv.org/pdf/1704.08063.pdf), [NormFace](https://arxiv.org/abs/1704.06369), [AM-Softmax](https://arxiv.org/abs/1801.05599), [CosFace](https://arxiv.org/abs/1801.09414), [ArcFace](https://arxiv.org/abs/1801.07698)等。
这类方法也被称作是proxy-based, 因为其本质上优化的是样本和一堆proxies之间的相似度。
### 2. Pairwise based:
这是一类基于样本对的学习范式。他以样本对作为输入,通过直接学习样本对之间的相似度来得到有效的特征表示,常见的算法包括:[Contrastive loss](http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf), [Triplet loss](https://arxiv.org/abs/1503.03832), [Lifted-Structure loss](https://arxiv.org/abs/1511.06452), [N-pair loss](https://papers.nips.cc/paper/2016/file/6b180037abbebea991d8b1232f8a8ca9-Paper.pdf), [Multi-Similarity loss](https://arxiv.org/pdf/1904.06627.pdf)
2020年发表的[CircleLoss](https://arxiv.org/abs/2002.10857),从一个全新的视角统一了两种学习范式,让研究人员和从业者对Metric Learning问题有了更进一步的思考。
...@@ -23,6 +23,7 @@ from . import backbone, gears ...@@ -23,6 +23,7 @@ from . import backbone, gears
from .backbone import * from .backbone import *
from .gears import build_gear from .gears import build_gear
from .utils import * from .utils import *
from ppcls.arch.backbone.base.theseus_layer import TheseusLayer
from ppcls.utils import logger from ppcls.utils import logger
from ppcls.utils.save_load import load_dygraph_pretrain from ppcls.utils.save_load import load_dygraph_pretrain
......
...@@ -21,6 +21,7 @@ from ppcls.arch.backbone.legendary_models.resnet import ResNet18, ResNet18_vd, R ...@@ -21,6 +21,7 @@ from ppcls.arch.backbone.legendary_models.resnet import ResNet18, ResNet18_vd, R
from ppcls.arch.backbone.legendary_models.vgg import VGG11, VGG13, VGG16, VGG19 from ppcls.arch.backbone.legendary_models.vgg import VGG11, VGG13, VGG16, VGG19
from ppcls.arch.backbone.legendary_models.inception_v3 import InceptionV3 from ppcls.arch.backbone.legendary_models.inception_v3 import InceptionV3
from ppcls.arch.backbone.legendary_models.hrnet import HRNet_W18_C, HRNet_W30_C, HRNet_W32_C, HRNet_W40_C, HRNet_W44_C, HRNet_W48_C, HRNet_W60_C, HRNet_W64_C, SE_HRNet_W64_C from ppcls.arch.backbone.legendary_models.hrnet import HRNet_W18_C, HRNet_W30_C, HRNet_W32_C, HRNet_W40_C, HRNet_W44_C, HRNet_W48_C, HRNet_W60_C, HRNet_W64_C, SE_HRNet_W64_C
from ppcls.arch.backbone.legendary_models.pp_lcnet import PPLCNet_x0_25, PPLCNet_x0_35, PPLCNet_x0_5, PPLCNet_x0_75, PPLCNet_x1_0, PPLCNet_x1_5, PPLCNet_x2_0, PPLCNet_x2_5
from ppcls.arch.backbone.model_zoo.resnet_vc import ResNet50_vc from ppcls.arch.backbone.model_zoo.resnet_vc import ResNet50_vc
from ppcls.arch.backbone.model_zoo.resnext import ResNeXt50_32x4d, ResNeXt50_64x4d, ResNeXt101_32x4d, ResNeXt101_64x4d, ResNeXt152_32x4d, ResNeXt152_64x4d from ppcls.arch.backbone.model_zoo.resnext import ResNeXt50_32x4d, ResNeXt50_64x4d, ResNeXt101_32x4d, ResNeXt101_64x4d, ResNeXt152_32x4d, ResNeXt152_64x4d
......
...@@ -38,24 +38,31 @@ class TheseusLayer(nn.Layer): ...@@ -38,24 +38,31 @@ class TheseusLayer(nn.Layer):
for layer_i in self._sub_layers: for layer_i in self._sub_layers:
layer_name = self._sub_layers[layer_i].full_name() layer_name = self._sub_layers[layer_i].full_name()
if isinstance(self._sub_layers[layer_i], (nn.Sequential, nn.LayerList)): if isinstance(self._sub_layers[layer_i], (nn.Sequential, nn.LayerList)):
self._sub_layers[layer_i] = wrap_theseus(self._sub_layers[layer_i]) self._sub_layers[layer_i] = wrap_theseus(self._sub_layers[layer_i], self.res_dict)
self._sub_layers[layer_i].res_dict = self.res_dict
self._sub_layers[layer_i].update_res(return_patterns) self._sub_layers[layer_i].update_res(return_patterns)
else: else:
for return_pattern in return_patterns: for return_pattern in return_patterns:
if re.match(return_pattern, layer_name): if re.match(return_pattern, layer_name):
if not isinstance(self._sub_layers[layer_i], TheseusLayer): if not isinstance(self._sub_layers[layer_i], TheseusLayer):
self._sub_layers[layer_i] = wrap_theseus(self._sub_layers[layer_i]) self._sub_layers[layer_i] = wrap_theseus(self._sub_layers[layer_i], self.res_dict)
else:
self._sub_layers[layer_i].res_dict = self.res_dict
self._sub_layers[layer_i].register_forward_post_hook( self._sub_layers[layer_i].register_forward_post_hook(
self._sub_layers[layer_i]._save_sub_res_hook) self._sub_layers[layer_i]._save_sub_res_hook)
self._sub_layers[layer_i].res_dict = self.res_dict if isinstance(self._sub_layers[layer_i], TheseusLayer):
if isinstance(self._sub_layers[layer_i], TheseusLayer): self._sub_layers[layer_i].res_dict = self.res_dict
self._sub_layers[layer_i].res_dict = self.res_dict self._sub_layers[layer_i].update_res(return_patterns)
self._sub_layers[layer_i].update_res(return_patterns)
def _save_sub_res_hook(self, layer, input, output): def _save_sub_res_hook(self, layer, input, output):
self.res_dict[layer.full_name()] = output self.res_dict[layer.full_name()] = output
def _return_dict_hook(self, layer, input, output):
res_dict = {"output": output}
for res_key in list(self.res_dict):
res_dict[res_key] = self.res_dict.pop(res_key)
return res_dict
def replace_sub(self, layer_name_pattern, replace_function, recursive=True): def replace_sub(self, layer_name_pattern, replace_function, recursive=True):
for layer_i in self._sub_layers: for layer_i in self._sub_layers:
layer_name = self._sub_layers[layer_i].full_name() layer_name = self._sub_layers[layer_i].full_name()
...@@ -85,10 +92,12 @@ class TheseusLayer(nn.Layer): ...@@ -85,10 +92,12 @@ class TheseusLayer(nn.Layer):
class WrapLayer(TheseusLayer): class WrapLayer(TheseusLayer):
def __init__(self, sub_layer): def __init__(self, sub_layer, res_dict=None):
super(WrapLayer, self).__init__() super(WrapLayer, self).__init__()
self.sub_layer = sub_layer self.sub_layer = sub_layer
self.name = sub_layer.full_name() self.name = sub_layer.full_name()
if res_dict is not None:
self.res_dict = res_dict
def full_name(self): def full_name(self):
return self.name return self.name
...@@ -101,14 +110,14 @@ class WrapLayer(TheseusLayer): ...@@ -101,14 +110,14 @@ class WrapLayer(TheseusLayer):
return return
for layer_i in self.sub_layer._sub_layers: for layer_i in self.sub_layer._sub_layers:
if isinstance(self.sub_layer._sub_layers[layer_i], (nn.Sequential, nn.LayerList)): if isinstance(self.sub_layer._sub_layers[layer_i], (nn.Sequential, nn.LayerList)):
self.sub_layer._sub_layers[layer_i] = wrap_theseus(self.sub_layer._sub_layers[layer_i]) self.sub_layer._sub_layers[layer_i] = wrap_theseus(self.sub_layer._sub_layers[layer_i], self.res_dict)
self.sub_layer._sub_layers[layer_i].res_dict = self.res_dict
self.sub_layer._sub_layers[layer_i].update_res(return_patterns) self.sub_layer._sub_layers[layer_i].update_res(return_patterns)
elif isinstance(self.sub_layer._sub_layers[layer_i], TheseusLayer):
self.sub_layer._sub_layers[layer_i].res_dict = self.res_dict
layer_name = self.sub_layer._sub_layers[layer_i].full_name() layer_name = self.sub_layer._sub_layers[layer_i].full_name()
for return_pattern in return_patterns: for return_pattern in return_patterns:
if re.match(return_pattern, layer_name): if re.match(return_pattern, layer_name):
self.sub_layer._sub_layers[layer_i].res_dict = self.res_dict
self.sub_layer._sub_layers[layer_i].register_forward_post_hook( self.sub_layer._sub_layers[layer_i].register_forward_post_hook(
self._sub_layers[layer_i]._save_sub_res_hook) self._sub_layers[layer_i]._save_sub_res_hook)
...@@ -116,6 +125,6 @@ class WrapLayer(TheseusLayer): ...@@ -116,6 +125,6 @@ class WrapLayer(TheseusLayer):
self.sub_layer._sub_layers[layer_i].update_res(return_patterns) self.sub_layer._sub_layers[layer_i].update_res(return_patterns)
def wrap_theseus(sub_layer): def wrap_theseus(sub_layer, res_dict=None):
wrapped_layer = WrapLayer(sub_layer) wrapped_layer = WrapLayer(sub_layer, res_dict)
return wrapped_layer return wrapped_layer
...@@ -367,7 +367,7 @@ class HRNet(TheseusLayer): ...@@ -367,7 +367,7 @@ class HRNet(TheseusLayer):
model: nn.Layer. Specific HRNet model depends on args. model: nn.Layer. Specific HRNet model depends on args.
""" """
def __init__(self, width=18, has_se=False, class_num=1000): def __init__(self, width=18, has_se=False, class_num=1000, return_patterns=None):
super().__init__() super().__init__()
self.width = width self.width = width
...@@ -456,8 +456,11 @@ class HRNet(TheseusLayer): ...@@ -456,8 +456,11 @@ class HRNet(TheseusLayer):
2048, 2048,
class_num, class_num,
weight_attr=ParamAttr(initializer=Uniform(-stdv, stdv))) weight_attr=ParamAttr(initializer=Uniform(-stdv, stdv)))
if return_patterns is not None:
self.update_res(return_patterns)
self.register_forward_post_hook(self._return_dict_hook)
def forward(self, x, res_dict=None): def forward(self, x):
x = self.conv_layer1_1(x) x = self.conv_layer1_1(x)
x = self.conv_layer1_2(x) x = self.conv_layer1_2(x)
......
...@@ -454,7 +454,7 @@ class Inception_V3(TheseusLayer): ...@@ -454,7 +454,7 @@ class Inception_V3(TheseusLayer):
model: nn.Layer. Specific Inception_V3 model depends on args. model: nn.Layer. Specific Inception_V3 model depends on args.
""" """
def __init__(self, config, class_num=1000): def __init__(self, config, class_num=1000, return_patterns=None):
super().__init__() super().__init__()
self.inception_a_list = config["inception_a"] self.inception_a_list = config["inception_a"]
...@@ -496,6 +496,9 @@ class Inception_V3(TheseusLayer): ...@@ -496,6 +496,9 @@ class Inception_V3(TheseusLayer):
class_num, class_num,
weight_attr=ParamAttr(initializer=Uniform(-stdv, stdv)), weight_attr=ParamAttr(initializer=Uniform(-stdv, stdv)),
bias_attr=ParamAttr()) bias_attr=ParamAttr())
if return_patterns is not None:
self.update_res(return_patterns)
self.register_forward_post_hook(self._return_dict_hook)
def forward(self, x): def forward(self, x):
x = self.inception_stem(x) x = self.inception_stem(x)
......
...@@ -102,7 +102,7 @@ class MobileNet(TheseusLayer): ...@@ -102,7 +102,7 @@ class MobileNet(TheseusLayer):
model: nn.Layer. Specific MobileNet model depends on args. model: nn.Layer. Specific MobileNet model depends on args.
""" """
def __init__(self, scale=1.0, class_num=1000): def __init__(self, scale=1.0, class_num=1000, return_patterns=None):
super().__init__() super().__init__()
self.scale = scale self.scale = scale
...@@ -145,6 +145,9 @@ class MobileNet(TheseusLayer): ...@@ -145,6 +145,9 @@ class MobileNet(TheseusLayer):
int(1024 * scale), int(1024 * scale),
class_num, class_num,
weight_attr=ParamAttr(initializer=KaimingNormal())) weight_attr=ParamAttr(initializer=KaimingNormal()))
if return_patterns is not None:
self.update_res(return_patterns)
self.register_forward_post_hook(self._return_dict_hook)
def forward(self, x): def forward(self, x):
x = self.conv(x) x = self.conv(x)
......
...@@ -142,7 +142,8 @@ class MobileNetV3(TheseusLayer): ...@@ -142,7 +142,8 @@ class MobileNetV3(TheseusLayer):
inplanes=STEM_CONV_NUMBER, inplanes=STEM_CONV_NUMBER,
class_squeeze=LAST_SECOND_CONV_LARGE, class_squeeze=LAST_SECOND_CONV_LARGE,
class_expand=LAST_CONV, class_expand=LAST_CONV,
dropout_prob=0.2): dropout_prob=0.2,
return_patterns=None):
super().__init__() super().__init__()
self.cfg = config self.cfg = config
...@@ -162,7 +163,7 @@ class MobileNetV3(TheseusLayer): ...@@ -162,7 +163,7 @@ class MobileNetV3(TheseusLayer):
if_act=True, if_act=True,
act="hardswish") act="hardswish")
self.blocks = nn.Sequential(*[ self.blocks = nn.Sequential(* [
ResidualUnit( ResidualUnit(
in_c=_make_divisible(self.inplanes * self.scale if i == 0 else in_c=_make_divisible(self.inplanes * self.scale if i == 0 else
self.cfg[i - 1][2] * self.scale), self.cfg[i - 1][2] * self.scale),
...@@ -199,6 +200,9 @@ class MobileNetV3(TheseusLayer): ...@@ -199,6 +200,9 @@ class MobileNetV3(TheseusLayer):
self.flatten = nn.Flatten(start_axis=1, stop_axis=-1) self.flatten = nn.Flatten(start_axis=1, stop_axis=-1)
self.fc = Linear(self.class_expand, class_num) self.fc = Linear(self.class_expand, class_num)
if return_patterns is not None:
self.update_res(return_patterns)
self.register_forward_post_hook(self._return_dict_hook)
def forward(self, x): def forward(self, x):
x = self.conv(x) x = self.conv(x)
......
# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import, division, print_function
import paddle
import paddle.nn as nn
from paddle import ParamAttr
from paddle.nn import AdaptiveAvgPool2D, BatchNorm, Conv2D, Dropout, Linear
from paddle.regularizer import L2Decay
from paddle.nn.initializer import KaimingNormal
from ppcls.arch.backbone.base.theseus_layer import TheseusLayer
from ppcls.utils.save_load import load_dygraph_pretrain, load_dygraph_pretrain_from_url
MODEL_URLS = {
"PPLCNet_x0_25":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_25_pretrained.pdparams",
"PPLCNet_x0_35":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_35_pretrained.pdparams",
"PPLCNet_x0_5":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_5_pretrained.pdparams",
"PPLCNet_x0_75":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams",
"PPLCNet_x1_0":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_0_pretrained.pdparams",
"PPLCNet_x1_5":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x1_5_pretrained.pdparams",
"PPLCNet_x2_0":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams",
"PPLCNet_x2_5":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_5_pretrained.pdparams"
}
__all__ = list(MODEL_URLS.keys())
# Each element(list) represents a depthwise block, which is composed of k, in_c, out_c, s, use_se.
# k: kernel_size
# in_c: input channel number in depthwise block
# out_c: output channel number in depthwise block
# s: stride in depthwise block
# use_se: whether to use SE block
NET_CONFIG = {
"blocks2":
#k, in_c, out_c, s, use_se
[[3, 16, 32, 1, False]],
"blocks3": [[3, 32, 64, 2, False], [3, 64, 64, 1, False]],
"blocks4": [[3, 64, 128, 2, False], [3, 128, 128, 1, False]],
"blocks5": [[3, 128, 256, 2, False], [5, 256, 256, 1, False],
[5, 256, 256, 1, False], [5, 256, 256, 1, False],
[5, 256, 256, 1, False], [5, 256, 256, 1, False]],
"blocks6": [[5, 256, 512, 2, True], [5, 512, 512, 1, True]]
}
def make_divisible(v, divisor=8, min_value=None):
if min_value is None:
min_value = divisor
new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
if new_v < 0.9 * v:
new_v += divisor
return new_v
class ConvBNLayer(TheseusLayer):
def __init__(self,
num_channels,
filter_size,
num_filters,
stride,
num_groups=1):
super().__init__()
self.conv = Conv2D(
in_channels=num_channels,
out_channels=num_filters,
kernel_size=filter_size,
stride=stride,
padding=(filter_size - 1) // 2,
groups=num_groups,
weight_attr=ParamAttr(initializer=KaimingNormal()),
bias_attr=False)
self.bn = BatchNorm(
num_filters,
param_attr=ParamAttr(regularizer=L2Decay(0.0)),
bias_attr=ParamAttr(regularizer=L2Decay(0.0)))
self.hardswish = nn.Hardswish()
def forward(self, x):
x = self.conv(x)
x = self.bn(x)
x = self.hardswish(x)
return x
class DepthwiseSeparable(TheseusLayer):
def __init__(self,
num_channels,
num_filters,
stride,
dw_size=3,
use_se=False):
super().__init__()
self.use_se = use_se
self.dw_conv = ConvBNLayer(
num_channels=num_channels,
num_filters=num_channels,
filter_size=dw_size,
stride=stride,
num_groups=num_channels)
if use_se:
self.se = SEModule(num_channels)
self.pw_conv = ConvBNLayer(
num_channels=num_channels,
filter_size=1,
num_filters=num_filters,
stride=1)
def forward(self, x):
x = self.dw_conv(x)
if self.use_se:
x = self.se(x)
x = self.pw_conv(x)
return x
class SEModule(TheseusLayer):
def __init__(self, channel, reduction=4):
super().__init__()
self.avg_pool = AdaptiveAvgPool2D(1)
self.conv1 = Conv2D(
in_channels=channel,
out_channels=channel // reduction,
kernel_size=1,
stride=1,
padding=0)
self.relu = nn.ReLU()
self.conv2 = Conv2D(
in_channels=channel // reduction,
out_channels=channel,
kernel_size=1,
stride=1,
padding=0)
self.hardsigmoid = nn.Hardsigmoid()
def forward(self, x):
identity = x
x = self.avg_pool(x)
x = self.conv1(x)
x = self.relu(x)
x = self.conv2(x)
x = self.hardsigmoid(x)
x = paddle.multiply(x=identity, y=x)
return x
class PPLCNet(TheseusLayer):
def __init__(self,
scale=1.0,
class_num=1000,
dropout_prob=0.2,
class_expand=1280):
super().__init__()
self.scale = scale
self.class_expand = class_expand
self.conv1 = ConvBNLayer(
num_channels=3,
filter_size=3,
num_filters=make_divisible(16 * scale),
stride=2)
self.blocks2 = nn.Sequential(*[
DepthwiseSeparable(
num_channels=make_divisible(in_c * scale),
num_filters=make_divisible(out_c * scale),
dw_size=k,
stride=s,
use_se=se)
for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks2"])
])
self.blocks3 = nn.Sequential(*[
DepthwiseSeparable(
num_channels=make_divisible(in_c * scale),
num_filters=make_divisible(out_c * scale),
dw_size=k,
stride=s,
use_se=se)
for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks3"])
])
self.blocks4 = nn.Sequential(*[
DepthwiseSeparable(
num_channels=make_divisible(in_c * scale),
num_filters=make_divisible(out_c * scale),
dw_size=k,
stride=s,
use_se=se)
for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks4"])
])
self.blocks5 = nn.Sequential(*[
DepthwiseSeparable(
num_channels=make_divisible(in_c * scale),
num_filters=make_divisible(out_c * scale),
dw_size=k,
stride=s,
use_se=se)
for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks5"])
])
self.blocks6 = nn.Sequential(*[
DepthwiseSeparable(
num_channels=make_divisible(in_c * scale),
num_filters=make_divisible(out_c * scale),
dw_size=k,
stride=s,
use_se=se)
for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks6"])
])
self.avg_pool = AdaptiveAvgPool2D(1)
self.last_conv = Conv2D(
in_channels=make_divisible(NET_CONFIG["blocks6"][-1][2] * scale),
out_channels=self.class_expand,
kernel_size=1,
stride=1,
padding=0,
bias_attr=False)
self.hardswish = nn.Hardswish()
self.dropout = Dropout(p=dropout_prob, mode="downscale_in_infer")
self.flatten = nn.Flatten(start_axis=1, stop_axis=-1)
self.fc = Linear(self.class_expand, class_num)
def forward(self, x):
x = self.conv1(x)
x = self.blocks2(x)
x = self.blocks3(x)
x = self.blocks4(x)
x = self.blocks5(x)
x = self.blocks6(x)
x = self.avg_pool(x)
x = self.last_conv(x)
x = self.hardswish(x)
x = self.dropout(x)
x = self.flatten(x)
x = self.fc(x)
return x
def _load_pretrained(pretrained, model, model_url, use_ssld):
if pretrained is False:
pass
elif pretrained is True:
load_dygraph_pretrain_from_url(model, model_url, use_ssld=use_ssld)
elif isinstance(pretrained, str):
load_dygraph_pretrain(model, pretrained)
else:
raise RuntimeError(
"pretrained type is not available. Please use `string` or `boolean` type."
)
def PPLCNet_x0_25(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x0_25
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x0_25` model depends on args.
"""
model = PPLCNet(scale=0.25, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x0_25"], use_ssld)
return model
def PPLCNet_x0_35(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x0_35
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x0_35` model depends on args.
"""
model = PPLCNet(scale=0.35, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x0_35"], use_ssld)
return model
def PPLCNet_x0_5(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x0_5
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x0_5` model depends on args.
"""
model = PPLCNet(scale=0.5, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x0_5"], use_ssld)
return model
def PPLCNet_x0_75(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x0_75
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x0_75` model depends on args.
"""
model = PPLCNet(scale=0.75, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x0_75"], use_ssld)
return model
def PPLCNet_x1_0(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x1_0
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x1_0` model depends on args.
"""
model = PPLCNet(scale=1.0, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x1_0"], use_ssld)
return model
def PPLCNet_x1_5(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x1_5
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x1_5` model depends on args.
"""
model = PPLCNet(scale=1.5, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x1_5"], use_ssld)
return model
def PPLCNet_x2_0(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x2_0
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x2_0` model depends on args.
"""
model = PPLCNet(scale=2.0, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x2_0"], use_ssld)
return model
def PPLCNet_x2_5(pretrained=False, use_ssld=False, **kwargs):
"""
PPLCNet_x2_5
Args:
pretrained: bool=False or str. If `True` load pretrained parameters, `False` otherwise.
If str, means the path of the pretrained model.
use_ssld: bool=False. Whether using distillation pretrained model when pretrained=True.
Returns:
model: nn.Layer. Specific `PPLCNet_x2_5` model depends on args.
"""
model = PPLCNet(scale=2.5, **kwargs)
_load_pretrained(pretrained, model, MODEL_URLS["PPLCNet_x2_5"], use_ssld)
return model
...@@ -269,7 +269,8 @@ class ResNet(TheseusLayer): ...@@ -269,7 +269,8 @@ class ResNet(TheseusLayer):
class_num=1000, class_num=1000,
lr_mult_list=[1.0, 1.0, 1.0, 1.0, 1.0], lr_mult_list=[1.0, 1.0, 1.0, 1.0, 1.0],
data_format="NCHW", data_format="NCHW",
input_image_channel=3): input_image_channel=3,
return_patterns=None):
super().__init__() super().__init__()
self.cfg = config self.cfg = config
...@@ -337,6 +338,9 @@ class ResNet(TheseusLayer): ...@@ -337,6 +338,9 @@ class ResNet(TheseusLayer):
weight_attr=ParamAttr(initializer=Uniform(-stdv, stdv))) weight_attr=ParamAttr(initializer=Uniform(-stdv, stdv)))
self.data_format = data_format self.data_format = data_format
if return_patterns is not None:
self.update_res(return_patterns)
self.register_forward_post_hook(self._return_dict_hook)
def forward(self, x): def forward(self, x):
with paddle.static.amp.fp16_guard(): with paddle.static.amp.fp16_guard():
......
...@@ -137,8 +137,11 @@ class VGGNet(TheseusLayer): ...@@ -137,8 +137,11 @@ class VGGNet(TheseusLayer):
self.fc1 = Linear(7 * 7 * 512, 4096) self.fc1 = Linear(7 * 7 * 512, 4096)
self.fc2 = Linear(4096, 4096) self.fc2 = Linear(4096, 4096)
self.fc3 = Linear(4096, class_num) self.fc3 = Linear(4096, class_num)
if return_patterns is not None:
self.update_res(return_patterns)
self.register_forward_post_hook(self._return_dict_hook)
def forward(self, inputs, res_dict=None): def forward(self, inputs):
x = self.conv_block_1(inputs) x = self.conv_block_1(inputs)
x = self.conv_block_2(x) x = self.conv_block_2(x)
x = self.conv_block_3(x) x = self.conv_block_3(x)
...@@ -152,9 +155,6 @@ class VGGNet(TheseusLayer): ...@@ -152,9 +155,6 @@ class VGGNet(TheseusLayer):
x = self.relu(x) x = self.relu(x)
x = self.drop(x) x = self.drop(x)
x = self.fc3(x) x = self.fc3(x)
if self.res_dict and res_dict is not None:
for res_key in list(self.res_dict):
res_dict[res_key] = self.res_dict.pop(res_key)
return x return x
......
...@@ -193,9 +193,11 @@ class Block(nn.Layer): ...@@ -193,9 +193,11 @@ class Block(nn.Layer):
self.drop_path(self.mlp_in(self.norm_mlp_in(pixel_embed)))) self.drop_path(self.mlp_in(self.norm_mlp_in(pixel_embed))))
# outer # outer
B, N, C = patch_embed.shape B, N, C = patch_embed.shape
patch_embed[:, 1:] = paddle.add( norm1_proj = self.norm1_proj(pixel_embed)
patch_embed[:, 1:], norm1_proj = norm1_proj.reshape(
self.proj(self.norm1_proj(pixel_embed).reshape((B, N - 1, -1)))) (B, N - 1, norm1_proj.shape[1] * norm1_proj.shape[2]))
patch_embed[:, 1:] = paddle.add(patch_embed[:, 1:],
self.proj(norm1_proj))
patch_embed = paddle.add( patch_embed = paddle.add(
patch_embed, patch_embed,
self.drop_path(self.attn_out(self.norm_out(patch_embed)))) self.drop_path(self.attn_out(self.norm_out(patch_embed))))
...@@ -328,7 +330,7 @@ class TNT(nn.Layer): ...@@ -328,7 +330,7 @@ class TNT(nn.Layer):
ones_(m.weight) ones_(m.weight)
def forward_features(self, x): def forward_features(self, x):
B = x.shape[0] B = paddle.shape(x)[0]
pixel_embed = self.pixel_embed(x, self.pixel_pos) pixel_embed = self.pixel_embed(x, self.pixel_pos)
patch_embed = self.norm2_proj( patch_embed = self.norm2_proj(
......
...@@ -24,30 +24,25 @@ class ArcMargin(nn.Layer): ...@@ -24,30 +24,25 @@ class ArcMargin(nn.Layer):
margin=0.5, margin=0.5,
scale=80.0, scale=80.0,
easy_margin=False): easy_margin=False):
super(ArcMargin, self).__init__() super().__init__()
self.embedding_size = embedding_size self.embedding_size = embedding_size
self.class_num = class_num self.class_num = class_num
self.margin = margin self.margin = margin
self.scale = scale self.scale = scale
self.easy_margin = easy_margin self.easy_margin = easy_margin
self.weight = self.create_parameter(
weight_attr = paddle.ParamAttr( shape=[self.embedding_size, self.class_num],
initializer=paddle.nn.initializer.XavierNormal()) is_bias=False,
self.fc = nn.Linear( default_initializer=paddle.nn.initializer.XavierNormal())
self.embedding_size,
self.class_num,
weight_attr=weight_attr,
bias_attr=False)
def forward(self, input, label=None): def forward(self, input, label=None):
input_norm = paddle.sqrt( input_norm = paddle.sqrt(
paddle.sum(paddle.square(input), axis=1, keepdim=True)) paddle.sum(paddle.square(input), axis=1, keepdim=True))
input = paddle.divide(input, input_norm) input = paddle.divide(input, input_norm)
weight = self.fc.weight
weight_norm = paddle.sqrt( weight_norm = paddle.sqrt(
paddle.sum(paddle.square(weight), axis=0, keepdim=True)) paddle.sum(paddle.square(self.weight), axis=0, keepdim=True))
weight = paddle.divide(weight, weight_norm) weight = paddle.divide(self.weight, weight_norm)
cos = paddle.matmul(input, weight) cos = paddle.matmul(input, weight)
if not self.training or label is None: if not self.training or label is None:
......
...@@ -26,20 +26,19 @@ class CircleMargin(nn.Layer): ...@@ -26,20 +26,19 @@ class CircleMargin(nn.Layer):
self.embedding_size = embedding_size self.embedding_size = embedding_size
self.class_num = class_num self.class_num = class_num
weight_attr = paddle.ParamAttr( self.weight = self.create_parameter(
initializer=paddle.nn.initializer.XavierNormal()) shape=[self.embedding_size, self.class_num],
self.fc = paddle.nn.Linear( is_bias=False,
self.embedding_size, self.class_num, weight_attr=weight_attr) default_initializer=paddle.nn.initializer.XavierNormal())
def forward(self, input, label): def forward(self, input, label):
feat_norm = paddle.sqrt( feat_norm = paddle.sqrt(
paddle.sum(paddle.square(input), axis=1, keepdim=True)) paddle.sum(paddle.square(input), axis=1, keepdim=True))
input = paddle.divide(input, feat_norm) input = paddle.divide(input, feat_norm)
weight = self.fc.weight
weight_norm = paddle.sqrt( weight_norm = paddle.sqrt(
paddle.sum(paddle.square(weight), axis=0, keepdim=True)) paddle.sum(paddle.square(self.weight), axis=0, keepdim=True))
weight = paddle.divide(weight, weight_norm) weight = paddle.divide(self.weight, weight_norm)
logits = paddle.matmul(input, weight) logits = paddle.matmul(input, weight)
if not self.training or label is None: if not self.training or label is None:
...@@ -49,9 +48,9 @@ class CircleMargin(nn.Layer): ...@@ -49,9 +48,9 @@ class CircleMargin(nn.Layer):
alpha_n = paddle.clip(logits.detach() + self.margin, min=0.) alpha_n = paddle.clip(logits.detach() + self.margin, min=0.)
delta_p = 1 - self.margin delta_p = 1 - self.margin
delta_n = self.margin delta_n = self.margin
m_hot = F.one_hot(label.reshape([-1]), num_classes=logits.shape[1]) m_hot = F.one_hot(label.reshape([-1]), num_classes=logits.shape[1])
logits_p = alpha_p * (logits - delta_p) logits_p = alpha_p * (logits - delta_p)
logits_n = alpha_n * (logits - delta_n) logits_n = alpha_n * (logits - delta_n)
pre_logits = logits_p * m_hot + logits_n * (1 - m_hot) pre_logits = logits_p * m_hot + logits_n * (1 - m_hot)
......
...@@ -25,13 +25,10 @@ class CosMargin(paddle.nn.Layer): ...@@ -25,13 +25,10 @@ class CosMargin(paddle.nn.Layer):
self.embedding_size = embedding_size self.embedding_size = embedding_size
self.class_num = class_num self.class_num = class_num
weight_attr = paddle.ParamAttr( self.weight = self.create_parameter(
initializer=paddle.nn.initializer.XavierNormal()) shape=[self.embedding_size, self.class_num],
self.fc = nn.Linear( is_bias=False,
self.embedding_size, default_initializer=paddle.nn.initializer.XavierNormal())
self.class_num,
weight_attr=weight_attr,
bias_attr=False)
def forward(self, input, label): def forward(self, input, label):
label.stop_gradient = True label.stop_gradient = True
...@@ -40,15 +37,14 @@ class CosMargin(paddle.nn.Layer): ...@@ -40,15 +37,14 @@ class CosMargin(paddle.nn.Layer):
paddle.sum(paddle.square(input), axis=1, keepdim=True)) paddle.sum(paddle.square(input), axis=1, keepdim=True))
input = paddle.divide(input, input_norm) input = paddle.divide(input, input_norm)
weight = self.fc.weight
weight_norm = paddle.sqrt( weight_norm = paddle.sqrt(
paddle.sum(paddle.square(weight), axis=0, keepdim=True)) paddle.sum(paddle.square(self.weight), axis=0, keepdim=True))
weight = paddle.divide(weight, weight_norm) weight = paddle.divide(self.weight, weight_norm)
cos = paddle.matmul(input, weight) cos = paddle.matmul(input, weight)
if not self.training or label is None: if not self.training or label is None:
return cos return cos
cos_m = cos - self.margin cos_m = cos - self.margin
one_hot = paddle.nn.functional.one_hot(label, self.class_num) one_hot = paddle.nn.functional.one_hot(label, self.class_num)
......
...@@ -34,9 +34,8 @@ Optimizer: ...@@ -34,9 +34,8 @@ Optimizer:
momentum: 0.9 momentum: 0.9
lr: lr:
name: Piecewise name: Piecewise
learning_rate: 0.01
decay_epochs: [30, 60, 90] decay_epochs: [30, 60, 90]
values: [0.1, 0.01, 0.001, 0.0001] values: [0.01, 0.001, 0.0001, 0.00001]
regularizer: regularizer:
name: 'L2' name: 'L2'
coeff: 0.0001 coeff: 0.0001
......
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -54,18 +56,39 @@ DataLoader: ...@@ -54,18 +56,39 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 384 size: 384
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -82,7 +105,9 @@ DataLoader: ...@@ -82,7 +105,9 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 426 resize_short: 438
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 384 size: 384
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -107,7 +132,9 @@ Infer: ...@@ -107,7 +132,9 @@ Infer:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 426 resize_short: 438
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 384 size: 384
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -54,18 +56,39 @@ DataLoader: ...@@ -54,18 +56,39 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 384 size: 384
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -82,7 +105,9 @@ DataLoader: ...@@ -82,7 +105,9 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 426 resize_short: 438
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 384 size: 384
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -107,7 +132,9 @@ Infer: ...@@ -107,7 +132,9 @@ Infer:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 426 resize_short: 438
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 384 size: 384
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -22,25 +22,27 @@ Arch: ...@@ -22,25 +22,27 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token pos_embed dist_token
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 1e-3
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
DataLoader: DataLoader:
...@@ -55,17 +57,38 @@ DataLoader: ...@@ -55,17 +57,38 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -83,6 +106,8 @@ DataLoader: ...@@ -83,6 +106,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -92,7 +117,7 @@ DataLoader: ...@@ -92,7 +117,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 256
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -108,6 +133,8 @@ Infer: ...@@ -108,6 +133,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -122,9 +149,6 @@ Infer: ...@@ -122,9 +149,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 200 epochs: 360
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -38,6 +38,7 @@ Optimizer: ...@@ -38,6 +38,7 @@ Optimizer:
lr: lr:
name: Cosine name: Cosine
learning_rate: 0.032 learning_rate: 0.032
warmup_epoch: 5
regularizer: regularizer:
name: 'L2' name: 'L2'
coeff: 0.00001 coeff: 0.00001
...@@ -67,7 +68,7 @@ DataLoader: ...@@ -67,7 +68,7 @@ DataLoader:
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
......
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: EfficientNetB1
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: RMSProp
momentum: 0.9
rho: 0.9
epsilon: 0.001
lr:
name: Cosine
learning_rate: 0.032
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 240
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 272
- CropImage:
size: 240
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: EfficientNetB2
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: RMSProp
momentum: 0.9
rho: 0.9
epsilon: 0.001
lr:
name: Cosine
learning_rate: 0.032
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 260
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 292
- CropImage:
size: 260
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: EfficientNetB3
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: RMSProp
momentum: 0.9
rho: 0.9
epsilon: 0.001
lr:
name: Cosine
learning_rate: 0.032
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 300
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 332
- CropImage:
size: 300
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: EfficientNetB4
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: RMSProp
momentum: 0.9
rho: 0.9
epsilon: 0.001
lr:
name: Cosine
learning_rate: 0.032
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 380
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 412
- CropImage:
size: 380
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: EfficientNetB5
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: RMSProp
momentum: 0.9
rho: 0.9
epsilon: 0.001
lr:
name: Cosine
learning_rate: 0.032
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 456
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 488
- CropImage:
size: 456
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: EfficientNetB6
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: RMSProp
momentum: 0.9
rho: 0.9
epsilon: 0.001
lr:
name: Cosine
learning_rate: 0.032
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 528
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 560
- CropImage:
size: 528
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: EfficientNetB7
class_num: 1000
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: RMSProp
momentum: 0.9
rho: 0.9
epsilon: 0.001
lr:
name: Cosine
learning_rate: 0.032
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00001
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 600
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 632
- CropImage:
size: 600
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x0_25
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00003
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x0_35
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00003
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x0_5
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00003
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x0_75
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00003
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x1_0
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00003
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x1_5
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x2_0
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
class_num: 1000
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: PPLCNet_x2_5
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.8
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00004
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 512
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -59,15 +63,35 @@ DataLoader: ...@@ -59,15 +63,35 @@ DataLoader:
size: 384 size: 384
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -84,7 +108,9 @@ DataLoader: ...@@ -84,7 +108,9 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
size: [384, 384] resize_short: 438
- CropImage:
size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
...@@ -92,7 +118,7 @@ DataLoader: ...@@ -92,7 +118,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -107,7 +133,9 @@ Infer: ...@@ -107,7 +133,9 @@ Infer:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
size: [384, 384] resize_short: 438
- CropImage:
size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
...@@ -120,9 +148,6 @@ Infer: ...@@ -120,9 +148,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -59,15 +63,35 @@ DataLoader: ...@@ -59,15 +63,35 @@ DataLoader:
size: 224 size: 224
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -94,7 +118,7 @@ DataLoader: ...@@ -94,7 +118,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -124,9 +148,6 @@ Infer: ...@@ -124,9 +148,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -59,15 +63,35 @@ DataLoader: ...@@ -59,15 +63,35 @@ DataLoader:
size: 384 size: 384
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -84,7 +108,9 @@ DataLoader: ...@@ -84,7 +108,9 @@ DataLoader:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
size: [384, 384] resize_short: 438
- CropImage:
size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
...@@ -92,7 +118,7 @@ DataLoader: ...@@ -92,7 +118,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -107,7 +133,9 @@ Infer: ...@@ -107,7 +133,9 @@ Infer:
to_rgb: True to_rgb: True
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
size: [384, 384] resize_short: 438
- CropImage:
size: 384
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
...@@ -120,9 +148,6 @@ Infer: ...@@ -120,9 +148,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -59,15 +63,35 @@ DataLoader: ...@@ -59,15 +63,35 @@ DataLoader:
size: 224 size: 224
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -94,7 +118,7 @@ DataLoader: ...@@ -94,7 +118,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -124,9 +148,6 @@ Infer: ...@@ -124,9 +148,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -59,15 +63,35 @@ DataLoader: ...@@ -59,15 +63,35 @@ DataLoader:
size: 224 size: 224
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -94,7 +118,7 @@ DataLoader: ...@@ -94,7 +118,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -124,9 +148,6 @@ Infer: ...@@ -124,9 +148,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -24,24 +24,28 @@ Arch: ...@@ -24,24 +24,28 @@ Arch:
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: absolute_pos_embed relative_position_bias_table .bias norm
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 20
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -59,15 +63,35 @@ DataLoader: ...@@ -59,15 +63,35 @@ DataLoader:
size: 224 size: 224
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -94,7 +118,7 @@ DataLoader: ...@@ -94,7 +118,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -124,9 +148,6 @@ Infer: ...@@ -124,9 +148,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: alt_gvt_base name: alt_gvt_base
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.3
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: alt_gvt_large name: alt_gvt_large
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.5
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: alt_gvt_small name: alt_gvt_small
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.2
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: pcpvt_base name: pcpvt_base
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.3
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: pcpvt_large name: pcpvt_large
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.5
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -7,7 +7,7 @@ Global: ...@@ -7,7 +7,7 @@ Global:
save_interval: 1 save_interval: 1
eval_during_train: True eval_during_train: True
eval_interval: 1 eval_interval: 1
epochs: 120 epochs: 300
print_batch_step: 10 print_batch_step: 10
use_visualdl: False use_visualdl: False
# used for static mode and model export # used for static mode and model export
...@@ -20,28 +20,34 @@ Global: ...@@ -20,28 +20,34 @@ Global:
Arch: Arch:
name: pcpvt_small name: pcpvt_small
class_num: 1000 class_num: 1000
drop_rate: 0.0
drop_path_rate: 0.2
# loss function config for traing/eval process # loss function config for traing/eval process
Loss: Loss:
Train: Train:
- CELoss: - MixCELoss:
weight: 1.0 weight: 1.0
epsilon: 0.1
Eval: Eval:
- CELoss: - CELoss:
weight: 1.0 weight: 1.0
Optimizer: Optimizer:
name: Momentum name: AdamW
momentum: 0.9 beta1: 0.9
beta2: 0.999
epsilon: 1e-8
weight_decay: 0.05
no_weight_decay_name: norm cls_token proj.0.weight proj.1.weight proj.2.weight proj.3.weight pos_block
one_dim_param_no_weight_decay: True
lr: lr:
name: Piecewise name: Cosine
learning_rate: 0.1 learning_rate: 5e-4
decay_epochs: [30, 60, 90] eta_min: 1e-5
values: [0.1, 0.01, 0.001, 0.0001] warmup_epoch: 5
regularizer: warmup_start_lr: 1e-6
name: 'L2'
coeff: 0.0001
# data loader for train and eval # data loader for train and eval
...@@ -57,17 +63,39 @@ DataLoader: ...@@ -57,17 +63,39 @@ DataLoader:
channel_first: False channel_first: False
- RandCropImage: - RandCropImage:
size: 224 size: 224
interpolation: bicubic
backend: pil
- RandFlipImage: - RandFlipImage:
flip_code: 1 flip_code: 1
- TimmAutoAugment:
config_str: rand-m9-mstd0.5-inc1
interpolation: bicubic
img_size: 224
- NormalizeImage: - NormalizeImage:
scale: 1.0/255.0 scale: 1.0/255.0
mean: [0.485, 0.456, 0.406] mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225] std: [0.229, 0.224, 0.225]
order: '' order: ''
- RandomErasing:
EPSILON: 0.25
sl: 0.02
sh: 1.0/3.0
r1: 0.3
attempt: 10
use_log_aspect: True
mode: pixel
batch_transform_ops:
- OpSampler:
MixupOperator:
alpha: 0.8
prob: 0.5
CutmixOperator:
alpha: 1.0
prob: 0.5
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: True shuffle: True
loader: loader:
...@@ -85,6 +113,8 @@ DataLoader: ...@@ -85,6 +113,8 @@ DataLoader:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -94,7 +124,7 @@ DataLoader: ...@@ -94,7 +124,7 @@ DataLoader:
order: '' order: ''
sampler: sampler:
name: DistributedBatchSampler name: DistributedBatchSampler
batch_size: 64 batch_size: 128
drop_last: False drop_last: False
shuffle: False shuffle: False
loader: loader:
...@@ -110,6 +140,8 @@ Infer: ...@@ -110,6 +140,8 @@ Infer:
channel_first: False channel_first: False
- ResizeImage: - ResizeImage:
resize_short: 256 resize_short: 256
interpolation: bicubic
backend: pil
- CropImage: - CropImage:
size: 224 size: 224
- NormalizeImage: - NormalizeImage:
...@@ -124,9 +156,6 @@ Infer: ...@@ -124,9 +156,6 @@ Infer:
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric: Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval: Eval:
- TopkAcc: - TopkAcc:
topk: [1, 5] topk: [1, 5]
...@@ -51,12 +51,8 @@ Optimizer: ...@@ -51,12 +51,8 @@ Optimizer:
name: Momentum name: Momentum
momentum: 0.9 momentum: 0.9
lr: lr:
name: MultiStepDecay name: Cosine
learning_rate: 0.01 learning_rate: 0.01
milestones: [30, 60, 70, 80, 90, 100, 120, 140]
gamma: 0.5
verbose: False
last_epoch: -1
regularizer: regularizer:
name: 'L2' name: 'L2'
coeff: 0.0005 coeff: 0.0005
......
...@@ -54,7 +54,6 @@ Optimizer: ...@@ -54,7 +54,6 @@ Optimizer:
lr: lr:
name: Cosine name: Cosine
learning_rate: 0.01 learning_rate: 0.01
last_epoch: -1
regularizer: regularizer:
name: 'L2' name: 'L2'
coeff: 0.0005 coeff: 0.0005
......
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 360
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# for quantization or prune model
Slim:
## for prune
prune:
name: fpgm
pruned_ratio: 0.3
# model architecture
Arch:
name: MobileNetV3_large_x1_0
class_num: 1000
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.65
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00002
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 256
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 60
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# for quantalization or prune model
Slim:
## for quantization
quant:
name: pact
# model architecture
Arch:
name: MobileNetV3_large_x1_0
class_num: 1000
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.065
warmup_epoch: 0
regularizer:
name: 'L2'
coeff: 0.00002
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- AutoAugment:
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 256
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 200
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# for quantization or prune model
Slim:
## for prune
prune:
name: fpgm
pruned_ratio: 0.3
# model architecture
Arch:
name: ResNet50_vd
class_num: 1000
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- MixCELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.1
regularizer:
name: 'L2'
coeff: 0.00007
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
batch_transform_ops:
- MixupOperator:
alpha: 0.2
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 30
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# for quantalization or prune model
Slim:
## for quantization
quant:
name: pact
# model architecture
Arch:
name: ResNet50_vd
class_num: 1000
pretrained: True
# loss function config for traing/eval process
Loss:
Train:
- MixCELoss:
weight: 1.0
epsilon: 0.1
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
regularizer:
name: 'L2'
coeff: 0.00007
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
batch_transform_ops:
- MixupOperator:
alpha: 0.2
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: True
loader:
num_workers: 4
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/ILSVRC2012/
cls_label_path: ./dataset/ILSVRC2012/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 4
use_shared_memory: True
Infer:
infer_imgs: docs/images/whl/demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 5
class_id_map_file: ppcls/utils/imagenet1k_label_list.txt
Metric:
Train:
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output_vehicle_cls_prune/"
device: "gpu"
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 160
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: "./inference"
Slim:
prune:
name: fpgm
pruned_ratio: 0.3
# model architecture
Arch:
name: "RecModel"
infer_output_key: "features"
infer_add_softmax: False
Backbone:
name: "ResNet50_last_stage_stride1"
pretrained: True
BackboneStopLayer:
name: "adaptive_avg_pool2d_0"
Neck:
name: "VehicleNeck"
in_channels: 2048
out_channels: 512
Head:
name: "ArcMargin"
embedding_size: 512
class_num: 431
margin: 0.15
scale: 32
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
- SupConLoss:
weight: 1.0
views: 2
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: "CompCars"
image_root: "./dataset/CompCars/image/"
label_root: "./dataset/CompCars/label/"
bbox_crop: True
cls_label_path: "./dataset/CompCars/train_test_split/classification/train_label.txt"
transform_ops:
- ResizeImage:
size: 224
- RandFlipImage:
flip_code: 1
- AugMix:
prob: 0.5
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.5
sl: 0.02
sh: 0.4
r1: 0.3
mean: [0., 0., 0.]
sampler:
name: DistributedRandomIdentitySampler
batch_size: 128
num_instances: 2
drop_last: False
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: "CompCars"
image_root: "./dataset/CompCars/image/"
label_root: "./dataset/CompCars/label/"
cls_label_path: "./dataset/CompCars/train_test_split/classification/test_label.txt"
bbox_crop: True
transform_ops:
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output_vehicle_cls_pact/"
device: "gpu"
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 80
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: "./inference"
Slim:
quant:
name: pact
# model architecture
Arch:
name: "RecModel"
infer_output_key: "features"
infer_add_softmax: False
Backbone:
name: "ResNet50_last_stage_stride1"
pretrained: True
BackboneStopLayer:
name: "adaptive_avg_pool2d_0"
Neck:
name: "VehicleNeck"
in_channels: 2048
out_channels: 512
Head:
name: "ArcMargin"
embedding_size: 512
class_num: 431
margin: 0.15
scale: 32
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
- SupConLoss:
weight: 1.0
views: 2
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.001
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: "CompCars"
image_root: "./dataset/CompCars/image/"
label_root: "./dataset/CompCars/label/"
bbox_crop: True
cls_label_path: "./dataset/CompCars/train_test_split/classification/train_label.txt"
transform_ops:
- ResizeImage:
size: 224
- RandFlipImage:
flip_code: 1
- AugMix:
prob: 0.5
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.5
sl: 0.02
sh: 0.4
r1: 0.3
mean: [0., 0., 0.]
sampler:
name: DistributedRandomIdentitySampler
batch_size: 128
num_instances: 2
drop_last: False
shuffle: True
loader:
num_workers: 8
use_shared_memory: True
Eval:
dataset:
name: "CompCars"
image_root: "./dataset/CompCars/image/"
label_root: "./dataset/CompCars/label/"
cls_label_path: "./dataset/CompCars/train_test_split/classification/test_label.txt"
bbox_crop: True
transform_ops:
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 128
drop_last: False
shuffle: False
loader:
num_workers: 8
use_shared_memory: True
Metric:
Train:
- TopkAcc:
topk: [1, 5]
Eval:
- TopkAcc:
topk: [1, 5]
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output_fpgm/"
device: "gpu"
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 160
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: "./inference"
eval_mode: "retrieval"
# for quantizaiton or prune model
Slim:
## for prune
prune:
name: fpgm
pruned_ratio: 0.3
# model architecture
Arch:
name: "RecModel"
infer_output_key: "features"
infer_add_softmax: False
Backbone:
name: "ResNet50_last_stage_stride1"
pretrained: True
BackboneStopLayer:
name: "adaptive_avg_pool2d_0"
Neck:
name: "VehicleNeck"
in_channels: 2048
out_channels: 512
Head:
name: "ArcMargin"
embedding_size: 512
class_num: 30671
margin: 0.15
scale: 32
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
- SupConLoss:
weight: 1.0
views: 2
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.01
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: "VeriWild"
image_root: "./dataset/VeRI-Wild/images/"
cls_label_path: "./dataset/VeRI-Wild/train_test_split/train_list_start0.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- RandFlipImage:
flip_code: 1
- AugMix:
prob: 0.5
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.5
sl: 0.02
sh: 0.4
r1: 0.3
mean: [0., 0., 0.]
sampler:
name: DistributedRandomIdentitySampler
batch_size: 128
num_instances: 2
drop_last: False
shuffle: True
loader:
num_workers: 6
use_shared_memory: True
Eval:
Query:
dataset:
name: "VeriWild"
image_root: "./dataset/VeRI-Wild/images"
cls_label_path: "./dataset/VeRI-Wild/train_test_split/test_3000_id_query.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 6
use_shared_memory: True
Gallery:
dataset:
name: "VeriWild"
image_root: "./dataset/VeRI-Wild/images"
cls_label_path: "./dataset/VeRI-Wild/train_test_split/test_3000_id.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 6
use_shared_memory: True
Metric:
Eval:
- Recallk:
topk: [1, 5]
- mAP: {}
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: "./output_vehicle_reid_pact/"
device: "gpu"
save_interval: 1
eval_during_train: True
eval_interval: 1
epochs: 40
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: "./inference"
eval_mode: "retrieval"
# for quantizaiton or prune model
Slim:
## for prune
quant:
name: pact
# model architecture
Arch:
name: "RecModel"
infer_output_key: "features"
infer_add_softmax: False
Backbone:
name: "ResNet50_last_stage_stride1"
pretrained: True
BackboneStopLayer:
name: "adaptive_avg_pool2d_0"
Neck:
name: "VehicleNeck"
in_channels: 2048
out_channels: 512
Head:
name: "ArcMargin"
embedding_size: 512
class_num: 30671
margin: 0.15
scale: 32
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
- SupConLoss:
weight: 1.0
views: 2
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.001
regularizer:
name: 'L2'
coeff: 0.0005
# data loader for train and eval
DataLoader:
Train:
dataset:
name: "VeriWild"
image_root: "./dataset/VeRI-Wild/images/"
cls_label_path: "./dataset/VeRI-Wild/train_test_split/train_list_start0.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- RandFlipImage:
flip_code: 1
- AugMix:
prob: 0.5
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- RandomErasing:
EPSILON: 0.5
sl: 0.02
sh: 0.4
r1: 0.3
mean: [0., 0., 0.]
sampler:
name: DistributedRandomIdentitySampler
batch_size: 64
num_instances: 2
drop_last: False
shuffle: True
loader:
num_workers: 6
use_shared_memory: True
Eval:
Query:
dataset:
name: "VeriWild"
image_root: "./dataset/VeRI-Wild/images"
cls_label_path: "./dataset/VeRI-Wild/train_test_split/test_3000_id_query.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 6
use_shared_memory: True
Gallery:
dataset:
name: "VeriWild"
image_root: "./dataset/VeRI-Wild/images"
cls_label_path: "./dataset/VeRI-Wild/train_test_split/test_3000_id.txt"
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 64
drop_last: False
shuffle: False
loader:
num_workers: 6
use_shared_memory: True
Metric:
Eval:
- Recallk:
topk: [1, 5]
- mAP: {}
...@@ -14,6 +14,7 @@ ...@@ -14,6 +14,7 @@
from ppcls.data.preprocess.ops.autoaugment import ImageNetPolicy as RawImageNetPolicy from ppcls.data.preprocess.ops.autoaugment import ImageNetPolicy as RawImageNetPolicy
from ppcls.data.preprocess.ops.randaugment import RandAugment as RawRandAugment from ppcls.data.preprocess.ops.randaugment import RandAugment as RawRandAugment
from ppcls.data.preprocess.ops.timm_autoaugment import RawTimmAutoAugment
from ppcls.data.preprocess.ops.cutout import Cutout from ppcls.data.preprocess.ops.cutout import Cutout
from ppcls.data.preprocess.ops.hide_and_seek import HideAndSeek from ppcls.data.preprocess.ops.hide_and_seek import HideAndSeek
...@@ -29,9 +30,8 @@ from ppcls.data.preprocess.ops.operators import NormalizeImage ...@@ -29,9 +30,8 @@ from ppcls.data.preprocess.ops.operators import NormalizeImage
from ppcls.data.preprocess.ops.operators import ToCHWImage from ppcls.data.preprocess.ops.operators import ToCHWImage
from ppcls.data.preprocess.ops.operators import AugMix from ppcls.data.preprocess.ops.operators import AugMix
from ppcls.data.preprocess.batch_ops.batch_operators import MixupOperator, CutmixOperator, FmixOperator from ppcls.data.preprocess.batch_ops.batch_operators import MixupOperator, CutmixOperator, OpSampler, FmixOperator
import six
import numpy as np import numpy as np
from PIL import Image from PIL import Image
...@@ -45,21 +45,16 @@ def transform(data, ops=[]): ...@@ -45,21 +45,16 @@ def transform(data, ops=[]):
class AutoAugment(RawImageNetPolicy): class AutoAugment(RawImageNetPolicy):
""" ImageNetPolicy wrapper to auto fit different img types """ """ ImageNetPolicy wrapper to auto fit different img types """
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
if six.PY2: super().__init__(*args, **kwargs)
super(AutoAugment, self).__init__(*args, **kwargs)
else:
super().__init__(*args, **kwargs)
def __call__(self, img): def __call__(self, img):
if not isinstance(img, Image.Image): if not isinstance(img, Image.Image):
img = np.ascontiguousarray(img) img = np.ascontiguousarray(img)
img = Image.fromarray(img) img = Image.fromarray(img)
if six.PY2: img = super().__call__(img)
img = super(AutoAugment, self).__call__(img)
else:
img = super().__call__(img)
if isinstance(img, Image.Image): if isinstance(img, Image.Image):
img = np.asarray(img) img = np.asarray(img)
...@@ -69,21 +64,35 @@ class AutoAugment(RawImageNetPolicy): ...@@ -69,21 +64,35 @@ class AutoAugment(RawImageNetPolicy):
class RandAugment(RawRandAugment): class RandAugment(RawRandAugment):
""" RandAugment wrapper to auto fit different img types """ """ RandAugment wrapper to auto fit different img types """
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def __call__(self, img):
if not isinstance(img, Image.Image):
img = np.ascontiguousarray(img)
img = Image.fromarray(img)
img = super().__call__(img)
if isinstance(img, Image.Image):
img = np.asarray(img)
return img
class TimmAutoAugment(RawTimmAutoAugment):
""" TimmAutoAugment wrapper to auto fit different img tyeps. """
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
if six.PY2: super().__init__(*args, **kwargs)
super(RandAugment, self).__init__(*args, **kwargs)
else:
super().__init__(*args, **kwargs)
def __call__(self, img): def __call__(self, img):
if not isinstance(img, Image.Image): if not isinstance(img, Image.Image):
img = np.ascontiguousarray(img) img = np.ascontiguousarray(img)
img = Image.fromarray(img) img = Image.fromarray(img)
if six.PY2: img = super().__call__(img)
img = super(RandAugment, self).__call__(img)
else:
img = super().__call__(img)
if isinstance(img, Image.Image): if isinstance(img, Image.Image):
img = np.asarray(img) img = np.asarray(img)
......
...@@ -16,13 +16,17 @@ from __future__ import absolute_import ...@@ -16,13 +16,17 @@ from __future__ import absolute_import
from __future__ import division from __future__ import division
from __future__ import print_function from __future__ import print_function
from __future__ import unicode_literals from __future__ import unicode_literals
import random
import numpy as np import numpy as np
from ppcls.utils import logger
from ppcls.data.preprocess.ops.fmix import sample_mask from ppcls.data.preprocess.ops.fmix import sample_mask
class BatchOperator(object): class BatchOperator(object):
""" BatchOperator """ """ BatchOperator """
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
pass pass
...@@ -46,9 +50,20 @@ class BatchOperator(object): ...@@ -46,9 +50,20 @@ class BatchOperator(object):
class MixupOperator(BatchOperator): class MixupOperator(BatchOperator):
""" Mixup operator """ """ Mixup operator """
def __init__(self, alpha=0.2):
assert alpha > 0., \ def __init__(self, alpha: float=1.):
'parameter alpha[%f] should > 0.0' % (alpha) """Build Mixup operator
Args:
alpha (float, optional): The parameter alpha of mixup. Defaults to 1..
Raises:
Exception: The value of parameter is illegal.
"""
if alpha <= 0:
raise Exception(
f"Parameter \"alpha\" of Mixup should be greater than 0. \"alpha\": {alpha}."
)
self._alpha = alpha self._alpha = alpha
def __call__(self, batch): def __call__(self, batch):
...@@ -62,9 +77,20 @@ class MixupOperator(BatchOperator): ...@@ -62,9 +77,20 @@ class MixupOperator(BatchOperator):
class CutmixOperator(BatchOperator): class CutmixOperator(BatchOperator):
""" Cutmix operator """ """ Cutmix operator """
def __init__(self, alpha=0.2): def __init__(self, alpha=0.2):
assert alpha > 0., \ """Build Cutmix operator
'parameter alpha[%f] should > 0.0' % (alpha)
Args:
alpha (float, optional): The parameter alpha of cutmix. Defaults to 0.2.
Raises:
Exception: The value of parameter is illegal.
"""
if alpha <= 0:
raise Exception(
f"Parameter \"alpha\" of Cutmix should be greater than 0. \"alpha\": {alpha}."
)
self._alpha = alpha self._alpha = alpha
def _rand_bbox(self, size, lam): def _rand_bbox(self, size, lam):
...@@ -72,8 +98,8 @@ class CutmixOperator(BatchOperator): ...@@ -72,8 +98,8 @@ class CutmixOperator(BatchOperator):
w = size[2] w = size[2]
h = size[3] h = size[3]
cut_rat = np.sqrt(1. - lam) cut_rat = np.sqrt(1. - lam)
cut_w = np.int(w * cut_rat) cut_w = int(w * cut_rat)
cut_h = np.int(h * cut_rat) cut_h = int(h * cut_rat)
# uniform # uniform
cx = np.random.randint(w) cx = np.random.randint(w)
...@@ -101,6 +127,7 @@ class CutmixOperator(BatchOperator): ...@@ -101,6 +127,7 @@ class CutmixOperator(BatchOperator):
class FmixOperator(BatchOperator): class FmixOperator(BatchOperator):
""" Fmix operator """ """ Fmix operator """
def __init__(self, alpha=1, decay_power=3, max_soft=0., reformulate=False): def __init__(self, alpha=1, decay_power=3, max_soft=0., reformulate=False):
self._alpha = alpha self._alpha = alpha
self._decay_power = decay_power self._decay_power = decay_power
...@@ -115,3 +142,42 @@ class FmixOperator(BatchOperator): ...@@ -115,3 +142,42 @@ class FmixOperator(BatchOperator):
size, self._max_soft, self._reformulate) size, self._max_soft, self._reformulate)
imgs = mask * imgs + (1 - mask) * imgs[idx] imgs = mask * imgs + (1 - mask) * imgs[idx]
return list(zip(imgs, labels, labels[idx], [lam] * bs)) return list(zip(imgs, labels, labels[idx], [lam] * bs))
class OpSampler(object):
""" Sample a operator from """
def __init__(self, **op_dict):
"""Build OpSampler
Raises:
Exception: The parameter \"prob\" of operator(s) are be set error.
"""
if len(op_dict) < 1:
msg = f"ConfigWarning: No operator in \"OpSampler\". \"OpSampler\" has been skipped."
self.ops = {}
total_prob = 0
for op_name in op_dict:
param = op_dict[op_name]
if "prob" not in param:
msg = f"ConfigWarning: Parameter \"prob\" should be set when use operator in \"OpSampler\". The operator \"{op_name}\"'s prob has been set \"0\"."
logger.warning(msg)
prob = param.pop("prob", 0)
total_prob += prob
op = eval(op_name)(**param)
self.ops.update({op: prob})
if total_prob > 1:
msg = f"ConfigError: The total prob of operators in \"OpSampler\" should be less 1."
logger.error(msg)
raise Exception(msg)
# add "None Op" when total_prob < 1, "None Op" do nothing
self.ops[None] = 1 - total_prob
def __call__(self, batch):
op = random.choices(
list(self.ops.keys()), weights=list(self.ops.values()), k=1)[0]
# return batch directly when None Op
return op(batch) if op else batch
...@@ -19,15 +19,59 @@ from __future__ import division ...@@ -19,15 +19,59 @@ from __future__ import division
from __future__ import print_function from __future__ import print_function
from __future__ import unicode_literals from __future__ import unicode_literals
from functools import partial
import six import six
import math import math
import random import random
import cv2 import cv2
import numpy as np import numpy as np
from PIL import Image from PIL import Image
from paddle.vision.transforms import ColorJitter as RawColorJitter
from .autoaugment import ImageNetPolicy from .autoaugment import ImageNetPolicy
from .functional import augmentations from .functional import augmentations
from ppcls.utils import logger
class UnifiedResize(object):
def __init__(self, interpolation=None, backend="cv2"):
_cv2_interp_from_str = {
'nearest': cv2.INTER_NEAREST,
'bilinear': cv2.INTER_LINEAR,
'area': cv2.INTER_AREA,
'bicubic': cv2.INTER_CUBIC,
'lanczos': cv2.INTER_LANCZOS4
}
_pil_interp_from_str = {
'nearest': Image.NEAREST,
'bilinear': Image.BILINEAR,
'bicubic': Image.BICUBIC,
'box': Image.BOX,
'lanczos': Image.LANCZOS,
'hamming': Image.HAMMING
}
def _pil_resize(src, size, resample):
pil_img = Image.fromarray(src)
pil_img = pil_img.resize(size, resample)
return np.asarray(pil_img)
if backend.lower() == "cv2":
if isinstance(interpolation, str):
interpolation = _cv2_interp_from_str[interpolation.lower()]
self.resize_func = partial(cv2.resize, interpolation=interpolation)
elif backend.lower() == "pil":
if isinstance(interpolation, str):
interpolation = _pil_interp_from_str[interpolation.lower()]
self.resize_func = partial(_pil_resize, resample=interpolation)
else:
logger.warning(
f"The backend of Resize only support \"cv2\" or \"PIL\". \"f{backend}\" is unavailable. Use \"cv2\" instead."
)
self.resize_func = cv2.resize
def __call__(self, src, size):
return self.resize_func(src, size)
class OperatorParamError(ValueError): class OperatorParamError(ValueError):
...@@ -67,8 +111,11 @@ class DecodeImage(object): ...@@ -67,8 +111,11 @@ class DecodeImage(object):
class ResizeImage(object): class ResizeImage(object):
""" resize image """ """ resize image """
def __init__(self, size=None, resize_short=None, interpolation=-1): def __init__(self,
self.interpolation = interpolation if interpolation >= 0 else None size=None,
resize_short=None,
interpolation=None,
backend="cv2"):
if resize_short is not None and resize_short > 0: if resize_short is not None and resize_short > 0:
self.resize_short = resize_short self.resize_short = resize_short
self.w = None self.w = None
...@@ -81,6 +128,9 @@ class ResizeImage(object): ...@@ -81,6 +128,9 @@ class ResizeImage(object):
raise OperatorParamError("invalid params for ReisizeImage for '\ raise OperatorParamError("invalid params for ReisizeImage for '\
'both 'size' and 'resize_short' are None") 'both 'size' and 'resize_short' are None")
self._resize_func = UnifiedResize(
interpolation=interpolation, backend=backend)
def __call__(self, img): def __call__(self, img):
img_h, img_w = img.shape[:2] img_h, img_w = img.shape[:2]
if self.resize_short is not None: if self.resize_short is not None:
...@@ -90,10 +140,7 @@ class ResizeImage(object): ...@@ -90,10 +140,7 @@ class ResizeImage(object):
else: else:
w = self.w w = self.w
h = self.h h = self.h
if self.interpolation is None: return self._resize_func(img, (w, h))
return cv2.resize(img, (w, h))
else:
return cv2.resize(img, (w, h), interpolation=self.interpolation)
class CropImage(object): class CropImage(object):
...@@ -119,9 +166,12 @@ class CropImage(object): ...@@ -119,9 +166,12 @@ class CropImage(object):
class RandCropImage(object): class RandCropImage(object):
""" random crop image """ """ random crop image """
def __init__(self, size, scale=None, ratio=None, interpolation=-1): def __init__(self,
size,
self.interpolation = interpolation if interpolation >= 0 else None scale=None,
ratio=None,
interpolation=None,
backend="cv2"):
if type(size) is int: if type(size) is int:
self.size = (size, size) # (h, w) self.size = (size, size) # (h, w)
else: else:
...@@ -130,6 +180,9 @@ class RandCropImage(object): ...@@ -130,6 +180,9 @@ class RandCropImage(object):
self.scale = [0.08, 1.0] if scale is None else scale self.scale = [0.08, 1.0] if scale is None else scale
self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio self.ratio = [3. / 4., 4. / 3.] if ratio is None else ratio
self._resize_func = UnifiedResize(
interpolation=interpolation, backend=backend)
def __call__(self, img): def __call__(self, img):
size = self.size size = self.size
scale = self.scale scale = self.scale
...@@ -155,10 +208,8 @@ class RandCropImage(object): ...@@ -155,10 +208,8 @@ class RandCropImage(object):
j = random.randint(0, img_h - h) j = random.randint(0, img_h - h)
img = img[j:j + h, i:i + w, :] img = img[j:j + h, i:i + w, :]
if self.interpolation is None:
return cv2.resize(img, size) return self._resize_func(img, size)
else:
return cv2.resize(img, size, interpolation=self.interpolation)
class RandFlipImage(object): class RandFlipImage(object):
...@@ -313,3 +364,20 @@ class AugMix(object): ...@@ -313,3 +364,20 @@ class AugMix(object):
mixed = (1 - m) * image + m * mix mixed = (1 - m) * image + m * mix
return mixed.astype(np.uint8) return mixed.astype(np.uint8)
class ColorJitter(RawColorJitter):
"""ColorJitter.
"""
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def __call__(self, img):
if not isinstance(img, Image.Image):
img = np.ascontiguousarray(img)
img = Image.fromarray(img)
img = super()._apply_image(img)
if isinstance(img, Image.Image):
img = np.asarray(img)
return img
...@@ -12,7 +12,9 @@ ...@@ -12,7 +12,9 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
#This code is based on https://github.com/zhunzhong07/Random-Erasing #This code is adapted from https://github.com/zhunzhong07/Random-Erasing, and refer to Timm.
from functools import partial
import math import math
import random import random
...@@ -20,36 +22,69 @@ import random ...@@ -20,36 +22,69 @@ import random
import numpy as np import numpy as np
class Pixels(object):
def __init__(self, mode="const", mean=[0., 0., 0.]):
self._mode = mode
self._mean = mean
def __call__(self, h=224, w=224, c=3):
if self._mode == "rand":
return np.random.normal(size=(1, 1, 3))
elif self._mode == "pixel":
return np.random.normal(size=(h, w, c))
elif self._mode == "const":
return self._mean
else:
raise Exception(
"Invalid mode in RandomErasing, only support \"const\", \"rand\", \"pixel\""
)
class RandomErasing(object): class RandomErasing(object):
def __init__(self, EPSILON=0.5, sl=0.02, sh=0.4, r1=0.3, """RandomErasing.
mean=[0., 0., 0.]): """
self.EPSILON = EPSILON
self.mean = mean def __init__(self,
self.sl = sl EPSILON=0.5,
self.sh = sh sl=0.02,
self.r1 = r1 sh=0.4,
r1=0.3,
mean=[0., 0., 0.],
attempt=100,
use_log_aspect=False,
mode='const'):
self.EPSILON = eval(EPSILON) if isinstance(EPSILON, str) else EPSILON
self.sl = eval(sl) if isinstance(sl, str) else sl
self.sh = eval(sh) if isinstance(sh, str) else sh
r1 = eval(r1) if isinstance(r1, str) else r1
self.r1 = (math.log(r1), math.log(1 / r1)) if use_log_aspect else (
r1, 1 / r1)
self.use_log_aspect = use_log_aspect
self.attempt = attempt
self.get_pixels = Pixels(mode, mean)
def __call__(self, img): def __call__(self, img):
if random.uniform(0, 1) > self.EPSILON: if random.random() > self.EPSILON:
return img return img
for _ in range(100): for _ in range(self.attempt):
area = img.shape[0] * img.shape[1] area = img.shape[0] * img.shape[1]
target_area = random.uniform(self.sl, self.sh) * area target_area = random.uniform(self.sl, self.sh) * area
aspect_ratio = random.uniform(self.r1, 1 / self.r1) aspect_ratio = random.uniform(*self.r1)
if self.use_log_aspect:
aspect_ratio = math.exp(aspect_ratio)
h = int(round(math.sqrt(target_area * aspect_ratio))) h = int(round(math.sqrt(target_area * aspect_ratio)))
w = int(round(math.sqrt(target_area / aspect_ratio))) w = int(round(math.sqrt(target_area / aspect_ratio)))
if w < img.shape[1] and h < img.shape[0]: if w < img.shape[1] and h < img.shape[0]:
pixels = self.get_pixels(h, w, img.shape[2])
x1 = random.randint(0, img.shape[0] - h) x1 = random.randint(0, img.shape[0] - h)
y1 = random.randint(0, img.shape[1] - w) y1 = random.randint(0, img.shape[1] - w)
if img.shape[0] == 3: if img.shape[2] == 3:
img[x1:x1 + h, y1:y1 + w, 0] = self.mean[0] img[x1:x1 + h, y1:y1 + w, :] = pixels
img[x1:x1 + h, y1:y1 + w, 1] = self.mean[1]
img[x1:x1 + h, y1:y1 + w, 2] = self.mean[2]
else: else:
img[0, x1:x1 + h, y1:y1 + w] = self.mean[1] img[x1:x1 + h, y1:y1 + w, 0] = pixels[0]
return img return img
return img return img
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
This code implements is borrowed from Timm: https://github.com/rwightman/pytorch-image-models.
hacked together by / Copyright 2020 Ross Wightman
"""
import random
import math
import re
from PIL import Image, ImageOps, ImageEnhance, ImageChops
import PIL
import numpy as np
IMAGENET_DEFAULT_MEAN = (0.485, 0.456, 0.406)
_PIL_VER = tuple([int(x) for x in PIL.__version__.split('.')[:2]])
_FILL = (128, 128, 128)
# This signifies the max integer that the controller RNN could predict for the
# augmentation scheme.
_MAX_LEVEL = 10.
_HPARAMS_DEFAULT = dict(
translate_const=250,
img_mean=_FILL, )
_RANDOM_INTERPOLATION = (Image.BILINEAR, Image.BICUBIC)
def _pil_interp(method):
if method == 'bicubic':
return Image.BICUBIC
elif method == 'lanczos':
return Image.LANCZOS
elif method == 'hamming':
return Image.HAMMING
else:
# default bilinear, do we want to allow nearest?
return Image.BILINEAR
def _interpolation(kwargs):
interpolation = kwargs.pop('resample', Image.BILINEAR)
if isinstance(interpolation, (list, tuple)):
return random.choice(interpolation)
else:
return interpolation
def _check_args_tf(kwargs):
if 'fillcolor' in kwargs and _PIL_VER < (5, 0):
kwargs.pop('fillcolor')
kwargs['resample'] = _interpolation(kwargs)
def shear_x(img, factor, **kwargs):
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, factor, 0, 0, 1, 0),
**kwargs)
def shear_y(img, factor, **kwargs):
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, 0, 0, factor, 1, 0),
**kwargs)
def translate_x_rel(img, pct, **kwargs):
pixels = pct * img.size[0]
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, 0, pixels, 0, 1, 0),
**kwargs)
def translate_y_rel(img, pct, **kwargs):
pixels = pct * img.size[1]
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, 0, 0, 0, 1, pixels),
**kwargs)
def translate_x_abs(img, pixels, **kwargs):
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, 0, pixels, 0, 1, 0),
**kwargs)
def translate_y_abs(img, pixels, **kwargs):
_check_args_tf(kwargs)
return img.transform(img.size, Image.AFFINE, (1, 0, 0, 0, 1, pixels),
**kwargs)
def rotate(img, degrees, **kwargs):
_check_args_tf(kwargs)
if _PIL_VER >= (5, 2):
return img.rotate(degrees, **kwargs)
elif _PIL_VER >= (5, 0):
w, h = img.size
post_trans = (0, 0)
rotn_center = (w / 2.0, h / 2.0)
angle = -math.radians(degrees)
matrix = [
round(math.cos(angle), 15),
round(math.sin(angle), 15),
0.0,
round(-math.sin(angle), 15),
round(math.cos(angle), 15),
0.0,
]
def transform(x, y, matrix):
(a, b, c, d, e, f) = matrix
return a * x + b * y + c, d * x + e * y + f
matrix[2], matrix[5] = transform(-rotn_center[0] - post_trans[0],
-rotn_center[1] - post_trans[1],
matrix)
matrix[2] += rotn_center[0]
matrix[5] += rotn_center[1]
return img.transform(img.size, Image.AFFINE, matrix, **kwargs)
else:
return img.rotate(degrees, resample=kwargs['resample'])
def auto_contrast(img, **__):
return ImageOps.autocontrast(img)
def invert(img, **__):
return ImageOps.invert(img)
def equalize(img, **__):
return ImageOps.equalize(img)
def solarize(img, thresh, **__):
return ImageOps.solarize(img, thresh)
def solarize_add(img, add, thresh=128, **__):
lut = []
for i in range(256):
if i < thresh:
lut.append(min(255, i + add))
else:
lut.append(i)
if img.mode in ("L", "RGB"):
if img.mode == "RGB" and len(lut) == 256:
lut = lut + lut + lut
return img.point(lut)
else:
return img
def posterize(img, bits_to_keep, **__):
if bits_to_keep >= 8:
return img
return ImageOps.posterize(img, bits_to_keep)
def contrast(img, factor, **__):
return ImageEnhance.Contrast(img).enhance(factor)
def color(img, factor, **__):
return ImageEnhance.Color(img).enhance(factor)
def brightness(img, factor, **__):
return ImageEnhance.Brightness(img).enhance(factor)
def sharpness(img, factor, **__):
return ImageEnhance.Sharpness(img).enhance(factor)
def _randomly_negate(v):
"""With 50% prob, negate the value"""
return -v if random.random() > 0.5 else v
def _rotate_level_to_arg(level, _hparams):
# range [-30, 30]
level = (level / _MAX_LEVEL) * 30.
level = _randomly_negate(level)
return level,
def _enhance_level_to_arg(level, _hparams):
# range [0.1, 1.9]
return (level / _MAX_LEVEL) * 1.8 + 0.1,
def _enhance_increasing_level_to_arg(level, _hparams):
# the 'no change' level is 1.0, moving away from that towards 0. or 2.0 increases the enhancement blend
# range [0.1, 1.9]
level = (level / _MAX_LEVEL) * .9
level = 1.0 + _randomly_negate(level)
return level,
def _shear_level_to_arg(level, _hparams):
# range [-0.3, 0.3]
level = (level / _MAX_LEVEL) * 0.3
level = _randomly_negate(level)
return level,
def _translate_abs_level_to_arg(level, hparams):
translate_const = hparams['translate_const']
level = (level / _MAX_LEVEL) * float(translate_const)
level = _randomly_negate(level)
return level,
def _translate_rel_level_to_arg(level, hparams):
# default range [-0.45, 0.45]
translate_pct = hparams.get('translate_pct', 0.45)
level = (level / _MAX_LEVEL) * translate_pct
level = _randomly_negate(level)
return level,
def _posterize_level_to_arg(level, _hparams):
# As per Tensorflow TPU EfficientNet impl
# range [0, 4], 'keep 0 up to 4 MSB of original image'
# intensity/severity of augmentation decreases with level
return int((level / _MAX_LEVEL) * 4),
def _posterize_increasing_level_to_arg(level, hparams):
# As per Tensorflow models research and UDA impl
# range [4, 0], 'keep 4 down to 0 MSB of original image',
# intensity/severity of augmentation increases with level
return 4 - _posterize_level_to_arg(level, hparams)[0],
def _posterize_original_level_to_arg(level, _hparams):
# As per original AutoAugment paper description
# range [4, 8], 'keep 4 up to 8 MSB of image'
# intensity/severity of augmentation decreases with level
return int((level / _MAX_LEVEL) * 4) + 4,
def _solarize_level_to_arg(level, _hparams):
# range [0, 256]
# intensity/severity of augmentation decreases with level
return int((level / _MAX_LEVEL) * 256),
def _solarize_increasing_level_to_arg(level, _hparams):
# range [0, 256]
# intensity/severity of augmentation increases with level
return 256 - _solarize_level_to_arg(level, _hparams)[0],
def _solarize_add_level_to_arg(level, _hparams):
# range [0, 110]
return int((level / _MAX_LEVEL) * 110),
LEVEL_TO_ARG = {
'AutoContrast': None,
'Equalize': None,
'Invert': None,
'Rotate': _rotate_level_to_arg,
# There are several variations of the posterize level scaling in various Tensorflow/Google repositories/papers
'Posterize': _posterize_level_to_arg,
'PosterizeIncreasing': _posterize_increasing_level_to_arg,
'PosterizeOriginal': _posterize_original_level_to_arg,
'Solarize': _solarize_level_to_arg,
'SolarizeIncreasing': _solarize_increasing_level_to_arg,
'SolarizeAdd': _solarize_add_level_to_arg,
'Color': _enhance_level_to_arg,
'ColorIncreasing': _enhance_increasing_level_to_arg,
'Contrast': _enhance_level_to_arg,
'ContrastIncreasing': _enhance_increasing_level_to_arg,
'Brightness': _enhance_level_to_arg,
'BrightnessIncreasing': _enhance_increasing_level_to_arg,
'Sharpness': _enhance_level_to_arg,
'SharpnessIncreasing': _enhance_increasing_level_to_arg,
'ShearX': _shear_level_to_arg,
'ShearY': _shear_level_to_arg,
'TranslateX': _translate_abs_level_to_arg,
'TranslateY': _translate_abs_level_to_arg,
'TranslateXRel': _translate_rel_level_to_arg,
'TranslateYRel': _translate_rel_level_to_arg,
}
NAME_TO_OP = {
'AutoContrast': auto_contrast,
'Equalize': equalize,
'Invert': invert,
'Rotate': rotate,
'Posterize': posterize,
'PosterizeIncreasing': posterize,
'PosterizeOriginal': posterize,
'Solarize': solarize,
'SolarizeIncreasing': solarize,
'SolarizeAdd': solarize_add,
'Color': color,
'ColorIncreasing': color,
'Contrast': contrast,
'ContrastIncreasing': contrast,
'Brightness': brightness,
'BrightnessIncreasing': brightness,
'Sharpness': sharpness,
'SharpnessIncreasing': sharpness,
'ShearX': shear_x,
'ShearY': shear_y,
'TranslateX': translate_x_abs,
'TranslateY': translate_y_abs,
'TranslateXRel': translate_x_rel,
'TranslateYRel': translate_y_rel,
}
class AugmentOp(object):
def __init__(self, name, prob=0.5, magnitude=10, hparams=None):
hparams = hparams or _HPARAMS_DEFAULT
self.aug_fn = NAME_TO_OP[name]
self.level_fn = LEVEL_TO_ARG[name]
self.prob = prob
self.magnitude = magnitude
self.hparams = hparams.copy()
self.kwargs = dict(
fillcolor=hparams['img_mean'] if 'img_mean' in hparams else _FILL,
resample=hparams['interpolation']
if 'interpolation' in hparams else _RANDOM_INTERPOLATION, )
# If magnitude_std is > 0, we introduce some randomness
# in the usually fixed policy and sample magnitude from a normal distribution
# with mean `magnitude` and std-dev of `magnitude_std`.
# NOTE This is my own hack, being tested, not in papers or reference impls.
self.magnitude_std = self.hparams.get('magnitude_std', 0)
def __call__(self, img):
if self.prob < 1.0 and random.random() > self.prob:
return img
magnitude = self.magnitude
if self.magnitude_std and self.magnitude_std > 0:
magnitude = random.gauss(magnitude, self.magnitude_std)
magnitude = min(_MAX_LEVEL, max(0, magnitude)) # clip to valid range
level_args = self.level_fn(
magnitude, self.hparams) if self.level_fn is not None else tuple()
return self.aug_fn(img, *level_args, **self.kwargs)
def auto_augment_policy_v0(hparams):
# ImageNet v0 policy from TPU EfficientNet impl, cannot find a paper reference.
policy = [
[('Equalize', 0.8, 1), ('ShearY', 0.8, 4)],
[('Color', 0.4, 9), ('Equalize', 0.6, 3)],
[('Color', 0.4, 1), ('Rotate', 0.6, 8)],
[('Solarize', 0.8, 3), ('Equalize', 0.4, 7)],
[('Solarize', 0.4, 2), ('Solarize', 0.6, 2)],
[('Color', 0.2, 0), ('Equalize', 0.8, 8)],
[('Equalize', 0.4, 8), ('SolarizeAdd', 0.8, 3)],
[('ShearX', 0.2, 9), ('Rotate', 0.6, 8)],
[('Color', 0.6, 1), ('Equalize', 1.0, 2)],
[('Invert', 0.4, 9), ('Rotate', 0.6, 0)],
[('Equalize', 1.0, 9), ('ShearY', 0.6, 3)],
[('Color', 0.4, 7), ('Equalize', 0.6, 0)],
[('Posterize', 0.4, 6), ('AutoContrast', 0.4, 7)],
[('Solarize', 0.6, 8), ('Color', 0.6, 9)],
[('Solarize', 0.2, 4), ('Rotate', 0.8, 9)],
[('Rotate', 1.0, 7), ('TranslateYRel', 0.8, 9)],
[('ShearX', 0.0, 0), ('Solarize', 0.8, 4)],
[('ShearY', 0.8, 0), ('Color', 0.6, 4)],
[('Color', 1.0, 0), ('Rotate', 0.6, 2)],
[('Equalize', 0.8, 4), ('Equalize', 0.0, 8)],
[('Equalize', 1.0, 4), ('AutoContrast', 0.6, 2)],
[('ShearY', 0.4, 7), ('SolarizeAdd', 0.6, 7)],
[('Posterize', 0.8, 2), ('Solarize', 0.6, 10)
], # This results in black image with Tpu posterize
[('Solarize', 0.6, 8), ('Equalize', 0.6, 1)],
[('Color', 0.8, 6), ('Rotate', 0.4, 5)],
]
pc = [[AugmentOp(*a, hparams=hparams) for a in sp] for sp in policy]
return pc
def auto_augment_policy_v0r(hparams):
# ImageNet v0 policy from TPU EfficientNet impl, with variation of Posterize used
# in Google research implementation (number of bits discarded increases with magnitude)
policy = [
[('Equalize', 0.8, 1), ('ShearY', 0.8, 4)],
[('Color', 0.4, 9), ('Equalize', 0.6, 3)],
[('Color', 0.4, 1), ('Rotate', 0.6, 8)],
[('Solarize', 0.8, 3), ('Equalize', 0.4, 7)],
[('Solarize', 0.4, 2), ('Solarize', 0.6, 2)],
[('Color', 0.2, 0), ('Equalize', 0.8, 8)],
[('Equalize', 0.4, 8), ('SolarizeAdd', 0.8, 3)],
[('ShearX', 0.2, 9), ('Rotate', 0.6, 8)],
[('Color', 0.6, 1), ('Equalize', 1.0, 2)],
[('Invert', 0.4, 9), ('Rotate', 0.6, 0)],
[('Equalize', 1.0, 9), ('ShearY', 0.6, 3)],
[('Color', 0.4, 7), ('Equalize', 0.6, 0)],
[('PosterizeIncreasing', 0.4, 6), ('AutoContrast', 0.4, 7)],
[('Solarize', 0.6, 8), ('Color', 0.6, 9)],
[('Solarize', 0.2, 4), ('Rotate', 0.8, 9)],
[('Rotate', 1.0, 7), ('TranslateYRel', 0.8, 9)],
[('ShearX', 0.0, 0), ('Solarize', 0.8, 4)],
[('ShearY', 0.8, 0), ('Color', 0.6, 4)],
[('Color', 1.0, 0), ('Rotate', 0.6, 2)],
[('Equalize', 0.8, 4), ('Equalize', 0.0, 8)],
[('Equalize', 1.0, 4), ('AutoContrast', 0.6, 2)],
[('ShearY', 0.4, 7), ('SolarizeAdd', 0.6, 7)],
[('PosterizeIncreasing', 0.8, 2), ('Solarize', 0.6, 10)],
[('Solarize', 0.6, 8), ('Equalize', 0.6, 1)],
[('Color', 0.8, 6), ('Rotate', 0.4, 5)],
]
pc = [[AugmentOp(*a, hparams=hparams) for a in sp] for sp in policy]
return pc
def auto_augment_policy_original(hparams):
# ImageNet policy from https://arxiv.org/abs/1805.09501
policy = [
[('PosterizeOriginal', 0.4, 8), ('Rotate', 0.6, 9)],
[('Solarize', 0.6, 5), ('AutoContrast', 0.6, 5)],
[('Equalize', 0.8, 8), ('Equalize', 0.6, 3)],
[('PosterizeOriginal', 0.6, 7), ('PosterizeOriginal', 0.6, 6)],
[('Equalize', 0.4, 7), ('Solarize', 0.2, 4)],
[('Equalize', 0.4, 4), ('Rotate', 0.8, 8)],
[('Solarize', 0.6, 3), ('Equalize', 0.6, 7)],
[('PosterizeOriginal', 0.8, 5), ('Equalize', 1.0, 2)],
[('Rotate', 0.2, 3), ('Solarize', 0.6, 8)],
[('Equalize', 0.6, 8), ('PosterizeOriginal', 0.4, 6)],
[('Rotate', 0.8, 8), ('Color', 0.4, 0)],
[('Rotate', 0.4, 9), ('Equalize', 0.6, 2)],
[('Equalize', 0.0, 7), ('Equalize', 0.8, 8)],
[('Invert', 0.6, 4), ('Equalize', 1.0, 8)],
[('Color', 0.6, 4), ('Contrast', 1.0, 8)],
[('Rotate', 0.8, 8), ('Color', 1.0, 2)],
[('Color', 0.8, 8), ('Solarize', 0.8, 7)],
[('Sharpness', 0.4, 7), ('Invert', 0.6, 8)],
[('ShearX', 0.6, 5), ('Equalize', 1.0, 9)],
[('Color', 0.4, 0), ('Equalize', 0.6, 3)],
[('Equalize', 0.4, 7), ('Solarize', 0.2, 4)],
[('Solarize', 0.6, 5), ('AutoContrast', 0.6, 5)],
[('Invert', 0.6, 4), ('Equalize', 1.0, 8)],
[('Color', 0.6, 4), ('Contrast', 1.0, 8)],
[('Equalize', 0.8, 8), ('Equalize', 0.6, 3)],
]
pc = [[AugmentOp(*a, hparams=hparams) for a in sp] for sp in policy]
return pc
def auto_augment_policy_originalr(hparams):
# ImageNet policy from https://arxiv.org/abs/1805.09501 with research posterize variation
policy = [
[('PosterizeIncreasing', 0.4, 8), ('Rotate', 0.6, 9)],
[('Solarize', 0.6, 5), ('AutoContrast', 0.6, 5)],
[('Equalize', 0.8, 8), ('Equalize', 0.6, 3)],
[('PosterizeIncreasing', 0.6, 7), ('PosterizeIncreasing', 0.6, 6)],
[('Equalize', 0.4, 7), ('Solarize', 0.2, 4)],
[('Equalize', 0.4, 4), ('Rotate', 0.8, 8)],
[('Solarize', 0.6, 3), ('Equalize', 0.6, 7)],
[('PosterizeIncreasing', 0.8, 5), ('Equalize', 1.0, 2)],
[('Rotate', 0.2, 3), ('Solarize', 0.6, 8)],
[('Equalize', 0.6, 8), ('PosterizeIncreasing', 0.4, 6)],
[('Rotate', 0.8, 8), ('Color', 0.4, 0)],
[('Rotate', 0.4, 9), ('Equalize', 0.6, 2)],
[('Equalize', 0.0, 7), ('Equalize', 0.8, 8)],
[('Invert', 0.6, 4), ('Equalize', 1.0, 8)],
[('Color', 0.6, 4), ('Contrast', 1.0, 8)],
[('Rotate', 0.8, 8), ('Color', 1.0, 2)],
[('Color', 0.8, 8), ('Solarize', 0.8, 7)],
[('Sharpness', 0.4, 7), ('Invert', 0.6, 8)],
[('ShearX', 0.6, 5), ('Equalize', 1.0, 9)],
[('Color', 0.4, 0), ('Equalize', 0.6, 3)],
[('Equalize', 0.4, 7), ('Solarize', 0.2, 4)],
[('Solarize', 0.6, 5), ('AutoContrast', 0.6, 5)],
[('Invert', 0.6, 4), ('Equalize', 1.0, 8)],
[('Color', 0.6, 4), ('Contrast', 1.0, 8)],
[('Equalize', 0.8, 8), ('Equalize', 0.6, 3)],
]
pc = [[AugmentOp(*a, hparams=hparams) for a in sp] for sp in policy]
return pc
def auto_augment_policy(name='v0', hparams=None):
hparams = hparams or _HPARAMS_DEFAULT
if name == 'original':
return auto_augment_policy_original(hparams)
elif name == 'originalr':
return auto_augment_policy_originalr(hparams)
elif name == 'v0':
return auto_augment_policy_v0(hparams)
elif name == 'v0r':
return auto_augment_policy_v0r(hparams)
else:
assert False, 'Unknown AA policy (%s)' % name
class AutoAugment(object):
def __init__(self, policy):
self.policy = policy
def __call__(self, img):
sub_policy = random.choice(self.policy)
for op in sub_policy:
img = op(img)
return img
def auto_augment_transform(config_str, hparams):
"""
Create a AutoAugment transform
:param config_str: String defining configuration of auto augmentation. Consists of multiple sections separated by
dashes ('-'). The first section defines the AutoAugment policy (one of 'v0', 'v0r', 'original', 'originalr').
The remaining sections, not order sepecific determine
'mstd' - float std deviation of magnitude noise applied
Ex 'original-mstd0.5' results in AutoAugment with original policy, magnitude_std 0.5
:param hparams: Other hparams (kwargs) for the AutoAugmentation scheme
:return: A callable Transform Op
"""
config = config_str.split('-')
policy_name = config[0]
config = config[1:]
for c in config:
cs = re.split(r'(\d.*)', c)
if len(cs) < 2:
continue
key, val = cs[:2]
if key == 'mstd':
# noise param injected via hparams for now
hparams.setdefault('magnitude_std', float(val))
else:
assert False, 'Unknown AutoAugment config section'
aa_policy = auto_augment_policy(policy_name, hparams=hparams)
return AutoAugment(aa_policy)
_RAND_TRANSFORMS = [
'AutoContrast',
'Equalize',
'Invert',
'Rotate',
'Posterize',
'Solarize',
'SolarizeAdd',
'Color',
'Contrast',
'Brightness',
'Sharpness',
'ShearX',
'ShearY',
'TranslateXRel',
'TranslateYRel',
#'Cutout' # NOTE I've implement this as random erasing separately
]
_RAND_INCREASING_TRANSFORMS = [
'AutoContrast',
'Equalize',
'Invert',
'Rotate',
'PosterizeIncreasing',
'SolarizeIncreasing',
'SolarizeAdd',
'ColorIncreasing',
'ContrastIncreasing',
'BrightnessIncreasing',
'SharpnessIncreasing',
'ShearX',
'ShearY',
'TranslateXRel',
'TranslateYRel',
#'Cutout' # NOTE I've implement this as random erasing separately
]
# These experimental weights are based loosely on the relative improvements mentioned in paper.
# They may not result in increased performance, but could likely be tuned to so.
_RAND_CHOICE_WEIGHTS_0 = {
'Rotate': 0.3,
'ShearX': 0.2,
'ShearY': 0.2,
'TranslateXRel': 0.1,
'TranslateYRel': 0.1,
'Color': .025,
'Sharpness': 0.025,
'AutoContrast': 0.025,
'Solarize': .005,
'SolarizeAdd': .005,
'Contrast': .005,
'Brightness': .005,
'Equalize': .005,
'Posterize': 0,
'Invert': 0,
}
def _select_rand_weights(weight_idx=0, transforms=None):
transforms = transforms or _RAND_TRANSFORMS
assert weight_idx == 0 # only one set of weights currently
rand_weights = _RAND_CHOICE_WEIGHTS_0
probs = [rand_weights[k] for k in transforms]
probs /= np.sum(probs)
return probs
def rand_augment_ops(magnitude=10, hparams=None, transforms=None):
hparams = hparams or _HPARAMS_DEFAULT
transforms = transforms or _RAND_TRANSFORMS
return [
AugmentOp(
name, prob=0.5, magnitude=magnitude, hparams=hparams)
for name in transforms
]
class RandAugment(object):
def __init__(self, ops, num_layers=2, choice_weights=None):
self.ops = ops
self.num_layers = num_layers
self.choice_weights = choice_weights
def __call__(self, img):
# no replacement when using weighted choice
ops = np.random.choice(
self.ops,
self.num_layers,
replace=self.choice_weights is None,
p=self.choice_weights)
for op in ops:
img = op(img)
return img
def rand_augment_transform(config_str, hparams):
"""
Create a RandAugment transform
:param config_str: String defining configuration of random augmentation. Consists of multiple sections separated by
dashes ('-'). The first section defines the specific variant of rand augment (currently only 'rand'). The remaining
sections, not order sepecific determine
'm' - integer magnitude of rand augment
'n' - integer num layers (number of transform ops selected per image)
'w' - integer probabiliy weight index (index of a set of weights to influence choice of op)
'mstd' - float std deviation of magnitude noise applied
'inc' - integer (bool), use augmentations that increase in severity with magnitude (default: 0)
Ex 'rand-m9-n3-mstd0.5' results in RandAugment with magnitude 9, num_layers 3, magnitude_std 0.5
'rand-mstd1-w0' results in magnitude_std 1.0, weights 0, default magnitude of 10 and num_layers 2
:param hparams: Other hparams (kwargs) for the RandAugmentation scheme
:return: A callable Transform Op
"""
magnitude = _MAX_LEVEL # default to _MAX_LEVEL for magnitude (currently 10)
num_layers = 2 # default to 2 ops per image
weight_idx = None # default to no probability weights for op choice
transforms = _RAND_TRANSFORMS
config = config_str.split('-')
assert config[0] == 'rand'
config = config[1:]
for c in config:
cs = re.split(r'(\d.*)', c)
if len(cs) < 2:
continue
key, val = cs[:2]
if key == 'mstd':
# noise param injected via hparams for now
hparams.setdefault('magnitude_std', float(val))
elif key == 'inc':
if bool(val):
transforms = _RAND_INCREASING_TRANSFORMS
elif key == 'm':
magnitude = int(val)
elif key == 'n':
num_layers = int(val)
elif key == 'w':
weight_idx = int(val)
else:
assert False, 'Unknown RandAugment config section'
ra_ops = rand_augment_ops(
magnitude=magnitude, hparams=hparams, transforms=transforms)
choice_weights = None if weight_idx is None else _select_rand_weights(
weight_idx)
return RandAugment(ra_ops, num_layers, choice_weights=choice_weights)
_AUGMIX_TRANSFORMS = [
'AutoContrast',
'ColorIncreasing', # not in paper
'ContrastIncreasing', # not in paper
'BrightnessIncreasing', # not in paper
'SharpnessIncreasing', # not in paper
'Equalize',
'Rotate',
'PosterizeIncreasing',
'SolarizeIncreasing',
'ShearX',
'ShearY',
'TranslateXRel',
'TranslateYRel',
]
def augmix_ops(magnitude=10, hparams=None, transforms=None):
hparams = hparams or _HPARAMS_DEFAULT
transforms = transforms or _AUGMIX_TRANSFORMS
return [
AugmentOp(
name, prob=1.0, magnitude=magnitude, hparams=hparams)
for name in transforms
]
class AugMixAugment(object):
""" AugMix Transform
Adapted and improved from impl here: https://github.com/google-research/augmix/blob/master/imagenet.py
From paper: 'AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty -
https://arxiv.org/abs/1912.02781
"""
def __init__(self, ops, alpha=1., width=3, depth=-1, blended=False):
self.ops = ops
self.alpha = alpha
self.width = width
self.depth = depth
self.blended = blended # blended mode is faster but not well tested
def _calc_blended_weights(self, ws, m):
ws = ws * m
cump = 1.
rws = []
for w in ws[::-1]:
alpha = w / cump
cump *= (1 - alpha)
rws.append(alpha)
return np.array(rws[::-1], dtype=np.float32)
def _apply_blended(self, img, mixing_weights, m):
# This is my first crack and implementing a slightly faster mixed augmentation. Instead
# of accumulating the mix for each chain in a Numpy array and then blending with original,
# it recomputes the blending coefficients and applies one PIL image blend per chain.
# TODO the results appear in the right ballpark but they differ by more than rounding.
img_orig = img.copy()
ws = self._calc_blended_weights(mixing_weights, m)
for w in ws:
depth = self.depth if self.depth > 0 else np.random.randint(1, 4)
ops = np.random.choice(self.ops, depth, replace=True)
img_aug = img_orig # no ops are in-place, deep copy not necessary
for op in ops:
img_aug = op(img_aug)
img = Image.blend(img, img_aug, w)
return img
def _apply_basic(self, img, mixing_weights, m):
# This is a literal adaptation of the paper/official implementation without normalizations and
# PIL <-> Numpy conversions between every op. It is still quite CPU compute heavy compared to the
# typical augmentation transforms, could use a GPU / Kornia implementation.
img_shape = img.size[0], img.size[1], len(img.getbands())
mixed = np.zeros(img_shape, dtype=np.float32)
for mw in mixing_weights:
depth = self.depth if self.depth > 0 else np.random.randint(1, 4)
ops = np.random.choice(self.ops, depth, replace=True)
img_aug = img # no ops are in-place, deep copy not necessary
for op in ops:
img_aug = op(img_aug)
mixed += mw * np.asarray(img_aug, dtype=np.float32)
np.clip(mixed, 0, 255., out=mixed)
mixed = Image.fromarray(mixed.astype(np.uint8))
return Image.blend(img, mixed, m)
def __call__(self, img):
mixing_weights = np.float32(
np.random.dirichlet([self.alpha] * self.width))
m = np.float32(np.random.beta(self.alpha, self.alpha))
if self.blended:
mixed = self._apply_blended(img, mixing_weights, m)
else:
mixed = self._apply_basic(img, mixing_weights, m)
return mixed
def augment_and_mix_transform(config_str, hparams):
""" Create AugMix transform
:param config_str: String defining configuration of random augmentation. Consists of multiple sections separated by
dashes ('-'). The first section defines the specific variant of rand augment (currently only 'rand'). The remaining
sections, not order sepecific determine
'm' - integer magnitude (severity) of augmentation mix (default: 3)
'w' - integer width of augmentation chain (default: 3)
'd' - integer depth of augmentation chain (-1 is random [1, 3], default: -1)
'b' - integer (bool), blend each branch of chain into end result without a final blend, less CPU (default: 0)
'mstd' - float std deviation of magnitude noise applied (default: 0)
Ex 'augmix-m5-w4-d2' results in AugMix with severity 5, chain width 4, chain depth 2
:param hparams: Other hparams (kwargs) for the Augmentation transforms
:return: A callable Transform Op
"""
magnitude = 3
width = 3
depth = -1
alpha = 1.
blended = False
config = config_str.split('-')
assert config[0] == 'augmix'
config = config[1:]
for c in config:
cs = re.split(r'(\d.*)', c)
if len(cs) < 2:
continue
key, val = cs[:2]
if key == 'mstd':
# noise param injected via hparams for now
hparams.setdefault('magnitude_std', float(val))
elif key == 'm':
magnitude = int(val)
elif key == 'w':
width = int(val)
elif key == 'd':
depth = int(val)
elif key == 'a':
alpha = float(val)
elif key == 'b':
blended = bool(val)
else:
assert False, 'Unknown AugMix config section'
ops = augmix_ops(magnitude=magnitude, hparams=hparams)
return AugMixAugment(
ops, alpha=alpha, width=width, depth=depth, blended=blended)
class RawTimmAutoAugment(object):
"""TimmAutoAugment API for PaddleClas."""
def __init__(self,
config_str="rand-m9-mstd0.5-inc1",
interpolation="bicubic",
img_size=224,
mean=IMAGENET_DEFAULT_MEAN):
if isinstance(img_size, (tuple, list)):
img_size_min = min(img_size)
else:
img_size_min = img_size
aa_params = dict(
translate_const=int(img_size_min * 0.45),
img_mean=tuple([min(255, round(255 * x)) for x in mean]), )
if interpolation and interpolation != 'random':
aa_params['interpolation'] = _pil_interp(interpolation)
if config_str.startswith('rand'):
self.augment_func = rand_augment_transform(config_str, aa_params)
elif config_str.startswith('augmix'):
aa_params['translate_pct'] = 0.3
self.augment_func = augment_and_mix_transform(config_str,
aa_params)
elif config_str.startswith('auto'):
self.augment_func = auto_augment_transform(config_str, aa_params)
else:
raise Exception(
"ConfigError: The TimmAutoAugment Op only support RandAugment, AutoAugment, AugMix, and the config_str only starts with \"rand\", \"augmix\", \"auto\"."
)
def __call__(self, img):
return self.augment_func(img)
...@@ -20,6 +20,8 @@ import paddle ...@@ -20,6 +20,8 @@ import paddle
import paddle.distributed as dist import paddle.distributed as dist
from visualdl import LogWriter from visualdl import LogWriter
from paddle import nn from paddle import nn
import numpy as np
import random
from ppcls.utils.check import check_gpu from ppcls.utils.check import check_gpu
from ppcls.utils.misc import AverageMeter from ppcls.utils.misc import AverageMeter
...@@ -42,6 +44,7 @@ from ppcls.data import create_operators ...@@ -42,6 +44,7 @@ from ppcls.data import create_operators
from ppcls.engine.train import train_epoch from ppcls.engine.train import train_epoch
from ppcls.engine import evaluation from ppcls.engine import evaluation
from ppcls.arch.gears.identity_head import IdentityHead from ppcls.arch.gears.identity_head import IdentityHead
from ppcls.engine.slim import get_pruner, get_quaner
class Engine(object): class Engine(object):
...@@ -56,6 +59,14 @@ class Engine(object): ...@@ -56,6 +59,14 @@ class Engine(object):
else: else:
self.is_rec = False self.is_rec = False
# set seed
seed = self.config["Global"].get("seed", False)
if seed:
assert isinstance(seed, int), "The 'seed' must be a integer!"
paddle.seed(seed)
np.random.seed(seed)
random.seed(seed)
# init logger # init logger
self.output_dir = self.config['Global']['output_dir'] self.output_dir = self.config['Global']['output_dir']
log_file = os.path.join(self.output_dir, self.config["Arch"]["name"], log_file = os.path.join(self.output_dir, self.config["Arch"]["name"],
...@@ -182,12 +193,14 @@ class Engine(object): ...@@ -182,12 +193,14 @@ class Engine(object):
self.model, self.config["Global"]["pretrained_model"]) self.model, self.config["Global"]["pretrained_model"])
# for slim # for slim
self.pruner = get_pruner(self.config, self.model)
self.quanter = get_quaner(self.config, self.model)
# build optimizer # build optimizer
if self.mode == 'train': if self.mode == 'train':
self.optimizer, self.lr_sch = build_optimizer( self.optimizer, self.lr_sch = build_optimizer(
self.config["Optimizer"], self.config["Global"]["epochs"], self.config["Optimizer"], self.config["Global"]["epochs"],
len(self.train_dataloader), self.model.parameters()) len(self.train_dataloader), [self.model])
# for distributed # for distributed
self.config["Global"][ self.config["Global"][
...@@ -333,6 +346,8 @@ class Engine(object): ...@@ -333,6 +346,8 @@ class Engine(object):
out = self.model(batch_tensor) out = self.model(batch_tensor)
if isinstance(out, list): if isinstance(out, list):
out = out[0] out = out[0]
if isinstance(out, dict):
out = out["output"]
result = self.postprocess_func(out, image_file_list) result = self.postprocess_func(out, image_file_list)
print(result) print(result)
batch_data.clear() batch_data.clear()
...@@ -346,18 +361,26 @@ class Engine(object): ...@@ -346,18 +361,26 @@ class Engine(object):
self.config["Global"]["pretrained_model"]) self.config["Global"]["pretrained_model"])
model.eval() model.eval()
save_path = os.path.join(self.config["Global"]["save_inference_dir"],
model = paddle.jit.to_static( "inference")
model, if self.quanter:
input_spec=[ self.quanter.save_quantized_model(
paddle.static.InputSpec( model,
shape=[None] + self.config["Global"]["image_shape"], save_path,
dtype='float32') input_spec=[
]) paddle.static.InputSpec(
paddle.jit.save( shape=[None] + self.config["Global"]["image_shape"],
model, dtype='float32')
os.path.join(self.config["Global"]["save_inference_dir"], ])
"inference")) else:
model = paddle.jit.to_static(
model,
input_spec=[
paddle.static.InputSpec(
shape=[None] + self.config["Global"]["image_shape"],
dtype='float32')
])
paddle.jit.save(model, save_path)
class ExportModel(nn.Layer): class ExportModel(nn.Layer):
......
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from ppcls.engine.slim.prune import get_pruner
from ppcls.engine.slim.quant import get_quaner
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import, division, print_function
import paddle
from ppcls.utils import logger
def get_pruner(config, model):
if config.get("Slim", False) and config["Slim"].get("prune", False):
import paddleslim
prune_method_name = config["Slim"]["prune"]["name"].lower()
assert prune_method_name in [
"fpgm", "l1_norm"
], "The prune methods only support 'fpgm' and 'l1_norm'"
if prune_method_name == "fpgm":
pruner = paddleslim.dygraph.FPGMFilterPruner(
model, [1] + config["Global"]["image_shape"])
else:
pruner = paddleslim.dygraph.L1NormFilterPruner(
model, [1] + config["Global"]["image_shape"])
# prune model
_prune_model(pruner, config, model)
else:
pruner = None
return pruner
def _prune_model(pruner, config, model):
from paddleslim.analysis import dygraph_flops as flops
logger.info("FLOPs before pruning: {}GFLOPs".format(
flops(model, [1] + config["Global"]["image_shape"]) / 1e9))
model.eval()
params = []
for sublayer in model.sublayers():
for param in sublayer.parameters(include_sublayers=False):
if isinstance(sublayer, paddle.nn.Conv2D):
params.append(param.name)
ratios = {}
for param in params:
ratios[param] = config["Slim"]["prune"]["pruned_ratio"]
plan = pruner.prune_vars(ratios, [0])
logger.info("FLOPs after pruning: {}GFLOPs; pruned ratio: {}".format(
flops(model, [1] + config["Global"]["image_shape"]) / 1e9,
plan.pruned_flops))
for param in model.parameters():
if "conv2d" in param.name:
logger.info("{}\t{}".format(param.name, param.shape))
model.train()
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import, division, print_function
import paddle
from ppcls.utils import logger
QUANT_CONFIG = {
# weight preprocess type, default is None and no preprocessing is performed.
'weight_preprocess_type': None,
# activation preprocess type, default is None and no preprocessing is performed.
'activation_preprocess_type': None,
# weight quantize type, default is 'channel_wise_abs_max'
'weight_quantize_type': 'channel_wise_abs_max',
# activation quantize type, default is 'moving_average_abs_max'
'activation_quantize_type': 'moving_average_abs_max',
# weight quantize bit num, default is 8
'weight_bits': 8,
# activation quantize bit num, default is 8
'activation_bits': 8,
# data type after quantization, such as 'uint8', 'int8', etc. default is 'int8'
'dtype': 'int8',
# window size for 'range_abs_max' quantization. default is 10000
'window_size': 10000,
# The decay coefficient of moving average, default is 0.9
'moving_rate': 0.9,
# for dygraph quantization, layers of type in quantizable_layer_type will be quantized
'quantizable_layer_type': ['Conv2D', 'Linear'],
}
def get_quaner(config, model):
if config.get("Slim", False) and config["Slim"].get("quant", False):
from paddleslim.dygraph.quant import QAT
assert config["Slim"]["quant"]["name"].lower(
) == 'pact', 'Only PACT quantization method is supported now'
QUANT_CONFIG["activation_preprocess_type"] = "PACT"
quanter = QAT(config=QUANT_CONFIG)
quanter.quantize(model)
logger.info("QAT model summary:")
paddle.summary(model, (1, 3, 224, 224))
else:
quanter = None
return quanter
...@@ -29,7 +29,7 @@ class CELoss(nn.Layer): ...@@ -29,7 +29,7 @@ class CELoss(nn.Layer):
self.epsilon = epsilon self.epsilon = epsilon
def _labelsmoothing(self, target, class_num): def _labelsmoothing(self, target, class_num):
if target.ndim == 1 or target.shape[-1] != class_num: if len(target.shape) == 1 or target.shape[-1] != class_num:
one_hot_target = F.one_hot(target, class_num) one_hot_target = F.one_hot(target, class_num)
else: else:
one_hot_target = target one_hot_target = target
......
...@@ -41,19 +41,22 @@ def build_lr_scheduler(lr_config, epochs, step_each_epoch): ...@@ -41,19 +41,22 @@ def build_lr_scheduler(lr_config, epochs, step_each_epoch):
return lr return lr
def build_optimizer(config, epochs, step_each_epoch, parameters=None): def build_optimizer(config, epochs, step_each_epoch, model_list):
config = copy.deepcopy(config) config = copy.deepcopy(config)
# step1 build lr # step1 build lr
lr = build_lr_scheduler(config.pop('lr'), epochs, step_each_epoch) lr = build_lr_scheduler(config.pop('lr'), epochs, step_each_epoch)
logger.debug("build lr ({}) success..".format(lr)) logger.debug("build lr ({}) success..".format(lr))
# step2 build regularization # step2 build regularization
if 'regularizer' in config and config['regularizer'] is not None: if 'regularizer' in config and config['regularizer'] is not None:
if 'weight_decay' in config:
logger.warning(
"ConfigError: Only one of regularizer and weight_decay can be set in Optimizer Config. \"weight_decay\" has been ignored."
)
reg_config = config.pop('regularizer') reg_config = config.pop('regularizer')
reg_name = reg_config.pop('name') + 'Decay' reg_name = reg_config.pop('name') + 'Decay'
reg = getattr(paddle.regularizer, reg_name)(**reg_config) reg = getattr(paddle.regularizer, reg_name)(**reg_config)
else: config["weight_decay"] = reg
reg = None logger.debug("build regularizer ({}) success..".format(reg))
logger.debug("build regularizer ({}) success..".format(reg))
# step3 build optimizer # step3 build optimizer
optim_name = config.pop('name') optim_name = config.pop('name')
if 'clip_norm' in config: if 'clip_norm' in config:
...@@ -62,8 +65,7 @@ def build_optimizer(config, epochs, step_each_epoch, parameters=None): ...@@ -62,8 +65,7 @@ def build_optimizer(config, epochs, step_each_epoch, parameters=None):
else: else:
grad_clip = None grad_clip = None
optim = getattr(optimizer, optim_name)(learning_rate=lr, optim = getattr(optimizer, optim_name)(learning_rate=lr,
weight_decay=reg,
grad_clip=grad_clip, grad_clip=grad_clip,
**config)(parameters=parameters) **config)(model_list=model_list)
logger.debug("build optimizer ({}) success..".format(optim)) logger.debug("build optimizer ({}) success..".format(optim))
return optim, lr return optim, lr
...@@ -26,6 +26,8 @@ class Linear(object): ...@@ -26,6 +26,8 @@ class Linear(object):
epochs(int): The decay step size. It determines the decay cycle. epochs(int): The decay step size. It determines the decay cycle.
end_lr(float, optional): The minimum final learning rate. Default: 0.0001. end_lr(float, optional): The minimum final learning rate. Default: 0.0001.
power(float, optional): Power of polynomial. Default: 1.0. power(float, optional): Power of polynomial. Default: 1.0.
warmup_epoch(int): The epoch numbers for LinearWarmup. Default: 0.
warmup_start_lr(float): Initial learning rate of warm up. Default: 0.0.
last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
""" """
...@@ -36,28 +38,30 @@ class Linear(object): ...@@ -36,28 +38,30 @@ class Linear(object):
end_lr=0.0, end_lr=0.0,
power=1.0, power=1.0,
warmup_epoch=0, warmup_epoch=0,
warmup_start_lr=0.0,
last_epoch=-1, last_epoch=-1,
**kwargs): **kwargs):
super(Linear, self).__init__() super(Linear, self).__init__()
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.epochs = epochs * step_each_epoch self.steps = (epochs - warmup_epoch) * step_each_epoch
self.end_lr = end_lr self.end_lr = end_lr
self.power = power self.power = power
self.last_epoch = last_epoch self.last_epoch = last_epoch
self.warmup_epoch = round(warmup_epoch * step_each_epoch) self.warmup_steps = round(warmup_epoch * step_each_epoch)
self.warmup_start_lr = warmup_start_lr
def __call__(self): def __call__(self):
learning_rate = lr.PolynomialDecay( learning_rate = lr.PolynomialDecay(
learning_rate=self.learning_rate, learning_rate=self.learning_rate,
decay_steps=self.epochs, decay_steps=self.steps,
end_lr=self.end_lr, end_lr=self.end_lr,
power=self.power, power=self.power,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
if self.warmup_epoch > 0: if self.warmup_steps > 0:
learning_rate = lr.LinearWarmup( learning_rate = lr.LinearWarmup(
learning_rate=learning_rate, learning_rate=learning_rate,
warmup_steps=self.warmup_epoch, warmup_steps=self.warmup_steps,
start_lr=0.0, start_lr=self.warmup_start_lr,
end_lr=self.learning_rate, end_lr=self.learning_rate,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
return learning_rate return learning_rate
...@@ -71,6 +75,9 @@ class Cosine(object): ...@@ -71,6 +75,9 @@ class Cosine(object):
lr(float): initial learning rate lr(float): initial learning rate
step_each_epoch(int): steps each epoch step_each_epoch(int): steps each epoch
epochs(int): total training epochs epochs(int): total training epochs
eta_min(float): Minimum learning rate. Default: 0.0.
warmup_epoch(int): The epoch numbers for LinearWarmup. Default: 0.
warmup_start_lr(float): Initial learning rate of warm up. Default: 0.0.
last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
""" """
...@@ -78,25 +85,30 @@ class Cosine(object): ...@@ -78,25 +85,30 @@ class Cosine(object):
learning_rate, learning_rate,
step_each_epoch, step_each_epoch,
epochs, epochs,
eta_min=0.0,
warmup_epoch=0, warmup_epoch=0,
warmup_start_lr=0.0,
last_epoch=-1, last_epoch=-1,
**kwargs): **kwargs):
super(Cosine, self).__init__() super(Cosine, self).__init__()
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.T_max = step_each_epoch * epochs self.T_max = (epochs - warmup_epoch) * step_each_epoch
self.eta_min = eta_min
self.last_epoch = last_epoch self.last_epoch = last_epoch
self.warmup_epoch = round(warmup_epoch * step_each_epoch) self.warmup_steps = round(warmup_epoch * step_each_epoch)
self.warmup_start_lr = warmup_start_lr
def __call__(self): def __call__(self):
learning_rate = lr.CosineAnnealingDecay( learning_rate = lr.CosineAnnealingDecay(
learning_rate=self.learning_rate, learning_rate=self.learning_rate,
T_max=self.T_max, T_max=self.T_max,
eta_min=self.eta_min,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
if self.warmup_epoch > 0: if self.warmup_steps > 0:
learning_rate = lr.LinearWarmup( learning_rate = lr.LinearWarmup(
learning_rate=learning_rate, learning_rate=learning_rate,
warmup_steps=self.warmup_epoch, warmup_steps=self.warmup_steps,
start_lr=0.0, start_lr=self.warmup_start_lr,
end_lr=self.learning_rate, end_lr=self.learning_rate,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
return learning_rate return learning_rate
...@@ -111,6 +123,8 @@ class Step(object): ...@@ -111,6 +123,8 @@ class Step(object):
step_size (int): the interval to update. step_size (int): the interval to update.
gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` . gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` .
It should be less than 1.0. Default: 0.1. It should be less than 1.0. Default: 0.1.
warmup_epoch(int): The epoch numbers for LinearWarmup. Default: 0.
warmup_start_lr(float): Initial learning rate of warm up. Default: 0.0.
last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
""" """
...@@ -120,6 +134,7 @@ class Step(object): ...@@ -120,6 +134,7 @@ class Step(object):
step_each_epoch, step_each_epoch,
gamma, gamma,
warmup_epoch=0, warmup_epoch=0,
warmup_start_lr=0.0,
last_epoch=-1, last_epoch=-1,
**kwargs): **kwargs):
super(Step, self).__init__() super(Step, self).__init__()
...@@ -127,7 +142,8 @@ class Step(object): ...@@ -127,7 +142,8 @@ class Step(object):
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.gamma = gamma self.gamma = gamma
self.last_epoch = last_epoch self.last_epoch = last_epoch
self.warmup_epoch = round(warmup_epoch * step_each_epoch) self.warmup_steps = round(warmup_epoch * step_each_epoch)
self.warmup_start_lr = warmup_start_lr
def __call__(self): def __call__(self):
learning_rate = lr.StepDecay( learning_rate = lr.StepDecay(
...@@ -135,11 +151,11 @@ class Step(object): ...@@ -135,11 +151,11 @@ class Step(object):
step_size=self.step_size, step_size=self.step_size,
gamma=self.gamma, gamma=self.gamma,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
if self.warmup_epoch > 0: if self.warmup_steps > 0:
learning_rate = lr.LinearWarmup( learning_rate = lr.LinearWarmup(
learning_rate=learning_rate, learning_rate=learning_rate,
warmup_steps=self.warmup_epoch, warmup_steps=self.warmup_steps,
start_lr=0.0, start_lr=self.warmup_start_lr,
end_lr=self.learning_rate, end_lr=self.learning_rate,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
return learning_rate return learning_rate
...@@ -152,6 +168,8 @@ class Piecewise(object): ...@@ -152,6 +168,8 @@ class Piecewise(object):
boundaries(list): A list of steps numbers. The type of element in the list is python int. boundaries(list): A list of steps numbers. The type of element in the list is python int.
values(list): A list of learning rate values that will be picked during different epoch boundaries. values(list): A list of learning rate values that will be picked during different epoch boundaries.
The type of element in the list is python float. The type of element in the list is python float.
warmup_epoch(int): The epoch numbers for LinearWarmup. Default: 0.
warmup_start_lr(float): Initial learning rate of warm up. Default: 0.0.
last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
""" """
...@@ -160,24 +178,26 @@ class Piecewise(object): ...@@ -160,24 +178,26 @@ class Piecewise(object):
decay_epochs, decay_epochs,
values, values,
warmup_epoch=0, warmup_epoch=0,
warmup_start_lr=0.0,
last_epoch=-1, last_epoch=-1,
**kwargs): **kwargs):
super(Piecewise, self).__init__() super(Piecewise, self).__init__()
self.boundaries = [step_each_epoch * e for e in decay_epochs] self.boundaries = [step_each_epoch * e for e in decay_epochs]
self.values = values self.values = values
self.last_epoch = last_epoch self.last_epoch = last_epoch
self.warmup_epoch = round(warmup_epoch * step_each_epoch) self.warmup_steps = round(warmup_epoch * step_each_epoch)
self.warmup_start_lr = warmup_start_lr
def __call__(self): def __call__(self):
learning_rate = lr.PiecewiseDecay( learning_rate = lr.PiecewiseDecay(
boundaries=self.boundaries, boundaries=self.boundaries,
values=self.values, values=self.values,
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
if self.warmup_epoch > 0: if self.warmup_steps > 0:
learning_rate = lr.LinearWarmup( learning_rate = lr.LinearWarmup(
learning_rate=learning_rate, learning_rate=learning_rate,
warmup_steps=self.warmup_epoch, warmup_steps=self.warmup_steps,
start_lr=0.0, start_lr=self.warmup_start_lr,
end_lr=self.values[0], end_lr=self.values[0],
last_epoch=self.last_epoch) last_epoch=self.last_epoch)
return learning_rate return learning_rate
...@@ -186,7 +206,7 @@ class Piecewise(object): ...@@ -186,7 +206,7 @@ class Piecewise(object):
class MultiStepDecay(LRScheduler): class MultiStepDecay(LRScheduler):
""" """
Update the learning rate by ``gamma`` once ``epoch`` reaches one of the milestones. Update the learning rate by ``gamma`` once ``epoch`` reaches one of the milestones.
The algorithm can be described as the code below. The algorithm can be described as the code below.
.. code-block:: text .. code-block:: text
learning_rate = 0.5 learning_rate = 0.5
milestones = [30, 50] milestones = [30, 50]
...@@ -200,15 +220,15 @@ class MultiStepDecay(LRScheduler): ...@@ -200,15 +220,15 @@ class MultiStepDecay(LRScheduler):
Args: Args:
learning_rate (float): The initial learning rate. It is a python float number. learning_rate (float): The initial learning rate. It is a python float number.
milestones (tuple|list): List or tuple of each boundaries. Must be increasing. milestones (tuple|list): List or tuple of each boundaries. Must be increasing.
gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` . gamma (float, optional): The Ratio that the learning rate will be reduced. ``new_lr = origin_lr * gamma`` .
It should be less than 1.0. Default: 0.1. It should be less than 1.0. Default: 0.1.
last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. last_epoch (int, optional): The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
verbose (bool, optional): If ``True``, prints a message to stdout for each update. Default: ``False`` . verbose (bool, optional): If ``True``, prints a message to stdout for each update. Default: ``False`` .
Returns: Returns:
``MultiStepDecay`` instance to schedule learning rate. ``MultiStepDecay`` instance to schedule learning rate.
Examples: Examples:
.. code-block:: python .. code-block:: python
import paddle import paddle
import numpy as np import numpy as np
......
...@@ -35,14 +35,15 @@ class Momentum(object): ...@@ -35,14 +35,15 @@ class Momentum(object):
weight_decay=None, weight_decay=None,
grad_clip=None, grad_clip=None,
multi_precision=False): multi_precision=False):
super(Momentum, self).__init__() super().__init__()
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.momentum = momentum self.momentum = momentum
self.weight_decay = weight_decay self.weight_decay = weight_decay
self.grad_clip = grad_clip self.grad_clip = grad_clip
self.multi_precision = multi_precision self.multi_precision = multi_precision
def __call__(self, parameters): def __call__(self, model_list):
parameters = sum([m.parameters() for m in model_list], [])
opt = optim.Momentum( opt = optim.Momentum(
learning_rate=self.learning_rate, learning_rate=self.learning_rate,
momentum=self.momentum, momentum=self.momentum,
...@@ -77,7 +78,8 @@ class Adam(object): ...@@ -77,7 +78,8 @@ class Adam(object):
self.lazy_mode = lazy_mode self.lazy_mode = lazy_mode
self.multi_precision = multi_precision self.multi_precision = multi_precision
def __call__(self, parameters): def __call__(self, model_list):
parameters = sum([m.parameters() for m in model_list], [])
opt = optim.Adam( opt = optim.Adam(
learning_rate=self.learning_rate, learning_rate=self.learning_rate,
beta1=self.beta1, beta1=self.beta1,
...@@ -112,7 +114,7 @@ class RMSProp(object): ...@@ -112,7 +114,7 @@ class RMSProp(object):
weight_decay=None, weight_decay=None,
grad_clip=None, grad_clip=None,
multi_precision=False): multi_precision=False):
super(RMSProp, self).__init__() super().__init__()
self.learning_rate = learning_rate self.learning_rate = learning_rate
self.momentum = momentum self.momentum = momentum
self.rho = rho self.rho = rho
...@@ -120,7 +122,8 @@ class RMSProp(object): ...@@ -120,7 +122,8 @@ class RMSProp(object):
self.weight_decay = weight_decay self.weight_decay = weight_decay
self.grad_clip = grad_clip self.grad_clip = grad_clip
def __call__(self, parameters): def __call__(self, model_list):
parameters = sum([m.parameters() for m in model_list], [])
opt = optim.RMSProp( opt = optim.RMSProp(
learning_rate=self.learning_rate, learning_rate=self.learning_rate,
momentum=self.momentum, momentum=self.momentum,
...@@ -130,3 +133,57 @@ class RMSProp(object): ...@@ -130,3 +133,57 @@ class RMSProp(object):
grad_clip=self.grad_clip, grad_clip=self.grad_clip,
parameters=parameters) parameters=parameters)
return opt return opt
class AdamW(object):
def __init__(self,
learning_rate=0.001,
beta1=0.9,
beta2=0.999,
epsilon=1e-8,
weight_decay=None,
multi_precision=False,
grad_clip=None,
no_weight_decay_name=None,
one_dim_param_no_weight_decay=False,
**args):
super().__init__()
self.learning_rate = learning_rate
self.beta1 = beta1
self.beta2 = beta2
self.epsilon = epsilon
self.grad_clip = grad_clip
self.weight_decay = weight_decay
self.multi_precision = multi_precision
self.no_weight_decay_name_list = no_weight_decay_name.split(
) if no_weight_decay_name else []
self.one_dim_param_no_weight_decay = one_dim_param_no_weight_decay
def __call__(self, model_list):
parameters = sum([m.parameters() for m in model_list], [])
self.no_weight_decay_param_name_list = [
p.name for model in model_list for n, p in model.named_parameters()
if any(nd in n for nd in self.no_weight_decay_name_list)
]
if self.one_dim_param_no_weight_decay:
self.no_weight_decay_param_name_list += [
p.name for model in model_list
for n, p in model.named_parameters() if len(p.shape) == 1
]
opt = optim.AdamW(
learning_rate=self.learning_rate,
beta1=self.beta1,
beta2=self.beta2,
epsilon=self.epsilon,
parameters=parameters,
weight_decay=self.weight_decay,
multi_precision=self.multi_precision,
grad_clip=self.grad_clip,
apply_decay_param_fun=self._apply_decay_param_fun)
return opt
def _apply_decay_param_fun(self, name):
return name not in self.no_weight_decay_param_name_list
...@@ -96,7 +96,6 @@ def create_fetchs(out, ...@@ -96,7 +96,6 @@ def create_fetchs(out,
""" """
fetchs = OrderedDict() fetchs = OrderedDict()
# build loss # build loss
# TODO(littletomatodonkey): support mix training
if use_mix: if use_mix:
y_a = paddle.reshape(feeds['y_a'], [-1, 1]) y_a = paddle.reshape(feeds['y_a'], [-1, 1])
y_b = paddle.reshape(feeds['y_b'], [-1, 1]) y_b = paddle.reshape(feeds['y_b'], [-1, 1])
...@@ -106,22 +105,14 @@ def create_fetchs(out, ...@@ -106,22 +105,14 @@ def create_fetchs(out,
loss_func = build_loss(config["Loss"][mode]) loss_func = build_loss(config["Loss"][mode])
# TODO: support mix training if use_mix:
loss_dict = loss_func(out, target) loss_dict = loss_func(out, [y_a, y_b, lam])
else:
loss_dict = loss_func(out, target)
loss_out = loss_dict["loss"] loss_out = loss_dict["loss"]
# if "AMP" in config and config.AMP.get("use_pure_fp16", False):
# loss_out = loss_out.astype("float16")
# if use_mix:
# return loss_func(out, feed_y_a, feed_y_b, feed_lam)
# else:
# return loss_func(out, target)
fetchs['loss'] = (loss_out, AverageMeter('loss', '7.4f', need_avg=True)) fetchs['loss'] = (loss_out, AverageMeter('loss', '7.4f', need_avg=True))
assert use_mix is False
# build metric # build metric
if not use_mix: if not use_mix:
metric_func = build_metrics(config["Metric"][mode]) metric_func = build_metrics(config["Metric"][mode])
......
...@@ -8,4 +8,4 @@ visualdl >= 2.0.0b ...@@ -8,4 +8,4 @@ visualdl >= 2.0.0b
scipy scipy
scikit-learn==0.23.2 scikit-learn==0.23.2
gast==0.3.3 gast==0.3.3
faiss-cpu==1.7.1 faiss-cpu==1.7.1.post2
...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val ...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val
null:null null:null
## ##
trainer:norm_train trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/DarkNet/DarkNet53.yaml norm_train:tools/train.py -c ppcls/configs/ImageNet/DarkNet/DarkNet53.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null pact_train:null
fpgm_train:null fpgm_train:null
distill_train:null distill_train:null
......
...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val ...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val
null:null null:null
## ##
trainer:norm_train trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/HRNet/HRNet_W18_C.yaml norm_train:tools/train.py -c ppcls/configs/ImageNet/HRNet/HRNet_W18_C.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null pact_train:null
fpgm_train:null fpgm_train:null
distill_train:null distill_train:null
......
...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val ...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val
null:null null:null
## ##
trainer:norm_train trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/LeViT/LeViT_128S.yaml norm_train:tools/train.py -c ppcls/configs/ImageNet/LeViT/LeViT_128S.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null pact_train:null
fpgm_train:null fpgm_train:null
distill_train:null distill_train:null
......
...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val ...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val
null:null null:null
## ##
trainer:norm_train trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/MobileNetV1/MobileNetV1.yaml norm_train:tools/train.py -c ppcls/configs/ImageNet/MobileNetV1/MobileNetV1.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null pact_train:null
fpgm_train:null fpgm_train:null
distill_train:null distill_train:null
......
...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val ...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val
null:null null:null
## ##
trainer:norm_train trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/MobileNetV2/MobileNetV2.yaml norm_train:tools/train.py -c ppcls/configs/ImageNet/MobileNetV2/MobileNetV2.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null pact_train:null
fpgm_train:null fpgm_train:null
distill_train:null distill_train:null
......
...@@ -12,10 +12,10 @@ train_model_name:latest ...@@ -12,10 +12,10 @@ train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val train_infer_img_dir:./dataset/ILSVRC2012/val
null:null null:null
## ##
trainer:norm_train trainer:norm_train|pact_train|fpgm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/MobileNetV3/MobileNetV3_large_x1_0.yaml norm_train:tools/train.py -c ppcls/configs/ImageNet/MobileNetV3/MobileNetV3_large_x1_0.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:deploy/slim/slim.py -c ppcls/configs/slim/MobileNetV3_large_x1_0_quantization.yaml pact_train:tools/train.py -c ppcls/configs/slim/MobileNetV3_large_x1_0_quantization.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
fpgm_train:deploy/slim/slim.py -c ppcls/configs/slim/MobileNetV3_large_x1_0_prune.yaml fpgm_train:tools/train.py -c ppcls/configs/slim/MobileNetV3_large_x1_0_prune.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
distill_train:null distill_train:null
null:null null:null
null:null null:null
...@@ -28,8 +28,8 @@ null:null ...@@ -28,8 +28,8 @@ null:null
-o Global.save_inference_dir:./inference -o Global.save_inference_dir:./inference
-o Global.pretrained_model: -o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/MobileNetV3/MobileNetV3_large_x1_0.yaml norm_export:tools/export_model.py -c ppcls/configs/ImageNet/MobileNetV3/MobileNetV3_large_x1_0.yaml
quant_export:deploy/slim/slim.py -m export -c ppcls/configs/slim/MobileNetV3_large_x1_0_quantalization.yaml quant_export:tools/export_model.py -c ppcls/configs/slim/MobileNetV3_large_x1_0_quantization.yaml
fpgm_export:deploy/slim/slim.py -m export -c ppcls/configs/slim/MobileNetV3_large_x1_0_prune.yaml fpgm_export:tools/export_model.py -c ppcls/configs/slim/MobileNetV3_large_x1_0_prune.yaml
distill_export:null distill_export:null
export1:null export1:null
export2:null export2:null
...@@ -49,3 +49,10 @@ inference:python/predict_cls.py -c configs/inference_cls.yaml ...@@ -49,3 +49,10 @@ inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.save_log_path:null -o Global.save_log_path:null
-o Global.benchmark:True -o Global.benchmark:True
null:null null:null
null:null
===========================cpp_infer_params===========================
use_gpu:0|1
cpu_threads:1|6
use_mkldnn:0|1
use_tensorrt:0|1
use_fp16:0|1
...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val ...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val
null:null null:null
## ##
trainer:norm_train trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/ResNeXt/ResNeXt101_vd_64x4d.yaml norm_train:tools/train.py -c ppcls/configs/ImageNet/ResNeXt/ResNeXt101_vd_64x4d.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null pact_train:null
fpgm_train:null fpgm_train:null
distill_train:null distill_train:null
......
...@@ -12,10 +12,10 @@ train_model_name:latest ...@@ -12,10 +12,10 @@ train_model_name:latest
train_infer_img_dir:./dataset/ILSVRC2012/val train_infer_img_dir:./dataset/ILSVRC2012/val
null:null null:null
## ##
trainer:norm_train trainer:norm_train|pact_train|fpgm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml norm_train:tools/train.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:deploy/slim/slim.py -c ppcls/configs/slim/ResNet50_vd_quantization.yaml pact_train:tools/train.py -c ppcls/configs/slim/ResNet50_vd_quantization.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
fpgm_train:deploy/slim/slim.py -c ppcls/configs/slim/ResNet50_vd_prune.yaml fpgm_train:tools/train.py -c ppcls/configs/slim/ResNet50_vd_prune.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
distill_train:null distill_train:null
null:null null:null
null:null null:null
...@@ -28,8 +28,8 @@ null:null ...@@ -28,8 +28,8 @@ null:null
-o Global.save_inference_dir:./inference -o Global.save_inference_dir:./inference
-o Global.pretrained_model: -o Global.pretrained_model:
norm_export:tools/export_model.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml norm_export:tools/export_model.py -c ppcls/configs/ImageNet/ResNet/ResNet50_vd.yaml
quant_export:deploy/slim/slim.py -m export -c ppcls/configs/slim/ResNet50_vd_quantalization.yaml quant_export:tools/export_model.py -c ppcls/configs/slim/ResNet50_vd_quantization.yaml
fpgm_export:deploy/slim/slim.py -m export -c ppcls/configs/slim/ResNet50_vd_prune.yaml fpgm_export:tools/export_model.py -c ppcls/configs/slim/ResNet50_vd_prune.yaml
distill_export:null distill_export:null
export1:null export1:null
export2:null export2:null
...@@ -49,3 +49,10 @@ inference:python/predict_cls.py -c configs/inference_cls.yaml ...@@ -49,3 +49,10 @@ inference:python/predict_cls.py -c configs/inference_cls.yaml
-o Global.save_log_path:null -o Global.save_log_path:null
-o Global.benchmark:True -o Global.benchmark:True
null:null null:null
null:null
===========================cpp_infer_params===========================
use_gpu:0|1
cpu_threads:1|6
use_mkldnn:0|1
use_tensorrt:0|1
use_fp16:0|1
...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val ...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val
null:null null:null
## ##
trainer:norm_train trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/ShuffleNet/ShuffleNetV2_x1_0.yaml norm_train:tools/train.py -c ppcls/configs/ImageNet/ShuffleNet/ShuffleNetV2_x1_0.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null pact_train:null
fpgm_train:null fpgm_train:null
distill_train:null distill_train:null
......
...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val ...@@ -13,7 +13,7 @@ train_infer_img_dir:./dataset/ILSVRC2012/val
null:null null:null
## ##
trainer:norm_train trainer:norm_train
norm_train:tools/train.py -c ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_tiny_patch4_window7_224.yaml norm_train:tools/train.py -c ppcls/configs/ImageNet/SwinTransformer/SwinTransformer_tiny_patch4_window7_224.yaml -o Global.seed=1234 -o DataLoader.Train.sampler.shuffle=False -o DataLoader.Train.loader.num_workers=0 -o DataLoader.Train.loader.use_shared_memory=False
pact_train:null pact_train:null
fpgm_train:null fpgm_train:null
distill_train:null distill_train:null
......
# model load config
gpu_id 0
gpu_mem 2000
# whole chain test will add following config
# use_gpu 0
# cpu_threads 10
# use_mkldnn 1
# use_tensorrt 0
# use_fp16 0
# cls config
cls_model_path ../inference/inference.pdmodel
cls_params_path ../inference/inference.pdiparams
resize_short_size 256
crop_size 224
# for log env info
benchmark 1
...@@ -33,7 +33,7 @@ if [ ${MODE} = "lite_train_infer" ] || [ ${MODE} = "whole_infer" ];then ...@@ -33,7 +33,7 @@ if [ ${MODE} = "lite_train_infer" ] || [ ${MODE} = "whole_infer" ];then
mv train.txt train_list.txt mv train.txt train_list.txt
mv val.txt val_list.txt mv val.txt val_list.txt
cd ../../ cd ../../
elif [ ${MODE} = "infer" ];then elif [ ${MODE} = "infer" ] || [ ${MODE} = "cpp_infer" ];then
# download data # download data
cd dataset cd dataset
rm -rf ILSVRC2012 rm -rf ILSVRC2012
...@@ -58,3 +58,63 @@ elif [ ${MODE} = "whole_train_infer" ];then ...@@ -58,3 +58,63 @@ elif [ ${MODE} = "whole_train_infer" ];then
mv val.txt val_list.txt mv val.txt val_list.txt
cd ../../ cd ../../
fi fi
if [ ${MODE} = "cpp_infer" ];then
cd deploy/cpp
echo "################### build opencv ###################"
rm -rf 3.4.7.tar.gz opencv-3.4.7/
wget https://github.com/opencv/opencv/archive/3.4.7.tar.gz
tar -xf 3.4.7.tar.gz
install_path=$(pwd)/opencv-3.4.7/opencv3
cd opencv-3.4.7/
rm -rf build
mkdir build
cd build
cmake .. \
-DCMAKE_INSTALL_PREFIX=${install_path} \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=OFF \
-DWITH_IPP=OFF \
-DBUILD_IPP_IW=OFF \
-DWITH_LAPACK=OFF \
-DWITH_EIGEN=OFF \
-DCMAKE_INSTALL_LIBDIR=lib64 \
-DWITH_ZLIB=ON \
-DBUILD_ZLIB=ON \
-DWITH_JPEG=ON \
-DBUILD_JPEG=ON \
-DWITH_PNG=ON \
-DBUILD_PNG=ON \
-DWITH_TIFF=ON \
-DBUILD_TIFF=ON
make -j
make install
cd ../../
echo "################### build opencv finished ###################"
echo "################### build PaddleClas demo ####################"
OPENCV_DIR=$(pwd)/opencv-3.4.7/opencv3/
LIB_DIR=$(pwd)/Paddle/build/paddle_inference_install_dir/
CUDA_LIB_DIR=$(dirname `find /usr -name libcudart.so`)
CUDNN_LIB_DIR=$(dirname `find /usr -name libcudnn.so`)
BUILD_DIR=build
rm -rf ${BUILD_DIR}
mkdir ${BUILD_DIR}
cd ${BUILD_DIR}
cmake .. \
-DPADDLE_LIB=${LIB_DIR} \
-DWITH_MKL=ON \
-DDEMO_NAME=clas_system \
-DWITH_GPU=OFF \
-DWITH_STATIC_LIB=OFF \
-DWITH_TENSORRT=OFF \
-DTENSORRT_DIR=${TENSORRT_DIR} \
-DOPENCV_DIR=${OPENCV_DIR} \
-DCUDNN_LIB=${CUDNN_LIB_DIR} \
-DCUDA_LIB=${CUDA_LIB_DIR} \
make -j
echo "################### build PaddleClas demo finished ###################"
fi
#!/bin/bash #!/bin/bash
FILENAME=$1 FILENAME=$1
# MODE be one of ['lite_train_infer' 'whole_infer' 'whole_train_infer', 'infer'] # MODE be one of ['lite_train_infer' 'whole_infer' 'whole_train_infer', 'infer', 'cpp_infer']
MODE=$2 MODE=$2
dataline=$(cat ${FILENAME}) dataline=$(cat ${FILENAME})
...@@ -145,10 +145,80 @@ benchmark_value=$(func_parser_value "${lines[49]}") ...@@ -145,10 +145,80 @@ benchmark_value=$(func_parser_value "${lines[49]}")
infer_key1=$(func_parser_key "${lines[50]}") infer_key1=$(func_parser_key "${lines[50]}")
infer_value1=$(func_parser_value "${lines[50]}") infer_value1=$(func_parser_value "${lines[50]}")
if [ ${MODE} = "cpp_infer" ]; then
cpp_use_gpu_key=$(func_parser_key "${lines[53]}")
cpp_use_gpu_list=$(func_parser_value "${lines[53]}")
cpp_cpu_threads_key=$(func_parser_key "${lines[54]}")
cpp_cpu_threads_list=$(func_parser_value "${lines[54]}")
cpp_use_mkldnn_key=$(func_parser_key "${lines[55]}")
cpp_use_mkldnn_list=$(func_parser_value "${lines[55]}")
cpp_use_tensorrt_key=$(func_parser_key "${lines[56]}")
cpp_use_tensorrt_list=$(func_parser_value "${lines[56]}")
cpp_use_fp16_key=$(func_parser_key "${lines[57]}")
cpp_use_fp16_list=$(func_parser_value "${lines[57]}")
fi
LOG_PATH="./tests/output" LOG_PATH="./tests/output"
mkdir -p ${LOG_PATH} mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results.log" status_log="${LOG_PATH}/results.log"
function func_cpp_inference(){
IFS='|'
_script=$1
_log_path=$2
_img_dir=$3
# inference
for use_gpu in ${cpp_use_gpu_list[*]}; do
if [ ${use_gpu} = "0" ] || [ ${use_gpu} = "cpu" ]; then
for use_mkldnn in ${cpp_use_mkldnn_list[*]}; do
if [ ${use_mkldnn} = "0" ] && [ ${_flag_quant} = "True" ]; then
continue
fi
for threads in ${cpp_cpu_threads_list[*]}; do
_save_log_path="${_log_path}/cpp_infer_cpu_usemkldnn_${use_mkldnn}_threads_${threads}.log"
set_infer_data=$(func_set_params "${cpp_image_dir_key}" "${_img_dir}")
cp ../tests/config/cpp_config.txt cpp_config.txt
echo "${cpp_use_gpu_key} ${use_gpu}" >> cpp_config.txt
echo "${cpp_cpu_threads_key} ${threads}" >> cpp_config.txt
echo "${cpp_use_mkldnn_key} ${use_mkldnn}" >> cpp_config.txt
echo "${cpp_use_tensorrt_key} 0" >> cpp_config.txt
echo "${cpp_use_fp16_key} 0" >> cpp_config.txt
command="${_script} cpp_config.txt ${_img_dir} > ${_save_log_path} 2>&1 "
eval $command
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
status_check $last_status "${command}" "${status_log}"
done
done
elif [ ${use_gpu} = "1" ] || [ ${use_gpu} = "gpu" ]; then
for use_trt in ${cpp_use_tensorrt_list[*]}; do
for precision in ${cpp_use_fp16_list[*]}; do
if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
continue
fi
if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [ ${_flag_quant} = "True" ]; then
continue
fi
_save_log_path="${_log_path}/cpp_infer_gpu_usetrt_${use_trt}_precision_${precision}_batchsize_${batch_size}.log"
cp ../tests/config/cpp_config.txt cpp_config.txt
echo "${cpp_use_gpu_key} ${use_gpu}" >> cpp_config.txt
echo "${cpp_cpu_threads_key} ${threads}" >> cpp_config.txt
echo "${cpp_use_mkldnn_key} ${use_mkldnn}" >> cpp_config.txt
echo "${cpp_use_tensorrt_key} ${use_trt}" >> cpp_config.txt
echo "${cpp_use_fp16_key} ${precision}" >> cpp_config.txt
command="${_script} cpp_config.txt ${_img_dir} > ${_save_log_path} 2>&1 "
eval $command
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
status_check $last_status "${command}" "${status_log}"
done
done
else
echo "Does not support hardware other than CPU and GPU Currently!"
fi
done
}
function func_inference(){ function func_inference(){
IFS='|' IFS='|'
...@@ -247,6 +317,10 @@ if [ ${MODE} = "infer" ]; then ...@@ -247,6 +317,10 @@ if [ ${MODE} = "infer" ]; then
Count=$(($Count + 1)) Count=$(($Count + 1))
done done
cd .. cd ..
elif [ ${MODE} = "cpp_infer" ]; then
cd deploy
func_cpp_inference "./cpp/build/clas_system" "../${LOG_PATH}" "${infer_img_dir}"
cd ..
else else
IFS="|" IFS="|"
...@@ -306,7 +380,7 @@ else ...@@ -306,7 +380,7 @@ else
set_pretrain=$(func_set_params "${pretrain_model_key}" "${pretrain_model_value}") set_pretrain=$(func_set_params "${pretrain_model_key}" "${pretrain_model_value}")
set_batchsize=$(func_set_params "${train_batch_key}" "${train_batch_value}") set_batchsize=$(func_set_params "${train_batch_key}" "${train_batch_value}")
set_train_params1=$(func_set_params "${train_param_key1}" "${train_param_value1}") set_train_params1=$(func_set_params "${train_param_key1}" "${train_param_value1}")
set_use_gpu=$(func_set_params "${train_use_gpu_key}" "${use_gpu}") set_use_gpu=$(func_set_params "${train_use_gpu_key}" "${train_use_gpu_value}")
save_log="${LOG_PATH}/${trainer}_gpus_${gpu}_autocast_${autocast}" save_log="${LOG_PATH}/${trainer}_gpus_${gpu}_autocast_${autocast}"
# load pretrain from norm training if current trainer is pact or fpgm trainer # load pretrain from norm training if current trainer is pact or fpgm trainer
...@@ -324,6 +398,7 @@ else ...@@ -324,6 +398,7 @@ else
fi fi
# run train # run train
eval "unset CUDA_VISIBLE_DEVICES" eval "unset CUDA_VISIBLE_DEVICES"
export FLAGS_cudnn_deterministic=True
eval $cmd eval $cmd
status_check $? "${cmd}" "${status_log}" status_check $? "${cmd}" "${status_log}"
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册