未验证 提交 3c8f4910 编写于 作者: T Tingquan Gao 提交者: GitHub

Merge branch 'develop' into dev/multi-scale

repos:
- repo: https://github.com/PaddlePaddle/mirrors-yapf.git
sha: 0d79c0c469bab64f7229c9aca2b1186ef47f0e37
rev: 0d79c0c469bab64f7229c9aca2b1186ef47f0e37
hooks:
- id: yapf
files: \.py$
- repo: https://github.com/pre-commit/pre-commit-hooks
sha: a11d9314b22d8f8c7556443875b731ef05965464
rev: a11d9314b22d8f8c7556443875b731ef05965464
hooks:
- id: check-merge-conflict
- id: check-symlinks
......@@ -15,7 +16,7 @@
- id: trailing-whitespace
files: \.md$
- repo: https://github.com/Lucas-C/pre-commit-hooks
sha: v1.0.1
rev: v1.0.1
hooks:
- id: forbid-crlf
files: \.md$
......
README_ch.md
README_ch.md
\ No newline at end of file
......@@ -7,7 +7,7 @@
飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。
**近期更新**
- 2022.4.21 新增 CVPR2022 oral论文 [MixFormer](https://arxiv.org/pdf/2204.02557.pdf) 相关[代码](https://github.com/PaddlePaddle/PaddleClas/pull/1820/files)
- 2022.1.27 全面升级文档;新增[PaddleServing C++ pipeline部署方式](./deploy/paddleserving)[18M图像识别安卓部署Demo](./deploy/lite_shitu)
- 2021.11.1 发布[PP-ShiTu技术报告](https://arxiv.org/pdf/2111.00775.pdf),新增饮料识别demo
- 2021.10.23 发布轻量级图像识别系统PP-ShiTu,CPU上0.2s即可完成在10w+库的图像识别。
......@@ -35,10 +35,11 @@ Res2Net200_vd预训练模型Top-1精度高达85.1%。
## 欢迎加入技术交流群
* 您可以扫描下面的微信群二维码, 加入PaddleClas 微信交流群。获得更高效的问题答疑,与各行各业开发者充分交流,期待您的加入。
* 您可以扫描下面的QQ/微信二维码(添加小助手微信并回复“C”),加入PaddleClas微信交流群,获得更高效的问题答疑,与各行各业开发者充分交流,期待您的加入。
<div align="center">
<img src="https://user-images.githubusercontent.com/12560511/153565053-d6cbc57b-1610-4a64-87b2-50c948352d87.jpeg" width = "200" />
<img src="https://user-images.githubusercontent.com/80816848/164383225-e375eb86-716e-41b4-a9e0-4b8a3976c1aa.jpg" width="200"/>
<img src="https://user-images.githubusercontent.com/48054808/160531099-9811bbe6-cfbb-47d5-8bdb-c2b40684d7dd.png" width="200"/>
</div>
## 快速体验
......
......@@ -8,6 +8,8 @@ PaddleClas is an image recognition toolset for industry and academia, helping us
**Recent updates**
- 2022.4.21 Added the related [code](https://github.com/PaddlePaddle/PaddleClas/pull/1820/files) of the CVPR2022 oral paper [MixFormer](https://arxiv.org/pdf/2204.02557.pdf).
- 2021.09.17 Add PP-LCNet series model developed by PaddleClas, these models show strong competitiveness on Intel CPUs.
For the introduction of PP-LCNet, please refer to [paper](https://arxiv.org/pdf/2109.15099.pdf) or [PP-LCNet model introduction](docs/en/models/PP-LCNet_en.md). The metrics and pretrained model are available [here](docs/en/ImageNet_models_en.md).
......@@ -38,10 +40,11 @@ Four sample solutions are provided, including product recognition, vehicle recog
## Welcome to Join the Technical Exchange Group
* You can also scan the QR code below to join the PaddleClas WeChat group to get more efficient answers to your questions and to communicate with developers from all walks of life. We look forward to hearing from you.
* You can also scan the QR code below to join the PaddleClas QQ group and WeChat group (add and replay "C") to get more efficient answers to your questions and to communicate with developers from all walks of life. We look forward to hearing from you.
<div align="center">
<img src="https://user-images.githubusercontent.com/12560511/153565053-d6cbc57b-1610-4a64-87b2-50c948352d87.jpeg" width = "200" />
<img src="https://user-images.githubusercontent.com/80816848/164383225-e375eb86-716e-41b4-a9e0-4b8a3976c1aa.jpg" width="200"/>
<img src="https://user-images.githubusercontent.com/48054808/160531099-9811bbe6-cfbb-47d5-8bdb-c2b40684d7dd.png" width="200"/>
</div>
## Quick Start
......
......@@ -8,7 +8,7 @@ Global:
image_shape: [3, 640, 640]
threshold: 0.2
max_det_results: 5
labe_list:
label_list:
- foreground
use_gpu: True
......
......@@ -5,7 +5,7 @@ Global:
image_shape: [3, 640, 640]
threshold: 0.2
max_det_results: 1
labe_list:
label_list:
- foreground
# inference engine config
......
......@@ -8,7 +8,7 @@ Global:
image_shape: [3, 640, 640]
threshold: 0.2
max_det_results: 5
labe_list:
label_list:
- foreground
use_gpu: True
......@@ -51,8 +51,8 @@ RecPostProcess: null
# indexing engine config
IndexProcess:
index_method: "HNSW32" # supported: HNSW32, IVF, Flat
index_dir: "./drink_dataset_v1.0/gallery"
image_root: "./drink_dataset_v1.0/index"
image_root: "./drink_dataset_v1.0/gallery"
index_dir: "./drink_dataset_v1.0/index"
data_file: "./drink_dataset_v1.0/gallery/drink_label.txt"
index_operation: "new" # suported: "append", "remove", "new"
delimiter: " "
......
......@@ -8,7 +8,7 @@ Global:
image_shape: [3, 640, 640]
threshold: 0.2
max_det_results: 5
labe_list:
label_list:
- foreground
use_gpu: True
......
......@@ -8,7 +8,7 @@ Global:
image_shape: [3, 640, 640]
threshold: 0.2
max_det_results: 5
labe_list:
label_list:
- foreground
use_gpu: True
......
......@@ -8,7 +8,7 @@ Global:
image_shape: [3, 640, 640]
threshold: 0.2
max_det_results: 5
labe_list:
label_list:
- foreground
use_gpu: True
......
......@@ -8,7 +8,7 @@ Global:
image_shape: [3, 640, 640]
threshold: 0.2
max_det_results: 5
labe_list:
label_list:
- foreground
# inference engine config
......
......@@ -8,7 +8,7 @@ Global:
image_shape: [3, 640, 640]
threshold: 0.2
max_det_results: 5
labe_list:
label_list:
- foreground
use_gpu: True
......
......@@ -33,106 +33,106 @@ using namespace paddle_infer;
namespace Detection {
// Object Detection Result
struct ObjectResult {
// Rectangle coordinates of detected object: left, right, top, down
std::vector<int> rect;
// Class id of detected object
int class_id;
// Confidence of detected object
float confidence;
};
struct ObjectResult {
// Rectangle coordinates of detected object: left, right, top, down
std::vector<int> rect;
// Class id of detected object
int class_id;
// Confidence of detected object
float confidence;
};
// Generate visualization colormap for each class
std::vector<int> GenerateColorMap(int num_class);
std::vector<int> GenerateColorMap(int num_class);
// Visualiztion Detection Result
cv::Mat VisualizeResult(const cv::Mat &img,
const std::vector <ObjectResult> &results,
const std::vector <std::string> &lables,
const std::vector<int> &colormap, const bool is_rbox);
class ObjectDetector {
public:
explicit ObjectDetector(const YAML::Node &config_file) {
this->use_gpu_ = config_file["Global"]["use_gpu"].as<bool>();
if (config_file["Global"]["gpu_id"].IsDefined())
this->gpu_id_ = config_file["Global"]["gpu_id"].as<int>();
this->gpu_mem_ = config_file["Global"]["gpu_mem"].as<int>();
this->cpu_math_library_num_threads_ =
config_file["Global"]["cpu_num_threads"].as<int>();
this->use_mkldnn_ = config_file["Global"]["enable_mkldnn"].as<bool>();
this->use_tensorrt_ = config_file["Global"]["use_tensorrt"].as<bool>();
this->use_fp16_ = config_file["Global"]["use_fp16"].as<bool>();
this->model_dir_ =
config_file["Global"]["det_inference_model_dir"].as<std::string>();
this->threshold_ = config_file["Global"]["threshold"].as<float>();
this->max_det_results_ = config_file["Global"]["max_det_results"].as<int>();
this->image_shape_ =
config_file["Global"]["image_shape"].as < std::vector < int >> ();
this->label_list_ =
config_file["Global"]["labe_list"].as < std::vector < std::string >> ();
this->ir_optim_ = config_file["Global"]["ir_optim"].as<bool>();
this->batch_size_ = config_file["Global"]["batch_size"].as<int>();
preprocessor_.Init(config_file["DetPreProcess"]["transform_ops"]);
LoadModel(model_dir_, batch_size_, run_mode);
}
// Load Paddle inference model
void LoadModel(const std::string &model_dir, const int batch_size = 1,
const std::string &run_mode = "fluid");
// Run predictor
void Predict(const std::vector <cv::Mat> imgs, const int warmup = 0,
const int repeats = 1,
std::vector <ObjectResult> *result = nullptr,
std::vector<int> *bbox_num = nullptr,
std::vector<double> *times = nullptr);
const std::vector <std::string> &GetLabelList() const {
return this->label_list_;
}
const float &GetThreshold() const { return this->threshold_; }
private:
bool use_gpu_ = true;
int gpu_id_ = 0;
int gpu_mem_ = 800;
int cpu_math_library_num_threads_ = 6;
std::string run_mode = "fluid";
bool use_mkldnn_ = false;
bool use_tensorrt_ = false;
bool batch_size_ = 1;
bool use_fp16_ = false;
std::string model_dir_;
float threshold_ = 0.5;
float max_det_results_ = 5;
std::vector<int> image_shape_ = {3, 640, 640};
std::vector <std::string> label_list_;
bool ir_optim_ = true;
bool det_permute_ = true;
bool det_postprocess_ = true;
int min_subgraph_size_ = 30;
bool use_dynamic_shape_ = false;
int trt_min_shape_ = 1;
int trt_max_shape_ = 1280;
int trt_opt_shape_ = 640;
bool trt_calib_mode_ = false;
// Preprocess image and copy data to input buffer
void Preprocess(const cv::Mat &image_mat);
// Postprocess result
void Postprocess(const std::vector <cv::Mat> mats,
std::vector <ObjectResult> *result, std::vector<int> bbox_num,
bool is_rbox);
std::shared_ptr <Predictor> predictor_;
Preprocessor preprocessor_;
ImageBlob inputs_;
std::vector<float> output_data_;
std::vector<int> out_bbox_num_data_;
};
cv::Mat VisualizeResult(const cv::Mat &img,
const std::vector<ObjectResult> &results,
const std::vector<std::string> &lables,
const std::vector<int> &colormap, const bool is_rbox);
class ObjectDetector {
public:
explicit ObjectDetector(const YAML::Node &config_file) {
this->use_gpu_ = config_file["Global"]["use_gpu"].as<bool>();
if (config_file["Global"]["gpu_id"].IsDefined())
this->gpu_id_ = config_file["Global"]["gpu_id"].as<int>();
this->gpu_mem_ = config_file["Global"]["gpu_mem"].as<int>();
this->cpu_math_library_num_threads_ =
config_file["Global"]["cpu_num_threads"].as<int>();
this->use_mkldnn_ = config_file["Global"]["enable_mkldnn"].as<bool>();
this->use_tensorrt_ = config_file["Global"]["use_tensorrt"].as<bool>();
this->use_fp16_ = config_file["Global"]["use_fp16"].as<bool>();
this->model_dir_ =
config_file["Global"]["det_inference_model_dir"].as<std::string>();
this->threshold_ = config_file["Global"]["threshold"].as<float>();
this->max_det_results_ = config_file["Global"]["max_det_results"].as<int>();
this->image_shape_ =
config_file["Global"]["image_shape"].as<std::vector<int>>();
this->label_list_ =
config_file["Global"]["label_list"].as<std::vector<std::string>>();
this->ir_optim_ = config_file["Global"]["ir_optim"].as<bool>();
this->batch_size_ = config_file["Global"]["batch_size"].as<int>();
preprocessor_.Init(config_file["DetPreProcess"]["transform_ops"]);
LoadModel(model_dir_, batch_size_, run_mode);
}
// Load Paddle inference model
void LoadModel(const std::string &model_dir, const int batch_size = 1,
const std::string &run_mode = "fluid");
// Run predictor
void Predict(const std::vector<cv::Mat> imgs, const int warmup = 0,
const int repeats = 1,
std::vector<ObjectResult> *result = nullptr,
std::vector<int> *bbox_num = nullptr,
std::vector<double> *times = nullptr);
const std::vector<std::string> &GetLabelList() const {
return this->label_list_;
}
const float &GetThreshold() const { return this->threshold_; }
private:
bool use_gpu_ = true;
int gpu_id_ = 0;
int gpu_mem_ = 800;
int cpu_math_library_num_threads_ = 6;
std::string run_mode = "fluid";
bool use_mkldnn_ = false;
bool use_tensorrt_ = false;
bool batch_size_ = 1;
bool use_fp16_ = false;
std::string model_dir_;
float threshold_ = 0.5;
float max_det_results_ = 5;
std::vector<int> image_shape_ = {3, 640, 640};
std::vector<std::string> label_list_;
bool ir_optim_ = true;
bool det_permute_ = true;
bool det_postprocess_ = true;
int min_subgraph_size_ = 30;
bool use_dynamic_shape_ = false;
int trt_min_shape_ = 1;
int trt_max_shape_ = 1280;
int trt_opt_shape_ = 640;
bool trt_calib_mode_ = false;
// Preprocess image and copy data to input buffer
void Preprocess(const cv::Mat &image_mat);
// Postprocess result
void Postprocess(const std::vector<cv::Mat> mats,
std::vector<ObjectResult> *result, std::vector<int> bbox_num,
bool is_rbox);
std::shared_ptr<Predictor> predictor_;
Preprocessor preprocessor_;
ImageBlob inputs_;
std::vector<float> output_data_;
std::vector<int> out_bbox_num_data_;
};
} // namespace Detection
ARM_ABI = arm8
export ARM_ABI
include ../Makefile.def
LITE_ROOT=./inference_lite_lib.android.armv8
LITE_ROOT=../../../
include ${LITE_ROOT}/demo/cxx/Makefile.def
THIRD_PARTY_DIR=${LITE_ROOT}/third_party
......@@ -29,7 +29,7 @@ OPENCV_LIBS = ${THIRD_PARTY_DIR}/${OPENCV_VERSION}/${ARM_PATH}/libs/libopencv_im
${THIRD_PARTY_DIR}/${OPENCV_VERSION}/${ARM_PATH}/3rdparty/libs/libtbb.a \
${THIRD_PARTY_DIR}/${OPENCV_VERSION}/${ARM_PATH}/3rdparty/libs/libcpufeatures.a
OPENCV_INCLUDE = -I../../../third_party/${OPENCV_VERSION}/${ARM_PATH}/include
OPENCV_INCLUDE = -I${LITE_ROOT}/third_party/${OPENCV_VERSION}/${ARM_PATH}/include
CXX_INCLUDES = $(INCLUDES) ${OPENCV_INCLUDE} -I$(LITE_ROOT)/cxx/include
......
clas_model_file ./MobileNetV3_large_x1_0.nb
label_path ./imagenet1k_label_list.txt
clas_model_file /data/local/tmp/arm_cpu/MobileNetV3_large_x1_0.nb
label_path /data/local/tmp/arm_cpu/imagenet1k_label_list.txt
resize_short_size 256
crop_size 224
visualize 0
num_threads 1
batch_size 1
precision FP32
runtime_device arm_cpu
enable_benchmark 0
tipc_benchmark 0
......@@ -21,6 +21,7 @@
#include <opencv2/opencv.hpp>
#include <sys/time.h>
#include <vector>
#include "AutoLog/auto_log/lite_autolog.h"
using namespace paddle::lite_api; // NOLINT
using namespace std;
......@@ -149,8 +150,10 @@ cv::Mat CenterCropImg(const cv::Mat &img, const int &crop_size) {
std::vector<RESULT>
RunClasModel(std::shared_ptr<PaddlePredictor> predictor, const cv::Mat &img,
const std::map<std::string, std::string> &config,
const std::vector<std::string> &word_labels, double &cost_time) {
const std::vector<std::string> &word_labels, double &cost_time,
std::vector<double> *time_info) {
// Read img
auto preprocess_start = std::chrono::steady_clock::now();
int resize_short_size = stoi(config.at("resize_short_size"));
int crop_size = stoi(config.at("crop_size"));
int visualize = stoi(config.at("visualize"));
......@@ -172,8 +175,8 @@ RunClasModel(std::shared_ptr<PaddlePredictor> predictor, const cv::Mat &img,
std::vector<float> scale = {1 / 0.229f, 1 / 0.224f, 1 / 0.225f};
const float *dimg = reinterpret_cast<const float *>(img_fp.data);
NeonMeanScale(dimg, data0, img_fp.rows * img_fp.cols, mean, scale);
auto start = std::chrono::system_clock::now();
auto preprocess_end = std::chrono::steady_clock::now();
auto inference_start = std::chrono::system_clock::now();
// Run predictor
predictor->Run();
......@@ -181,9 +184,10 @@ RunClasModel(std::shared_ptr<PaddlePredictor> predictor, const cv::Mat &img,
std::unique_ptr<const Tensor> output_tensor(
std::move(predictor->GetOutput(0)));
auto *output_data = output_tensor->data<float>();
auto end = std::chrono::system_clock::now();
auto inference_end = std::chrono::system_clock::now();
auto postprocess_start = std::chrono::system_clock::now();
auto duration =
std::chrono::duration_cast<std::chrono::microseconds>(end - start);
std::chrono::duration_cast<std::chrono::microseconds>(inference_end - inference_start);
cost_time = double(duration.count()) *
std::chrono::microseconds::period::num /
std::chrono::microseconds::period::den;
......@@ -196,6 +200,13 @@ RunClasModel(std::shared_ptr<PaddlePredictor> predictor, const cv::Mat &img,
cv::Mat output_image;
auto results =
PostProcess(output_data, output_size, word_labels, output_image);
auto postprocess_end = std::chrono::system_clock::now();
std::chrono::duration<float> preprocess_diff = preprocess_end - preprocess_start;
time_info->push_back(double(preprocess_diff.count() * 1000));
std::chrono::duration<float> inference_diff = inference_end - inference_start;
time_info->push_back(double(inference_diff.count() * 1000));
std::chrono::duration<float> postprocess_diff = postprocess_end - postprocess_start;
time_info->push_back(double(postprocess_diff.count() * 1000));
if (visualize) {
std::string output_image_path = "./clas_result.png";
......@@ -309,6 +320,12 @@ int main(int argc, char **argv) {
std::string clas_model_file = config.at("clas_model_file");
std::string label_path = config.at("label_path");
std::string crop_size = config.at("crop_size");
int num_threads = stoi(config.at("num_threads"));
int batch_size = stoi(config.at("batch_size"));
std::string precision = config.at("precision");
std::string runtime_device = config.at("runtime_device");
bool tipc_benchmark = bool(stoi(config.at("tipc_benchmark")));
// Load Labels
std::vector<std::string> word_labels = LoadLabels(label_path);
......@@ -319,8 +336,9 @@ int main(int argc, char **argv) {
cv::cvtColor(srcimg, srcimg, cv::COLOR_BGR2RGB);
double run_time = 0;
std::vector<double> time_info;
std::vector<RESULT> results =
RunClasModel(clas_predictor, srcimg, config, word_labels, run_time);
RunClasModel(clas_predictor, srcimg, config, word_labels, run_time, &time_info);
std::cout << "===clas result for image: " << img_path << "===" << std::endl;
for (int i = 0; i < results.size(); i++) {
......@@ -338,6 +356,19 @@ int main(int argc, char **argv) {
} else {
std::cout << "Current time cost: " << run_time << " s." << std::endl;
}
if (tipc_benchmark) {
AutoLogger autolog(clas_model_file,
runtime_device,
num_threads,
batch_size,
crop_size,
precision,
time_info,
1);
std::cout << "=======================TIPC Lite Information=======================" << std::endl;
autolog.report();
}
}
return 0;
......
......@@ -25,8 +25,8 @@ Paddle Lite是飞桨轻量化推理引擎,为手机、IOT端提供高效推理
1. [建议]直接下载,预测库下载链接如下:
|平台|预测库下载链接|
|-|-|
|Android|[arm7](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/Android/gcc/inference_lite_lib.android.armv7.gcc.c++_static.with_extra.with_cv.tar.gz) / [arm8](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/Android/gcc/inference_lite_lib.android.armv8.gcc.c++_static.with_extra.with_cv.tar.gz)|
|iOS|[arm7](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/iOS/inference_lite_lib.ios.armv7.with_cv.with_extra.tiny_publish.tar.gz) / [arm8](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/iOS/inference_lite_lib.ios.armv8.with_cv.with_extra.tiny_publish.tar.gz)|
|Android|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.android.armv7.clang.c++_static.with_extra.with_cv.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.android.armv8.clang.c++_static.with_extra.with_cv.tar.gz)|
|iOS|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.ios.armv7.with_cv.with_extra.tiny_publish.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.ios.armv8.with_cv.with_extra.tiny_publish.tar.gz)|
**注**:
1. 如果是从 Paddle-Lite [官方文档](https://paddle-lite.readthedocs.io/zh/latest/quick_start/release_lib.html#android-toolchain-gcc)下载的预测库,
......@@ -44,11 +44,11 @@ git checkout develop
**注意**:编译Paddle-Lite获得预测库时,需要打开`--with_cv=ON --with_extra=ON`两个选项,`--arch`表示`arm`版本,这里指定为armv8,更多编译命令介绍请参考[链接](https://paddle-lite.readthedocs.io/zh/latest/user_guides/Compile/Android.html#id2)
直接下载预测库并解压后,可以得到`inference_lite_lib.android.armv8/`文件夹,通过编译Paddle-Lite得到的预测库位于`Paddle-Lite/build.lite.android.armv8.gcc/inference_lite_lib.android.armv8/`文件夹下。
直接下载预测库并解压后,可以得到`inference_lite_lib.android.armv8.clang.c++_static.with_extra.with_cv/`文件夹,通过编译Paddle-Lite得到的预测库位于`Paddle-Lite/build.lite.android.armv8.gcc/inference_lite_lib.android.armv8/`文件夹下。
预测库的文件目录如下:
```
inference_lite_lib.android.armv8/
inference_lite_lib.android.armv8.clang.c++_static.with_extra.with_cv/
|-- cxx C++ 预测库和头文件
| |-- include C++ 头文件
| | |-- paddle_api.h
......@@ -86,7 +86,7 @@ Python下安装 `paddlelite`,目前最高支持`Python3.7`。
**注意**`paddlelite`whl包版本必须和预测库版本对应。
```shell
pip install paddlelite==2.8
pip install paddlelite==2.10
```
之后使用`paddle_lite_opt`工具可以进行inference模型的转换。`paddle_lite_opt`的部分参数如下
......@@ -146,6 +146,24 @@ paddle_lite_opt --model_file=./MobileNetV3_large_x1_0_infer/inference.pdmodel --
**注意**`--optimize_out` 参数为优化后模型的保存路径,无需加后缀`.nb``--model_file` 参数为模型结构信息文件的路径,`--param_file` 参数为模型权重信息文件的路径,请注意文件名。
<a name="2.1.4"></a>
#### 2.1.4 执行编译,得到可执行文件clas_system
```shell
# 克隆 Autolog 代码库,以便获取自动化日志
cd PaddleClas_root_path
cd deploy/lite/
git clone https://github.com/LDOUBLEV/AutoLog.git
```
```shell
# 编译
make -j
```
执行 `make` 命令后,会在当前目录生成 `clas_system` 可执行文件,该文件用于 Lite 预测。
<a name="2.2与手机联调"></a>
### 2.2 与手机联调
......@@ -167,7 +185,7 @@ paddle_lite_opt --model_file=./MobileNetV3_large_x1_0_infer/inference.pdmodel --
win上安装需要去谷歌的安卓平台下载ADB软件包进行安装:[链接](https://developer.android.com/studio)
4. 手机连接电脑后,开启手机`USB调试`选项,选择`文件传输`模式,在电脑终端中输入:
3. 手机连接电脑后,开启手机`USB调试`选项,选择`文件传输`模式,在电脑终端中输入:
```shell
adb devices
......@@ -178,40 +196,18 @@ List of devices attached
744be294 device
```
5. 准备优化后的模型、预测库文件、测试图像和类别映射文件。
```shell
cd PaddleClas_root_path
cd deploy/lite/
# 运行prepare.sh
# prepare.sh 会将预测库文件、测试图像和使用的字典文件放置在预测库中的demo/cxx/clas文件夹下
sh prepare.sh /{lite prediction library path}/inference_lite_lib.android.armv8
4. 将优化后的模型、预测库文件、测试图像和类别映射文件push到手机上。
# 进入lite demo的工作目录
cd /{lite prediction library path}/inference_lite_lib.android.armv8/
cd demo/cxx/clas/
# 将C++预测动态库so文件复制到debug文件夹中
cp ../../../cxx/lib/libpaddle_light_api_shared.so ./debug/
```
`prepare.sh``PaddleClas/deploy/lite/imgs/tabby_cat.jpg` 作为测试图像,将测试图像复制到`demo/cxx/clas/debug/` 文件夹下。
`paddle_lite_opt` 工具优化后的模型文件放置到 `/{lite prediction library path}/inference_lite_lib.android.armv8/demo/cxx/clas/debug/` 文件夹下。本例中,使用[2.1.3](#2.1.3)生成的 `MobileNetV3_large_x1_0.nb` 模型文件。
执行完成后,clas文件夹下将有如下文件格式:
```
demo/cxx/clas/
|-- debug/
| |--MobileNetV3_large_x1_0.nb 优化后的分类器模型文件
| |--tabby_cat.jpg 待测试图像
| |--imagenet1k_label_list.txt 类别映射文件
| |--libpaddle_light_api_shared.so C++预测库文件
| |--config.txt 分类预测超参数配置
|-- config.txt 分类预测超参数配置
|-- image_classfication.cpp 图像分类代码文件
|-- Makefile 编译文件
```shell
adb shell mkdir -p /data/local/tmp/arm_cpu/
adb push clas_system /data/local/tmp/arm_cpu/
adb shell chmod +x /data/local/tmp/arm_cpu//clas_system
adb push inference_lite_lib.android.armv8.clang.c++_static.with_extra.with_cv/cxx/lib/libpaddle_light_api_shared.so /data/local/tmp/arm_cpu/
adb push MobileNetV3_large_x1_0.nb /data/local/tmp/arm_cpu/
adb push config.txt /data/local/tmp/arm_cpu/
adb push ../../ppcls/utils/imagenet1k_label_list.txt /data/local/tmp/arm_cpu/
adb push imgs/tabby_cat.jpg /data/local/tmp/arm_cpu/
```
#### 注意:
......@@ -224,32 +220,22 @@ clas_model_file ./MobileNetV3_large_x1_0.nb # 模型文件地址
label_path ./imagenet1k_label_list.txt # 类别映射文本文件
resize_short_size 256 # resize之后的短边边长
crop_size 224 # 裁剪后用于预测的边长
visualize 0 # 是否进行可视化,如果选择的话,会在当前文件夹下生成名为clas_result.png的图像文件。
visualize 0 # 是否进行可视化,如果选择的话,会在当前文件夹下生成名为clas_result.png的图像文件
num_threads 1 # 线程数,默认是1。
precision FP32 # 精度类型,可以选择 FP32 或者 INT8,默认是 FP32。
runtime_device arm_cpu # 设备类型,默认是 arm_cpu
enable_benchmark 0 # 是否开启benchmark, 默认是 0
tipc_benchmark 0 # 是否开启tipc_benchmark,默认是 0
```
5. 启动调试,上述步骤完成后就可以使用ADB将文件夹 `debug/` push到手机上运行,步骤如下:
5. 执行预测命令
```shell
# 执行编译,得到可执行文件clas_system
make -j
# 将编译得到的可执行文件移动到debug文件夹中
mv clas_system ./debug/
# 将上述debug文件夹push到手机上
adb push debug /data/local/tmp/
adb shell
cd /data/local/tmp/debug
export LD_LIBRARY_PATH=/data/local/tmp/debug:$LD_LIBRARY_PATH
执行以下命令,可完成在手机上的预测。
# clas_system可执行文件的使用方式为:
# ./clas_system 配置文件路径 测试图像路径
./clas_system ./config.txt ./tabby_cat.jpg
```shell
adb shell 'export LD_LIBRARY_PATH=/data/local/tmp/arm_cpu/; /data/local/tmp/arm_cpu/clas_system /data/local/tmp/arm_cpu/config.txt /data/local/tmp/arm_cpu/tabby_cat.jpg'
```
如果对代码做了修改,则需要重新编译并push到手机上。
运行效果如下:
<div align="center">
......@@ -263,3 +249,4 @@ A1:如果已经走通了上述步骤,更换模型只需要替换 `.nb` 模
Q2:换一个图测试怎么做?
A2:替换 debug 下的测试图像为你想要测试的图像,使用 ADB 再次 push 到手机上即可。
......@@ -17,7 +17,6 @@ ${info LITE_ROOT: $(abspath ${LITE_ROOT})}
THIRD_PARTY_DIR=third_party
${info THIRD_PARTY_DIR: $(abspath ${THIRD_PARTY_DIR})}
OPENCV_VERSION=opencv4.1.0
OPENCV_LIBS = ${THIRD_PARTY_DIR}/${OPENCV_VERSION}/${ARM_PLAT}/libs/libopencv_imgcodecs.a \
${THIRD_PARTY_DIR}/${OPENCV_VERSION}/${ARM_PLAT}/libs/libopencv_imgproc.a \
......@@ -32,6 +31,8 @@ OPENCV_LIBS = ${THIRD_PARTY_DIR}/${OPENCV_VERSION}/${ARM_PLAT}/libs/libopencv_im
${THIRD_PARTY_DIR}/${OPENCV_VERSION}/${ARM_PLAT}/3rdparty/libs/libtbb.a \
${THIRD_PARTY_DIR}/${OPENCV_VERSION}/${ARM_PLAT}/3rdparty/libs/libcpufeatures.a
FAISS_VERSION=faiss1.5.3
FAISS_LIBS = ${THIRD_PARTY_DIR}/${FAISS_VERSION}/libs/${ARM_PLAT}/libfaiss.a
LITE_LIBS = -L${LITE_ROOT}/cxx/lib/ -lpaddle_light_api_shared
###############################################################
......@@ -45,7 +46,7 @@ LITE_LIBS = -L${LITE_ROOT}/cxx/lib/ -lpaddle_light_api_shared
# 2. Undo comment below line using `libpaddle_api_light_bundled.a`
# LITE_LIBS = ${LITE_ROOT}/cxx/lib/libpaddle_api_light_bundled.a
CXX_LIBS = $(LITE_LIBS) ${OPENCV_LIBS} $(SYSTEM_LIBS)
CXX_LIBS = $(LITE_LIBS) ${OPENCV_LIBS} ${FAISS_LIBS} $(SYSTEM_LIBS)
LOCAL_DIRSRCS=$(wildcard src/*.cc)
LOCAL_SRCS=$(notdir $(LOCAL_DIRSRCS))
......@@ -53,9 +54,17 @@ LOCAL_OBJS=$(patsubst %.cpp, %.o, $(patsubst %.cc, %.o, $(LOCAL_SRCS)))
JSON_OBJS = json_reader.o json_value.o json_writer.o
pp_shitu: $(LOCAL_OBJS) $(JSON_OBJS) fetch_opencv
pp_shitu: $(LOCAL_OBJS) $(JSON_OBJS) fetch_opencv fetch_faiss
$(CC) $(SYSROOT_LINK) $(CXXFLAGS_LINK) $(LOCAL_OBJS) $(JSON_OBJS) -o pp_shitu $(CXX_LIBS) $(LDFLAGS)
fetch_faiss:
@ test -d ${THIRD_PARTY_DIR} || mkdir ${THIRD_PARTY_DIR}
@ test -e ${THIRD_PARTY_DIR}/${FAISS_VERSION}.tar.gz || \
(echo "fetch faiss libs" && \
wget -P ${THIRD_PARTY_DIR} https://paddle-inference-dist.bj.bcebos.com/${FAISS_VERSION}.tar.gz)
@ test -d ${THIRD_PARTY_DIR}/${FAISS_VERSION} || \
tar -xf ${THIRD_PARTY_DIR}/${FAISS_VERSION}.tar.gz -C ${THIRD_PARTY_DIR}
fetch_opencv:
@ test -d ${THIRD_PARTY_DIR} || mkdir ${THIRD_PARTY_DIR}
@ test -e ${THIRD_PARTY_DIR}/${OPENCV_VERSION}.tar.gz || \
......@@ -74,11 +83,12 @@ fetch_json_code:
LOCAL_INCLUDES = -I./ -Iinclude
OPENCV_INCLUDE = -I${THIRD_PARTY_DIR}/${OPENCV_VERSION}/${ARM_PLAT}/include
FAISS_INCLUDE = -I${THIRD_PARTY_DIR}/${FAISS_VERSION}/include
JSON_INCLUDE = -I${THIRD_PARTY_DIR}/jsoncpp_code/include
CXX_INCLUDES = ${LOCAL_INCLUDES} ${INCLUDES} ${OPENCV_INCLUDE} ${JSON_INCLUDE} -I$(LITE_ROOT)/cxx/include
CXX_INCLUDES = ${LOCAL_INCLUDES} ${INCLUDES} ${OPENCV_INCLUDE} ${FAISS_INCLUDE} ${JSON_INCLUDE} -I$(LITE_ROOT)/cxx/include
$(LOCAL_OBJS): %.o: src/%.cc fetch_opencv fetch_json_code
$(LOCAL_OBJS): %.o: src/%.cc fetch_opencv fetch_json_code fetch_faiss
$(CC) $(SYSROOT_COMPLILE) $(CXX_DEFINES) $(CXX_INCLUDES) $(CXX_FLAGS) -c $< -o $@
$(JSON_OBJS): %.o: ${THIRD_PARTY_DIR}/jsoncpp_code/%.cpp fetch_json_code
......
......@@ -2,7 +2,7 @@
本教程将介绍基于[Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite) 在移动端部署PaddleClas PP-ShiTu模型的详细步骤。
Paddle Lite是飞桨轻量化推理引擎,为手机、IOT端提供高效推理能力,并广泛整合跨平台硬件,为端侧部署及应用落地问题提供轻量化的部署方案。
Paddle Lite是飞桨轻量化推理引擎,为手机、IoT端提供高效推理能力,并广泛整合跨平台硬件,为端侧部署及应用落地问题提供轻量化的部署方案。
## 1. 准备环境
......@@ -81,35 +81,134 @@ inference_lite_lib.android.armv8/
| `-- java Java 预测库demo
```
## 2 开始运行
## 2 模型准备
### 2.1 模型准备
PaddleClas 提供了转换并优化后的推理模型,可以直接参考下方 2.1.1 小节进行下载。如果需要使用其他模型,请参考后续 2.1.2 小节自行转换并优化模型。
#### 2.1.1 模型准备
#### 2.1.1 使用PaddleClas提供的推理模型
```shell
# 进入lite_ppshitu目录
cd $PaddleClas/deploy/lite_shitu
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/lite/ppshitu_lite_models_v1.0.tar
tar -xf ppshitu_lite_models_v1.0.tar
rm -f ppshitu_lite_models_v1.0.tar
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/lite/ppshitu_lite_models_v1.1.tar
tar -xf ppshitu_lite_models_v1.1.tar
rm -f ppshitu_lite_models_v1.1.tar
```
#### 2.1.2将yaml文件转换成json文件
#### 2.1.2 使用其他模型
Paddle-Lite 提供了多种策略来自动优化原始的模型,其中包括量化、子图融合、混合调度、Kernel优选等方法,使用Paddle-Lite的`opt`工具可以自动对inference模型进行优化,目前支持两种优化方式,优化后的模型更轻量,模型运行速度更快。
**注意**:如果已经准备好了 `.nb` 结尾的模型文件,可以跳过此步骤。
##### 2.1.2.1 安装paddle_lite_opt工具
安装`paddle_lite_opt`工具有如下两种方法:
1. [**建议**]pip安装paddlelite并进行转换
```shell
pip install paddlelite==2.10rc
```
2. 源码编译Paddle-Lite生成`paddle_lite_opt`工具
模型优化需要Paddle-Lite的`opt`可执行文件,可以通过编译Paddle-Lite源码获得,编译步骤如下:
```shell
# 如果准备环境时已经clone了Paddle-Lite,则不用重新clone Paddle-Lite
git clone https://github.com/PaddlePaddle/Paddle-Lite.git
cd Paddle-Lite
git checkout develop
# 启动编译
./lite/tools/build.sh build_optimize_tool
```
编译完成后,`opt`文件位于`build.opt/lite/api/`下,可通过如下方式查看`opt`的运行选项和使用方式;
```shell
cd build.opt/lite/api/
./opt
```
`opt`的使用方式与参数与上面的`paddle_lite_opt`完全一致。
之后使用`paddle_lite_opt`工具可以进行inference模型的转换。`paddle_lite_opt`的部分参数如下:
|选项|说明|
|-|-|
|--model_file|待优化的PaddlePaddle模型(combined形式)的网络结构文件路径|
|--param_file|待优化的PaddlePaddle模型(combined形式)的权重文件路径|
|--optimize_out_type|输出模型类型,目前支持两种类型:protobuf和naive_buffer,其中naive_buffer是一种更轻量级的序列化/反序列化实现,默认为naive_buffer|
|--optimize_out|优化模型的输出路径|
|--valid_targets|指定模型可执行的backend,默认为arm。目前可支持x86、arm、opencl、npu、xpu,可以同时指定多个backend(以空格分隔),Model Optimize Tool将会自动选择最佳方式。如果需要支持华为NPU(Kirin 810/990 Soc搭载的达芬奇架构NPU),应当设置为npu, arm|
更详细的`paddle_lite_opt`工具使用说明请参考[使用opt转化模型文档](https://paddle-lite.readthedocs.io/zh/latest/user_guides/opt/opt_bin.html)
`--model_file`表示inference模型的model文件地址,`--param_file`表示inference模型的param文件地址;`optimize_out`用于指定输出文件的名称(不需要添加`.nb`的后缀)。直接在命令行中运行`paddle_lite_opt`,也可以查看所有参数及其说明。
##### 2.1.2.2 转换示例
下面介绍使用`paddle_lite_opt`完成主体检测模型和识别模型的预训练模型,转成inference模型,最终转换成Paddle-Lite的优化模型的过程。
1. 转换主体检测模型
```shell
# 当前目录为 $PaddleClas/deploy/lite_shitu
# $code_path需替换成相应的运行目录,可以根据需要,将$code_path设置成需要的目录
export $code_path=~
cd $code_path
git clone https://github.com/PaddlePaddle/PaddleDetection.git
# 进入PaddleDetection根目录
cd PaddleDetection
# 将预训练模型导出为inference模型
python tools/export_model.py -c configs/picodet/application/mainbody_detection/picodet_lcnet_x2_5_640_mainbody.yml -o weights=https://paddledet.bj.bcebos.com/models/picodet_lcnet_x2_5_640_mainbody.pdparams --output_dir=inference
# 将inference模型转化为Paddle-Lite优化模型
paddle_lite_opt --model_file=inference/picodet_lcnet_x2_5_640_mainbody/model.pdmodel --param_file=inference/picodet_lcnet_x2_5_640_mainbody/model.pdiparams --optimize_out=inference/picodet_lcnet_x2_5_640_mainbody/mainbody_det
# 将转好的模型复制到lite_shitu目录下
cd $PaddleClas/deploy/lite_shitu
mkdir models
cp $code_path/PaddleDetection/inference/picodet_lcnet_x2_5_640_mainbody/mainbody_det.nb $PaddleClas/deploy/lite_shitu/models
```
2. 转换识别模型
```shell
# 转换为Paddle-Lite模型
paddle_lite_opt --model_file=inference/inference.pdmodel --param_file=inference/inference.pdiparams --optimize_out=inference/rec
# 将模型文件拷贝到lite_shitu下
cp inference/rec.nb deploy/lite_shitu/models/
cd deploy/lite_shitu
```
**注意**`--optimize_out` 参数为优化后模型的保存路径,无需加后缀`.nb``--model_file` 参数为模型结构信息文件的路径,`--param_file` 参数为模型权重信息文件的路径,请注意文件名。
### 2.2 将yaml文件转换成json文件
```shell
# 如果测试单张图像
python generate_json_config.py --det_model_path ppshitu_lite_models_v1.0/mainbody_PPLCNet_x2_5_640_quant_v1.0_lite.nb --rec_model_path ppshitu_lite_models_v1.0/general_PPLCNet_x2_5_quant_v1.0_lite.nb --rec_label_path ppshitu_lite_models_v1.0/label.txt --img_path images/demo.jpg
python generate_json_config.py --det_model_path ppshitu_lite_models_v1.1/mainbody_PPLCNet_x2_5_640_quant_v1.1_lite.nb --rec_model_path ppshitu_lite_models_v1.1/general_PPLCNet_x2_5_lite_v1.1_infer.nb --img_path images/demo.jpg
# or
# 如果测试多张图像
python generate_json_config.py --det_model_path ppshitu_lite_models_v1.0/mainbody_PPLCNet_x2_5_640_quant_v1.0_lite.nb --rec_model_path ppshitu_lite_models_v1.0/general_PPLCNet_x2_5_quant_v1.0_lite.nb --rec_label_path ppshitu_lite_models_v1.0/label.txt --img_dir images
python generate_json_config.py --det_model_path ppshitu_lite_models_v1.1/mainbody_PPLCNet_x2_5_640_quant_v1.1_lite.nb --rec_model_path ppshitu_lite_models_v1.1/general_PPLCNet_x2_5_lite_v1.1_infer.nb --img_dir images
# 执行完成后,会在lit_shitu下生成shitu_config.json配置文件
```
### 2.3 index字典转换
由于python的检索库字典,使用`pickle`进行的序列化存储,导致C++不方便读取,因此需要进行转换
```shell
# 下载瓶装饮料数据集
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar && tar -xf drink_dataset_v1.0.tar
rm -rf drink_dataset_v1.0.tar
# 转化id_map.pkl为id_map.txt
python transform_id_map.py -c ../configs/inference_drink.yaml
```
转换成功后,会在`IndexProcess.index_dir`目录下生成`id_map.txt`
### 2.2 与手机联调
### 2.4 与手机联调
首先需要进行一些准备工作。
1. 准备一台arm8的安卓手机,如果编译的预测库是armv7,则需要arm7的手机,并修改Makefile中`ARM_ABI=arm7`
......@@ -160,7 +259,8 @@ make ARM_ABI=arm8
```shell
mkdir deploy
mv ppshitu_lite_models_v1.0 deploy/
mv ppshitu_lite_models_v1.1 deploy/
mv drink_dataset_v1.0 deploy/
mv images deploy/
mv shitu_config.json deploy/
cp pp_shitu deploy/
......@@ -173,13 +273,13 @@ cp ../../../cxx/lib/libpaddle_light_api_shared.so deploy/
```shell
deploy/
|-- ppshitu_lite_models_v1.0/
| |--mainbody_PPLCNet_x2_5_640_v1.0_lite.nb 优化后的主体检测模型文件
| |--general_PPLCNet_x2_5_quant_v1.0_lite.nb 优化后的识别模型文件
| |--label.txt 识别模型的label文件
|-- ppshitu_lite_models_v1.1/
| |--mainbody_PPLCNet_x2_5_640_quant_v1.1_lite.nb 优化后的主体检测模型文件
| |--general_PPLCNet_x2_5_lite_v1.1_infer.nb 优化后的识别模型文件
|-- images/
| |--demo.jpg 图片文件
| ... 图片文件
|-- drink_dataset_v1.0/ 瓶装饮料demo数据
| |--index 检索index目录
|-- pp_shitu 生成的移动端执行文件
|-- shitu_config.json 执行时参数配置文件
|-- libpaddle_light_api_shared.so Paddle-Lite库文件
......@@ -207,8 +307,10 @@ chmod 777 pp_shitu
如果对代码做了修改,则需要重新编译并push到手机上。
运行效果如下:
![](../../docs/images/ppshitu_lite_demo.png)
```
images/demo.jpg:
result0: bbox[253, 275, 1146, 872], score: 0.974196, label: 伊藤园_果蔬汁
```
## FAQ
Q1:如果想更换模型怎么办,需要重新按照流程走一遍吗?
......
......@@ -95,7 +95,7 @@ def main():
config_json["Global"]["det_model_path"] = args.det_model_path
config_json["Global"]["rec_model_path"] = args.rec_model_path
config_json["Global"]["rec_label_path"] = args.rec_label_path
config_json["Global"]["label_list"] = config_yaml["Global"]["labe_list"]
config_json["Global"]["label_list"] = config_yaml["Global"]["label_list"]
config_json["Global"]["rec_nms_thresold"] = config_yaml["Global"][
"rec_nms_thresold"]
config_json["Global"]["max_det_results"] = config_yaml["Global"][
......@@ -130,6 +130,8 @@ def main():
y["type"] = k
config_json["RecPreProcess"]["transform_ops"].append(y)
# set IndexProces
config_json["IndexProcess"] = config_yaml["IndexProcess"]
with open('shitu_config.json', 'w') as fd:
json.dump(config_json, fd, indent=4)
......
......@@ -36,10 +36,9 @@ struct RESULT {
float score;
};
class Recognition {
class FeatureExtract {
public:
explicit Recognition(const Json::Value &config_file) {
explicit FeatureExtract(const Json::Value &config_file) {
MobileConfig config;
if (config_file["Global"]["rec_model_path"].as<std::string>().empty()) {
std::cout << "Please set [rec_model_path] in config file" << std::endl;
......@@ -53,29 +52,8 @@ public:
std::cout << "Please set [rec_label_path] in config file" << std::endl;
exit(-1);
}
LoadLabel(config_file["Global"]["rec_label_path"].as<std::string>());
SetPreProcessParam(config_file["RecPreProcess"]["transform_ops"]);
if (!config_file["Global"].isMember("return_k")){
this->topk = config_file["Global"]["return_k"].as<int>();
}
printf("rec model create!\n");
}
void LoadLabel(std::string path) {
std::ifstream file;
std::vector<std::string> label_list;
file.open(path);
while (file) {
std::string line;
std::getline(file, line);
std::string::size_type pos = line.find(" ");
if (pos != std::string::npos) {
line = line.substr(pos);
}
this->label_list.push_back(line);
}
file.clear();
file.close();
printf("feature extract model create!\n");
}
void SetPreProcessParam(const Json::Value &config_file) {
......@@ -97,19 +75,17 @@ public:
}
}
std::vector<RESULT> RunRecModel(const cv::Mat &img, double &cost_time);
std::vector<RESULT> PostProcess(const float *output_data, int output_size,
cv::Mat &output_image);
void RunRecModel(const cv::Mat &img, double &cost_time, std::vector<float> &feature);
//void PostProcess(std::vector<float> &feature);
cv::Mat ResizeImage(const cv::Mat &img);
void NeonMeanScale(const float *din, float *dout, int size);
private:
std::shared_ptr<PaddlePredictor> predictor;
std::vector<std::string> label_list;
//std::vector<std::string> label_list;
std::vector<float> mean = {0.485f, 0.456f, 0.406f};
std::vector<float> std = {1 / 0.229f, 1 / 0.224f, 1 / 0.225f};
double scale = 0.00392157;
float size = 224;
int topk = 5;
};
} // namespace PPShiTu
......@@ -16,7 +16,7 @@
#include <algorithm>
#include <ctime>
#include <include/recognition.h>
#include <include/feature_extractor.h>
#include <memory>
#include <numeric>
#include <string>
......
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#ifdef WIN32
#define OS_PATH_SEP "\\"
#else
#define OS_PATH_SEP "/"
#endif
#include "json/json.h"
#include <cstring>
#include <faiss/Index.h>
#include <faiss/index_io.h>
#include <map>
namespace PPShiTu {
struct SearchResult {
std::vector<faiss::Index::idx_t> I;
std::vector<float> D;
int return_k;
};
class VectorSearch {
public:
explicit VectorSearch(const Json::Value &config) {
// IndexProcess
this->index_dir = config["IndexProcess"]["index_dir"].as<std::string>();
this->return_k = config["IndexProcess"]["return_k"].as<int>();
this->score_thres = config["IndexProcess"]["score_thres"].as<float>();
this->max_query_number = config["Global"]["max_det_results"].as<int>() + 1;
LoadIdMap();
LoadIndexFile();
this->I.resize(this->return_k * this->max_query_number);
this->D.resize(this->return_k * this->max_query_number);
printf("faiss index load success!\n");
};
void LoadIdMap();
void LoadIndexFile();
const SearchResult &Search(float *feature, int query_number);
const std::string &GetLabel(faiss::Index::idx_t ind);
const float &GetThreshold() { return this->score_thres; }
private:
std::string index_dir;
int return_k = 5;
float score_thres = 0.5;
std::map<long int, std::string> id_map;
faiss::Index *index;
int max_query_number = 6;
std::vector<float> D;
std::vector<faiss::Index::idx_t> I;
SearchResult sr;
};
}
......@@ -12,12 +12,12 @@
// See the License for the specific language governing permissions and
// limitations under the License.
#include "include/recognition.h"
#include "include/feature_extractor.h"
namespace PPShiTu {
std::vector<RESULT> Recognition::RunRecModel(const cv::Mat &img,
double &cost_time) {
void FeatureExtract::RunRecModel(const cv::Mat &img,
double &cost_time,
std::vector<float> &feature) {
// Read img
cv::Mat resize_image = ResizeImage(img);
......@@ -38,8 +38,7 @@ std::vector<RESULT> Recognition::RunRecModel(const cv::Mat &img,
// Get output and post process
std::unique_ptr<const Tensor> output_tensor(
std::move(this->predictor->GetOutput(1)));
auto *output_data = output_tensor->data<float>();
std::move(this->predictor->GetOutput(0))); //only one output
auto end = std::chrono::system_clock::now();
auto duration =
std::chrono::duration_cast<std::chrono::microseconds>(end - start);
......@@ -47,17 +46,27 @@ std::vector<RESULT> Recognition::RunRecModel(const cv::Mat &img,
std::chrono::microseconds::period::num /
std::chrono::microseconds::period::den;
//do postprocess
int output_size = 1;
for (auto dim : output_tensor->shape()) {
output_size *= dim;
}
feature.resize(output_size);
output_tensor->CopyToCpu(feature.data());
cv::Mat output_image;
auto results = PostProcess(output_data, output_size, output_image);
return results;
//postprocess include sqrt or binarize.
//PostProcess(feature);
return;
}
void Recognition::NeonMeanScale(const float *din, float *dout, int size) {
// void FeatureExtract::PostProcess(std::vector<float> &feature){
// float feature_sqrt = std::sqrt(std::inner_product(
// feature.begin(), feature.end(), feature.begin(), 0.0f));
// for (int i = 0; i < feature.size(); ++i)
// feature[i] /= feature_sqrt;
// }
void FeatureExtract::NeonMeanScale(const float *din, float *dout, int size) {
if (this->mean.size() != 3 || this->std.size() != 3) {
std::cerr << "[ERROR] mean or scale size must equal to 3\n";
......@@ -99,45 +108,9 @@ void Recognition::NeonMeanScale(const float *din, float *dout, int size) {
}
}
cv::Mat Recognition::ResizeImage(const cv::Mat &img) {
cv::Mat FeatureExtract::ResizeImage(const cv::Mat &img) {
cv::Mat resize_img;
cv::resize(img, resize_img, cv::Size(this->size, this->size));
return resize_img;
}
std::vector<RESULT> Recognition::PostProcess(const float *output_data,
int output_size,
cv::Mat &output_image) {
int max_indices[this->topk];
double max_scores[this->topk];
for (int i = 0; i < this->topk; i++) {
max_indices[i] = 0;
max_scores[i] = 0;
}
for (int i = 0; i < output_size; i++) {
float score = output_data[i];
int index = i;
for (int j = 0; j < this->topk; j++) {
if (score > max_scores[j]) {
index += max_indices[j];
max_indices[j] = index - max_indices[j];
index -= max_indices[j];
score += max_scores[j];
max_scores[j] = score - max_scores[j];
score -= max_scores[j];
}
}
}
std::vector<RESULT> results(this->topk);
for (int i = 0; i < results.size(); i++) {
results[i].class_name = "Unknown";
if (max_indices[i] >= 0 && max_indices[i] < this->label_list.size()) {
results[i].class_name = this->label_list[max_indices[i]];
}
results[i].score = max_scores[i];
results[i].class_id = max_indices[i];
}
return results;
}
}
......@@ -24,9 +24,10 @@
#include <vector>
#include "include/config_parser.h"
#include "include/feature_extractor.h"
#include "include/object_detector.h"
#include "include/preprocess_op.h"
#include "include/recognition.h"
#include "include/vector_search.h"
#include "json/json.h"
Json::Value RT_Config;
......@@ -111,14 +112,18 @@ void DetPredictImage(const std::vector<cv::Mat> &batch_imgs,
}
}
void PrintResult(const std::string &image_path,
std::vector<PPShiTu::ObjectResult> &det_result) {
printf("%s:\n", image_path.c_str());
void PrintResult(std::string &img_path,
std::vector<PPShiTu::ObjectResult> &det_result,
PPShiTu::VectorSearch &vector_search,
PPShiTu::SearchResult &search_result) {
printf("%s:\n", img_path.c_str());
for (int i = 0; i < det_result.size(); ++i) {
int t = i;
printf("\tresult%d: bbox[%d, %d, %d, %d], score: %f, label: %s\n", i,
det_result[i].rect[0], det_result[i].rect[1], det_result[i].rect[2],
det_result[i].rect[3], det_result[i].rec_result[0].score,
det_result[i].rec_result[0].class_name.c_str());
det_result[t].rect[0], det_result[t].rect[1], det_result[t].rect[2],
det_result[t].rect[3], det_result[t].confidence,
vector_search.GetLabel(search_result.I[search_result.return_k * t])
.c_str());
}
}
......@@ -159,11 +164,16 @@ int main(int argc, char **argv) {
RT_Config["Global"]["cpu_num_threads"].as<int>(),
RT_Config["Global"]["batch_size"].as<int>());
// create rec model
PPShiTu::Recognition rec(RT_Config);
PPShiTu::FeatureExtract rec(RT_Config);
PPShiTu::VectorSearch searcher(RT_Config);
// Do inference on input image
std::vector<PPShiTu::ObjectResult> det_result;
std::vector<cv::Mat> batch_imgs;
// for vector search
std::vector<float> feature;
std::vector<float> features;
double rec_time;
if (!RT_Config["Global"]["infer_imgs"].as<std::string>().empty() ||
!img_dir.empty()) {
......@@ -178,8 +188,7 @@ int main(int argc, char **argv) {
return -1;
}
} else {
cv::glob(img_dir,
cv_all_img_paths);
cv::glob(img_dir, cv_all_img_paths);
for (const auto &img_path : cv_all_img_paths) {
all_img_paths.push_back(img_path);
}
......@@ -199,24 +208,25 @@ int main(int argc, char **argv) {
RT_Config["Global"]["max_det_results"].as<int>(), false, &det);
// add the whole image for recognition to improve recall
PPShiTu::ObjectResult result_whole_img = {
{0, 0, srcimg.cols, srcimg.rows}, 0, 1.0};
det_result.push_back(result_whole_img);
// PPShiTu::ObjectResult result_whole_img = {
// {0, 0, srcimg.cols, srcimg.rows}, 0, 1.0};
// det_result.push_back(result_whole_img);
// get rec result
PPShiTu::SearchResult search_result;
for (int j = 0; j < det_result.size(); ++j) {
int w = det_result[j].rect[2] - det_result[j].rect[0];
int h = det_result[j].rect[3] - det_result[j].rect[1];
cv::Rect rect(det_result[j].rect[0], det_result[j].rect[1], w, h);
cv::Mat crop_img = srcimg(rect);
std::vector<PPShiTu::RESULT> result =
rec.RunRecModel(crop_img, rec_time);
det_result[j].rec_result.assign(result.begin(), result.end());
rec.RunRecModel(crop_img, rec_time, feature);
features.insert(features.end(), feature.begin(), feature.end());
}
// rec nms
PPShiTu::nms(det_result,
RT_Config["Global"]["rec_nms_thresold"].as<float>(), true);
PrintResult(img_path, det_result);
// do vectore search
search_result = searcher.Search(features.data(), det_result.size());
PrintResult(img_path, det_result, searcher, search_result);
batch_imgs.clear();
det_result.clear();
}
......
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "include/vector_search.h"
#include <cstdio>
#include <faiss/index_io.h>
#include <fstream>
#include <iostream>
#include <regex>
namespace PPShiTu {
// load the vector.index
void VectorSearch::LoadIndexFile() {
std::string file_path = this->index_dir + OS_PATH_SEP + "vector.index";
const char *fname = file_path.c_str();
this->index = faiss::read_index(fname, 0);
}
// load id_map.txt
void VectorSearch::LoadIdMap() {
std::string file_path = this->index_dir + OS_PATH_SEP + "id_map.txt";
std::ifstream in(file_path);
std::string line;
std::vector<std::string> m_vec;
if (in) {
while (getline(in, line)) {
std::regex ws_re("\\s+");
std::vector<std::string> v(
std::sregex_token_iterator(line.begin(), line.end(), ws_re, -1),
std::sregex_token_iterator());
if (v.size() != 2) {
std::cout << "The number of element for each line in : " << file_path
<< "must be 2, exit the program..." << std::endl;
exit(1);
} else
this->id_map.insert(std::pair<long int, std::string>(
std::stol(v[0], nullptr, 10), v[1]));
}
}
}
// doing search
const SearchResult &VectorSearch::Search(float *feature, int query_number) {
this->D.resize(this->return_k * query_number);
this->I.resize(this->return_k * query_number);
this->index->search(query_number, feature, return_k, D.data(), I.data());
this->sr.return_k = this->return_k;
this->sr.D = this->D;
this->sr.I = this->I;
return this->sr;
}
const std::string &VectorSearch::GetLabel(faiss::Index::idx_t ind) {
return this->id_map.at(ind);
}
}
\ No newline at end of file
import argparse
import os
import pickle
import yaml
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('-c', '--config', type=str, required=True)
args = parser.parse_args()
return args
def main():
args = parse_args()
with open(args.config) as fd:
config = yaml.load(fd.read(), yaml.FullLoader)
index_dir = ""
try:
index_dir = config["IndexProcess"]["index_dir"]
except Exception as e:
print("The IndexProcess.index_dir in config_file dose not exist")
exit(1)
id_map_path = os.path.join(index_dir, "id_map.pkl")
assert os.path.exists(
id_map_path), "The id_map file dose not exist: {}".format(id_map_path)
with open(id_map_path, "rb") as fd:
ids = pickle.load(fd)
with open(os.path.join(index_dir, "id_map.txt"), "w") as fd:
for k, v in ids.items():
v = v.split("\t")[1]
fd.write(str(k) + " " + v + "\n")
print('Transform id_map sucess')
if __name__ == "__main__":
main()
......@@ -128,13 +128,10 @@ class DetPredictor(Predictor):
results = []
if reduce(lambda x, y: x * y, np_boxes.shape) < 6:
print('[WARNNING] No object detected.')
results = np.array([])
else:
results = np_boxes
results = self.parse_det_results(results,
self.config["Global"]["threshold"],
self.config["Global"]["labe_list"])
results = self.parse_det_results(
np_boxes, self.config["Global"]["threshold"],
self.config["Global"]["label_list"])
return results
......
......@@ -11,6 +11,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import platform
import os
import argparse
import base64
......@@ -50,8 +51,10 @@ class Predictor(object):
else:
config.disable_gpu()
if args.enable_mkldnn:
# cache 10 different shapes for mkldnn to avoid memory leak
config.set_mkldnn_cache_capacity(10)
# there is no set_mkldnn_cache_capatity() on macOS
if platform.system() != "Darwin":
# cache 10 different shapes for mkldnn to avoid memory leak
config.set_mkldnn_cache_capacity(10)
config.enable_mkldnn()
config.set_cpu_math_library_num_threads(args.cpu_num_threads)
......
# ISE
---
## Catalogue
- [1. Introduction](#1)
- [2. Performance on Market1501 and MSMT17](#2)
- [3. Test](#3)
- [4. Reference](#4)
<a name='1'></a>
## 1. Introduction
ISE (Implicit Sample Extension) is a simple, efficient, and effective learning algorithm for unsupervised person Re-ID. ISE generates what we call support samples around the cluster boundaries. The sample generation process in ISE depends on two critical mechanisms, i.e., a progressive linear interpolation strategy and a label-preserving loss function. The generated support samples from ISE provide complementary information, which can nicely handle the "sub and mixed" clustering errors. ISE achieves superior performance than other unsupervised methods on Market1501 and MSMT17 datasets.
> [**Implicit Sample Extension for Unsupervised Person Re-Identification**](https://arxiv.org/abs/2204.06892v1)<br>
> Xinyu Zhang, Dongdong Li, Zhigang Wang, Jian Wang, Errui Ding, Javen Qinfeng Shi, Zhaoxiang Zhang, Jingdong Wang<br>
> CVPR2022
![image](../../images/ISE_ReID/ISE_pipeline.png)
<a name='2'></a>
## 2. Performance on Market1501 and MSMT17
The main results on Market1501 (M) and MSMT17 (MS). PIL denotes the progressive linear interpolation strategy. LP represents the label-preserving loss function.
| Methods | M | Link | MS | Link |
| --- | -- | -- | -- | - |
| Baseline | 82.5 (92.5) | - | 30.1 (58.6) | - |
| ISE (+PIL) | 83.9 (93.9) | - | 33.5 (63.9) | - |
| ISE (+LP) | 83.6 (92.7) | - | 31.4 (59.9) | - |
| ISE (Ours) (+PIL+LP) | **84.7 (94.0)** | [ISE_M](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ISE_M_model.pdparams) | **35.0 (64.7)** | [ISE_MS](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ISE_MS_model.pdparams) |
<a name="3"></a>
## 3. Test
The training code is coming soon. We first release the test code with the pretrained models.
**Test:** You can simply run the following script for the evaluation.
```
python tools/eval.py -c ./ppcls/configs/Person/ResNet50_UReID_infer.yaml
```
**Steps:**
1. Download the pretrained model first, and put the model into: ```./pd_model_trace/ISE/```.
2. Change the dataset name in: ```./ppcls/configs/Person/ResNet50_UReID_infer.yaml```.
3. Run the above script.
<a name="4"></a>
## 4. Reference
If you find ISE useful in your research, please kindly consider citing our paper:
```
@inproceedings{zhang2022Implicit,
title={Implicit Sample Extension for Unsupervised Person Re-Identification},
author={Xinyu Zhang, Dongdong Li, Zhigang Wang, Jian Wang, Errui Ding, Javen Qinfeng Shi, Zhaoxiang Zhang, Jingdong Wang},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2022}
}
```
......@@ -55,7 +55,7 @@ After the data is settled, the model often determines the upper limit of the fin
<a name="2.3"></a>
### 2.3 Train the Model
After preparing the data and model, you can start training the model and updating the parameters of the model. After many iterations, a trained model can finally be obtained for image classification tasks. The training process of image classification requires a lot of experience and involves the setting of many hyperparameters. PaddleClas provides a series of [training tuning methods](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/en/models/Tricks_en.md), which can help you quickly obtain a high-precision model.
After preparing the data and model, you can start training the model and updating the parameters of the model. After many iterations, a trained model can finally be obtained for image classification tasks. The training process of image classification requires a lot of experience and involves the setting of many hyperparameters. PaddleClas provides a series of [training tuning methods](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/en/models_training/train_strategy_en.md), which can help you quickly obtain a high-precision model.
<a name="2.4"></a>
......
......@@ -108,7 +108,7 @@ PaddleClas strictly follows the resolution used by the authors of the paper. Sin
**A**:
There are many ssld pre-training models available in PaddleClas, which obtain better pre-training weights by semi-supervised knowledge distillation, so that the accuracy can be improved by replacing the ssld pre-training models with higher accuracy in transfer tasks or downstream vision tasks without replacing the structure files. For example, in PaddleSeg, [HRNet](../models/HRNet_en.md) , with the weight of the ssld pre-training model, achieves much better accuracy than other same models in the industry; In PaddleDetection, [PP- YOLO](https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.4/configs/ppyolo/README_cn.md)with ssld pre-training weights has further improvement in the already high baseline. The transfer of classification with ssld pre-training weights also yields impressive results, and the benefits of knowledge distillation for the transfer of classification task is detailed in [SSLD Distillation Strategy](../advanced_tutorials/knowledge_distillation_en.md)
There are many ssld pre-training models available in PaddleClas, which obtain better pre-training weights by semi-supervised knowledge distillation, so that the accuracy can be improved by replacing the ssld pre-training models with higher accuracy in transfer tasks or downstream vision tasks without replacing the structure files. For example, in PaddleSeg, [HRNet](../models/HRNet_en.md) , with the weight of the ssld pre-training model, achieves much better accuracy than other same models in the industry; In PaddleDetection, [PP- YOLO](https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.4/configs/ppyolo/README_cn.md)with ssld pre-training weights has further improvement in the already high baseline. The transfer of classification with ssld pre-training weights also yields impressive results, and the benefits of knowledge distillation for the transfer of classification task is detailed in [SSLD Distillation Strategy](../advanced_tutorials/distillation/distillation_en.md)
<a name="3"></a>
......@@ -143,7 +143,7 @@ When adopting multiple models for inference, it is recommended to first export t
**A**
- You can adopt auto-mixed precision training, which can gain a significantly faster speed with almost zero precision loss. Take ResNet50 as an example, the configuration file of auto-mixed precision training in PaddleClas can be found at: [ResNet50_fp16.yml](../../../ppcls/configs/ImageNet/ResNet/ResNet50_fp16.yaml). The main step is to add the following lines to the standard configuration file.
- You can adopt auto-mixed precision training, which can gain a significantly faster speed with almost zero precision loss. Take ResNet50 as an example, the configuration file of auto-mixed precision training in PaddleClas can be found at: [ResNet50_amp_O1.yml](../../../ppcls/configs/ImageNet/ResNet/ResNet50_amp_O1.yaml). The main step is to add the following lines to the standard configuration file.
```
# mixed precision training
......@@ -351,7 +351,7 @@ At this stage, it has become a common practice in the image recognition field to
**A**: If the existing strategy cannot further improve the accuracy of the model, it means that the model has almost reached saturation with the existing dataset and strategy, and two methods are provided here.
- Mining relevant data: Use the model trained on the existing dataset to make predictions on the relevant data, label the data with higher confidence and add it to the training set for further training. Repeat the steps above to further improve the accuracy of the model.
- Knowledge distillation: You can use a larger model to train a teacher model with higher accuracy on the dataset, and then adopt the teacher model to teach a Student model, where the Student model is the target model. PaddleClas provides Baidu's own SSLD knowledge distillation scheme, which can steadily improve by more than 3% even on such a challenging classification task as ImageNet-1k. For the chapter on SSLD knowledge distillation, please refer to [**SSLD Knowledge Distillation**](../advanced_tutorials/knowledge_distillation_en.md).
- Knowledge distillation: You can use a larger model to train a teacher model with higher accuracy on the dataset, and then adopt the teacher model to teach a Student model, where the Student model is the target model. PaddleClas provides Baidu's own SSLD knowledge distillation scheme, which can steadily improve by more than 3% even on such a challenging classification task as ImageNet-1k. For the chapter on SSLD knowledge distillation, please refer to [**SSLD Knowledge Distillation**](../advanced_tutorials/distillation/distillation_en.md).
<a name="6"></a>
......
......@@ -248,7 +248,7 @@ PaddleClas saves/updates the following three types of models during training.
#### Q2.4.2: How can recognition models be fine-tuned to train on the basis of pre-trained models?
**A**:The fine-tuning training of the recognition model is similar to that of the classification model. The recognition model can be loaded with a pre-trained model of the product, and the training process can be found in [recognition model training](../../models_training/recognition_en.md), and we will continue to refine the documentation.
**A**:The fine-tuning training of the recognition model is similar to that of the classification model. The recognition model can be loaded with a pre-trained model of the product, and the training process can be found in [recognition model training](../models_training/recognition_en.md), and we will continue to refine the documentation.
#### Q2.4.3: Why does it fail to run all mini-batches in each epoch when training metric learning?
......@@ -353,4 +353,4 @@ pip install paddle2onnx
- `InputSpec()` function is used to describe the signature information of the model input, including the `shape`, `type` and `name` of the input data (can be omitted).
- The `paddle.onnx.export()` function needs to specify the model grouping object `net`, the save path of the exported model `save_path`, and the description of the model's input data `input_spec`.
Note that the `paddlepaddle` `2.0.0` or above should be adopted.See [paddle.onnx.export](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/onnx/) for more details on the parameters of the `paddle.onnx.export()` function.
Note that the `paddlepaddle` `2.0.0` or above should be adopted.See [paddle.onnx.export](https://www.paddlepaddle.org.cn/documentation/docs/en/api/paddle/onnx/export_en.html) for more details on the parameters of the `paddle.onnx.export()` function.
......@@ -27,8 +27,8 @@
> >
- Q: 怎样根据自己的任务选择合适的模型进行训练?How to choose the right training model?
- A: If you want to deploy on the server with a high requirement for accuracy but not model storage size or prediction speed, then it is recommended to use ResNet_vd, Res2Net_vd, DenseNet, Xception, etc., which are suitable for server-side models. If you want to deploy on the mobile side, then it is recommended to use MobileNetV3 and GhostNet. Meanwhile, we suggest you refer to the speed-accuracy metrics chart in [Model Library](../models/models_intro_en.md) when choosing models.
- Q: How to choose the right training model?
- A: If you want to deploy on the server with a high requirement for accuracy but not model storage size or prediction speed, then it is recommended to use ResNet_vd, Res2Net_vd, DenseNet, Xception, etc., which are suitable for server-side models. If you want to deploy on the mobile side, then it is recommended to use MobileNetV3 and GhostNet. Meanwhile, we suggest you refer to the speed-accuracy metrics chart in [Model Library](../algorithm_introduction/ImageNet_models_en.md) when choosing models.
> >
......@@ -280,7 +280,7 @@ Loss:
> >
- Q: How to train with Automatic Mixed Precision (AMP) during training?
- A: You can refer to [ResNet50_fp16.yaml](../../../ppcls/configs/ImageNet/ResNet/ResNet50_fp16.yaml). Specifically, if you want your configuration file to support automatic mixed precision during model training, you can add the following information to the file.
- A: You can refer to [ResNet50_amp_O1.yaml](../../../ppcls/configs/ImageNet/ResNet/ResNet50_amp_O1.yaml). Specifically, if you want your configuration file to support automatic mixed precision during model training, you can add the following information to the file.
```
# mixed precision training
......
......@@ -48,7 +48,7 @@ Before installing the service module, you need to prepare the inference model an
**Notice**:
* The model file path can be viewed and modified in `PaddleClas/deploy/hubserving/clas/params.py`.
* It should be noted that the prefix of model structure file and model parameters file must be `inference`.
* More models provided by PaddleClas can be obtained from the [model library](../models/models_intro_en.md). You can also use models trained by yourself.
* More models provided by PaddleClas can be obtained from the [model library](../algorithm_introduction/ImageNet_models_en.md). You can also use models trained by yourself.
<a name="4"></a>
## 4. Install Service Module
......
......@@ -4,7 +4,7 @@ This tutorial will introduce how to use [Paddle-Lite](https://github.com/PaddleP
Paddle-Lite is a lightweight inference engine for PaddlePaddle. It provides efficient inference capabilities for mobile phones and IoTs, and extensively integrates cross-platform hardware to provide lightweight deployment solutions for mobile-side deployment issues.
If you only want to test speed, please refer to [The tutorial of Paddle-Lite mobile-side benchmark test](../extension/paddle_mobile_inference_en.md).
If you only want to test speed, please refer to [The tutorial of Paddle-Lite mobile-side benchmark test](../others/paddle_mobile_inference_en.md).
---
......@@ -18,6 +18,7 @@ If you only want to test speed, please refer to [The tutorial of Paddle-Lite mob
- [2.1.1 [RECOMMEND] Use pip to install Paddle-Lite and optimize model](#2.1.1)
- [2.1.2 Compile Paddle-Lite to generate opt tool](#2.1.2)
- [2.1.3 Demo of get the optimized model](#2.1.3)
- [2.1.4 Compile to get the executable file clas_system](#2.1.4)
- [2.2 Run optimized model on Phone](#2.2)
- [3. FAQ](#3)
......@@ -40,20 +41,20 @@ For the detailed compilation directions of different development environments, p
|Platform|Inference Library Download Link|
|-|-|
|Android|[arm7](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/Android/gcc/inference_lite_lib.android.armv7.gcc.c++_static.with_extra.with_cv.tar.gz) / [arm8](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/Android/gcc/inference_lite_lib.android.armv8.gcc.c++_static.with_extra.with_cv.tar.gz)|
|iOS|[arm7](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/iOS/inference_lite_lib.ios.armv7.with_cv.with_extra.tiny_publish.tar.gz) / [arm8](https://paddlelite-data.bj.bcebos.com/Release/2.8-rc/iOS/inference_lite_lib.ios.armv8.with_cv.with_extra.tiny_publish.tar.gz)|
|Android|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.android.armv7.clang.c++_static.with_extra.with_cv.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.android.armv8.clang.c++_static.with_extra.with_cv.tar.gz) |
|iOS|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.ios.armv7.with_cv.with_extra.tiny_publish.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.ios.armv8.with_cv.with_extra.tiny_publish.tar.gz)|
**NOTE**:
1. If you download the inference library from [Paddle-Lite official document](https://paddle-lite.readthedocs.io/zh/latest/quick_start/release_lib.html#android-toolchain-gcc), please choose `with_extra=ON` , `with_cv=ON` .
2. It is recommended to build inference library using [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) develop branch if you want to deploy the [quantitative](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/quantization/README_en.md) model to mobile phones. Please refer to the [link](https://paddle-lite.readthedocs.io/zh/latest/user_guides/Compile/Android.html#id2) for more detailed information about compiling.
2. It is recommended to build inference library using [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) develop branch if you want to deploy the [quantitative](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/quantization/README_en.md) model to mobile phones. Please refer to the [link](https://paddle-lite.readthedocs.io/) for more detailed information about compiling.
The structure of the inference library is as follows:
```
inference_lite_lib.android.armv8/
inference_lite_lib.android.armv8.clang.c++_static.with_extra.with_cv/
|-- cxx C++ inference library and header files
| |-- include C++ header files
| | |-- paddle_api.h
......@@ -148,6 +149,23 @@ paddle_lite_opt --model_file=./MobileNetV3_large_x1_0_infer/inference.pdmodel --
```
When the above code command is completed, there will be ``MobileNetV3_large_x1_0.nb` in the current directory, which is the converted model file.
<a name="2.1.4"></a>
#### 2.1.4 Compile to get the executable file clas_system
```shell
# Clone the Autolog repository to get automation logs
cd PaddleClas_root_path
cd deploy/lite/
git clone https://github.com/LDOUBLEV/AutoLog.git
```
```shell
# Compile
make -j
```
After executing the `make` command, the `clas_system` executable file is generated in the current directory, which is used for Lite prediction.
<a name="2.2"></a>
## 2.2 Run optimized model on Phone
......@@ -172,7 +190,7 @@ When the above code command is completed, there will be ``MobileNetV3_large_x1_0
* Install ADB for windows
If install ADB fo Windows, you need to download from Google's Android platform: [Download Link](https://developer.android.com/studio).
First, make sure the phone is connected to the computer, turn on the `USB debugging` option of the phone, and select the `file transfer` mode. Verify whether ADB is installed successfully as follows:
3. First, make sure the phone is connected to the computer, turn on the `USB debugging` option of the phone, and select the `file transfer` mode. Verify whether ADB is installed successfully as follows:
```shell
$ adb devices
......@@ -183,42 +201,22 @@ When the above code command is completed, there will be ``MobileNetV3_large_x1_0
If there is `device` output like the above, it means the installation was successful.
4. Prepare optimized model, inference library files, test image and dictionary file used.
4. Push the optimized model, prediction library file, test image and class map file to the phone.
```shell
cd PaddleClas_root_path
cd deploy/lite/
# prepare.sh will put the inference library files, the test image and the dictionary files in demo/cxx/clas
sh prepare.sh /{lite inference library path}/inference_lite_lib.android.armv8
# enter the working directory of lite demo
cd /{lite inference library path}/inference_lite_lib.android.armv8/
cd demo/cxx/clas/
# copy the C++ inference dynamic library file (ie. .so) to the debug folder
cp ../../../cxx/lib/libpaddle_light_api_shared.so ./debug/
```shell
adb shell mkdir -p /data/local/tmp/arm_cpu/
adb push clas_system /data/local/tmp/arm_cpu/
adb shell chmod +x /data/local/tmp/arm_cpu//clas_system
adb push inference_lite_lib.android.armv8.clang.c++_static.with_extra.with_cv/cxx/lib/libpaddle_light_api_shared.so /data/local/tmp/arm_cpu/
adb push MobileNetV3_large_x1_0.nb /data/local/tmp/arm_cpu/
adb push config.txt /data/local/tmp/arm_cpu/
adb push ../../ppcls/utils/imagenet1k_label_list.txt /data/local/tmp/arm_cpu/
adb push imgs/tabby_cat.jpg /data/local/tmp/arm_cpu/
```
The `prepare.sh` take `PaddleClas/deploy/lite/imgs/tabby_cat.jpg` as the test image, and copy it to the `demo/cxx/clas/debug/` directory.
You should put the model that optimized by `paddle_lite_opt` under the `demo/cxx/clas/debug/` directory. In this example, use `MobileNetV3_large_x1_0.nb` model file generated in [2.1.3](#2.1.3).
The structure of the clas demo is as follows after the above command is completed:
```
demo/cxx/clas/
|-- debug/
| |--MobileNetV3_large_x1_0.nb class model
| |--tabby_cat.jpg test image
| |--imagenet1k_label_list.txt dictionary file
| |--libpaddle_light_api_shared.so C++ .so file
| |--config.txt config file
|-- config.txt config file
|-- image_classfication.cpp source code
|-- Makefile compile file
```
**NOTE**:
* `Imagenet1k_label_list.txt` is the category mapping file of the `ImageNet1k` dataset. If use a custom category, you need to replace the category mapping file.
......@@ -229,33 +227,22 @@ clas_model_file ./MobileNetV3_large_x1_0.nb # path of model file
label_path ./imagenet1k_label_list.txt # path of category mapping file
resize_short_size 256 # the short side length after resize
crop_size 224 # side length used for inference after cropping
visualize 0 # whether to visualize. If you set it to 1, an image file named 'clas_result.png' will be generated in the current directory.
num_threads 1 # The number of threads, the default is 1
precision FP32 # Precision type, you can choose FP32 or INT8, the default is FP32
runtime_device arm_cpu # Device type, the default is arm_cpu
enable_benchmark 0 # Whether to enable benchmark, the default is 0
tipc_benchmark 0 # Whether to enable tipc_benchmark, the default is 0
```
5. Run Model on Phone
```shell
# run compile to get the executable file 'clas_system'
make -j
# move the compiled executable file to the debug folder
mv clas_system ./debug/
# push the debug folder to Phone
adb push debug /data/local/tmp/
Execute the following command to complete the prediction on the mobile phone.
adb shell
cd /data/local/tmp/debug
export LD_LIBRARY_PATH=/data/local/tmp/debug:$LD_LIBRARY_PATH
# the usage of clas_system is as follows:
# ./clas_system "path of config file" "path of test image"
./clas_system ./config.txt ./tabby_cat.jpg
```shell
adb shell 'export LD_LIBRARY_PATH=/data/local/tmp/arm_cpu/; /data/local/tmp/arm_cpu/clas_system /data/local/tmp/arm_cpu/config.txt /data/local/tmp/arm_cpu/tabby_cat.jpg'
```
**NOTE**: If you make changes to the code, you need to recompile and repush the `debug ` folder to the phone.
The result is as follows:
![](../../images/inference_deployment/lite_demo_result.png)
......
# CSWinTransformer
---
## Catalogue
* [1. Overview](#1)
* [2. Accuracy, FLOPs and Parameters](#2)
<a name='1'></a>
## 1. Overview
CSWinTransformer is a new visual Transformer network that can be used as a general backbone network in the field of computer vision. CSWinTransformer proposes to do self-attention through a cross-shaped window, which not only has a very high computational efficiency, but also can obtain a global receptive field through two-layer calculation. CSWinTransformer also proposed a new encoding method: LePE, which further improved the accuracy of the model. [Paper](https://arxiv.org/abs/2107.00652)
<a name='2'></a>
## 2. Accuracy, FLOPs and Parameters
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPs<br>(G) | Params<br>(M) |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| CSWinTransformer_tiny_224 | 0.8281 | 0.9628 | 0.828 | - | 4.1 | 22 |
| CSWinTransformer_small_224 | 0.8358 | 0.9658 | 0.836 | - | 6.4 | 35 |
| CSWinTransformer_base_224 | 0.8420 | 0.9692 | 0.842 | - | 14.3 | 77 |
| CSWinTransformer_large_224 | 0.8643 | 0.9799 | 0.865 | - | 32.2 | 173.3 |
| CSWinTransformer_base_384 | 0.8550 | 0.9749 | 0.855 | - | 42.2 | 77 |
| CSWinTransformer_large_384 | 0.8748 | 0.9833 | 0.875 | - | 94.7 | 173.3 |
# MobileviT
---
## Catalogue
* [1. Overview](#1)
* [2. Accuracy, FLOPs and Parameters](#2)
<a name='1'></a>
## 1. Overview
MobileViT is a lightweight visual Transformer network that can be used as a general backbone network in the field of computer vision. MobileViT combines the advantages of CNN and Transformer, which can better deal with global features and local features, and better solve the problem of lack of inductive bias in Transformer models.
, and finally, under the same amount of parameters, compared with other SOTA models, the tasks of image classification, object detection, and semantic segmentation have been greatly improved. [Paper](https://arxiv.org/pdf/2110.02178.pdf)
<a name='2'></a>
## 2. Accuracy, FLOPs and Parameters
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPs<br>(M) | Params<br>(M) |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| MobileViT_XXS | 0.6867 | 0.8878 | 0.690 | - | 1849.35 | 5.59 |
| MobileViT_XS | 0.7454 | 0.9227 | 0.747 | - | 930.75 | 2.33 |
| MobileViT_S | 0.7814 | 0.9413 | 0.783 | - | 337.24 | 1.28 |
......@@ -107,7 +107,7 @@ For image classification, ImageNet dataset is adopted. Compared with the current
| PPLCNet_x1_0_ssld | 3.0 | 161 | 74.39 | 92.09 | 2.46 |
| PPLCNet_x2_5_ssld | 9.0 | 906 | 80.82 | 95.33 | 5.39 |
where `_ssld` represents the model after using `SSLD distillation`. For details about `SSLD distillation`, see [SSLD distillation](../advanced_tutorials/knowledge_distillation_en.md).
where `_ssld` represents the model after using `SSLD distillation`. For details about `SSLD distillation`, see [SSLD distillation](../advanced_tutorials/distillation/distillation_en.md).
Performance comparison with other lightweight networks:
......@@ -190,7 +190,7 @@ Rather than holding on to perfect FLOPs and Params as academics do, PP-LCNet foc
Reference to cite when you use PP-LCNet in a paper:
```
@misc{cui2021pplcnet,
title={PP-LCNet: A Lightweight CPU Convolutional Neural Network},
title={PP-LCNet: A Lightweight CPU Convolutional Neural Network},
author={Cheng Cui and Tingquan Gao and Shengyu Wei and Yuning Du and Ruoyu Guo and Shuilong Dong and Bin Lu and Ying Zhou and Xueying Lv and Qiwen Liu and Xiaoguang Hu and Dianhai Yu and Yanjun Ma},
year={2021},
eprint={2109.15099},
......
此差异已折叠。
......@@ -217,7 +217,7 @@ Some of the configurable evaluation parameters are described as follows:
**Note:** When loading the model to be evaluated, you only need to specify the path of the model file stead of the suffix. PaddleClas will automatically add the `.pdparams` suffix, such as [3.1.3 Resume Training](#3.1.3).
When loading the model to be evaluated, you only need to specify the path of the model file stead of the suffix. PaddleClas will automatically add the `.pdparams` suffix, such as [3.1.3 Resume Training](https://github.com/PaddlePaddle/PaddleClas/blob/ develop/docs/zh_CN/models_training/classification.md#3.1.3).
When loading the model to be evaluated, you only need to specify the path of the model file stead of the suffix. PaddleClas will automatically add the `.pdparams` suffix, such as [3.1.3 Resume Training](../models_training/classification_en.md#3.1.3).
<a name="3.2"></a>
......
......@@ -27,7 +27,7 @@ The first step is to select the model to be studied, here we choose ResNet50. Co
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_pretrained.pdparams
```
For other pre-training models and codes of network structure, please download [model library](../../../ppcls/arch/backbone/) and [pre-training models](../models/models_intro_en.md).
For other pre-training models and codes of network structure, please download [model library](../../../ppcls/arch/backbone/) and [pre-training models](../algorithm_introduction/ImageNet_models_en.md).
<a name='3'></a>
......
......@@ -75,9 +75,23 @@ python3 -m paddle.distributed.launch \
The highest accuracy of the validation set is around 0.415.
* ** Note**
Here, multiple GPUs are used for training. If only one GPU is used, please specify the GPU with the `CUDA_VISIBLE_DEVICES` setting, and specify the GPU with the `--gpus` setting, the same below. For example, to train with only GPU 0:
```shell
export CUDA_VISIBLE_DEVICES=0
python3 -m paddle.distributed.launch \
--gpus="0" \
tools/train.py \
-c ./ppcls/configs/quick_start/professional/ResNet50_vd_CIFAR100.yaml \
-o Global.output_dir="output_CIFAR" \
-o Optimizer.lr.learning_rate=0.01
```
* **Notice**:
* The GPUs specified in `--gpus` can be a subset of the GPUs specified in `CUDA_VISIBLE_DEVICES`.
* Since the initial learning rate and batch-size need to maintain a linear relationship, when training is switched from 4 GPUs to 1 GPU, the total batch-size is reduced to 1/4 of the original, and the learning rate also needs to be reduced to 1/4 of the original, so changed the default learning rate from 0.04 to 0.01.
* If the number of GPU cards is not 4, the accuracy of the validation set may be different from 0.415. To maintain a comparable accuracy, you need to change the learning rate in the configuration file to `the current learning rate / 4 \* current card number`. The same below.
<a name="2.1.2"></a>
......
# Quick Start of Multi-label Classification
Experience the training, evaluation, and prediction of multi-label classification based on the [NUS-WIDE-SCENE](https://lms.comp.nus.edu.sg/wp-content/uploads/2019/research/nuswide/NUS-WIDE.html) dataset, which is a subset of the NUS-WIDE dataset. Please first install PaddlePaddle and PaddleClas, see [Paddle Installation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/installation) and [PaddleClas installation](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/installation/install_ paddleclas.md) for more details.
Experience the training, evaluation, and prediction of multi-label classification based on the [NUS-WIDE-SCENE](https://lms.comp.nus.edu.sg/wp-content/uploads/2019/research/nuswide/NUS-WIDE.html) dataset, which is a subset of the NUS-WIDE dataset. Please first install PaddlePaddle and PaddleClas, see [Paddle Installation](../installation/install_paddle_en.md) and [PaddleClas installation](../installation/install_paddleclas_en.md) for more details.
## Catalogue
......
# 识别模型转分类模型
PaddleClas 提供了 `gallery2fc.py` 工具,帮助大家将识别模型转为分类模型。目前该工具仅支持转换量化后模型,因此建议使用 PaddleClas 提供的 `general_PPLCNet_x2_5_pretrained_v1.0_quant` 预训练模型,该模型为量化后的通用识别模型,backbone 为 PPLCNet_x2_5。
如需使用其他模型,关于量化的具体操作请参考文档 [模型量化](./model_prune_quantization.md)
## 一、模型转换说明
### 1.1 准备底库数据、预训练模型
#### 1. 底库数据集
首先需要准备好底库数据,下面以 PaddleClas 提供的饮料数据集(drink_dataset_v1.0)为例进行说明,饮料数据集获取方法:
```shell
cd PaddleClas/
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar
tar -xf drink_dataset_v1.0.tar
```
饮料数据集的底库图片路径为 `drink_dataset_v1.0/gallery/`,底库图片列表可在 `drink_dataset_v1.0/gallery/drink_label.txt` 中查看,关于底库数据格式说明,请参考文档[数据集格式说明](../data_preparation/recognition_dataset.md#1-数据集格式说明)
#### 2. 预训练模型
在开始转换模型前,需要准备好预训练模型,下面以量化后的 `general_PPLCNet_x2_5` 模型为例,下载预训练模型:
```shell
cd PaddleClas/pretrained/
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/general_PPLCNet_x2_5_pretrained_v1.0_quant.pdparams
```
### 1.2 准备配置文件
在进行模型转换时,需要通过配置文件定义所需参数,本例中所用配置文件为 `ppcls/configs/GeneralRecognition/Gallery2FC_PPLCNet_x2_5.yaml`,对于配置文件字段的说明,如下所示:
* Global:
* pretrained_model: 预训练模型路径,无需包含 `.pdparams` 后缀名;
* image_shape: 模型输入数据尺寸,无需包含 batch size 维度;
* save_inference_dir: 转换后模型的保存路径;
* Arch: 模型结构相关定义,可参考 [配置说明](../models_training/config_description.md#3-%E8%AF%86%E5%88%AB%E6%A8%A1%E5%9E%8B)
* IndexProcess: 底库数据集相关定义
* image_root: 底库数据集路径;
* data_file: 底库数据集列表文件路径;
### 1.3 模型转换
在完成上述准备工作后,即可进行模型转换,命令如下所示:
```python
python ppcls/utils/gallery2fc.py -c ppcls/configs/GeneralRecognition/Gallery2FC_PPLCNet_x2_5.yaml
```
在上述命令执行完成后,转换并导出的模型保存在目录 `./inference/general_PPLCNet_x2_5_quant/` 下。在推理部署时,需要注意的是,模型的输出结果通常有多个,应选取分类结果作为模型输出,需要注意区分。
......@@ -80,7 +80,7 @@
因为要对模型进行训练,所以收集自己的数据集。数据准备及相应格式请参考:[特征提取文档](../image_recognition_pipeline/feature_extraction.md)`4.1数据准备`部分、[识别数据集说明](../data_preparation/recognition_dataset.md)。值得注意的是,此部分需要准备大量的数据,以保证识别模型效果。训练配置文件参考:[通用识别模型配置文件](../../../ppcls/configs/GeneralRecognition/GeneralRecognition_PPLCNet_x2_5.yaml),训练方法参考:[识别模型训练](../models_training/recognition.md)
- 数据增强:根据实际情况选择不同数据增强方法。如:实际应用中数据遮挡比较严重,建议添加`RandomErasing`增强方法。详见[数据增强文档](./DataAugmentation.md)
- 换不同的`backbone`,一般来说,越大的模型,特征提取能力更强。不同`backbone`详见[模型介绍](../models/models_intro.md)
- 换不同的`backbone`,一般来说,越大的模型,特征提取能力更强。不同`backbone`详见[模型介绍](../algorithm_introduction/ImageNet_models.md)
- 选择不同的`Metric Learning`方法。不同的`Metric Learning`方法,对不同的数据集效果可能不太一样,建议尝试其他`Loss`,详见[Metric Learning](../algorithm_introduction/metric_learning.md)
- 采用蒸馏方法,对小模型进行模型能力提升,详见[模型蒸馏](../algorithm_introduction/knowledge_distillation.md)
- 增补数据集。针对错误样本,添加badcase数据
......
# ISE
---
## 目录
- [1. 介绍](#1)
- [2. 在Market1501和MSMT17上的结果](#2)
- [3. 测试](#3)
- [4. 引用](#4)
<a name='1'></a>
## 1. 介绍
ISE (Implicit Sample Extension)是一种简单、高效、有效的无监督行人再识别学习算法。ISE在聚类蔟边界周围生成样本,我们称之为支持样本。ISE的样本生成过程依赖于两个关键机制,即渐进线性插值策略(progressive linear interpolation)和标签保留的损失函数(label-preserving loss function)。ISE生成的支持样本提供了额外补充信息,可以很好地处理“子类和混合”的聚类错误。ISE在Market1501和MSMT17数据集上取得了优于其他无监督方法的性能。
> [**Implicit Sample Extension for Unsupervised Person Re-Identification**](https://arxiv.org/abs/2204.06892v1)<br>
> Xinyu Zhang, Dongdong Li, Zhigang Wang, Jian Wang, Errui Ding, Javen Qinfeng Shi, Zhaoxiang Zhang, Jingdong Wang<br>
> CVPR2022
![image](../../images/ISE_ReID/ISE_pipeline.png)
<a name='2'></a>
## 2. 在Market1501和MSMT17上的结果
在Market1501和MSMT17上的主要结果。“PIL”表示渐进线性插值策略。“LP”表示标签保留的损失函数。
| 方法 | Market1501 | 下载链接 | MSMT17 | 下载链接 |
| --- | -- | -- | -- | - |
| Baseline | 82.5 (92.5) | - | 30.1 (58.6) | - |
| ISE (+PIL) | 83.9 (93.9) | - | 33.5 (63.9) | - |
| ISE (+LP) | 83.6 (92.7) | - | 31.4 (59.9) | - |
| ISE (Ours) (+PIL+LP) | **84.7 (94.0)** | [ISE_M](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ISE_M_model.pdparams) | **35.0 (64.7)** | [ISE_MS](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ISE_MS_model.pdparams) |
<a name="3"></a>
## 3. 测试
我们很快会提供训练代码,首先我们提供了测试代码和模型。
**测试:** 可简使用如下脚本进行模型评估。
```
python tools/eval.py -c ./ppcls/configs/Person/ResNet50_UReID_infer.yaml
```
**步骤:**
1. 首先下载模型,并放入:```./pd_model_trace/ISE/```
2. 改变```./ppcls/configs/Person/ResNet50_UReID_infer.yaml```中的数据集名称。
3. 运行上述脚本。
<a name="4"></a>
## 4. 引用
如果ISE在您的研究中有启发,请考虑引用我们的论文:
```
@inproceedings{zhang2022Implicit,
title={Implicit Sample Extension for Unsupervised Person Re-Identification},
author={Xinyu Zhang, Dongdong Li, Zhigang Wang, Jian Wang, Errui Ding, Javen Qinfeng Shi, Zhaoxiang Zhang, Jingdong Wang},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2022}
}
```
......@@ -122,14 +122,15 @@ ResNet 系列模型中,相比于其他模型,ResNet_vd 模型在预测速度
**A**
* 可以使用自动混合精度进行训练,这在精度几乎无损的情况下,可以有比较明显的速度收益,以 ResNet50 为例,PaddleClas 中使用自动混合精度训练的配置文件可以参考:[ResNet50_fp16.yml](../../../ppcls/configs/ImageNet/ResNet/ResNet50_fp16.yaml),主要就是需要在标准的配置文件中添加以下几行
* 可以使用自动混合精度进行训练,这在精度几乎无损的情况下,可以有比较明显的速度收益,以 ResNet50 为例,PaddleClas 中使用自动混合精度训练的配置文件可以参考:[ResNet50_amp_O1.yml](../../../ppcls/configs/ImageNet/ResNet/ResNet50_amp_O1.yaml),主要就是需要在标准的配置文件中添加以下几行
```
```yaml
# mixed precision training
AMP:
scale_loss: 128.0
use_dynamic_loss_scaling: True
use_pure_fp16: &use_pure_fp16 True
# O1: mixed fp16
level: O1
```
* 可以开启 dali,将数据预处理方法放在 GPU 上运行,在模型比较小时(reader 耗时占比更高一些),开启 dali 会带来比较明显的训练速度收益,在训练的时候,添加 `-o Global.use_dali=True` 即可使用 dali 进行训练,更多关于 dali 安装与介绍可以参考:[dali 安装教程](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/installation.html#nightly-builds)
......
......@@ -31,7 +31,8 @@
>>
* Q: 怎样根据自己的任务选择合适的模型进行训练?
* A: 如果希望在服务器部署,或者希望精度尽可能地高,对模型存储大小或者预测速度的要求不是很高,那么推荐使用 ResNet_vd、Res2Net_vd、DenseNet、Xception 等适合于服务器端的系列模型;如果希望在移动端侧部署,则推荐使用 MobileNetV3、GhostNet 等适合于移动端的系列模型。同时,我们推荐在选择模型的时候可以参考[模型库](../models/models_intro.md)中的速度-精度指标图。
* A: 如果希望在服务器部署,或者希望精度尽可能地高,对模型存储大小或者预测速度的要求不是很高,那么推荐使用 ResNet_vd、Res2Net_vd、DenseNet、Xception 等适合于服务器端的系列模型;如果希望在移动端侧部署,则推荐使用 MobileNetV3、GhostNet
等适合于移动端的系列模型。同时,我们推荐在选择模型的时候可以参考[模型库](../algorithm_introduction/ImageNet_models.md)中的速度-精度指标图。
>>
* Q: 如何进行参数初始化,什么样的初始化可以加快模型收敛?
......@@ -232,11 +233,13 @@ Loss:
* A: 如果希望使用 TensorRT 进行模型预测推理的话,需要安装或是自己编译带 TensorRT 的 PaddlePaddle,Linux、Windows、macOS 系统的用户下载安装可以参考参考[下载预测库](https://paddleinference.paddlepaddle.org.cn/user_guides/download_lib.html),如果没有符合您所需要的版本,则需要本地编译安装,编译方法可以参考[源码编译](https://paddleinference.paddlepaddle.org.cn/user_guides/source_compile.html)
>>
* Q: 怎样在训练的时候使用自动混合精度(Automatic Mixed Precision, AMP)训练呢?
* A: 可以参考 [ResNet50_fp16.yaml](../../../ppcls/configs/ImageNet/ResNet/ResNet50_fp16.yaml) 这个配置文件;具体地,如果希望自己的配置文件在模型训练的时候也支持自动混合精度,可以在配置文件中添加下面的配置信息。
```
* A: 可以参考 [ResNet50_amp_O1.yaml](../../../ppcls/configs/ImageNet/ResNet/ResNet50_amp_O1.yaml) 这个配置文件;具体地,如果希望自己的配置文件在模型训练的时候也支持自动混合精度,可以在配置文件中添加下面的配置信息。
```yaml
# mixed precision training
AMP:
scale_loss: 128.0
use_dynamic_loss_scaling: True
use_pure_fp16: &use_pure_fp16 True
# O1: mixed fp16
level: O1
```
......@@ -15,8 +15,8 @@ PaddleClas 支持通过 PaddleHub 快速进行服务化部署。目前支持图
- [5.2 配置文件启动](#5.2)
- [6. 发送预测请求](#6)
- [7. 自定义修改服务模块](#7)
<a name="1"></a>
## 1. 简介
......@@ -55,7 +55,7 @@ pip3 install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/sim
```
需要注意,
* 模型文件(包括 `.pdmodel``.pdiparams`)名称必须为 `inference`
* 我们也提供了大量基于 ImageNet-1k 数据集的预训练模型,模型列表及下载地址详见[模型库概览](../models/models_intro.md),也可以使用自己训练转换好的模型。
* 我们也提供了大量基于 ImageNet-1k 数据集的预训练模型,模型列表及下载地址详见[模型库概览](../algorithm_introduction/ImageNet_models.md),也可以使用自己训练转换好的模型。
<a name="4"></a>
......
# CSWinTransformer
---
## 目录
* [1. 概述](#1)
* [2. 精度、FLOPs 和参数量](#2)
<a name='1'></a>
## 1. 概述
CSWinTransformer 是一种新的视觉 Transformer 网络,可以用作计算机视觉领域的通用骨干网路。 CSWinTransformer 提出了通过十字形的窗口来做 self-attention,它不仅计算效率非常高,而且能够通过两层计算就获得全局的感受野。CSWinTransformer 还提出了新的编码方式:LePE,进一步提高了模型的准确率。[论文地址](https://arxiv.org/abs/2107.00652)
<a name='2'></a>
## 2. 精度、FLOPs 和参数量
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPs<br>(G) | Params<br>(M) |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| CSWinTransformer_tiny_224 | 0.8281 | 0.9628 | 0.828 | - | 4.1 | 22 |
| CSWinTransformer_small_224 | 0.8358 | 0.9658 | 0.836 | - | 6.4 | 35 |
| CSWinTransformer_base_224 | 0.8420 | 0.9692 | 0.842 | - | 14.3 | 77 |
| CSWinTransformer_large_224 | 0.8643 | 0.9799 | 0.865 | - | 32.2 | 173.3 |
| CSWinTransformer_base_384 | 0.8550 | 0.9749 | 0.855 | - | 42.2 | 77 |
| CSWinTransformer_large_384 | 0.8748 | 0.9833 | 0.875 | - | 94.7 | 173.3 |
# MobileviT
---
## 目录
* [1. 概述](#1)
* [2. 精度、FLOPs 和参数量](#2)
<a name='1'></a>
## 1. 概述
MobileViT 是一个轻量级的视觉 Transformer 网络,可以用作计算机视觉领域的通用骨干网路。 MobileViT 结合了 CNN 和 Transformer 的优势,可以更好的处理全局特征和局部特征,更好地解决 Transformer 模型缺乏归纳偏置的问题,最终,在同样参数量下,与其他 SOTA 模型相比,在图像分类、目标检测、语义分割任务上都有大幅提升。[论文地址](https://arxiv.org/pdf/2110.02178.pdf)
<a name='2'></a>
## 2. 精度、FLOPs 和参数量
| Models | Top1 | Top5 | Reference<br>top1 | Reference<br>top5 | FLOPs<br>(M) | Params<br>(M) |
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| MobileViT_XXS | 0.6867 | 0.8878 | 0.690 | - | 1849.35 | 5.59 |
| MobileViT_XS | 0.7454 | 0.9227 | 0.747 | - | 930.75 | 2.33 |
| MobileViT_S | 0.7814 | 0.9413 | 0.783 | - | 337.24 | 1.28 |
此差异已折叠。
......@@ -23,7 +23,7 @@
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_pretrained.pdparams
```
其他模型网络结构代码及预训练模型请自行下载:[模型库](../../../ppcls/arch/backbone/)[预训练模型](../models/models_intro.md)
其他模型网络结构代码及预训练模型请自行下载:[模型库](../../../ppcls/arch/backbone/)[预训练模型](../algorithm_introduction/ImageNet_models.md)
<a name='3'></a>
......
......@@ -3,12 +3,8 @@
- 2021.11.1 发布[PP-ShiTu技术报告](https://arxiv.org/pdf/2111.00775.pdf),新增饮料识别demo。
- 2021.10.23 发布轻量级图像识别系统PP-ShiTu,CPU上0.2s即可完成在10w+库的图像识别。[点击这里](../quick_start/quick_start_recognition.md)立即体验。
- 2021.09.17 发布PP-LCNet系列超轻量骨干网络模型, 在Intel CPU上,单张图像预测速度约5ms,ImageNet-1K数据集上Top1识别准确率达到80.82%,超越ResNet152的模型效果。PP-LCNet的介绍可以参考[论文](https://arxiv.org/pdf/2109.15099.pdf), 或者[PP-LCNet模型介绍](../models/PP-LCNet.md),相关指标和预训练权重可以从 [这里](../algorithm_introduction/ImageNet_models.md)下载。
- 2021.08.11 更新7个[FAQ](../faq_series/faq_2021_s2.md)
- 2021.06.29 添加Swin-transformer系列模型,ImageNet1k数据集上Top1 acc最高精度可达87.2%;支持训练预测评估与whl包部署,预训练模型可以从[这里](../models/models_intro.md)下载。
- 2021.06.22,23,24 PaddleClas官方研发团队带来技术深入解读三日直播课。课程回放:[https://aistudio.baidu.com/aistudio/course/introduce/24519](https://aistudio.baidu.com/aistudio/course/introduce/24519)
- 2021.06.16 PaddleClas v2.2版本升级,集成Metric learning,向量检索等组件。新增商品识别、动漫人物识别、车辆识别和logo识别等4个图像识别应用。新增LeViT、Twins、TNT、DLA、HarDNet、RedNet系列30个预训练模型。
- 2021.08.11 更新 7 个[FAQ](../faq_series/faq_2021_s2.md)
- 2021.06.29 添加 Swin-transformer 系列模型,ImageNet1k 数据集上 Top1 acc 最高精度可达 87.2%;支持训练预测评估与 whl 包部署,预训练模型可以从[这里](../models/models_intro.md)下载。
- 2021.06.29 添加 Swin-transformer 系列模型,ImageNet1k 数据集上 Top1 acc 最高精度可达 87.2%;支持训练预测评估与 whl 包部署,预训练模型可以从[这里](../algorithm_introduction/ImageNet_models.md)下载。
- 2021.06.22,23,24 PaddleClas 官方研发团队带来技术深入解读三日直播课。课程回放:[https://aistudio.baidu.com/aistudio/course/introduce/24519](https://aistudio.baidu.com/aistudio/course/introduce/24519)
- 2021.06.16 PaddleClas v2.2 版本升级,集成 Metric learning,向量检索等组件。新增商品识别、动漫人物识别、车辆识别和 logo 识别等 4 个图像识别应用。新增 LeViT、Twins、TNT、DLA、HarDNet、RedNet 系列 30 个预训练模型。
- 2021.04.15
......
......@@ -75,9 +75,22 @@ python3 -m paddle.distributed.launch \
验证集的最高准确率为 0.415 左右。
* **注意**
此处使用了多个 GPU 训练,如果只使用一个 GPU,请将 `CUDA_VISIBLE_DEVICES` 设置指定 GPU,`--gpus`设置指定 GPU,下同。例如,只使用 0 号 GPU 训练:
```shell
export CUDA_VISIBLE_DEVICES=0
python3 -m paddle.distributed.launch \
--gpus="0" \
tools/train.py \
-c ./ppcls/configs/quick_start/professional/ResNet50_vd_CIFAR100.yaml \
-o Global.output_dir="output_CIFAR" \
-o Optimizer.lr.learning_rate=0.01
```
* **注意**:
* 如果 GPU 卡数不是 4,验证集的准确率可能与 0.415 有差异,若需保持相当的准确率,需要将配置文件中的学习率改为`当前学习率 / 4 \* 当前卡数`。下同。
* `--gpus`中指定的 GPU 可以是 `CUDA_VISIBLE_DEVICES` 指定的 GPU 的子集。
* 由于初始学习率和 batch-size 需要保持线性关系,所以训练从 4 个 GPU 切换到 1 个 GPU 训练时,总 batch-size 缩减为原来的 1/4,学习率也需要缩减为原来的 1/4,所以改变了默认的学习率从 0.04 到 0.01。
<a name="2.1.2"></a>
......@@ -157,7 +170,7 @@ python3 -m paddle.distributed.launch \
* **注意**
* 其他数据增广的配置文件可以参考 `ppcls/configs/ImageNet/DataAugment/` 中的配置文件。
* 训练 CIFAR100 的迭代轮数较少,因此进行训练时,验证集的精度指标可能会有 1% 左右的波动。
* 训练 CIFAR100 的迭代轮数较少,因此进行训练时,验证集的精度指标可能会有 1% 左右的波动。
<a name="4"></a>
......
# 电梯内电瓶车入室检测
近年来,电瓶车进楼入户发生的火灾事故屡见不鲜,针对该现象推出了相应的电瓶车入室检测模型,旨在从源头减少这一情况的发生。 针对室内摩托车模型可能会发生的误报情况,采用了额外的图像检索方式实现更为精确的识别。 本案例使用了飞桨图像分类开发套件中的通用图像识别系统PP-ShiTu。
![result](./imgs/result.png)
注:AI Studio在线运行代码请参考[电梯内电瓶车检测全流程](https://aistudio.baidu.com/aistudio/projectdetail/3497217?contributionType=1)
\ No newline at end of file
**商品识别**,即在智能零售场景下精准快速的识别商品类别、商品属性等。当下零售市场存在降低人力及运营成本、实现24小时不间断营业等需求,商品识别可以帮助越来越多的零售门店实现智慧零售数字化转型。在智慧零售概念火爆的今天,商品识别具有非常广阔的应用场景,如**货架陈列分析****智能结算****仓库管理****以图搜图**等。
本案例使用了飞桨图像分类开发套件PaddleClas中的通用图像识别系统PP-ShiTu进行**商品识别**的实现。
![result](./imgs/result.jpg)
**注**: AI Studio在线运行代码请参考[智慧商超商品识别系统](https://aistudio.baidu.com/aistudio/projectdetail/3460304)
\ No newline at end of file
......@@ -27,8 +27,9 @@ from ppcls.arch.backbone.base.theseus_layer import TheseusLayer
from ppcls.utils import logger
from ppcls.utils.save_load import load_dygraph_pretrain
from ppcls.arch.slim import prune_model, quantize_model
from ppcls.arch.distill.afd_attention import LinearTransformStudent, LinearTransformTeacher
__all__ = ["build_model", "RecModel", "DistillationModel"]
__all__ = ["build_model", "RecModel", "DistillationModel", "AttentionModel"]
def build_model(config):
......@@ -132,3 +133,24 @@ class DistillationModel(nn.Layer):
else:
result_dict[model_name] = self.model_list[idx](x, label)
return result_dict
class AttentionModel(DistillationModel):
def __init__(self,
models=None,
pretrained_list=None,
freeze_params_list=None,
**kargs):
super().__init__(models, pretrained_list, freeze_params_list, **kargs)
def forward(self, x, label=None):
result_dict = dict()
out = x
for idx, model_name in enumerate(self.model_name_list):
if label is None:
out = self.model_list[idx](out)
result_dict.update(out)
else:
out = self.model_list[idx](out, label)
result_dict.update(out)
return result_dict
......@@ -51,6 +51,7 @@ from ppcls.arch.backbone.model_zoo.regnet import RegNetX_200MF, RegNetX_4GF, Reg
from ppcls.arch.backbone.model_zoo.vision_transformer import ViT_small_patch16_224, ViT_base_patch16_224, ViT_base_patch16_384, ViT_base_patch32_384, ViT_large_patch16_224, ViT_large_patch16_384, ViT_large_patch32_384
from ppcls.arch.backbone.model_zoo.distilled_vision_transformer import DeiT_tiny_patch16_224, DeiT_small_patch16_224, DeiT_base_patch16_224, DeiT_tiny_distilled_patch16_224, DeiT_small_distilled_patch16_224, DeiT_base_distilled_patch16_224, DeiT_base_patch16_384, DeiT_base_distilled_patch16_384
from ppcls.arch.backbone.model_zoo.swin_transformer import SwinTransformer_tiny_patch4_window7_224, SwinTransformer_small_patch4_window7_224, SwinTransformer_base_patch4_window7_224, SwinTransformer_base_patch4_window12_384, SwinTransformer_large_patch4_window7_224, SwinTransformer_large_patch4_window12_384
from ppcls.arch.backbone.model_zoo.cswin_transformer import CSWinTransformer_tiny_224, CSWinTransformer_small_224, CSWinTransformer_base_224, CSWinTransformer_large_224, CSWinTransformer_base_384, CSWinTransformer_large_384
from ppcls.arch.backbone.model_zoo.mixnet import MixNet_S, MixNet_M, MixNet_L
from ppcls.arch.backbone.model_zoo.rexnet import ReXNet_1_0, ReXNet_1_3, ReXNet_1_5, ReXNet_2_0, ReXNet_3_0
from ppcls.arch.backbone.model_zoo.gvt import pcpvt_small, pcpvt_base, pcpvt_large, alt_gvt_small, alt_gvt_base, alt_gvt_large
......@@ -61,6 +62,9 @@ from ppcls.arch.backbone.model_zoo.tnt import TNT_small
from ppcls.arch.backbone.model_zoo.hardnet import HarDNet68, HarDNet85, HarDNet39_ds, HarDNet68_ds
from ppcls.arch.backbone.model_zoo.cspnet import CSPDarkNet53
from ppcls.arch.backbone.model_zoo.pvt_v2 import PVT_V2_B0, PVT_V2_B1, PVT_V2_B2_Linear, PVT_V2_B2, PVT_V2_B3, PVT_V2_B4, PVT_V2_B5
from ppcls.arch.backbone.model_zoo.mobilevit import MobileViT_XXS, MobileViT_XS, MobileViT_S
from ppcls.arch.backbone.model_zoo.repvgg import RepVGG_A0, RepVGG_A1, RepVGG_A2, RepVGG_B0, RepVGG_B1, RepVGG_B2, RepVGG_B1g2, RepVGG_B1g4, RepVGG_B2g4, RepVGG_B3g4
from ppcls.arch.backbone.model_zoo.van import VAN_tiny
from ppcls.arch.backbone.variant_models.resnet_variant import ResNet50_last_stage_stride1
from ppcls.arch.backbone.variant_models.vgg_variant import VGG19Sigmoid
from ppcls.arch.backbone.variant_models.pp_lcnet_variant import PPLCNet_x2_5_Tanh
......
......@@ -35,7 +35,7 @@ class TheseusLayer(nn.Layer):
self.quanter = None
def _return_dict_hook(self, layer, input, output):
res_dict = {"output": output}
res_dict = {"logits": output}
# 'list' is needed to avoid error raised by popping self.res_dict
for res_key in list(self.res_dict):
# clear the res_dict because the forward process may change according to input
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1908.07919
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
......@@ -459,6 +461,7 @@ class HRNet(TheseusLayer):
self.avg_pool = nn.AdaptiveAvgPool2D(1)
stdv = 1.0 / math.sqrt(2048 * 1.0)
self.flatten = nn.Flatten(start_axis=1, stop_axis=-1)
self.fc = nn.Linear(
2048,
......@@ -496,7 +499,7 @@ class HRNet(TheseusLayer):
y = self.conv_last(y)
y = self.avg_pool(y)
y = paddle.reshape(y, shape=[-1, y.shape[1]])
y = self.flatten(y)
y = self.fc(y)
return y
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1512.00567v3
from __future__ import absolute_import, division, print_function
import math
import paddle
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1704.04861
from __future__ import absolute_import, division, print_function
from paddle import ParamAttr
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1905.02244
from __future__ import absolute_import, division, print_function
import paddle
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/pdf/1512.03385
from __future__ import absolute_import, division, print_function
import numpy as np
......@@ -276,6 +278,7 @@ class ResNet(TheseusLayer):
config,
stages_pattern,
version="vb",
stem_act="relu",
class_num=1000,
lr_mult_list=[1.0, 1.0, 1.0, 1.0, 1.0],
data_format="NCHW",
......@@ -309,13 +312,13 @@ class ResNet(TheseusLayer):
[[input_image_channel, 32, 3, 2], [32, 32, 3, 1], [32, 64, 3, 1]]
}
self.stem = nn.Sequential(* [
self.stem = nn.Sequential(*[
ConvBNLayer(
num_channels=in_c,
num_filters=out_c,
filter_size=k,
stride=s,
act="relu",
act=stem_act,
lr_mult=self.lr_mult_list[0],
data_format=data_format)
for in_c, out_c, k, s in self.stem_cfg[version]
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1409.1556
from __future__ import absolute_import, division, print_function
import paddle.nn as nn
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
import paddle
from paddle import ParamAttr
import paddle.nn as nn
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was heavily based on https://github.com/rwightman/pytorch-image-models
# reference: https://arxiv.org/abs/1911.11929
import paddle
import paddle.nn as nn
......
此差异已折叠。
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1804.02767
import paddle
from paddle import ParamAttr
import paddle.nn as nn
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1608.06993
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was heavily based on https://github.com/facebookresearch/deit
# reference: https://arxiv.org/abs/2012.12877
import paddle
import paddle.nn as nn
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was based on https://github.com/ucbdrive/dla
# reference: https://arxiv.org/abs/1707.06484
import math
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1707.01629
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was based on https://github.com/lukemelas/EfficientNet-PyTorch
# reference: https://arxiv.org/abs/1905.11946
import paddle
from paddle import ParamAttr
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was based on https://github.com/huawei-noah/CV-Backbones/tree/master/ghostnet_pytorch
# reference: https://arxiv.org/abs/1911.11907
import math
import paddle
......
# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1409.4842
import paddle
from paddle import ParamAttr
import paddle.nn as nn
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was based on https://github.com/Meituan-AutoML/Twins
# reference: https://arxiv.org/abs/2104.13840
from functools import partial
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was based on https://github.com/PingoLH/Pytorch-HarDNet
# reference: https://arxiv.org/abs/1909.00948
import paddle
import paddle.nn as nn
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1602.07261
import paddle
from paddle import ParamAttr
import paddle.nn as nn
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was based on https://github.com/facebookresearch/LeViT
# reference: https://openaccess.thecvf.com/content/ICCV2021/html/Graham_LeViT_A_Vision_Transformer_in_ConvNets_Clothing_for_Faster_Inference_ICCV_2021_paper.html
import itertools
import math
......
......@@ -11,11 +11,8 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
MixNet for ImageNet-1K, implemented in Paddle.
Original paper: 'MixConv: Mixed Depthwise Convolutional Kernels,'
https://arxiv.org/abs/1907.09595.
"""
# reference: https://arxiv.org/abs/1907.09595
import os
from inspect import isfunction
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1801.04381
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
......
此差异已折叠。
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was heavily based on https://github.com/whai362/PVT
# reference: https://arxiv.org/abs/2106.13797
from functools import partial
import math
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was based on https://github.com/d-li14/involution
# reference: https://arxiv.org/abs/2103.06255
import paddle
import paddle.nn as nn
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was based on https://github.com/facebookresearch/pycls
# reference: https://arxiv.org/abs/1905.13214
from __future__ import absolute_import
from __future__ import division
......
......@@ -13,6 +13,7 @@
# limitations under the License.
# Code was based on https://github.com/DingXiaoH/RepVGG
# reference: https://arxiv.org/abs/2101.03697
import paddle.nn as nn
import paddle
......@@ -33,18 +34,12 @@ MODEL_URLS = {
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RepVGG_B1_pretrained.pdparams",
"RepVGG_B2":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RepVGG_B2_pretrained.pdparams",
"RepVGG_B3":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RepVGG_B3_pretrained.pdparams",
"RepVGG_B1g2":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RepVGG_B1g2_pretrained.pdparams",
"RepVGG_B1g4":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RepVGG_B1g4_pretrained.pdparams",
"RepVGG_B2g2":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RepVGG_B2g2_pretrained.pdparams",
"RepVGG_B2g4":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RepVGG_B2g4_pretrained.pdparams",
"RepVGG_B3g2":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RepVGG_B3g2_pretrained.pdparams",
"RepVGG_B3g4":
"https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/RepVGG_B3g4_pretrained.pdparams",
}
......@@ -92,6 +87,8 @@ class RepVGGBlock(nn.Layer):
groups=1,
padding_mode='zeros'):
super(RepVGGBlock, self).__init__()
self.is_repped = False
self.in_channels = in_channels
self.out_channels = out_channels
self.kernel_size = kernel_size
......@@ -127,6 +124,12 @@ class RepVGGBlock(nn.Layer):
groups=groups)
def forward(self, inputs):
if not self.training and not self.is_repped:
self.rep()
self.is_repped = True
if self.training and self.is_repped:
self.is_repped = False
if not self.training:
return self.nonlinearity(self.rbr_reparam(inputs))
......@@ -137,7 +140,7 @@ class RepVGGBlock(nn.Layer):
return self.nonlinearity(
self.rbr_dense(inputs) + self.rbr_1x1(inputs) + id_out)
def eval(self):
def rep(self):
if not hasattr(self, 'rbr_reparam'):
self.rbr_reparam = nn.Conv2D(
in_channels=self.in_channels,
......@@ -148,12 +151,9 @@ class RepVGGBlock(nn.Layer):
dilation=self.dilation,
groups=self.groups,
padding_mode=self.padding_mode)
self.training = False
kernel, bias = self.get_equivalent_kernel_bias()
self.rbr_reparam.weight.set_value(kernel)
self.rbr_reparam.bias.set_value(bias)
for layer in self.sublayers():
layer.eval()
def get_equivalent_kernel_bias(self):
kernel3x3, bias3x3 = self._fuse_bn_tensor(self.rbr_dense)
......@@ -248,12 +248,6 @@ class RepVGG(nn.Layer):
self.cur_layer_idx += 1
return nn.Sequential(*blocks)
def eval(self):
self.training = False
for layer in self.sublayers():
layer.training = False
layer.eval()
def forward(self, x):
out = self.stage0(x)
out = self.stage1(out)
......@@ -367,17 +361,6 @@ def RepVGG_B2(pretrained=False, use_ssld=False, **kwargs):
return model
def RepVGG_B2g2(pretrained=False, use_ssld=False, **kwargs):
model = RepVGG(
num_blocks=[4, 6, 16, 1],
width_multiplier=[2.5, 2.5, 2.5, 5],
override_groups_map=g2_map,
**kwargs)
_load_pretrained(
pretrained, model, MODEL_URLS["RepVGG_B2g2"], use_ssld=use_ssld)
return model
def RepVGG_B2g4(pretrained=False, use_ssld=False, **kwargs):
model = RepVGG(
num_blocks=[4, 6, 16, 1],
......@@ -389,28 +372,6 @@ def RepVGG_B2g4(pretrained=False, use_ssld=False, **kwargs):
return model
def RepVGG_B3(pretrained=False, use_ssld=False, **kwargs):
model = RepVGG(
num_blocks=[4, 6, 16, 1],
width_multiplier=[3, 3, 3, 5],
override_groups_map=None,
**kwargs)
_load_pretrained(
pretrained, model, MODEL_URLS["RepVGG_B3"], use_ssld=use_ssld)
return model
def RepVGG_B3g2(pretrained=False, use_ssld=False, **kwargs):
model = RepVGG(
num_blocks=[4, 6, 16, 1],
width_multiplier=[3, 3, 3, 5],
override_groups_map=g2_map,
**kwargs)
_load_pretrained(
pretrained, model, MODEL_URLS["RepVGG_B3g2"], use_ssld=use_ssld)
return model
def RepVGG_B3g4(pretrained=False, use_ssld=False, **kwargs):
model = RepVGG(
num_blocks=[4, 6, 16, 1],
......
......@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# reference: https://arxiv.org/abs/1904.01169
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册