## 近期更新
📢将于**6月15-6月17日晚20:30** 进行为期三天的课程直播,详细介绍超轻量图像分类方案,对各场景模型优化原理及使用方式进行拆解,之后还有产业案例全流程实操,对各类痛难点解决方案进行手把手教学,加上现场互动答疑,抓紧扫码上车吧!
- 📢将于**6月15-6月17日晚20:30** 进行为期三天的课程直播,详细介绍超轻量图像分类方案,对各场景模型优化原理及使用方式进行拆解,之后还有产业案例全流程实操,对各类痛难点解决方案进行手把手教学,加上现场互动答疑,抓紧扫码上车吧!
<div align="center">
<img src="https://user-images.githubusercontent.com/80816848/173404459-9426c0ed-4801-4f75-876f-2e6ec47255f5.png" width = "200" height = "200"/>
<img src="https://user-images.githubusercontent.com/45199522/173483779-2332f990-4941-4f8d-baee-69b62035fc31.png" width = "200" height = "200"/>
- 🔥️ 2022.6.15 发布[PULC超轻量图像分类实用方案](docs/zh_CN/PULC/PULC_train.md),CPU推理3ms,精度比肩SwinTransformer,覆盖人、车、OCR场景九大常见任务。
......@@ -47,11 +47,11 @@ PaddleClas发布了[PP-HGNet](docs/zh_CN/models/PP-HGNet.md)、[PP-LCNetv2](docs
## 欢迎加入技术交流群
您可以扫描下面的微信/QQ二维码(添加小助手微信并回复"C"),加入PaddleClas微信交流群,获得更高效的问题答疑,与各行各业开发者充分交流,期待您的加入。
* 您可以扫描下面的微信/QQ二维码(添加小助手微信并回复“C”),加入PaddleClas微信交流群,获得更高效的问题答疑,与各行各业开发者充分交流,期待您的加入。
<div align="center">
<img src="https://user-images.githubusercontent.com/80816848/164383225-e375eb86-716e-41b4-a9e0-4b8a3976c1aa.jpg" width="200"/>
<img src="https://user-images.githubusercontent.com/48054808/160531099-9811bbe6-cfbb-47d5-8bdb-c2b40684d7dd.png" width="200"/>
<img src="https://user-images.githubusercontent.com/80816848/164383225-e375eb86-716e-41b4-a9e0-4b8a3976c1aa.jpg" width="200"/>
## 快速体验
......@@ -33,8 +33,8 @@ For the introduction of PP-LCNet, please refer to [paper](https://arxiv.org/pdf/
## Features
PaddleClas release PP-HGNet、PP-LCNetv2、 PP-LCNet and **S**imple **S**emi-supervised **L**abel **D**istillation algorithms, and support plenty of
image classification and image recognition algorithms.
image classification and image recognition algorithms.
PaddleClas release PP-HGNet、PP-LCNetv2、 PP-LCNet and **S**imple **S**emi-supervised **L**abel **D**istillation algorithms, and support plenty of
image classification and image recognition algorithms.
Based on th algorithms above, PaddleClas release PP-ShiTu image recognition system and [**P**ractical **U**ltra **L**ight-weight image **C**lassification solutions](docs/en/PULC/PULC_quickstart_en.md).
......@@ -59,6 +59,10 @@ Quick experience of **P**ractical **U**ltra **L**ight-weight image **C**lassific
- [Install Paddle](./docs/en/installation/install_paddle_en.md)
- [Install PaddleClas Environment](./docs/en/installation/install_paddleclas_en.md)
- [Practical Ultra Light-weight image Classification solutions](./docs/en/PULC/PULC_quickstart_en.md)
- [PULC Quick Start](docs/en/PULC/PULC_quickstart_en.md)
- [PULC Model Zoo](docs/en/PULC/PULC_model_list_en.md)
- [PULC Classification Model of Someone or Nobody](docs/en/PULC/PULC_person_exists_en.md)
- [Quick Start of Recognition](./docs/en/tutorials/quick_start_recognition_en.md)
- [Quick Start of Recognition](./docs/en/quick_start/quick_start_recognition_en.md)
- [Introduction to Image Recognition Systems](#Introduction_to_Image_Recognition_Systems)
- [Image Recognition Demo images](#Rec_Demo_images)
# Paddle2ONNX: Converting To ONNX and Deployment
This section introduce that how to convert the Paddle Inference Model ResNet50_vd to ONNX model and deployment based on ONNX engine.
## 1. Installation
First, you need to install Paddle2ONNX and onnxruntime. Paddle2ONNX is a toolkit to convert Paddle Inference Model to ONNX model. Please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_en.md) for more information.
- Paddle2ONNX Installation
python3.7 -m pip install paddle2onnx
- ONNX Installation
python3.7 -m pip install onnxruntime
## 2. Converting to ONNX
Download the Paddle Inference Model ResNet50_vd:
cd deploy
mkdir models && cd models
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/ResNet50_vd_infer.tar && tar xf ResNet50_vd_infer.tar
cd ..
Converting to ONNX model:
paddle2onnx --model_dir=./models/ResNet50_vd_infer/ \
--model_filename=inference.pdmodel \
--params_filename=inference.pdiparams \
--save_file=./models/ResNet50_vd_infer/inference.onnx \
--opset_version=10 \
After running the above command, the ONNX model file converted would be save in `./models/ResNet50_vd_infer/`.
## 3. Deployment
Deployment with ONNX model, command is as shown below.
python3.7 python/predict_cls.py \
-c configs/inference_cls.yaml \
-o Global.use_onnx=True \
-o Global.use_gpu=False \
-o Global.inference_model_dir=./models/ResNet50_vd_infer
The prediction results:
ILSVRC2012_val_00000010.jpeg: class id(s): [153, 204, 229, 332, 155], score(s): [0.69, 0.10, 0.02, 0.01, 0.01], label_name(s): ['Maltese dog, Maltese terrier, Maltese', 'Lhasa, Lhasa apso', 'Old English sheepdog, bobtail', 'Angora, Angora rabbit', 'Shih-Tzu']
# 使用镜像:
# registry.baidubce.com/paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82
# 编译Serving Server:
# client和app可以直接使用release版本
# server因为加入了自定义OP,需要重新编译
# 默认编译时的${PWD}=PaddleClas/deploy/paddleserving/
apt-get update
apt install -y libcurl4-openssl-dev libbz2-dev
wget -nc https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar
tar xf centos_ssl.tar
rm -rf centos_ssl.tar
mv libcrypto.so.1.0.2k /usr/lib/libcrypto.so.1.0.2k
mv libssl.so.1.0.2k /usr/lib/libssl.so.1.0.2k
ln -sf /usr/lib/libcrypto.so.1.0.2k /usr/lib/libcrypto.so.10
ln -sf /usr/lib/libssl.so.1.0.2k /usr/lib/libssl.so.10
ln -sf /usr/lib/libcrypto.so.10 /usr/lib/libcrypto.so
ln -sf /usr/lib/libssl.so.10 /usr/lib/libssl.so
# 安装go依赖
rm -rf /usr/local/go
wget -qO- https://paddle-ci.cdn.bcebos.com/go1.17.2.linux-amd64.tar.gz | tar -xz -C /usr/local
export GOROOT=/usr/local/go
export GOPATH=/root/gopath
export PATH=$PATH:$GOPATH/bin:$GOROOT/bin
go env -w GO111MODULE=on
go env -w GOPROXY=https://goproxy.cn,direct
go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
go install google.golang.org/grpc@v1.33.0
go env -w GO111MODULE=auto
# 下载opencv库
wget https://paddle-qa.bj.bcebos.com/PaddleServing/opencv3.tar.gz
tar -xvf opencv3.tar.gz
rm -rf opencv3.tar.gz
export OPENCV_DIR=$PWD/opencv3
# clone Serving
git clone https://github.com/PaddlePaddle/Serving.git -b develop --depth=1
cd Serving # PaddleClas/deploy/paddleserving/Serving
export Serving_repo_path=$PWD
git submodule update --init --recursive
${python_name} -m pip install -r python/requirements.txt
# set env
export PYTHON_INCLUDE_DIR=$(${python_name} -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())")
export PYTHON_LIBRARIES=$(${python_name} -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))")
export PYTHON_EXECUTABLE=`which ${python_name}`
export CUDA_PATH='/usr/local/cuda'
export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
export CUDA_CUDART_LIBRARY='/usr/local/cuda/lib64/'
export TENSORRT_LIBRARY_PATH='/usr/local/TensorRT6-cuda10.1-cudnn7/targets/x86_64-linux-gnu/'
# cp 自定义OP代码
\cp ../preprocess/general_clas_op.* ${Serving_repo_path}/core/general-server/op
\cp ../preprocess/preprocess_op.* ${Serving_repo_path}/core/predictor/tools/pp_shitu_tools
# 编译Server
mkdir server-build-gpu-opencv
cd server-build-gpu-opencv
make -j32
${python_name} -m pip install python/dist/paddle*
# export SERVING_BIN
export SERVING_BIN=$PWD/core/general-server/serving
cd ../../
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// http://www.apache.org/licenses/LICENSE-2.0
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// See the License for the specific language governing permissions and
// limitations under the License.
#include "core/general-server/op/general_clas_op.h"
#include "core/predictor/framework/infer.h"
#include "core/predictor/framework/memory.h"
#include "core/predictor/framework/resource.h"
#include "core/util/include/timer.h"
#include <algorithm>
#include <iostream>
#include <memory>
#include <sstream>
namespace baidu {
namespace paddle_serving {
namespace serving {
using baidu::paddle_serving::Timer;
using baidu::paddle_serving::predictor::MempoolWrapper;
using baidu::paddle_serving::predictor::general_model::Tensor;
using baidu::paddle_serving::predictor::general_model::Response;
using baidu::paddle_serving::predictor::general_model::Request;
using baidu::paddle_serving::predictor::InferManager;
using baidu::paddle_serving::predictor::PaddleGeneralModelConfig;
int GeneralClasOp::inference() {
VLOG(2) << "Going to run inference";
const std::vector<std::string> pre_node_names = pre_names();
if (pre_node_names.size() != 1) {
LOG(ERROR) << "This op(" << op_name()
<< ") can only have one predecessor op, but received "
<< pre_node_names.size();
return -1;
const std::string pre_name = pre_node_names[0];
const GeneralBlob *input_blob = get_depend_argument<GeneralBlob>(pre_name);
if (!input_blob) {
LOG(ERROR) << "input_blob is nullptr,error";
return -1;
uint64_t log_id = input_blob->GetLogId();
VLOG(2) << "(logid=" << log_id << ") Get precedent op name: " << pre_name;
GeneralBlob *output_blob = mutable_data<GeneralBlob>();
if (!output_blob) {
LOG(ERROR) << "output_blob is nullptr,error";
return -1;
if (!input_blob) {
LOG(ERROR) << "(logid=" << log_id
<< ") Failed mutable depended argument, op:" << pre_name;
return -1;
const TensorVector *in = &input_blob->tensor_vector;
TensorVector *out = &output_blob->tensor_vector;
int batch_size = input_blob->_batch_size;
output_blob->_batch_size = batch_size;
VLOG(2) << "(logid=" << log_id << ") infer batch size: " << batch_size;
Timer timeline;
int64_t start = timeline.TimeStampUS();
// only support string type
char *total_input_ptr = static_cast<char *>(in->at(0).data.data());
std::string base64str = total_input_ptr;
cv::Mat img = Base2Mat(base64str);
cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
// Resize
cv::Mat resize_img;
resize_op_.Run(img, resize_img, resize_short_size_);
// CenterCrop
crop_op_.Run(resize_img, crop_size_);
// Normalize
normalize_op_.Run(&resize_img, mean_, scale_, is_scale_);
// Permute
std::vector<float> input(1 * 3 * resize_img.rows * resize_img.cols, 0.0f);
permute_op_.Run(&resize_img, input.data());
float maxValue = *max_element(input.begin(), input.end());
float minValue = *min_element(input.begin(), input.end());
TensorVector *real_in = new TensorVector();
if (!real_in) {
LOG(ERROR) << "real_in is nullptr,error";
return -1;
std::vector<int> input_shape;
int in_num = 0;
void *databuf_data = NULL;
char *databuf_char = NULL;
size_t databuf_size = 0;
input_shape = {1, 3, resize_img.rows, resize_img.cols};
in_num = std::accumulate(input_shape.begin(), input_shape.end(), 1,
databuf_size = in_num * sizeof(float);
databuf_data = MempoolWrapper::instance().malloc(databuf_size);
if (!databuf_data) {
LOG(ERROR) << "Malloc failed, size: " << databuf_size;
return -1;
memcpy(databuf_data, input.data(), databuf_size);
databuf_char = reinterpret_cast<char *>(databuf_data);
paddle::PaddleBuf paddleBuf(databuf_char, databuf_size);
paddle::PaddleTensor tensor_in;
tensor_in.name = in->at(0).name;
tensor_in.dtype = paddle::PaddleDType::FLOAT32;
tensor_in.shape = {1, 3, resize_img.rows, resize_img.cols};
tensor_in.lod = in->at(0).lod;
tensor_in.data = paddleBuf;
if (InferManager::instance().infer(engine_name().c_str(), real_in, out,
batch_size)) {
LOG(ERROR) << "(logid=" << log_id
<< ") Failed do infer in fluid model: " << engine_name().c_str();
return -1;
int64_t end = timeline.TimeStampUS();
CopyBlobInfo(input_blob, output_blob);
AddBlobInfo(output_blob, start);
AddBlobInfo(output_blob, end);
return 0;
cv::Mat GeneralClasOp::Base2Mat(std::string &base64_data) {
cv::Mat img;
std::string s_mat;
s_mat = base64Decode(base64_data.data(), base64_data.size());
std::vector<char> base64_img(s_mat.begin(), s_mat.end());
img = cv::imdecode(base64_img, cv::IMREAD_COLOR); // CV_LOAD_IMAGE_COLOR
return img;
std::string GeneralClasOp::base64Decode(const char *Data, int DataByte) {
const char DecodeTable[] = {
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0,
62, // '+'
0, 0, 0,
63, // '/'
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, // '0'-'9'
0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, // 'A'-'Z'
0, 0, 0, 0, 0, 0, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, // 'a'-'z'
std::string strDecode;
int nValue;
int i = 0;
while (i < DataByte) {
if (*Data != '\r' && *Data != '\n') {
nValue = DecodeTable[*Data++] << 18;
nValue += DecodeTable[*Data++] << 12;
strDecode += (nValue & 0x00FF0000) >> 16;
if (*Data != '=') {
nValue += DecodeTable[*Data++] << 6;
strDecode += (nValue & 0x0000FF00) >> 8;
if (*Data != '=') {
nValue += DecodeTable[*Data++];
strDecode += nValue & 0x000000FF;
i += 4;
} else // 回车换行,跳过
return strDecode;
} // namespace serving
} // namespace paddle_serving
} // namespace baidu
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// http://www.apache.org/licenses/LICENSE-2.0
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include "core/general-server/general_model_service.pb.h"
#include "core/general-server/op/general_infer_helper.h"
#include "core/predictor/tools/pp_shitu_tools/preprocess_op.h"
#include "paddle_inference_api.h" // NOLINT
#include <string>
#include <vector>
#include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include <chrono>
#include <iomanip>
#include <iostream>
#include <ostream>
#include <vector>
#include <cstring>
#include <fstream>
#include <numeric>
namespace baidu {
namespace paddle_serving {
namespace serving {
class GeneralClasOp
: public baidu::paddle_serving::predictor::OpWithChannel<GeneralBlob> {
typedef std::vector<paddle::PaddleTensor> TensorVector;
int inference();
// clas preprocess
std::vector<float> mean_ = {0.485f, 0.456f, 0.406f};
std::vector<float> scale_ = {0.229f, 0.224f, 0.225f};
bool is_scale_ = true;
int resize_short_size_ = 256;
int crop_size_ = 224;
PaddleClas::ResizeImg resize_op_;
PaddleClas::Normalize normalize_op_;
PaddleClas::Permute permute_op_;
PaddleClas::CenterCropImg crop_op_;
// read pics
cv::Mat Base2Mat(std::string &base64_data);
std::string base64Decode(const char *Data, int DataByte);
} // namespace serving
} // namespace paddle_serving
} // namespace baidu
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// http://www.apache.org/licenses/LICENSE-2.0
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// See the License for the specific language governing permissions and
// limitations under the License.
#include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include "paddle_api.h"
#include "paddle_inference_api.h"
#include <chrono>
#include <iomanip>
#include <iostream>
#include <ostream>
#include <vector>
#include <cstring>
#include <fstream>
#include <math.h>
#include <numeric>
#include "preprocess_op.h"
namespace Feature {
void Permute::Run(const cv::Mat *im, float *data) {
int rh = im->rows;
int rw = im->cols;
int rc = im->channels();
for (int i = 0; i < rc; ++i) {
cv::extractChannel(*im, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw), i);
void Normalize::Run(cv::Mat *im, const std::vector<float> &mean,
const std::vector<float> &std, float scale) {
(*im).convertTo(*im, CV_32FC3, scale);
for (int h = 0; h < im->rows; h++) {
for (int w = 0; w < im->cols; w++) {
im->at<cv::Vec3f>(h, w)[0] =
(im->at<cv::Vec3f>(h, w)[0] - mean[0]) / std[0];
im->at<cv::Vec3f>(h, w)[1] =
(im->at<cv::Vec3f>(h, w)[1] - mean[1]) / std[1];
im->at<cv::Vec3f>(h, w)[2] =
(im->at<cv::Vec3f>(h, w)[2] - mean[2]) / std[2];
void CenterCropImg::Run(cv::Mat &img, const int crop_size) {
int resize_w = img.cols;
int resize_h = img.rows;
int w_start = int((resize_w - crop_size) / 2);
int h_start = int((resize_h - crop_size) / 2);
cv::Rect rect(w_start, h_start, crop_size, crop_size);
img = img(rect);
void ResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img,
int resize_short_size, int size) {
int resize_h = 0;
int resize_w = 0;
if (size > 0) {
resize_h = size;
resize_w = size;
} else {
int w = img.cols;
int h = img.rows;
float ratio = 1.f;
if (h < w) {
ratio = float(resize_short_size) / float(h);
} else {
ratio = float(resize_short_size) / float(w);
resize_h = round(float(h) * ratio);
resize_w = round(float(w) * ratio);
cv::resize(img, resize_img, cv::Size(resize_w, resize_h));
} // namespace Feature
namespace PaddleClas {
void Permute::Run(const cv::Mat *im, float *data) {
int rh = im->rows;
int rw = im->cols;
int rc = im->channels();
for (int i = 0; i < rc; ++i) {
cv::extractChannel(*im, cv::Mat(rh, rw, CV_32FC1, data + i * rh * rw), i);
void Normalize::Run(cv::Mat *im, const std::vector<float> &mean,
const std::vector<float> &scale, const bool is_scale) {
double e = 1.0;
if (is_scale) {
e /= 255.0;
(*im).convertTo(*im, CV_32FC3, e);
for (int h = 0; h < im->rows; h++) {
for (int w = 0; w < im->cols; w++) {
im->at<cv::Vec3f>(h, w)[0] =
(im->at<cv::Vec3f>(h, w)[0] - mean[0]) / scale[0];
im->at<cv::Vec3f>(h, w)[1] =
(im->at<cv::Vec3f>(h, w)[1] - mean[1]) / scale[1];
im->at<cv::Vec3f>(h, w)[2] =
(im->at<cv::Vec3f>(h, w)[2] - mean[2]) / scale[2];
void CenterCropImg::Run(cv::Mat &img, const int crop_size) {
int resize_w = img.cols;
int resize_h = img.rows;
int w_start = int((resize_w - crop_size) / 2);
int h_start = int((resize_h - crop_size) / 2);
cv::Rect rect(w_start, h_start, crop_size, crop_size);
img = img(rect);
void ResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img,
int resize_short_size) {
int w = img.cols;
int h = img.rows;
float ratio = 1.f;
if (h < w) {
ratio = float(resize_short_size) / float(h);
} else {
ratio = float(resize_short_size) / float(w);
int resize_h = round(float(h) * ratio);
int resize_w = round(float(w) * ratio);
cv::resize(img, resize_img, cv::Size(resize_w, resize_h));
} // namespace PaddleClas
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// http://www.apache.org/licenses/LICENSE-2.0
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include <chrono>
#include <iomanip>
#include <iostream>
#include <ostream>
#include <vector>
#include <cstring>
#include <fstream>
#include <numeric>
namespace Feature {
class Normalize {
virtual void Run(cv::Mat *im, const std::vector<float> &mean,
const std::vector<float> &std, float scale);
// RGB -> CHW
class Permute {
virtual void Run(const cv::Mat *im, float *data);
class CenterCropImg {
virtual void Run(cv::Mat &im, const int crop_size = 224);
class ResizeImg {
virtual void Run(const cv::Mat &img, cv::Mat &resize_img, int max_size_len,
int size = 0);
} // namespace Feature
namespace PaddleClas {
class Normalize {
virtual void Run(cv::Mat *im, const std::vector<float> &mean,
const std::vector<float> &scale, const bool is_scale = true);
// RGB -> CHW
class Permute {
virtual void Run(const cv::Mat *im, float *data);
class CenterCropImg {
virtual void Run(cv::Mat &im, const int crop_size = 224);
class ResizeImg {
virtual void Run(const cv::Mat &img, cv::Mat &resize_img, int max_size_len);
} // namespace PaddleClas
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
feed_var {
name: "boxes"
alias_name: "boxes"
is_lod_tensor: false
feed_type: 1
shape: 6
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: false
fetch_type: 1
shape: 512
fetch_var {
name: "boxes"
alias_name: "boxes"
is_lod_tensor: false
fetch_type: 1
shape: 6
feed_var {
name: "x"
alias_name: "x"
is_lod_tensor: false
feed_type: 1
shape: 3
shape: 224
shape: 224
feed_var {
name: "boxes"
alias_name: "boxes"
is_lod_tensor: false
feed_type: 1
shape: 6
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "features"
is_lod_tensor: false
fetch_type: 1
shape: 512
fetch_var {
name: "boxes"
alias_name: "boxes"
is_lod_tensor: false
fetch_type: 1
shape: 6
feed_var {
name: "im_shape"
alias_name: "im_shape"
is_lod_tensor: false
feed_type: 1
shape: 2
feed_var {
name: "image"
alias_name: "image"
is_lod_tensor: false
feed_type: 7
shape: -1
shape: -1
shape: 3
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "save_infer_model/scale_0.tmp_1"
is_lod_tensor: true
fetch_type: 1
shape: -1
fetch_var {
name: "save_infer_model/scale_1.tmp_1"
alias_name: "save_infer_model/scale_1.tmp_1"
is_lod_tensor: false
fetch_type: 2
feed_var {
name: "im_shape"
alias_name: "im_shape"
is_lod_tensor: false
feed_type: 1
shape: 2
feed_var {
name: "image"
alias_name: "image"
is_lod_tensor: false
feed_type: 7
shape: -1
shape: -1
shape: 3
fetch_var {
name: "save_infer_model/scale_0.tmp_1"
alias_name: "save_infer_model/scale_0.tmp_1"
is_lod_tensor: true
fetch_type: 1
shape: -1
fetch_var {
name: "save_infer_model/scale_1.tmp_1"
alias_name: "save_infer_model/scale_1.tmp_1"
is_lod_tensor: false
fetch_type: 2
......@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import numpy as np
from paddle_serving_client import Client
......@@ -22,181 +21,101 @@ import faiss
import os
import pickle
class MainbodyDetect():
pp-shitu mainbody detect.
include preprocess, process, postprocess
return detect results
Attention: Postprocess include num limit and box filter; no nms
def __init__(self):
self.preprocess = DetectionSequential([
DetectionFile2Image(), DetectionNormalize(
[0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True),
(640, 640), False, interpolation=2), DetectionTranspose(
(2, 0, 1))
self.client = Client()
self.max_det_result = 5
self.conf_threshold = 0.2
def predict(self, imgpath):
im, im_info = self.preprocess(imgpath)
im_shape = np.array(im.shape[1:]).reshape(-1)
scale_factor = np.array(list(im_info['scale_factor'])).reshape(-1)
fetch_map = self.client.predict(
"image": im,
"im_shape": im_shape,
"scale_factor": scale_factor,
return self.postprocess(fetch_map, imgpath)
def postprocess(self, fetch_map, imgpath):
#1. get top max_det_result
det_results = fetch_map["save_infer_model/scale_0.tmp_1"]
if len(det_results) > self.max_det_result:
boxes_reserved = fetch_map[
boxes_reserved = det_results
#2. do conf threshold
boxes_list = []
for i in range(boxes_reserved.shape[0]):
if (boxes_reserved[i, 1]) > self.conf_threshold:
boxes_list.append(boxes_reserved[i, :])
#3. add origin image box
origin_img = cv2.imread(imgpath)
np.array([0, 1.0, 0, 0, origin_img.shape[1], origin_img.shape[0]]))
return np.array(boxes_list)
class ObjectRecognition():
pp-shitu object recognion for all objects detected by MainbodyDetect.
include preprocess, process, postprocess
preprocess include preprocess for each image and batching.
Batch process
postprocess include retrieval and nms
def __init__(self):
self.client = Client()
self.seq = Sequential([
BGR2RGB(), Resize((224, 224)), Div(255),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225],
False), Transpose((2, 0, 1))
self.searcher, self.id_map = self.init_index()
self.rec_nms_thresold = 0.05
self.rec_score_thres = 0.5
self.feature_normalize = True
self.return_k = 1
def init_index(self):
index_dir = "../../drink_dataset_v1.0/index"
assert os.path.exists(os.path.join(
index_dir, "vector.index")), "vector.index not found ..."
assert os.path.exists(os.path.join(
index_dir, "id_map.pkl")), "id_map.pkl not found ... "
searcher = faiss.read_index(os.path.join(index_dir, "vector.index"))
with open(os.path.join(index_dir, "id_map.pkl"), "rb") as fd:
id_map = pickle.load(fd)
return searcher, id_map
def predict(self, det_boxes, imgpath):
#1. preprocess
batch_imgs = []
origin_img = cv2.imread(imgpath)
for i in range(det_boxes.shape[0]):
box = det_boxes[i]
x1, y1, x2, y2 = [int(x) for x in box[2:]]
cropped_img = origin_img[y1:y2, x1:x2, :].copy()
tmp = self.seq(cropped_img)
batch_imgs = np.array(batch_imgs)
#2. process
fetch_map = self.client.predict(
feed={"x": batch_imgs}, fetch=["features"], batch=True)
batch_features = fetch_map["features"]
#3. postprocess
if self.feature_normalize:
feas_norm = np.sqrt(
np.sum(np.square(batch_features), axis=1, keepdims=True))
batch_features = np.divide(batch_features, feas_norm)
scores, docs = self.searcher.search(batch_features, self.return_k)
results = []
for i in range(scores.shape[0]):
pred = {}
if scores[i][0] >= self.rec_score_thres:
pred["bbox"] = [int(x) for x in det_boxes[i, 2:]]
pred["rec_docs"] = self.id_map[docs[i][0]].split()[1]
pred["rec_scores"] = scores[i][0]
return self.nms_to_rec_results(results)
def nms_to_rec_results(self, results):
filtered_results = []
x1 = np.array([r["bbox"][0] for r in results]).astype("float32")
y1 = np.array([r["bbox"][1] for r in results]).astype("float32")
x2 = np.array([r["bbox"][2] for r in results]).astype("float32")
y2 = np.array([r["bbox"][3] for r in results]).astype("float32")
scores = np.array([r["rec_scores"] for r in results])
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = scores.argsort()[::-1]
while order.size > 0:
i = order[0]
xx1 = np.maximum(x1[i], x1[order[1:]])
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= self.rec_nms_thresold)[0]
order = order[inds + 1]
return filtered_results
rec_nms_thresold = 0.05
rec_score_thres = 0.5
feature_normalize = True
return_k = 1
index_dir = "../../drink_dataset_v1.0/index"
def init_index(index_dir):
assert os.path.exists(os.path.join(
index_dir, "vector.index")), "vector.index not found ..."
assert os.path.exists(os.path.join(
index_dir, "id_map.pkl")), "id_map.pkl not found ... "
searcher = faiss.read_index(os.path.join(index_dir, "vector.index"))
with open(os.path.join(index_dir, "id_map.pkl"), "rb") as fd:
id_map = pickle.load(fd)
return searcher, id_map
#get box
def nms_to_rec_results(results, thresh=0.1):
filtered_results = []
x1 = np.array([r["bbox"][0] for r in results]).astype("float32")
y1 = np.array([r["bbox"][1] for r in results]).astype("float32")
x2 = np.array([r["bbox"][2] for r in results]).astype("float32")
y2 = np.array([r["bbox"][3] for r in results]).astype("float32")
scores = np.array([r["rec_scores"] for r in results])
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = scores.argsort()[::-1]
while order.size > 0:
i = order[0]
xx1 = np.maximum(x1[i], x1[order[1:]])
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= thresh)[0]
order = order[inds + 1]
return filtered_results
def postprocess(fetch_dict, feature_normalize, det_boxes, searcher, id_map,
return_k, rec_score_thres, rec_nms_thresold):
batch_features = fetch_dict["features"]
#do feature norm
if feature_normalize:
feas_norm = np.sqrt(
np.sum(np.square(batch_features), axis=1, keepdims=True))
batch_features = np.divide(batch_features, feas_norm)
scores, docs = searcher.search(batch_features, return_k)
results = []
for i in range(scores.shape[0]):
pred = {}
if scores[i][0] >= rec_score_thres:
pred["bbox"] = [int(x) for x in det_boxes[i, 2:]]
pred["rec_docs"] = id_map[docs[i][0]].split()[1]
pred["rec_scores"] = scores[i][0]
#do nms
results = nms_to_rec_results(results, rec_nms_thresold)
return results
#do client
if __name__ == "__main__":
det = MainbodyDetect()
rec = ObjectRecognition()
#1. get det_results
imgpath = "../../drink_dataset_v1.0/test_images/001.jpeg"
det_results = det.predict(imgpath)
#2. get rec_results
rec_results = rec.predict(det_results, imgpath)
client = Client()
im = cv2.imread("../../drink_dataset_v1.0/test_images/001.jpeg")
im_shape = np.array(im.shape[:2]).reshape(-1)
fetch_map = client.predict(
feed={"image": im,
"im_shape": im_shape},
fetch=["features", "boxes"],
#add retrieval procedure
det_boxes = fetch_map["boxes"]
searcher, id_map = init_index(index_dir)
results = postprocess(fetch_map, feature_normalize, det_boxes, searcher,
id_map, return_k, rec_score_thres, rec_nms_thresold)
......@@ -12,16 +12,20 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import base64
import time
from paddle_serving_client import Client
from paddle_serving_app.reader import Sequential, URL2Image, Resize
from paddle_serving_app.reader import CenterCrop, RGB2BGR, Transpose, Div, Normalize
import time
def bytes_to_base64(image: bytes) -> str:
"""encode bytes into base64 string
return base64.b64encode(image).decode('utf8')
client = Client()
label_dict = {}
......@@ -31,22 +35,17 @@ with open("imagenet.label") as fin:
label_dict[label_idx] = line.strip()
label_idx += 1
seq = Sequential([
URL2Image(), Resize(256), CenterCrop(224), RGB2BGR(), Transpose((2, 0, 1)),
Div(255), Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], True)
start = time.time()
image_file = "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"
image_file = "./daisy.jpg"
for i in range(1):
img = seq(image_file)
fetch_map = client.predict(
feed={"inputs": img}, fetch=["prediction"], batch=False)
prob = max(fetch_map["prediction"][0])
label = label_dict[fetch_map["prediction"][0].tolist().index(prob)].strip(
).replace(",", "")
print("prediction: {}, probability: {}".format(label, prob))
end = time.time()
print(end - start)
start = time.time()
with open(image_file, 'rb') as img_file:
image_data = img_file.read()
image = bytes_to_base64(image_data)
fetch_dict = client.predict(
feed={"inputs": image}, fetch=["prediction"], batch=False)
prob = max(fetch_dict["prediction"][0])
label = label_dict[fetch_dict["prediction"][0].tolist().index(
prob)].strip().replace(",", "")
print("prediction: {}, probability: {}".format(label, prob))
end = time.time()
print(end - start)
# PULC Classification Model of Someone or Nobody
## Catalogue
- [1. Introduction](#1)
- [2. Quick Start](#2)
- [2.1 PaddlePaddle Installation](#2.1)
- [2.2 PaddleClas Installation](#2.2)
- [2.3 Prediction](#2.3)
- [3. Training, Evaluation and Inference](#3)
- [3.1 Installation](#3.1)
- [3.2 Dataset](#3.2)
- [3.2.1 Dataset Introduction](#3.2.1)
- [3.2.2 Getting Dataset](#3.2.2)
- [3.3 Training](#3.3)
- [3.4 Evaluation](#3.4)
- [3.5 Inference](#3.5)
- [4. Model Compression](#4)
- [4.1 SKL-UGI Knowledge Distillation](#4.1)
- [4.1.1 Teacher Model Training](#4.1.1)
- [4.1.2 Knowledge Distillation Training](#4.1.2)
- [5. SHAS](#5)
- [6. Inference Deployment](#6)
- [6.1 Getting Paddle Inference Model](#6.1)
- [6.1.1 Exporting Paddle Inference Model](#6.1.1)
- [6.1.2 Downloading Inference Model](#6.1.2)
- [6.2 Prediction with Python](#6.2)
- [6.2.1 Image Prediction](#6.2.1)
- [6.2.2 Images Prediction](#6.2.2)
- [6.3 Deployment with C++](#6.3)
- [6.4 Deployment as Service](#6.4)
- [6.5 Deployment on Mobile](#6.5)
- [6.6 Converting To ONNX and Deployment](#6.6)
<a name="1"></a>
## 1. Introduction
This case provides a way for users to quickly build a lightweight, high-precision and practical classification model of human exists using PaddleClas PULC (Practical Ultra Lightweight Classification). The model can be widely used in monitoring scenarios, personnel access control scenarios, massive data filtering scenarios, etc.
The following table lists the relevant indicators of the model. The first two lines means that using SwinTransformer_tiny and MobileNetV3_small_x0_35 as the backbone to training. The third to sixth lines means that the backbone is replaced by PPLCNet, additional use of EDA strategy and additional use of EDA strategy and SKL-UGI knowledge distillation strategy.
| Backbone | Tpr(%) | Latency(ms) | Size(M)| Training Strategy |
| SwinTranformer_tiny | 95.69 | 95.30 | 107 | using ImageNet pretrained |
| MobileNetV3_small_x0_35 | 68.25 | 2.85 | 1.6 | using ImageNet pretrained |
| PPLCNet_x1_0 | 89.57 | 2.12 | 6.5 | using ImageNet pretrained |
| PPLCNet_x1_0 | 92.10 | 2.12 | 6.5 | using SSLD pretrained |
| PPLCNet_x1_0 | 93.43 | 2.12 | 6.5 | using SSLD pretrained + EDA strategy |
| <b>PPLCNet_x1_0<b> | <b>95.60<b> | <b>2.12<b> | <b>6.5<b> | using SSLD pretrained + EDA strategy + SKL-UGI knowledge distillation strategy|
It can be seen that high Tpr can be getted when backbone is SwinTranformer_tiny, but the speed is slow. Replacing backbone with the lightweight model MobileNetV3_small_x0_35, the speed can be greatly improved, but the Tpr will be greatly reduced. Replacing backbone with faster backbone PPLCNet_x1_0, the Tpr is higher more 20 percentage points higher than MobileNetv3_small_x0_35. At the same time, the speed can be more than 20% faster. After additional using the SSLD pretrained model, the Tpr can be improved by about 2.6 percentage points without affecting the inference speed. Further, additional using the EDA strategy, the Tpr can be increased by 1.3 percentage points. Finally, after additional using the SKL-UGI knowledge distillation, the Tpr can be further improved by 2.2 percentage points. At this point, the Tpr close to that of SwinTranformer_tiny is obtained, but the speed is more than 40 times faster. The training method and deployment instructions of PULC will be introduced in detail below.
* About `Tpr` metric, please refer to [3.2 section](#3.2) for more information .
* The Latency is tested on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz. The MKLDNN is enabled and the number of threads is 10.
* About PP-LCNet, please refer to [PP-LCNet Introduction](../models/PP-LCNet_en.md) and [PP-LCNet Paper](https://arxiv.org/abs/2109.15099).
<a name="2"></a>
## 2. Quick Start
<a name="2.1"></a>
### 2.1 PaddlePaddle Installation
- Run the following command to install if CUDA9 or CUDA10 is available.
python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
- Run the following command to install if GPU device is unavailable.
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
Please refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/en/install/pip/linux-pip_en.html) for more information about installation, for examples other versions.
<a name="2.2"></a>
### 2.2 PaddleClas wheel Installation
The command of PaddleClas installation as bellow:
pip3 install paddleclas
<a name="2.3"></a>
### 2.3 Prediction
First, please click [here](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip) to download and unzip to get the test demo images.
* Prediction with CLI
paddleclas --model_name=person_exists --infer_imgs=pulc_demo_imgs/person_exists/objects365_01780782.jpg
>>> result
class_ids: [0], scores: [0.9955421453341842], label_names: ['nobody'], filename: pulc_demo_imgs/person_exists/objects365_01780782.jpg
Predict complete!
**Note**: If you want to test other images, only need to specify the `--infer_imgs` argument, and the directory containing images is also supported.
* Prediction in Python
import paddleclas
model = paddleclas.PaddleClas(model_name="person_exists")
result = model.predict(input_data="pulc_demo_imgs/person_exists/objects365_01780782.jpg")
**Note**: The `result` returned by `model.predict()` is a generator, so you need to use the `next()` function to call it or `for` loop to loop it. And it will predict with `batch_size` size batch and return the prediction results when called. The default `batch_size` is 1, and you also specify the `batch_size` when instantiating, such as `model = paddleclas.PaddleClas(model_name="person_exists", batch_size=2)`. The result of demo above:
>>> result
[{'class_ids': [0], 'scores': [0.9955421453341842], 'label_names': ['nobody'], 'filename': 'pulc_demo_imgs/person_exists/objects365_01780782.jpg'}]
<a name="3"></a>
## 3. Training, Evaluation and Inference
<a name="3.1"></a>
### 3.1 Installation
Please refer to [Installation](../installation/install_paddleclas_en.md) to get the description about installation.
<a name="3.2"></a>
### 3.2 Dataset
<a name="3.2.1"></a>
#### 3.2.1 Dataset Introduction
All datasets used in this case are open source data. Train data is the subset of [MS-COCO](https://cocodataset.org/#overview) training data. And the validation data is the subset of [Object365](https://www.objects365.org/overview.html) training data. ImageNet_val is [ImageNet-1k](https://www.image-net.org/) validation data.
<a name="3.2.2"></a>
#### 3.2.2 Getting Dataset
The data used in this case can be getted by processing the open source data. The detailed processes are as follows:
- Training data. This case deals with the annotation file of MS-COCO data training data. If a certain image contains the label of "person" and the area of this box is greater than 10% in the whole image, it is considered that the image contains human. If there is no label of "person" in a certain image, It is considered that the image does not contain human. After processing, 92964 pieces of available data were obtained, including 39813 images containing human and 53151 images without containing human.
- Validation data: randomly select a small part of data from object365 data, use the better model trained on MS-COCO to predict these data, take the intersection between the prediction results and the data annotation file, and filter the intersection results into the validation set according to the method of obtaining the training set. After processing, 27820 pieces of available data were obtained. There are 2255 pieces of data with human and 25565 pieces of data without human. The data visualization of the processed dataset is as follows:
Some image of the processed dataset is as follows:
And you can also download the data processed directly.
cd path_to_PaddleClas
Enter the `dataset/` directory, download and unzip the dataset.
cd dataset
wget https://paddleclas.bj.bcebos.com/data/PULC/person_exists.tar
tar -xf person_exists.tar
cd ../
The datas under `person_exists` directory:
├── train
│   ├── 000000000009.jpg
│   ├── 000000000025.jpg
├── val
│   ├── objects365_01780637.jpg
│   ├── objects365_01780640.jpg
├── ImageNet_val
│   ├── ILSVRC2012_val_00000001.JPEG
│   ├── ILSVRC2012_val_00000002.JPEG
├── train_list.txt
├── train_list.txt.debug
├── train_list_for_distill.txt
├── val_list.txt
└── val_list.txt.debug
Where `train/` and `val/` are training set and validation set respectively. The `train_list.txt` and `val_list.txt` are label files of training data and validation data respectively. The file `train_list.txt.debug` and `val_list.txt.debug` are subset of `train_list.txt` and `val_list.txt` respectively. `ImageNet_val/` is the validation data of ImageNet-1k, which will be used for SKL-UGI knowledge distillation, and its label file is `train_list_for_distill.txt`.
* About the contents format of `train_list.txt` and `val_list.txt`, please refer to [Description about Classification Dataset in PaddleClas](../data_preparation/classification_dataset_en.md).
* About the `train_list_for_distill.txt`, please refer to [Knowledge Distillation Label](../advanced_tutorials/distillation/distillation_en.md).
<a name="3.3"></a>
### 3.3 Training
The details of training config in `ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml`. The command about training as follows:
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml
The best metric of validation data is between `0.94` and `0.95`. There would be fluctuations because the data size is small.
* The metric Tpr, that describe the True Positive Rate when False Positive Rate is less than a certain threshold(1/1000 used in this case), is one of the commonly used metric for binary classification. About the details of Fpr and Tpr, please refer [here](https://en.wikipedia.org/wiki/Receiver_operating_characteristic).
* When evaluation, the best metric TprAtFpr will be printed that include `Fpr`, `Tpr` and the current `threshold`. The `Tpr` means the Recall rate under the current `Fpr`. The `Tpr` higher, the model better. The `threshold` would be used in deployment, which means the classification threshold under best `Fpr` metric.
<a name="3.4"></a>
### 3.4 Evaluation
After training, you can use the following commands to evaluate the model.
python3 tools/eval.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"
Among the above command, the argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
<a name="3.5"></a>
### 3.5 Inference
After training, you can use the model that trained to infer. Command is as follow:
python3 tools/infer.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/PPLCNet_x1_0/best_model
The results:
[{'class_ids': [1], 'scores': [0.9999976], 'label_names': ['someone'], 'file_name': 'deploy/images/PULC/person_exists/objects365_02035329.jpg'}]
* Among the above command, argument `-o Global.pretrained_model="output/PPLCNet_x1_0/best_model"` specify the path of the best model weight file. You can specify other path if needed.
* The default test image is `deploy/images/PULC/person_exists/objects365_02035329.jpg`. And you can test other image, only need to specify the argument `-o Infer.infer_imgs=path_to_test_image`.
* The default threshold is `0.5`. If needed, you can specify the argument `Infer.PostProcess.threshold`, such as: `-o Infer.PostProcess.threshold=0.9794`. And the argument `threshold` is needed to be specified according by specific case. The `0.9794` is the best threshold when `Fpr` is less than `1/1000` in this valuation dataset.
<a name="4"></a>
## 4. Model Compression
<a name="4.1"></a>
### 4.1 SKL-UGI Knowledge Distillation
SKL-UGI is a simple but effective knowledge distillation algrithem proposed by PaddleClas.
<!-- todo -->
<!-- Please refer to [SKL-UGI](../advanced_tutorials/distillation/distillation_en.md) for more details. -->
<a name="4.1.1"></a>
#### 4.1.1 Teacher Model Training
Training the teacher model with hyperparameters specified in `ppcls/configs/PULC/person_exists/PPLCNet/PPLCNet_x1_0.yaml`. The command is as follow:
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
-o Arch.name=ResNet101_vd
The best metric of validation data is between `0.96` and `0.98`. The best teacher model weight would be saved in file `output/ResNet101_vd/best_model.pdparams`.
<a name="4.1.2"></a>
#### 4.1.2 Knowledge Distillation Training
The training strategy, specified in training config file `ppcls/configs/PULC/person_exists/PPLCNet_x1_0_distillation.yaml`, the teacher model is `ResNet101_vd`, the student model is `PPLCNet_x1_0` and the additional unlabeled training data is validation data of ImageNet1k. The command is as follow:
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0_distillation.yaml \
-o Arch.models.0.Teacher.pretrained=output/ResNet101_vd/best_model
The best metric is between `0.95` and `0.97`. The best student model weight would be saved in file `output/DistillationModel/best_model_student.pdparams`.
<a name="5"></a>
## 5. Hyperparameters Searching
The hyperparameters used by [3.2 section](#3.2) and [4.1 section](#4.1) are according by `Hyperparameters Searching` in PaddleClas. If you want to get better results on your own dataset, you can refer to [Hyperparameters Searching](PULC_train_en.md#4) to get better hyperparameters.
**Note**: This section is optional. Because the search process will take a long time, you can selectively run according to your specific. If not replace the dataset, you can ignore this section.
<a name="6"></a>
## 6. Inference Deployment
<a name="6.1"></a>
### 6.1 Getting Paddle Inference Model
Paddle Inference is the original Inference Library of the PaddlePaddle, provides high-performance inference for server deployment. And compared with directly based on the pretrained model, Paddle Inference can use tools to accelerate prediction, so as to achieve better inference performance. Please refer to [Paddle Inference](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/infer/inference/inference_cn.html) for more information.
Paddle Inference need Paddle Inference Model to predict. Two process provided to get Paddle Inference Model. If want to use the provided by PaddleClas, you can download directly, click [Downloading Inference Model](#6.1.2).
<a name="6.1.1"></a>
### 6.1.1 Exporting Paddle Inference Model
The command about exporting Paddle Inference Model is as follow:
python3 tools/export_model.py \
-c ./ppcls/configs/PULC/person_exists/PPLCNet_x1_0.yaml \
-o Global.pretrained_model=output/DistillationModel/best_model_student \
-o Global.save_inference_dir=deploy/models/PPLCNet_x1_0_person_exists_infer
After running above command, the inference model files would be saved in `deploy/models/PPLCNet_x1_0_person_exists_infer`, as shown below:
├── PPLCNet_x1_0_person_exists_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
**Note**: The best model is from knowledge distillation training. If knowledge distillation training is not used, the best model would be saved in `output/PPLCNet_x1_0/best_model.pdparams`.
<a name="6.1.2"></a>
### 6.1.2 Downloading Inference Model
You can also download directly.
cd deploy/models
# download the inference model and decompression
wget https://paddleclas.bj.bcebos.com/models/PULC/person_exists_infer.tar && tar -xf person_exists_infer.tar
After decompression, the directory `models` should be shown below.
├── person_exists_infer
│ ├── inference.pdiparams
│ ├── inference.pdiparams.info
│ └── inference.pdmodel
<a name="6.2"></a>
### 6.2 Prediction with Python
<a name="6.2.1"></a>
#### 6.2.1 Image Prediction
Return the directory `deploy`:
cd ../
Run the following command to classify whether there are human in the image `./images/PULC/person_exists/objects365_02035329.jpg`.
# Use the following command to predict with GPU.
python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml
# Use the following command to predict with CPU.
python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml -o Global.use_gpu=False
The prediction results:
objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
**Note**: The default threshold is `0.5`. If needed, you can specify the argument `Infer.PostProcess.threshold`, such as: `-o Infer.PostProcess.threshold=0.9794`. And the argument `threshold` is needed to be specified according by specific case. The `0.9794` is the best threshold when `Fpr` is less than `1/1000` in this valuation dataset. Please refer to [3.3 section](#3.3) for details.
<a name="6.2.2"></a>
#### 6.2.2 Images Prediction
If you want to predict images in directory, please specify the argument `Global.infer_imgs` as directory path by `-o Global.infer_imgs`. The command is as follow.
# Use the following command to predict with GPU. If want to replace with CPU, you can add argument -o Global.use_gpu=False
python3.7 python/predict_cls.py -c configs/PULC/person_exists/inference_person_exists.yaml -o Global.infer_imgs="./images/PULC/person_exists/"
All prediction results will be printed, as shown below.
objects365_01780782.jpg: class id(s): [0], score(s): [1.00], label_name(s): ['nobody']
objects365_02035329.jpg: class id(s): [1], score(s): [1.00], label_name(s): ['someone']
Among the prediction results above, `someone` means that there is a human in the image, `nobody` means that there is no human in the image.
<a name="6.3"></a>
### 6.3 Deployment with C++
PaddleClas provides an example about how to deploy with C++. Please refer to [Deployment with C++](../inference_deployment/cpp_deploy_en.md).
<a name="6.4"></a>
### 6.4 Deployment as Service
Paddle Serving is a flexible, high-performance carrier for machine learning models, and supports different protocol, such as RESTful, gRPC, bRPC and so on, which provides different deployment solutions for a variety of heterogeneous hardware and operating system environments. Please refer [Paddle Serving](https://github.com/PaddlePaddle/Serving) for more information.
PaddleClas provides an example about how to deploy as service by Paddle Serving. Please refer to [Paddle Serving Deployment](../inference_deployment/paddle_serving_deploy_en.md).
<a name="6.5"></a>
### 6.5 Deployment on Mobile
Paddle-Lite is an open source deep learning framework that designed to make easy to perform inference on mobile, embeded, and IoT devices. Please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) for more information.
PaddleClas provides an example of how to deploy on mobile by Paddle-Lite. Please refer to [Paddle-Lite deployment](../inference_deployment/paddle_lite_deploy_en.md).
<a name="6.6"></a>
### 6.6 Converting To ONNX and Deployment
Paddle2ONNX support convert Paddle Inference model to ONNX model. And you can deploy with ONNX model on different inference engine, such as TensorRT, OpenVINO, MNN/TNN, NCNN and so on. About Paddle2ONNX details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX).
PaddleClas provides an example of how to convert Paddle Inference model to ONNX model by paddle2onnx toolkit and predict by ONNX model. You can refer to [paddle2onnx](../../../deploy/paddle2onnx/readme_en.md) for deployment details.
......@@ -14,7 +14,7 @@
- [3.3 EDA strategy](#3.3)
- [3.4 SKL-UGI knowledge distillation](#3.4)
- [3.5 Summary](#3.5)
- [4. Hyperparameter Search](#4)
- [4. Hyperparameters Searching](#4)
- [4.1 Search based on default configuration](#4.1)
- [4.2 Custom search configuration](#4.2)
......@@ -31,7 +31,7 @@ The PULC solution has been verified to be effective in many scenarios, such as h
<img src="https://user-images.githubusercontent.com/19523330/173011854-b10fcd7a-b799-4dfd-a1cf-9504952a3c44.png" width = "800" />
The solution mainly includes 4 parts, namely: PP-LCNet lightweight backbone network, SSLD pre-trained model, Ensemble Data Augmentation (EDA) and SKL-UGI knowledge distillation algorithm. In addition, we also adopt the method of hyperparameter search to efficiently optimize the hyperparameters in training. Below, we take the person exists or not scene as an example to illustrate the solution.
The solution mainly includes 4 parts, namely: PP-LCNet lightweight backbone network, SSLD pre-trained model, Ensemble Data Augmentation (EDA) and SKL-UGI knowledge distillation algorithm. In addition, we also adopt the method of hyperparameters searching to efficiently optimize the hyperparameters in training. Below, we take the person exists or not scene as an example to illustrate the solution.
**Note**:For some specific scenarios, we provide basic training documents for reference, such as [person exists or not classification model](PULC_person_exists_en.md), etc. You can find these documents [here](./PULC_model_list_en.md). If the methods in these documents do not meet your needs, or if you need a custom training task, you can refer to this document.
......@@ -201,22 +201,22 @@ We also used the same optimization strategy in the other 8 scenarios and got the
| Text Image Orientation Classification | SwinTransformer_tiny |99.12 | PPLCNet_x1_0 | 99.06 |
| Text-line Orientation Classification | SwinTransformer_tiny | 93.61 | PPLCNet_x1_0 | 96.01 |
| Language Classification | SwinTransformer_tiny | 98.12 | PPLCNet_x1_0 | 99.26 |
It can be seen from the results that the PULC scheme can improve the model accuracy in multiple application scenarios. Using the PULC scheme can greatly reduce the workload of model optimization and quickly obtain models with higher accuracy.
<a name="4"></a>
### 4. Hyperparameter Search
### 4. Hyperparameters Searching
In the above training process, we adjusted parameters such as learning rate, data augmentation probability, and stage learning rate mult list. The optimal values of these parameters may not be the same in different scenarios. We provide a quick hyperparameter search script to automate the process of hyperparameter tuning. This script traverses the parameters in the search value list to replace the parameters in the default configuration, then trains in sequence, and finally selects the parameters corresponding to the model with the highest accuracy as the search result.
In the above training process, we adjusted parameters such as learning rate, data augmentation probability, and stage learning rate mult list. The optimal values of these parameters may not be the same in different scenarios. We provide a quick hyperparameters searching script to automate the process of hyperparameter tuning. This script traverses the parameters in the search value list to replace the parameters in the default configuration, then trains in sequence, and finally selects the parameters corresponding to the model with the highest accuracy as the search result.
<a name="4.1"></a>
#### 4.1 Search based on default configuration
The configuration file [search.yaml](../../../ppcls/configs/PULC/person_exists/search.yaml) defines the configuration of hyperparameter search in person exists or not scenarios. Use the following commands to complete hyperparameter search.
The configuration file [search.yaml](../../../ppcls/configs/PULC/person_exists/search.yaml) defines the configuration of hyperparameters searching in person exists or not scenarios. Use the following commands to complete hyperparameters searching.
python3 tools/search_strategy.py -c ppcls/configs/PULC/person_exists/search.yaml
......@@ -228,8 +228,8 @@ python3 tools/search_strategy.py -c ppcls/configs/PULC/person_exists/search.yaml
#### 4.2 Custom search configuration
You can also modify the configuration of hyperparameter search based on training results or your parameter tuning experience.
You can also modify the configuration of hyperparameters searching based on training results or your parameter tuning experience.
Modify the `search_values` field in `lrs` to modify the list of learning rate search values;
......@@ -8,6 +8,8 @@ PaddleClas supports Python wheel package for prediction. At present, PaddleClas
- [1. Installation](#1)
- [2. Quick Start](#2)
- [2.1 ImageNet1k models](#2.1)
- [2.2 PULC models](#2.2)
- [3. Definition of Parameters](#3)
- [4. More usage](#4)
- [4.1 View help information](#4.1)
......@@ -75,7 +77,6 @@ filename: docs/images/inference_deployment/whl_demo.jpg, top-5, class_ids: [8, 7
Predict complete!
<a name="2.2"></a>
### 2.2 PULC models
......@@ -54,22 +54,22 @@
| PPLCNet_x1_0 | 95.48 | 2.12 | 6.5 | 使用 SSLD 预训练模型+EDA 策略|
| <b>PPLCNet_x1_0<b> | <b>95.92<b> | <b>2.12<b> | <b>6.5<b> | 使用 SSLD 预训练模型+EDA 策略+SKL-UGI 知识蒸馏策略|
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backboone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是会导致精度大幅下降。将 backbone 替换为速度更快的 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 13 个百分点,与此同时速度依旧可以快 20% 以上。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 0.7 个百分点,进一步地,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.44 个百分点。此时,PPLCNet_x1_0 达到了接近 SwinTranformer_tiny 模型的精度,但是速度快 40 多倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是会导致精度大幅下降。将 backbone 替换为速度更快的 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 13 个百分点,与此同时速度依旧可以快 20% 以上。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 0.7 个百分点,进一步地,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.44 个百分点。此时,PPLCNet_x1_0 达到了接近 SwinTranformer_tiny 模型的精度,但是速度快 40 多倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
* `Tpr`指标的介绍可以参考 [3.3节](#3.3)的备注部分,延时是基于 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,开启 MKLDNN 加速策略,线程数为10。
* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)
<a name="2"></a>
## 2. 模型快速体验
<a name="2.1"></a>
### 2.1 安装 paddlepaddle
- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
......@@ -81,11 +81,11 @@ python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
<a name="2.2"></a>
### 2.2 安装 paddleclas
使用如下命令快速安装 paddleclas
......@@ -93,11 +93,11 @@ python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
pip3 install paddleclas
<a name="2.3"></a>
### 2.3 预测
点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
* 使用命令行快速预测
......@@ -130,7 +130,7 @@ print(next(result))
>>> result
[{'class_ids': [1], 'scores': [0.9871138], 'label_names': ['contains_car'], 'filename': 'pulc_demo_imgs/car_exists/objects365_00001507.jpeg'}]
<a name="3"></a>
......@@ -326,7 +326,7 @@ python3 -m paddle.distributed.launch \
## 5. 超参搜索
[3.3 节](#3.3)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
[3.3 节](#3.3)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
......@@ -49,7 +49,7 @@
| PPLCNet_x1_0 | 99.12 | 2.58 | 6.5 | 使用SSLD预训练模型+EDA策略 |
| **PPLCNet_x1_0** | **99.26** | **2.58** | **6.5** | 使用SSLD预训练模型+EDA策略+SKL-UGI知识蒸馏策略 |
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度比较高,但是推理速度较慢。将 backboone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度提升明显,但精度有了大幅下降。将 backbone 替换为 PPLCNet_x1_0 且调整预处理输入尺寸和网络的下采样stride时,速度略为提升,同时精度较 MobileNetV3_large_x1_0 高2.43个百分点。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升 0.35 个百分点,进一步地,当融合EDA策略后,精度可以再提升 0.42 个百分点,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.14 个百分点。此时,PPLCNet_x1_0 超过了 SwinTranformer_tiny 模型的精度,并且速度有了明显提升。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度比较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度提升明显,但精度有了大幅下降。将 backbone 替换为 PPLCNet_x1_0 且调整预处理输入尺寸和网络的下采样stride时,速度略为提升,同时精度较 MobileNetV3_large_x1_0 高2.43个百分点。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升 0.35 个百分点,进一步地,当融合EDA策略后,精度可以再提升 0.42 个百分点,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 0.14 个百分点。此时,PPLCNet_x1_0 超过了 SwinTranformer_tiny 模型的精度,并且速度有了明显提升。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
......@@ -60,9 +60,9 @@
## 2. 模型快速体验
<a name="2.1"></a>
### 2.1 安装 paddlepaddle
- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
......@@ -74,23 +74,23 @@ python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
<a name="2.2"></a>
### 2.2 安装 paddleclas
使用如下命令快速安装 paddleclas
pip3 install paddleclas
<a name="2.3"></a>
### 2.3 预测
点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
* 使用命令行快速预测
......@@ -309,7 +309,7 @@ python3 -m paddle.distributed.launch \
## 5. 超参搜索
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
......@@ -67,9 +67,9 @@
## 2. 模型快速体验
<a name="2.1"></a>
### 2.1 安装 paddlepaddle
- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
......@@ -81,23 +81,23 @@ python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
<a name="2.2"></a>
### 2.2 安装 paddleclas
使用如下命令快速安装 paddleclas
pip3 install paddleclas
<a name="2.3"></a>
### 2.3 预测
点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
* 使用命令行快速预测
......@@ -313,7 +313,7 @@ python3 -m paddle.distributed.launch \
## 5. 超参搜索
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
......@@ -54,7 +54,7 @@
| PPLCNet_x1_0 | 93.43 | 2.12 | 6.5 | 使用 SSLD 预训练模型+EDA 策略|
| <b>PPLCNet_x1_0<b> | <b>95.60<b> | <b>2.12<b> | <b>6.5<b> | 使用 SSLD 预训练模型+EDA 策略+SKL-UGI 知识蒸馏策略|
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backboone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是会导致精度大幅下降。将 backbone 替换为速度更快的 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 20 多个百分点,与此同时速度依旧可以快 20% 以上。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 2.6 个百分点,进一步地,当融合EDA策略后,精度可以再提升 1.3 个百分点,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 2.2 个百分点。此时,PPLCNet_x1_0 达到了 SwinTranformer_tiny 模型的精度,但是速度快 40 多倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是会导致精度大幅下降。将 backbone 替换为速度更快的 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 20 多个百分点,与此同时速度依旧可以快 20% 以上。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升约 2.6 个百分点,进一步地,当融合EDA策略后,精度可以再提升 1.3 个百分点,最后,在使用 SKL-UGI 知识蒸馏后,精度可以继续提升 2.2 个百分点。此时,PPLCNet_x1_0 达到了 SwinTranformer_tiny 模型的精度,但是速度快 40 多倍。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
......@@ -67,9 +67,9 @@
## 2. 模型快速体验
<a name="2.1"></a>
### 2.1 安装 paddlepaddle
- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
......@@ -81,23 +81,23 @@ python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
<a name="2.2"></a>
### 2.2 安装 paddleclas
使用如下命令快速安装 paddleclas
pip3 install paddleclas
<a name="2.3"></a>
### 2.3 预测
点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
* 使用命令行快速预测
......@@ -328,7 +328,7 @@ python3 -m paddle.distributed.launch \
## 5. 超参搜索
[3.3 节](#3.3)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
[3.3 节](#3.3)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
......@@ -53,12 +53,12 @@
| PPLCNet_x1_0 | 99.30 | 2.03 | 6.5 | 使用SSLD预训练模型+EDA策略|
| <b>PPLCNet_x1_0<b> | <b>99.38<b> | <b>2.03<b> | <b>6.5<b> | 使用SSLD预训练模型+EDA策略+UDML知识蒸馏策略|
从表中可以看出,在使用服务器端大模型作为 backbone 时,SwinTranformer_tiny 精度较低,Res2Net200_vd_26w_4s 精度较高,但服务器端大模型推理速度普遍较慢。将 backboone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是精度显著降低。在将 backbone 替换为 PPLCNet_x1_0 后,精度较 MobileNetV3_small_x0_35 提高约 8.5 个百分点,与此同时速度快 20% 以上。在此基础上,将 PPLCNet_x1_0 的预训练模型替换为 SSLD 预训练模型后,在对推理速度无影响的前提下,精度提升约 4.9 个百分点,进一步地使用 EDA 策略后,精度可以再提升 1.1 个百分点。此时,PPLCNet_x1_0 已经超过 Res2Net200_vd_26w_4s 模型的精度,但是速度快 70+ 倍。最后,在使用 UDML 知识蒸馏后,精度可以再提升 0.08 个百分点。下面详细介绍关于 PULC 安全帽模型的训练方法和推理部署方法。
从表中可以看出,在使用服务器端大模型作为 backbone 时,SwinTranformer_tiny 精度较低,Res2Net200_vd_26w_4s 精度较高,但服务器端大模型推理速度普遍较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,但是精度显著降低。在将 backbone 替换为 PPLCNet_x1_0 后,精度较 MobileNetV3_small_x0_35 提高约 8.5 个百分点,与此同时速度快 20% 以上。在此基础上,将 PPLCNet_x1_0 的预训练模型替换为 SSLD 预训练模型后,在对推理速度无影响的前提下,精度提升约 4.9 个百分点,进一步地使用 EDA 策略后,精度可以再提升 1.1 个百分点。此时,PPLCNet_x1_0 已经超过 Res2Net200_vd_26w_4s 模型的精度,但是速度快 70+ 倍。最后,在使用 UDML 知识蒸馏后,精度可以再提升 0.08 个百分点。下面详细介绍关于 PULC 安全帽模型的训练方法和推理部署方法。
* `Tpr`指标的介绍可以参考 [3.3小节](#3.3)的备注部分,延时是基于 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,开启MKLDNN加速策略,线程数为10。
* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)
......@@ -67,9 +67,9 @@
## 2. 模型快速体验
<a name="2.1"></a>
### 2.1 安装 paddlepaddle
- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
......@@ -81,23 +81,23 @@ python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
<a name="2.2"></a>
### 2.2 安装 paddleclas
使用如下命令快速安装 paddleclas
pip3 install paddleclas
<a name="2.3"></a>
### 2.3 预测
点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
* 使用命令行快速预测
......@@ -295,7 +295,7 @@ python3 -m paddle.distributed.launch \
## 5. 超参搜索
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
......@@ -38,7 +38,7 @@
在诸如文档扫描、证照拍摄等过程中,有时为了拍摄更清晰,会将拍摄设备进行旋转,导致得到的图片也是不同方向的。此时,标准的OCR流程无法很好地应对这些数据。利用图像分类技术,可以预先判断含文字图像的方向,并将其进行方向调整,从而提高OCR处理的准确性。该案例提供了用户使用 PaddleClas 的超轻量图像分类方案(PULC,Practical Ultra Lightweight Classification)快速构建轻量级、高精度、可落地的含文字图像方向的分类模型。该模型可以广泛应用于金融、政务等行业的旋转图片的OCR处理场景中。
下表列出了判断含文字图像方向分类模型的相关指标,前两行展现了使用 SwinTranformer_tiny 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第三行至第五行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 SHAS 超参数搜索策略训练得到的模型的相关指标。
下表列出了判断含文字图像方向分类模型的相关指标,前两行展现了使用 SwinTranformer_tiny 和 MobileNetV3_small_x0_35 作为 backbone 训练得到的模型的相关指标,第三行至第五行依次展现了替换 backbone 为 PPLCNet_x1_0、使用 SSLD 预训练模型、使用 超参数搜索策略训练得到的模型的相关指标。
| 模型 | 精度(%) | 延时(ms) | 存储(M) | 策略 |
| ----------------------- | --------- | ---------- | --------- | ------------------------------------- |
......@@ -48,9 +48,9 @@
| PPLCNet_x1_0 | 98.02 | 2.16 | 6.5 | 使用SSLD预训练模型 |
| **PPLCNet_x1_0** | **99.06** | **2.16** | **6.5** | 使用SSLD预训练模型+SHAS超参数搜索策略 |
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度比较高,但是推理速度较慢。将 backboone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度提升明显,但精度有了大幅下降。将 backbone 替换为 PPLCNet_x1_0 时,速度略为提升,同时精度较 MobileNetV3_small_x0_35 高了 14.24 个百分点。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升 0.17 个百分点,进一步地,当使用SHAS超参数搜索策略搜索最优超参数后,精度可以再提升 1.04 个百分点。此时,PPLCNet_x1_0 与 SwinTranformer_tiny 的精度差别不大,但是速度明显变快。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度比较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度提升明显,但精度有了大幅下降。将 backbone 替换为 PPLCNet_x1_0 时,速度略为提升,同时精度较 MobileNetV3_small_x0_35 高了 14.24 个百分点。在此基础上,使用 SSLD 预训练模型后,在不改变推理速度的前提下,精度可以提升 0.17 个百分点,进一步地,当使用SHAS超参数搜索策略搜索最优超参数后,精度可以再提升 1.04 个百分点。此时,PPLCNet_x1_0 与 SwinTranformer_tiny 的精度差别不大,但是速度明显变快。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)
......@@ -59,9 +59,9 @@
## 2. 模型快速体验
<a name="2.1"></a>
### 2.1 安装 paddlepaddle
- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
......@@ -73,23 +73,23 @@ python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
<a name="2.2"></a>
### 2.2 安装 paddleclas
使用如下命令快速安装 paddleclas
pip3 install paddleclas
<a name="2.3"></a>
### 2.3 预测
点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
* 使用命令行快速预测
......@@ -319,7 +319,7 @@ python3 -m paddle.distributed.launch \
## 5. 超参搜索
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
......@@ -55,11 +55,11 @@
| <b>PPLCNet_x1_0**<b> | <b>96.01<b> | <b>2.72<b> | <b>6.5<b> | 使用 SSLD 预训练模型+EDA 策略|
| PPLCNet_x1_0** | 95.86 | 2.72 | 6.5 | 使用 SSLD 预训练模型+EDA 策略+SKL-UGI 知识蒸馏策略|
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backboone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,精度下降也比较明显。将 backbone 替换为 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 8.6 个百分点,速度快10%左右。在此基础上,更改分辨率和stride, 速度变慢 27%,但是精度可以提升 4.5 个百分点(采用[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)的方案),使用 SSLD 预训练模型后,精度可以继续提升约 0.05 个百分点 ,进一步地,当融合EDA策略后,精度可以再提升 1.9 个百分点。最后,融合SKL-UGI 知识蒸馏策略后,在该场景无效。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
从表中可以看出,backbone 为 SwinTranformer_tiny 时精度较高,但是推理速度较慢。将 backbone 替换为轻量级模型 MobileNetV3_small_x0_35 后,速度可以大幅提升,精度下降也比较明显。将 backbone 替换为 PPLCNet_x1_0 时,精度较 MobileNetV3_small_x0_35 高 8.6 个百分点,速度快10%左右。在此基础上,更改分辨率和stride, 速度变慢 27%,但是精度可以提升 4.5 个百分点(采用[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)的方案),使用 SSLD 预训练模型后,精度可以继续提升约 0.05 个百分点 ,进一步地,当融合EDA策略后,精度可以再提升 1.9 个百分点。最后,融合SKL-UGI 知识蒸馏策略后,在该场景无效。关于 PULC 的训练方法和推理部署方法将在下面详细介绍。
* 其中不带\*的模型表示分辨率为224x224,带\*的模型表示分辨率为48x192(h\*w),数据增强从网络中的 stride 改为 `[2, [2, 1], [2, 1], [2, 1], [2, 1]]`,其中,外层列表中的每一个元素代表网络结构下采样层的stride,该策略为 [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) 提供的文本行方向分类器方案。带\*\*的模型表示分辨率为80x160(h\*w), 网络中的 stride 改为 `[2, [2, 1], [2, 1], [2, 1], [2, 1]]`,其中,外层列表中的每一个元素代表网络结构下采样层的stride,此分辨率是经过[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)搜索得到的。
* 其中不带\*的模型表示分辨率为224x224,带\*的模型表示分辨率为48x192(h\*w),数据增强从网络中的 stride 改为 `[2, [2, 1], [2, 1], [2, 1], [2, 1]]`,其中,外层列表中的每一个元素代表网络结构下采样层的stride,该策略为 [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) 提供的文本行方向分类器方案。带\*\*的模型表示分辨率为80x160(h\*w), 网络中的 stride 改为 `[2, [2, 1], [2, 1], [2, 1], [2, 1]]`,其中,外层列表中的每一个元素代表网络结构下采样层的stride,此分辨率是经过[超参数搜索策略](PULC_train.md#4-超参搜索)搜索得到的。
* 延时是基于 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz 测试得到,开启 MKLDNN 加速策略,线程数为10。
* 关于PP-LCNet的介绍可以参考[PP-LCNet介绍](../models/PP-LCNet.md),相关论文可以查阅[PP-LCNet paper](https://arxiv.org/abs/2109.15099)
......@@ -68,9 +68,9 @@
## 2. 模型快速体验
<a name="2.1"></a>
### 2.1 安装 paddlepaddle
- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
......@@ -82,23 +82,23 @@ python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
<a name="2.2"></a>
### 2.2 安装 paddleclas
使用如下命令快速安装 paddleclas
pip3 install paddleclas
<a name="2.3"></a>
### 2.3 预测
点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
* 使用命令行快速预测
......@@ -314,7 +314,7 @@ python3 -m paddle.distributed.launch \
## 5. 超参搜索
[3.3 节](#3.3)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
[3.3 节](#3.3)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。
......@@ -66,9 +66,9 @@
## 2. 模型快速体验
<a name="2.1"></a>
### 2.1 安装 paddlepaddle
- 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
......@@ -80,23 +80,23 @@ python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
<a name="2.2"></a>
### 2.2 安装 paddleclas
使用如下命令快速安装 paddleclas
pip3 install paddleclas
<a name="2.3"></a>
### 2.3 预测
点击[这里](https://paddleclas.bj.bcebos.com/data/PULC/pulc_demo_imgs.zip)下载 demo 数据并解压,然后在终端中切换到相应目录。
* 使用命令行快速预测
......@@ -344,7 +344,7 @@ python3 -m paddle.distributed.launch \
## 5. 超参搜索
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
[3.2 节](#3.2)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
......@@ -337,7 +337,7 @@ python3 -m paddle.distributed.launch \
## 5. 超参搜索
[3.3 节](#3.3)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `SHAS 超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[SHAS 超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
[3.3 节](#3.3)[4.1 节](#4.1)所使用的超参数是根据 PaddleClas 提供的 `超参数搜索策略` 搜索得到的,如果希望在自己的数据集上得到更好的结果,可以参考[超参数搜索策略](PULC_train.md#4-超参搜索)来获得更好的训练超参数。
**备注:** 此部分内容是可选内容,搜索过程需要较长的时间,您可以根据自己的硬件情况来选择执行。如果没有更换数据集,可以忽略此节内容。
......@@ -233,7 +233,7 @@ class ShuffleNet(Layer):
elif scale == 1.5:
stage_out_channels = [-1, 24, 176, 352, 704, 1024]
elif scale == 2.0:
stage_out_channels = [-1, 24, 224, 488, 976, 2048]
stage_out_channels = [-1, 24, 244, 488, 976, 2048]
raise NotImplementedError("This scale size:[" + str(scale) +
"] is not implemented!")
......@@ -51,10 +51,10 @@ Optimizer:
one_dim_param_no_weight_decay: True
name: Cosine
learning_rate: 1e-4
eta_min: 2e-6
learning_rate: 5e-5
eta_min: 1e-6
warmup_epoch: 5
warmup_start_lr: 2e-7
warmup_start_lr: 1e-7
# data loader for train and eval
......@@ -371,6 +371,11 @@ def run(dataloader,
"Except RuntimeError when reading data from dataloader, try to read once again..."
except IndexError:
"Except IndexError when reading data from dataloader, try to read once again..."
idx += 1
# ignore the warmup iters
if idx == 5:
......@@ -112,4 +112,5 @@ bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/MobileNetV3/Mo
- [test_lite_arm_cpu_cpp 使用](docs/test_lite_arm_cpu_cpp.md): 测试基于Paddle-Lite的ARM CPU端c++预测部署功能.
- [test_paddle2onnx 使用](docs/test_paddle2onnx.md):测试Paddle2ONNX的模型转化功能,并验证正确性。
- [test_serving_infer_python 使用](docs/test_serving_infer_python.md):测试python serving功能。
- [test_serving_infer_cpp 使用](docs/test_serving_infer_cpp.md):测试cpp serving功能。
- [test_train_fleet_inference_python 使用](./docs/test_train_fleet_inference_python.md):测试基于Python的多机多卡训练与推理等基本功能。
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
trans_model:-m paddle_serving_client.convert
# Linux GPU/CPU PYTHON 服务化部署测试
Linux GPU/CPU PYTHON 服务化部署测试的主程序为`test_serving_infer_cpp.sh`,可以测试基于Python的模型服务化部署功能。
## 1. 测试结论汇总
- 推理相关:
| 算法名称 | 模型名称 | device_CPU | device_GPU |
| :----: | :----: | :----: | :----: |
| MobileNetV3 | MobileNetV3_large_x1_0 | 支持 | 支持 |
| PP-ShiTu | PPShiTu_general_rec、PPShiTu_mainbody_det | 支持 | 支持 |
| PPHGNet | PPHGNet_small | 支持 | 支持 |
| PPHGNet | PPHGNet_tiny | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_25 | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_35 | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_5 | 支持 | 支持 |
| PPLCNet | PPLCNet_x0_75 | 支持 | 支持 |
| PPLCNet | PPLCNet_x1_0 | 支持 | 支持 |
| PPLCNet | PPLCNet_x1_5 | 支持 | 支持 |
| PPLCNet | PPLCNet_x2_0 | 支持 | 支持 |
| PPLCNet | PPLCNet_x2_5 | 支持 | 支持 |
| PPLCNetV2 | PPLCNetV2_base | 支持 | 支持 |
| ResNet | ResNet50 | 支持 | 支持 |
| ResNet | ResNet50_vd | 支持 | 支持 |
| SwinTransformer | SwinTransformer_tiny_patch4_window7_224 | 支持 | 支持 |
## 2. 测试流程
### 2.1 准备数据
识别模型默认使用`drink_dataset_v1.0/test_images/001.jpeg`作为测试输入图片,在**2.2 准备环境**中会下载好。
### 2.2 准备环境
- 安装PaddlePaddle:如果您已经安装了2.2或者以上版本的paddlepaddle,那么无需运行下面的命令安装paddlepaddle。
# 需要安装2.2及以上版本的Paddle
# 安装GPU版本的Paddle
python3.7 -m pip install paddlepaddle-gpu==2.2.0
# 安装CPU版本的Paddle
python3.7 -m pip install paddlepaddle==2.2.0
- 安装依赖
python3.7 -m pip install -r requirements.txt
- 安装 PaddleServing 相关组件,包括serving_client、serving-app,自动编译并安装带自定义OP的 serving_server 包,以及自动下载并解压推理模型
bash test_tipc/prepare.sh test_tipc/configs/PPLCNet/PPLCNet_x1_0_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt serving_infer
### 2.3 功能测试
bash test_tipc/test_serving_infer_cpp.sh ${your_params_file}
`PPLCNet_x1_0``Linux GPU/CPU C++ 服务化部署测试`为例,命令如下所示。
bash test_tipc/test_serving_infer_cpp.sh test_tipc/configs/PPLCNet/PPLCNet_x1_0_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
Run successfully with command - PPLCNet_x1_0 - python3.7 test_cpp_serving_client.py > ../../test_tipc/output/PPLCNet_x1_0/server_infer_cpp_gpu_pipeline_batchsize_1.log 2>&1 !
Run successfully with command - PPLCNet_x1_0 - python3.7 test_cpp_serving_client.py > ../../test_tipc/output/PPLCNet_x1_0/server_infer_cpp_cpu_pipeline_batchsize_1.log 2>&1 !
预测结果会自动保存在 `./test_tipc/output/PPLCNet_x1_0/server_infer_gpu_pipeline_http_batchsize_1.log` ,可以看到 PaddleServing 的运行结果:
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0612 09:55:16.109890 38303 naming_service_thread.cpp:202] brpc::policy::ListNamingService(""): added 1
I0612 09:55:16.172924 38303 general_model.cpp:490] [client]logid=0,client_cost=60.772ms,server_cost=57.6ms.
prediction: daisy, probability: 0.9099399447441101
# Linux GPU/CPU PYTHON 服务化部署测试
Linux GPU/CPU PYTHON 服务化部署测试的主程序为`test_serving_infer.sh`,可以测试基于Python的模型服务化部署功能。
Linux GPU/CPU PYTHON 服务化部署测试的主程序为`test_serving_infer_python.sh`,可以测试基于Python的模型服务化部署功能。
## 1. 测试结论汇总
......@@ -60,14 +60,14 @@ Linux GPU/CPU PYTHON 服务化部署测试的主程序为`test_serving_infer.sh
bash test_tipc/test_serving_infer_python.sh ${your_params_file} lite_train_lite_infer
bash test_tipc/test_serving_infer_python.sh ${your_params_file}
`ResNet50``Linux GPU/CPU PYTHON 服务化部署测试`为例,命令如下所示。
bash test_tipc/test_serving_infer_python.sh test_tipc/configs/ResNet50/ResNet50_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt serving_infer
bash test_tipc/test_serving_infer_python.sh test_tipc/configs/ResNet50/ResNet50_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
......@@ -200,15 +200,25 @@ fi
if [[ ${MODE} = "serving_infer" ]]; then
# prepare serving env
python_name=$(func_parser_value "${lines[2]}")
${python_name} -m pip install paddle-serving-server-gpu==0.7.0.post102
${python_name} -m pip install paddle_serving_client==0.7.0
${python_name} -m pip install paddle-serving-app==0.7.0
${python_name} -m pip install paddle_serving_client==0.9.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
${python_name} -m pip install paddle-serving-app==0.9.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
python_name=$(func_parser_value "${lines[2]}")
if [[ ${FILENAME} =~ "cpp" ]]; then
pushd ./deploy/paddleserving
bash build_server.sh ${python_name}
${python_name} -m pip install install paddle-serving-server-gpu==0.9.0.post101 -i https://pypi.tuna.tsinghua.edu.cn/simple
if [[ ${model_name} =~ "ShiTu" ]]; then
${python_name} -m pip install faiss-cpu==1.7.1post2 -i https://pypi.tuna.tsinghua.edu.cn/simple
cls_inference_model_url=$(func_parser_value "${lines[3]}")
cls_tar_name=$(func_get_url_file_name "${cls_inference_model_url}")
det_inference_model_url=$(func_parser_value "${lines[4]}")
det_tar_name=$(func_get_url_file_name "${det_inference_model_url}")
cd ./deploy
wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar --no-check-certificate
tar -xf drink_dataset_v1.0.tar
mkdir models
cd models
wget -nc ${cls_inference_model_url} && tar xf ${cls_tar_name}
......@@ -8,5 +8,11 @@ num_workers=8
# get data
bash test_tipc/static/${model_item}/benchmark_common/prepare.sh
cd ./dataset/ILSVRC2012
cat train_list.txt >> tmp
for i in {1..10}; do cat tmp >> train_list.txt; done
cd ../../
# run
bash test_tipc/static/${model_item}/benchmark_common/run_benchmark.sh ${model_item} ${bs_item} ${fp_item} ${run_mode} ${device_num} ${max_epochs} ${num_workers} 2>&1;
......@@ -8,5 +8,11 @@ num_workers=8
# get data
bash test_tipc/static/${model_item}/benchmark_common/prepare.sh
cd ./dataset/ILSVRC2012
cat train_list.txt >> tmp
for i in {1..10}; do cat tmp >> train_list.txt; done
cd ../../
# run
bash test_tipc/static/${model_item}/benchmark_common/run_benchmark.sh ${model_item} ${bs_item} ${fp_item} ${run_mode} ${device_num} ${max_epochs} ${num_workers} 2>&1;
......@@ -8,5 +8,11 @@ num_workers=8
# get data
bash test_tipc/static/${model_item}/benchmark_common/prepare.sh
cd ./dataset/ILSVRC2012
cat train_list.txt >> tmp
for i in {1..10}; do cat tmp >> train_list.txt; done
cd ../../
# run
bash test_tipc/static/${model_item}/benchmark_common/run_benchmark.sh ${model_item} ${bs_item} ${fp_item} ${run_mode} ${device_num} ${max_epochs} ${num_workers} 2>&1;
source test_tipc/common_func.sh
dataline=$(awk 'NR==1, NR==19{print}' $FILENAME)
# parser params
function func_get_url_file_name(){
echo ${tmp}
# parser serving
model_name=$(func_parser_value "${lines[1]}")
python=$(func_parser_value "${lines[2]}")
trans_model_py=$(func_parser_value "${lines[4]}")
infer_model_dir_key=$(func_parser_key "${lines[5]}")
infer_model_dir_value=$(func_parser_value "${lines[5]}")
model_filename_key=$(func_parser_key "${lines[6]}")
model_filename_value=$(func_parser_value "${lines[6]}")
params_filename_key=$(func_parser_key "${lines[7]}")
params_filename_value=$(func_parser_value "${lines[7]}")
serving_server_key=$(func_parser_key "${lines[8]}")
serving_server_value=$(func_parser_value "${lines[8]}")
serving_client_key=$(func_parser_key "${lines[9]}")
serving_client_value=$(func_parser_value "${lines[9]}")
serving_dir_value=$(func_parser_value "${lines[10]}")
web_service_py=$(func_parser_value "${lines[11]}")
web_use_gpu_key=$(func_parser_key "${lines[12]}")
web_use_gpu_list=$(func_parser_value "${lines[12]}")
pipeline_py=$(func_parser_value "${lines[13]}")
function func_serving_cls(){
mkdir -p ${LOG_PATH}
# pdserving
set_dirname=$(func_set_params "${infer_model_dir_key}" "${infer_model_dir_value}")
set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
set_serving_server=$(func_set_params "${serving_server_key}" "${serving_server_value}")
set_serving_client=$(func_set_params "${serving_client_key}" "${serving_client_value}")
for python_ in ${python[*]}; do
if [[ ${python_} =~ "python" ]]; then
trans_model_cmd="${python_} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval ${trans_model_cmd}
# modify the alias_name of fetch_var to "outputs"
server_fetch_var_line_cmd="sed -i '/fetch_var/,/is_lod_tensor/s/alias_name: .*/alias_name: \"prediction\"/' ${serving_server_value}/serving_server_conf.prototxt"
eval ${server_fetch_var_line_cmd}
client_fetch_var_line_cmd="sed -i '/fetch_var/,/is_lod_tensor/s/alias_name: .*/alias_name: \"prediction\"/' ${serving_client_value}/serving_client_conf.prototxt"
eval ${client_fetch_var_line_cmd}
prototxt_dataline=$(awk 'NR==1, NR==3{print}' ${serving_server_value}/serving_server_conf.prototxt)
feed_var_name=$(func_parser_value "${prototxt_lines[2]}")
cd ${serving_dir_value}
unset https_proxy
unset http_proxy
for item in ${python[*]}; do
if [[ ${item} =~ "python" ]]; then
serving_client_dir_name=$(func_get_url_file_name "$serving_client_value")
set_client_feed_type_cmd="sed -i '/feed_type/,/: .*/s/feed_type: .*/feed_type: 20/' ${serving_client_dir_name}/serving_client_conf.prototxt"
eval ${set_client_feed_type_cmd}
set_client_shape_cmd="sed -i '/shape: 3/,/shape: 3/s/shape: 3/shape: 1/' ${serving_client_dir_name}/serving_client_conf.prototxt"
eval ${set_client_shape_cmd}
set_client_shape224_cmd="sed -i '/shape: 224/,/shape: 224/s/shape: 224//' ${serving_client_dir_name}/serving_client_conf.prototxt"
eval ${set_client_shape224_cmd}
set_client_shape224_cmd="sed -i '/shape: 224/,/shape: 224/s/shape: 224//' ${serving_client_dir_name}/serving_client_conf.prototxt"
eval ${set_client_shape224_cmd}
set_pipeline_load_config_cmd="sed -i '/load_client_config/,/.prototxt/s/.\/.*\/serving_client_conf.prototxt/.\/${serving_client_dir_name}\/serving_client_conf.prototxt/' ${pipeline_py}"
eval ${set_pipeline_load_config_cmd}
set_pipeline_feed_var_cmd="sed -i '/feed=/,/: image}/s/feed={.*: image}/feed={${feed_var_name}: image}/' ${pipeline_py}"
eval ${set_pipeline_feed_var_cmd}
serving_server_dir_name=$(func_get_url_file_name "$serving_server_value")
for use_gpu in ${web_use_gpu_list[*]}; do
if [[ ${use_gpu} = "null" ]]; then
web_service_cpp_cmd="${python_} -m paddle_serving_server.serve --model ${serving_server_dir_name} --op GeneralClasOp --port 9292 &"
eval ${web_service_cpp_cmd}
sleep 5s
pipeline_cmd="${python_} test_cpp_serving_client.py > ${_save_log_path} 2>&1 "
eval ${pipeline_cmd}
eval "cat ${_save_log_path}"
status_check ${last_status} "${pipeline_cmd}" "${status_log}" "${model_name}"
eval "${python_} -m paddle_serving_server.serve stop"
sleep 5s
web_service_cpp_cmd="${python_} -m paddle_serving_server.serve --model ${serving_server_dir_name} --op GeneralClasOp --port 9292 --gpu_id=${use_gpu} &"
eval ${web_service_cpp_cmd}
sleep 8s
pipeline_cmd="${python_} test_cpp_serving_client.py > ${_save_log_path} 2>&1 "
eval ${pipeline_cmd}
eval "cat ${_save_log_path}"
status_check ${last_status} "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
eval "${python_} -m paddle_serving_server.serve stop"
function func_serving_rec(){
mkdir -p ${LOG_PATH}
trans_model_py=$(func_parser_value "${lines[5]}")
cls_infer_model_dir_key=$(func_parser_key "${lines[6]}")
cls_infer_model_dir_value=$(func_parser_value "${lines[6]}")
det_infer_model_dir_key=$(func_parser_key "${lines[7]}")
det_infer_model_dir_value=$(func_parser_value "${lines[7]}")
model_filename_key=$(func_parser_key "${lines[8]}")
model_filename_value=$(func_parser_value "${lines[8]}")
params_filename_key=$(func_parser_key "${lines[9]}")
params_filename_value=$(func_parser_value "${lines[9]}")
cls_serving_server_key=$(func_parser_key "${lines[10]}")
cls_serving_server_value=$(func_parser_value "${lines[10]}")
cls_serving_client_key=$(func_parser_key "${lines[11]}")
cls_serving_client_value=$(func_parser_value "${lines[11]}")
det_serving_server_key=$(func_parser_key "${lines[12]}")
det_serving_server_value=$(func_parser_value "${lines[12]}")
det_serving_client_key=$(func_parser_key "${lines[13]}")
det_serving_client_value=$(func_parser_value "${lines[13]}")
serving_dir_value=$(func_parser_value "${lines[14]}")
web_service_py=$(func_parser_value "${lines[15]}")
web_use_gpu_key=$(func_parser_key "${lines[16]}")
web_use_gpu_list=$(func_parser_value "${lines[16]}")
pipeline_py=$(func_parser_value "${lines[17]}")
for python_ in ${python[*]}; do
if [[ ${python_} =~ "python" ]]; then
# pdserving
cd ./deploy
set_dirname=$(func_set_params "${cls_infer_model_dir_key}" "${cls_infer_model_dir_value}")
set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
set_serving_server=$(func_set_params "${cls_serving_server_key}" "${cls_serving_server_value}")
set_serving_client=$(func_set_params "${cls_serving_client_key}" "${cls_serving_client_value}")
cls_trans_model_cmd="${python_interp} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval ${cls_trans_model_cmd}
set_dirname=$(func_set_params "${det_infer_model_dir_key}" "${det_infer_model_dir_value}")
set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
set_serving_server=$(func_set_params "${det_serving_server_key}" "${det_serving_server_value}")
set_serving_client=$(func_set_params "${det_serving_client_key}" "${det_serving_client_value}")
det_trans_model_cmd="${python_interp} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval ${det_trans_model_cmd}
cp_prototxt_cmd="cp ./paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_serving/*.prototxt ${cls_serving_server_value}"
eval ${cp_prototxt_cmd}
cp_prototxt_cmd="cp ./paddleserving/recognition/preprocess/general_PPLCNet_x2_5_lite_v1.0_client/*.prototxt ${cls_serving_client_value}"
eval ${cp_prototxt_cmd}
cp_prototxt_cmd="cp ./paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_client/*.prototxt ${det_serving_client_value}"
eval ${cp_prototxt_cmd}
cp_prototxt_cmd="cp ./paddleserving/recognition/preprocess/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_serving/*.prototxt ${det_serving_server_value}"
eval ${cp_prototxt_cmd}
prototxt_dataline=$(awk 'NR==1, NR==3{print}' ${cls_serving_server_value}/serving_server_conf.prototxt)
feed_var_name=$(func_parser_value "${prototxt_lines[2]}")
cd ${serving_dir_value}
unset https_proxy
unset http_proxy
export SERVING_BIN=${PWD}/../Serving/server-build-gpu-opencv/core/general-server/serving
for use_gpu in ${web_use_gpu_list[*]}; do
if [ ${use_gpu} = "null" ]; then
det_serving_server_dir_name=$(func_get_url_file_name "$det_serving_server_value")
web_service_cpp_cmd="${python_interp} -m paddle_serving_server.serve --model ../../${det_serving_server_value} ../../${cls_serving_server_value} --op GeneralPicodetOp GeneralFeatureExtractOp --port 9400 &"
eval ${web_service_cpp_cmd}
sleep 5s
pipeline_cmd="${python_interp} ${pipeline_py} > ${_save_log_path} 2>&1 "
eval ${pipeline_cmd}
eval "cat ${_save_log_path}"
status_check ${last_status} "${pipeline_cmd}" "${status_log}" "${model_name}"
eval "${python_} -m paddle_serving_server.serve stop"
sleep 5s
det_serving_server_dir_name=$(func_get_url_file_name "$det_serving_server_value")
web_service_cpp_cmd="${python_interp} -m paddle_serving_server.serve --model ../../${det_serving_server_value} ../../${cls_serving_server_value} --op GeneralPicodetOp GeneralFeatureExtractOp --port 9400 --gpu_id=${use_gpu} &"
eval ${web_service_cpp_cmd}
sleep 5s
pipeline_cmd="${python_interp} ${pipeline_py} > ${_save_log_path} 2>&1 "
eval ${pipeline_cmd}
eval "cat ${_save_log_path}"
status_check ${last_status} "${pipeline_cmd}" "${status_log}" "${model_name}"
eval "${python_} -m paddle_serving_server.serve stop"
sleep 5s
# set cuda device
if [ ${#GPUID} -le 0 ];then
env=" "
eval ${env}
echo "################### run test ###################"
export Count=0
if [[ ${model_name} =~ "ShiTu" ]]; then
......@@ -38,8 +38,9 @@ pipeline_py=$(func_parser_value "${lines[13]}")
function func_serving_cls(){
mkdir -p ${LOG_PATH}
......@@ -50,12 +51,18 @@ function func_serving_cls(){
set_serving_server=$(func_set_params "${serving_server_key}" "${serving_server_value}")
set_serving_client=$(func_set_params "${serving_client_key}" "${serving_client_value}")
trans_model_cmd="${python} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval $trans_model_cmd
for python_ in ${python[*]}; do
if [[ ${python_} =~ "python" ]]; then
trans_model_cmd="${python_} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval ${trans_model_cmd}
# modify the alias_name of fetch_var to "outputs"
server_fetch_var_line_cmd="sed -i '/fetch_var/,/is_lod_tensor/s/alias_name: .*/alias_name: \"prediction\"/' ${serving_server_value}/serving_server_conf.prototxt"
eval ${server_fetch_var_line_cmd}
client_fetch_var_line_cmd="sed -i '/fetch_var/,/is_lod_tensor/s/alias_name: .*/alias_name: \"prediction\"/' ${serving_client_value}/serving_client_conf.prototxt"
eval ${client_fetch_var_line_cmd}
......@@ -69,109 +76,82 @@ function func_serving_cls(){
unset https_proxy
unset http_proxy
# python serving
# modify the input_name in "classification_web_service.py" to be consistent with feed_var.name in prototxt
set_web_service_feet_var_cmd="sed -i '/preprocess/,/input_imgs}/s/{.*: input_imgs}/{${feed_var_name}: input_imgs}/' ${web_service_py}"
eval ${set_web_service_feet_var_cmd}
set_web_service_feed_var_cmd="sed -i '/preprocess/,/input_imgs}/s/{.*: input_imgs}/{${feed_var_name}: input_imgs}/' ${web_service_py}"
eval ${set_web_service_feed_var_cmd}
serving_server_dir_name=$(func_get_url_file_name "$serving_server_value")
set_model_config_cmd="sed -i '${model_config}s/model_config: .*/model_config: ${serving_server_dir_name}/' config.yml"
eval ${set_model_config_cmd}
for python in ${python[*]}; do
if [[ ${python} = "cpp" ]]; then
for use_gpu in ${web_use_gpu_list[*]}; do
if [ ${use_gpu} = "null" ]; then
web_service_cpp_cmd="${python} -m paddle_serving_server.serve --model ppocr_det_mobile_2.0_serving/ ppocr_rec_mobile_2.0_serving/ --port 9293"
eval $web_service_cmd
sleep 5s
pipeline_cmd="${python} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
eval $pipeline_cmd
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
web_service_cpp_cmd="${python} -m paddle_serving_server.serve --model ppocr_det_mobile_2.0_serving/ ppocr_rec_mobile_2.0_serving/ --port 9293 --gpu_id=0"
eval $web_service_cmd
sleep 5s
pipeline_cmd="${python} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
eval $pipeline_cmd
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
for use_gpu in ${web_use_gpu_list[*]}; do
if [[ ${use_gpu} = "null" ]]; then
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 0/' config.yml"
eval ${set_device_type_cmd}
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"\"/' config.yml"
eval ${set_devices_cmd}
web_service_cmd="${python_} ${web_service_py} &"
eval ${web_service_cmd}
sleep 5s
for pipeline in ${pipeline_py[*]}; do
pipeline_cmd="${python_} ${pipeline} > ${_save_log_path} 2>&1 "
eval ${pipeline_cmd}
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
# python serving
for use_gpu in ${web_use_gpu_list[*]}; do
if [[ ${use_gpu} = "null" ]]; then
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 0/' config.yml"
eval $set_device_type_cmd
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"\"/' config.yml"
eval $set_devices_cmd
web_service_cmd="${python} ${web_service_py} &"
eval $web_service_cmd
sleep 5s
for pipeline in ${pipeline_py[*]}; do
pipeline_cmd="${python} ${pipeline} > ${_save_log_path} 2>&1 "
eval $pipeline_cmd
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
elif [ ${use_gpu} -eq 0 ]; then
if [[ ${_flag_quant} = "False" ]] && [[ ${precision} =~ "int8" ]]; then
if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [[ ${_flag_quant} = "True" ]]; then
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 1/' config.yml"
eval $set_device_type_cmd
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"${use_gpu}\"/' config.yml"
eval $set_devices_cmd
web_service_cmd="${python} ${web_service_py} & "
eval $web_service_cmd
sleep 5s
for pipeline in ${pipeline_py[*]}; do
pipeline_cmd="${python} ${pipeline} > ${_save_log_path} 2>&1"
eval $pipeline_cmd
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
echo "Does not support hardware [${use_gpu}] other than CPU and GPU Currently!"
eval "${python_} -m paddle_serving_server.serve stop"
elif [ ${use_gpu} -eq 0 ]; then
if [[ ${_flag_quant} = "False" ]] && [[ ${precision} =~ "int8" ]]; then
if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [[ ${_flag_quant} = "True" ]]; then
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 1/' config.yml"
eval ${set_device_type_cmd}
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"${use_gpu}\"/' config.yml"
eval ${set_devices_cmd}
web_service_cmd="${python_} ${web_service_py} & "
eval ${web_service_cmd}
sleep 5s
for pipeline in ${pipeline_py[*]}; do
pipeline_cmd="${python_} ${pipeline} > ${_save_log_path} 2>&1"
eval ${pipeline_cmd}
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
eval "${python_} -m paddle_serving_server.serve stop"
echo "Does not support hardware [${use_gpu}] other than CPU and GPU Currently!"
function func_serving_rec(){
mkdir -p ${LOG_PATH}
trans_model_py=$(func_parser_value "${lines[5]}")
cls_infer_model_dir_key=$(func_parser_key "${lines[6]}")
......@@ -200,6 +180,12 @@ function func_serving_rec(){
pipeline_py=$(func_parser_value "${lines[17]}")
for python_ in ${python[*]}; do
if [[ ${python_} =~ "python" ]]; then
# pdserving
cd ./deploy
......@@ -208,16 +194,16 @@ function func_serving_rec(){
set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
set_serving_server=$(func_set_params "${cls_serving_server_key}" "${cls_serving_server_value}")
set_serving_client=$(func_set_params "${cls_serving_client_key}" "${cls_serving_client_value}")
cls_trans_model_cmd="${python} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval $cls_trans_model_cmd
cls_trans_model_cmd="${python_interp} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval ${cls_trans_model_cmd}
set_dirname=$(func_set_params "${det_infer_model_dir_key}" "${det_infer_model_dir_value}")
set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
set_serving_server=$(func_set_params "${det_serving_server_key}" "${det_serving_server_value}")
set_serving_client=$(func_set_params "${det_serving_client_key}" "${det_serving_client_value}")
det_trans_model_cmd="${python} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval $det_trans_model_cmd
det_trans_model_cmd="${python_interp} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
eval ${det_trans_model_cmd}
# modify the alias_name of fetch_var to "outputs"
server_fetch_var_line_cmd="sed -i '/fetch_var/,/is_lod_tensor/s/alias_name: .*/alias_name: \"features\"/' $cls_serving_server_value/serving_server_conf.prototxt"
......@@ -236,96 +222,66 @@ function func_serving_rec(){
unset http_proxy
# modify the input_name in "recognition_web_service.py" to be consistent with feed_var.name in prototxt
set_web_service_feet_var_cmd="sed -i '/preprocess/,/input_imgs}/s/{.*: input_imgs}/{${feed_var_name}: input_imgs}/' ${web_service_py}"
eval ${set_web_service_feet_var_cmd}
for python in ${python[*]}; do
if [[ ${python} = "cpp" ]]; then
for use_gpu in ${web_use_gpu_list[*]}; do
if [ ${use_gpu} = "null" ]; then
web_service_cpp_cmd="${python} web_service_py"
eval $web_service_cmd
sleep 5s
pipeline_cmd="${python} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
eval $pipeline_cmd
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
web_service_cpp_cmd="${python} web_service_py"
eval $web_service_cmd
sleep 5s
pipeline_cmd="${python} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
eval $pipeline_cmd
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
set_web_service_feed_var_cmd="sed -i '/preprocess/,/input_imgs}/s/{.*: input_imgs}/{${feed_var_name}: input_imgs}/' ${web_service_py}"
eval ${set_web_service_feed_var_cmd}
# python serving
for use_gpu in ${web_use_gpu_list[*]}; do
if [[ ${use_gpu} = "null" ]]; then
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 0/' config.yml"
eval ${set_device_type_cmd}
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"\"/' config.yml"
eval ${set_devices_cmd}
web_service_cmd="${python} ${web_service_py} &"
eval ${web_service_cmd}
sleep 5s
for pipeline in ${pipeline_py[*]}; do
pipeline_cmd="${python} ${pipeline} > ${_save_log_path} 2>&1 "
eval ${pipeline_cmd}
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
# python serving
for use_gpu in ${web_use_gpu_list[*]}; do
if [[ ${use_gpu} = "null" ]]; then
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 0/' config.yml"
eval $set_device_type_cmd
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"\"/' config.yml"
eval $set_devices_cmd
web_service_cmd="${python} ${web_service_py} &"
eval $web_service_cmd
sleep 5s
for pipeline in ${pipeline_py[*]}; do
pipeline_cmd="${python} ${pipeline} > ${_save_log_path} 2>&1 "
eval $pipeline_cmd
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 5s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
elif [ ${use_gpu} -eq 0 ]; then
if [[ ${_flag_quant} = "False" ]] && [[ ${precision} =~ "int8" ]]; then
if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [[ ${_flag_quant} = "True" ]]; then
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 1/' config.yml"
eval $set_device_type_cmd
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"${use_gpu}\"/' config.yml"
eval $set_devices_cmd
web_service_cmd="${python} ${web_service_py} & "
eval $web_service_cmd
sleep 10s
for pipeline in ${pipeline_py[*]}; do
pipeline_cmd="${python} ${pipeline} > ${_save_log_path} 2>&1"
eval $pipeline_cmd
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 10s
ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
echo "Does not support hardware [${use_gpu}] other than CPU and GPU Currently!"
eval "${python_} -m paddle_serving_server.serve stop"
elif [ ${use_gpu} -eq 0 ]; then
if [[ ${_flag_quant} = "False" ]] && [[ ${precision} =~ "int8" ]]; then
if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [[ ${_flag_quant} = "True" ]]; then
set_device_type_cmd="sed -i '${device_type_line}s/device_type: .*/device_type: 1/' config.yml"
eval ${set_device_type_cmd}
set_devices_cmd="sed -i '${devices_line}s/devices: .*/devices: \"${use_gpu}\"/' config.yml"
eval ${set_devices_cmd}
web_service_cmd="${python} ${web_service_py} & "
eval ${web_service_cmd}
sleep 10s
for pipeline in ${pipeline_py[*]}; do
pipeline_cmd="${python} ${pipeline} > ${_save_log_path} 2>&1"
eval ${pipeline_cmd}
eval "cat ${_save_log_path}"
status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
sleep 10s
eval "${python_} -m paddle_serving_server.serve stop"
echo "Does not support hardware [${use_gpu}] other than CPU and GPU Currently!"
......@@ -339,7 +295,7 @@ else
eval $env
eval ${env}
echo "################### run test ###################"
