diff --git a/deploy/pdserving/README.md b/deploy/pdserving/README.md deleted file mode 100644 index 88426ba9c508a4020af0a6203010d683cb73eba9..0000000000000000000000000000000000000000 --- a/deploy/pdserving/README.md +++ /dev/null @@ -1,158 +0,0 @@ -# OCR Pipeline WebService - -(English|[简体中文](./README_CN.md)) - -PaddleOCR provides two service deployment methods: -- Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please refer to the [tutorial](../../deploy/hubserving/readme_en.md) -- Based on **PaddleServing**: Code path is "`./deploy/pdserving`". Please follow this tutorial. - -# Service deployment based on PaddleServing - -This document will introduce how to use the [PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md) to deploy the PPOCR dynamic graph model as a pipeline online service. - -Some Key Features of Paddle Serving: -- Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command. -- Industrial serving features supported, such as models management, online loading, online A/B testing etc. -- Highly concurrent and efficient communication between clients and servers supported. - -The introduction and tutorial of Paddle Serving service deployment framework reference [document](https://github.com/PaddlePaddle/Serving/blob/develop/README.md). - - -## Contents -- [Environmental preparation](#environmental-preparation) -- [Model conversion](#model-conversion) -- [Paddle Serving pipeline deployment](#paddle-serving-pipeline-deployment) -- [FAQ](#faq) - - -## Environmental preparation - -PaddleOCR operating environment and Paddle Serving operating environment are needed. - -1. Please prepare PaddleOCR operating environment reference [link](../../doc/doc_ch/installation.md). - -2. The steps of PaddleServing operating environment prepare are as follows: - - Install serving which used to start the service - ``` - pip3 install paddle-serving-server==0.5.0 # for CPU - pip3 install paddle-serving-server-gpu==0.5.0 # for GPU - # Other GPU environments need to confirm the environment and then choose to execute the following commands - pip3 install paddle-serving-server-gpu==0.5.0.post9 # GPU with CUDA9.0 - pip3 install paddle-serving-server-gpu==0.5.0.post10 # GPU with CUDA10.0 - pip3 install paddle-serving-server-gpu==0.5.0.post101 # GPU with CUDA10.1 + TensorRT6 - pip3 install paddle-serving-server-gpu==0.5.0.post11 # GPU with CUDA10.1 + TensorRT7 - ``` - -3. Install the client to send requests to the service - ``` - pip3 install paddle-serving-client==0.5.0 # for CPU - - pip3 install paddle-serving-client-gpu==0.5.0 # for GPU - ``` - -4. Install serving-app - ``` - pip3 install paddle-serving-app==0.3.0 - # fix local_predict to support load dynamic model - # find the install directoory of paddle_serving_app - vim /usr/local/lib/python3.7/site-packages/paddle_serving_app/local_predict.py - # replace line 85 of local_predict.py config = AnalysisConfig(model_path) with: - if os.path.exists(os.path.join(model_path, "__params__")): - config = AnalysisConfig(os.path.join(model_path, "__model__"), os.path.join(model_path, "__params__")) - else: - config = AnalysisConfig(model_path) - ``` - - **note:** If you want to install the latest version of PaddleServing, refer to [link](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md). - - - -## Model conversion -When using PaddleServing for service deployment, you need to convert the saved inference model into a serving model that is easy to deploy. - -Firstly, download the [inference model](https://github.com/PaddlePaddle/PaddleOCR#pp-ocr-20-series-model-listupdate-on-dec-15) of PPOCR -``` -# Download and unzip the OCR text detection model -wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar && tar xf ch_ppocr_server_v2.0_det_infer.tar -# Download and unzip the OCR text recognition model -wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar && tar xf ch_ppocr_server_v2.0_rec_infer.tar - -``` -Then, you can use installed paddle_serving_client tool to convert inference model to server model. -``` -# Detection model conversion -python3 -m paddle_serving_client.convert --dirname ./ch_ppocr_server_v2.0_det_infer/ \ - --model_filename inference.pdmodel \ - --params_filename inference.pdiparams \ - --serving_server ./ppocr_det_server_2.0_serving/ \ - --serving_client ./ppocr_det_server_2.0_client/ - -# Recognition model conversion -python3 -m paddle_serving_client.convert --dirname ./ch_ppocr_server_v2.0_rec_infer/ \ - --model_filename inference.pdmodel \ - --params_filename inference.pdiparams \ - --serving_server ./ppocr_rec_server_2.0_serving/ \ - --serving_client ./ppocr_rec_server_2.0_client/ - -``` - -After the detection model is converted, there will be additional folders of `ppocr_det_server_2.0_serving` and `ppocr_det_server_2.0_client` in the current folder, with the following format: -``` -|- ppocr_det_server_2.0_serving/ - |- __model__ - |- __params__ - |- serving_server_conf.prototxt - |- serving_server_conf.stream.prototxt - -|- ppocr_det_server_2.0_client - |- serving_client_conf.prototxt - |- serving_client_conf.stream.prototxt - -``` -The recognition model is the same. - - -## Paddle Serving pipeline deployment - -1. Download the PaddleOCR code, if you have already downloaded it, you can skip this step. - ``` - git clone https://github.com/PaddlePaddle/PaddleOCR - - # Enter the working directory - cd PaddleOCR/deploy/pdserver/ - ``` - - The pdserver directory contains the code to start the pipeline service and send prediction requests, including: - ``` - __init__.py - config.yml # Start the service configuration file - ocr_reader.py # OCR model pre-processing and post-processing code implementation - pipeline_http_client.py # Script to send pipeline prediction request - web_service.py # Start the script of the pipeline server - ``` - -2. Run the following command to start the service. - ``` - # Start the service and save the running log in log.txt - python3 web_service.py &>log.txt & - ``` - After the service is successfully started, a log similar to the following will be printed in log.txt - ![](./imgs/start_server.png) - -3. Send service request - ``` - python3 pipeline_http_client.py - ``` - After successfully running, the predicted result of the model will be printed in the cmd window. An example of the result is: - ![](./imgs/results.png) - - -## FAQ -**Q1**: No result return after sending the request. - -**A1**: Do not set the proxy when starting the service and sending the request. You can close the proxy before starting the service and before sending the request. The command to close the proxy is: -``` -unset https_proxy -unset http_proxy -``` diff --git a/deploy/pdserving/README_CN.md b/deploy/pdserving/README_CN.md deleted file mode 100644 index 3e3f1bde0e824fe6133a1c169b9b03e614904c26..0000000000000000000000000000000000000000 --- a/deploy/pdserving/README_CN.md +++ /dev/null @@ -1,160 +0,0 @@ -# PPOCR 服务化部署 - -([English](./README.md)|简体中文) - -PaddleOCR提供2种服务部署方式: -- 基于PaddleHub Serving的部署:代码路径为"`./deploy/hubserving`",使用方法参考[文档](../../deploy/hubserving/readme.md); -- 基于PaddleServing的部署:代码路径为"`./deploy/pdserving`",按照本教程使用。 - -# 基于PaddleServing的服务部署 - -本文档将介绍如何使用[PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)工具部署PPOCR -动态图模型的pipeline在线服务。 - -相比较于hubserving部署,PaddleServing具备以下优点: -- 支持客户端和服务端之间高并发和高效通信 -- 支持 工业级的服务能力 例如模型管理,在线加载,在线A/B测试等 -- 支持 多种编程语言 开发客户端,例如C++, Python和Java - -更多有关PaddleServing服务化部署框架介绍和使用教程参考[文档](https://github.com/PaddlePaddle/Serving/blob/develop/README_CN.md)。 - -## 目录 -- [环境准备](#环境准备) -- [模型转换](#模型转换) -- [Paddle Serving pipeline部署](#部署) -- [FAQ](#FAQ) - - -## 环境准备 - -需要准备PaddleOCR的运行环境和Paddle Serving的运行环境。 - -- 准备PaddleOCR的运行环境参考[链接](../../doc/doc_ch/installation.md) - -- 准备PaddleServing的运行环境,步骤如下 - -1. 安装serving,用于启动服务 - ``` - pip3 install paddle-serving-server==0.5.0 # for CPU - pip3 install paddle-serving-server-gpu==0.5.0 # for GPU - # 其他GPU环境需要确认环境再选择执行如下命令 - pip3 install paddle-serving-server-gpu==0.5.0.post9 # GPU with CUDA9.0 - pip3 install paddle-serving-server-gpu==0.5.0.post10 # GPU with CUDA10.0 - pip3 install paddle-serving-server-gpu==0.5.0.post101 # GPU with CUDA10.1 + TensorRT6 - pip3 install paddle-serving-server-gpu==0.5.0.post11 # GPU with CUDA10.1 + TensorRT7 - ``` - -2. 安装client,用于向服务发送请求 - ``` - pip3 install paddle-serving-client==0.5.0 # for CPU - - pip3 install paddle-serving-client-gpu==0.5.0 # for GPU - ``` - -3. 安装serving-app - ``` - pip3 install paddle-serving-app==0.3.0 - ``` - **note:** 安装0.3.0版本的serving-app后,为了能加载动态图模型,需要修改serving_app的源码,具体为: - ``` - # 找到paddle_serving_app的安装目录,找到并编辑local_predict.py文件 - vim /usr/local/lib/python3.7/site-packages/paddle_serving_app/local_predict.py - # 将local_predict.py 的第85行 config = AnalysisConfig(model_path) 替换为: - if os.path.exists(os.path.join(model_path, "__params__")): - config = AnalysisConfig(os.path.join(model_path, "__model__"), os.path.join(model_path, "__params__")) - else: - config = AnalysisConfig(model_path) - ``` - - **Note:** 如果要安装最新版本的PaddleServing参考[链接](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md)。 - - -## 模型转换 - -使用PaddleServing做服务化部署时,需要将保存的inference模型转换为serving易于部署的模型。 - -首先,下载PPOCR的[inference模型](https://github.com/PaddlePaddle/PaddleOCR#pp-ocr-20-series-model-listupdate-on-dec-15) -``` -# 下载并解压 OCR 文本检测模型 -wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar && tar xf ch_ppocr_server_v2.0_det_infer.tar -# 下载并解压 OCR 文本识别模型 -wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar && tar xf ch_ppocr_server_v2.0_rec_infer.tar -``` - -接下来,用安装的paddle_serving_client把下载的inference模型转换成易于server部署的模型格式。 - -``` -# 转换检测模型 -python3 -m paddle_serving_client.convert --dirname ./ch_ppocr_server_v2.0_det_infer/ \ - --model_filename inference.pdmodel \ - --params_filename inference.pdiparams \ - --serving_server ./ppocr_det_server_2.0_serving/ \ - --serving_client ./ppocr_det_server_2.0_client/ - -# 转换识别模型 -python3 -m paddle_serving_client.convert --dirname ./ch_ppocr_server_v2.0_rec_infer/ \ - --model_filename inference.pdmodel \ - --params_filename inference.pdiparams \ - --serving_server ./ppocr_rec_server_2.0_serving/ \ - --serving_client ./ppocr_rec_server_2.0_client/ -``` - -检测模型转换完成后,会在当前文件夹多出`ppocr_det_server_2.0_serving` 和`ppocr_det_server_2.0_client`的文件夹,具备如下格式: -``` -|- ppocr_det_server_2.0_serving/ - |- __model__ - |- __params__ - |- serving_server_conf.prototxt - |- serving_server_conf.stream.prototxt - -|- ppocr_det_server_2.0_client - |- serving_client_conf.prototxt - |- serving_client_conf.stream.prototxt - -``` -识别模型同理。 - - -## Paddle Serving pipeline部署 - -1. 下载PaddleOCR代码,若已下载可跳过此步骤 - ``` - git clone https://github.com/PaddlePaddle/PaddleOCR - - # 进入到工作目录 - cd PaddleOCR/deploy/pdserver/ - ``` - pdserver目录包含启动pipeline服务和发送预测请求的代码,包括: - ``` - __init__.py - config.yml # 启动服务的配置文件 - ocr_reader.py # OCR模型预处理和后处理的代码实现 - pipeline_http_client.py # 发送pipeline预测请求的脚本 - web_service.py # 启动pipeline服务端的脚本 - ``` - -2. 启动服务可运行如下命令: - ``` - # 启动服务,运行日志保存在log.txt - python3 web_service.py &>log.txt & - ``` - 成功启动服务后,log.txt中会打印类似如下日志 - ![](./imgs/start_server.png) - -3. 发送服务请求: - ``` - python3 pipeline_http_client.py - ``` - 成功运行后,模型预测的结果会打印在cmd窗口中,结果示例为: - ![](./imgs/results.png) - - - -## FAQ -**Q1**: 发送请求后没有结果返回或者提示输出解码报错 - -**A1**: 启动服务和发送请求时不要设置代理,可以在启动服务前和发送请求前关闭代理,关闭代理的命令是: -``` -unset https_proxy -unset http_proxy -``` diff --git a/deploy/pdserving/__init__.py b/deploy/pdserving/__init__.py deleted file mode 100644 index 185a92b8d94d3426d616c0624f0f2ee04339349e..0000000000000000000000000000000000000000 --- a/deploy/pdserving/__init__.py +++ /dev/null @@ -1,13 +0,0 @@ -# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. diff --git a/deploy/pdserving/config.yml b/deploy/pdserving/config.yml deleted file mode 100644 index aef735dbfab5b314f9209a7cc91e7fd5b6fc615c..0000000000000000000000000000000000000000 --- a/deploy/pdserving/config.yml +++ /dev/null @@ -1,71 +0,0 @@ -#rpc端口, rpc_port和http_port不允许同时为空。当rpc_port为空且http_port不为空时,会自动将rpc_port设置为http_port+1 -rpc_port: 18090 - -#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port -http_port: 9999 - -#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG -##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num -worker_num: 20 - -#build_dag_each_worker, False,框架在进程内创建一条DAG;True,框架会每个进程内创建多个独立的DAG -build_dag_each_worker: false - -dag: - #op资源类型, True, 为线程模型;False,为进程模型 - is_thread_op: False - - #重试次数 - retry: 1 - - #使用性能分析, True,生成Timeline性能数据,对性能有一定影响;False为不使用 - use_profile: False - - tracer: - interval_s: 10 -op: - det: - #并发数,is_thread_op=True时,为线程并发;否则为进程并发 - concurrency: 4 - - #当op配置没有server_endpoints时,从local_service_conf读取本地服务配置 - local_service_conf: - #client类型,包括brpc, grpc和local_predictor.local_predictor不启动Serving服务,进程内预测 - client_type: local_predictor - - #det模型路径 - model_config: /paddle/serving/models/det_serving_server/ #ocr_det_model - - #Fetch结果列表,以client_config中fetch_var的alias_name为准 - fetch_list: ["save_infer_model/scale_0.tmp_1"] - - #计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡 - devices: "2" - - ir_optim: True - rec: - #并发数,is_thread_op=True时,为线程并发;否则为进程并发 - concurrency: 1 - - #超时时间, 单位ms - timeout: -1 - - #Serving交互重试次数,默认不重试 - retry: 1 - - #当op配置没有server_endpoints时,从local_service_conf读取本地服务配置 - local_service_conf: - - #client类型,包括brpc, grpc和local_predictor。local_predictor不启动Serving服务,进程内预测 - client_type: local_predictor - - #rec模型路径 - model_config: /paddle/serving/models/rec_serving_server/ #ocr_rec_model - - #Fetch结果列表,以client_config中fetch_var的alias_name为准 - fetch_list: ["save_infer_model/scale_0.tmp_1"] #["ctc_greedy_decoder_0.tmp_0", "softmax_0.tmp_0"] - - #计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡 - devices: "2" - - ir_optim: True diff --git a/deploy/pdserving/imgs/cpp_infer_pred_12.png b/deploy/pdserving/imgs/cpp_infer_pred_12.png deleted file mode 100644 index eb5f64e1f6c329f7ae772c50edce7fc8afcb1211..0000000000000000000000000000000000000000 Binary files a/deploy/pdserving/imgs/cpp_infer_pred_12.png and /dev/null differ diff --git a/deploy/pdserving/imgs/demo.png b/deploy/pdserving/imgs/demo.png deleted file mode 100644 index 761bfb9baa505cf3450d0702151555bba4196ec5..0000000000000000000000000000000000000000 Binary files a/deploy/pdserving/imgs/demo.png and /dev/null differ diff --git a/deploy/pdserving/imgs/results.png b/deploy/pdserving/imgs/results.png deleted file mode 100644 index 35322bf9462c859bb2d158dd1d50ab52f35bad41..0000000000000000000000000000000000000000 Binary files a/deploy/pdserving/imgs/results.png and /dev/null differ diff --git a/deploy/pdserving/imgs/start_server.png b/deploy/pdserving/imgs/start_server.png deleted file mode 100644 index 60e19ccaed6bd4382f5342101a54d8957879bd60..0000000000000000000000000000000000000000 Binary files a/deploy/pdserving/imgs/start_server.png and /dev/null differ diff --git a/deploy/pdserving/ocr_reader.py b/deploy/pdserving/ocr_reader.py deleted file mode 100644 index 95110706af13662de11ef0f668558d0dd3abcf52..0000000000000000000000000000000000000000 --- a/deploy/pdserving/ocr_reader.py +++ /dev/null @@ -1,438 +0,0 @@ -# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import cv2 -import copy -import numpy as np -import math -import re -import sys -import argparse -import string -from copy import deepcopy -import paddle - - -class DetResizeForTest(object): - def __init__(self, **kwargs): - super(DetResizeForTest, self).__init__() - self.resize_type = 0 - if 'image_shape' in kwargs: - self.image_shape = kwargs['image_shape'] - self.resize_type = 1 - elif 'limit_side_len' in kwargs: - self.limit_side_len = kwargs['limit_side_len'] - self.limit_type = kwargs.get('limit_type', 'min') - elif 'resize_long' in kwargs: - self.resize_type = 2 - self.resize_long = kwargs.get('resize_long', 960) - else: - self.limit_side_len = 736 - self.limit_type = 'min' - - def __call__(self, data): - img = deepcopy(data) - src_h, src_w, _ = img.shape - - if self.resize_type == 0: - img, [ratio_h, ratio_w] = self.resize_image_type0(img) - elif self.resize_type == 2: - img, [ratio_h, ratio_w] = self.resize_image_type2(img) - else: - img, [ratio_h, ratio_w] = self.resize_image_type1(img) - - return img - - def resize_image_type1(self, img): - resize_h, resize_w = self.image_shape - ori_h, ori_w = img.shape[:2] # (h, w, c) - ratio_h = float(resize_h) / ori_h - ratio_w = float(resize_w) / ori_w - img = cv2.resize(img, (int(resize_w), int(resize_h))) - return img, [ratio_h, ratio_w] - - def resize_image_type0(self, img): - """ - resize image to a size multiple of 32 which is required by the network - args: - img(array): array with shape [h, w, c] - return(tuple): - img, (ratio_h, ratio_w) - """ - limit_side_len = self.limit_side_len - h, w, _ = img.shape - - # limit the max side - if self.limit_type == 'max': - if max(h, w) > limit_side_len: - if h > w: - ratio = float(limit_side_len) / h - else: - ratio = float(limit_side_len) / w - else: - ratio = 1. - else: - if min(h, w) < limit_side_len: - if h < w: - ratio = float(limit_side_len) / h - else: - ratio = float(limit_side_len) / w - else: - ratio = 1. - resize_h = int(h * ratio) - resize_w = int(w * ratio) - - resize_h = int(round(resize_h / 32) * 32) - resize_w = int(round(resize_w / 32) * 32) - - try: - if int(resize_w) <= 0 or int(resize_h) <= 0: - return None, (None, None) - img = cv2.resize(img, (int(resize_w), int(resize_h))) - except: - print(img.shape, resize_w, resize_h) - sys.exit(0) - ratio_h = resize_h / float(h) - ratio_w = resize_w / float(w) - # return img, np.array([h, w]) - return img, [ratio_h, ratio_w] - - def resize_image_type2(self, img): - h, w, _ = img.shape - - resize_w = w - resize_h = h - - # Fix the longer side - if resize_h > resize_w: - ratio = float(self.resize_long) / resize_h - else: - ratio = float(self.resize_long) / resize_w - - resize_h = int(resize_h * ratio) - resize_w = int(resize_w * ratio) - - max_stride = 128 - resize_h = (resize_h + max_stride - 1) // max_stride * max_stride - resize_w = (resize_w + max_stride - 1) // max_stride * max_stride - img = cv2.resize(img, (int(resize_w), int(resize_h))) - ratio_h = resize_h / float(h) - ratio_w = resize_w / float(w) - - return img, [ratio_h, ratio_w] - - -class BaseRecLabelDecode(object): - """ Convert between text-label and text-index """ - - def __init__(self, config): - support_character_type = [ - 'ch', 'en', 'EN_symbol', 'french', 'german', 'japan', 'korean', - 'it', 'xi', 'pu', 'ru', 'ar', 'ta', 'ug', 'fa', 'ur', 'rs', 'oc', - 'rsc', 'bg', 'uk', 'be', 'te', 'ka', 'chinese_cht', 'hi', 'mr', - 'ne', 'EN' - ] - character_type = config['character_type'] - character_dict_path = config['character_dict_path'] - use_space_char = True - assert character_type in support_character_type, "Only {} are supported now but get {}".format( - support_character_type, character_type) - - self.beg_str = "sos" - self.end_str = "eos" - - if character_type == "en": - self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz" - dict_character = list(self.character_str) - elif character_type == "EN_symbol": - # same with ASTER setting (use 94 char). - self.character_str = string.printable[:-6] - dict_character = list(self.character_str) - elif character_type in support_character_type: - self.character_str = "" - assert character_dict_path is not None, "character_dict_path should not be None when character_type is {}".format( - character_type) - with open(character_dict_path, "rb") as fin: - lines = fin.readlines() - for line in lines: - line = line.decode('utf-8').strip("\n").strip("\r\n") - self.character_str += line - if use_space_char: - self.character_str += " " - dict_character = list(self.character_str) - - else: - raise NotImplementedError - self.character_type = character_type - dict_character = self.add_special_char(dict_character) - self.dict = {} - for i, char in enumerate(dict_character): - self.dict[char] = i - self.character = dict_character - - def add_special_char(self, dict_character): - return dict_character - - def decode(self, text_index, text_prob=None, is_remove_duplicate=False): - """ convert text-index into text-label. """ - result_list = [] - ignored_tokens = self.get_ignored_tokens() - batch_size = len(text_index) - for batch_idx in range(batch_size): - char_list = [] - conf_list = [] - for idx in range(len(text_index[batch_idx])): - if text_index[batch_idx][idx] in ignored_tokens: - continue - if is_remove_duplicate: - # only for predict - if idx > 0 and text_index[batch_idx][idx - 1] == text_index[ - batch_idx][idx]: - continue - char_list.append(self.character[int(text_index[batch_idx][ - idx])]) - if text_prob is not None: - conf_list.append(text_prob[batch_idx][idx]) - else: - conf_list.append(1) - text = ''.join(char_list) - result_list.append((text, np.mean(conf_list))) - return result_list - - def get_ignored_tokens(self): - return [0] # for ctc blank - - -class CTCLabelDecode(BaseRecLabelDecode): - """ Convert between text-label and text-index """ - - def __init__( - self, - config, - #character_dict_path=None, - #character_type='ch', - #use_space_char=False, - **kwargs): - super(CTCLabelDecode, self).__init__(config) - - def __call__(self, preds, label=None, *args, **kwargs): - if isinstance(preds, paddle.Tensor): - preds = preds.numpy() - preds_idx = preds.argmax(axis=2) - preds_prob = preds.max(axis=2) - text = self.decode(preds_idx, preds_prob, is_remove_duplicate=True) - if label is None: - return text - label = self.decode(label) - return text, label - - def add_special_char(self, dict_character): - dict_character = ['blank'] + dict_character - return dict_character - - -class CharacterOps(object): - """ Convert between text-label and text-index """ - - def __init__(self, config): - self.character_type = config['character_type'] - self.loss_type = config['loss_type'] - if self.character_type == "en": - self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz" - dict_character = list(self.character_str) - elif self.character_type == "ch": - character_dict_path = config['character_dict_path'] - self.character_str = "" - with open(character_dict_path, "rb") as fin: - lines = fin.readlines() - for line in lines: - line = line.decode('utf-8').strip("\n").strip("\r\n") - self.character_str += line - dict_character = list(self.character_str) - elif self.character_type == "en_sensitive": - # same with ASTER setting (use 94 char). - self.character_str = string.printable[:-6] - dict_character = list(self.character_str) - else: - self.character_str = None - assert self.character_str is not None, \ - "Nonsupport type of the character: {}".format(self.character_str) - self.beg_str = "sos" - self.end_str = "eos" - if self.loss_type == "attention": - dict_character = [self.beg_str, self.end_str] + dict_character - self.dict = {} - for i, char in enumerate(dict_character): - self.dict[char] = i - self.character = dict_character - - def encode(self, text): - """convert text-label into text-index. - input: - text: text labels of each image. [batch_size] - - output: - text: concatenated text index for CTCLoss. - [sum(text_lengths)] = [text_index_0 + text_index_1 + ... + text_index_(n - 1)] - length: length of each text. [batch_size] - """ - if self.character_type == "en": - text = text.lower() - - text_list = [] - for char in text: - if char not in self.dict: - continue - text_list.append(self.dict[char]) - text = np.array(text_list) - return text - - def decode(self, text_index, is_remove_duplicate=False): - """ convert text-index into text-label. """ - char_list = [] - char_num = self.get_char_num() - - if self.loss_type == "attention": - beg_idx = self.get_beg_end_flag_idx("beg") - end_idx = self.get_beg_end_flag_idx("end") - ignored_tokens = [beg_idx, end_idx] - else: - ignored_tokens = [char_num] - - for idx in range(len(text_index)): - if text_index[idx] in ignored_tokens: - continue - if is_remove_duplicate: - if idx > 0 and text_index[idx - 1] == text_index[idx]: - continue - char_list.append(self.character[text_index[idx]]) - text = ''.join(char_list) - return text - - def get_char_num(self): - return len(self.character) - - def get_beg_end_flag_idx(self, beg_or_end): - if self.loss_type == "attention": - if beg_or_end == "beg": - idx = np.array(self.dict[self.beg_str]) - elif beg_or_end == "end": - idx = np.array(self.dict[self.end_str]) - else: - assert False, "Unsupport type %s in get_beg_end_flag_idx"\ - % beg_or_end - return idx - else: - err = "error in get_beg_end_flag_idx when using the loss %s"\ - % (self.loss_type) - assert False, err - - -class OCRReader(object): - def __init__(self, - algorithm="CRNN", - image_shape=[3, 32, 320], - char_type="ch", - batch_num=1, - char_dict_path="./ppocr_keys_v1.txt"): - self.rec_image_shape = image_shape - self.character_type = char_type - self.rec_batch_num = batch_num - char_ops_params = {} - char_ops_params["character_type"] = char_type - char_ops_params["character_dict_path"] = char_dict_path - char_ops_params['loss_type'] = 'ctc' - self.char_ops = CharacterOps(char_ops_params) - self.label_ops = CTCLabelDecode(char_ops_params) - - def resize_norm_img(self, img, max_wh_ratio): - imgC, imgH, imgW = self.rec_image_shape - if self.character_type == "ch": - imgW = int(32 * max_wh_ratio) - h = img.shape[0] - w = img.shape[1] - ratio = w / float(h) - if math.ceil(imgH * ratio) > imgW: - resized_w = imgW - else: - resized_w = int(math.ceil(imgH * ratio)) - resized_image = cv2.resize(img, (resized_w, imgH)) - resized_image = resized_image.astype('float32') - resized_image = resized_image.transpose((2, 0, 1)) / 255 - resized_image -= 0.5 - resized_image /= 0.5 - padding_im = np.zeros((imgC, imgH, imgW), dtype=np.float32) - - padding_im[:, :, 0:resized_w] = resized_image - return padding_im - - def preprocess(self, img_list): - img_num = len(img_list) - norm_img_batch = [] - max_wh_ratio = 0 - for ino in range(img_num): - h, w = img_list[ino].shape[0:2] - wh_ratio = w * 1.0 / h - max_wh_ratio = max(max_wh_ratio, wh_ratio) - - for ino in range(img_num): - norm_img = self.resize_norm_img(img_list[ino], max_wh_ratio) - norm_img = norm_img[np.newaxis, :] - norm_img_batch.append(norm_img) - norm_img_batch = np.concatenate(norm_img_batch) - norm_img_batch = norm_img_batch.copy() - - return norm_img_batch[0] - - def postprocess_old(self, outputs, with_score=False): - rec_res = [] - rec_idx_lod = outputs["ctc_greedy_decoder_0.tmp_0.lod"] - rec_idx_batch = outputs["ctc_greedy_decoder_0.tmp_0"] - if with_score: - predict_lod = outputs["softmax_0.tmp_0.lod"] - for rno in range(len(rec_idx_lod) - 1): - beg = rec_idx_lod[rno] - end = rec_idx_lod[rno + 1] - if isinstance(rec_idx_batch, list): - rec_idx_tmp = [x[0] for x in rec_idx_batch[beg:end]] - else: #nd array - rec_idx_tmp = rec_idx_batch[beg:end, 0] - preds_text = self.char_ops.decode(rec_idx_tmp) - if with_score: - beg = predict_lod[rno] - end = predict_lod[rno + 1] - if isinstance(outputs["softmax_0.tmp_0"], list): - outputs["softmax_0.tmp_0"] = np.array(outputs[ - "softmax_0.tmp_0"]).astype(np.float32) - probs = outputs["softmax_0.tmp_0"][beg:end, :] - ind = np.argmax(probs, axis=1) - blank = probs.shape[1] - valid_ind = np.where(ind != (blank - 1))[0] - score = np.mean(probs[valid_ind, ind[valid_ind]]) - rec_res.append([preds_text, score]) - else: - rec_res.append([preds_text]) - return rec_res - - def postprocess(self, outputs, with_score=False): - preds = outputs["save_infer_model/scale_0.tmp_1"] - try: - preds = preds.numpy() - except: - pass - preds_idx = preds.argmax(axis=2) - preds_prob = preds.max(axis=2) - text = self.label_ops.decode( - preds_idx, preds_prob, is_remove_duplicate=True) - return text diff --git a/deploy/pdserving/pipeline_http_client.py b/deploy/pdserving/pipeline_http_client.py deleted file mode 100644 index 88c4a81ea8bbed80d37b5fbfea6bf01b38f9613a..0000000000000000000000000000000000000000 --- a/deploy/pdserving/pipeline_http_client.py +++ /dev/null @@ -1,40 +0,0 @@ -# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import numpy as np -import requests -import json -import base64 -import os - - -def cv2_to_base64(image): - return base64.b64encode(image).decode('utf8') - - -url = "http://127.0.0.1:9999/ocr/prediction" -test_img_dir = "../doc/imgs/" -for idx, img_file in enumerate(os.listdir(test_img_dir)): - with open(os.path.join(test_img_dir, img_file), 'rb') as file: - image_data1 = file.read() - - image = cv2_to_base64(image_data1) - - for i in range(1): - data = {"key": ["image"], "value": [image]} - r = requests.post(url=url, data=json.dumps(data)) - print(r.json()) - -test_img_dir = "../doc/imgs/" -print("==> total number of test imgs: ", len(os.listdir(test_img_dir))) diff --git a/deploy/pdserving/pipeline_rpc_client.py b/deploy/pdserving/pipeline_rpc_client.py deleted file mode 100644 index 7471f7ed6c1254d550bcf2c19f6ee7c610a2e20e..0000000000000000000000000000000000000000 --- a/deploy/pdserving/pipeline_rpc_client.py +++ /dev/null @@ -1,42 +0,0 @@ -# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -try: - from paddle_serving_server_gpu.pipeline import PipelineClient -except ImportError: - from paddle_serving_server.pipeline import PipelineClient -import numpy as np -import requests -import json -import cv2 -import base64 -import os - -client = PipelineClient() -client.connect(['127.0.0.1:18090']) - - -def cv2_to_base64(image): - return base64.b64encode(image).decode('utf8') - - -test_img_dir = "imgs/" -for img_file in os.listdir(test_img_dir): - with open(os.path.join(test_img_dir, img_file), 'rb') as file: - image_data = file.read() - image = cv2_to_base64(image_data) - -for i in range(1): - ret = client.predict(feed_dict={"image": image}, fetch=["res"]) - print(ret) - #print(ret) diff --git a/deploy/pdserving/web_service.py b/deploy/pdserving/web_service.py deleted file mode 100644 index b47ef65d09dd7aad0e4d00ca852a5c32161ad45b..0000000000000000000000000000000000000000 --- a/deploy/pdserving/web_service.py +++ /dev/null @@ -1,127 +0,0 @@ -# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -try: - from paddle_serving_server_gpu.web_service import WebService, Op -except ImportError: - from paddle_serving_server.web_service import WebService, Op - -import logging -import numpy as np -import cv2 -import base64 -# from paddle_serving_app.reader import OCRReader -from ocr_reader import OCRReader, DetResizeForTest -from paddle_serving_app.reader import Sequential, ResizeByFactor -from paddle_serving_app.reader import Div, Normalize, Transpose -from paddle_serving_app.reader import DBPostProcess, FilterBoxes, GetRotateCropImage, SortedBoxes - -_LOGGER = logging.getLogger() - - -class DetOp(Op): - def init_op(self): - self.det_preprocess = Sequential([ - DetResizeForTest(), Div(255), - Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), Transpose( - (2, 0, 1)) - ]) - self.filter_func = FilterBoxes(10, 10) - self.post_func = DBPostProcess({ - "thresh": 0.3, - "box_thresh": 0.5, - "max_candidates": 1000, - "unclip_ratio": 1.5, - "min_size": 3 - }) - - def preprocess(self, input_dicts, data_id, log_id): - (_, input_dict), = input_dicts.items() - data = base64.b64decode(input_dict["image"].encode('utf8')) - data = np.fromstring(data, np.uint8) - # Note: class variables(self.var) can only be used in process op mode - im = cv2.imdecode(data, cv2.IMREAD_COLOR) - self.im = im - self.ori_h, self.ori_w, _ = im.shape - - det_img = self.det_preprocess(self.im) - _, self.new_h, self.new_w = det_img.shape - print("det image shape", det_img.shape) - return {"x": det_img[np.newaxis, :].copy()}, False, None, "" - - def postprocess(self, input_dicts, fetch_dict, log_id): - print("input_dicts: ", input_dicts) - det_out = fetch_dict["save_infer_model/scale_0.tmp_1"] - ratio_list = [ - float(self.new_h) / self.ori_h, float(self.new_w) / self.ori_w - ] - dt_boxes_list = self.post_func(det_out, [ratio_list]) - dt_boxes = self.filter_func(dt_boxes_list[0], [self.ori_h, self.ori_w]) - out_dict = {"dt_boxes": dt_boxes, "image": self.im} - - print("out dict", out_dict["dt_boxes"]) - return out_dict, None, "" - - -class RecOp(Op): - def init_op(self): - self.ocr_reader = OCRReader( - char_dict_path="../../ppocr/utils/ppocr_keys_v1.txt") - - self.get_rotate_crop_image = GetRotateCropImage() - self.sorted_boxes = SortedBoxes() - - def preprocess(self, input_dicts, data_id, log_id): - (_, input_dict), = input_dicts.items() - im = input_dict["image"] - dt_boxes = input_dict["dt_boxes"] - dt_boxes = self.sorted_boxes(dt_boxes) - feed_list = [] - img_list = [] - max_wh_ratio = 0 - for i, dtbox in enumerate(dt_boxes): - boximg = self.get_rotate_crop_image(im, dt_boxes[i]) - img_list.append(boximg) - h, w = boximg.shape[0:2] - wh_ratio = w * 1.0 / h - max_wh_ratio = max(max_wh_ratio, wh_ratio) - _, w, h = self.ocr_reader.resize_norm_img(img_list[0], - max_wh_ratio).shape - - imgs = np.zeros((len(img_list), 3, w, h)).astype('float32') - for id, img in enumerate(img_list): - norm_img = self.ocr_reader.resize_norm_img(img, max_wh_ratio) - imgs[id] = norm_img - print("rec image shape", imgs.shape) - feed = {"x": imgs.copy()} - return feed, False, None, "" - - def postprocess(self, input_dicts, fetch_dict, log_id): - rec_res = self.ocr_reader.postprocess(fetch_dict, with_score=True) - res_lst = [] - for res in rec_res: - res_lst.append(res[0]) - res = {"res": str(res_lst)} - return res, None, "" - - -class OcrService(WebService): - def get_pipeline_response(self, read_op): - det_op = DetOp(name="det", input_ops=[read_op]) - rec_op = RecOp(name="rec", input_ops=[det_op]) - return rec_op - - -uci_service = OcrService(name="ocr") -uci_service.prepare_pipeline_config("config.yml") -uci_service.run_service() diff --git a/deploy/slim/prune/README.md b/deploy/slim/prune/README.md deleted file mode 100644 index d9675c5a3cfc281a3a2af69b364bd528597096d7..0000000000000000000000000000000000000000 --- a/deploy/slim/prune/README.md +++ /dev/null @@ -1,64 +0,0 @@ - -## 介绍 - -复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型裁剪通过移出网络模型中的子模型来减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。 -本教程将介绍如何使用飞桨模型压缩库PaddleSlim做PaddleOCR模型的压缩。 -[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim)集成了模型剪枝、量化(包括量化训练和离线量化)、蒸馏和神经网络搜索等多种业界常用且领先的模型压缩功能,如果您感兴趣,可以关注并了解。 - - -在开始本教程之前,建议先了解: -1. [PaddleOCR模型的训练方法](../../../doc/doc_ch/quickstart.md) -2. [模型裁剪教程](https://github.com/PaddlePaddle/PaddleSlim/blob/release%2F2.0.0/docs/zh_cn/tutorials/pruning/dygraph/filter_pruning.md) - - -## 快速开始 - -模型裁剪主要包括四个步骤: -1. 安装 PaddleSlim -2. 准备训练好的模型 -3. 敏感度分析、裁剪训练 -4. 导出模型、预测部署 - -### 1. 安装PaddleSlim - -```bash -git clone https://github.com/PaddlePaddle/PaddleSlim.git -git checkout develop -cd Paddleslim -python3 setup.py install -``` - -### 2. 获取预训练模型 -模型裁剪需要加载事先训练好的模型,PaddleOCR也提供了一系列(模型)[../../../doc/doc_ch/models_list.md],开发者可根据需要自行选择模型或使用自己的模型。 - -### 3. 敏感度分析训练 - -加载预训练模型后,通过对现有模型的每个网络层进行敏感度分析,得到敏感度文件:sen.pickle,可以通过PaddleSlim提供的[接口](https://github.com/PaddlePaddle/PaddleSlim/blob/9b01b195f0c4bc34a1ab434751cb260e13d64d9e/paddleslim/dygraph/prune/filter_pruner.py#L75)加载文件,获得各网络层在不同裁剪比例下的精度损失。从而了解各网络层冗余度,决定每个网络层的裁剪比例。 -敏感度文件内容格式: - sen.pickle(Dict){ - 'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} - 'layer_weight_name_1': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} - } - - 例子: - { - 'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594} - 'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405} - } -加载敏感度文件后会返回一个字典,字典中的keys为网络模型参数模型的名字,values为一个字典,里面保存了相应网络层的裁剪敏感度信息。例如在例子中,conv10_expand_weights所对应的网络层在裁掉10%的卷积核后模型性能相较原模型会下降0.65%,详细信息可见[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86) - -进入PaddleOCR根目录,通过以下命令对模型进行敏感度分析训练: -```bash -python3.7 deploy/slim/prune/sensitivity_anal.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrain_weights="your trained model" -``` - -### 4. 导出模型、预测部署 - -在得到裁剪训练保存的模型后,我们可以将其导出为inference_model: -```bash -pytho3.7 deploy/slim/prune/export_prune_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrain_weights=./output/det_db/best_accuracy Global.save_inference_dir=inference_model -``` - -inference model的预测和部署参考: -1. [inference model python端预测](../../../doc/doc_ch/inference.md) -2. [inference model C++预测](../../cpp_infer/readme.md) diff --git a/deploy/slim/prune/README_en.md b/deploy/slim/prune/README_en.md deleted file mode 100644 index 70cfd580b30dde2070e27cd3512f54f222acfaed..0000000000000000000000000000000000000000 --- a/deploy/slim/prune/README_en.md +++ /dev/null @@ -1,71 +0,0 @@ - -## Introduction - -Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Model Pruning is a technique that reduces this redundancy by removing the sub-models in the neural network model, so as to reduce model calculation complexity and improve model inference performance. - -This example uses PaddleSlim provided[APIs of Pruning](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/) to compress the OCR model. -[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim), an open source library which integrates model pruning, quantization (including quantization training and offline quantization), distillation, neural network architecture search, and many other commonly used and leading model compression technique in the industry. - -It is recommended that you could understand following pages before reading this example: -1. [PaddleOCR training methods](../../../doc/doc_ch/quickstart.md) -2. [The demo of prune](https://github.com/PaddlePaddle/PaddleSlim/blob/release%2F2.0.0/docs/zh_cn/tutorials/pruning/dygraph/filter_pruning.md) - -## Quick start - -Five steps for OCR model prune: -1. Install PaddleSlim -2. Prepare the trained model -3. Sensitivity analysis and tailoring training -4. Export model, predict deployment - -### 1. Install PaddleSlim - -```bash -git clone https://github.com/PaddlePaddle/PaddleSlim.git -git checkout develop -cd Paddleslim -python3 setup.py install -``` - - -### 2. Download Pretrain Model -Model prune needs to load pre-trained models. -PaddleOCR also provides a series of (models)[../../../doc/doc_en/models_list_en.md]. Developers can choose their own models or use their own models according to their needs. - - -### 3. Pruning sensitivity analysis - - After the pre-training model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, and save a sensitivity file which named: sen.pickle. After that, user could load the sensitivity file via the [methods provided by PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221) and determining the pruning ratio of each network layer automatically. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md) - The data format of sensitivity file: - sen.pickle(Dict){ - 'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} - 'layer_weight_name_1': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} - } - - example: - { - 'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594} - 'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405} - } - The function would return a dict after loading the sensitivity file. The keys of the dict are name of parameters in each layer. And the value of key is the information about pruning sensitivity of correspoding layer. In example, pruning 10% filter of the layer corresponding to conv10_expand_weights would lead to 0.65% degradation of model performance. The details could be seen at: [Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86) - - -Enter the PaddleOCR root directory,perform sensitivity analysis on the model with the following command: - -```bash - -python3.7 deploy/slim/prune/sensitivity_anal.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrain_weights="your trained model" - -``` - - -### 5. Export inference model and deploy it - -We can export the pruned model as inference_model for deployment: -```bash -python deploy/slim/prune/export_prune_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrain_weights=./output/det_db/best_accuracy Global.test_batch_size_per_card=1 Global.save_inference_dir=inference_model -``` - -Reference for prediction and deployment of inference model: -1. [inference model python prediction](../../../doc/doc_en/inference_en.md) -2. [inference model C++ prediction](../../cpp_infer/readme_en.md) diff --git a/deploy/slim/prune/export_prune_model.py b/deploy/slim/prune/export_prune_model.py deleted file mode 100644 index 29f7d211df7b2ad02bf2229f0be81c3cbe005503..0000000000000000000000000000000000000000 --- a/deploy/slim/prune/export_prune_model.py +++ /dev/null @@ -1,125 +0,0 @@ -# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os -import sys - -__dir__ = os.path.dirname(__file__) -sys.path.append(__dir__) -sys.path.append(os.path.join(__dir__, '..', '..', '..')) -sys.path.append(os.path.join(__dir__, '..', '..', '..', 'tools')) - -import paddle -from ppocr.data import build_dataloader -from ppocr.modeling.architectures import build_model - -from ppocr.postprocess import build_post_process -from ppocr.metrics import build_metric -from ppocr.utils.save_load import init_model -import tools.program as program - - -def main(config, device, logger, vdl_writer): - - global_config = config['Global'] - - # build dataloader - valid_dataloader = build_dataloader(config, 'Eval', device, logger) - - # build post process - post_process_class = build_post_process(config['PostProcess'], - global_config) - - # build model - # for rec algorithm - if hasattr(post_process_class, 'character'): - char_num = len(getattr(post_process_class, 'character')) - config['Architecture']["Head"]['out_channels'] = char_num - model = build_model(config['Architecture']) - - flops = paddle.flops(model, [1, 3, 640, 640]) - logger.info(f"FLOPs before pruning: {flops}") - - from paddleslim.dygraph import FPGMFilterPruner - model.train() - pruner = FPGMFilterPruner(model, [1, 3, 640, 640]) - - # build metric - eval_class = build_metric(config['Metric']) - - def eval_fn(): - metric = program.eval(model, valid_dataloader, post_process_class, - eval_class) - logger.info(f"metric['hmean']: {metric['hmean']}") - return metric['hmean'] - - params_sensitive = pruner.sensitive( - eval_func=eval_fn, - sen_file="./sen.pickle", - skip_vars=[ - "conv2d_57.w_0", "conv2d_transpose_2.w_0", "conv2d_transpose_3.w_0" - ]) - - logger.info( - "The sensitivity analysis results of model parameters saved in sen.pickle" - ) - # calculate pruned params's ratio - params_sensitive = pruner._get_ratios_by_loss(params_sensitive, loss=0.02) - for key in params_sensitive.keys(): - logger.info(f"{key}, {params_sensitive[key]}") - - plan = pruner.prune_vars(params_sensitive, [0]) - - flops = paddle.flops(model, [1, 3, 640, 640]) - logger.info(f"FLOPs after pruning: {flops}") - - # load pretrain model - pre_best_model_dict = init_model(config, model, logger, None) - metric = program.eval(model, valid_dataloader, post_process_class, - eval_class) - logger.info(f"metric['hmean']: {metric['hmean']}") - - # start export model - from paddle.jit import to_static - - infer_shape = [3, -1, -1] - if config['Architecture']['model_type'] == "rec": - infer_shape = [3, 32, -1] # for rec model, H must be 32 - - if 'Transform' in config['Architecture'] and config['Architecture'][ - 'Transform'] is not None and config['Architecture'][ - 'Transform']['name'] == 'TPS': - logger.info( - 'When there is tps in the network, variable length input is not supported, and the input size needs to be the same as during training' - ) - infer_shape[-1] = 100 - model = to_static( - model, - input_spec=[ - paddle.static.InputSpec( - shape=[None] + infer_shape, dtype='float32') - ]) - - save_path = '{}/inference'.format(config['Global']['save_inference_dir']) - paddle.jit.save(model, save_path) - logger.info('inference model is saved to {}'.format(save_path)) - - -if __name__ == '__main__': - config, device, logger, vdl_writer = program.preprocess(is_train=True) - main(config, device, logger, vdl_writer) diff --git a/deploy/slim/prune/sensitivity_anal.py b/deploy/slim/prune/sensitivity_anal.py deleted file mode 100644 index bd2b96497221fd886c83b9401cc8ed2a1a201a50..0000000000000000000000000000000000000000 --- a/deploy/slim/prune/sensitivity_anal.py +++ /dev/null @@ -1,146 +0,0 @@ -# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os -import sys - -__dir__ = os.path.dirname(__file__) -sys.path.append(__dir__) -sys.path.append(os.path.join(__dir__, '..', '..', '..')) -sys.path.append(os.path.join(__dir__, '..', '..', '..', 'tools')) - -import paddle -import paddle.distributed as dist -from ppocr.data import build_dataloader -from ppocr.modeling.architectures import build_model -from ppocr.losses import build_loss -from ppocr.optimizer import build_optimizer -from ppocr.postprocess import build_post_process -from ppocr.metrics import build_metric -from ppocr.utils.save_load import init_model -import tools.program as program - -dist.get_world_size() - - -def get_pruned_params(parameters): - params = [] - - for param in parameters: - if len( - param.shape - ) == 4 and 'depthwise' not in param.name and 'transpose' not in param.name and "conv2d_57" not in param.name and "conv2d_56" not in param.name: - params.append(param.name) - return params - - -def main(config, device, logger, vdl_writer): - # init dist environment - if config['Global']['distributed']: - dist.init_parallel_env() - - global_config = config['Global'] - - # build dataloader - train_dataloader = build_dataloader(config, 'Train', device, logger) - if config['Eval']: - valid_dataloader = build_dataloader(config, 'Eval', device, logger) - else: - valid_dataloader = None - - # build post process - post_process_class = build_post_process(config['PostProcess'], - global_config) - - # build model - # for rec algorithm - if hasattr(post_process_class, 'character'): - char_num = len(getattr(post_process_class, 'character')) - config['Architecture']["Head"]['out_channels'] = char_num - model = build_model(config['Architecture']) - - flops = paddle.flops(model, [1, 3, 640, 640]) - logger.info(f"FLOPs before pruning: {flops}") - - from paddleslim.dygraph import FPGMFilterPruner - model.train() - pruner = FPGMFilterPruner(model, [1, 3, 640, 640]) - - # build loss - loss_class = build_loss(config['Loss']) - - # build optim - optimizer, lr_scheduler = build_optimizer( - config['Optimizer'], - epochs=config['Global']['epoch_num'], - step_each_epoch=len(train_dataloader), - parameters=model.parameters()) - - # build metric - eval_class = build_metric(config['Metric']) - # load pretrain model - pre_best_model_dict = init_model(config, model, logger, optimizer) - - logger.info('train dataloader has {} iters, valid dataloader has {} iters'. - format(len(train_dataloader), len(valid_dataloader))) - # build metric - eval_class = build_metric(config['Metric']) - - logger.info('train dataloader has {} iters, valid dataloader has {} iters'. - format(len(train_dataloader), len(valid_dataloader))) - - def eval_fn(): - metric = program.eval(model, valid_dataloader, post_process_class, - eval_class) - logger.info(f"metric['hmean']: {metric['hmean']}") - return metric['hmean'] - - params_sensitive = pruner.sensitive( - eval_func=eval_fn, - sen_file="./sen.pickle", - skip_vars=[ - "conv2d_57.w_0", "conv2d_transpose_2.w_0", "conv2d_transpose_3.w_0" - ]) - - logger.info( - "The sensitivity analysis results of model parameters saved in sen.pickle" - ) - # calculate pruned params's ratio - params_sensitive = pruner._get_ratios_by_loss(params_sensitive, loss=0.02) - for key in params_sensitive.keys(): - logger.info(f"{key}, {params_sensitive[key]}") - - plan = pruner.prune_vars(params_sensitive, [0]) - for param in model.parameters(): - if ("weights" in param.name and "conv" in param.name) or ( - "w_0" in param.name and "conv2d" in param.name): - logger.info(f"{param.name}: {param.shape}") - - flops = paddle.flops(model, [1, 3, 640, 640]) - logger.info(f"FLOPs after pruning: {flops}") - - # start train - - program.train(config, train_dataloader, valid_dataloader, device, model, - loss_class, optimizer, lr_scheduler, post_process_class, - eval_class, pre_best_model_dict, logger, vdl_writer) - - -if __name__ == '__main__': - config, device, logger, vdl_writer = program.preprocess(is_train=True) - main(config, device, logger, vdl_writer) diff --git a/deploy/slim/quantization/README.md b/deploy/slim/quantization/README.md deleted file mode 100644 index 4ac3f7c3016c9ef53724ad6f7745507cef3580a8..0000000000000000000000000000000000000000 --- a/deploy/slim/quantization/README.md +++ /dev/null @@ -1,61 +0,0 @@ - -## 介绍 -复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型量化将全精度缩减到定点数减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。 -模型量化可以在基本不损失模型的精度的情况下,将FP32精度的模型参数转换为Int8精度,减小模型参数大小并加速计算,使用量化后的模型在移动端等部署时更具备速度优势。 - -本教程将介绍如何使用飞桨模型压缩库PaddleSlim做PaddleOCR模型的压缩。 -[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim) 集成了模型剪枝、量化(包括量化训练和离线量化)、蒸馏和神经网络搜索等多种业界常用且领先的模型压缩功能,如果您感兴趣,可以关注并了解。 - -在开始本教程之前,建议先了解[PaddleOCR模型的训练方法](../../../doc/doc_ch/quickstart.md)以及[PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/index.html) - - -## 快速开始 -量化多适用于轻量模型在移动端的部署,当训练出一个模型后,如果希望进一步的压缩模型大小并加速预测,可使用量化的方法压缩模型。 - -模型量化主要包括五个步骤: -1. 安装 PaddleSlim -2. 准备训练好的模型 -3. 量化训练 -4. 导出量化推理模型 -5. 量化模型预测部署 - -### 1. 安装PaddleSlim - -```bash -git clone https://github.com/PaddlePaddle/PaddleSlim.git -cd Paddleslim -python setup.py install -``` - -### 2. 准备训练好的模型 - -PaddleOCR提供了一系列训练好的[模型](../../../doc/doc_ch/models_list.md),如果待量化的模型不在列表中,需要按照[常规训练](../../../doc/doc_ch/quickstart.md)方法得到训练好的模型。 - -### 3. 量化训练 -量化训练包括离线量化训练和在线量化训练,在线量化训练效果更好,需加载预训练模型,在定义好量化策略后即可对模型进行量化。 - - -量化训练的代码位于slim/quantization/quant.py 中,比如训练检测模型,训练指令如下: -```bash -python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights='your trained model' Global.save_model_dir=./output/quant_model - -# 比如下载提供的训练模型 -wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar -tar -xf ch_ppocr_mobile_v2.0_det_train.tar -python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./ch_ppocr_mobile_v2.0_det_train/best_accuracy Global.save_inference_dir=./output/quant_inference_model - -``` -如果要训练识别模型的量化,修改配置文件和加载的模型参数即可。 - -### 4. 导出模型 - -在得到量化训练保存的模型后,我们可以将其导出为inference_model,用于预测部署: - -```bash -python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_model_dir=./output/quant_inference_model -``` - -### 5. 量化模型部署 - -上述步骤导出的量化模型,参数精度仍然是FP32,但是参数的数值范围是int8,导出的模型可以通过PaddleLite的opt模型转换工具完成模型转换。 -量化模型部署的可参考 [移动端模型部署](../../lite/readme.md) diff --git a/deploy/slim/quantization/README_en.md b/deploy/slim/quantization/README_en.md deleted file mode 100644 index 36407a2bb58ee3a36afc211ca7a8f0d786d1714f..0000000000000000000000000000000000000000 --- a/deploy/slim/quantization/README_en.md +++ /dev/null @@ -1,68 +0,0 @@ - -## Introduction - -Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. -Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number, -so as to reduce model calculation complexity and improve model inference performance. - -This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model. - -It is recommended that you could understand following pages before reading this example: -- [The training strategy of OCR model](../../../doc/doc_en/quickstart_en.md) -- [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) - -## Quick Start -Quantization is mostly suitable for the deployment of lightweight models on mobile terminals. -After training, if you want to further compress the model size and accelerate the prediction, you can use quantization methods to compress the model according to the following steps. - -1. Install PaddleSlim -2. Prepare trained model -3. Quantization-Aware Training -4. Export inference model -5. Deploy quantization inference model - - -### 1. Install PaddleSlim - -```bash -git clone https://github.com/PaddlePaddle/PaddleSlim.git -cd Paddleslim -python setup.py install -``` - - -### 2. Download Pretrain Model -PaddleOCR provides a series of trained [models](../../../doc/doc_en/models_list_en.md). -If the model to be quantified is not in the list, you need to follow the [Regular Training](../../../doc/doc_en/quickstart_en.md) method to get the trained model. - - -### 3. Quant-Aware Training -Quantization training includes offline quantization training and online quantization training. -Online quantization training is more effective. It is necessary to load the pre-training model. -After the quantization strategy is defined, the model can be quantified. - -The code for quantization training is located in `slim/quantization/quant.py`. For example, to train a detection model, the training instructions are as follows: -```bash -python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights='your trained model' Global.save_model_dir=./output/quant_model - -# download provided model -wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar -tar -xf ch_ppocr_mobile_v2.0_det_train.tar -python deploy/slim/quantization/quant.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./ch_ppocr_mobile_v2.0_det_train/best_accuracy Global.save_model_dir=./output/quant_model - -``` - - -### 4. Export inference model - -After getting the model after pruning and finetuning we, can export it as inference_model for predictive deployment: - -```bash -python deploy/slim/quantization/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_inference_dir=./output/quant_inference_model -``` - -### 5. Deploy -The numerical range of the quantized model parameters derived from the above steps is still FP32, but the numerical range of the parameters is int8. -The derived model can be converted through the `opt tool` of PaddleLite. - -For quantitative model deployment, please refer to [Mobile terminal model deployment](../../lite/readme_en.md) diff --git a/deploy/slim/quantization/export_model.py b/deploy/slim/quantization/export_model.py deleted file mode 100755 index 100b107a1deb1ce9932c9cefa50659c060f5803e..0000000000000000000000000000000000000000 --- a/deploy/slim/quantization/export_model.py +++ /dev/null @@ -1,118 +0,0 @@ -# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -import sys - -__dir__ = os.path.dirname(os.path.abspath(__file__)) -sys.path.append(__dir__) -sys.path.append(os.path.abspath(os.path.join(__dir__, '..', '..', '..'))) -sys.path.append( - os.path.abspath(os.path.join(__dir__, '..', '..', '..', 'tools'))) - -import argparse - -import paddle -from paddle.jit import to_static - -from ppocr.modeling.architectures import build_model -from ppocr.postprocess import build_post_process -from ppocr.utils.save_load import init_model -from ppocr.utils.logging import get_logger -from tools.program import load_config, merge_config, ArgsParser -from ppocr.metrics import build_metric -import tools.program as program -from paddleslim.dygraph.quant import QAT -from ppocr.data import build_dataloader - - -def main(): - ############################################################################################################ - # 1. quantization configs - ############################################################################################################ - quant_config = { - # weight preprocess type, default is None and no preprocessing is performed. - 'weight_preprocess_type': None, - # activation preprocess type, default is None and no preprocessing is performed. - 'activation_preprocess_type': None, - # weight quantize type, default is 'channel_wise_abs_max' - 'weight_quantize_type': 'channel_wise_abs_max', - # activation quantize type, default is 'moving_average_abs_max' - 'activation_quantize_type': 'moving_average_abs_max', - # weight quantize bit num, default is 8 - 'weight_bits': 8, - # activation quantize bit num, default is 8 - 'activation_bits': 8, - # data type after quantization, such as 'uint8', 'int8', etc. default is 'int8' - 'dtype': 'int8', - # window size for 'range_abs_max' quantization. default is 10000 - 'window_size': 10000, - # The decay coefficient of moving average, default is 0.9 - 'moving_rate': 0.9, - # for dygraph quantization, layers of type in quantizable_layer_type will be quantized - 'quantizable_layer_type': ['Conv2D', 'Linear'], - } - FLAGS = ArgsParser().parse_args() - config = load_config(FLAGS.config) - merge_config(FLAGS.opt) - logger = get_logger() - # build post process - - post_process_class = build_post_process(config['PostProcess'], - config['Global']) - - # build model - # for rec algorithm - if hasattr(post_process_class, 'character'): - char_num = len(getattr(post_process_class, 'character')) - config['Architecture']["Head"]['out_channels'] = char_num - model = build_model(config['Architecture']) - - # get QAT model - quanter = QAT(config=quant_config) - quanter.quantize(model) - - init_model(config, model, logger) - model.eval() - - # build metric - eval_class = build_metric(config['Metric']) - - # build dataloader - valid_dataloader = build_dataloader(config, 'Eval', device, logger) - - # start eval - metirc = program.eval(model, valid_dataloader, post_process_class, - eval_class) - logger.info('metric eval ***************') - for k, v in metirc.items(): - logger.info('{}:{}'.format(k, v)) - - save_path = '{}/inference'.format(config['Global']['save_inference_dir']) - infer_shape = [3, 32, 100] if config['Architecture'][ - 'model_type'] != "det" else [3, 640, 640] - - quanter.save_quantized_model( - model, - save_path, - input_spec=[ - paddle.static.InputSpec( - shape=[None] + infer_shape, dtype='float32') - ]) - logger.info('inference QAT model is saved to {}'.format(save_path)) - - -if __name__ == "__main__": - config, device, logger, vdl_writer = program.preprocess() - main() diff --git a/deploy/slim/quantization/quant.py b/deploy/slim/quantization/quant.py deleted file mode 100755 index 7671e5f871ce6769fc51876d1fa2e5f0af63d904..0000000000000000000000000000000000000000 --- a/deploy/slim/quantization/quant.py +++ /dev/null @@ -1,166 +0,0 @@ -# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os -import sys - -__dir__ = os.path.dirname(os.path.abspath(__file__)) -sys.path.append(__dir__) -sys.path.append(os.path.abspath(os.path.join(__dir__, '..', '..', '..'))) -sys.path.append( - os.path.abspath(os.path.join(__dir__, '..', '..', '..', 'tools'))) - -import yaml -import paddle -import paddle.distributed as dist - -paddle.seed(2) - -from ppocr.data import build_dataloader -from ppocr.modeling.architectures import build_model -from ppocr.losses import build_loss -from ppocr.optimizer import build_optimizer -from ppocr.postprocess import build_post_process -from ppocr.metrics import build_metric -from ppocr.utils.save_load import init_model -import tools.program as program -from paddleslim.dygraph.quant import QAT - -dist.get_world_size() - - -class PACT(paddle.nn.Layer): - def __init__(self): - super(PACT, self).__init__() - alpha_attr = paddle.ParamAttr( - name=self.full_name() + ".pact", - initializer=paddle.nn.initializer.Constant(value=20), - learning_rate=1.0, - regularizer=paddle.regularizer.L2Decay(2e-5)) - - self.alpha = self.create_parameter( - shape=[1], attr=alpha_attr, dtype='float32') - - def forward(self, x): - out_left = paddle.nn.functional.relu(x - self.alpha) - out_right = paddle.nn.functional.relu(-self.alpha - x) - x = x - out_left + out_right - return x - - -quant_config = { - # weight preprocess type, default is None and no preprocessing is performed. - 'weight_preprocess_type': None, - # activation preprocess type, default is None and no preprocessing is performed. - 'activation_preprocess_type': None, - # weight quantize type, default is 'channel_wise_abs_max' - 'weight_quantize_type': 'channel_wise_abs_max', - # activation quantize type, default is 'moving_average_abs_max' - 'activation_quantize_type': 'moving_average_abs_max', - # weight quantize bit num, default is 8 - 'weight_bits': 8, - # activation quantize bit num, default is 8 - 'activation_bits': 8, - # data type after quantization, such as 'uint8', 'int8', etc. default is 'int8' - 'dtype': 'int8', - # window size for 'range_abs_max' quantization. default is 10000 - 'window_size': 10000, - # The decay coefficient of moving average, default is 0.9 - 'moving_rate': 0.9, - # for dygraph quantization, layers of type in quantizable_layer_type will be quantized - 'quantizable_layer_type': ['Conv2D', 'Linear'], -} - - -def main(config, device, logger, vdl_writer): - # init dist environment - if config['Global']['distributed']: - dist.init_parallel_env() - - global_config = config['Global'] - - # build dataloader - train_dataloader = build_dataloader(config, 'Train', device, logger) - if config['Eval']: - valid_dataloader = build_dataloader(config, 'Eval', device, logger) - else: - valid_dataloader = None - - # build post process - post_process_class = build_post_process(config['PostProcess'], - global_config) - - # build model - # for rec algorithm - if hasattr(post_process_class, 'character'): - char_num = len(getattr(post_process_class, 'character')) - config['Architecture']["Head"]['out_channels'] = char_num - model = build_model(config['Architecture']) - - # prepare to quant - quanter = QAT(config=quant_config, act_preprocess=PACT) - quanter.quantize(model) - - if config['Global']['distributed']: - model = paddle.DataParallel(model) - - # build loss - loss_class = build_loss(config['Loss']) - - # build optim - optimizer, lr_scheduler = build_optimizer( - config['Optimizer'], - epochs=config['Global']['epoch_num'], - step_each_epoch=len(train_dataloader), - parameters=model.parameters()) - - # build metric - eval_class = build_metric(config['Metric']) - # load pretrain model - pre_best_model_dict = init_model(config, model, logger, optimizer) - - logger.info('train dataloader has {} iters, valid dataloader has {} iters'. - format(len(train_dataloader), len(valid_dataloader))) - # start train - program.train(config, train_dataloader, valid_dataloader, device, model, - loss_class, optimizer, lr_scheduler, post_process_class, - eval_class, pre_best_model_dict, logger, vdl_writer) - - -def test_reader(config, device, logger): - loader = build_dataloader(config, 'Train', device, logger) - import time - starttime = time.time() - count = 0 - try: - for data in loader(): - count += 1 - if count % 1 == 0: - batch_time = time.time() - starttime - starttime = time.time() - logger.info("reader: {}, {}, {}".format( - count, len(data[0]), batch_time)) - except Exception as e: - logger.info(e) - logger.info("finish reader: {}, Success!".format(count)) - - -if __name__ == '__main__': - config, device, logger, vdl_writer = program.preprocess(is_train=True) - main(config, device, logger, vdl_writer) - # test_reader(config, device, logger)