diff --git a/PPOCRLabel/README.md b/PPOCRLabel/README.md
index 9e5b3245b0cfb56d300155a94f64d38edcdbb599..7a936543277e2e1d5687681ac78eca96f5f3f400 100644
--- a/PPOCRLabel/README.md
+++ b/PPOCRLabel/README.md
@@ -21,12 +21,9 @@ PPOCRLabel is a semi-automatic graphic annotation tool suitable for OCR field, w
- Click to modify the recognition result.(If you can't change the result, please switch to the system default input method, or switch back to the original input method again)
- 2020.12.18: Support re-recognition of a single label box (by [ninetailskim](https://github.com/ninetailskim) ), perfect shortcut keys.
-### TODO:
-- Lock box mode: For the same scene data, the size and position of the locked detection box can be transferred between different pictures.
+## 1. Installation
-## Installation
-
-### 1. Environment Preparation
+### 1.1 Environment Preparation
#### **Install PaddlePaddle 2.0**
@@ -66,7 +63,7 @@ If you getting this error `OSError: [WinError 126] The specified module could no
Reference: [Solve shapely installation on windows](https://stackoverflow.com/questions/44398265/install-shapely-oserror-winerror-126-the-specified-module-could-not-be-found)
-### 2. Install PPOCRLabel
+### 1.2 Install PPOCRLabel
#### Windows
@@ -94,9 +91,9 @@ cd ./PPOCRLabel # Change the directory to the PPOCRLabel folder
python3 PPOCRLabel.py
```
-## Usage
+## 2. Usage
-### Steps
+### 2.1 Steps
1. Build and launch using the instructions above.
@@ -140,9 +137,9 @@ python3 PPOCRLabel.py
| rec_gt.txt | The recognition label file, which can be directly used for PPOCR identification model training, is generated after the user clicks on the menu bar "File"-"Export recognition result". |
| crop_img | The recognition data, generated at the same time with *rec_gt.txt* |
-## Explanation
+## 3. Explanation
-### Shortcut keys
+### 3.1 Shortcut keys
| Shortcut keys | Description |
| ------------------------ | ------------------------------------------------ |
@@ -162,31 +159,37 @@ python3 PPOCRLabel.py
| Ctrl-- | Zoom out |
| ↑→↓← | Move selected box |
-### Built-in Model
+### 3.2 Built-in Model
- Default model: PPOCRLabel uses the Chinese and English ultra-lightweight OCR model in PaddleOCR by default, supports Chinese, English and number recognition, and multiple language detection.
- Model language switching: Changing the built-in model language is supportable by clicking "PaddleOCR"-"Choose OCR Model" in the menu bar. Currently supported languagesinclude French, German, Korean, and Japanese.
For specific model download links, please refer to [PaddleOCR Model List](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/models_list_en.md#multilingual-recognition-modelupdating)
-- Custom model: The model trained by users can be replaced by modifying PPOCRLabel.py in [PaddleOCR class instantiation](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/PPOCRLabel/PPOCRLabel.py#L110) referring [Custom Model Code](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/whl_en.md#use-custom-model)
+- **Custom Model**: If users want to replace the built-in model with their own inference model, they can follow the [Custom Model Code Usage](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/doc/doc_en/whl_en.md#31-use-by-code) by modifying PPOCRLabel.py for [Instantiation of PaddleOCR class](https://github.com/PaddlePaddle/PaddleOCR/blob/release/ 2.3/PPOCRLabel/PPOCRLabel.py#L116) :
+
+ add parameter `det_model_dir` in `self.ocr = PaddleOCR(use_pdserving=False, use_angle_cls=True, det=True, cls=True, use_gpu=gpu, lang=lang) `
-### Export Label Result
+### 3.3 Export Label Result
PPOCRLabel supports three ways to export Label.txt
- Automatically export: After selecting "File - Auto Export Label Mode", the program will automatically write the annotations into Label.txt every time the user confirms an image. If this option is not turned on, it will be automatically exported after detecting that the user has manually checked 5 images.
+
+ > The automatically export mode is turned off by default
+
- Manual export: Click "File-Export Marking Results" to manually export the label.
+
- Close application export
-### Export Partial Recognition Results
+### 3.4 Export Partial Recognition Results
-For some data that are difficult to recognize, the recognition results will not be exported by **unchecking** the corresponding tags in the recognition results checkbox.
+For some data that are difficult to recognize, the recognition results will not be exported by **unchecking** the corresponding tags in the recognition results checkbox. The unchecked recognition result is saved as `True` in the `difficult` variable in the label file `label.txt`.
-*Note: The status of the checkboxes in the recognition results still needs to be saved manually by clicking Save Button.*
+> *Note: The status of the checkboxes in the recognition results still needs to be saved manually by clicking Save Button.*
-### Error message
+### 3.5 Error message
- If paddleocr is installed with whl, it has a higher priority than calling PaddleOCR class with paddleocr.py, which may cause an exception if whl package is not updated.
diff --git a/PPOCRLabel/README_ch.md b/PPOCRLabel/README_ch.md
index 7f9351dfe185be2417162f2c786f5eec0b58816a..17bb95c08267bfac5e2cecde2e1d88c4a7cb1b60 100644
--- a/PPOCRLabel/README_ch.md
+++ b/PPOCRLabel/README_ch.md
@@ -21,16 +21,12 @@ PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具,内置P
- 识别结果更改为单击修改。(如果无法修改,请切换为系统自带输入法,或再次切回原输入法)
- 2020.12.18: 支持对单个标记框进行重新识别(by [ninetailskim](https://github.com/ninetailskim)),完善快捷键。
-#### 尽请期待
-
-- 锁定框模式:针对同一场景数据,被锁定的检测框的大小与位置能在不同图片之间传递。
-
如果您对以上内容感兴趣或对完善工具有不一样的想法,欢迎加入我们的SIG队伍与我们共同开发。可以在[此处](https://github.com/PaddlePaddle/PaddleOCR/issues/1728)完成问卷和前置任务,经过我们确认相关内容后即可正式加入,享受SIG福利,共同为OCR开源事业贡献(特别说明:针对PPOCRLabel的改进也属于PaddleOCR前置任务)
-## 安装
+## 1. 安装
-### 1. 环境搭建
+### 1.1 环境搭建
#### 安装PaddlePaddle
```bash
@@ -67,7 +63,7 @@ pip3 install -r requirements.txt
注意,windows环境下,建议从[这里](https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely)下载shapely安装包完成安装, 直接通过pip安装的shapely库可能出现`[winRrror 126] 找不到指定模块的问题`。
-### 2. 安装PPOCRLabel
+### 1.2 安装PPOCRLabel
#### Windows
@@ -95,11 +91,9 @@ cd ./PPOCRLabel # 将目录切换到PPOCRLabel文件夹下
python3 PPOCRLabel.py --lang ch
```
+## 2. 使用
-
-## 使用
-
-### 操作步骤
+### 2.1 操作步骤
1. 安装与运行:使用上述命令安装与运行程序。
2. 打开文件夹:在菜单栏点击 “文件” - "打开目录" 选择待标记图片的文件夹[1].
@@ -130,9 +124,9 @@ python3 PPOCRLabel.py --lang ch
| rec_gt.txt | 识别标签。可直接用于PPOCR识别模型训练。需用户手动点击菜单栏“文件” - "导出识别结果"后产生。 |
| crop_img | 识别数据。按照检测框切割后的图片。与rec_gt.txt同时产生。 |
-## 说明
+## 3. 说明
-### 快捷键
+### 3.1 快捷键
| 快捷键 | 说明 |
| ---------------- | ---------------------------- |
@@ -152,29 +146,35 @@ python3 PPOCRLabel.py --lang ch
| Ctrl-- | 放大 |
| ↑→↓← | 移动标记框 |
-### 内置模型
+### 3.2 内置模型
- 默认模型:PPOCRLabel默认使用PaddleOCR中的中英文超轻量OCR模型,支持中英文与数字识别,多种语言检测。
- 模型语言切换:用户可通过菜单栏中 "PaddleOCR" - "选择模型" 切换内置模型语言,目前支持的语言包括法文、德文、韩文、日文。具体模型下载链接可参考[PaddleOCR模型列表](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/models_list.md).
- - 自定义模型:用户可根据[自定义模型代码使用](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/whl.md#%E8%87%AA%E5%AE%9A%E4%B9%89%E6%A8%A1%E5%9E%8B),通过修改PPOCRLabel.py中针对[PaddleOCR类的实例化](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/PPOCRLabel/PPOCRLabel.py#L110)替换成自己训练的模型。
+ - **自定义模型**:如果用户想将内置模型更换为自己的推理模型,可根据[自定义模型代码使用](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/whl.md#%E8%87%AA%E5%AE%9A%E4%B9%89%E6%A8%A1%E5%9E%8B),通过修改PPOCRLabel.py中针对[PaddleOCR类的实例化](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.3/PPOCRLabel/PPOCRLabel.py#L116) :
-### 导出标记结果
+ `self.ocr = PaddleOCR(use_pdserving=False, use_angle_cls=True, det=True, cls=True, use_gpu=gpu, lang=lang) `,在 `det_model_dir` 中传入 自己的模型即可。
+
+### 3.3 导出标记结果
PPOCRLabel支持三种导出方式:
- 自动导出:点击“文件 - 自动导出标记结果”后,用户每确认过一张图片,程序自动将标记结果写入Label.txt中。若未开启此选项,则检测到用户手动确认过5张图片后进行自动导出。
+
+ > 默认情况下自动导出功能为关闭状态
+
- 手动导出:点击“文件 - 导出标记结果”手动导出标记。
+
- 关闭应用程序导出
-### 导出部分识别结果
+### 3.4 导出部分识别结果
-针对部分难以识别的数据,通过在识别结果的复选框中**取消勾选**相应的标记,其识别结果不会被导出。
+针对部分难以识别的数据,通过在识别结果的复选框中**取消勾选**相应的标记,其识别结果不会被导出。被取消勾选的识别结果在标记文件 `label.txt` 中的 `difficult` 变量保存为 `True` 。
-*注意:识别结果中的复选框状态仍需用户手动点击确认后才能保留*
+> *注意:识别结果中的复选框状态仍需用户手动点击确认后才能保留*
-### 错误提示
+### 3.5 错误提示
- 如果同时使用whl包安装了paddleocr,其优先级大于通过paddleocr.py调用PaddleOCR类,whl包未更新时会导致程序异常。
- PPOCRLabel**不支持对中文文件名**的图片进行自动标注。
@@ -194,6 +194,6 @@ PPOCRLabel支持三种导出方式:
pip install opencv-contrib-python-headless==4.2.0.32
```
-### 参考资料
+### 4. 参考资料
1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)
diff --git a/README.md b/README.md
index c19493b07b0e615876404689f4eaac0802dbda60..f6a73a7e15739443b5af7d4a73893f63ae1c0a20 100644
--- a/README.md
+++ b/README.md
@@ -119,7 +119,7 @@ For a new language request, please refer to [Guideline for new language_requests
- [Table Recognition](./ppstructure/table/README.md)
- Academic Circles
- [Two-stage Algorithm](./doc/doc_en/algorithm_overview_en.md)
- - [PGNet Algorithm](./doc/doc_en/algorithm_overview_en.md)
+ - [PGNet Algorithm](./doc/doc_en/pgnet_en.md)
- [Python Inference](./doc/doc_en/inference_en.md)
- Data Annotation and Synthesis
- [Semi-automatic Annotation Tool: PPOCRLabel](./PPOCRLabel/README.md)
diff --git a/README_ch.md b/README_ch.md
index 7e088e30116a4dd636b044fcc55169972ef04eb6..7091523d2f6e24dd4256aad9e6a978a6929f9443 100755
--- a/README_ch.md
+++ b/README_ch.md
@@ -109,15 +109,16 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
- [PP-Structure信息提取](./ppstructure/README_ch.md)
- [版面分析](./ppstructure/layout/README_ch.md)
- [表格识别](./ppstructure/table/README_ch.md)
+- OCR学术圈
+ - [两阶段模型介绍与下载](./doc/doc_ch/algorithm_overview.md)
+ - [端到端PGNet算法](./doc/doc_ch/pgnet.md)
+ - [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md)
+ - [使用PaddleOCR架构添加新算法](./doc/doc_ch/add_new_algorithm.md)
- 数据标注与合成
- [半自动标注工具PPOCRLabel](./PPOCRLabel/README_ch.md)
- [数据合成工具Style-Text](./StyleText/README_ch.md)
- [其它数据标注工具](./doc/doc_ch/data_annotation.md)
- [其它数据合成工具](./doc/doc_ch/data_synthesis.md)
-- OCR学术圈
- - [两阶段模型介绍与下载](./doc/doc_ch/algorithm_overview.md)
- - [端到端PGNet算法](./doc/doc_ch/pgnet.md)
- - [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md)
- 数据集
- [通用中英文OCR数据集](./doc/doc_ch/datasets.md)
- [手写中文OCR数据集](./doc/doc_ch/handwritten_datasets.md)
diff --git a/benchmark/readme.md b/benchmark/readme.md
index 7f7704cca5341d495dfbcdc66ddfd29fbea1e1df..d90d21468e7c9d0c9068a273ae704c0c8a086eab 100644
--- a/benchmark/readme.md
+++ b/benchmark/readme.md
@@ -1,5 +1,5 @@
-# PaddleOCR DB/EAST 算法训练benchmark测试
+# PaddleOCR DB/EAST/PSE 算法训练benchmark测试
PaddleOCR/benchmark目录下的文件用于获取并分析训练日志。
训练采用icdar2015数据集,包括1000张训练图像和500张测试图像。模型配置采用resnet18_vd作为backbone,分别训练batch_size=8和batch_size=16的情况。
@@ -18,7 +18,7 @@ run_det.sh 执行方式如下:
```
# cd PaddleOCR/
-bash benchmark/run_det.sh
+bash benchmark/run_det.sh
```
以DB为例,将得到四个日志文件,如下:
@@ -28,7 +28,3 @@ det_res18_db_v2.0_sp_bs8_fp32_1
det_res18_db_v2.0_mp_bs16_fp32_1
det_res18_db_v2.0_mp_bs8_fp32_1
```
-
-
-
-
diff --git a/benchmark/run_benchmark_det.sh b/benchmark/run_benchmark_det.sh
index 26bcda5d20ba4e4d0498da28aafb93f29468169d..46144b43a09baf787216728eabaab5f8548fa924 100644
--- a/benchmark/run_benchmark_det.sh
+++ b/benchmark/run_benchmark_det.sh
@@ -6,7 +6,7 @@ function _set_params(){
run_mode=${1:-"sp"} # 单卡sp|多卡mp
batch_size=${2:-"64"}
fp_item=${3:-"fp32"} # fp32|fp16
- max_iter=${4:-"500"} # 可选,如果需要修改代码提前中断
+ max_iter=${4:-"10"} # 可选,如果需要修改代码提前中断
model_name=${5:-"model_name"}
run_log_path=${TRAIN_LOG_DIR:-$(pwd)} # TRAIN_LOG_DIR 后续QA设置该参数
@@ -20,7 +20,7 @@ function _train(){
echo "Train on ${num_gpu_devices} GPUs"
echo "current CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES, gpus=$num_gpu_devices, batch_size=$batch_size"
- train_cmd="-c configs/det/${model_name}.yml -o Train.loader.batch_size_per_card=${batch_size} Global.epoch_num=${max_iter} "
+ train_cmd="-c configs/det/${model_name}.yml -o Train.loader.batch_size_per_card=${batch_size} Global.epoch_num=${max_iter} Global.eval_batch_step=[0,20000] Global.print_batch_step=2"
case ${run_mode} in
sp)
train_cmd="python3.7 tools/train.py "${train_cmd}""
@@ -39,18 +39,24 @@ function _train(){
echo -e "${model_name}, SUCCESS"
export job_fail_flag=0
fi
- kill -9 `ps -ef|grep 'python3.7'|awk '{print $2}'`
if [ $run_mode = "mp" -a -d mylog ]; then
rm ${log_file}
cp mylog/workerlog.0 ${log_file}
fi
+}
- # run log analysis
- analysis_cmd="python3.7 benchmark/analysis.py --filename ${log_file} --mission_name ${model_name} --run_mode ${mode} --direction_id 0 --keyword 'ips:' --base_batch_size ${batch_szie} --skip_steps 1 --gpu_num ${num_gpu_devices} --index 1 --model_mode=-1 --ips_unit=samples/sec"
+function _analysis_log(){
+ analysis_cmd="python3.7 benchmark/analysis.py --filename ${log_file} --mission_name ${model_name} --run_mode ${run_mode} --direction_id 0 --keyword 'ips:' --base_batch_size ${batch_size} --skip_steps 1 --gpu_num ${num_gpu_devices} --index 1 --model_mode=-1 --ips_unit=samples/sec"
eval $analysis_cmd
}
+function _kill_process(){
+ kill -9 `ps -ef|grep 'python3.7'|awk '{print $2}'`
+}
+
+
_set_params $@
_train
-
+_analysis_log
+_kill_process
\ No newline at end of file
diff --git a/benchmark/run_det.sh b/benchmark/run_det.sh
index c507510c615a60177e07300976947b010dbae990..68109b3ab2c3b8b61a0c90b4b31fd855c1ba2d46 100644
--- a/benchmark/run_det.sh
+++ b/benchmark/run_det.sh
@@ -3,11 +3,11 @@
# 1 安装该模型需要的依赖 (如需开启优化策略请注明)
python3.7 -m pip install -r requirements.txt
# 2 拷贝该模型需要数据、预训练模型
-wget -c -p ./tain_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015.tar && cd train_data && tar xf icdar2015.tar && cd ../
-wget -c -p ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_pretrained.pdparams
+wget -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015.tar && cd train_data && tar xf icdar2015.tar && cd ../
+wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_pretrained.pdparams
# 3 批量运行(如不方便批量,1,2需放到单个模型中)
-model_mode_list=(det_res18_db_v2.0 det_r50_vd_east)
+model_mode_list=(det_res18_db_v2.0 det_r50_vd_east det_r50_vd_pse)
fp_item_list=(fp32)
bs_list=(8 16)
for model_mode in ${model_mode_list[@]}; do
@@ -15,11 +15,11 @@ for model_mode in ${model_mode_list[@]}; do
for bs_item in ${bs_list[@]}; do
echo "index is speed, 1gpus, begin, ${model_name}"
run_mode=sp
- CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark_det.sh ${run_mode} ${bs_item} ${fp_item} 10 ${model_mode} # (5min)
+ CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark_det.sh ${run_mode} ${bs_item} ${fp_item} 2 ${model_mode} # (5min)
sleep 60
echo "index is speed, 8gpus, run_mode is multi_process, begin, ${model_name}"
run_mode=mp
- CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark_det.sh ${run_mode} ${bs_item} ${fp_item} 10 ${model_mode}
+ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark_det.sh ${run_mode} ${bs_item} ${fp_item} 2 ${model_mode}
sleep 60
done
done
diff --git a/configs/det/ch_PP-OCRv2/ch_PP-OCR_det_distill.yml b/configs/det/ch_PP-OCRv2/ch_PP-OCR_det_distill.yml
index 46daeeb86d004772a6fb964d602369dcd53b3a01..d8d5135dd73ee438f76f5796b63e0dae4331402b 100644
--- a/configs/det/ch_PP-OCRv2/ch_PP-OCR_det_distill.yml
+++ b/configs/det/ch_PP-OCRv2/ch_PP-OCR_det_distill.yml
@@ -90,7 +90,7 @@ Optimizer:
PostProcess:
name: DistillationDBPostProcess
- model_name: ["Student", "Student2"]
+ model_name: ["Student"]
key: head_out
thresh: 0.3
box_thresh: 0.6
diff --git a/deploy/cpp_infer/include/ocr_rec.h b/deploy/cpp_infer/include/ocr_rec.h
index d585112b051daff7c03060836a4c065ba6e3949c..ff80ba5299014885fc4c900fb87b5dcc6042744a 100644
--- a/deploy/cpp_infer/include/ocr_rec.h
+++ b/deploy/cpp_infer/include/ocr_rec.h
@@ -44,7 +44,8 @@ public:
const int &gpu_id, const int &gpu_mem,
const int &cpu_math_library_num_threads,
const bool &use_mkldnn, const string &label_path,
- const bool &use_tensorrt, const std::string &precision) {
+ const bool &use_tensorrt, const std::string &precision,
+ const int &rec_batch_num) {
this->use_gpu_ = use_gpu;
this->gpu_id_ = gpu_id;
this->gpu_mem_ = gpu_mem;
@@ -52,6 +53,7 @@ public:
this->use_mkldnn_ = use_mkldnn;
this->use_tensorrt_ = use_tensorrt;
this->precision_ = precision;
+ this->rec_batch_num_ = rec_batch_num;
this->label_list_ = Utility::ReadDict(label_path);
this->label_list_.insert(this->label_list_.begin(),
@@ -64,7 +66,7 @@ public:
// Load Paddle inference model
void LoadModel(const std::string &model_dir);
- void Run(cv::Mat &img, std::vector *times);
+ void Run(std::vector img_list, std::vector *times);
private:
std::shared_ptr predictor_;
@@ -82,10 +84,12 @@ private:
bool is_scale_ = true;
bool use_tensorrt_ = false;
std::string precision_ = "fp32";
+ int rec_batch_num_ = 6;
+
// pre-process
CrnnResizeImg resize_op_;
Normalize normalize_op_;
- Permute permute_op_;
+ PermuteBatch permute_op_;
// post-process
PostProcessor post_processor_;
diff --git a/deploy/cpp_infer/include/preprocess_op.h b/deploy/cpp_infer/include/preprocess_op.h
index ab4c140059fdcaed9872d2d99b4aea57c7e5208f..31217de301573e078f8e11ef88657f369ede9b31 100644
--- a/deploy/cpp_infer/include/preprocess_op.h
+++ b/deploy/cpp_infer/include/preprocess_op.h
@@ -44,6 +44,11 @@ public:
virtual void Run(const cv::Mat *im, float *data);
};
+class PermuteBatch {
+public:
+ virtual void Run(const std::vector imgs, float *data);
+};
+
class ResizeImgType0 {
public:
virtual void Run(const cv::Mat &img, cv::Mat &resize_img, int max_size_len,
diff --git a/deploy/cpp_infer/include/utility.h b/deploy/cpp_infer/include/utility.h
index 678187d3fabfb1c91584226950155b3c47b5f93f..5797559f7550da6bb38b014c46c1492124a9e065 100644
--- a/deploy/cpp_infer/include/utility.h
+++ b/deploy/cpp_infer/include/utility.h
@@ -50,6 +50,9 @@ public:
static cv::Mat GetRotateCropImage(const cv::Mat &srcimage,
std::vector> box);
+
+ static std::vector argsort(const std::vector& array);
+
};
} // namespace PaddleOCR
\ No newline at end of file
diff --git a/deploy/cpp_infer/src/main.cpp b/deploy/cpp_infer/src/main.cpp
index 82a248416f086dd2b90e891a23774c294ed50ae3..b7a199b548beca881e4ab69491adcc9351f52c0f 100644
--- a/deploy/cpp_infer/src/main.cpp
+++ b/deploy/cpp_infer/src/main.cpp
@@ -61,7 +61,7 @@ DEFINE_string(cls_model_dir, "", "Path of cls inference model.");
DEFINE_double(cls_thresh, 0.9, "Threshold of cls_thresh.");
// recognition related
DEFINE_string(rec_model_dir, "", "Path of rec inference model.");
-DEFINE_int32(rec_batch_num, 1, "rec_batch_num.");
+DEFINE_int32(rec_batch_num, 6, "rec_batch_num.");
DEFINE_string(char_list_file, "../../ppocr/utils/ppocr_keys_v1.txt", "Path of dictionary.");
@@ -146,8 +146,9 @@ int main_rec(std::vector cv_all_img_names) {
CRNNRecognizer rec(FLAGS_rec_model_dir, FLAGS_use_gpu, FLAGS_gpu_id,
FLAGS_gpu_mem, FLAGS_cpu_threads,
FLAGS_enable_mkldnn, char_list_file,
- FLAGS_use_tensorrt, FLAGS_precision);
+ FLAGS_use_tensorrt, FLAGS_precision, FLAGS_rec_batch_num);
+ std::vector img_list;
for (int i = 0; i < cv_all_img_names.size(); ++i) {
LOG(INFO) << "The predict img: " << cv_all_img_names[i];
@@ -156,22 +157,21 @@ int main_rec(std::vector cv_all_img_names) {
std::cerr << "[ERROR] image read failed! image path: " << cv_all_img_names[i] << endl;
exit(1);
}
-
- std::vector rec_times;
- rec.Run(srcimg, &rec_times);
-
- time_info[0] += rec_times[0];
- time_info[1] += rec_times[1];
- time_info[2] += rec_times[2];
+ img_list.push_back(srcimg);
}
-
+ std::vector rec_times;
+ rec.Run(img_list, &rec_times);
+ time_info[0] += rec_times[0];
+ time_info[1] += rec_times[1];
+ time_info[2] += rec_times[2];
+
if (FLAGS_benchmark) {
AutoLogger autolog("ocr_rec",
FLAGS_use_gpu,
FLAGS_use_tensorrt,
FLAGS_enable_mkldnn,
FLAGS_cpu_threads,
- 1,
+ FLAGS_rec_batch_num,
"dynamic",
FLAGS_precision,
time_info,
@@ -209,7 +209,7 @@ int main_system(std::vector cv_all_img_names) {
CRNNRecognizer rec(FLAGS_rec_model_dir, FLAGS_use_gpu, FLAGS_gpu_id,
FLAGS_gpu_mem, FLAGS_cpu_threads,
FLAGS_enable_mkldnn, char_list_file,
- FLAGS_use_tensorrt, FLAGS_precision);
+ FLAGS_use_tensorrt, FLAGS_precision, FLAGS_rec_batch_num);
for (int i = 0; i < cv_all_img_names.size(); ++i) {
LOG(INFO) << "The predict img: " << cv_all_img_names[i];
@@ -228,19 +228,22 @@ int main_system(std::vector cv_all_img_names) {
time_info_det[1] += det_times[1];
time_info_det[2] += det_times[2];
- cv::Mat crop_img;
+ std::vector img_list;
for (int j = 0; j < boxes.size(); j++) {
- crop_img = Utility::GetRotateCropImage(srcimg, boxes[j]);
-
- if (cls != nullptr) {
- crop_img = cls->Run(crop_img);
- }
- rec.Run(crop_img, &rec_times);
- time_info_rec[0] += rec_times[0];
- time_info_rec[1] += rec_times[1];
- time_info_rec[2] += rec_times[2];
+ cv::Mat crop_img;
+ crop_img = Utility::GetRotateCropImage(srcimg, boxes[j]);
+ if (cls != nullptr) {
+ crop_img = cls->Run(crop_img);
+ }
+ img_list.push_back(crop_img);
}
+
+ rec.Run(img_list, &rec_times);
+ time_info_rec[0] += rec_times[0];
+ time_info_rec[1] += rec_times[1];
+ time_info_rec[2] += rec_times[2];
}
+
if (FLAGS_benchmark) {
AutoLogger autolog_det("ocr_det",
FLAGS_use_gpu,
@@ -257,7 +260,7 @@ int main_system(std::vector cv_all_img_names) {
FLAGS_use_tensorrt,
FLAGS_enable_mkldnn,
FLAGS_cpu_threads,
- 1,
+ FLAGS_rec_batch_num,
"dynamic",
FLAGS_precision,
time_info_rec,
diff --git a/deploy/cpp_infer/src/ocr_rec.cpp b/deploy/cpp_infer/src/ocr_rec.cpp
index 3739a66ad802fd108df16bbbbe8c8695963b7693..f1a97a99a3d3487988c35766592668ba3f43c784 100644
--- a/deploy/cpp_infer/src/ocr_rec.cpp
+++ b/deploy/cpp_infer/src/ocr_rec.cpp
@@ -15,83 +15,108 @@
#include
namespace PaddleOCR {
-
-void CRNNRecognizer::Run(cv::Mat &img, std::vector *times) {
- cv::Mat srcimg;
- img.copyTo(srcimg);
- cv::Mat resize_img;
-
- float wh_ratio = float(srcimg.cols) / float(srcimg.rows);
- auto preprocess_start = std::chrono::steady_clock::now();
- this->resize_op_.Run(srcimg, resize_img, wh_ratio, this->use_tensorrt_);
-
- this->normalize_op_.Run(&resize_img, this->mean_, this->scale_,
- this->is_scale_);
-
- std::vector input(1 * 3 * resize_img.rows * resize_img.cols, 0.0f);
-
- this->permute_op_.Run(&resize_img, input.data());
- auto preprocess_end = std::chrono::steady_clock::now();
-
- // Inference.
- auto input_names = this->predictor_->GetInputNames();
- auto input_t = this->predictor_->GetInputHandle(input_names[0]);
- input_t->Reshape({1, 3, resize_img.rows, resize_img.cols});
- auto inference_start = std::chrono::steady_clock::now();
- input_t->CopyFromCpu(input.data());
- this->predictor_->Run();
-
- std::vector predict_batch;
- auto output_names = this->predictor_->GetOutputNames();
- auto output_t = this->predictor_->GetOutputHandle(output_names[0]);
- auto predict_shape = output_t->shape();
-
- int out_num = std::accumulate(predict_shape.begin(), predict_shape.end(), 1,
+
+void CRNNRecognizer::Run(std::vector img_list, std::vector *times) {
+ std::chrono::duration preprocess_diff = std::chrono::steady_clock::now() - std::chrono::steady_clock::now();
+ std::chrono::duration inference_diff = std::chrono::steady_clock::now() - std::chrono::steady_clock::now();
+ std::chrono::duration postprocess_diff = std::chrono::steady_clock::now() - std::chrono::steady_clock::now();
+
+ int img_num = img_list.size();
+ std::vector width_list;
+ for (int i = 0; i < img_num; i++) {
+ width_list.push_back(float(img_list[i].cols) / img_list[i].rows);
+ }
+ std::vector indices = Utility::argsort(width_list);
+
+ for (int beg_img_no = 0; beg_img_no < img_num; beg_img_no += this->rec_batch_num_) {
+ auto preprocess_start = std::chrono::steady_clock::now();
+ int end_img_no = min(img_num, beg_img_no + this->rec_batch_num_);
+ float max_wh_ratio = 0;
+ for (int ino = beg_img_no; ino < end_img_no; ino ++) {
+ int h = img_list[indices[ino]].rows;
+ int w = img_list[indices[ino]].cols;
+ float wh_ratio = w * 1.0 / h;
+ max_wh_ratio = max(max_wh_ratio, wh_ratio);
+ }
+ std::vector norm_img_batch;
+ for (int ino = beg_img_no; ino < end_img_no; ino ++) {
+ cv::Mat srcimg;
+ img_list[indices[ino]].copyTo(srcimg);
+ cv::Mat resize_img;
+ this->resize_op_.Run(srcimg, resize_img, max_wh_ratio, this->use_tensorrt_);
+ this->normalize_op_.Run(&resize_img, this->mean_, this->scale_, this->is_scale_);
+ norm_img_batch.push_back(resize_img);
+ }
+
+ int batch_width = int(ceilf(32 * max_wh_ratio)) - 1;
+ std::vector input(this->rec_batch_num_ * 3 * 32 * batch_width, 0.0f);
+ this->permute_op_.Run(norm_img_batch, input.data());
+ auto preprocess_end = std::chrono::steady_clock::now();
+ preprocess_diff += preprocess_end - preprocess_start;
+
+ // Inference.
+ auto input_names = this->predictor_->GetInputNames();
+ auto input_t = this->predictor_->GetInputHandle(input_names[0]);
+ input_t->Reshape({this->rec_batch_num_, 3, 32, batch_width});
+ auto inference_start = std::chrono::steady_clock::now();
+ input_t->CopyFromCpu(input.data());
+ this->predictor_->Run();
+
+ std::vector predict_batch;
+ auto output_names = this->predictor_->GetOutputNames();
+ auto output_t = this->predictor_->GetOutputHandle(output_names[0]);
+ auto predict_shape = output_t->shape();
+
+ int out_num = std::accumulate(predict_shape.begin(), predict_shape.end(), 1,
std::multiplies());
- predict_batch.resize(out_num);
-
- output_t->CopyToCpu(predict_batch.data());
- auto inference_end = std::chrono::steady_clock::now();
-
- // ctc decode
- auto postprocess_start = std::chrono::steady_clock::now();
- std::vector str_res;
- int argmax_idx;
- int last_index = 0;
- float score = 0.f;
- int count = 0;
- float max_value = 0.0f;
-
- for (int n = 0; n < predict_shape[1]; n++) {
- argmax_idx =
- int(Utility::argmax(&predict_batch[n * predict_shape[2]],
- &predict_batch[(n + 1) * predict_shape[2]]));
- max_value =
- float(*std::max_element(&predict_batch[n * predict_shape[2]],
- &predict_batch[(n + 1) * predict_shape[2]]));
-
- if (argmax_idx > 0 && (!(n > 0 && argmax_idx == last_index))) {
- score += max_value;
- count += 1;
- str_res.push_back(label_list_[argmax_idx]);
+ predict_batch.resize(out_num);
+
+ output_t->CopyToCpu(predict_batch.data());
+ auto inference_end = std::chrono::steady_clock::now();
+ inference_diff += inference_end - inference_start;
+
+ // ctc decode
+ auto postprocess_start = std::chrono::steady_clock::now();
+ for (int m = 0; m < predict_shape[0]; m++) {
+ std::vector str_res;
+ int argmax_idx;
+ int last_index = 0;
+ float score = 0.f;
+ int count = 0;
+ float max_value = 0.0f;
+
+ for (int n = 0; n < predict_shape[1]; n++) {
+ argmax_idx =
+ int(Utility::argmax(&predict_batch[(m * predict_shape[1] + n) * predict_shape[2]],
+ &predict_batch[(m * predict_shape[1] + n + 1) * predict_shape[2]]));
+ max_value =
+ float(*std::max_element(&predict_batch[(m * predict_shape[1] + n) * predict_shape[2]],
+ &predict_batch[(m * predict_shape[1] + n + 1) * predict_shape[2]]));
+
+ if (argmax_idx > 0 && (!(n > 0 && argmax_idx == last_index))) {
+ score += max_value;
+ count += 1;
+ str_res.push_back(label_list_[argmax_idx]);
+ }
+ last_index = argmax_idx;
+ }
+ score /= count;
+ if (isnan(score))
+ continue;
+ for (int i = 0; i < str_res.size(); i++) {
+ std::cout << str_res[i];
+ }
+ std::cout << "\tscore: " << score << std::endl;
+ }
+ auto postprocess_end = std::chrono::steady_clock::now();
+ postprocess_diff += postprocess_end - postprocess_start;
}
- last_index = argmax_idx;
- }
- auto postprocess_end = std::chrono::steady_clock::now();
- score /= count;
- for (int i = 0; i < str_res.size(); i++) {
- std::cout << str_res[i];
- }
- std::cout << "\tscore: " << score << std::endl;
-
- std::chrono::duration preprocess_diff = preprocess_end - preprocess_start;
- times->push_back(double(preprocess_diff.count() * 1000));
- std::chrono::duration inference_diff = inference_end - inference_start;
- times->push_back(double(inference_diff.count() * 1000));
- std::chrono::duration postprocess_diff = postprocess_end - postprocess_start;
- times->push_back(double(postprocess_diff.count() * 1000));
+ times->push_back(double(preprocess_diff.count() * 1000));
+ times->push_back(double(inference_diff.count() * 1000));
+ times->push_back(double(postprocess_diff.count() * 1000));
}
+
void CRNNRecognizer::LoadModel(const std::string &model_dir) {
// AnalysisConfig config;
paddle_infer::Config config;
diff --git a/deploy/cpp_infer/src/preprocess_op.cpp b/deploy/cpp_infer/src/preprocess_op.cpp
index 23c51c2008dc7280ce4d6c232ed766dbf2a53226..14e8bd1d8425fa6c539c4f3673ea861e24b3b3c8 100644
--- a/deploy/cpp_infer/src/preprocess_op.cpp
+++ b/deploy/cpp_infer/src/preprocess_op.cpp
@@ -40,6 +40,17 @@ void Permute::Run(const cv::Mat *im, float *data) {
}
}
+void PermuteBatch::Run(const std::vector imgs, float *data) {
+ for (int j = 0; j < imgs.size(); j ++){
+ int rh = imgs[j].rows;
+ int rw = imgs[j].cols;
+ int rc = imgs[j].channels();
+ for (int i = 0; i < rc; ++i) {
+ cv::extractChannel(imgs[j], cv::Mat(rh, rw, CV_32FC1, data + (j * rc + i) * rh * rw), i);
+ }
+ }
+}
+
void Normalize::Run(cv::Mat *im, const std::vector &mean,
const std::vector &scale, const bool is_scale) {
double e = 1.0;
@@ -90,16 +101,17 @@ void CrnnResizeImg::Run(const cv::Mat &img, cv::Mat &resize_img, float wh_ratio,
imgC = rec_image_shape[0];
imgH = rec_image_shape[1];
imgW = rec_image_shape[2];
-
+
imgW = int(32 * wh_ratio);
float ratio = float(img.cols) / float(img.rows);
int resize_w, resize_h;
+
if (ceilf(imgH * ratio) > imgW)
resize_w = imgW;
else
resize_w = int(ceilf(imgH * ratio));
-
+
cv::resize(img, resize_img, cv::Size(resize_w, imgH), 0.f, 0.f,
cv::INTER_LINEAR);
cv::copyMakeBorder(resize_img, resize_img, 0, 0, 0,
diff --git a/deploy/cpp_infer/src/utility.cpp b/deploy/cpp_infer/src/utility.cpp
index dba445b747ff3f3c0d2db91061650c369977c4dd..c3c7b8485520579e8e2a23ae03543e3a9fc821bf 100644
--- a/deploy/cpp_infer/src/utility.cpp
+++ b/deploy/cpp_infer/src/utility.cpp
@@ -147,4 +147,17 @@ cv::Mat Utility::GetRotateCropImage(const cv::Mat &srcimage,
}
}
+std::vector Utility::argsort(const std::vector& array)
+{
+ const int array_len(array.size());
+ std::vector array_index(array_len, 0);
+ for (int i = 0; i < array_len; ++i)
+ array_index[i] = i;
+
+ std::sort(array_index.begin(), array_index.end(),
+ [&array](int pos1, int pos2) {return (array[pos1] < array[pos2]); });
+
+ return array_index;
+}
+
} // namespace PaddleOCR
\ No newline at end of file
diff --git a/deploy/pdserving/README.md b/deploy/pdserving/README.md
index de7965bac752f6bc9cd1de224b791b0a84f0e699..cb2845c581d244e80ca597e0eb485a16ad369f20 100644
--- a/deploy/pdserving/README.md
+++ b/deploy/pdserving/README.md
@@ -114,7 +114,7 @@ The recognition model is the same.
git clone https://github.com/PaddlePaddle/PaddleOCR
# Enter the working directory
- cd PaddleOCR/deploy/pdserver/
+ cd PaddleOCR/deploy/pdserving/
```
The pdserver directory contains the code to start the pipeline service and send prediction requests, including:
diff --git a/deploy/pdserving/README_CN.md b/deploy/pdserving/README_CN.md
index 5106fd9b2d03e6169cf0c723b241cb1385a22906..067be8bbda10d971b709afdf822aea96a979d000 100644
--- a/deploy/pdserving/README_CN.md
+++ b/deploy/pdserving/README_CN.md
@@ -112,7 +112,7 @@ python3 -m paddle_serving_client.convert --dirname ./ch_ppocr_mobile_v2.0_rec_in
git clone https://github.com/PaddlePaddle/PaddleOCR
# 进入到工作目录
- cd PaddleOCR/deploy/pdserver/
+ cd PaddleOCR/deploy/pdserving/
```
pdserver目录包含启动pipeline服务和发送预测请求的代码,包括:
```
@@ -206,7 +206,7 @@ pip3 install paddle-serving-app==0.3.1
1. 启动服务端程序
```
-cd win
+cd win
python3 ocr_web_server.py gpu(使用gpu方式)
或者
python3 ocr_web_server.py cpu(使用cpu方式)
diff --git a/doc/doc_ch/detection.md b/doc/doc_ch/detection.md
index dc50e8388128017d608a4ff38471d24bcb143bf3..166e15fb03b604b63f47b95304ac06c1f4ae9dd2 100644
--- a/doc/doc_ch/detection.md
+++ b/doc/doc_ch/detection.md
@@ -96,15 +96,28 @@ python3 tools/train.py -c configs/det/det_mv3_db.yml \
# 单机多卡训练,通过 --gpus 参数设置使用的GPU ID
python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/det/det_mv3_db.yml \
-o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
+
+# 多机多卡训练,通过 --ips 参数设置使用的机器IP地址,通过 --gpus 参数设置使用的GPU ID
+python3 -m paddle.distributed.launch --ips="10.21.226.181,10.21.226.133" --gpus '0,1,2,3' tools/train.py -c configs/det/det_mv3_db.yml \
+ -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
```
上述指令中,通过-c 选择训练使用configs/det/det_db_mv3.yml配置文件。
有关配置文件的详细解释,请参考[链接](./config.md)。
-
+
您也可以通过-o参数在不需要修改yml文件的情况下,改变训练的参数,比如,调整训练的学习率为0.0001
```shell
python3 tools/train.py -c configs/det/det_mv3_db.yml -o Optimizer.base_lr=0.0001
```
+
+**注意:** 采用多机多卡训练时,需要替换上面命令中的ips值为您机器的地址,机器之间需要能够相互ping通。查看机器ip地址的命令为`ifconfig`。
+
+如果您想进一步加快训练速度,可以使用[自动混合精度训练](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/01_paddle2.0_introduction/basic_concept/amp_cn.html), 以单机单卡为例,命令如下:
+```shell
+python3 tools/train.py -c configs/det/det_mv3_db.yml \
+ -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained \
+ Global.use_amp=True Global.scale_loss=1024.0 Global.use_dynamic_loss_scaling=True
+ ```
## 1.4 断点训练
diff --git a/doc/doc_ch/training.md b/doc/doc_ch/training.md
index c6c7b87d9925197b36a246c651ab7179ff9d2e81..d2deea6f205e0402e8c9c0010d5d8dd3e325fa3b 100644
--- a/doc/doc_ch/training.md
+++ b/doc/doc_ch/training.md
@@ -4,15 +4,16 @@
同时会简单介绍PaddleOCR模型训练数据的组成部分,以及如何在垂类场景中准备数据finetune模型。
-- [1. 基本概念](#基本概念)
- * [1.1 学习率](#学习率)
- * [1.2 正则化](#正则化)
- * [1.3 评估指标](#评估指标)
-- [2. 数据与垂类场景](#数据与垂类场景)
- * [2.1 训练数据](#训练数据)
- * [2.2 垂类场景](#垂类场景)
- * [2.3 自己构建数据集](#自己构建数据集)
-* [3. 常见问题](#常见问题)
+- [1.配置文件说明](#配置文件)
+- [2. 基本概念](#基本概念)
+ * [2.1 学习率](#学习率)
+ * [2.2 正则化](#正则化)
+ * [2.3 评估指标](#评估指标)
+- [3. 数据与垂类场景](#数据与垂类场景)
+ * [3.1 训练数据](#训练数据)
+ * [3.2 垂类场景](#垂类场景)
+ * [3.3 自己构建数据集](#自己构建数据集)
+* [4. 常见问题](#常见问题)
## 1. 基本概念
@@ -23,7 +24,7 @@ OCR(Optical Character Recognition,光学字符识别)是指对图像进行分析
模型调优时需要关注以下参数:
-### 1.1 学习率
+### 2.1 学习率
学习率是训练神经网络的重要超参数之一,它代表在每一次迭代中梯度向损失函数最优解移动的步长。
在PaddleOCR中提供了多种学习率更新策略,可以通过配置文件修改,例如:
@@ -42,7 +43,7 @@ Piecewise 代表分段常数衰减,在不同的学习阶段指定不同的学
warmup_epoch 代表在前5个epoch中,学习率将逐渐从0增加到base_lr。全部策略可以参考代码[learning_rate.py](../../ppocr/optimizer/learning_rate.py) 。
-### 1.2 正则化
+### 2.2 正则化
正则化可以有效的避免算法过拟合,PaddleOCR中提供了L1、L2正则方法,L1 和 L2 正则化是最常用的正则化方法。L1 正则化向目标函数添加正则化项,以减少参数的绝对值总和;而 L2 正则化中,添加正则化项的目的在于减少参数平方的总和。配置方法如下:
@@ -55,7 +56,7 @@ Optimizer:
```
-### 1.3 评估指标
+### 2.3 评估指标
(1)检测阶段:先按照检测框和标注框的IOU评估,IOU大于某个阈值判断为检测准确。这里检测框和标注框不同于一般的通用目标检测框,是采用多边形进行表示。检测准确率:正确的检测框个数在全部检测框的占比,主要是判断检测指标。检测召回率:正确的检测框个数在全部标注框的占比,主要是判断漏检的指标。
@@ -65,10 +66,10 @@ Optimizer:
-## 2. 数据与垂类场景
+## 3. 数据与垂类场景
-### 2.1 训练数据
+### 3.1 训练数据
目前开源的模型,数据集和量级如下:
- 检测:
@@ -83,13 +84,14 @@ Optimizer:
其中,公开数据集都是开源的,用户可自行搜索下载,也可参考[中文数据集](./datasets.md),合成数据暂不开源,用户可使用开源合成工具自行合成,可参考的合成工具包括[text_renderer](https://github.com/Sanster/text_renderer) 、[SynthText](https://github.com/ankush-me/SynthText) 、[TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator) 等。
-### 2.2 垂类场景
+### 3.2 垂类场景
PaddleOCR主要聚焦通用OCR,如果有垂类需求,您可以用PaddleOCR+垂类数据自己训练;
如果缺少带标注的数据,或者不想投入研发成本,建议直接调用开放的API,开放的API覆盖了目前比较常见的一些垂类。
-### 2.3 自己构建数据集
+
+### 3.3 自己构建数据集
在构建数据集时有几个经验可供参考:
@@ -107,7 +109,7 @@ PaddleOCR主要聚焦通用OCR,如果有垂类需求,您可以用PaddleOCR+
-## 3. 常见问题
+## 4. 常见问题
**Q**:训练CRNN识别时,如何选择合适的网络输入shape?
diff --git a/doc/doc_en/detection_en.md b/doc/doc_en/detection_en.md
index df96fd5336cd64049e8f5d9b898f60c55b82b7b4..68c5691cda9f78c7f805c0f0ecdf82f00534de72 100644
--- a/doc/doc_en/detection_en.md
+++ b/doc/doc_en/detection_en.md
@@ -98,7 +98,19 @@ python3 tools/train.py -c configs/det/det_mv3_db.yml -o \
# multi-GPU training
# Set the GPU ID used by the '--gpus' parameter.
python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/det/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
-
+
+# multi-Node, multi-GPU training
+# Set the IPs of your nodes used by the '--ips' parameter. Set the GPU ID used by the '--gpus' parameter.
+python3 -m paddle.distributed.launch --ips="10.21.226.181,10.21.226.133" --gpus '0,1,2,3' tools/train.py -c configs/det/det_mv3_db.yml \
+ -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
+```
+**Note:** For multi-Node multi-GPU training, you need to replace the `ips` value in the preceding command with the address of your machine, and the machines must be able to ping each other. The command for viewing the IP address of the machine is `ifconfig`.
+
+If you want to further speed up the training, you can use [automatic mixed precision training](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/01_paddle2.0_introduction/basic_concept/amp_en.html). for single card training, the command is as follows:
+```
+python3 tools/train.py -c configs/det/det_mv3_db.yml \
+ -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained \
+ Global.use_amp=True Global.scale_loss=1024.0 Global.use_dynamic_loss_scaling=True
```
### 2.2 Load Trained Model and Continue Training
diff --git a/doc/joinus.PNG b/doc/joinus.PNG
index 202ad0a5c6edf2190b71d5a7a544f1df94f866c4..79bd3143d082636dc85a3a6d5f0601a05b4a784f 100644
Binary files a/doc/joinus.PNG and b/doc/joinus.PNG differ
diff --git a/ppocr/data/imaug/east_process.py b/ppocr/data/imaug/east_process.py
index b1d7a5e51939af981dd62c269c930f4bf9ba4179..df08adfa1516c59229e95af193c172dfcdd5af08 100644
--- a/ppocr/data/imaug/east_process.py
+++ b/ppocr/data/imaug/east_process.py
@@ -11,7 +11,10 @@
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
-
+"""
+This code is refered from:
+https://github.com/songdejia/EAST/blob/master/data_utils.py
+"""
import math
import cv2
import numpy as np
@@ -24,10 +27,10 @@ __all__ = ['EASTProcessTrain']
class EASTProcessTrain(object):
def __init__(self,
- image_shape = [512, 512],
- background_ratio = 0.125,
- min_crop_side_ratio = 0.1,
- min_text_size = 10,
+ image_shape=[512, 512],
+ background_ratio=0.125,
+ min_crop_side_ratio=0.1,
+ min_text_size=10,
**kwargs):
self.input_size = image_shape[1]
self.random_scale = np.array([0.5, 1, 2.0, 3.0])
@@ -282,12 +285,7 @@ class EASTProcessTrain(object):
1.0 / max(min(poly_h, poly_w), 1.0)
return score_map, geo_map, training_mask
- def crop_area(self,
- im,
- polys,
- tags,
- crop_background=False,
- max_tries=50):
+ def crop_area(self, im, polys, tags, crop_background=False, max_tries=50):
"""
make random crop from the input image
:param im:
@@ -435,5 +433,4 @@ class EASTProcessTrain(object):
data['score_map'] = score_map
data['geo_map'] = geo_map
data['training_mask'] = training_mask
- # print(im.shape, score_map.shape, geo_map.shape, training_mask.shape)
- return data
\ No newline at end of file
+ return data
diff --git a/ppocr/data/imaug/iaa_augment.py b/ppocr/data/imaug/iaa_augment.py
index 9ce6bd4209034389df04334a83717142ca8c7b40..0aac7877c257f3e7532ca2806891775913d416b7 100644
--- a/ppocr/data/imaug/iaa_augment.py
+++ b/ppocr/data/imaug/iaa_augment.py
@@ -11,6 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/WenmuZhou/DBNet.pytorch/blob/master/data_loader/modules/iaa_augment.py
+"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
diff --git a/ppocr/data/imaug/make_border_map.py b/ppocr/data/imaug/make_border_map.py
index cc2c9034e147eb7bb6a70e43eda4903337a523f0..abab38368db2de84e54b060598fc509a65219296 100644
--- a/ppocr/data/imaug/make_border_map.py
+++ b/ppocr/data/imaug/make_border_map.py
@@ -1,4 +1,20 @@
-# -*- coding:utf-8 -*-
+# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+This code is refer from:
+https://github.com/WenmuZhou/DBNet.pytorch/blob/master/data_loader/modules/make_border_map.py
+"""
from __future__ import absolute_import
from __future__ import division
diff --git a/ppocr/data/imaug/make_pse_gt.py b/ppocr/data/imaug/make_pse_gt.py
index 55abc8970784fd00843d2e91f259c58b65ae8579..255d076bde848d53f3b2fb04e80594872f4ae8c7 100644
--- a/ppocr/data/imaug/make_pse_gt.py
+++ b/ppocr/data/imaug/make_pse_gt.py
@@ -1,4 +1,16 @@
-# -*- coding:utf-8 -*-
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
from __future__ import absolute_import
from __future__ import division
@@ -12,12 +24,8 @@ from shapely.geometry import Polygon
__all__ = ['MakePseGt']
-class MakePseGt(object):
- r'''
- Making binary mask from detection data with ICDAR format.
- Typically following the process of class `MakeICDARData`.
- '''
+class MakePseGt(object):
def __init__(self, kernel_num=7, size=640, min_shrink_ratio=0.4, **kwargs):
self.kernel_num = kernel_num
self.min_shrink_ratio = min_shrink_ratio
@@ -38,16 +46,20 @@ class MakePseGt(object):
text_polys *= scale
gt_kernels = []
- for i in range(1,self.kernel_num+1):
+ for i in range(1, self.kernel_num + 1):
# s1->sn, from big to small
- rate = 1.0 - (1.0 - self.min_shrink_ratio) / (self.kernel_num - 1) * i
- text_kernel, ignore_tags = self.generate_kernel(image.shape[0:2], rate, text_polys, ignore_tags)
+ rate = 1.0 - (1.0 - self.min_shrink_ratio) / (self.kernel_num - 1
+ ) * i
+ text_kernel, ignore_tags = self.generate_kernel(
+ image.shape[0:2], rate, text_polys, ignore_tags)
gt_kernels.append(text_kernel)
training_mask = np.ones(image.shape[0:2], dtype='uint8')
for i in range(text_polys.shape[0]):
if ignore_tags[i]:
- cv2.fillPoly(training_mask, text_polys[i].astype(np.int32)[np.newaxis, :, :], 0)
+ cv2.fillPoly(training_mask,
+ text_polys[i].astype(np.int32)[np.newaxis, :, :],
+ 0)
gt_kernels = np.array(gt_kernels)
gt_kernels[gt_kernels > 0] = 1
@@ -59,16 +71,25 @@ class MakePseGt(object):
data['mask'] = training_mask.astype('float32')
return data
- def generate_kernel(self, img_size, shrink_ratio, text_polys, ignore_tags=None):
+ def generate_kernel(self,
+ img_size,
+ shrink_ratio,
+ text_polys,
+ ignore_tags=None):
+ """
+ Refer to part of the code:
+ https://github.com/open-mmlab/mmocr/blob/main/mmocr/datasets/pipelines/textdet_targets/base_textdet_targets.py
+ """
+
h, w = img_size
text_kernel = np.zeros((h, w), dtype=np.float32)
for i, poly in enumerate(text_polys):
polygon = Polygon(poly)
- distance = polygon.area * (1 - shrink_ratio * shrink_ratio) / (polygon.length + 1e-6)
+ distance = polygon.area * (1 - shrink_ratio * shrink_ratio) / (
+ polygon.length + 1e-6)
subject = [tuple(l) for l in poly]
pco = pyclipper.PyclipperOffset()
- pco.AddPath(subject, pyclipper.JT_ROUND,
- pyclipper.ET_CLOSEDPOLYGON)
+ pco.AddPath(subject, pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON)
shrinked = np.array(pco.Execute(-distance))
if len(shrinked) == 0 or shrinked.size == 0:
diff --git a/ppocr/data/imaug/make_shrink_map.py b/ppocr/data/imaug/make_shrink_map.py
index 15e8afa05bb9f7315a2e9342c78cb98718a54df9..6c65c20e5621f91a5b1fba549b059c92923fca6f 100644
--- a/ppocr/data/imaug/make_shrink_map.py
+++ b/ppocr/data/imaug/make_shrink_map.py
@@ -1,4 +1,20 @@
-# -*- coding:utf-8 -*-
+# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+This code is refer from:
+https://github.com/WenmuZhou/DBNet.pytorch/blob/master/data_loader/modules/make_shrink_map.py
+"""
from __future__ import absolute_import
from __future__ import division
diff --git a/ppocr/data/imaug/random_crop_data.py b/ppocr/data/imaug/random_crop_data.py
index 7c1c25abb56a0cf7d4d59b8523962bd5d81c873a..64aa110de4e3df950ce21e6d657877081b0fdd13 100644
--- a/ppocr/data/imaug/random_crop_data.py
+++ b/ppocr/data/imaug/random_crop_data.py
@@ -1,4 +1,20 @@
-# -*- coding:utf-8 -*-
+# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+This code is refer from:
+https://github.com/WenmuZhou/DBNet.pytorch/blob/master/data_loader/modules/random_crop_data.py
+"""
from __future__ import absolute_import
from __future__ import division
diff --git a/ppocr/data/imaug/sast_process.py b/ppocr/data/imaug/sast_process.py
index 1536dceb8ee5999226cfe7cf455d70e39b449530..08d03b194dcfab92ab59329857d4a1326531218e 100644
--- a/ppocr/data/imaug/sast_process.py
+++ b/ppocr/data/imaug/sast_process.py
@@ -11,7 +11,10 @@
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
-
+"""
+This part code is refered from:
+https://github.com/songdejia/EAST/blob/master/data_utils.py
+"""
import math
import cv2
import numpy as np
diff --git a/ppocr/data/imaug/text_image_aug/augment.py b/ppocr/data/imaug/text_image_aug/augment.py
index 1aeff3733a4521c56dd5972fc058f6e0c245e4b7..2d15dd5f353c72a6cc3876481c423d81a8175c95 100644
--- a/ppocr/data/imaug/text_image_aug/augment.py
+++ b/ppocr/data/imaug/text_image_aug/augment.py
@@ -11,6 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/RubanSeven/Text-Image-Augmentation-python/blob/master/augment.py
+"""
import numpy as np
from .warp_mls import WarpMLS
diff --git a/ppocr/data/imaug/text_image_aug/warp_mls.py b/ppocr/data/imaug/text_image_aug/warp_mls.py
index d6cbe749b61aa4cf3163927c096868c83f4a4cdd..75de11115cf9ba824a7cd62b8b880ea7f99e4cb2 100644
--- a/ppocr/data/imaug/text_image_aug/warp_mls.py
+++ b/ppocr/data/imaug/text_image_aug/warp_mls.py
@@ -11,6 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/RubanSeven/Text-Image-Augmentation-python/blob/master/warp_mls.py
+"""
import numpy as np
@@ -161,4 +165,4 @@ class WarpMLS:
dst = np.clip(dst, 0, 255)
dst = np.array(dst, dtype=np.uint8)
- return dst
\ No newline at end of file
+ return dst
diff --git a/ppocr/losses/ace_loss.py b/ppocr/losses/ace_loss.py
index bf15f8e3a7b355bd9e8b69435a5dae01fc75a892..915b99e6ec1d6cb4641d8032fa188c61006dfbb3 100644
--- a/ppocr/losses/ace_loss.py
+++ b/ppocr/losses/ace_loss.py
@@ -11,6 +11,9 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+
+# This code is refer from: https://github.com/viig99/LS-ACELoss
+
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
@@ -32,7 +35,7 @@ class ACELoss(nn.Layer):
def __call__(self, predicts, batch):
if isinstance(predicts, (list, tuple)):
predicts = predicts[-1]
-
+
B, N = predicts.shape[:2]
div = paddle.to_tensor([N]).astype('float32')
diff --git a/ppocr/losses/center_loss.py b/ppocr/losses/center_loss.py
index cbef4df965e2659c6aa63c0c69cd8798143df485..f8c57fdd5c9b3f0dec5c3d0a811e5532abd2e45a 100644
--- a/ppocr/losses/center_loss.py
+++ b/ppocr/losses/center_loss.py
@@ -12,6 +12,8 @@
#See the License for the specific language governing permissions and
#limitations under the License.
+# This code is refer from: https://github.com/KaiyangZhou/pytorch-center-loss
+
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
@@ -27,6 +29,7 @@ class CenterLoss(nn.Layer):
"""
Reference: Wen et al. A Discriminative Feature Learning Approach for Deep Face Recognition. ECCV 2016.
"""
+
def __init__(self,
num_classes=6625,
feat_dim=96,
diff --git a/ppocr/losses/det_basic_loss.py b/ppocr/losses/det_basic_loss.py
index 7017236c284e55710f242275a413d56d32158d34..61ea579b41d3cdf7831c168f563a1e3cd72463a0 100644
--- a/ppocr/losses/det_basic_loss.py
+++ b/ppocr/losses/det_basic_loss.py
@@ -11,7 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
-
+"""
+This code is refer from:
+https://github.com/WenmuZhou/DBNet.pytorch/blob/master/models/losses/basic_loss.py
+"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
@@ -147,4 +150,4 @@ class BCELoss(nn.Layer):
def forward(self, input, label, mask=None, weight=None, name=None):
loss = F.binary_cross_entropy(input, label, reduction=self.reduction)
- return loss
\ No newline at end of file
+ return loss
diff --git a/ppocr/losses/det_db_loss.py b/ppocr/losses/det_db_loss.py
index b079aabff7c7deccc7e365b91c9407f7e894bcb9..708ffbdb47f349304e2bfd781a836e79348475f4 100755
--- a/ppocr/losses/det_db_loss.py
+++ b/ppocr/losses/det_db_loss.py
@@ -11,6 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/WenmuZhou/DBNet.pytorch/blob/master/models/losses/DB_loss.py
+"""
from __future__ import absolute_import
from __future__ import division
diff --git a/ppocr/losses/det_pse_loss.py b/ppocr/losses/det_pse_loss.py
index 78423091f841f29b1217f73f79beb26fe1575844..9b8ac4b5a5dfac176c398dd0a9e490e5ca67ad5f 100644
--- a/ppocr/losses/det_pse_loss.py
+++ b/ppocr/losses/det_pse_loss.py
@@ -11,6 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/whai362/PSENet/blob/python3/models/head/psenet_head.py
+"""
import paddle
from paddle import nn
diff --git a/ppocr/losses/rec_nrtr_loss.py b/ppocr/losses/rec_nrtr_loss.py
index 41714dd2a3ae15eeedc62521d97935f68271c598..200a6d0486dbf6f76dd674eb58f641b31a70f31c 100644
--- a/ppocr/losses/rec_nrtr_loss.py
+++ b/ppocr/losses/rec_nrtr_loss.py
@@ -22,7 +22,7 @@ class NRTRLoss(nn.Layer):
log_prb = F.log_softmax(pred, axis=1)
non_pad_mask = paddle.not_equal(
tgt, paddle.zeros(
- tgt.shape, dtype='int64'))
+ tgt.shape, dtype=tgt.dtype))
loss = -(one_hot * log_prb).sum(axis=1)
loss = loss.masked_select(non_pad_mask).mean()
else:
diff --git a/ppocr/modeling/backbones/rec_mv1_enhance.py b/ppocr/modeling/backbones/rec_mv1_enhance.py
index 04a909b8ccafd8e62f9a7076c7dedf63ff745303..d8a7f4b5646eb70b5202aa3b3ac6494318b424ad 100644
--- a/ppocr/modeling/backbones/rec_mv1_enhance.py
+++ b/ppocr/modeling/backbones/rec_mv1_enhance.py
@@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
+# This code is refer from: https://github.com/PaddlePaddle/PaddleClas/blob/develop/ppcls/arch/backbone/legendary_models/pp_lcnet.py
+
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
diff --git a/ppocr/modeling/backbones/rec_resnet_31.py b/ppocr/modeling/backbones/rec_resnet_31.py
index f60729cdcced2af7626e5615ca323e32c99432ec..965170138d00a53fca720b3b5f535a3dd34272d9 100644
--- a/ppocr/modeling/backbones/rec_resnet_31.py
+++ b/ppocr/modeling/backbones/rec_resnet_31.py
@@ -1,3 +1,22 @@
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+This code is refer from:
+https://github.com/open-mmlab/mmocr/blob/main/mmocr/models/textrecog/layers/conv_layer.py
+https://github.com/open-mmlab/mmocr/blob/main/mmocr/models/textrecog/backbones/resnet31_ocr.py
+"""
+
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
@@ -18,12 +37,12 @@ def conv3x3(in_channel, out_channel, stride=1):
kernel_size=3,
stride=stride,
padding=1,
- bias_attr=False
- )
+ bias_attr=False)
class BasicBlock(nn.Layer):
expansion = 1
+
def __init__(self, in_channels, channels, stride=1, downsample=False):
super().__init__()
self.conv1 = conv3x3(in_channels, channels, stride)
@@ -34,9 +53,13 @@ class BasicBlock(nn.Layer):
self.downsample = downsample
if downsample:
self.downsample = nn.Sequential(
- nn.Conv2D(in_channels, channels * self.expansion, 1, stride, bias_attr=False),
- nn.BatchNorm2D(channels * self.expansion),
- )
+ nn.Conv2D(
+ in_channels,
+ channels * self.expansion,
+ 1,
+ stride,
+ bias_attr=False),
+ nn.BatchNorm2D(channels * self.expansion), )
else:
self.downsample = nn.Sequential()
self.stride = stride
@@ -57,7 +80,7 @@ class BasicBlock(nn.Layer):
out += residual
out = self.relu(out)
- return out
+ return out
class ResNet31(nn.Layer):
@@ -69,12 +92,13 @@ class ResNet31(nn.Layer):
out_indices (None | Sequence[int]): Indices of output stages.
last_stage_pool (bool): If True, add `MaxPool2d` layer to last stage.
'''
- def __init__(self,
- in_channels=3,
- layers=[1, 2, 5, 3],
- channels=[64, 128, 256, 256, 512, 512, 512],
- out_indices=None,
- last_stage_pool=False):
+
+ def __init__(self,
+ in_channels=3,
+ layers=[1, 2, 5, 3],
+ channels=[64, 128, 256, 256, 512, 512, 512],
+ out_indices=None,
+ last_stage_pool=False):
super(ResNet31, self).__init__()
assert isinstance(in_channels, int)
assert isinstance(last_stage_pool, bool)
@@ -83,46 +107,56 @@ class ResNet31(nn.Layer):
self.last_stage_pool = last_stage_pool
# conv 1 (Conv Conv)
- self.conv1_1 = nn.Conv2D(in_channels, channels[0], kernel_size=3, stride=1, padding=1)
+ self.conv1_1 = nn.Conv2D(
+ in_channels, channels[0], kernel_size=3, stride=1, padding=1)
self.bn1_1 = nn.BatchNorm2D(channels[0])
self.relu1_1 = nn.ReLU()
- self.conv1_2 = nn.Conv2D(channels[0], channels[1], kernel_size=3, stride=1, padding=1)
+ self.conv1_2 = nn.Conv2D(
+ channels[0], channels[1], kernel_size=3, stride=1, padding=1)
self.bn1_2 = nn.BatchNorm2D(channels[1])
self.relu1_2 = nn.ReLU()
# conv 2 (Max-pooling, Residual block, Conv)
- self.pool2 = nn.MaxPool2D(kernel_size=2, stride=2, padding=0, ceil_mode=True)
+ self.pool2 = nn.MaxPool2D(
+ kernel_size=2, stride=2, padding=0, ceil_mode=True)
self.block2 = self._make_layer(channels[1], channels[2], layers[0])
- self.conv2 = nn.Conv2D(channels[2], channels[2], kernel_size=3, stride=1, padding=1)
+ self.conv2 = nn.Conv2D(
+ channels[2], channels[2], kernel_size=3, stride=1, padding=1)
self.bn2 = nn.BatchNorm2D(channels[2])
self.relu2 = nn.ReLU()
# conv 3 (Max-pooling, Residual block, Conv)
- self.pool3 = nn.MaxPool2D(kernel_size=2, stride=2, padding=0, ceil_mode=True)
+ self.pool3 = nn.MaxPool2D(
+ kernel_size=2, stride=2, padding=0, ceil_mode=True)
self.block3 = self._make_layer(channels[2], channels[3], layers[1])
- self.conv3 = nn.Conv2D(channels[3], channels[3], kernel_size=3, stride=1, padding=1)
+ self.conv3 = nn.Conv2D(
+ channels[3], channels[3], kernel_size=3, stride=1, padding=1)
self.bn3 = nn.BatchNorm2D(channels[3])
self.relu3 = nn.ReLU()
# conv 4 (Max-pooling, Residual block, Conv)
- self.pool4 = nn.MaxPool2D(kernel_size=(2, 1), stride=(2, 1), padding=0, ceil_mode=True)
+ self.pool4 = nn.MaxPool2D(
+ kernel_size=(2, 1), stride=(2, 1), padding=0, ceil_mode=True)
self.block4 = self._make_layer(channels[3], channels[4], layers[2])
- self.conv4 = nn.Conv2D(channels[4], channels[4], kernel_size=3, stride=1, padding=1)
+ self.conv4 = nn.Conv2D(
+ channels[4], channels[4], kernel_size=3, stride=1, padding=1)
self.bn4 = nn.BatchNorm2D(channels[4])
self.relu4 = nn.ReLU()
# conv 5 ((Max-pooling), Residual block, Conv)
self.pool5 = None
if self.last_stage_pool:
- self.pool5 = nn.MaxPool2D(kernel_size=2, stride=2, padding=0, ceil_mode=True)
+ self.pool5 = nn.MaxPool2D(
+ kernel_size=2, stride=2, padding=0, ceil_mode=True)
self.block5 = self._make_layer(channels[4], channels[5], layers[3])
- self.conv5 = nn.Conv2D(channels[5], channels[5], kernel_size=3, stride=1, padding=1)
+ self.conv5 = nn.Conv2D(
+ channels[5], channels[5], kernel_size=3, stride=1, padding=1)
self.bn5 = nn.BatchNorm2D(channels[5])
self.relu5 = nn.ReLU()
self.out_channels = channels[-1]
-
+
def _make_layer(self, input_channels, output_channels, blocks):
layers = []
for _ in range(blocks):
@@ -130,19 +164,19 @@ class ResNet31(nn.Layer):
if input_channels != output_channels:
downsample = nn.Sequential(
nn.Conv2D(
- input_channels,
- output_channels,
- kernel_size=1,
- stride=1,
+ input_channels,
+ output_channels,
+ kernel_size=1,
+ stride=1,
bias_attr=False),
- nn.BatchNorm2D(output_channels),
- )
-
- layers.append(BasicBlock(input_channels, output_channels, downsample=downsample))
+ nn.BatchNorm2D(output_channels), )
+
+ layers.append(
+ BasicBlock(
+ input_channels, output_channels, downsample=downsample))
input_channels = output_channels
return nn.Sequential(*layers)
-
def forward(self, x):
x = self.conv1_1(x)
x = self.bn1_1(x)
@@ -166,11 +200,11 @@ class ResNet31(nn.Layer):
x = block_layer(x)
x = conv_layer(x)
x = bn_layer(x)
- x= relu_layer(x)
+ x = relu_layer(x)
outs.append(x)
-
+
if self.out_indices is not None:
return tuple([outs[i] for i in self.out_indices])
-
+
return x
diff --git a/ppocr/modeling/backbones/rec_resnet_aster.py b/ppocr/modeling/backbones/rec_resnet_aster.py
index bdecaf46af98f9b967d9a339f82d4e938abdc6d9..6a2710dfa079b4d910146c10ca2cff31321b2513 100644
--- a/ppocr/modeling/backbones/rec_resnet_aster.py
+++ b/ppocr/modeling/backbones/rec_resnet_aster.py
@@ -11,7 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
-
+"""
+This code is refer from:
+https://github.com/ayumiymk/aster.pytorch/blob/master/lib/models/resnet_aster.py
+"""
import paddle
import paddle.nn as nn
diff --git a/ppocr/modeling/heads/det_pse_head.py b/ppocr/modeling/heads/det_pse_head.py
index db800f57a216ab437b724988ce692a9ac0c545d9..32a5b48e190b7566411b19841b6aa14455b5d41d 100644
--- a/ppocr/modeling/heads/det_pse_head.py
+++ b/ppocr/modeling/heads/det_pse_head.py
@@ -1,4 +1,4 @@
-# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@@ -11,22 +11,24 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/whai362/PSENet/blob/python3/models/head/psenet_head.py
+"""
+
from paddle import nn
class PSEHead(nn.Layer):
- def __init__(self,
- in_channels,
- hidden_dim=256,
- out_channels=7,
- **kwargs):
+ def __init__(self, in_channels, hidden_dim=256, out_channels=7, **kwargs):
super(PSEHead, self).__init__()
- self.conv1 = nn.Conv2D(in_channels, hidden_dim, kernel_size=3, stride=1, padding=1)
+ self.conv1 = nn.Conv2D(
+ in_channels, hidden_dim, kernel_size=3, stride=1, padding=1)
self.bn1 = nn.BatchNorm2D(hidden_dim)
self.relu1 = nn.ReLU()
- self.conv2 = nn.Conv2D(hidden_dim, out_channels, kernel_size=1, stride=1, padding=0)
-
+ self.conv2 = nn.Conv2D(
+ hidden_dim, out_channels, kernel_size=1, stride=1, padding=0)
def forward(self, x, **kwargs):
out = self.conv1(x)
diff --git a/ppocr/modeling/heads/rec_aster_head.py b/ppocr/modeling/heads/rec_aster_head.py
index 4961897b409020fe6cff72eb96f3257156fa33ac..9240f002d3a8bcbde517142be6b45559430de610 100644
--- a/ppocr/modeling/heads/rec_aster_head.py
+++ b/ppocr/modeling/heads/rec_aster_head.py
@@ -11,6 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/ayumiymk/aster.pytorch/blob/master/lib/models/attention_recognition_head.py
+"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
diff --git a/ppocr/modeling/heads/rec_att_head.py b/ppocr/modeling/heads/rec_att_head.py
index 4286d7691d1abcf80c283d1c1ab76f8cd1f4a634..6d77e42eb5def579052687ab6fdc265159311884 100644
--- a/ppocr/modeling/heads/rec_att_head.py
+++ b/ppocr/modeling/heads/rec_att_head.py
@@ -75,7 +75,7 @@ class AttentionHead(nn.Layer):
probs_step, axis=1)], axis=1)
next_input = probs_step.argmax(axis=1)
targets = next_input
-
+ probs = paddle.nn.functional.softmax(probs, axis=2)
return probs
diff --git a/ppocr/modeling/heads/rec_sar_head.py b/ppocr/modeling/heads/rec_sar_head.py
index 7107788d9ef3b49ac6d4dcd4a8133a9603ada19b..a46cce7de2c8e59cf797db96fc6fcb7e25fa549a 100644
--- a/ppocr/modeling/heads/rec_sar_head.py
+++ b/ppocr/modeling/heads/rec_sar_head.py
@@ -1,3 +1,22 @@
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+This code is refer from:
+https://github.com/open-mmlab/mmocr/blob/main/mmocr/models/textrecog/encoders/sar_encoder.py
+https://github.com/open-mmlab/mmocr/blob/main/mmocr/models/textrecog/decoders/sar_decoder.py
+"""
+
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
@@ -275,7 +294,6 @@ class ParallelSARDecoder(BaseDecoder):
if img_metas is not None and self.mask:
valid_ratios = img_metas[-1]
- label = label.cuda()
lab_embedding = self.embedding(label)
# bsz * seq_len * emb_dim
out_enc = out_enc.unsqueeze(1)
diff --git a/ppocr/modeling/necks/fpn.py b/ppocr/modeling/necks/fpn.py
index 8728a5c9ded5b9c174fd34f088d8012961f65ec0..48c85b1e53bd889bc887e8fedcd33b1b12cb734b 100644
--- a/ppocr/modeling/necks/fpn.py
+++ b/ppocr/modeling/necks/fpn.py
@@ -11,64 +11,102 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/whai362/PSENet/blob/python3/models/neck/fpn.py
+"""
import paddle.nn as nn
import paddle
import math
import paddle.nn.functional as F
+
class Conv_BN_ReLU(nn.Layer):
- def __init__(self, in_planes, out_planes, kernel_size=1, stride=1, padding=0):
+ def __init__(self,
+ in_planes,
+ out_planes,
+ kernel_size=1,
+ stride=1,
+ padding=0):
super(Conv_BN_ReLU, self).__init__()
- self.conv = nn.Conv2D(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=padding,
- bias_attr=False)
+ self.conv = nn.Conv2D(
+ in_planes,
+ out_planes,
+ kernel_size=kernel_size,
+ stride=stride,
+ padding=padding,
+ bias_attr=False)
self.bn = nn.BatchNorm2D(out_planes, momentum=0.1)
self.relu = nn.ReLU()
for m in self.sublayers():
if isinstance(m, nn.Conv2D):
n = m._kernel_size[0] * m._kernel_size[1] * m._out_channels
- m.weight = paddle.create_parameter(shape=m.weight.shape, dtype='float32', default_initializer=paddle.nn.initializer.Normal(0, math.sqrt(2. / n)))
+ m.weight = paddle.create_parameter(
+ shape=m.weight.shape,
+ dtype='float32',
+ default_initializer=paddle.nn.initializer.Normal(
+ 0, math.sqrt(2. / n)))
elif isinstance(m, nn.BatchNorm2D):
- m.weight = paddle.create_parameter(shape=m.weight.shape, dtype='float32', default_initializer=paddle.nn.initializer.Constant(1.0))
- m.bias = paddle.create_parameter(shape=m.bias.shape, dtype='float32', default_initializer=paddle.nn.initializer.Constant(0.0))
+ m.weight = paddle.create_parameter(
+ shape=m.weight.shape,
+ dtype='float32',
+ default_initializer=paddle.nn.initializer.Constant(1.0))
+ m.bias = paddle.create_parameter(
+ shape=m.bias.shape,
+ dtype='float32',
+ default_initializer=paddle.nn.initializer.Constant(0.0))
def forward(self, x):
return self.relu(self.bn(self.conv(x)))
+
class FPN(nn.Layer):
def __init__(self, in_channels, out_channels):
super(FPN, self).__init__()
# Top layer
- self.toplayer_ = Conv_BN_ReLU(in_channels[3], out_channels, kernel_size=1, stride=1, padding=0)
+ self.toplayer_ = Conv_BN_ReLU(
+ in_channels[3], out_channels, kernel_size=1, stride=1, padding=0)
# Lateral layers
- self.latlayer1_ = Conv_BN_ReLU(in_channels[2], out_channels, kernel_size=1, stride=1, padding=0)
+ self.latlayer1_ = Conv_BN_ReLU(
+ in_channels[2], out_channels, kernel_size=1, stride=1, padding=0)
- self.latlayer2_ = Conv_BN_ReLU(in_channels[1], out_channels, kernel_size=1, stride=1, padding=0)
+ self.latlayer2_ = Conv_BN_ReLU(
+ in_channels[1], out_channels, kernel_size=1, stride=1, padding=0)
- self.latlayer3_ = Conv_BN_ReLU(in_channels[0], out_channels, kernel_size=1, stride=1, padding=0)
+ self.latlayer3_ = Conv_BN_ReLU(
+ in_channels[0], out_channels, kernel_size=1, stride=1, padding=0)
# Smooth layers
- self.smooth1_ = Conv_BN_ReLU(out_channels, out_channels, kernel_size=3, stride=1, padding=1)
-
- self.smooth2_ = Conv_BN_ReLU(out_channels, out_channels, kernel_size=3, stride=1, padding=1)
+ self.smooth1_ = Conv_BN_ReLU(
+ out_channels, out_channels, kernel_size=3, stride=1, padding=1)
- self.smooth3_ = Conv_BN_ReLU(out_channels, out_channels, kernel_size=3, stride=1, padding=1)
+ self.smooth2_ = Conv_BN_ReLU(
+ out_channels, out_channels, kernel_size=3, stride=1, padding=1)
+ self.smooth3_ = Conv_BN_ReLU(
+ out_channels, out_channels, kernel_size=3, stride=1, padding=1)
self.out_channels = out_channels * 4
for m in self.sublayers():
if isinstance(m, nn.Conv2D):
n = m._kernel_size[0] * m._kernel_size[1] * m._out_channels
- m.weight = paddle.create_parameter(shape=m.weight.shape, dtype='float32',
- default_initializer=paddle.nn.initializer.Normal(0,
- math.sqrt(2. / n)))
+ m.weight = paddle.create_parameter(
+ shape=m.weight.shape,
+ dtype='float32',
+ default_initializer=paddle.nn.initializer.Normal(
+ 0, math.sqrt(2. / n)))
elif isinstance(m, nn.BatchNorm2D):
- m.weight = paddle.create_parameter(shape=m.weight.shape, dtype='float32',
- default_initializer=paddle.nn.initializer.Constant(1.0))
- m.bias = paddle.create_parameter(shape=m.bias.shape, dtype='float32',
- default_initializer=paddle.nn.initializer.Constant(0.0))
+ m.weight = paddle.create_parameter(
+ shape=m.weight.shape,
+ dtype='float32',
+ default_initializer=paddle.nn.initializer.Constant(1.0))
+ m.bias = paddle.create_parameter(
+ shape=m.bias.shape,
+ dtype='float32',
+ default_initializer=paddle.nn.initializer.Constant(0.0))
def _upsample(self, x, scale=1):
return F.upsample(x, scale_factor=scale, mode='bilinear')
@@ -81,15 +119,15 @@ class FPN(nn.Layer):
p5 = self.toplayer_(f5)
f4 = self.latlayer1_(f4)
- p4 = self._upsample_add(p5, f4,2)
+ p4 = self._upsample_add(p5, f4, 2)
p4 = self.smooth1_(p4)
f3 = self.latlayer2_(f3)
- p3 = self._upsample_add(p4, f3,2)
+ p3 = self._upsample_add(p4, f3, 2)
p3 = self.smooth2_(p3)
f2 = self.latlayer3_(f2)
- p2 = self._upsample_add(p3, f2,2)
+ p2 = self._upsample_add(p3, f2, 2)
p2 = self.smooth3_(p2)
p3 = self._upsample(p3, 2)
@@ -97,4 +135,4 @@ class FPN(nn.Layer):
p5 = self._upsample(p5, 8)
fuse = paddle.concat([p2, p3, p4, p5], axis=1)
- return fuse
\ No newline at end of file
+ return fuse
diff --git a/ppocr/modeling/transforms/stn.py b/ppocr/modeling/transforms/stn.py
index 215895f4c4c719f407f4998f7429d965e0529ddc..6f2bdda050f217d8253740001901fbff4065782a 100644
--- a/ppocr/modeling/transforms/stn.py
+++ b/ppocr/modeling/transforms/stn.py
@@ -11,7 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
-
+"""
+This code is refer from:
+https://github.com/ayumiymk/aster.pytorch/blob/master/lib/models/stn_head.py
+"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
diff --git a/ppocr/modeling/transforms/tps.py b/ppocr/modeling/transforms/tps.py
index 6cd68555369dd1ddbd6ccf5236688a4b957b8525..9bdab0f85112b90d8da959dce4e258188a812052 100644
--- a/ppocr/modeling/transforms/tps.py
+++ b/ppocr/modeling/transforms/tps.py
@@ -11,6 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/clovaai/deep-text-recognition-benchmark/blob/master/modules/transformation.py
+"""
from __future__ import absolute_import
from __future__ import division
diff --git a/ppocr/modeling/transforms/tps_spatial_transformer.py b/ppocr/modeling/transforms/tps_spatial_transformer.py
index b510acb0d4012c9a4d90c7ca07cac895f0bf242e..4db34f7b4833c1c9b2901c68899bfb294b5843c4 100644
--- a/ppocr/modeling/transforms/tps_spatial_transformer.py
+++ b/ppocr/modeling/transforms/tps_spatial_transformer.py
@@ -11,6 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/ayumiymk/aster.pytorch/blob/master/lib/models/tps_spatial_transformer.py
+"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
diff --git a/ppocr/postprocess/__init__.py b/ppocr/postprocess/__init__.py
index 5ca4e6bb96fc6f37ef67a2fb0b8c2496e1a83d77..c6cb0144f7efd9ff7976ad67a658a554eafce754 100644
--- a/ppocr/postprocess/__init__.py
+++ b/ppocr/postprocess/__init__.py
@@ -18,7 +18,6 @@ from __future__ import print_function
from __future__ import unicode_literals
import copy
-import platform
__all__ = ['build_post_process']
@@ -26,21 +25,24 @@ from .db_postprocess import DBPostProcess, DistillationDBPostProcess
from .east_postprocess import EASTPostProcess
from .sast_postprocess import SASTPostProcess
from .rec_postprocess import CTCLabelDecode, AttnLabelDecode, SRNLabelDecode, DistillationCTCLabelDecode, \
- TableLabelDecode, NRTRLabelDecode, SARLabelDecode , SEEDLabelDecode
+ TableLabelDecode, NRTRLabelDecode, SARLabelDecode, SEEDLabelDecode
from .cls_postprocess import ClsPostProcess
from .pg_postprocess import PGPostProcess
-from .pse_postprocess import PSEPostProcess
def build_post_process(config, global_config=None):
support_dict = [
- 'DBPostProcess', 'PSEPostProcess', 'EASTPostProcess', 'SASTPostProcess',
- 'CTCLabelDecode', 'AttnLabelDecode', 'ClsPostProcess', 'SRNLabelDecode',
- 'PGPostProcess', 'DistillationCTCLabelDecode', 'TableLabelDecode',
+ 'DBPostProcess', 'EASTPostProcess', 'SASTPostProcess', 'CTCLabelDecode',
+ 'AttnLabelDecode', 'ClsPostProcess', 'SRNLabelDecode', 'PGPostProcess',
+ 'DistillationCTCLabelDecode', 'TableLabelDecode',
'DistillationDBPostProcess', 'NRTRLabelDecode', 'SARLabelDecode',
'SEEDLabelDecode'
]
+ if config['name'] == 'PSEPostProcess':
+ from .pse_postprocess import PSEPostProcess
+ support_dict.append('PSEPostProcess')
+
config = copy.deepcopy(config)
module_name = config.pop('name')
if global_config is not None:
diff --git a/ppocr/postprocess/db_postprocess.py b/ppocr/postprocess/db_postprocess.py
index d9c9869dfcd35cb9b491db826f3bff5f766723f4..27b428ef2e73c9abf81d3881b23979343c8595b2 100755
--- a/ppocr/postprocess/db_postprocess.py
+++ b/ppocr/postprocess/db_postprocess.py
@@ -11,7 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
-
+"""
+This code is refered from:
+https://github.com/WenmuZhou/DBNet.pytorch/blob/master/post_processing/seg_detector_representer.py
+"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
@@ -190,7 +193,8 @@ class DBPostProcess(object):
class DistillationDBPostProcess(object):
- def __init__(self, model_name=["student"],
+ def __init__(self,
+ model_name=["student"],
key=None,
thresh=0.3,
box_thresh=0.6,
@@ -201,12 +205,13 @@ class DistillationDBPostProcess(object):
**kwargs):
self.model_name = model_name
self.key = key
- self.post_process = DBPostProcess(thresh=thresh,
- box_thresh=box_thresh,
- max_candidates=max_candidates,
- unclip_ratio=unclip_ratio,
- use_dilation=use_dilation,
- score_mode=score_mode)
+ self.post_process = DBPostProcess(
+ thresh=thresh,
+ box_thresh=box_thresh,
+ max_candidates=max_candidates,
+ unclip_ratio=unclip_ratio,
+ use_dilation=use_dilation,
+ score_mode=score_mode)
def __call__(self, predicts, shape_list):
results = {}
diff --git a/ppocr/postprocess/locality_aware_nms.py b/ppocr/postprocess/locality_aware_nms.py
index 53280cc13ed7e41859e23e2517938d4f6eb07076..d305ef681882b4a393a73190bcbd20a65d1f0c15 100644
--- a/ppocr/postprocess/locality_aware_nms.py
+++ b/ppocr/postprocess/locality_aware_nms.py
@@ -1,5 +1,6 @@
"""
Locality aware nms.
+This code is refered from: https://github.com/songdejia/EAST/blob/master/locality_aware_nms.py
"""
import numpy as np
diff --git a/ppocr/postprocess/pse_postprocess/pse/README.md b/ppocr/postprocess/pse_postprocess/pse/README.md
index 9c2d9eaeaa5f93550358ebdd4d9161330b78a86f..6a19d5d1b6b1d8e6952eb054d74c6672ed10bc48 100644
--- a/ppocr/postprocess/pse_postprocess/pse/README.md
+++ b/ppocr/postprocess/pse_postprocess/pse/README.md
@@ -1,5 +1,6 @@
## 编译
-code from https://github.com/whai362/pan_pp.pytorch
+This code is refer from:
+https://github.com/whai362/PSENet/blob/python3/models/post_processing/pse
```python
python3 setup.py build_ext --inplace
```
diff --git a/ppocr/postprocess/pse_postprocess/pse/__init__.py b/ppocr/postprocess/pse_postprocess/pse/__init__.py
index 0536a32ea5614a8f1826ac2550b1f12518ac53e5..1903a9149a7703ac7ae9a66273eca620e0d77272 100644
--- a/ppocr/postprocess/pse_postprocess/pse/__init__.py
+++ b/ppocr/postprocess/pse_postprocess/pse/__init__.py
@@ -21,8 +21,9 @@ ori_path = os.getcwd()
os.chdir('ppocr/postprocess/pse_postprocess/pse')
if subprocess.call(
'{} setup.py build_ext --inplace'.format(python_path), shell=True) != 0:
- raise RuntimeError('Cannot compile pse: {}'.format(
- os.path.dirname(os.path.realpath(__file__))))
+ raise RuntimeError(
+ 'Cannot compile pse: {}, if your system is windows, you need to install all the default components of `desktop development using C++` in visual studio 2019+'.
+ format(os.path.dirname(os.path.realpath(__file__))))
os.chdir(ori_path)
from .pse import pse
diff --git a/ppocr/postprocess/pse_postprocess/pse_postprocess.py b/ppocr/postprocess/pse_postprocess/pse_postprocess.py
index 4b89d221d284602933ab3d4f21468fcae79ef310..0234d592d6dde8419b1d623e33b9ca5bb251fb97 100755
--- a/ppocr/postprocess/pse_postprocess/pse_postprocess.py
+++ b/ppocr/postprocess/pse_postprocess/pse_postprocess.py
@@ -1,16 +1,20 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
-# http://www.apache.org/licenses/LICENSE-2.0
+# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/whai362/PSENet/blob/python3/models/head/psenet_head.py
+"""
from __future__ import absolute_import
from __future__ import division
@@ -47,7 +51,8 @@ class PSEPostProcess(object):
pred = outs_dict['maps']
if not isinstance(pred, paddle.Tensor):
pred = paddle.to_tensor(pred)
- pred = F.interpolate(pred, scale_factor=4 // self.scale, mode='bilinear')
+ pred = F.interpolate(
+ pred, scale_factor=4 // self.scale, mode='bilinear')
score = F.sigmoid(pred[:, 0, :, :])
@@ -60,7 +65,9 @@ class PSEPostProcess(object):
boxes_batch = []
for batch_index in range(pred.shape[0]):
- boxes, scores = self.boxes_from_bitmap(score[batch_index], kernels[batch_index], shape_list[batch_index])
+ boxes, scores = self.boxes_from_bitmap(score[batch_index],
+ kernels[batch_index],
+ shape_list[batch_index])
boxes_batch.append({'points': boxes, 'scores': scores})
return boxes_batch
@@ -98,15 +105,14 @@ class PSEPostProcess(object):
mask = np.zeros((box_height, box_width), np.uint8)
mask[points[:, 1], points[:, 0]] = 255
- contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
+ contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL,
+ cv2.CHAIN_APPROX_SIMPLE)
bbox = np.squeeze(contours[0], 1)
else:
raise NotImplementedError
- bbox[:, 0] = np.clip(
- np.round(bbox[:, 0] / ratio_w), 0, src_w)
- bbox[:, 1] = np.clip(
- np.round(bbox[:, 1] / ratio_h), 0, src_h)
+ bbox[:, 0] = np.clip(np.round(bbox[:, 0] / ratio_w), 0, src_w)
+ bbox[:, 1] = np.clip(np.round(bbox[:, 1] / ratio_h), 0, src_h)
boxes.append(bbox)
scores.append(score_i)
return boxes, scores
diff --git a/ppocr/utils/iou.py b/ppocr/utils/iou.py
index 20529dee2d14083f3de4ac034668d004136c56e2..35459f5f053cde0a74f76c5652bfb723a48ca890 100644
--- a/ppocr/utils/iou.py
+++ b/ppocr/utils/iou.py
@@ -1,4 +1,4 @@
-# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@@ -11,18 +11,23 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/whai362/PSENet/blob/python3/models/loss/iou.py
+"""
import paddle
EPS = 1e-6
+
def iou_single(a, b, mask, n_class):
valid = mask == 1
a = a.masked_select(valid)
b = b.masked_select(valid)
miou = []
for i in range(n_class):
- if a.shape == [0] and a.shape==b.shape:
+ if a.shape == [0] and a.shape == b.shape:
inter = paddle.to_tensor(0.0)
union = paddle.to_tensor(0.0)
else:
@@ -32,6 +37,7 @@ def iou_single(a, b, mask, n_class):
miou = sum(miou) / len(miou)
return miou
+
def iou(a, b, mask, n_class=2, reduce=True):
batch_size = a.shape[0]
@@ -39,10 +45,10 @@ def iou(a, b, mask, n_class=2, reduce=True):
b = b.reshape([batch_size, -1])
mask = mask.reshape([batch_size, -1])
- iou = paddle.zeros((batch_size,), dtype='float32')
+ iou = paddle.zeros((batch_size, ), dtype='float32')
for i in range(batch_size):
iou[i] = iou_single(a[i], b[i], mask[i], n_class)
if reduce:
iou = paddle.mean(iou)
- return iou
\ No newline at end of file
+ return iou
diff --git a/ppocr/utils/logging.py b/ppocr/utils/logging.py
index 11896c37d9285e19a9526caa9c637d7eda7b1979..ce827e8b10c4b63b736886a2f72106c7570576b1 100644
--- a/ppocr/utils/logging.py
+++ b/ppocr/utils/logging.py
@@ -1,4 +1,4 @@
-# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@@ -11,6 +11,10 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+"""
+This code is refer from:
+https://github.com/WenmuZhou/PytorchOCR/blob/master/torchocr/utils/logging.py
+"""
import os
import sys
diff --git a/test_tipc/compare_results.py b/test_tipc/compare_results.py
index 35af38809fe7d564707d0d538f7d0159cb6edfbd..e28410ed6cb26aab7557025c06b2541a7d27c2c1 100644
--- a/test_tipc/compare_results.py
+++ b/test_tipc/compare_results.py
@@ -32,6 +32,7 @@ def run_shell_command(cmd):
else:
return None
+
def parser_results_from_log_by_name(log_path, names_list):
if not os.path.exists(log_path):
raise ValueError("The log file {} does not exists!".format(log_path))
@@ -52,6 +53,7 @@ def parser_results_from_log_by_name(log_path, names_list):
parser_results[name] = result
return parser_results
+
def load_gt_from_file(gt_file):
if not os.path.exists(gt_file):
raise ValueError("The log file {} does not exists!".format(gt_file))
diff --git a/test_tipc/configs/amp_ppocr_det_mobile_params.txt b/test_tipc/configs/amp_ppocr_det_mobile_params.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1c9978753e663c7b466a55d70657f515c12df18b
--- /dev/null
+++ b/test_tipc/configs/amp_ppocr_det_mobile_params.txt
@@ -0,0 +1,110 @@
+===========================train_params===========================
+model_name:ocr_det
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:amp
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train|pact_train|fpgm_train
+norm_train:tools/train.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
+pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/det_mv3_db.yml -o
+fpgm_train:deploy/slim/prune/sensitivity_anal.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.pretrained_model:
+norm_export:tools/export_model.py -c test_tipc/configs/det_mv3_db.yml -o
+quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/det_mv3_db.yml -o
+fpgm_export:deploy/slim/prune/export_prune_model.py -c test_tipc/configs/det_mv3_db.yml -o
+distill_export:null
+export1:null
+export2:null
+inference_dir:null
+train_model:./inference/ch_ppocr_mobile_v2.0_det_train/best_accuracy
+infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+infer_quant:False
+inference:tools/infer/predict_det.py
+--use_gpu:True|False
+--enable_mkldnn:True|False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False|True
+--precision:fp32|fp16|int8
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+===========================cpp_infer_params===========================
+use_opencv:True
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr det
+--use_gpu:True|False
+--enable_mkldnn:True|False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False|True
+--precision:fp32|fp16
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+===========================serving_params===========================
+model_name:ocr_det
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--dirname:./inference/ch_ppocr_mobile_v2.0_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--serving_server:./deploy/pdserving/ppocr_det_mobile_2.0_serving/
+--serving_client:./deploy/pdserving/ppocr_det_mobile_2.0_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency=1
+op.det.local_service_conf.devices:null|0
+op.det.local_service_conf.use_mkldnn:True|False
+op.det.local_service_conf.thread_num:1|6
+op.det.local_service_conf.use_trt:False|True
+op.det.local_service_conf.precision:fp32|fp16|int8
+pipline:pipeline_http_client.py --image_dir=../../doc/imgs
+===========================kl_quant_params===========================
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/
+infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+infer_quant:True
+inference:tools/infer/predict_det.py
+--use_gpu:True|False
+--enable_mkldnn:True|False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False|True
+--precision:int8
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+null:null
+===========================lite_params===========================
+inference:./ocr_db_crnn det
+infer_model:./models/ch_ppocr_mobile_v2.0_det_opt.nb|./models/ch_ppocr_mobile_v2.0_det_slim_opt.nb
+--cpu_threads:1|4
+--batch_size:1
+--power_mode:LITE_POWER_HIGH|LITE_POWER_LOW
+--image_dir:./test_data/icdar2015_lite/text_localization/ch4_test_images/|./test_data/icdar2015_lite/text_localization/ch4_test_images/img_233.jpg
+--config_dir:./config.txt
+--rec_dict_dir:./ppocr_keys_v1.txt
+--benchmark:True
diff --git a/test_tipc/configs/fleet_ppocr_det_mobile_params.txt b/test_tipc/configs/fleet_ppocr_det_mobile_params.txt
new file mode 100644
index 0000000000000000000000000000000000000000..99278845e43f1a56239b508e49c1670f5bc77922
--- /dev/null
+++ b/test_tipc/configs/fleet_ppocr_det_mobile_params.txt
@@ -0,0 +1,110 @@
+===========================train_params===========================
+model_name:ocr_det
+python:python3.7
+gpu_list:xx.xx.xx.xx,xx.xx.xx.xx;0,1
+Global.use_gpu:True|True
+Global.auto_cast:null|amp
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train|pact_train|fpgm_train
+norm_train:tools/train.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
+pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/det_mv3_db.yml -o
+fpgm_train:deploy/slim/prune/sensitivity_anal.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.pretrained_model:
+norm_export:tools/export_model.py -c test_tipc/configs/det_mv3_db.yml -o
+quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/det_mv3_db.yml -o
+fpgm_export:deploy/slim/prune/export_prune_model.py -c test_tipc/configs/det_mv3_db.yml -o
+distill_export:null
+export1:null
+export2:null
+inference_dir:null
+train_model:./inference/ch_ppocr_mobile_v2.0_det_train/best_accuracy
+infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+infer_quant:False
+inference:tools/infer/predict_det.py
+--use_gpu:True|False
+--enable_mkldnn:True|False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False|True
+--precision:fp32|fp16|int8
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+===========================cpp_infer_params===========================
+use_opencv:True
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr det
+--use_gpu:True|False
+--enable_mkldnn:True|False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False|True
+--precision:fp32|fp16
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+===========================serving_params===========================
+model_name:ocr_det
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--dirname:./inference/ch_ppocr_mobile_v2.0_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--serving_server:./deploy/pdserving/ppocr_det_mobile_2.0_serving/
+--serving_client:./deploy/pdserving/ppocr_det_mobile_2.0_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency=1
+op.det.local_service_conf.devices:null|0
+op.det.local_service_conf.use_mkldnn:True|False
+op.det.local_service_conf.thread_num:1|6
+op.det.local_service_conf.use_trt:False|True
+op.det.local_service_conf.precision:fp32|fp16|int8
+pipline:pipeline_http_client.py --image_dir=../../doc/imgs
+===========================kl_quant_params===========================
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/
+infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+infer_quant:True
+inference:tools/infer/predict_det.py
+--use_gpu:True|False
+--enable_mkldnn:True|False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False|True
+--precision:int8
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+null:null
+===========================lite_params===========================
+inference:./ocr_db_crnn det
+infer_model:./models/ch_ppocr_mobile_v2.0_det_opt.nb|./models/ch_ppocr_mobile_v2.0_det_slim_opt.nb
+--cpu_threads:1|4
+--batch_size:1
+--power_mode:LITE_POWER_HIGH|LITE_POWER_LOW
+--image_dir:./test_data/icdar2015_lite/text_localization/ch4_test_images/|./test_data/icdar2015_lite/text_localization/ch4_test_images/img_233.jpg
+--config_dir:./config.txt
+--rec_dict_dir:./ppocr_keys_v1.txt
+--benchmark:True
diff --git a/test_tipc/configs/mac_ppocr_det_mobile_params.txt b/test_tipc/configs/mac_ppocr_det_mobile_params.txt
new file mode 100644
index 0000000000000000000000000000000000000000..b0415c9a1f79837866812d1e545ad8fd09fb681d
--- /dev/null
+++ b/test_tipc/configs/mac_ppocr_det_mobile_params.txt
@@ -0,0 +1,100 @@
+===========================train_params===========================
+model_name:ocr_det
+python:python
+gpu_list:-1
+Global.use_gpu:False
+Global.auto_cast:null
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train|pact_train|fpgm_train
+norm_train:tools/train.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
+pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/det_mv3_db.yml -o
+fpgm_train:deploy/slim/prune/sensitivity_anal.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.pretrained_model:
+norm_export:tools/export_model.py -c test_tipc/configs/det_mv3_db.yml -o
+quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/det_mv3_db.yml -o
+fpgm_export:deploy/slim/prune/export_prune_model.py -c test_tipc/configs/det_mv3_db.yml -o
+distill_export:null
+export1:null
+export2:null
+inference_dir:null
+train_model:./inference/ch_ppocr_mobile_v2.0_det_train/best_accuracy
+infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+infer_quant:False
+inference:tools/infer/predict_det.py
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+===========================cpp_infer_params===========================
+use_opencv:True
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr det
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+===========================serving_params===========================
+model_name:ocr_det
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--dirname:./inference/ch_ppocr_mobile_v2.0_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--serving_server:./deploy/pdserving/ppocr_det_mobile_2.0_serving/
+--serving_client:./deploy/pdserving/ppocr_det_mobile_2.0_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency=1
+op.det.local_service_conf.devices:null|0
+op.det.local_service_conf.use_mkldnn:True|False
+op.det.local_service_conf.thread_num:1|6
+op.det.local_service_conf.use_trt:False|True
+op.det.local_service_conf.precision:fp32|fp16|int8
+pipline:pipeline_http_client.py --image_dir=../../doc/imgs
+===========================kl_quant_params===========================
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/
+infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+infer_quant:True
+inference:tools/infer/predict_det.py
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:int8
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+null:null
diff --git a/test_tipc/configs/ppocr_det_mobile_params.txt b/test_tipc/configs/ppocr_det_mobile_params.txt
index 62f8e5dd120181267367701f5881672c9b093611..d7e9cf95c2e9b4b2e18265e5f8b4a65cd6bdf518 100644
--- a/test_tipc/configs/ppocr_det_mobile_params.txt
+++ b/test_tipc/configs/ppocr_det_mobile_params.txt
@@ -1,21 +1,21 @@
===========================train_params===========================
model_name:ocr_det
python:python3.7
-gpu_list:0|0,1|10.21.226.181,10.21.226.133;0,1
-Global.use_gpu:True|True|True
-Global.auto_cast:fp32|amp
-Global.epoch_num:lite_train_infer=1|whole_train_infer=300
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:null
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
Global.save_model_dir:./output/
-Train.loader.batch_size_per_card:lite_train_infer=2|whole_train_infer=4
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
null:null
##
trainer:norm_train|pact_train|fpgm_train
-norm_train:tools/train.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
-pact_train:deploy/slim/quantization/quant.py -c tests/configs/det_mv3_db.yml -o
-fpgm_train:deploy/slim/prune/sensitivity_anal.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy
+norm_train:tools/train.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
+pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/det_mv3_db.yml -o
+fpgm_train:deploy/slim/prune/sensitivity_anal.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy
distill_train:null
null:null
null:null
@@ -27,13 +27,13 @@ null:null
===========================infer_params===========================
Global.save_inference_dir:./output/
Global.pretrained_model:
-norm_export:tools/export_model.py -c tests/configs/det_mv3_db.yml -o
-quant_export:deploy/slim/quantization/export_model.py -c tests/configs/det_mv3_db.yml -o
-fpgm_export:deploy/slim/prune/export_prune_model.py -c tests/configs/det_mv3_db.yml -o
+norm_export:tools/export_model.py -c test_tipc/configs/det_mv3_db.yml -o
+quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/det_mv3_db.yml -o
+fpgm_export:deploy/slim/prune/export_prune_model.py -c test_tipc/configs/det_mv3_db.yml -o
distill_export:null
export1:null
export2:null
-##
+inference_dir:null
train_model:./inference/ch_ppocr_mobile_v2.0_det_train/best_accuracy
infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
infer_quant:False
diff --git a/test_tipc/configs/ppocrv2_det_mobile_params.txt b/test_tipc/configs/ppocrv2_det_mobile_params.txt
new file mode 100644
index 0000000000000000000000000000000000000000..423cb979f0d35a62324958b36dd9d115211c19d4
--- /dev/null
+++ b/test_tipc/configs/ppocrv2_det_mobile_params.txt
@@ -0,0 +1,51 @@
+===========================train_params===========================
+model_name:PPOCRv2_ocr_det
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_infer=1|whole_train_infer=500
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_infer=2|whole_train_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train|pact_train
+norm_train:tools/train.py -c configs/det/ch_PP-OCRv2/ch_PP-OCR_det_cml.yml -o
+pact_train:deploy/slim/quantization/quant.py -c configs/det/ch_PP-OCRv2/ch_PP-OCR_det_cml.yml -o
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.pretrained_model:
+norm_export:tools/export_model.py -c configs/det/ch_PP-OCRv2/ch_PP-OCR_det_cml.yml -o
+quant_export:deploy/slim/quantization/export_model.py -c configs/det/ch_PP-OCRv2/ch_PP-OCR_det_cml.yml -o
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+inference_dir:Student
+infer_model:./inference/ch_PP-OCRv2_det_infer/
+infer_export:null
+infer_quant:False
+inference:tools/infer/predict_det.py
+--use_gpu:True|False
+--enable_mkldnn:True|False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False|True
+--precision:fp32|fp16|int8
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
diff --git a/test_tipc/configs/win_ppocr_det_mobile_params.txt b/test_tipc/configs/win_ppocr_det_mobile_params.txt
new file mode 100644
index 0000000000000000000000000000000000000000..5a532ceb307fe87174dc6b46fbde236405f59ff5
--- /dev/null
+++ b/test_tipc/configs/win_ppocr_det_mobile_params.txt
@@ -0,0 +1,110 @@
+===========================train_params===========================
+model_name:ocr_det
+python:python
+gpu_list:0
+Global.use_gpu:True
+Global.auto_cast:fp32|amp
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train|pact_train|fpgm_train
+norm_train:tools/train.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
+pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/det_mv3_db.yml -o
+fpgm_train:deploy/slim/prune/sensitivity_anal.py -c test_tipc/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.pretrained_model:
+norm_export:tools/export_model.py -c test_tipc/configs/det_mv3_db.yml -o
+quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/det_mv3_db.yml -o
+fpgm_export:deploy/slim/prune/export_prune_model.py -c test_tipc/configs/det_mv3_db.yml -o
+distill_export:null
+export1:null
+export2:null
+inference_dir:null
+train_model:./inference/ch_ppocr_mobile_v2.0_det_train/best_accuracy
+infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+infer_quant:False
+inference:tools/infer/predict_det.py
+--use_gpu:True|False
+--enable_mkldnn:True|False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False|True
+--precision:fp32|fp16|int8
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+===========================cpp_infer_params===========================
+use_opencv:True
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr det
+--use_gpu:True|False
+--enable_mkldnn:True|False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False|True
+--precision:fp32|fp16
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+===========================serving_params===========================
+model_name:ocr_det
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--dirname:./inference/ch_ppocr_mobile_v2.0_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--serving_server:./deploy/pdserving/ppocr_det_mobile_2.0_serving/
+--serving_client:./deploy/pdserving/ppocr_det_mobile_2.0_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency=1
+op.det.local_service_conf.devices:null|0
+op.det.local_service_conf.use_mkldnn:True|False
+op.det.local_service_conf.thread_num:1|6
+op.det.local_service_conf.use_trt:False|True
+op.det.local_service_conf.precision:fp32|fp16|int8
+pipline:pipeline_http_client.py --image_dir=../../doc/imgs
+===========================kl_quant_params===========================
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/
+infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+infer_quant:True
+inference:tools/infer/predict_det.py
+--use_gpu:True|False
+--enable_mkldnn:True|False
+--cpu_threads:1|6
+--rec_batch_num:1
+--use_tensorrt:False|True
+--precision:int8
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+null:null
+===========================lite_params===========================
+inference:./ocr_db_crnn det
+infer_model:./models/ch_ppocr_mobile_v2.0_det_opt.nb|./models/ch_ppocr_mobile_v2.0_det_slim_opt.nb
+--cpu_threads:1|4
+--batch_size:1
+--power_mode:LITE_POWER_HIGH|LITE_POWER_LOW
+--image_dir:./test_data/icdar2015_lite/text_localization/ch4_test_images/|./test_data/icdar2015_lite/text_localization/ch4_test_images/img_233.jpg
+--config_dir:./config.txt
+--rec_dict_dir:./ppocr_keys_v1.txt
+--benchmark:True
diff --git a/test_tipc/docs/install.md b/test_tipc/docs/install.md
index 28b92426fa04da79ce63381fffa9f52a0f42813f..f17c264f3987c8cc2a756e045ebacb8fba5c277a 100644
--- a/test_tipc/docs/install.md
+++ b/test_tipc/docs/install.md
@@ -1,13 +1,15 @@
-
-## 环境配置
+## 1. 环境准备
本教程适用于PTDN目录下基础功能测试的运行环境搭建。
推荐环境:
-- CUDA 10.1
-- CUDNN 7.6
-- TensorRT 6.1.0.5 / 7.1
+- CUDA 10.1/10.2
+- CUDNN 7.6/cudnn8.1
+- TensorRT 6.1.0.5 / 7.1 / 7.2
+
+环境配置可以选择docker镜像安装,或者在本地环境Python搭建环境。推荐使用docker镜像安装,避免不必要的环境配置。
+## 2. Docker 镜像安装
推荐docker镜像安装,按照如下命令创建镜像,当前目录映射到镜像中的`/paddle`目录下
```
@@ -16,7 +18,79 @@ cd /paddle
# 安装带TRT的paddle
pip3.7 install https://paddle-wheel.bj.bcebos.com/with-trt/2.1.3/linux-gpu-cuda10.1-cudnn7-mkl-gcc8.2-trt6-avx/paddlepaddle_gpu-2.1.3.post101-cp37-cp37m-linux_x86_64.whl
+```
+
+## 3 Python 环境构建
+
+非docker环境下,环境配置比较灵活,推荐环境组合配置:
+- CUDA10.1 + CUDNN7.6 + TensorRT 6
+- CUDA10.2 + CUDNN8.1 + TensorRT 7
+- CUDA11.1 + CUDNN8.1 + TensorRT 7
+
+下面以 CUDA10.2 + CUDNN8.1 + TensorRT 7 配置为例,介绍环境配置的流程。
+
+### 3.1 安装CUDNN
+
+如果当前环境满足CUDNN版本的要求,可以跳过此步骤。
+
+以CUDNN8.1 安装安装为例,安装步骤如下,首先下载CUDNN,从[Nvidia官网](https://developer.nvidia.com/rdp/cudnn-archive)下载CUDNN8.1版本,下载符合当前系统版本的三个deb文件,分别是:
+- cuDNN Runtime Library ,如:libcudnn8_8.1.0.77-1+cuda10.2_amd64.deb
+- cuDNN Developer Library ,如:libcudnn8-dev_8.1.0.77-1+cuda10.2_amd64.deb
+- cuDNN Code Samples,如:libcudnn8-samples_8.1.0.77-1+cuda10.2_amd64.deb
+
+deb安装可以参考[官方文档](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installlinux-deb),安装方式如下
+```
+# x.x.x表示下载的版本号
+# $HOME为工作目录
+sudo dpkg -i libcudnn8_x.x.x-1+cudax.x_arm64.deb
+sudo dpkg -i libcudnn8-dev_8.x.x.x-1+cudax.x_arm64.deb
+sudo dpkg -i libcudnn8-samples_8.x.x.x-1+cudax.x_arm64.deb
+
+# 验证是否正确安装
+cp -r /usr/src/cudnn_samples_v8/ $HOME
+cd $HOME/cudnn_samples_v8/mnistCUDNN
+
+# 编译
+make clean && make
+./mnistCUDNN
+```
+如果运行mnistCUDNN完后提示运行成功,则表示安装成功。如果运行后出现freeimage相关的报错,需要按照提示安装freeimage库:
+```
+sudo apt-get install libfreeimage-dev
+sudo apt-get install libfreeimage
+```
+
+### 3.2 安装TensorRT
+
+首先,从[Nvidia官网TensorRT板块](https://developer.nvidia.com/tensorrt-getting-started)下载TensorRT,这里选择7.1.3.4版本的TensorRT,注意选择适合自己系统版本和CUDA版本的TensorRT,另外建议下载TAR package的安装包。
+
+以Ubuntu16.04+CUDA10.2为例,下载并解压后可以参考[官方文档](https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-713/install-guide/index.html#installing-tar)的安装步骤,按照如下步骤安装:
+```
+# 以下安装命令中 '${version}' 为下载的TensorRT版本,如7.1.3.4
+# 设置环境变量, 为解压后的TensorRT的lib目录
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:
+
+# 安装TensorRT
+cd TensorRT-${version}/python
+pip3.7 install tensorrt-*-cp3x-none-linux_x86_64.whl
+
+# 安装graphsurgeon
+cd TensorRT-${version}/graphsurgeon
+```
+
+### 3.3 安装PaddlePaddle
+
+下载支持TensorRT版本的Paddle安装包,注意安装包的TensorRT版本需要与本地TensorRT一致,下载[链接](https://paddleinference.paddlepaddle.org.cn/user_guides/download_lib.html#python)
+选择下载 linux-cuda10.2-trt7-gcc8.2 Python3.7版本的Paddle:
+```
+# 从下载链接中可以看到是paddle2.1.1-cuda10.2-cudnn8.1版本
+wget https://paddle-wheel.bj.bcebos.com/with-trt/2.1.1-gpu-cuda10.2-cudnn8.1-mkl-gcc8.2/paddlepaddle_gpu-2.1.1-cp37-cp37m-linux_x86_64.whl
+pip3.7 install -U paddlepaddle_gpu-2.1.1-cp37-cp37m-linux_x86_64.whl
+```
+
+## 4. 安装PaddleOCR依赖
+```
# 安装AutoLog
git clone https://github.com/LDOUBLEV/AutoLog
cd AutoLog
@@ -24,7 +98,6 @@ pip3.7 install -r requirements.txt
python3.7 setup.py bdist_wheel
pip3.7 install ./dist/auto_log-1.0.0-py3-none-any.whl
-
# 下载OCR代码
cd ../
git clone https://github.com/PaddlePaddle/PaddleOCR
@@ -45,4 +118,4 @@ A. 问题一般是当前安装paddle版本带TRT,但是本地环境找不到Te
```
export LD_LIBRARY_PATH=/usr/local/python3.7.0/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/paddle/package/TensorRT-6.0.1.5/lib
```
-或者问题是下载的TensorRT版本和当前paddle中编译的TRT版本不匹配,需要下载版本相符的TRT。
+或者问题是下载的TensorRT版本和当前paddle中编译的TRT版本不匹配,需要下载版本相符的TensorRT重新安装。
diff --git a/test_tipc/docs/lite_auto_log.png b/test_tipc/docs/lite_auto_log.png
new file mode 100644
index 0000000000000000000000000000000000000000..cd9256db40232d689ea67a1bbef2b768c5f98753
Binary files /dev/null and b/test_tipc/docs/lite_auto_log.png differ
diff --git a/test_tipc/docs/lite_log.png b/test_tipc/docs/lite_log.png
new file mode 100644
index 0000000000000000000000000000000000000000..24ae5abc7167049ac879428e5e105a6e67d3c36d
Binary files /dev/null and b/test_tipc/docs/lite_log.png differ
diff --git a/test_tipc/docs/mac_test_train_inference_python.md b/test_tipc/docs/mac_test_train_inference_python.md
new file mode 100644
index 0000000000000000000000000000000000000000..7f088a489ff7988d97c7cb48973b13725b49a09f
--- /dev/null
+++ b/test_tipc/docs/mac_test_train_inference_python.md
@@ -0,0 +1,147 @@
+# Mac端基础训练预测功能测试
+
+Mac端基础训练预测功能测试的主程序为`test_train_inference_python.sh`,可以测试基于Python的模型CPU训练,包括裁剪、量化、蒸馏训练,以及评估、CPU推理等基本功能。
+
+注:Mac端测试用法同linux端测试方法类似,但是无需测试需要在GPU上运行的测试。
+
+## 1. 测试结论汇总
+
+- 训练相关:
+
+| 算法名称 | 模型名称 | 单机单卡(CPU) | 单机多卡 | 多机多卡 | 模型压缩(CPU) |
+| :---- | :---- | :---- | :---- | :---- | :---- |
+| DB | ch_ppocr_mobile_v2.0_det| 正常训练 | - | - | 正常训练:FPGM裁剪、PACT量化
离线量化(无需训练) |
+
+
+- 预测相关:基于训练是否使用量化,可以将训练产出的模型可以分为`正常模型`和`量化模型`,这两类模型对应的预测功能汇总如下,
+
+| 模型类型 |device | batchsize | tensorrt | mkldnn | cpu多线程 |
+| ---- | ---- | ---- | :----: | :----: | :----: |
+| 正常模型 | CPU | 1/6 | - | fp32 | 支持 |
+| 量化模型 | CPU | 1/6 | - | int8 | 支持 |
+
+
+## 2. 测试流程
+
+Mac端无GPU,环境准备只需要Python环境即可,安装PaddlePaddle等依赖参考下述文档。
+
+### 2.1 安装依赖
+- 安装PaddlePaddle >= 2.0
+- 安装PaddleOCR依赖
+ ```
+ pip install -r ../requirements.txt
+ ```
+- 安装autolog(规范化日志输出工具)
+ ```
+ git clone https://github.com/LDOUBLEV/AutoLog
+ cd AutoLog
+ pip install -r requirements.txt
+ python setup.py bdist_wheel
+ pip install ./dist/auto_log-1.0.0-py3-none-any.whl
+ cd ../
+ ```
+- 安装PaddleSlim (可选)
+ ```
+ # 如果要测试量化、裁剪等功能,需要安装PaddleSlim
+ pip install paddleslim
+ ```
+
+
+### 2.2 功能测试
+
+先运行`prepare.sh`准备数据和模型,然后运行`test_train_inference_python.sh`进行测试,最终在```test_tipc/output```目录下生成`python_infer_*.log`格式的日志文件。
+
+`test_train_inference_python.sh`包含5种运行模式,每种模式的运行数据不同,分别用于测试速度和精度,分别是:
+
+- 模式1:lite_train_lite_infer,使用少量数据训练,用于快速验证训练到预测的走通流程,不验证精度和速度;
+```shell
+# 同linux端运行不同的是,Mac端测试使用新的配置文件mac_ppocr_det_mobile_params.txt,
+# 配置文件中默认去掉了GPU和mkldnn相关的测试链条
+bash test_tipc/prepare.sh ./test_tipc/configs/mac_ppocr_det_mobile_params.txt 'lite_train_lite_infer'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/mac_ppocr_det_mobile_params.txt 'lite_train_lite_infer'
+```
+
+- 模式2:lite_train_whole_infer,使用少量数据训练,一定量数据预测,用于验证训练后的模型执行预测,预测速度是否合理;
+```shell
+bash test_tipc/prepare.sh ./test_tipc/configs/mac_ppocr_det_mobile_params.txt 'lite_train_whole_infer'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/mac_ppocr_det_mobile_params.txt 'lite_train_whole_infer'
+```
+
+- 模式3:whole_infer,不训练,全量数据预测,走通开源模型评估、动转静,检查inference model预测时间和精度;
+```shell
+bash test_tipc/prepare.sh ./test_tipc/configs/mac_ppocr_det_mobile_params.txt 'whole_infer'
+# 用法1:
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/mac_ppocr_det_mobile_params.txt 'whole_infer'
+# 用法2: 指定GPU卡预测,第三个传入参数为GPU卡号
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/mac_ppocr_det_mobile_params.txt 'whole_infer' '1'
+```
+
+- 模式4:whole_train_whole_infer,CE: 全量数据训练,全量数据预测,验证模型训练精度,预测精度,预测速度;(Mac端不建议运行此模式)
+```shell
+bash test_tipc/prepare.sh ./test_tipc/configs/mac_ppocr_det_mobile_params.txt 'whole_train_whole_infer'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/mac_ppocr_det_mobile_params.txt 'whole_train_whole_infer'
+```
+
+- 模式5:klquant_whole_infer,测试离线量化;
+```shell
+bash test_tipc/prepare.sh ./test_tipc/configs/mac_ppocr_det_mobile_params.txt 'klquant_whole_infer'
+bash test_tipc/test_train_inference_python.sh test_tipc/configs/mac_ppocr_det_mobile_params.txt 'klquant_whole_infer'
+```
+
+运行相应指令后,在`test_tipc/output`文件夹下自动会保存运行日志。如`lite_train_lite_infer`模式下,会运行训练+inference的链条,因此,在`test_tipc/output`文件夹有以下文件:
+```
+test_tipc/output/
+|- results_python.log # 运行指令状态的日志
+|- norm_train_gpus_-1_autocast_null/ # CPU上正常训练的训练日志和模型保存文件夹
+|- pact_train_gpus_-1_autocast_null/ # CPU上量化训练的训练日志和模型保存文件夹
+......
+|- python_infer_cpu_usemkldnn_False_threads_1_batchsize_1.log # CPU上关闭Mkldnn线程数设置为1,测试batch_size=1条件下的预测运行日志
+......
+```
+
+其中`results_python.log`中包含了每条指令的运行状态,如果运行成功会输出:
+```
+Run successfully with command - python3.7 tools/train.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained Global.use_gpu=False Global.save_model_dir=./tests/output/norm_train_gpus_-1_autocast_null Global.epoch_num=1 Train.loader.batch_size_per_card=2 !
+Run successfully with command - python3.7 tools/export_model.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./tests/output/norm_train_gpus_-1_autocast_null/latest Global.save_inference_dir=./tests/output/norm_train_gpus_-1_autocast_null!
+......
+```
+如果运行失败,会输出:
+```
+Run failed with command - python3.7 tools/train.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained Global.use_gpu=Faslse Global.save_model_dir=./tests/output/norm_train_gpus_-1_autocast_null Global.epoch_num=1 Train.loader.batch_size_per_card=2 !
+Run failed with command - python3.7 tools/export_model.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./tests/output/norm_train_gpus_0_autocast_null/latest Global.save_inference_dir=./tests/output/norm_train_gpus_-1_autocast_null!
+......
+```
+可以很方便的根据`results_python.log`中的内容判定哪一个指令运行错误。
+
+### 2.3 精度测试
+
+使用compare_results.py脚本比较模型预测的结果是否符合预期,主要步骤包括:
+- 提取日志中的预测坐标;
+- 从本地文件中提取保存好的坐标结果;
+- 比较上述两个结果是否符合精度预期,误差大于设置阈值时会报错。
+
+#### 使用方式
+运行命令:
+```shell
+python test_tipc/compare_results.py --gt_file=./test_tipc/results/python_*.txt --log_file=./test_tipc/output/python_*.log --atol=1e-3 --rtol=1e-3
+```
+
+参数介绍:
+- gt_file: 指向事先保存好的预测结果路径,支持*.txt 结尾,会自动索引*.txt格式的文件,文件默认保存在test_tipc/result/ 文件夹下
+- log_file: 指向运行test_tipc/test_train_inference_python.sh 脚本的infer模式保存的预测日志,预测日志中打印的有预测结果,比如:文本框,预测文本,类别等等,同样支持python_infer_*.log格式传入
+- atol: 设置的绝对误差
+- rtol: 设置的相对误差
+
+#### 运行结果
+
+正常运行效果如下图:
+
+
+出现不一致结果时的运行输出:
+
+
+
+## 3. 更多教程
+本文档为功能测试用,更丰富的训练预测使用教程请参考:
+[模型训练](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/training.md)
+[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/inference.md)
diff --git a/test_tipc/docs/ssh_termux_ls.png b/test_tipc/docs/ssh_termux_ls.png
new file mode 100644
index 0000000000000000000000000000000000000000..2df78026b23b2bb71ac98092d7820e5d02ad611c
Binary files /dev/null and b/test_tipc/docs/ssh_termux_ls.png differ
diff --git a/test_tipc/docs/termux.jpg b/test_tipc/docs/termux.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..cb87c4ccc21bab6411b87f61e76f03b5b5f6f557
Binary files /dev/null and b/test_tipc/docs/termux.jpg differ
diff --git a/test_tipc/docs/termux_for_android.md b/test_tipc/docs/termux_for_android.md
new file mode 100644
index 0000000000000000000000000000000000000000..73ecbb2e93be5a8fdef593fb492d3a27d33c6b52
--- /dev/null
+++ b/test_tipc/docs/termux_for_android.md
@@ -0,0 +1,128 @@
+# 安卓手机通过Termux连接电脑
+
+由于通过adb方式连接手机后,很多linux命令无法运行,自动化测试受阻,所以此处特此介绍另外一种通过Termux的连接方式,不仅可以运行大部分linux命令,方便开发者在手机上在线调试,甚至还可以多实现台机器同时连接手机。Termux不是真实的Linux环境,但是Termux可以安装真实的Linux,而且不会损失性能,与此同时,Termux不需要root。在配置Termux之前,请确保电脑已经安装adb工具,安装方式请参考[Lite端部署](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/lite/readme.md) 。在运行以下命令后确保可以显示安卓设备信息。
+
+```
+adb devices
+```
+连接成功信息提示:
+```
+List of devices attached
+744be294 device
+```
+
+## 1.安卓手机安装termux app
+
+### 1.1 下载termux apk文件
+
+由于目前该app目前各大商城暂无,所以可以直接下载如下apk文件。
+
+打开电脑终端,执行以下命令:
+
+```
+wget http://10.12.121.133:8911/cuicheng01/fullchain/termux-v1.0.3.apk
+```
+
+### 1.2 安装termux到手机上
+
+在手机端的开发者模式下,允许USB调试,允许USB安装。在电脑终端,执行如下命令,将termux app安装到手机上:
+
+```
+adb install termux-v1.0.3.apk
+```
+
+此处需要手机端确认安装,点击确认。
+
+### 1.3 验证是否安装成功
+
+打开手机,检验termux是否安装成功,如果没有,重新执行1.2,如果有相应的app,点击进入,会有如下显示。
+
+
+
+接下来的配置环境需要在手机上此终端运行相关命令。
+
+## 2.手机端配置termux
+
+首先将手机联网,最好可以连接外网,部分的配置需要外网。打开Termux终端,执行以下命令安装基础件`proot`,并使用`termux-chroot`命令可以模拟 root 环境与标准的 Linux 目录结构。
+
+```
+pkg i -y proot
+termux-chroot
+```
+
+Termux 默认只能访问自身内部的数据,如果要访问手机中其它的数据,输入下面的命令后,手机弹出对请求权限的窗口,允许即可(方便对部分运行出的结果在手机端可视化)。
+
+```
+termux-setup-storage
+```
+
+### 2.1 配置SSH
+
+作为 Linux 终端或者服务器,必须有SSH。不管你是 SSH 连接到 Termux还是使用Termux去连其它主机,都需要先安装openssh。如果安装失败,请重复执行命令。
+
+```
+pkg i -y openssh
+```
+
+启动 SSH 服务端,默认端口号为8022
+
+```
+sshd
+```
+
+
+### 2.2 电脑通过SSH方式连接手机
+
+1.保证手机和电脑处于同一局域网下
+手机端分别输入以下命令获得ip地址和当前用户:
+
+```
+# 获取ip地址
+ifconfig
+
+# 获取当前用户
+whoami
+```
+
+如获取到的ip地址和当前用户分别是`172.24.162.117`和`u0_a374`。
+
+2.电脑端通过SSH连接手机
+
+```
+#默认端口号为8022
+ssh u0_a374@172.24.162.117 -p 8022
+```
+
+3.运行ls命令后,会有如下显示:
+
+```
+ls
+```
+
+
+
+
+### 2.3 通过scp传输数据
+
+1.在当前目录上新建test目录
+
+```
+mkdir test
+```
+
+2.测试scp功能
+
+将电脑中的某个文件拷贝到手机上:
+```
+scp -P 8022 test.txt u0_a374@172.24.162.117:/home/storage/test
+```
+
+3.手机端查看
+
+打开手机终端,在`/home/storage/test`下查看是否存在`test.txt`
+
+
+## 3. 更多教程
+
+本教程可以完成Termux基本配置,更多关于Termux的用法,请参考:[Termux高级终端安装使用配置教程](https://www.sqlsec.com/2018/05/termux.html)。
+
diff --git a/test_tipc/docs/test_inference_cpp.md b/test_tipc/docs/test_inference_cpp.md
index 24655d96ba1acaadd489019ec260999c981107de..cd757a895bb957e498fda61cf52d2132d660ca8f 100644
--- a/test_tipc/docs/test_inference_cpp.md
+++ b/test_tipc/docs/test_inference_cpp.md
@@ -14,6 +14,8 @@ C++预测功能测试的主程序为`test_inference_cpp.sh`,可以测试基于
| 量化模型 | CPU | 1/6 | - | int8 | 支持 |
## 2. 测试流程
+运行环境配置请参考[文档](./install.md)的内容配置TIPC的运行环境。
+
### 2.1 功能测试
先运行`prepare.sh`准备数据和模型,然后运行`test_inference_cpp.sh`进行测试,最终在```test_tipc/output```目录下生成`cpp_infer_*.log`后缀的日志文件。
@@ -26,6 +28,32 @@ bash test_tipc/test_inference_cpp.sh ./test_tipc/configs/ppocr_det_mobile_params
bash test_tipc/test_inference_cpp.sh ./test_tipc/configs/ppocr_det_mobile_params.txt '1'
```
+运行预测指令后,在`test_tipc/output`文件夹下自动会保存运行日志,包括以下文件:
+
+```shell
+test_tipc/output/
+|- results_cpp.log # 运行指令状态的日志
+|- cpp_infer_cpu_usemkldnn_False_threads_1_precision_fp32_batchsize_1.log # CPU上不开启Mkldnn,线程数设置为1,测试batch_size=1条件下的预测运行日志
+|- cpp_infer_cpu_usemkldnn_False_threads_6_precision_fp32_batchsize_1.log # CPU上不开启Mkldnn,线程数设置为6,测试batch_size=1条件下的预测运行日志
+|- cpp_infer_gpu_usetrt_False_precision_fp32_batchsize_1.log # GPU上不开启TensorRT,测试batch_size=1的fp32精度预测日志
+|- cpp_infer_gpu_usetrt_True_precision_fp16_batchsize_1.log # GPU上开启TensorRT,测试batch_size=1的fp16精度预测日志
+......
+```
+其中results_cpp.log中包含了每条指令的运行状态,如果运行成功会输出:
+
+```
+Run successfully with command - ./deploy/cpp_infer/build/ppocr det --use_gpu=False --enable_mkldnn=False --cpu_threads=6 --det_model_dir=./inference/ch_ppocr_mobile_v2.0_det_infer/ --rec_batch_num=1 --image_dir=./inference/ch_det_data_50/all-sum-510/ --benchmar k=True > ./test_tipc/output/cpp_infer_cpu_usemkldnn_False_threads_6_precision_fp32_batchsize_1.log 2>&1 !
+Run successfully with command - ./deploy/cpp_infer/build/ppocr det --use_gpu=True --use_tensorrt=False --precision=fp32 --det_model_dir=./inference/ch_ppocr_mobile_v2.0_det_infer/ --rec_batch_num=1 --image_dir=./inference/ch_det_data_50/all-sum-510/ --benchmark =True > ./test_tipc/output/cpp_infer_gpu_usetrt_False_precision_fp32_batchsize_1.log 2>&1 !
+......
+```
+如果运行失败,会输出:
+```
+Run failed with command - ./deploy/cpp_infer/build/ppocr det --use_gpu=True --use_tensorrt=True --precision=fp32 --det_model_dir=./inference/ch_ppocr_mobile_v2.0_det_infer/ --rec_batch_num=1 --image_dir=./inference/ch_det_data_50/all-sum-510/ --benchmark=True > ./test_tipc/output/cpp_infer_gpu_usetrt_True_precision_fp32_batchsize_1.log 2>&1 !
+Run failed with command - ./deploy/cpp_infer/build/ppocr det --use_gpu=True --use_tensorrt=True --precision=fp16 --det_model_dir=./inference/ch_ppocr_mobile_v2.0_det_infer/ --rec_batch_num=1 --image_dir=./inference/ch_det_data_50/all-sum-510/ --benchmark=True > ./test_tipc/output/cpp_infer_gpu_usetrt_True_precision_fp16_batchsize_1.log 2>&1 !
+......
+```
+可以很方便的根据results_cpp.log中的内容判定哪一个指令运行错误。
+
### 2.2 精度测试
diff --git a/test_tipc/docs/test_lite.md b/test_tipc/docs/test_lite.md
new file mode 100644
index 0000000000000000000000000000000000000000..01ae0cb4b471f1219f88ffa9e2c11d50765233d3
--- /dev/null
+++ b/test_tipc/docs/test_lite.md
@@ -0,0 +1,72 @@
+# Lite预测功能测试
+
+Lite预测功能测试的主程序为`test_lite.sh`,可以测试基于Lite预测库的模型推理功能。
+
+## 1. 测试结论汇总
+
+目前Lite端的样本间支持以方式的组合:
+
+**字段说明:**
+- 输入设置:包括C++预测、python预测、java预测
+- 模型类型:包括正常模型(FP32)和量化模型(FP16)
+- batch-size:包括1和4
+- predictor数量:包括多predictor预测和单predictor预测
+- 功耗模式:包括高性能模式(LITE_POWER_HIGH)和省电模式(LITE_POWER_LOW)
+- 预测库来源:包括下载方式和编译方式,其中编译方式分为以下目标硬件:(1)ARM CPU;(2)Linux XPU;(3)OpenCL GPU;(4)Metal GPU
+
+| 模型类型 | batch-size | predictor数量 | 功耗模式 | 预测库来源 | 支持语言 |
+| :----: | :----: | :----: | :----: | :----: | :----: |
+| 正常模型/量化模型 | 1 | 1 | 高性能模式/省电模式 | 下载方式 | C++预测 |
+
+
+## 2. 测试流程
+运行环境配置请参考[文档](./install.md)的内容配置TIPC的运行环境。
+
+### 2.1 功能测试
+
+先运行`prepare.sh`准备数据和模型,模型和数据会打包到test_lite.tar中,将test_lite.tar上传到手机上,解压后进`入test_lite`目录中,然后运行`test_lite.sh`进行测试,最终在`test_lite/output`目录下生成`lite_*.log`后缀的日志文件。
+
+```shell
+
+# 数据和模型准备
+bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt "lite_infer"
+
+# 手机端测试:
+bash test_lite.sh ppocr_det_mobile_params.txt
+
+```
+
+**注意**:由于运行该项目需要bash等命令,传统的adb方式不能很好的安装。所以此处推荐通在手机上开启虚拟终端的方式连接电脑,连接方式可以参考[安卓手机termux连接电脑](./termux_for_android.md)。
+
+#### 运行结果
+
+各测试的运行情况会打印在 `./output/` 中:
+运行成功时会输出:
+
+```
+Run successfully with command - ./ocr_db_crnn det ./models/ch_ppocr_mobile_v2.0_det_slim_opt.nb INT8 4 1 LITE_POWER_LOW ./test_data/icdar2015_lite/text_localization/ch4_test_images/img_233.jpg ./config.txt True > ./output/lite_ch_ppocr_mobile_v2.0_det_slim_opt.nb_precision_INT8_batchsize_1_threads_4_powermode_LITE_POWER_LOW_singleimg_True.log 2>&1!
+Run successfully with command xxx
+...
+```
+
+运行失败时会输出:
+
+```
+Run failed with command - ./ocr_db_crnn det ./models/ch_ppocr_mobile_v2.0_det_slim_opt.nb INT8 4 1 LITE_POWER_LOW ./test_data/icdar2015_lite/text_localization/ch4_test_images/img_233.jpg ./config.txt True > ./output/lite_ch_ppocr_mobile_v2.0_det_slim_opt.nb_precision_INT8_batchsize_1_threads_4_powermode_LITE_POWER_LOW_singleimg_True.log 2>&1!
+Run failed with command xxx
+...
+```
+
+在./output/文件夹下,会存在如下日志,每一个日志都是不同配置下的log结果:
+
+
+
+在每一个log中,都会调用autolog打印如下信息:
+
+
+
+
+
+## 3. 更多教程
+
+本文档为功能测试用,更详细的Lite端预测使用教程请参考:[Lite端部署](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/lite/readme.md)。
diff --git a/test_tipc/docs/test_serving.md b/test_tipc/docs/test_serving.md
index 6028f82a460b20c9b76743840919fed1679bc9c5..f63d6c7107ce92807c53d81a22a582b09178a712 100644
--- a/test_tipc/docs/test_serving.md
+++ b/test_tipc/docs/test_serving.md
@@ -14,6 +14,8 @@ PaddleServing预测功能测试的主程序为`test_serving.sh`,可以测试
| 量化模型 | CPU | 1/6 | - | int8 | 支持 |
## 2. 测试流程
+运行环境配置请参考[文档](./install.md)的内容配置TIPC的运行环境。
+
### 2.1 功能测试
先运行`prepare.sh`准备数据和模型,然后运行`test_serving.sh`进行测试,最终在```test_tipc/output```目录下生成`serving_infer_*.log`后缀的日志文件。
diff --git a/test_tipc/docs/test_train_inference_python.md b/test_tipc/docs/test_train_inference_python.md
index fa14863fdad02dcb9b69f45494cc18b24ceaf36f..9028e67d093112d23cc7c5d9da10d185f1db9b5b 100644
--- a/test_tipc/docs/test_train_inference_python.md
+++ b/test_tipc/docs/test_train_inference_python.md
@@ -1,6 +1,9 @@
-# 基础训练预测功能测试
+# Linux端基础训练预测功能测试
-基础训练预测功能测试的主程序为`test_train_inference_python.sh`,可以测试基于Python的模型训练、评估、推理等基本功能,包括裁剪、量化、蒸馏。
+Linux端基础训练预测功能测试的主程序为`test_train_inference_python.sh`,可以测试基于Python的模型训练、评估、推理等基本功能,包括裁剪、量化、蒸馏。
+
+- Mac端基础训练预测功能测试参考[链接](./mac_test_train_inference_python.md)
+- Windows端基础训练预测功能测试参考[链接](./win_test_train_inference_python.md)
## 1. 测试结论汇总
@@ -22,12 +25,15 @@
| 模型类型 |device | batchsize | tensorrt | mkldnn | cpu多线程 |
| ---- | ---- | ---- | :----: | :----: | :----: |
| 正常模型 | GPU | 1/6 | fp32/fp16 | - | - |
-| 正常模型 | CPU | 1/6 | - | fp32 | 支持 |
+| 正常模型 | CPU | 1/6 | - | fp32/fp16 | 支持 |
| 量化模型 | GPU | 1/6 | int8 | - | - |
| 量化模型 | CPU | 1/6 | - | int8 | 支持 |
## 2. 测试流程
+
+运行环境配置请参考[文档](./install.md)的内容配置TIPC的运行环境。
+
### 2.1 安装依赖
- 安装PaddlePaddle >= 2.0
- 安装PaddleOCR依赖
@@ -43,6 +49,11 @@
pip3 install ./dist/auto_log-1.0.0-py3-none-any.whl
cd ../
```
+- 安装PaddleSlim (可选)
+ ```
+ # 如果要测试量化、裁剪等功能,需要安装PaddleSlim
+ pip3 install paddleslim
+ ```
### 2.2 功能测试
@@ -51,38 +62,64 @@
`test_train_inference_python.sh`包含5种运行模式,每种模式的运行数据不同,分别用于测试速度和精度,分别是:
-- 模式1:lite_train_infer,使用少量数据训练,用于快速验证训练到预测的走通流程,不验证精度和速度;
+- 模式1:lite_train_lite_infer,使用少量数据训练,用于快速验证训练到预测的走通流程,不验证精度和速度;
```shell
-bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'lite_train_infer'
-bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'lite_train_infer'
+bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'lite_train_lite_infer'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'lite_train_lite_infer'
```
-- 模式2:whole_infer,使用少量数据训练,一定量数据预测,用于验证训练后的模型执行预测,预测速度是否合理;
+- 模式2:lite_train_whole_infer,使用少量数据训练,一定量数据预测,用于验证训练后的模型执行预测,预测速度是否合理;
```shell
-bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'whole_infer'
-bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'whole_infer'
+bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'lite_train_whole_infer'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'lite_train_whole_infer'
```
-- 模式3:infer,不训练,全量数据预测,走通开源模型评估、动转静,检查inference model预测时间和精度;
+- 模式3:whole_infer,不训练,全量数据预测,走通开源模型评估、动转静,检查inference model预测时间和精度;
```shell
-bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'infer'
+bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'whole_infer'
# 用法1:
-bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'infer'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'whole_infer'
# 用法2: 指定GPU卡预测,第三个传入参数为GPU卡号
-bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'infer' '1'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'whole_infer' '1'
```
-- 模式4:whole_train_infer,CE: 全量数据训练,全量数据预测,验证模型训练精度,预测精度,预测速度;
+- 模式4:whole_train_whole_infer,CE: 全量数据训练,全量数据预测,验证模型训练精度,预测精度,预测速度;
```shell
-bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'whole_train_infer'
-bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'whole_train_infer'
+bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'whole_train_whole_infer'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'whole_train_whole_infer'
```
-- 模式5:klquant_infer,测试离线量化;
+- 模式5:klquant_whole_infer,测试离线量化;
```shell
-bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'klquant_infer'
-bash test_tipc/test_train_inference_python.sh test_tipc/configs/ppocr_det_mobile_params.txt 'klquant_infer'
+bash test_tipc/prepare.sh ./test_tipc/configs/ppocr_det_mobile_params.txt 'klquant_whole_infer'
+bash test_tipc/test_train_inference_python.sh test_tipc/configs/ppocr_det_mobile_params.txt 'klquant_whole_infer'
+```
+
+运行相应指令后,在`test_tipc/output`文件夹下自动会保存运行日志。如'lite_train_lite_infer'模式下,会运行训练+inference的链条,因此,在`test_tipc/output`文件夹有以下文件:
+```
+test_tipc/output/
+|- results_python.log # 运行指令状态的日志
+|- norm_train_gpus_0_autocast_null/ # GPU 0号卡上正常训练的训练日志和模型保存文件夹
+|- pact_train_gpus_0_autocast_null/ # GPU 0号卡上量化训练的训练日志和模型保存文件夹
+......
+|- python_infer_cpu_usemkldnn_True_threads_1_batchsize_1.log # CPU上开启Mkldnn线程数设置为1,测试batch_size=1条件下的预测运行日志
+|- python_infer_gpu_usetrt_True_precision_fp16_batchsize_1.log # GPU上开启TensorRT,测试batch_size=1的半精度预测日志
+......
+```
+
+其中`results_python.log`中包含了每条指令的运行状态,如果运行成功会输出:
+```
+Run successfully with command - python3.7 tools/train.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained Global.use_gpu=True Global.save_model_dir=./tests/output/norm_train_gpus_0_autocast_null Global.epoch_num=1 Train.loader.batch_size_per_card=2 !
+Run successfully with command - python3.7 tools/export_model.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./tests/output/norm_train_gpus_0_autocast_null/latest Global.save_inference_dir=./tests/output/norm_train_gpus_0_autocast_null!
+......
+```
+如果运行失败,会输出:
+```
+Run failed with command - python3.7 tools/train.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained Global.use_gpu=True Global.save_model_dir=./tests/output/norm_train_gpus_0_autocast_null Global.epoch_num=1 Train.loader.batch_size_per_card=2 !
+Run failed with command - python3.7 tools/export_model.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./tests/output/norm_train_gpus_0_autocast_null/latest Global.save_inference_dir=./tests/output/norm_train_gpus_0_autocast_null!
+......
```
+可以很方便的根据`results_python.log`中的内容判定哪一个指令运行错误。
### 2.3 精度测试
diff --git a/test_tipc/docs/win_test_train_inference_python.md b/test_tipc/docs/win_test_train_inference_python.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1bb5470de36d14071ed86e29bebf29b104289dd
--- /dev/null
+++ b/test_tipc/docs/win_test_train_inference_python.md
@@ -0,0 +1,151 @@
+# Windows端基础训练预测功能测试
+
+Windows端基础训练预测功能测试的主程序为`test_train_inference_python.sh`,可以测试基于Python的模型训练、评估、推理等基本功能,包括裁剪、量化、蒸馏。
+
+## 1. 测试结论汇总
+
+- 训练相关:
+
+| 算法名称 | 模型名称 | 单机单卡 | 单机多卡 | 多机多卡 | 模型压缩(单机多卡) |
+| :---- | :---- | :---- | :---- | :---- | :---- |
+| DB | ch_ppocr_mobile_v2.0_det| 正常训练
混合精度 | - | - | 正常训练:FPGM裁剪、PACT量化
离线量化(无需训练) |
+
+
+- 预测相关:基于训练是否使用量化,可以将训练产出的模型可以分为`正常模型`和`量化模型`,这两类模型对应的预测功能汇总如下:
+
+| 模型类型 |device | batchsize | tensorrt | mkldnn | cpu多线程 |
+| ---- | ---- | ---- | :----: | :----: | :----: |
+| 正常模型 | GPU | 1/6 | fp32/fp16 | - | - |
+| 正常模型 | CPU | 1/6 | - | fp32/fp16 | 支持 |
+| 量化模型 | GPU | 1/6 | int8 | - | - |
+| 量化模型 | CPU | 1/6 | - | int8 | 支持 |
+
+
+## 2. 测试流程
+
+运行环境配置请参考[文档](./install.md)的内容配置TIPC的运行环境。
+
+另外,由于Windows上和linux的路径管理方式不同,可以在win上安装gitbash终端,在gitbash中执行指令的方式和在linux端执行指令方式相同,更方便tipc测试。gitbash[下载链接](https://git-scm.com/download/win)。
+
+
+### 2.1 安装依赖
+- 安装PaddlePaddle >= 2.0
+- 安装PaddleOCR依赖
+ ```
+ pip install -r ../requirements.txt
+ ```
+- 安装autolog(规范化日志输出工具)
+ ```
+ git clone https://github.com/LDOUBLEV/AutoLog
+ cd AutoLog
+ pip install -r requirements.txt
+ python setup.py bdist_wheel
+ pip install ./dist/auto_log-1.0.0-py3-none-any.whl
+ cd ../
+ ```
+- 安装PaddleSlim (可选)
+ ```
+ # 如果要测试量化、裁剪等功能,需要安装PaddleSlim
+ pip install paddleslim
+ ```
+
+
+### 2.2 功能测试
+先运行`prepare.sh`准备数据和模型,然后运行`test_train_inference_python.sh`进行测试,最终在```test_tipc/output```目录下生成`python_infer_*.log`格式的日志文件。
+
+
+`test_train_inference_python.sh`包含5种运行模式,每种模式的运行数据不同,分别用于测试速度和精度,分别是:
+
+- 模式1:lite_train_lite_infer,使用少量数据训练,用于快速验证训练到预测的走通流程,不验证精度和速度;
+```shell
+bash test_tipc/prepare.sh ./test_tipc/configs/win_ppocr_det_mobile_params.txt 'lite_train_lite_infer'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/win_ppocr_det_mobile_params.txt 'lite_train_lite_infer'
+```
+
+- 模式2:lite_train_whole_infer,使用少量数据训练,一定量数据预测,用于验证训练后的模型执行预测,预测速度是否合理;
+```shell
+bash test_tipc/prepare.sh ./test_tipc/configs/win_ppocr_det_mobile_params.txt 'lite_train_whole_infer'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/win_ppocr_det_mobile_params.txt 'lite_train_whole_infer'
+```
+
+- 模式3:whole_infer,不训练,全量数据预测,走通开源模型评估、动转静,检查inference model预测时间和精度;
+```shell
+bash test_tipc/prepare.sh ./test_tipc/configs/win_ppocr_det_mobile_params.txt 'whole_infer'
+# 用法1:
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/win_ppocr_det_mobile_params.txt 'whole_infer'
+# 用法2: 指定GPU卡预测,第三个传入参数为GPU卡号
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/win_ppocr_det_mobile_params.txt 'whole_infer' '1'
+```
+
+- 模式4:whole_train_whole_infer,CE: 全量数据训练,全量数据预测,验证模型训练精度,预测精度,预测速度;
+```shell
+bash test_tipc/prepare.sh ./test_tipc/configs/win_ppocr_det_mobile_params.txt 'whole_train_whole_infer'
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/win_ppocr_det_mobile_params.txt 'whole_train_whole_infer'
+```
+
+- 模式5:klquant_whole_infer,测试离线量化;
+```shell
+bash test_tipc/prepare.sh ./test_tipc/configs/win_ppocr_det_mobile_params.txt 'klquant_whole_infer'
+bash test_tipc/test_train_inference_python.sh test_tipc/configs/win_ppocr_det_mobile_params.txt 'klquant_whole_infer'
+```
+
+
+运行相应指令后,在`test_tipc/output`文件夹下自动会保存运行日志。如'lite_train_lite_infer'模式下,会运行训练+inference的链条,因此,在`test_tipc/output`文件夹有以下文件:
+```
+test_tipc/output/
+|- results_python.log # 运行指令状态的日志
+|- norm_train_gpus_0_autocast_null/ # GPU 0号卡上正常训练的训练日志和模型保存文件夹
+|- pact_train_gpus_0_autocast_null/ # GPU 0号卡上量化训练的训练日志和模型保存文件夹
+......
+|- python_infer_cpu_usemkldnn_True_threads_1_batchsize_1.log # CPU上开启Mkldnn线程数设置为1,测试batch_size=1条件下的预测运行日志
+|- python_infer_gpu_usetrt_True_precision_fp16_batchsize_1.log # GPU上开启TensorRT,测试batch_size=1的半精度预测日志
+......
+```
+
+其中`results_python.log`中包含了每条指令的运行状态,如果运行成功会输出:
+```
+Run successfully with command - python3.7 tools/train.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained Global.use_gpu=True Global.save_model_dir=./tests/output/norm_train_gpus_0_autocast_null Global.epoch_num=1 Train.loader.batch_size_per_card=2 !
+Run successfully with command - python3.7 tools/export_model.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./tests/output/norm_train_gpus_0_autocast_null/latest Global.save_inference_dir=./tests/output/norm_train_gpus_0_autocast_null!
+......
+```
+如果运行失败,会输出:
+```
+Run failed with command - python3.7 tools/train.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained Global.use_gpu=True Global.save_model_dir=./tests/output/norm_train_gpus_0_autocast_null Global.epoch_num=1 Train.loader.batch_size_per_card=2 !
+Run failed with command - python3.7 tools/export_model.py -c tests/configs/det_mv3_db.yml -o Global.pretrained_model=./tests/output/norm_train_gpus_0_autocast_null/latest Global.save_inference_dir=./tests/output/norm_train_gpus_0_autocast_null!
+......
+```
+可以很方便的根据`results_python.log`中的内容判定哪一个指令运行错误。
+
+
+### 2.3 精度测试
+
+使用compare_results.py脚本比较模型预测的结果是否符合预期,主要步骤包括:
+- 提取日志中的预测坐标;
+- 从本地文件中提取保存好的坐标结果;
+- 比较上述两个结果是否符合精度预期,误差大于设置阈值时会报错。
+
+#### 使用方式
+运行命令:
+```shell
+python test_tipc/compare_results.py --gt_file=./test_tipc/results/python_*.txt --log_file=./test_tipc/output/python_*.log --atol=1e-3 --rtol=1e-3
+```
+
+参数介绍:
+- gt_file: 指向事先保存好的预测结果路径,支持*.txt 结尾,会自动索引*.txt格式的文件,文件默认保存在test_tipc/result/ 文件夹下
+- log_file: 指向运行test_tipc/test_train_inference_python.sh 脚本的infer模式保存的预测日志,预测日志中打印的有预测结果,比如:文本框,预测文本,类别等等,同样支持python_infer_*.log格式传入
+- atol: 设置的绝对误差
+- rtol: 设置的相对误差
+
+#### 运行结果
+
+正常运行效果如下图:
+
+
+出现不一致结果时的运行输出:
+
+
+
+## 3. 更多教程
+本文档为功能测试用,更丰富的训练预测使用教程请参考:
+[模型训练](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/training.md)
+[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/inference.md)
diff --git a/test_tipc/prepare.sh b/test_tipc/prepare.sh
index e4f48e044b63a1471ec18f46afd56067fd49bddc..f3ad242538a9471af237c804eae343da06e2b9dd 100644
--- a/test_tipc/prepare.sh
+++ b/test_tipc/prepare.sh
@@ -1,8 +1,9 @@
#!/bin/bash
FILENAME=$1
-# MODE be one of ['lite_train_infer' 'whole_infer' 'whole_train_infer', 'infer',
-# 'cpp_infer', 'serving_infer', 'klquant_infer', 'lite_infer']
+# MODE be one of ['lite_train_lite_infer' 'lite_train_whole_infer' 'whole_train_whole_infer',
+# 'whole_infer', 'klquant_whole_infer',
+# 'cpp_infer', 'serving_infer', 'lite_infer']
MODE=$2
@@ -31,94 +32,124 @@ model_name=$(func_parser_value "${lines[1]}")
trainer_list=$(func_parser_value "${lines[14]}")
-# MODE be one of ['lite_train_infer' 'whole_infer' 'whole_train_infer']
+# MODE be one of ['lite_train_lite_infer' 'lite_train_whole_infer' 'whole_train_whole_infer',
+# 'whole_infer', 'klquant_whole_infer',
+# 'cpp_infer', 'serving_infer', 'lite_infer']
MODE=$2
-if [ ${MODE} = "lite_train_infer" ];then
+if [ ${MODE} = "lite_train_lite_infer" ];then
# pretrain lite train data
- wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x0_5_pretrained.pdparams
- wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar
+ wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x0_5_pretrained.pdparams --no-check-certificate
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar --no-check-certificate
+ if [ ${model_name} == "PPOCRv2_ocr_det" ]; then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar --no-check-certificate
+ cd ./pretrain_models/ && tar xf ch_PP-OCRv2_det_distill_train.tar && cd ../
+ fi
cd ./pretrain_models/ && tar xf det_mv3_db_v2.0_train.tar && cd ../
rm -rf ./train_data/icdar2015
rm -rf ./train_data/ic15_data
- wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015_lite.tar
- wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar # todo change to bcebos
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar
- wget -nc -P ./deploy/slim/prune https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/sen.pickle
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015_lite.tar --no-check-certificate
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./deploy/slim/prune https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/sen.pickle --no-check-certificate
cd ./train_data/ && tar xf icdar2015_lite.tar && tar xf ic15_data.tar
ln -s ./icdar2015_lite ./icdar2015
cd ../
cd ./inference && tar xf rec_inference.tar && cd ../
-elif [ ${MODE} = "whole_train_infer" ];then
- wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x0_5_pretrained.pdparams
+elif [ ${MODE} = "whole_train_whole_infer" ];then
+ wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x0_5_pretrained.pdparams --no-check-certificate
rm -rf ./train_data/icdar2015
rm -rf ./train_data/ic15_data
- wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015.tar
- wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015.tar --no-check-certificate
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar --no-check-certificate
cd ./train_data/ && tar xf icdar2015.tar && tar xf ic15_data.tar && cd ../
-elif [ ${MODE} = "whole_infer" ];then
- wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x0_5_pretrained.pdparams
+ if [ ${model_name} == "PPOCRv2_ocr_det" ]; then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar --no-check-certificate
+ cd ./pretrain_models/ && tar xf ch_PP-OCRv2_det_distill_train.tar && cd ../
+ fi
+elif [ ${MODE} = "lite_train_whole_infer" ];then
+ wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x0_5_pretrained.pdparams --no-check-certificate
rm -rf ./train_data/icdar2015
rm -rf ./train_data/ic15_data
- wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015_infer.tar
- wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015_infer.tar --no-check-certificate
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar --no-check-certificate
cd ./train_data/ && tar xf icdar2015_infer.tar && tar xf ic15_data.tar
ln -s ./icdar2015_infer ./icdar2015
cd ../
-elif [ ${MODE} = "infer" ];then
+ if [ ${model_name} == "PPOCRv2_ocr_det" ]; then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar --no-check-certificate
+ cd ./pretrain_models/ && tar xf ch_PP-OCRv2_det_distill_train.tar && cd ../
+ fi
+elif [ ${MODE} = "whole_infer" ];then
if [ ${model_name} = "ocr_det" ]; then
eval_model_name="ch_ppocr_mobile_v2.0_det_train"
rm -rf ./train_data/icdar2015
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar --no-check-certificate
cd ./inference && tar xf ${eval_model_name}.tar && tar xf ch_det_data_50.tar && cd ../
elif [ ${model_name} = "ocr_server_det" ]; then
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
cd ./inference && tar xf ch_ppocr_server_v2.0_det_train.tar && tar xf ch_det_data_50.tar && cd ../
elif [ ${model_name} = "ocr_system_mobile" ]; then
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate
cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_det_data_50.tar && cd ../
elif [ ${model_name} = "ocr_system_server" ]; then
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar --no-check-certificate
cd ./inference && tar xf ch_ppocr_server_v2.0_det_infer.tar && tar xf ch_ppocr_server_v2.0_rec_infer.tar && tar xf ch_det_data_50.tar && cd ../
elif [ ${model_name} = "ocr_rec" ]; then
rm -rf ./train_data/ic15_data
eval_model_name="ch_ppocr_mobile_v2.0_rec_infer"
- wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate
cd ./inference && tar xf ${eval_model_name}.tar && tar xf rec_inference.tar && cd ../
elif [ ${model_name} = "ocr_server_rec" ]; then
rm -rf ./train_data/ic15_data
eval_model_name="ch_ppocr_server_v2.0_rec_infer"
- wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar --no-check-certificate
cd ./inference && tar xf ${eval_model_name}.tar && tar xf rec_inference.tar && cd ../
fi
-elif [ ${MODE} = "klquant_infer" ];then
+
+ elif [ ${model_name} = "PPOCRv2_ocr_det" ]; then
+ eval_model_name="ch_PP-OCRv2_det_infer"
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate
+ cd ./inference && tar xf ${eval_model_name}.tar && tar xf ch_det_data_50.tar && cd ../
+ fi
+
+if [ ${MODE} = "klquant_whole_infer" ]; then
if [ ${model_name} = "ocr_det" ]; then
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar
- cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_det_data_50.tar && cd ../
fi
-elif [ ${MODE} = "cpp_infer" ];then
+ if [ ${model_name} = "PPOCRv2_ocr_det" ]; then
+ eval_model_name="ch_PP-OCRv2_det_infer"
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate
+ cd ./inference && tar xf ${eval_model_name}.tar && tar xf ch_det_data_50.tar && cd ../
+ fi
+fi
+
+if [ ${MODE} = "cpp_infer" ];then
if [ ${model_name} = "ocr_det" ]; then
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar --no-check-certificate
cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_det_data_50.tar && cd ../
elif [ ${model_name} = "ocr_rec" ]; then
- wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate
cd ./inference && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf rec_inference.tar && cd ../
elif [ ${model_name} = "ocr_system" ]; then
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate
cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_det_data_50.tar && cd ../
fi
fi
@@ -150,7 +181,7 @@ if [ ${MODE} = "lite_infer" ];then
export https_proxy=http://172.19.57.45:3128
paddlelite_url=https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.9/inference_lite_lib.android.armv8.gcc.c++_shared.with_extra.with_cv.tar.gz
paddlelite_zipfile=$(echo $paddlelite_url | awk -F "/" '{print $NF}')
- paddlelite_file=inference_lite_lib.android.armv8.gcc.c++_shared.with_extra.with_cv
+ paddlelite_file=${paddlelite_zipfile:0:66}
wget ${paddlelite_url}
tar -xf ${paddlelite_zipfile}
mkdir -p ${paddlelite_file}/demo/cxx/ocr/test_lite
@@ -158,7 +189,7 @@ if [ ${MODE} = "lite_infer" ];then
cp ppocr/utils/ppocr_keys_v1.txt deploy/lite/config.txt ${paddlelite_file}/demo/cxx/ocr/test_lite
cp ./deploy/lite/* ${paddlelite_file}/demo/cxx/ocr/
cp ${paddlelite_file}/cxx/lib/libpaddle_light_api_shared.so ${paddlelite_file}/demo/cxx/ocr/test_lite
- cp PTDN/configs/ppocr_det_mobile_params.txt PTDN/test_lite.sh PTDN/common_func.sh ${paddlelite_file}/demo/cxx/ocr/test_lite
+ cp test_tipc/configs/ppocr_det_mobile_params.txt test_tipc/test_lite.sh test_tipc/common_func.sh ${paddlelite_file}/demo/cxx/ocr/test_lite
cd ${paddlelite_file}/demo/cxx/ocr/
git clone https://github.com/LDOUBLEV/AutoLog.git
unset http_proxy
diff --git a/test_tipc/readme.md b/test_tipc/readme.md
index e2bf6d9e9c927aee5eb802a68bde89f7bd2a641c..1d8df7da6cf6d1319cedd329e4202fa674e8538b 100644
--- a/test_tipc/readme.md
+++ b/test_tipc/readme.md
@@ -69,7 +69,7 @@ test_tipc/
├── ppocr_sys_server_params.txt # 测试server版ppocr检测+识别模型串联的参数配置文件
├── ppocr_det_server_params.txt # 测试server版ppocr检测模型的参数配置文件
├── ppocr_rec_server_params.txt # 测试server版ppocr识别模型的参数配置文件
- ├── ...
+ ├── ...
├── results/ # 预先保存的预测结果,用于和实际预测结果进行精读比对
├── python_ppocr_det_mobile_results_fp32.txt # 预存的mobile版ppocr检测模型python预测fp32精度的结果
├── python_ppocr_det_mobile_results_fp16.txt # 预存的mobile版ppocr检测模型python预测fp16精度的结果
diff --git a/test_tipc/test_inference_cpp.sh b/test_tipc/test_inference_cpp.sh
index 124bdacb7dad04bdea07a62ba9c86b248be5a06d..3f8b54b189349aa9c011a56f6f12752b771ce43e 100644
--- a/test_tipc/test_inference_cpp.sh
+++ b/test_tipc/test_inference_cpp.sh
@@ -1,5 +1,5 @@
#!/bin/bash
-source tests/common_func.sh
+source test_tipc/common_func.sh
FILENAME=$1
dataline=$(awk 'NR==52, NR==66{print}' $FILENAME)
@@ -35,7 +35,7 @@ cpp_benchmark_key=$(func_parser_key "${lines[14]}")
cpp_benchmark_value=$(func_parser_value "${lines[14]}")
-LOG_PATH="./tests/output"
+LOG_PATH="./test_tipc/output"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results_cpp.log"
diff --git a/test_tipc/test_serving.sh b/test_tipc/test_serving.sh
index af66d70d7b0a255c33d1114a3951adb92407b8d1..be7b594c3848c423937c59336ce3bf686f8f228d 100644
--- a/test_tipc/test_serving.sh
+++ b/test_tipc/test_serving.sh
@@ -1,5 +1,5 @@
#!/bin/bash
-source PTDN/common_func.sh
+source test_tipc/common_func.sh
FILENAME=$1
dataline=$(awk 'NR==67, NR==83{print}' $FILENAME)
@@ -36,8 +36,8 @@ web_precision_key=$(func_parser_key "${lines[15]}")
web_precision_list=$(func_parser_value "${lines[15]}")
pipeline_py=$(func_parser_value "${lines[16]}")
-LOG_PATH="../../PTDN/output"
-mkdir -p ./PTDN/output
+LOG_PATH="../../test_tipc/output"
+mkdir -p ./test_tipc/output
status_log="${LOG_PATH}/results_serving.log"
function func_serving(){
diff --git a/test_tipc/test_train_inference_python.sh b/test_tipc/test_train_inference_python.sh
index 756e1f89d74c1df8de50cf8e23fd3d9c95bd20c5..eaeaf9684b1fed6738149d61d3697232e105a72f 100644
--- a/test_tipc/test_train_inference_python.sh
+++ b/test_tipc/test_train_inference_python.sh
@@ -1,8 +1,8 @@
#!/bin/bash
-source tests/common_func.sh
+source test_tipc/common_func.sh
FILENAME=$1
-# MODE be one of ['lite_train_infer' 'whole_infer' 'whole_train_infer', 'infer', 'klquant_infer']
+# MODE be one of ['lite_train_lite_infer' 'lite_train_whole_infer' 'whole_train_whole_infer', 'whole_infer', 'klquant_whole_infer']
MODE=$2
dataline=$(awk 'NR==1, NR==51{print}' $FILENAME)
@@ -59,6 +59,7 @@ export_key1=$(func_parser_key "${lines[33]}")
export_value1=$(func_parser_value "${lines[33]}")
export_key2=$(func_parser_key "${lines[34]}")
export_value2=$(func_parser_value "${lines[34]}")
+inference_dir=$(func_parser_value "${lines[35]}")
# parser inference model
infer_model_dir_list=$(func_parser_value "${lines[36]}")
@@ -88,7 +89,7 @@ infer_key1=$(func_parser_key "${lines[50]}")
infer_value1=$(func_parser_value "${lines[50]}")
# parser klquant_infer
-if [ ${MODE} = "klquant_infer" ]; then
+if [ ${MODE} = "klquant_whole_infer" ]; then
dataline=$(awk 'NR==82, NR==98{print}' $FILENAME)
lines=(${dataline})
# parser inference model
@@ -119,7 +120,7 @@ if [ ${MODE} = "klquant_infer" ]; then
infer_value1=$(func_parser_value "${lines[15]}")
fi
-LOG_PATH="./tests/output"
+LOG_PATH="./test_tipc/output"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results_python.log"
@@ -202,7 +203,7 @@ function func_inference(){
done
}
-if [ ${MODE} = "infer" ] || [ ${MODE} = "klquant_infer" ]; then
+if [ ${MODE} = "whole_infer" ] || [ ${MODE} = "klquant_whole_infer" ]; then
GPUID=$3
if [ ${#GPUID} -le 0 ];then
env=" "
@@ -315,7 +316,7 @@ else
elif [ ${#ips} -le 26 ];then # train with multi-gpu
cmd="${python} -m paddle.distributed.launch --gpus=${gpu} ${run_train} ${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_autocast} ${set_batchsize} ${set_train_params1} ${set_amp_config}"
else # train with multi-machine
- cmd="${python} -m paddle.distributed.launch --ips=${ips} --gpus=${gpu} ${set_use_gpu} ${run_train} ${set_save_model} ${set_pretrain} ${set_epoch} ${set_autocast} ${set_batchsize} ${set_train_params1} ${set_amp_config}"
+ cmd="${python} -m paddle.distributed.launch --ips=${ips} --gpus=${gpu} ${run_train} ${set_use_gpu} ${set_save_model} ${set_pretrain} ${set_epoch} ${set_autocast} ${set_batchsize} ${set_train_params1} ${set_amp_config}"
fi
# run train
eval "unset CUDA_VISIBLE_DEVICES"
@@ -347,7 +348,13 @@ else
#run inference
eval $env
save_infer_path="${save_log}"
- func_inference "${python}" "${inference_py}" "${save_infer_path}" "${LOG_PATH}" "${train_infer_img_dir}" "${flag_quant}"
+ if [ ${inference_dir} != "null" ] && [ ${inference_dir} != '##' ]; then
+ infer_model_dir="${save_infer_path}/${inference_dir}"
+ else
+ infer_model_dir=${save_infer_path}
+ fi
+ func_inference "${python}" "${inference_py}" "${infer_model_dir}" "${LOG_PATH}" "${train_infer_img_dir}" "${flag_quant}"
+
eval "unset CUDA_VISIBLE_DEVICES"
fi
done # done with: for trainer in ${trainer_list[*]}; do
diff --git a/tools/infer/predict_rec.py b/tools/infer/predict_rec.py
index 591885c384701784724980b06787c52171f8cc35..41982e3403b11dd4a1893f89af11a9201e0e15d7 100755
--- a/tools/infer/predict_rec.py
+++ b/tools/infer/predict_rec.py
@@ -107,7 +107,6 @@ class TextRecognizer(object):
return norm_img.astype(np.float32) / 128. - 1.
assert imgC == img.shape[2]
- max_wh_ratio = max(max_wh_ratio, imgW / imgH)
imgW = int((32 * max_wh_ratio))
if self.use_onnx:
imgW = 100
diff --git a/tools/program.py b/tools/program.py
index f94ad83c532183f5a6ff458cfd8c0bfa814d5784..d110f70704028948dff2bc889e07d128e0bc94ea 100755
--- a/tools/program.py
+++ b/tools/program.py
@@ -212,15 +212,15 @@ def train(config,
for epoch in range(start_epoch, epoch_num + 1):
train_dataloader = build_dataloader(
config, 'Train', device, logger, seed=epoch)
- train_batch_cost = 0.0
train_reader_cost = 0.0
- batch_sum = 0
- batch_start = time.time()
+ train_run_cost = 0.0
+ total_samples = 0
+ reader_start = time.time()
max_iter = len(train_dataloader) - 1 if platform.system(
) == "Windows" else len(train_dataloader)
for idx, batch in enumerate(train_dataloader):
profiler.add_profiler_step(profiler_options)
- train_reader_cost += time.time() - batch_start
+ train_reader_cost += time.time() - reader_start
if idx >= max_iter:
break
lr = optimizer.get_lr()
@@ -228,6 +228,7 @@ def train(config,
if use_srn:
model_average = True
+ train_start = time.time()
# use amp
if scaler:
with paddle.amp.auto_cast():
@@ -252,8 +253,8 @@ def train(config,
optimizer.step()
optimizer.clear_grad()
- train_batch_cost += time.time() - batch_start
- batch_sum += len(images)
+ train_run_cost += time.time() - train_start
+ total_samples += len(images)
if not isinstance(lr_scheduler, float):
lr_scheduler.step()
@@ -284,12 +285,13 @@ def train(config,
logs = train_stats.log()
strs = 'epoch: [{}/{}], iter: {}, {}, reader_cost: {:.5f} s, batch_cost: {:.5f} s, samples: {}, ips: {:.5f}'.format(
epoch, epoch_num, global_step, logs, train_reader_cost /
- print_batch_step, train_batch_cost / print_batch_step,
- batch_sum, batch_sum / train_batch_cost)
+ print_batch_step, (train_reader_cost + train_run_cost) /
+ print_batch_step, total_samples,
+ total_samples / (train_reader_cost + train_run_cost))
logger.info(strs)
- train_batch_cost = 0.0
train_reader_cost = 0.0
- batch_sum = 0
+ train_run_cost = 0.0
+ total_samples = 0
# eval
if global_step > start_eval_step and \
(global_step - start_eval_step) % eval_batch_step == 0 and dist.get_rank() == 0:
@@ -342,7 +344,7 @@ def train(config,
global_step)
global_step += 1
optimizer.clear_grad()
- batch_start = time.time()
+ reader_start = time.time()
if dist.get_rank() == 0:
save_model(
model,
@@ -383,7 +385,11 @@ def eval(model,
with paddle.no_grad():
total_frame = 0.0
total_time = 0.0
- pbar = tqdm(total=len(valid_dataloader), desc='eval model:')
+ pbar = tqdm(
+ total=len(valid_dataloader),
+ desc='eval model:',
+ position=0,
+ leave=True)
max_iter = len(valid_dataloader) - 1 if platform.system(
) == "Windows" else len(valid_dataloader)
for idx, batch in enumerate(valid_dataloader):
@@ -452,8 +458,6 @@ def get_center(model, eval_dataloader, post_process_class):
batch = [item.numpy() for item in batch]
# Obtain usable results from post-processing methods
- total_time += time.time() - start
- # Evaluate the results of the current batch
post_result = post_process_class(preds, batch[1])
#update char_center