diff --git a/README.md b/README.md index 9eb2175889875d68b3dd716ca52d811057824db8..1165b10d649908e91b9dd527c1cfd86f151ec802 100644 --- a/README.md +++ b/README.md @@ -69,6 +69,7 @@ For more model downloads (including multiple languages), please refer to [PP-OCR - Model training/evaluation - [Text Detection](./doc/doc_en/detection_en.md) - [Text Recognition](./doc/doc_en/recognition_en.md) + - [Direction Classification](./doc/doc_en/angle_class_en.md) - [Yml Configuration](./doc/doc_en/config_en.md) - Inference and Deployment - [Quick inference based on pip](./doc/doc_en/whl_en.md) diff --git a/README_ch.md b/README_ch.md index b76f194b38385d4b147aaab1d0b7714908e985d8..79ac2a1b0131307b34b9f430136c08137bc4b02a 100644 --- a/README_ch.md +++ b/README_ch.md @@ -67,6 +67,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 - 模型训练/评估 - [文本检测](./doc/doc_ch/detection.md) - [文本识别](./doc/doc_ch/recognition.md) + - [方向分类器](./doc/doc_ch/angle_class.md) - [yml参数配置文件介绍](./doc/doc_ch/config.md) - 预测部署 - [基于pip安装whl包快速推理](./doc/doc_ch/whl.md) diff --git a/deploy/slim/prune/README.md b/deploy/slim/prune/README.md index 9a73463a83c1593f423251c03c05e7b3fd63d46d..f2f4cda728b1575e9d15b2082b7d258ff6b70dd0 100644 --- a/deploy/slim/prune/README.md +++ b/deploy/slim/prune/README.md @@ -3,7 +3,7 @@ 复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型裁剪通过移出网络模型中的子模型来减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。 -本教程将介绍如何使用PaddleSlim裁剪PaddleOCR的模型。 +本教程将介绍如何使用[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim)裁剪PaddleOCR的模型。 在开始本教程之前,建议先了解: 1. [PaddleOCR模型的训练方法](../../../doc/doc_ch/quickstart.md) @@ -33,8 +33,20 @@ python setup.py install ### 3. 敏感度分析训练 -加载预训练模型后,通过对现有模型的每个网络层进行敏感度分析,了解各网络层冗余度,从而决定每个网络层的裁剪比例。 +加载预训练模型后,通过对现有模型的每个网络层进行敏感度分析,得到敏感度文件:sensitivities_0.data,可以通过PaddleSlim提供的[接口](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221)加载文件,获得各网络层在不同裁剪比例下的精度损失。从而了解各网络层冗余度,决定每个网络层的裁剪比例。 敏感度分析的具体细节见:[敏感度分析](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md) +敏感度文件内容格式: + sensitivities_0.data(Dict){ + 'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} + 'layer_weight_name_1': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} + } + + 例子: + { + 'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594} + 'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405} + } +加载敏感度文件后会返回一个字典,字典中的keys为网络模型参数模型的名字,values为一个字典,里面保存了相应网络层的裁剪敏感度信息。例如在例子中,conv10_expand_weights所对应的网络层在裁掉10%的卷积核后模型性能相较原模型会下降0.65%,详细信息可见[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86) 进入PaddleOCR根目录,通过以下命令对模型进行敏感度分析训练: ```bash @@ -42,7 +54,7 @@ python deploy/slim/prune/sensitivity_anal.py -c configs/det/det_mv3_db.yml -o Gl ``` ### 4. 模型裁剪训练 -裁剪时通过之前的敏感度分析文件决定每个网络层的裁剪比例。在具体实现时,为了尽可能多的保留从图像中提取的低阶特征,我们跳过了backbone中靠近输入的4个卷积层。同样,为了减少由于裁剪导致的模型性能损失,我们通过之前敏感度分析所获得的敏感度表,挑选出了一些冗余较少,对裁剪较为敏感的[网络层](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41),并在之后的裁剪过程中选择避开这些网络层。裁剪过后finetune的过程沿用OCR检测模型原始的训练策略。 +裁剪时通过之前的敏感度分析文件决定每个网络层的裁剪比例。在具体实现时,为了尽可能多的保留从图像中提取的低阶特征,我们跳过了backbone中靠近输入的4个卷积层。同样,为了减少由于裁剪导致的模型性能损失,我们通过之前敏感度分析所获得的敏感度表,人工挑选出了一些冗余较少,对裁剪较为敏感的[网络层](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/slim/prune/pruning_and_finetune.py#L41)(指在较低的裁剪比例下就导致很高性能损失的网络层),并在之后的裁剪过程中选择避开这些网络层。裁剪过后finetune的过程沿用OCR检测模型原始的训练策略。 ```bash python deploy/slim/prune/pruning_and_finetune.py -c configs/det/det_mv3_db.yml -o Global.pretrain_weights=./deploy/slim/prune/pretrain_models/det_mv3_db/best_accuracy Global.test_batch_size_per_card=1 diff --git a/deploy/slim/prune/README_en.md b/deploy/slim/prune/README_en.md index 7a93dce50063a68ddb828b87a4ced714c2f69dce..9e6c762b981b461dc649a8eccd748d23918d23c0 100644 --- a/deploy/slim/prune/README_en.md +++ b/deploy/slim/prune/README_en.md @@ -1,5 +1,4 @@ - ## Introduction Complicated models help to improve the performance of the model, but it also leads to some redundancy in the model. Model tailoring reduces this redundancy by removing the sub-models in the network model, so as to reduce model calculation complexity and improve model inference performance. . @@ -36,7 +35,20 @@ PaddleOCR also provides a series of models [../../../doc/doc_en/models_list_en.m ### 3. Pruning sensitivity analysis - After the pre-training model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, thereby determining the pruning ratio of each network layer. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md) + After the pre-training model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, and save a sensitivity file which named: sensitivities_0.data. After that, user could load the sensitivity file via the [methods provided by PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221) and determining the pruning ratio of each network layer automatically. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md) + The data format of sensitivity file: + sensitivities_0.data(Dict){ + 'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} + 'layer_weight_name_1': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} + } + + example: + { + 'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594} + 'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405} + } + The function would return a dict after loading the sensitivity file. The keys of the dict are name of parameters in each layer. And the value of key is the information about pruning sensitivity of correspoding layer. In example, pruning 10% filter of the layer corresponding to conv10_expand_weights would lead to 0.65% degradation of model performance. The details could be seen at: [Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86) + Enter the PaddleOCR root directory,perform sensitivity analysis on the model with the following command: diff --git a/deploy/slim/quantization/README.md b/deploy/slim/quantization/README.md index bf801d7133f57326556891e35cb551dc1c82ae5d..b35761c649ae5faf9e0db8663047419d991282fe 100755 --- a/deploy/slim/quantization/README.md +++ b/deploy/slim/quantization/README.md @@ -3,7 +3,8 @@ 复杂的模型有利于提高模型的性能,但也导致模型中存在一定冗余,模型量化将全精度缩减到定点数减少这种冗余,达到减少模型计算复杂度,提高模型推理性能的目的。 模型量化可以在基本不损失模型的精度的情况下,将FP32精度的模型参数转换为Int8精度,减小模型参数大小并加速计算,使用量化后的模型在移动端等部署时更具备速度优势。 -本教程将介绍如何使用PaddleSlim量化PaddleOCR的模型。 +本教程将介绍如何使用飞桨模型压缩库PaddleSlim做PaddleOCR模型的压缩。 +PaddleSlim(项目链接:https://github.com/PaddlePaddle/PaddleSlim)集成了模型剪枝、量化(包括量化训练和离线量化)、蒸馏和神经网络搜索等多种业界常用且领先的模型压缩功能,如果您感兴趣,可以关注并了解。 在开始本教程之前,建议先了解[PaddleOCR模型的训练方法](../../../doc/doc_ch/quickstart.md)以及[PaddleSlim](https://paddleslim.readthedocs.io/zh_CN/latest/index.html) diff --git a/deploy/slim/quantization/README_en.md b/deploy/slim/quantization/README_en.md index e565f13f5507bdf6eadb6909fc85d788290e5a42..49d4b60287c477525bf441529388fdb0341129df 100755 --- a/deploy/slim/quantization/README_en.md +++ b/deploy/slim/quantization/README_en.md @@ -1,6 +1,4 @@ -# Model compress tutorial (Quantization) - ## Introduction Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. diff --git a/doc/doc_ch/config.md b/doc/doc_ch/config.md index fe8db9c893cf0e6190111de5fe7627d2fe52a4fd..f9c664d4ea38e2e52dc76bfb5b63d9a515b106a7 100644 --- a/doc/doc_ch/config.md +++ b/doc/doc_ch/config.md @@ -10,7 +10,7 @@ ## 配置文件 Global 参数介绍 -以 `rec_chinese_lite_train.yml` 为例 +以 `rec_chinese_lite_train_v1.1.yml ` 为例 | 字段 | 用途 | 默认值 | 备注 | @@ -32,6 +32,7 @@ | loss_type | 设置 loss 类型 | ctc | 支持两种loss: ctc / attention | | distort | 设置是否使用数据增强 | false | 设置为true时,将在训练时随机进行扰动,支持的扰动操作可阅读[img_tools.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/rec/img_tools.py) | | use_space_char | 设置是否识别空格 | false | 仅在 character_type=ch 时支持空格 | +| label_list | 设置方向分类器支持的角度 | ['0','180'] | 仅在方向分类器中生效 | | average_window | ModelAverage优化器中的窗口长度计算比例 | 0.15 | 目前仅应用与SRN | | max_average_window | 平均值计算窗口长度的最大值 | 15625 | 推荐设置为一轮训练中mini-batchs的数目| | min_average_window | 平均值计算窗口长度的最小值 | 10000 | \ | diff --git a/doc/doc_ch/installation.md b/doc/doc_ch/installation.md index 381d1a9e8522c40fc4a2784024ee20537e6f11ba..d4b0a67f3cbdcb7d4f3efa6ea44ff881f7598a38 100644 --- a/doc/doc_ch/installation.md +++ b/doc/doc_ch/installation.md @@ -2,7 +2,7 @@ 经测试PaddleOCR可在glibc 2.23上运行,您也可以测试其他glibc版本或安装glic 2.23 PaddleOCR 工作环境 -- PaddlePaddle 1.7+ +- PaddlePaddle 1.8+ ,推荐使用 PaddlePaddle 2.0.0.beta - python3.7 - glibc 2.23 - cuDNN 7.6+ (GPU) @@ -47,19 +47,16 @@ docker images hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829 ``` -**2. 安装PaddlePaddle Fluid v1.7** +**2. 安装PaddlePaddle Fluid v2.0** ``` pip3 install --upgrade pip -如果您的机器安装的是CUDA9,请运行以下命令安装 -python3 -m pip install paddlepaddle-gpu==1.7.2.post97 -i https://pypi.tuna.tsinghua.edu.cn/simple - -如果您的机器安装的是CUDA10,请运行以下命令安装 -python3 -m pip install paddlepaddle-gpu==1.7.2.post107 -i https://pypi.tuna.tsinghua.edu.cn/simple +如果您的机器安装的是CUDA9或CUDA10,请运行以下命令安装 +python3 -m pip install paddlepaddle-gpu==2.0.0b0 -i https://mirror.baidu.com/pypi/simple 如果您的机器是CPU,请运行以下命令安装 -python3 -m pip install paddlepaddle==1.7.2 -i https://pypi.tuna.tsinghua.edu.cn/simple +python3 -m pip install paddlepaddle==2.0.0b0 -i https://mirror.baidu.com/pypi/simple 更多的版本需求,请参照[安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。 ``` diff --git a/doc/doc_ch/whl.md b/doc/doc_ch/whl.md index 46796ce64a60f12db9bbfbdd7b16ff77238c1831..1b04a9a8a967f39516db0c0f1be3e3842a87278b 100644 --- a/doc/doc_ch/whl.md +++ b/doc/doc_ch/whl.md @@ -11,8 +11,8 @@ pip install paddleocr 本地构建并安装 ```bash -python setup.py bdist_wheel -pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本号 +python3 setup.py bdist_wheel +pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本号 ``` ### 1. 代码使用 @@ -20,7 +20,7 @@ pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本 ```python from paddleocr import PaddleOCR, draw_ocr # Paddleocr目前支持中英文、英文、法语、德语、韩语、日语,可以通过修改lang参数进行切换 -# 参数依次为`zh`, `en`, `french`, `german`, `korean`, `japan`。 +# 参数依次为`ch`, `en`, `french`, `german`, `korean`, `japan`。 ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory img_path = 'PaddleOCR/doc/imgs/11.jpg' result = ocr.ocr(img_path, cls=True) @@ -280,7 +280,7 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_ | rec_algorithm | 使用的识别算法类型 | CRNN | | rec_model_dir | 识别模型所在文件夹。传参方式有两种,1. None: 自动下载内置模型到 `~/.paddleocr/rec`;2.自己转换好的inference模型路径,模型路径下必须包含model和params文件 | None | | rec_image_shape | 识别算法的输入图片尺寸 | "3,32,320" | -| rec_char_type | 识别算法的字符类型,中文(ch)或英文(en) | ch | +| rec_char_type | 识别算法的字符类型,中英文(ch)、英文(en)、法语(french)、德语(german)、韩语(korean)、日语(japan) | ch | | rec_batch_num | 进行识别时,同时前向的图片数 | 30 | | max_text_length | 识别算法能识别的最大文字长度 | 25 | | rec_char_dict_path | 识别模型字典路径,当rec_model_dir使用方式2传参时需要修改为自己的字典路径 | ./ppocr/utils/ppocr_keys_v1.txt | diff --git a/doc/doc_en/config_en.md b/doc/doc_en/config_en.md index b54def895f0758df7cdbd089253d6acd712d2b8e..722da6620fd03912c48a47679e7c13a23f15752e 100644 --- a/doc/doc_en/config_en.md +++ b/doc/doc_en/config_en.md @@ -10,7 +10,7 @@ The following list can be viewed via `--help` ## INTRODUCTION TO GLOBAL PARAMETERS OF CONFIGURATION FILE -Take `rec_chinese_lite_train.yml` as an example +Take `rec_chinese_lite_train_v1.1.yml` as an example | Parameter | Use | Default | Note | @@ -32,6 +32,7 @@ Take `rec_chinese_lite_train.yml` as an example | loss_type | Set loss type | ctc | Supports two types of loss: ctc / attention | | distort | Set use distort | false | Support distort type ,read [img_tools.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/rec/img_tools.py) | | use_space_char | Wether to recognize space | false | Only support in character_type=ch mode | + label_list | Set the angle supported by the direction classifier | ['0','180'] | Only valid in the direction classifier | | reader_yml | Set the reader configuration file | ./configs/rec/rec_icdar15_reader.yml | \ | | pretrain_weights | Load pre-trained model path | ./pretrain_models/CRNN/best_accuracy | \ | | checkpoints | Load saved model path | None | Used to load saved parameters to continue training after interruption | diff --git a/doc/doc_en/installation_en.md b/doc/doc_en/installation_en.md index b62d9b298dcb6f1757cb1a522565fb4c19484d6d..7608e12d979e9f26d58a8f0504d3740e9cf67014 100644 --- a/doc/doc_en/installation_en.md +++ b/doc/doc_en/installation_en.md @@ -3,7 +3,7 @@ After testing, paddleocr can run on glibc 2.23. You can also test other glibc versions or install glic 2.23 for the best compatibility. PaddleOCR working environment: -- PaddlePaddle1.7 +- PaddlePaddle1.8+, Recommend PaddlePaddle 2.0.0.beta - python3.7 - glibc 2.23 @@ -49,18 +49,15 @@ docker images hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829 ``` -**2. Install PaddlePaddle Fluid v1.7 (the higher version is not supported yet, the adaptation work is in progress)** +**2. Install PaddlePaddle Fluid v2.0 ** ``` pip3 install --upgrade pip -# If you have cuda9 installed on your machine, please run the following command to install -python3 -m pip install paddlepaddle-gpu==1.7.2.post97 -i https://pypi.tuna.tsinghua.edu.cn/simple - -# If you have cuda10 installed on your machine, please run the following command to install -python3 -m pip install paddlepaddle-gpu==1.7.2.post107 -i https://pypi.tuna.tsinghua.edu.cn/simple +# If you have cuda9 or cuda10 installed on your machine, please run the following command to install +python3 -m pip install paddlepaddle-gpu==2.0.0b0 -i https://mirror.baidu.com/pypi/simple # If you only have cpu on your machine, please run the following command to install -python3 -m pip install paddlepaddle==1.7.2 -i https://pypi.tuna.tsinghua.edu.cn/simple +python3 -m pip install paddlepaddle==2.0.0b0 -i https://mirror.baidu.com/pypi/simple ``` For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation. diff --git a/doc/doc_en/whl_en.md b/doc/doc_en/whl_en.md index 4049d9dcb2d52eb5f610d5f02017a9d2d4f14f47..ffbced346f7a3f661f382b5f2d826c20fef2c012 100644 --- a/doc/doc_en/whl_en.md +++ b/doc/doc_en/whl_en.md @@ -9,8 +9,8 @@ pip install paddleocr build own whl package and install ```bash -python setup.py bdist_wheel -pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr +python3 setup.py bdist_wheel +pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr ``` ### 1. Use by code @@ -18,7 +18,7 @@ pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of padd ```python from paddleocr import PaddleOCR,draw_ocr # Paddleocr supports Chinese, English, French, German, Korean and Japanese. -# You can set the parameter `lang` as `zh`, `en`, `french`, `german`, `korean`, `japan` +# You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan` # to switch the language model in order. ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg' @@ -302,7 +302,7 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_ | cls_batch_num | When performing classification, the batchsize of forward images | 30 | | enable_mkldnn | Whether to enable mkldnn | FALSE | | use_zero_copy_run | Whether to forward by zero_copy_run | FALSE | -| lang | The support language, now only chinese(ch) and english(en) are supported | ch | +| lang | The support language, now only Chinese(ch)、English(en)、French(french)、German(german)、Korean(korean)、Japanese(japan) are supported | ch | | det | Enable detction when `ppocr.ocr` func exec | TRUE | | rec | Enable recognition when `ppocr.ocr` func exec | TRUE | | cls | Enable classification when `ppocr.ocr` func exec | FALSE |