From e797692e14cfd2b90bc694f0858b51a5d2d70e18 Mon Sep 17 00:00:00 2001 From: andyjpaddle Date: Wed, 16 Feb 2022 02:55:15 +0000 Subject: [PATCH] fix dead link --- README.md | 7 +------ deploy/slim/prune/README.md | 2 +- deploy/slim/prune/README_en.md | 6 +++--- deploy/slim/quantization/README_en.md | 4 ++-- doc/doc_ch/FAQ.md | 8 ++++---- doc/doc_ch/config.md | 10 +++++----- doc/doc_ch/serving_inference.md | 2 +- doc/doc_en/config_en.md | 2 +- doc/doc_en/tricks_en.md | 20 ++++++++++---------- ppstructure/README_ch.md | 2 +- 10 files changed, 29 insertions(+), 34 deletions(-) diff --git a/README.md b/README.md index 4a938f2f..95f35277 100644 --- a/README.md +++ b/README.md @@ -181,16 +181,11 @@ For a new language request, please refer to [Guideline for new language_requests ## Guideline for New Language Requests -If you want to request a new language support, a PR with 2 following files are needed: +If you want to request a new language support, a PR with 1 following files are needed: 1. In folder [ppocr/utils/dict](./ppocr/utils/dict), it is necessary to submit the dict text to this path and name it with `{language}_dict.txt` that contains a list of all characters. Please see the format example from other files in that folder. -2. In folder [ppocr/utils/corpus](./ppocr/utils/corpus), -it is necessary to submit the corpus to this path and name it with `{language}_corpus.txt` that contains a list of words in your language. -Maybe, 50000 words per language is necessary at least. -Of course, the more, the better. - If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on. More details, please refer to [Multilingual OCR Development Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048). diff --git a/deploy/slim/prune/README.md b/deploy/slim/prune/README.md index 7b8dd169..c4385723 100644 --- a/deploy/slim/prune/README.md +++ b/deploy/slim/prune/README.md @@ -45,7 +45,7 @@ python3 setup.py install 'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594} 'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405} } -加载敏感度文件后会返回一个字典,字典中的keys为网络模型参数模型的名字,values为一个字典,里面保存了相应网络层的裁剪敏感度信息。例如在例子中,conv10_expand_weights所对应的网络层在裁掉10%的卷积核后模型性能相较原模型会下降0.65%,详细信息可见[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86) +加载敏感度文件后会返回一个字典,字典中的keys为网络模型参数模型的名字,values为一个字典,里面保存了相应网络层的裁剪敏感度信息。例如在例子中,conv10_expand_weights所对应的网络层在裁掉10%的卷积核后模型性能相较原模型会下降0.65%,详细信息可见[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/release/2.0-alpha/docs/zh_cn/algo/algo.md) 进入PaddleOCR根目录,通过以下命令对模型进行敏感度分析训练: ```bash diff --git a/deploy/slim/prune/README_en.md b/deploy/slim/prune/README_en.md index f0d652f2..f8fbed47 100644 --- a/deploy/slim/prune/README_en.md +++ b/deploy/slim/prune/README_en.md @@ -3,7 +3,7 @@ Generally, a more complex model would achive better performance in the task, but it also leads to some redundancy in the model. Model Pruning is a technique that reduces this redundancy by removing the sub-models in the neural network model, so as to reduce model calculation complexity and improve model inference performance. -This example uses PaddleSlim provided[APIs of Pruning](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/) to compress the OCR model. +This example uses PaddleSlim provided[APIs of Pruning](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/docs/zh_cn/api_cn/dygraph/pruners) to compress the OCR model. [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim), an open source library which integrates model pruning, quantization (including quantization training and offline quantization), distillation, neural network architecture search, and many other commonly used and leading model compression technique in the industry. It is recommended that you could understand following pages before reading this example: @@ -35,7 +35,7 @@ PaddleOCR also provides a series of [models](../../../doc/doc_en/models_list_en. ### 3. Pruning sensitivity analysis - After the pre-trained model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, and save a sensitivity file which named: sen.pickle. After that, user could load the sensitivity file via the [methods provided by PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221) and determining the pruning ratio of each network layer automatically. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/tutorials/image_classification_sensitivity_analysis_tutorial.md) + After the pre-trained model is loaded, sensitivity analysis is performed on each network layer of the model to understand the redundancy of each network layer, and save a sensitivity file which named: sen.pickle. After that, user could load the sensitivity file via the [methods provided by PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/paddleslim/prune/sensitive.py#L221) and determining the pruning ratio of each network layer automatically. For specific details of sensitivity analysis, see:[Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/en/tutorials/image_classification_sensitivity_analysis_tutorial_en.md) The data format of sensitivity file: sen.pickle(Dict){ 'layer_weight_name_0': sens_of_each_ratio(Dict){'pruning_ratio_0': acc_loss, 'pruning_ratio_1': acc_loss} @@ -47,7 +47,7 @@ PaddleOCR also provides a series of [models](../../../doc/doc_en/models_list_en. 'conv10_expand_weights': {0.1: 0.006509952684312718, 0.2: 0.01827734339798862, 0.3: 0.014528405644659832, 0.6: 0.06536008804270439, 0.8: 0.11798612250664964, 0.7: 0.12391408417493704, 0.4: 0.030615754498018757, 0.5: 0.047105205602406594} 'conv10_linear_weights': {0.1: 0.05113190831455035, 0.2: 0.07705573833558801, 0.3: 0.12096721757739311, 0.6: 0.5135061352930738, 0.8: 0.7908166677143281, 0.7: 0.7272187676899062, 0.4: 0.1819252083008504, 0.5: 0.3728054727792405} } - The function would return a dict after loading the sensitivity file. The keys of the dict are name of parameters in each layer. And the value of key is the information about pruning sensitivity of corresponding layer. In example, pruning 10% filter of the layer corresponding to conv10_expand_weights would lead to 0.65% degradation of model performance. The details could be seen at: [Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/algo/algo.md#2-%E5%8D%B7%E7%A7%AF%E6%A0%B8%E5%89%AA%E8%A3%81%E5%8E%9F%E7%90%86) + The function would return a dict after loading the sensitivity file. The keys of the dict are name of parameters in each layer. And the value of key is the information about pruning sensitivity of corresponding layer. In example, pruning 10% filter of the layer corresponding to conv10_expand_weights would lead to 0.65% degradation of model performance. The details could be seen at: [Sensitivity analysis](https://github.com/PaddlePaddle/PaddleSlim/blob/release/2.0-alpha/docs/zh_cn/algo/algo.md) Enter the PaddleOCR root directory,perform sensitivity analysis on the model with the following command: diff --git a/deploy/slim/quantization/README_en.md b/deploy/slim/quantization/README_en.md index 4cafe5f4..d3bf12d6 100644 --- a/deploy/slim/quantization/README_en.md +++ b/deploy/slim/quantization/README_en.md @@ -5,11 +5,11 @@ Generally, a more complex model would achieve better performance in the task, bu Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance. -This example uses PaddleSlim provided [APIs of Quantization](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) to compress the OCR model. +This example uses PaddleSlim provided [APIs of Quantization](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/dygraph/quanter/qat.rst) to compress the OCR model. It is recommended that you could understand following pages before reading this example: - [The training strategy of OCR model](../../../doc/doc_en/quickstart_en.md) -- [PaddleSlim Document](https://paddlepaddle.github.io/PaddleSlim/api/quantization_api/) +- [PaddleSlim Document](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/docs/zh_cn/api_cn/dygraph/quanter/qat.rst) ## Quick Start Quantization is mostly suitable for the deployment of lightweight models on mobile terminals. diff --git a/doc/doc_ch/FAQ.md b/doc/doc_ch/FAQ.md index cd5369f6..41277f7d 100644 --- a/doc/doc_ch/FAQ.md +++ b/doc/doc_ch/FAQ.md @@ -11,7 +11,7 @@ PaddleOCR收集整理了自从开源以来在issues和用户群中的常见问 OCR领域大佬众多,本文档回答主要依赖有限的项目实践,难免挂一漏万,如有遗漏和不足,也**希望有识之士帮忙补充和修正**,万分感谢。 - [FAQ](#faq) - + * [1. 通用问题](#1) + [1.1 检测](#11) + [1.2 识别](#12) @@ -20,7 +20,7 @@ OCR领域大佬众多,本文档回答主要依赖有限的项目实践,难 + [1.5 垂类场景实现思路](#15) + [1.6 训练过程与模型调优](#16) + [1.7 补充资料](#17) - + * [2. PaddleOCR实战问题](#2) + [2.1 PaddleOCR repo](#21) + [2.2 安装环境](#22) @@ -734,7 +734,7 @@ C++TensorRT预测需要使用支持TRT的预测库并在编译时打开[-DWITH_T #### Q:PaddleOCR中,对于模型预测加速,CPU加速的途径有哪些?基于TenorRT加速GPU对输入有什么要求? -**A**:(1)CPU可以使用mkldnn进行加速;对于python inference的话,可以把enable_mkldnn改为true,[参考代码](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/tools/infer/utility.py#L99),对于cpp inference的话,在配置文件里面配置use_mkldnn 1即可,[参考代码](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/deploy/cpp_infer/tools/config.txt#L6) +**A**:(1)CPU可以使用mkldnn进行加速;对于python inference的话,可以把enable_mkldnn改为true,[参考代码](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/tools/infer/utility.py#L99),对于cpp inference的话,可参考[文档](https://github.com/andyjpaddle/PaddleOCR/tree/dygraph/deploy/cpp_infer) (2)GPU需要注意变长输入问题等,TRT6 之后才支持变长输入 @@ -838,4 +838,4 @@ nvidia-smi --lock-gpu-clocks=1590 -i 0 #### Q: 预测时显存爆炸、内存泄漏问题? -**A**: 打开显存/内存优化开关`enable_memory_optim`可以解决该问题,相关代码已合入,[查看详情](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/tools/infer/utility.py#L153)。 \ No newline at end of file +**A**: 打开显存/内存优化开关`enable_memory_optim`可以解决该问题,相关代码已合入,[查看详情](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/tools/infer/utility.py#L153)。 diff --git a/doc/doc_ch/config.md b/doc/doc_ch/config.md index 40c63905..1668eba1 100644 --- a/doc/doc_ch/config.md +++ b/doc/doc_ch/config.md @@ -66,7 +66,7 @@ | :---------------------: | :---------------------: | :--------------: | :--------------------: | | model_type | 网络类型 | rec | 目前支持`rec`,`det`,`cls` | | algorithm | 模型名称 | CRNN | 支持列表见[algorithm_overview](./algorithm_overview.md) | -| **Transform** | 设置变换方式 | - | 目前仅rec类型的算法支持, 具体见[ppocr/modeling/transform](../../ppocr/modeling/transform) | +| **Transform** | 设置变换方式 | - | 目前仅rec类型的算法支持, 具体见[ppocr/modeling/transforms](../../ppocr/modeling/transforms) | | name | 变换方式类名 | TPS | 目前支持`TPS` | | num_fiducial | TPS控制点数 | 20 | 上下边各十个 | | loc_lr | 定位网络学习率 | 0.1 | \ | @@ -176,7 +176,7 @@ PaddleOCR目前已支持80种(除中文外)语种识别,`configs/rec/multi --dict {path/of/dict} \ # 字典文件路径 -o Global.use_gpu=False # 是否使用gpu ... - + ``` 意大利文由拉丁字母组成,因此执行完命令后会得到名为 rec_latin_lite_train.yml 的配置文件。 @@ -191,21 +191,21 @@ PaddleOCR目前已支持80种(除中文外)语种识别,`configs/rec/multi epoch_num: 500 ... character_dict_path: {path/of/dict} # 字典文件所在路径 - + Train: dataset: name: SimpleDataSet data_dir: train_data/ # 数据存放根目录 label_file_list: ["./train_data/train_list.txt"] # 训练集label路径 ... - + Eval: dataset: name: SimpleDataSet data_dir: train_data/ # 数据存放根目录 label_file_list: ["./train_data/val_list.txt"] # 验证集label路径 ... - + ``` 目前PaddleOCR支持的多语言算法有: diff --git a/doc/doc_ch/serving_inference.md b/doc/doc_ch/serving_inference.md index 7a53628e..fea5a245 100644 --- a/doc/doc_ch/serving_inference.md +++ b/doc/doc_ch/serving_inference.md @@ -20,7 +20,7 @@ **Python操作指南:** -目前Serving用于OCR的部分功能还在测试当中,因此在这里我们给出[Servnig latest package](https://github.com/PaddlePaddle/Serving/blob/develop/doc/LATEST_PACKAGES.md) +目前Serving用于OCR的部分功能还在测试当中,因此在这里我们给出[Servnig latest package](https://github.com/PaddlePaddle/Serving/blob/develop/doc/Latest_Packages_CN.md) 大家根据自己的环境选择需要安装的whl包即可,例如以Python 3.5为例,执行下列命令 ``` #CPU/GPU版本选择一个 diff --git a/doc/doc_en/config_en.md b/doc/doc_en/config_en.md index eda1e13d..d7bf5ead 100644 --- a/doc/doc_en/config_en.md +++ b/doc/doc_en/config_en.md @@ -66,7 +66,7 @@ In PaddleOCR, the network is divided into four stages: Transform, Backbone, Neck | :---------------------: | :---------------------: | :--------------: | :--------------------: | | model_type | Network Type | rec | Currently support`rec`,`det`,`cls` | | algorithm | Model name | CRNN | See [algorithm_overview](./algorithm_overview_en.md) for the support list | -| **Transform** | Set the transformation method | - | Currently only recognition algorithms are supported, see [ppocr/modeling/transform](../../ppocr/modeling/transform) for details | +| **Transform** | Set the transformation method | - | Currently only recognition algorithms are supported, see [ppocr/modeling/transforms](../../ppocr/modeling/transforms) for details | | name | Transformation class name | TPS | Currently supports `TPS` | | num_fiducial | Number of TPS control points | 20 | Ten on the top and bottom | | loc_lr | Localization network learning rate | 0.1 | \ | diff --git a/doc/doc_en/tricks_en.md b/doc/doc_en/tricks_en.md index eab9c892..4d59857a 100644 --- a/doc/doc_en/tricks_en.md +++ b/doc/doc_en/tricks_en.md @@ -12,25 +12,25 @@ Here we have sorted out some Chinese OCR training and prediction tricks, which a At present, ResNet_vd series and MobileNetV3 series are the backbone networks used in PaddleOCR, whether replacing the other backbone networks will help to improve the accuracy? What should be paid attention to when replacing? - **Tips** - - Whether text detection or text recognition, the choice of backbone network is a trade-off between prediction effect and prediction efficiency. Generally, a larger backbone network is selected, e.g. ResNet101_vd, then the performance of the detection or recognition is more accurate, but the time cost will increase accordingly. And a smaller backbone network is selected, e.g. MobileNetV3_small_x0_35, the prediction speed is faster, but the accuracy of detection or recognition will be reduced. Fortunately, the detection or recognition effect of different backbone networks is positively correlated with the performance of ImageNet 1000 classification task. [**PaddleClas**](https://github.com/PaddlePaddle/PaddleClas/blob/master/README_en.md) have sorted out the 23 series of classification network structures, such as ResNet_vd、Res2Net、HRNet、MobileNetV3、GhostNet. It provides the top1 accuracy of classification, the time cost of GPU(V100 and T4) and CPU(SD 855), and the 117 pretrained models [**download addresses**](https://paddleclas-en.readthedocs.io/en/latest/models/models_intro_en.html). - + - Whether text detection or text recognition, the choice of backbone network is a trade-off between prediction effect and prediction efficiency. Generally, a larger backbone network is selected, e.g. ResNet101_vd, then the performance of the detection or recognition is more accurate, but the time cost will increase accordingly. And a smaller backbone network is selected, e.g. MobileNetV3_small_x0_35, the prediction speed is faster, but the accuracy of detection or recognition will be reduced. Fortunately, the detection or recognition effect of different backbone networks is positively correlated with the performance of ImageNet 1000 classification task. [**PaddleClas**](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/en/models/models_intro_en.md) have sorted out the 23 series of classification network structures, such as ResNet_vd、Res2Net、HRNet、MobileNetV3、GhostNet. It provides the top1 accuracy of classification, the time cost of GPU(V100 and T4) and CPU(SD 855), and the 117 pretrained models [**download addresses**](https://paddleclas-en.readthedocs.io/en/latest/models/models_intro_en.html). + - Similar as the 4 stages of ResNet, the replacement of text detection backbone network is to determine those four stages to facilitate the integration of FPN like the object detection heads. In addition, for the text detection problem, the pre trained model in ImageNet1000 can accelerate the convergence and improve the accuracy. - + - In order to replace the backbone network of text recognition, we need to pay attention to the descending position of network width and height stride. Since the ratio between width and height is large in chinese text recognition, the frequency of height decrease is less and the frequency of width decrease is more. You can refer the [modifies of MobileNetV3](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/modeling/backbones/rec_mobilenet_v3.py) in PaddleOCR. #### 2、Long Chinese Text Recognition -- **Problem Description** +- **Problem Description** The maximum resolution of Chinese recognition model during training is [3,32,320], if the text image to be recognized is too long, as shown in the figure below, how to adapt? - +
- + - **Tips** During the training, the training samples are not directly resized to [3,32,320]. At first, the height of samples are resized to 32 and keep the ratio between the width and the height. When the width is less than 320, the excess parts are padding 0. Besides, when the ratio between the width and the height of the samples is larger than 10, these samples will be ignored. When the prediction for one image, do as above, but do not limit the max ratio between the width and the height. When the prediction for an images batch, do as training, but the resized target width is the longest width of the images in the batch. [Code as following](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/tools/infer/predict_rec.py): - + ``` def resize_norm_img(self, img, max_wh_ratio): imgC, imgH, imgW = self.rec_image_shape @@ -58,11 +58,11 @@ Here we have sorted out some Chinese OCR training and prediction tricks, which a - **Problem Description** As shown in the figure below, for Chinese and English mixed scenes, in order to facilitate reading and using the recognition results, it is often necessary to recognize the spaces between words. How can this situation be adapted? - +
- + - **Tips** - + There are two possible methods for space recognition. (1) Optimize the text detection. For spliting the text at the space in detection results, it needs to divide the text line with space into many segments when label the data for detection. (2) Optimize the text recognition. The space character is introduced into the recognition dictionary. Label the blank line in the training data for text recognition. In addition, we can also concat multiple word lines to synthesize the training data with spaces. PaddleOCR currently uses the second method. diff --git a/ppstructure/README_ch.md b/ppstructure/README_ch.md index 1013c619..172a399a 100644 --- a/ppstructure/README_ch.md +++ b/ppstructure/README_ch.md @@ -116,4 +116,4 @@ PP-Structure系列模型列表(更新中) |re_LayoutXLM_xfun_zh|基于LayoutXLM在xfun中文数据集上训练的RE模型|1.4G|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutXLM_xfun_zh.tar) | -更多模型下载,可以参考 [PP-OCR model_list](../doc/doc_en/models_list.md) and [PP-Structure model_list](./docs/models_list.md) +更多模型下载,可以参考 [PP-OCR model_list](../doc/doc_en/models_list_en.md) and [PP-Structure model_list](./docs/models_list.md) -- GitLab