未验证 提交 8fc03fa9 编写于 作者: A andyj 提交者: GitHub

[CP]update visionlan (#7991)

* update visionlan

* update visionlan
上级 ef9e8702
...@@ -99,9 +99,9 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广 ...@@ -99,9 +99,9 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广
|SVTR|SVTR-Tiny| 89.25% | rec_svtr_tiny_none_ctc_en | [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/rec_svtr_tiny_none_ctc_en_train.tar) | |SVTR|SVTR-Tiny| 89.25% | rec_svtr_tiny_none_ctc_en | [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/rec_svtr_tiny_none_ctc_en_train.tar) |
|ViTSTR|ViTSTR| 79.82% | rec_vitstr_none_ce | [训练模型](https://paddleocr.bj.bcebos.com/rec_vitstr_none_ce_train.tar) | |ViTSTR|ViTSTR| 79.82% | rec_vitstr_none_ce | [训练模型](https://paddleocr.bj.bcebos.com/rec_vitstr_none_ce_train.tar) |
|ABINet|Resnet45| 90.75% | rec_r45_abinet | [训练模型](https://paddleocr.bj.bcebos.com/rec_r45_abinet_train.tar) | |ABINet|Resnet45| 90.75% | rec_r45_abinet | [训练模型](https://paddleocr.bj.bcebos.com/rec_r45_abinet_train.tar) |
|VisionLAN|Resnet45| 90.30% | rec_r45_visionlan | [训练模型](https://paddleocr.bj.bcebos.com/rec_r45_visionlan_train.tar) | |VisionLAN|Resnet45| 90.30% | rec_r45_visionlan | [训练模型](https://paddleocr.bj.bcebos.com/VisionLAN/rec_r45_visionlan_train.tar) |
|SPIN|ResNet32| 90.00% | rec_r32_gaspin_bilstm_att | coming soon | |SPIN|ResNet32| 90.00% | rec_r32_gaspin_bilstm_att | [训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_r32_gaspin_bilstm_att.tar) |
|RobustScanner|ResNet31| 87.77% | rec_r31_robustscanner | coming soon | |RobustScanner|ResNet31| 87.77% | rec_r31_robustscanner | [训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_r31_robustscanner.tar)|
<a name="2"></a> <a name="2"></a>
......
...@@ -27,7 +27,7 @@ ...@@ -27,7 +27,7 @@
|模型|骨干网络|配置文件|Acc|下载链接| |模型|骨干网络|配置文件|Acc|下载链接|
| --- | --- | --- | --- | --- | | --- | --- | --- | --- | --- |
|VisionLAN|ResNet45|[rec_r45_visionlan.yml](../../configs/rec/rec_r45_visionlan.yml)|90.3%|[预训练、训练模型](https://paddleocr.bj.bcebos.com/rec_r45_visionlan_train.tar)| |VisionLAN|ResNet45|[rec_r45_visionlan.yml](../../configs/rec/rec_r45_visionlan.yml)|90.3%|[预训练、训练模型](https://paddleocr.bj.bcebos.com/VisionLAN/rec_r45_visionlan_train.tar)|
<a name="2"></a> <a name="2"></a>
## 2. 环境配置 ## 2. 环境配置
...@@ -80,7 +80,7 @@ python3 tools/infer_rec.py -c configs/rec/rec_r45_visionlan.yml -o Global.infer_ ...@@ -80,7 +80,7 @@ python3 tools/infer_rec.py -c configs/rec/rec_r45_visionlan.yml -o Global.infer_
<a name="4-1"></a> <a name="4-1"></a>
### 4.1 Python推理 ### 4.1 Python推理
首先将训练得到best模型,转换成inference model。这里以训练完成的模型为例([模型下载地址](https://paddleocr.bj.bcebos.com/rec_r45_visionlan_train.tar)),可以使用如下命令进行转换: 首先将训练得到best模型,转换成inference model。这里以训练完成的模型为例([模型下载地址](https://paddleocr.bj.bcebos.com/VisionLAN/rec_r45_visionlan_train.tar)),可以使用如下命令进行转换:
```shell ```shell
# 注意将pretrained_model的路径设置为本地路径。 # 注意将pretrained_model的路径设置为本地路径。
......
...@@ -96,9 +96,9 @@ Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation r ...@@ -96,9 +96,9 @@ Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation r
|SVTR|SVTR-Tiny| 89.25% | rec_svtr_tiny_none_ctc_en | [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/rec_svtr_tiny_none_ctc_en_train.tar) | |SVTR|SVTR-Tiny| 89.25% | rec_svtr_tiny_none_ctc_en | [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/rec_svtr_tiny_none_ctc_en_train.tar) |
|ViTSTR|ViTSTR| 79.82% | rec_vitstr_none_ce | [trained model](https://paddleocr.bj.bcebos.com/rec_vitstr_none_none_train.tar) | |ViTSTR|ViTSTR| 79.82% | rec_vitstr_none_ce | [trained model](https://paddleocr.bj.bcebos.com/rec_vitstr_none_none_train.tar) |
|ABINet|Resnet45| 90.75% | rec_r45_abinet | [trained model](https://paddleocr.bj.bcebos.com/rec_r45_abinet_train.tar) | |ABINet|Resnet45| 90.75% | rec_r45_abinet | [trained model](https://paddleocr.bj.bcebos.com/rec_r45_abinet_train.tar) |
|VisionLAN|Resnet45| 90.30% | rec_r45_visionlan | [trained model](https://paddleocr.bj.bcebos.com/rec_r45_visionlan_train.tar) | |VisionLAN|Resnet45| 90.30% | rec_r45_visionlan | [trained model](https://paddleocr.bj.bcebos.com/VisionLAN/rec_r45_visionlan_train.tar) |
|SPIN|ResNet32| 90.00% | rec_r32_gaspin_bilstm_att | coming soon | |SPIN|ResNet32| 90.00% | rec_r32_gaspin_bilstm_att | [trained model](https://paddleocr.bj.bcebos.com/contribution/rec_r32_gaspin_bilstm_att.tar) |
|RobustScanner|ResNet31| 87.77% | rec_r31_robustscanner | coming soon | |RobustScanner|ResNet31| 87.77% | rec_r31_robustscanner | [trained model](https://paddleocr.bj.bcebos.com/contribution/rec_r31_robustscanner.tar)|
<a name="2"></a> <a name="2"></a>
......
...@@ -25,7 +25,7 @@ Using MJSynth and SynthText two text recognition datasets for training, and eval ...@@ -25,7 +25,7 @@ Using MJSynth and SynthText two text recognition datasets for training, and eval
|Model|Backbone|config|Acc|Download link| |Model|Backbone|config|Acc|Download link|
| --- | --- | --- | --- | --- | | --- | --- | --- | --- | --- |
|VisionLAN|ResNet45|[rec_r45_visionlan.yml](../../configs/rec/rec_r45_visionlan.yml)|90.3%|[预训练、训练模型](https://paddleocr.bj.bcebos.com/rec_r45_visionlan_train.tar)| |VisionLAN|ResNet45|[rec_r45_visionlan.yml](../../configs/rec/rec_r45_visionlan.yml)|90.3%|[预训练、训练模型](https://paddleocr.bj.bcebos.com/VisionLAN/rec_r45_visionlan_train.tar)|
<a name="2"></a> <a name="2"></a>
## 2. Environment ## 2. Environment
...@@ -68,7 +68,7 @@ python3 tools/infer_rec.py -c configs/rec/rec_r45_visionlan.yml -o Global.infer_ ...@@ -68,7 +68,7 @@ python3 tools/infer_rec.py -c configs/rec/rec_r45_visionlan.yml -o Global.infer_
<a name="4-1"></a> <a name="4-1"></a>
### 4.1 Python Inference ### 4.1 Python Inference
First, the model saved during the VisionLAN text recognition training process is converted into an inference model. ( [Model download link](https://paddleocr.bj.bcebos.com/rec_r45_visionlan_train.tar)) ), you can use the following command to convert: First, the model saved during the VisionLAN text recognition training process is converted into an inference model. ( [Model download link](https://paddleocr.bj.bcebos.com/VisionLAN/rec_r45_visionlan_train.tar)) ), you can use the following command to convert:
``` ```
python3 tools/export_model.py -c configs/rec/rec_r45_visionlan.yml -o Global.pretrained_model=./rec_r45_visionlan_train/best_accuracy Global.save_inference_dir=./inference/rec_r45_visionlan/ python3 tools/export_model.py -c configs/rec/rec_r45_visionlan.yml -o Global.pretrained_model=./rec_r45_visionlan_train/best_accuracy Global.save_inference_dir=./inference/rec_r45_visionlan/
......
...@@ -107,7 +107,6 @@ class BaseRecLabelEncode(object): ...@@ -107,7 +107,6 @@ class BaseRecLabelEncode(object):
self.beg_str = "sos" self.beg_str = "sos"
self.end_str = "eos" self.end_str = "eos"
self.lower = lower self.lower = lower
self.use_default_dict = False
if character_dict_path is None: if character_dict_path is None:
logger = get_logger() logger = get_logger()
...@@ -117,11 +116,8 @@ class BaseRecLabelEncode(object): ...@@ -117,11 +116,8 @@ class BaseRecLabelEncode(object):
self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz" self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
dict_character = list(self.character_str) dict_character = list(self.character_str)
self.lower = True self.lower = True
self.use_default_dict = True
else: else:
self.character_str = [] self.character_str = []
if 'ppocr/utils/ic15_dict.txt' in character_dict_path:
self.use_default_dict = True
with open(character_dict_path, "rb") as fin: with open(character_dict_path, "rb") as fin:
lines = fin.readlines() lines = fin.readlines()
for line in lines: for line in lines:
...@@ -1348,9 +1344,6 @@ class VLLabelEncode(BaseRecLabelEncode): ...@@ -1348,9 +1344,6 @@ class VLLabelEncode(BaseRecLabelEncode):
**kwargs): **kwargs):
super(VLLabelEncode, self).__init__( super(VLLabelEncode, self).__init__(
max_text_length, character_dict_path, use_space_char, lower) max_text_length, character_dict_path, use_space_char, lower)
if self.use_default_dict:
self.character = self.character[10:] + self.character[
1:10] + [self.character[0]]
self.dict = {} self.dict = {}
for i, char in enumerate(self.character): for i, char in enumerate(self.character):
self.dict[char] = i self.dict[char] = i
......
...@@ -26,15 +26,11 @@ class BaseRecLabelDecode(object): ...@@ -26,15 +26,11 @@ class BaseRecLabelDecode(object):
self.end_str = "eos" self.end_str = "eos"
self.reverse = False self.reverse = False
self.character_str = [] self.character_str = []
self.use_default_dict = False
if character_dict_path is None: if character_dict_path is None:
self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz" self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
dict_character = list(self.character_str) dict_character = list(self.character_str)
self.use_default_dict = True
else: else:
if 'ppocr/utils/ic15_dict.txt' in character_dict_path:
self.use_default_dict = True
with open(character_dict_path, "rb") as fin: with open(character_dict_path, "rb") as fin:
lines = fin.readlines() lines = fin.readlines()
for line in lines: for line in lines:
...@@ -725,9 +721,6 @@ class VLLabelDecode(BaseRecLabelDecode): ...@@ -725,9 +721,6 @@ class VLLabelDecode(BaseRecLabelDecode):
super(VLLabelDecode, self).__init__(character_dict_path, use_space_char) super(VLLabelDecode, self).__init__(character_dict_path, use_space_char)
self.max_text_length = kwargs.get('max_text_length', 25) self.max_text_length = kwargs.get('max_text_length', 25)
self.nclass = len(self.character) + 1 self.nclass = len(self.character) + 1
if self.use_default_dict:
self.character = self.character[10:] + self.character[
1:10] + [self.character[0]]
def decode(self, text_index, text_prob=None, is_remove_duplicate=False): def decode(self, text_index, text_prob=None, is_remove_duplicate=False):
""" convert text-index into text-label. """ """ convert text-index into text-label. """
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册