The angle classification is used in the scene where the image is not 0 degrees. In this scene, it is necessary to perform a correction operation on the text line detected in the picture. In the PaddleOCR system,
The text line image obtained after text detection is sent to the recognition model after affine transformation. At this time, only a 0 and 180 degree angle classification of the text is required, so the built-in PaddleOCR text angle classifier **only supports 0 and 180 degree classification**. If you want to support more angles, you can modify the algorithm yourself to support.
-[4.1 Training engine prediction](#Training_engine_prediction)
<aname="DATA_PREPARATION"></a>
<aname="DATA_PREPARATION"></a>
### DATA PREPARATION
### DATA PREPARATION
PaddleOCR supports two data formats: `LMDB` is used to train public data and evaluation algorithms; `general data` is used to train your own data:
PaddleOCR supports two data formats:
-`LMDB` is used to train data sets stored in lmdb format;
-`general data` is used to train data sets stored in text files:
Please organize the dataset as follows:
Please organize the dataset as follows:
The default storage path for training data is `PaddleOCR/train_data`, if you already have a dataset on your disk, just create a soft link to the dataset directory:
The default storage path for training data is `PaddleOCR/train_data`, if you already have a dataset on your disk, just create a soft link to the dataset directory:
If you do not have a dataset locally, you can download it on the official website [icdar2015](http://rrc.cvc.uab.es/?ch=4&com=downloads). Also refer to [DTRB](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here),download the lmdb format dataset required for benchmark
If you want to reproduce the paper indicators of SRN, you need to download offline [augmented data](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA), extraction code: y3ry. The augmented data is obtained by rotation and perturbation of mjsynth and synthtext. Please unzip the data to {your_path}/PaddleOCR/train_data/data_lmdb_Release/training/path.
<aname="Costom_Dataset"></a>
<aname="Costom_Dataset"></a>
* Use your own dataset:
#### 1.1 Costom dataset
If you want to use your own data for training, please refer to the following to organize your data.
If you want to use your own data for training, please refer to the following to organize your data.
- Training set
- Training set
First put the training images in the same folder (train_images), and use a txt file (rec_gt_train.txt) to store the image path and label.
It is recommended to put the training images in the same folder, and use a txt file (rec_gt_train.txt) to store the image path and label. The contents of the txt file are as follows:
* Note: by default, the image path and image label are split with \t, if you use other methods to split, it will cause training error
* Note: by default, the image path and image label are split with \t, if you use other methods to split, it will cause training error
```
```
" Image file name Image annotation "
" Image file name Image annotation "
train_data/train_0001.jpg 简单可依赖
train_data/rec/train/word_001.jpg 简单可依赖
train_data/train_0002.jpg 用科技让复杂的世界更简单
train_data/rec/train/word_002.jpg 用科技让复杂的世界更简单
```
...
PaddleOCR provides label files for training the icdar2015 dataset, which can be downloaded in the following ways:
The final training set should have the following file structure:
The final training set should have the following file structure:
```
```
|-train_data
|-train_data
|-ic15_data
|-rec
|- rec_gt_train.txt
|- rec_gt_train.txt
|- train
|- train
|- word_001.png
|- word_001.png
|- word_002.jpg
|- word_002.jpg
|- word_003.jpg
|- word_003.jpg
| ...
| ...
```
```
- Test set
- Test set
...
@@ -82,6 +73,7 @@ Similar to the training set, the test set also needs to be provided a folder con
...
@@ -82,6 +73,7 @@ Similar to the training set, the test set also needs to be provided a folder con
```
```
|-train_data
|-train_data
|-rec
|-ic15_data
|-ic15_data
|- rec_gt_test.txt
|- rec_gt_test.txt
|- test
|- test
...
@@ -90,8 +82,25 @@ Similar to the training set, the test set also needs to be provided a folder con
...
@@ -90,8 +82,25 @@ Similar to the training set, the test set also needs to be provided a folder con
|- word_003.jpg
|- word_003.jpg
| ...
| ...
```
```
<aname="Dataset_download"></a>
#### 1.2 Dataset download
If you do not have a dataset locally, you can download it on the official website [icdar2015](http://rrc.cvc.uab.es/?ch=4&com=downloads). Also refer to [DTRB](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here) ,download the lmdb format dataset required for benchmark
If you want to reproduce the paper indicators of SRN, you need to download offline [augmented data](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA), extraction code: y3ry. The augmented data is obtained by rotation and perturbation of mjsynth and synthtext. Please unzip the data to {your_path}/PaddleOCR/train_data/data_lmdb_Release/training/path.
PaddleOCR provides label files for training the icdar2015 dataset, which can be downloaded in the following ways:
Finally, a dictionary ({word_dict_name}.txt) needs to be provided so that when the model is trained, all the characters that appear can be mapped to the dictionary index.
Finally, a dictionary ({word_dict_name}.txt) needs to be provided so that when the model is trained, all the characters that appear can be mapped to the dictionary index.
...
@@ -108,6 +117,8 @@ n
...
@@ -108,6 +117,8 @@ n
In `word_dict.txt`, there is a single word in each line, which maps characters and numeric indexes together, e.g "and" will be mapped to [2 5 1]
In `word_dict.txt`, there is a single word in each line, which maps characters and numeric indexes together, e.g "and" will be mapped to [2 5 1]
PaddleOCR has built-in dictionaries, which can be used on demand.
`ppocr/utils/ppocr_keys_v1.txt` is a Chinese dictionary with 6623 characters.
`ppocr/utils/ppocr_keys_v1.txt` is a Chinese dictionary with 6623 characters.
`ppocr/utils/ic15_dict.txt` is an English dictionary with 63 characters
`ppocr/utils/ic15_dict.txt` is an English dictionary with 63 characters
...
@@ -123,8 +134,6 @@ In `word_dict.txt`, there is a single word in each line, which maps characters a
...
@@ -123,8 +134,6 @@ In `word_dict.txt`, there is a single word in each line, which maps characters a
`ppocr/utils/dict/en_dict.txt` is a English dictionary with 63 characters
`ppocr/utils/dict/en_dict.txt` is a English dictionary with 63 characters
You can use it on demand.
The current multi-language model is still in the demo stage and will continue to optimize the model and add languages. **You are very welcome to provide us with dictionaries and fonts in other languages**,
The current multi-language model is still in the demo stage and will continue to optimize the model and add languages. **You are very welcome to provide us with dictionaries and fonts in other languages**,
If you like, you can submit the dictionary file to [dict](../../ppocr/utils/dict) and we will thank you in the Repo.
If you like, you can submit the dictionary file to [dict](../../ppocr/utils/dict) and we will thank you in the Repo.
...
@@ -136,14 +145,14 @@ To customize the dict file, please modify the `character_dict_path` field in `co
...
@@ -136,14 +145,14 @@ To customize the dict file, please modify the `character_dict_path` field in `co
If you need to customize dic file, please add character_dict_path field in configs/rec/rec_icdar15_train.yml to point to your dictionary path. And set character_type to ch.
If you need to customize dic file, please add character_dict_path field in configs/rec/rec_icdar15_train.yml to point to your dictionary path. And set character_type to ch.
<aname="Add_space_category"></a>
<aname="Add_space_category"></a>
- Add space category
#### 1.4 Add space category
If you want to support the recognition of the `space` category, please set the `use_space_char` field in the yml file to `True`.
If you want to support the recognition of the `space` category, please set the `use_space_char` field in the yml file to `True`.
**Note: use_space_char only takes effect when character_type=ch**
**Note: use_space_char only takes effect when character_type=ch**
<aname="TRAINING"></a>
<aname="TRAINING"></a>
### TRAINING
### 2 TRAINING
PaddleOCR provides training scripts, evaluation scripts, and prediction scripts. In this section, the CRNN recognition model will be used as an example:
PaddleOCR provides training scripts, evaluation scripts, and prediction scripts. In this section, the CRNN recognition model will be used as an example:
PaddleOCR provides a variety of data augmentation methods. If you want to add disturbance during training, please set `distort: true` in the configuration file.
PaddleOCR provides a variety of data augmentation methods. If you want to add disturbance during training, please set `distort: true` in the configuration file.
Each disturbance method is selected with a 50% probability during the training process. For specific code implementation, please refer to: [img_tools.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/rec/img_tools.py)
Each disturbance method is selected with a 50% probability during the training process. For specific code implementation, please refer to: [img_tools.py](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/ppocr/data/rec/img_tools.py)
<aname="Training"></a>
<aname="Training"></a>
- Training
#### 2.2 Training
PaddleOCR supports alternating training and evaluation. You can modify `eval_batch_step` in `configs/rec/rec_icdar15_train.yml` to set the evaluation frequency. By default, it is evaluated every 500 iter and the best acc model is saved under `output/rec_CRNN/best_accuracy` during the evaluation process.
PaddleOCR supports alternating training and evaluation. You can modify `eval_batch_step` in `configs/rec/rec_icdar15_train.yml` to set the evaluation frequency. By default, it is evaluated every 500 iter and the best acc model is saved under `output/rec_CRNN/best_accuracy` during the evaluation process.
...
@@ -268,7 +277,7 @@ Eval:
...
@@ -268,7 +277,7 @@ Eval:
**Note that the configuration file for prediction/evaluation must be consistent with the training.**
**Note that the configuration file for prediction/evaluation must be consistent with the training.**
<aname="Multi_language"></a>
<aname="Multi_language"></a>
- Multi-language
#### 2.3 Multi-language
PaddleOCR currently supports 26 (except Chinese) language recognition. A multi-language configuration file template is
PaddleOCR currently supports 26 (except Chinese) language recognition. A multi-language configuration file template is
provided under the path `configs/rec/multi_languages`: [rec_multi_language_lite_train.yml](../../configs/rec/multi_language/rec_multi_language_lite_train.yml)。
provided under the path `configs/rec/multi_languages`: [rec_multi_language_lite_train.yml](../../configs/rec/multi_language/rec_multi_language_lite_train.yml)。
...
@@ -420,7 +429,7 @@ Eval:
...
@@ -420,7 +429,7 @@ Eval:
```
```
<aname="EVALUATION"></a>
<aname="EVALUATION"></a>
### EVALUATION
### 3 EVALUATION
The evaluation dataset can be set by modifying the `Eval.dataset.label_file_list` field in the `configs/rec/rec_icdar15_train.yml` file.
The evaluation dataset can be set by modifying the `Eval.dataset.label_file_list` field in the `configs/rec/rec_icdar15_train.yml` file.
pip install"paddleocr>=2.0.1"# Recommend to use version 2.0.1+
pip install"paddleocr>=2.0.1"# Recommend to use version 2.0.1+
...
@@ -12,9 +12,11 @@ build own whl package and install
...
@@ -12,9 +12,11 @@ build own whl package and install
python3 setup.py bdist_wheel
python3 setup.py bdist_wheel
pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr
pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr
```
```
### 1. Use by code
## 2 Use
### 2.1 Use by code
The paddleocr whl package will automatically download the ppocr lightweight model as the default model, which can be customized and replaced according to the section 3 **Custom Model**.
* detection classification and recognition
* detection angle classification and recognition
```python
```python
frompaddleocrimportPaddleOCR,draw_ocr
frompaddleocrimportPaddleOCR,draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
...
@@ -163,7 +165,7 @@ Output will be a list, each item contains classification result and confidence
...
@@ -163,7 +165,7 @@ Output will be a list, each item contains classification result and confidence
['0', 0.99999964]
['0', 0.99999964]
```
```
### Use by command line
### 2.2 Use by command line
show help information
show help information
```bash
```bash
...
@@ -239,11 +241,11 @@ Output will be a list, each item contains classification result and confidence
...
@@ -239,11 +241,11 @@ Output will be a list, each item contains classification result and confidence
['0', 0.99999964]
['0', 0.99999964]
```
```
## Use custom model
## 3 Use custom model
When the built-in model cannot meet the needs, you need to use your own trained model.
When the built-in model cannot meet the needs, you need to use your own trained model.
First, refer to the first section of [inference_en.md](./inference_en.md) to convert your det and rec model to inference model, and then use it as follows
First, refer to the first section of [inference_en.md](./inference_en.md) to convert your det and rec model to inference model, and then use it as follows