From d1f1351f9bdaa08a999649d92aa8a92a3e5f08ce Mon Sep 17 00:00:00 2001 From: grasswolfs Date: Fri, 17 Jul 2020 13:45:10 +0800 Subject: [PATCH] test=develop --- doc/doc_en/benchmark_en.md | 36 +++++++++ doc/doc_en/data_annotation_en.md | 35 ++++++++ doc/doc_en/data_synthesis_en.md | 11 +++ doc/doc_en/handwritten_datasets.md | 28 +++++++ doc/doc_en/installation_en.md | 11 ++- .../vertical_and_multilingual_datasets.md | 79 +++++++++++++++++++ 6 files changed, 197 insertions(+), 3 deletions(-) create mode 100644 doc/doc_en/benchmark_en.md create mode 100644 doc/doc_en/data_annotation_en.md create mode 100644 doc/doc_en/data_synthesis_en.md create mode 100644 doc/doc_en/handwritten_datasets.md create mode 100644 doc/doc_en/vertical_and_multilingual_datasets.md diff --git a/doc/doc_en/benchmark_en.md b/doc/doc_en/benchmark_en.md new file mode 100644 index 00000000..9e2dadb1 --- /dev/null +++ b/doc/doc_en/benchmark_en.md @@ -0,0 +1,36 @@ +# BENCHMARK + +This document gives the prediction time-consuming benchmark of PaddleOCR Ultra Lightweight Chinese Model (8.6M) on each platform. + +## TEST DATA +* 500 images were randomly sampled from the Chinese public data set [ICDAR2017-RCTW](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/datasets.md#ICDAR2017-RCTW-17). + Most of the pictures in the set were collected in the wild through mobile phone cameras. + Some are screenshots. + These pictures show various scenes, including street scenes, posters, menus, indoor scenes and screenshots of mobile applications. + +## MEASUREMENT +The predicted time-consuming indicators on the four platforms are as follows: + +| Long size(px) | T4(s) | V100(s) | Intel Xeon 6148(s) | Snapdragon 855(s) | +| :---------: | :-----: | :-------: | :------------------: | :-----------------: | +| 960 | 0.092 | 0.057 | 0.319 | 0.354 | +| 640 | 0.067 | 0.045 | 0.198 | 0.236 | +| 480 | 0.057 | 0.043 | 0.151 | 0.175 | + +Explanation: +* The evaluation time-consuming stage is the complete stage from image input to result output, including image +pre-processing and post-processing. +* ```Intel Xeon 6148``` is the server-side CPU model. Intel MKL-DNN is used in the test to accelerate the CPU prediction speed. +To use this operation, you need to: + * Update to the latest version of PaddlePaddle: https://www.paddlepaddle.org.cn/documentation/docs/zh/install/Tables.html#whl-dev + Please select the corresponding mkl version wheel package according to the CUDA version and Python version of your environment, + for example, CUDA10, Python3.7 environment, you should: + + ``` + # Obtain the installation package + wget https://paddle-wheel.bj.bcebos.com/0.0.0-gpu-cuda10-cudnn7-mkl/paddlepaddle_gpu-0.0.0-cp37-cp37m-linux_x86_64.whl + # Installation + pip3.7 install paddlepaddle_gpu-0.0.0-cp37-cp37m-linux_x86_64.whl + ``` + * Use parameters ```--enable_mkldnn True``` to turn on the acceleration switch when making predictions +* ```Snapdragon 855``` is a mobile processing platform model. diff --git a/doc/doc_en/data_annotation_en.md b/doc/doc_en/data_annotation_en.md new file mode 100644 index 00000000..a99cdff8 --- /dev/null +++ b/doc/doc_en/data_annotation_en.md @@ -0,0 +1,35 @@ +# DATA ANNOTATION TOOLS + +There are the commonly used data annotation tools, which will be continuously updated. Welcome to contribute tools~ + +1.**labelImg** + +* Tool description: Rectangular label + +* Tool address: https://github.com/tzutalin/labelImg + +* Sketch diagram: + + ![labelimg](C:\Users\USER\Desktop\labelimg.jpg) + + + +2.**roLabelImg** + +* Tool description: Label tool rewritten based on labelImg, supporting rotating rectangular label + +* Tool address: https://github.com/cgvict/roLabelImg + +* Sketch diagram:![roLabelImg](C:\Users\USER\Desktop\roLabelImg.png) + + + +3.**labelme** + +* Tool description: Support four points, polygons, circles and other labels + +* Tool address: https://github.com/wkentaro/labelme + +* Sketch diagram: + + ![labelme](C:\Users\USER\Desktop\labelme.jpg) diff --git a/doc/doc_en/data_synthesis_en.md b/doc/doc_en/data_synthesis_en.md new file mode 100644 index 00000000..81b19d53 --- /dev/null +++ b/doc/doc_en/data_synthesis_en.md @@ -0,0 +1,11 @@ +# DATA SYNTHESIS TOOLS + +In addition to open source data, users can also use synthesis tools to synthesize data. +There are the commonly used data synthesis tools, which will be continuously updated. Welcome to contribute tools~ + +* [Text_renderer](https://github.com/Sanster/text_renderer) +* [SynthText](https://github.com/ankush-me/SynthText) +* [SynthText_Chinese_version](https://github.com/JarveeLee/SynthText_Chinese_version) +* [TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator) +* [SynthText3D](https://github.com/MhLiao/SynthText3D) +* [UnrealText](https://github.com/Jyouhou/UnrealText/) diff --git a/doc/doc_en/handwritten_datasets.md b/doc/doc_en/handwritten_datasets.md new file mode 100644 index 00000000..da6008a2 --- /dev/null +++ b/doc/doc_en/handwritten_datasets.md @@ -0,0 +1,28 @@ +# Handwritten OCR dataset +Here we have sorted out the commonly used handwritten OCR dataset datasets, which are being updated continuously. We welcome you to contribute datasets ~ +- [Institute of automation, Chinese Academy of Sciences - handwritten Chinese dataset](#Institute of automation, Chinese Academy of Sciences - handwritten Chinese dataset) +- [NIST handwritten single character dataset - English](#NIST handwritten single character dataset - English) + + +## Institute of automation, Chinese Academy of Sciences - handwritten Chinese dataset +- **Data source**:http://www.nlpr.ia.ac.cn/databases/handwriting/Download.html +- **Data introduction**: + * It includes online and offline handwritten data,`HWDB1.0~1.2` has totally 3895135 handwritten single character samples, which belong to 7356 categories (7185 Chinese characters and 171 English letters, numbers and symbols);`HWDB2.0~2.2` has totally 5091 pages of images, which are divided into 52230 text lines and 1349414 words. All text and text samples are stored as grayscale images. Some sample words are shown below. + + ![](../datasets/CASIA_0.jpg) + +- **Download address**:http://www.nlpr.ia.ac.cn/databases/handwriting/Download.html +- **使用建议**:Data for single character, white background, can form a large number of text lines for training. White background can be processed into transparent state, which is convenient to add various backgrounds. For the case of semantic needs, it is suggested to extract single character from real corpus to form text lines. + + + +## NIST handwritten single character dataset - English(NIST Handprinted Forms and Characters Database) + +- **Data source**: [https://www.nist.gov/srd/nist-special-database-19](https://www.nist.gov/srd/nist-special-database-19) + +- **Data introduction**: NIST19 dataset is suitable for handwritten document and character recognition model training. It is extracted from the handwritten sample form of 3600 authors and contains 810000 character images in total. Nine of them are shown below. + + ![](../datasets/nist_demo.png) + + +- **Download address**: [https://www.nist.gov/srd/nist-special-database-19](https://www.nist.gov/srd/nist-special-database-19) diff --git a/doc/doc_en/installation_en.md b/doc/doc_en/installation_en.md index 1d708267..585f9d43 100644 --- a/doc/doc_en/installation_en.md +++ b/doc/doc_en/installation_en.md @@ -3,12 +3,14 @@ After testing, paddleocr can run on glibc 2.23. You can also test other glibc versions or install glic 2.23 for the best compatibility. PaddleOCR working environment: -- PaddlePaddle 1.7+ +- PaddlePaddle1.7 - python3 - glibc 2.23 -- cuDNN 7.6+ (GPU) + It is recommended to use the docker provided by us to run PaddleOCR, please refer to the use of docker [link](https://docs.docker.com/get-started/). +*If you want to directly run the prediction code on mac or windows, you can start from step 2.* + 1. (Recommended) Prepare a docker environment. The first time you use this image, it will be downloaded automatically. Please be patient. ``` # Switch to the working directory @@ -47,7 +49,7 @@ docker images hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829 ``` -2. Install PaddlePaddle Fluid v1.7 +2. Install PaddlePaddle Fluid v1.7 (the higher version is not supported yet, the adaptation work is in progress) ``` pip3 install --upgrade pip @@ -56,6 +58,9 @@ python3 -m pip install paddlepaddle-gpu==1.7.2.post97 -i https://pypi.tuna.tsing # If you have cuda10 installed on your machine, please run the following command to install python3 -m pip install paddlepaddle-gpu==1.7.2.post107 -i https://pypi.tuna.tsinghua.edu.cn/simple + +# If you only have cpu on your machine, please run the following command to install +python3 -m pip install paddlepaddle==1.7.2 -i https://pypi.tuna.tsinghua.edu.cn/simple ``` For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation. diff --git a/doc/doc_en/vertical_and_multilingual_datasets.md b/doc/doc_en/vertical_and_multilingual_datasets.md new file mode 100644 index 00000000..9d5ecff7 --- /dev/null +++ b/doc/doc_en/vertical_and_multilingual_datasets.md @@ -0,0 +1,79 @@ +# Vertical multi-language OCR dataset +Here we have sorted out the commonly used vertical multi-language OCR dataset datasets, which are being updated continuously. We welcome you to contribute datasets ~ +- [Chinese urban license plate dataset](#Chinese urban license plate dataset) +- [Bank credit card dataset](#Bank credit card dataset) +- [Captcha dataset-Captcha](#Captcha dataset-Captcha) +- [multi-language dataset](#multi-language dataset) + + + +## Chinese urban license plate dataset + +- **Data source**:[https://github.com/detectRecog/CCPD](https://github.com/detectRecog/CCPD) + +- **Data introduction**: It contains more than 250000 vehicle license plate images and vehicle license plate detection and recognition information labeling. It contains the following license plate image information in different scenes. + * CCPD-Base: General license plate picture + * CCPD-DB: The brightness of license plate area is bright, dark or uneven + * CCPD-FN: The license plate is farther or closer to the camera location + * CCPD-Rotate: License plate includes rotation (horizontal 20\~50 degrees, vertical-10\~10 degrees) + * CCPD-Tilt: License plate includes rotation (horizontal 15\~45 degrees, vertical 15\~45 degrees) + * CCPD-Blur: The license plate contains blurring due to camera lens jitter + * CCPD-Weather: The license plate is photographed on rainy, snowy or foggy days + * CCPD-Challenge: So far, some of the most challenging images in license plate detection and recognition tasks + * CCPD-NP: Pictures of new cars without license plates. + + ![](../datasets/ccpd_demo.png) + + +- **Download address** + * Baidu cloud download address (extracted code is hm0U): [https://pan.baidu.com/s/1i5AOjAbtkwb17Zy-NQGqkw](https://pan.baidu.com/s/1i5AOjAbtkwb17Zy-NQGqkw) + * Google drive download address:[https://drive.google.com/file/d/1rdEsCUcIUaYOVRkx5IMTRNA7PcGMmSgc/view](https://drive.google.com/file/d/1rdEsCUcIUaYOVRkx5IMTRNA7PcGMmSgc/view) + + + +## Bank credit card dataset + +- **Data source**: [https://www.kesci.com/home/dataset/5954cf1372ead054a5e25870](https://www.kesci.com/home/dataset/5954cf1372ead054a5e25870) + +- **Data introduction**: There are three types of training data + * 1.Sample card data of China Merchants Bank: including card image data and annotation data, a total of 618 pictures + * 2.Single character data: including pictures and annotation data, 37 pictures in total. + * 3.There are only other bank cards, no more detailed information, a total of 50 pictures. + + * The demo image is shown as follows. The annotation information is stored in excel, and the demo image below is marked as + * Top 8 card number: 62257583 + * Card type: card of our bank + * End of validity: 07/41 + * Chinese phonetic alphabet of card users: MICHAEL + + ![](../datasets/cmb_demo.jpg) + +- **Download address**: [https://cdn.kesci.com/cmb2017-2.zip](https://cdn.kesci.com/cmb2017-2.zip) + + + + +## Captcha dataset-Captcha + +- **Data source**: [https://github.com/lepture/captcha](https://github.com/lepture/captcha) + +- **Data introduction**: This is a toolkit for data synthesis. You can output captcha images according to the input text. Use the toolkit to generate several demo images as follows. + + ![](../datasets/captcha_demo.png) + +- **Download address**: The dataset is generated and has no download address. + + + + +## multi-language dataset(Multi-lingual scene text detection and recognition) + +- **Data source**: [https://rrc.cvc.uab.es/?ch=15&com=downloads](https://rrc.cvc.uab.es/?ch=15&com=downloads) + +- **Data introduction**: Multi language detection dataset MLT contains both language recognition and detection tasks. + * In the detection task, the training set contains 10000 images in 10 languages, and each language contains 1000 training images. The test set contains 10000 images. + * In the recognition task, the training set contains 111998 samples. + + +- **Download address**: The training set is large and can be downloaded in two parts. It can only be downloaded after registering on the website: +[https://rrc.cvc.uab.es/?ch=15&com=downloads](https://rrc.cvc.uab.es/?ch=15&com=downloads) -- GitLab