diff --git a/PPOCRLabel/Makefile b/PPOCRLabel/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..7d72a890cbfb297de5e5329261092b8af481751a --- /dev/null +++ b/PPOCRLabel/Makefile @@ -0,0 +1,35 @@ +# ex: set ts=8 noet: + +all: qt5 test + +test: testpy3 + +testpy2: + python -m unittest discover tests + +testpy3: + python3 -m unittest discover tests + +qt4: qt4py2 + +qt5: qt5py3 + +qt4py2: + pyrcc4 -py2 -o libs/resources.py resources.qrc + +qt4py3: + pyrcc4 -py3 -o libs/resources.py resources.qrc + +qt5py3: + pyrcc5 -o libs/resources.py resources.qrc + +clean: + rm -rf ~/.labelImgSettings.pkl *.pyc dist labelImg.egg-info __pycache__ build + +pip_upload: + python3 setup.py upload + +long_description: + restview --long-description + +.PHONY: all diff --git a/PPOCRLabel/README.md b/PPOCRLabel/README.md index 93fd64ffe1d0923d3e754e3964fa7c62509d0df0..cef8fe1ff0adcafe30bd5c7591d0accb9ecded7d 100644 --- a/PPOCRLabel/README.md +++ b/PPOCRLabel/README.md @@ -24,11 +24,9 @@ python PPOCRLabel.py #### Ubuntu Linux ``` -sudo apt-get install pyqt5-dev-tools -sudo apt-get install trash-cli +pip3 install pyqt5 +pip3 install trash-cli cd ./PPOCRLabel # 将目录切换到PPOCRLabel文件夹下 -sudo pip3 install -r requirements/requirements-linux-python3.txt -make qt5py3 python3 PPOCRLabel.py ``` @@ -38,7 +36,6 @@ pip3 install pyqt5 pip3 uninstall opencv-python # 由于mac版本的opencv与pyqt有冲突,需先手动卸载opencv pip3 install opencv-contrib-python-headless # 安装headless版本的open-cv cd ./PPOCRLabel # 将目录切换到PPOCRLabel文件夹下 -make qt5py3 python3 PPOCRLabel.py ``` @@ -75,6 +72,16 @@ python3 PPOCRLabel.py | rec_gt.txt | 识别标签。可直接用于PPOCR识别模型训练。需用户手动点击菜单栏“PaddleOCR” - "保存识别结果"后产生。 | | crop_img | 识别数据。按照检测框切割后的图片。与rec_gt.txt同时产生。 | +## 说明 +### 内置模型 + - 默认模型:PPOCRLabel默认使用PaddleOCR中的中英文超轻量OCR模型,支持中英文与数字识别,多种语言检测。 + - 模型语言切换:用户可通过菜单栏中 "PaddleOCR" - "选择模型" 切换内置模型语言,目前支持的语言包括法文、德文、韩文、日文。具体模型下载链接可参考[PaddleOCR模型列表](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/models_list.md). + - 自定义模型:用户可根据[自定义模型代码使用](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/whl.md#%E8%87%AA%E5%AE%9A%E4%B9%89%E6%A8%A1%E5%9E%8B),通过修改PPOCRLabel.py中针对[PaddleOCR类的实例化](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/PPOCRLabel/PPOCRLabel.py#L110)替换成自己训练的模型 + +### 错误提示 +- 如果同时使用whl包安装了paddleocr,其优先级大于通过paddleocr.py调用PaddleOCR类,whl包未更新时会导致程序异常。 +- PPOCRLabel不支持对中文文件名的图片进行自动标注。 + ### 参考资料 1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg) diff --git a/PPOCRLabel/README_en.md b/PPOCRLabel/README_en.md index 7ebbb97c1f2693942f5949a17c68f54b5d6a6cbe..d503dd3d39d065769ef6e64a4737637b486788ac 100644 --- a/PPOCRLabel/README_en.md +++ b/PPOCRLabel/README_en.md @@ -26,11 +26,9 @@ python PPOCRLabel.py --lang en #### Ubuntu Linux ``` -sudo apt-get install pyqt5-dev-tools -sudo apt-get install trash-cli +pip3 install pyqt5 +pip3 install trash-cli cd ./PPOCRLabel # Change the directory to the PPOCRLabel folder -sudo pip3 install -r requirements/requirements-linux-python3.txt -make qt5py3 python3 PPOCRLabel.py --lang en ``` @@ -40,7 +38,6 @@ pip3 install pyqt5 pip3 uninstall opencv-python # Uninstall opencv manually as it conflicts with pyqt pip3 install opencv-contrib-python-headless # Install the headless version of opencv cd ./PPOCRLabel # Change the directory to the PPOCRLabel folder -make qt5py3 python3 PPOCRLabel.py --lang en ``` @@ -92,6 +89,14 @@ Therefore, if the recognition result has been manually changed before, it may ch | rec_gt.txt | The recognition label file, which can be directly used for PPOCR identification model training, is generated after the user clicks on the menu bar "PaddleOCR"-"Save recognition result". | | crop_img | The recognition data, generated at the same time with *rec_gt.txt* | + +### Built-in Model +- Default model: PPOCRLabel uses the Chinese and English ultra-lightweight OCR model in PaddleOCR by default, supports Chinese, English and number recognition, and multiple language detection. +- Model language switching: Changing the built-in model language is supportable by clicking "PaddleOCR"-"Choose OCR Model" in the menu bar. Currently supported languages​include French, German, Korean, and Japanese. +For specific model download links, please refer to [PaddleOCR Model List](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/models_list_en.md#multilingual-recognition-modelupdating) +- Custom model: The model trained by users can be replaced by modifying PPOCRLabel.py in [PaddleOCR class instantiation](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/PPOCRLabel/PPOCRLabel.py#L110) referring [Custom Model Code](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/whl_en.md#use-custom-model) + + ## Related 1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)