Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PaddleOCR
提交
1a0848a4
P
PaddleOCR
项目概览
PaddlePaddle
/
PaddleOCR
大约 1 年 前同步成功
通知
1528
Star
32962
Fork
6643
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
108
列表
看板
标记
里程碑
合并请求
7
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleOCR
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
108
Issue
108
列表
看板
标记
里程碑
合并请求
7
合并请求
7
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
1a0848a4
编写于
7月 30, 2020
作者:
S
shaohua.zhang
提交者:
GitHub
7月 30, 2020
浏览文件
操作
浏览文件
下载
差异文件
Merge branch 'develop' into develop
上级
fce0a57d
f2ae24a9
变更
11
显示空白变更内容
内联
并排
Showing
11 changed file
with
65 addition
and
36 deletion
+65
-36
README.md
README.md
+1
-0
README_cn.md
README_cn.md
+3
-2
deploy/android_demo/README.md
deploy/android_demo/README.md
+1
-1
doc/doc_ch/installation.md
doc/doc_ch/installation.md
+1
-1
doc/doc_ch/recognition.md
doc/doc_ch/recognition.md
+2
-6
ppocr/data/det/db_process.py
ppocr/data/det/db_process.py
+4
-2
ppocr/data/rec/dataset_traversal.py
ppocr/data/rec/dataset_traversal.py
+1
-1
ppocr/utils/utility.py
ppocr/utils/utility.py
+15
-2
tools/infer/predict_det.py
tools/infer/predict_det.py
+4
-2
tools/infer/predict_rec.py
tools/infer/predict_rec.py
+4
-2
tools/infer/predict_system.py
tools/infer/predict_system.py
+29
-17
未找到文件。
README.md
浏览文件 @
1a0848a4
...
...
@@ -212,3 +212,4 @@ We welcome all the contributions to PaddleOCR and appreciate for your feedback v
-
Many thanks to
[
lyl120117
](
https://github.com/lyl120117
)
for contributing the code for printing the network structure.
-
Thanks
[
xiangyubo
](
https://github.com/xiangyubo
)
for contributing the handwritten Chinese OCR datasets.
-
Thanks
[
authorfu
](
https://github.com/authorfu
)
for contributing Android demo and
[
xiadeye
](
https://github.com/xiadeye
)
contributing iOS demo, respectively.
-
Thanks
[
BeyondYourself
](
https://github.com/BeyondYourself
)
for contributing many great suggestions and simplifying part of the code style.
README_cn.md
浏览文件 @
1a0848a4
...
...
@@ -205,8 +205,9 @@ PaddleOCR文本识别算法的训练和使用请参考文档教程中[模型训
## 贡献代码
我们非常欢迎你为PaddleOCR贡献代码,也十分感谢你的反馈。
-
非常感谢
[
Khanh Tran
](
https://github.com/xxxpsyduck
)
贡献了英文文档
。
-
非常感谢
[
Khanh Tran
](
https://github.com/xxxpsyduck
)
贡献了英文文档
-
非常感谢
[
zhangxin
](
https://github.com/ZhangXinNan
)(
[Blog](https://blog.csdn.net/sdlypyzq
)
) 贡献新的可视化方式、添加.gitgnore、处理手动设置PYTHONPATH环境变量的问题
-
非常感谢
[
lyl120117
](
https://github.com/lyl120117
)
贡献打印网络结构的代码
-
非常感谢
[
xiangyubo
](
https://github.com/xiangyubo
)
贡献手写中文OCR数据集
-
非常感谢
[
authorfu
](
https://github.com/authorfu
)
贡献Android和
[
xiadeye
](
https://github.com/xiadeye
)
贡献IOS的demo代码
-
非常感谢
[
BeyondYourself
](
https://github.com/BeyondYourself
)
给PaddleOCR提了很多非常棒的建议,并简化了PaddleOCR的部分代码风格。
deploy/android_demo/README.md
浏览文件 @
1a0848a4
# 如何快速测试
### 1. 安装最新版本的Android Studio
可以从https://developer.android.com/studio下载。本Demo使用是4.0版本Android Studio编写。
可以从https://developer.android.com/studio
下载。本Demo使用是4.0版本Android Studio编写。
### 2. 按照NDK 20 以上版本
Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编译成功。
...
...
doc/doc_ch/installation.md
浏览文件 @
1a0848a4
...
...
@@ -7,7 +7,7 @@ PaddleOCR 工作环境
-
glibc 2.23
-
cuDNN 7.6+ (GPU)
建议使用我们提供的docker运行PaddleOCR,有关docker使用请参考
[
链接
](
https://docs.docker.com/get-started/
)
。
建议使用我们提供的docker运行PaddleOCR,有关docker
、nvidia-docker
使用请参考
[
链接
](
https://docs.docker.com/get-started/
)
。
*如您希望使用 mac 或 windows直接运行预测代码,可以从第2步开始执行。*
...
...
doc/doc_ch/recognition.md
浏览文件 @
1a0848a4
...
...
@@ -21,12 +21,11 @@ ln -sf <path/to/dataset> <path/to/paddle_ocr>/train_data/dataset
*
使用自己数据集:
若您希望使用自己的数据进行训练,请参考下文组织您的数据。
-
训练集
首先请将训练图片放入同一个文件夹(train_images),并用一个txt文件(rec_gt_train.txt)记录图片路径和标签。
*
注意:
默认请将图片路径和图片标签用
\t
分割,如用其他方式分割将造成训练报错
*
*注意:**
默认请将图片路径和图片标签用
\t
分割,如用其他方式分割将造成训练报错
```
" 图像文件名 图像标注信息 "
...
...
@@ -41,12 +40,9 @@ PaddleOCR 提供了一份用于训练 icdar2015 数据集的标签文件,通
wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_train.txt
# 测试集标签
wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_test.txt
```
最终训练集应有如下文件结构:
```
|-train_data
|-ic15_data
...
...
@@ -150,7 +146,7 @@ PaddleOCR支持训练和评估交替进行, 可以在 `configs/rec/rec_icdar15_t
如果验证集很大,测试将会比较耗时,建议减少评估次数,或训练完再进行评估。
*
提示:
可通过 -c 参数选择
`configs/rec/`
路径下的多种模型配置进行训练,PaddleOCR支持的识别算法有:
*
*提示:**
可通过 -c 参数选择
`configs/rec/`
路径下的多种模型配置进行训练,PaddleOCR支持的识别算法有:
| 配置文件 | 算法名称 | backbone | trans | seq | pred |
...
...
ppocr/data/det/db_process.py
浏览文件 @
1a0848a4
...
...
@@ -17,7 +17,7 @@ import cv2
import
numpy
as
np
import
json
import
sys
from
ppocr.utils.utility
import
initial_logger
from
ppocr.utils.utility
import
initial_logger
,
check_and_read_gif
logger
=
initial_logger
()
from
.data_augment
import
AugmentData
...
...
@@ -100,6 +100,8 @@ class DBProcessTrain(object):
def
__call__
(
self
,
label_infor
):
img_path
,
gt_label
=
self
.
convert_label_infor
(
label_infor
)
imgvalue
,
flag
=
check_and_read_gif
(
img_path
)
if
not
flag
:
imgvalue
=
cv2
.
imread
(
img_path
)
if
imgvalue
is
None
:
logger
.
info
(
"{} does not exist!"
.
format
(
img_path
))
...
...
ppocr/data/rec/dataset_traversal.py
浏览文件 @
1a0848a4
...
...
@@ -233,7 +233,7 @@ class SimpleReader(object):
img_num
=
len
(
label_infor_list
)
img_id_list
=
list
(
range
(
img_num
))
random
.
shuffle
(
img_id_list
)
if
sys
.
platform
==
"win32"
:
if
sys
.
platform
==
"win32"
and
self
.
num_workers
!=
1
:
print
(
"multiprocess is not fully compatible with Windows."
"num_workers will be 1."
)
self
.
num_workers
=
1
...
...
ppocr/utils/utility.py
浏览文件 @
1a0848a4
...
...
@@ -15,6 +15,8 @@
import
logging
import
os
import
imghdr
import
cv2
from
paddle
import
fluid
def
initial_logger
():
...
...
@@ -62,7 +64,7 @@ def get_image_file_list(img_file):
if
img_file
is
None
or
not
os
.
path
.
exists
(
img_file
):
raise
Exception
(
"not found any img file in {}"
.
format
(
img_file
))
img_end
=
{
'jpg'
,
'bmp'
,
'png'
,
'jpeg'
,
'rgb'
,
'tif'
,
'tiff'
}
img_end
=
{
'jpg'
,
'bmp'
,
'png'
,
'jpeg'
,
'rgb'
,
'tif'
,
'tiff'
,
'gif'
,
'GIF'
}
if
os
.
path
.
isfile
(
img_file
)
and
imghdr
.
what
(
img_file
)
in
img_end
:
imgs_lists
.
append
(
img_file
)
elif
os
.
path
.
isdir
(
img_file
):
...
...
@@ -75,7 +77,18 @@ def get_image_file_list(img_file):
return
imgs_lists
from
paddle
import
fluid
def
check_and_read_gif
(
img_path
):
if
os
.
path
.
basename
(
img_path
)[
-
3
:]
in
[
'gif'
,
'GIF'
]:
gif
=
cv2
.
VideoCapture
(
img_path
)
ret
,
frame
=
gif
.
read
()
if
not
ret
:
logging
.
info
(
"Cannot read {}. This gif image maybe corrupted."
)
return
None
,
False
if
len
(
frame
.
shape
)
==
2
or
frame
.
shape
[
-
1
]
==
1
:
frame
=
cv2
.
cvtColor
(
frame
,
cv2
.
COLOR_GRAY2RGB
)
imgvalue
=
frame
[:,
:,
::
-
1
]
return
imgvalue
,
True
return
None
,
False
def
create_multi_devices_program
(
program
,
loss_var_name
):
...
...
tools/infer/predict_det.py
浏览文件 @
1a0848a4
...
...
@@ -20,7 +20,7 @@ sys.path.append(os.path.join(__dir__, '../..'))
import
tools.infer.utility
as
utility
from
ppocr.utils.utility
import
initial_logger
logger
=
initial_logger
()
from
ppocr.utils.utility
import
get_image_file_list
from
ppocr.utils.utility
import
get_image_file_list
,
check_and_read_gif
import
cv2
from
ppocr.data.det.east_process
import
EASTProcessTest
from
ppocr.data.det.db_process
import
DBProcessTest
...
...
@@ -139,6 +139,8 @@ if __name__ == "__main__":
if
not
os
.
path
.
exists
(
draw_img_save
):
os
.
makedirs
(
draw_img_save
)
for
image_file
in
image_file_list
:
img
,
flag
=
check_and_read_gif
(
image_file
)
if
not
flag
:
img
=
cv2
.
imread
(
image_file
)
if
img
is
None
:
logger
.
info
(
"error in loading image:{}"
.
format
(
image_file
))
...
...
tools/infer/predict_rec.py
浏览文件 @
1a0848a4
...
...
@@ -20,7 +20,7 @@ sys.path.append(os.path.abspath(os.path.join(__dir__, '../..')))
import
tools.infer.utility
as
utility
from
ppocr.utils.utility
import
initial_logger
logger
=
initial_logger
()
from
ppocr.utils.utility
import
get_image_file_list
from
ppocr.utils.utility
import
get_image_file_list
,
check_and_read_gif
import
cv2
import
copy
import
numpy
as
np
...
...
@@ -153,7 +153,9 @@ def main(args):
valid_image_file_list
=
[]
img_list
=
[]
for
image_file
in
image_file_list
:
img
=
cv2
.
imread
(
image_file
,
cv2
.
IMREAD_COLOR
)
img
,
flag
=
check_and_read_gif
(
image_file
)
if
not
flag
:
img
=
cv2
.
imread
(
image_file
)
if
img
is
None
:
logger
.
info
(
"error in loading image:{}"
.
format
(
image_file
))
continue
...
...
tools/infer/predict_system.py
浏览文件 @
1a0848a4
...
...
@@ -27,7 +27,7 @@ import copy
import
numpy
as
np
import
math
import
time
from
ppocr.utils.utility
import
get_image_file_list
from
ppocr.utils.utility
import
get_image_file_list
,
check_and_read_gif
from
PIL
import
Image
from
tools.infer.utility
import
draw_ocr
from
tools.infer.utility
import
draw_ocr_box_txt
...
...
@@ -49,16 +49,21 @@ class TextSystem(object):
points[:, 0] = points[:, 0] - left
points[:, 1] = points[:, 1] - top
'''
img_crop_width
=
int
(
max
(
np
.
linalg
.
norm
(
points
[
0
]
-
points
[
1
]),
img_crop_width
=
int
(
max
(
np
.
linalg
.
norm
(
points
[
0
]
-
points
[
1
]),
np
.
linalg
.
norm
(
points
[
2
]
-
points
[
3
])))
img_crop_height
=
int
(
max
(
np
.
linalg
.
norm
(
points
[
0
]
-
points
[
3
]),
img_crop_height
=
int
(
max
(
np
.
linalg
.
norm
(
points
[
0
]
-
points
[
3
]),
np
.
linalg
.
norm
(
points
[
1
]
-
points
[
2
])))
pts_std
=
np
.
float32
([[
0
,
0
],
[
img_crop_width
,
0
],
pts_std
=
np
.
float32
([[
0
,
0
],
[
img_crop_width
,
0
],
[
img_crop_width
,
img_crop_height
],
[
0
,
img_crop_height
]])
M
=
cv2
.
getPerspectiveTransform
(
points
,
pts_std
)
dst_img
=
cv2
.
warpPerspective
(
img
,
M
,
(
img_crop_width
,
img_crop_height
),
dst_img
=
cv2
.
warpPerspective
(
img
,
M
,
(
img_crop_width
,
img_crop_height
),
borderMode
=
cv2
.
BORDER_REPLICATE
,
flags
=
cv2
.
INTER_CUBIC
)
dst_img_height
,
dst_img_width
=
dst_img
.
shape
[
0
:
2
]
...
...
@@ -119,6 +124,8 @@ def main(args):
is_visualize
=
True
tackle_img_num
=
0
for
image_file
in
image_file_list
:
img
,
flag
=
check_and_read_gif
(
image_file
)
if
not
flag
:
img
=
cv2
.
imread
(
image_file
)
if
img
is
None
:
logger
.
info
(
"error in loading image:{}"
.
format
(
image_file
))
...
...
@@ -130,14 +137,14 @@ def main(args):
dt_boxes
,
rec_res
=
text_sys
(
img
)
elapse
=
time
.
time
()
-
starttime
print
(
"Predict time of %s: %.3fs"
%
(
image_file
,
elapse
))
drop_score
=
0.5
dt_num
=
len
(
dt_boxes
)
dt_boxes_final
=
[]
for
dno
in
range
(
dt_num
):
text
,
score
=
rec_res
[
dno
]
if
score
>=
0.5
:
if
score
>=
drop_score
:
text_str
=
"%s, %.3f"
%
(
text
,
score
)
print
(
text_str
)
dt_boxes_final
.
append
(
dt_boxes
[
dno
])
if
is_visualize
:
image
=
Image
.
fromarray
(
cv2
.
cvtColor
(
img
,
cv2
.
COLOR_BGR2RGB
))
...
...
@@ -146,7 +153,12 @@ def main(args):
scores
=
[
rec_res
[
i
][
1
]
for
i
in
range
(
len
(
rec_res
))]
draw_img
=
draw_ocr
(
image
,
boxes
,
txts
,
scores
,
draw_txt
=
True
,
drop_score
=
0.5
)
image
,
boxes
,
txts
,
scores
,
draw_txt
=
True
,
drop_score
=
drop_score
)
draw_img_save
=
"./inference_results/"
if
not
os
.
path
.
exists
(
draw_img_save
):
os
.
makedirs
(
draw_img_save
)
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录