Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
yinnxinn
chineseocr
提交
7305add9
C
chineseocr
项目概览
yinnxinn
/
chineseocr
与 Fork 源项目一致
从无法访问的项目Fork
通知
5
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
C
chineseocr
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
提交
7305add9
编写于
8月 25, 2019
作者:
W
wenlihaoyu
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
fix error #339
上级
60d696c5
变更
3
隐藏空白更改
内联
并排
Showing
3 changed file
with
216 addition
and
109 deletion
+216
-109
README.md
README.md
+25
-19
app.py
app.py
+163
-63
config.py
config.py
+28
-27
未找到文件。
README.md
浏览文件 @
7305add9
## 本项目基于[yolo3](https://github.com/pjreddie/darknet.git) 与[crnn](https://github.com/meijieru/crnn.pytorch.git) 实现中文自然场景文字检测及识别
master分支将保留一周,后续app分支将替换为master
# 实现功能
-
[x] 文字方向检测 0、90、180、270度检测(支持dnn/tensorflow)
-
[x] 支持(darknet/opencv dnn /keras)文字检测,支持darknet/keras训练
-
[x] 不定长OCR训练(英文、中英文) crnn
\d
ense ocr 识别及训练 ,新增pytorch转keras模型代码(tools/pytorch_to_keras.py)
-
[x] 支持darknet 转keras, keras转darknet, pytorch 转keras模型
-
[x] 新增对身份证/火车票结构化数据识别
-
[ ] 新增语音模型修正OCR识别结果
-
[ ] 新增CNN+ctc模型,支持DNN模块调用OCR,单行图像平均时间为0.02秒以下
-
[ ] 优化CPU调用,识别速度与GPU接近(近期更新)
-
[x] 支持darknet 转keras, keras转darknet, pytorch 转keras模型
-
[x] 身份证/火车票结构化数据识别
-
[x] 新增CNN+ctc模型,支持DNN模块调用OCR,单行图像平均时间为0.02秒以下
-
[ ] CPU版本加速
-
[ ] 支持基于用户字典OCR识别
-
[ ] 新增语言模型修正OCR识别结果
-
[ ] 支持树莓派实时识别方案
## 环境部署
...
...
@@ -38,7 +39,6 @@ lib = CDLL(root+"chineseocr/darknet/libdarknet.so", RTLD_GLOBAL)
## 下载模型文件
模型文件地址:
*
[
baidu pan
](
https://pan.baidu.com/s/1gTW9gwJR6hlwTuyB6nCkzQ
)
*
[
google drive
](
https://drive.google.com/drive/folders/1XiT1FLFvokAdwfE9WSUSS1PnZA34WBzy?usp=sharing
)
复制文件夹中的所有文件到models目录
...
...
@@ -56,22 +56,21 @@ keras 转darknet
python tools/keras_to_darknet.py -cfg_path models/text.cfg -weights_path models/text.h5 -output_path models/text.weights
```
## 编译语言模型
## 编译语言模型
(可选)
```
Bash
git clone --recursive https://github.com/parlance/ctcdecode.git
cd ctcdecode
pip install .
```
## 下载语言模型
## 下载语言模型
(可选)
```
Bash
wget https://deepspeech.bj.bcebos.com/zh_lm/zh_giga.no_cna_cmn.prune01244.klm
mv zh_giga.no_cna_cmn.prune01244.klm chineseocr/models/
```
##
web服务启动
##
模型选择
```
Bash
cd chineseocr## 进入chineseocr目录
ipython app.py 8080 ##8080端口号,可以设置任意端口
```
参考config.py文件
```
## 构建docker镜像
```
Bash
...
...
@@ -83,6 +82,18 @@ docker run -d -p 8080:8080 chineseocr /root/anaconda3/bin/python app.py
```
## web服务启动
```
Bash
cd chineseocr## 进入chineseocr目录
python app.py 8080 ##8080端口号,可以设置任意端口
```
## 访问服务
http://127.0.0.1:8080/ocr
<img
width=
"500"
height=
"300"
src=
"https://github.com/chineseocr/chineseocr/blob/master/test/demo.png"
/>
## 识别结果展示
...
...
@@ -92,11 +103,6 @@ docker run -d -p 8080:8080 chineseocr /root/anaconda3/bin/python app.py
<img
width=
"500"
height=
"300"
src=
"https://github.com/chineseocr/chineseocr/blob/master/test/line-demo.png"
/>
## 访问服务
http://127.0.0.1:8080/ocr
<img
width=
"500"
height=
"300"
src=
"https://github.com/chineseocr/chineseocr/blob/master/test/demo.png"
/>
## 参考
1.
yolo3 https://github.com/pjreddie/darknet.git
...
...
app.py
浏览文件 @
7305add9
...
...
@@ -3,20 +3,108 @@
@author: lywen
"""
import
os
import
cv2
import
json
import
time
import
uuid
import
base64
import
web
import
numpy
as
np
import
uuid
from
PIL
import
Image
web
.
config
.
debug
=
True
import
model
filelock
=
'file.lock'
if
os
.
path
.
exists
(
filelock
):
os
.
remove
(
filelock
)
render
=
web
.
template
.
render
(
'templates'
,
base
=
'base'
)
from
config
import
DETECTANGLE
from
apphelper.image
import
union_rbox
,
adjust_box_to_origin
from
config
import
*
from
apphelper.image
import
union_rbox
,
adjust_box_to_origin
,
base64_to_PIL
from
application
import
trainTicket
,
idcard
if
yoloTextFlag
==
'keras'
or
AngleModelFlag
==
'tf'
or
ocrFlag
==
'keras'
:
if
GPU
:
os
.
environ
[
"CUDA_VISIBLE_DEVICES"
]
=
str
(
GPUID
)
import
tensorflow
as
tf
from
keras
import
backend
as
K
config
=
tf
.
ConfigProto
()
config
.
gpu_options
.
allocator_type
=
'BFC'
config
.
gpu_options
.
per_process_gpu_memory_fraction
=
0.3
## GPU最大占用量
config
.
gpu_options
.
allow_growth
=
True
##GPU是否可动态增加
K
.
set_session
(
tf
.
Session
(
config
=
config
))
K
.
get_session
().
run
(
tf
.
global_variables_initializer
())
else
:
##CPU启动
os
.
environ
[
"CUDA_VISIBLE_DEVICES"
]
=
''
if
yoloTextFlag
==
'opencv'
:
scale
,
maxScale
=
IMGSIZE
from
text.opencv_dnn_detect
import
text_detect
elif
yoloTextFlag
==
'darknet'
:
scale
,
maxScale
=
IMGSIZE
from
text.darknet_detect
import
text_detect
elif
yoloTextFlag
==
'keras'
:
scale
,
maxScale
=
IMGSIZE
[
0
],
2048
from
text.keras_detect
import
text_detect
else
:
print
(
"err,text engine in keras\opencv\darknet"
)
from
text.opencv_dnn_detect
import
angle_detect
if
ocr_redis
:
##多任务并发识别
from
apphelper.redisbase
import
redisDataBase
ocr
=
redisDataBase
().
put_values
else
:
from
crnn.keys
import
alphabetChinese
,
alphabetEnglish
if
ocrFlag
==
'keras'
:
from
crnn.network_keras
import
CRNN
if
chineseModel
:
alphabet
=
alphabetChinese
if
LSTMFLAG
:
ocrModel
=
ocrModelKerasLstm
else
:
ocrModel
=
ocrModelKerasDense
else
:
ocrModel
=
ocrModelKerasEng
alphabet
=
alphabetEnglish
LSTMFLAG
=
True
elif
ocrFlag
==
'torch'
:
from
crnn.network_torch
import
CRNN
if
chineseModel
:
alphabet
=
alphabetChinese
if
LSTMFLAG
:
ocrModel
=
ocrModelTorchLstm
else
:
ocrModel
=
ocrModelTorchDense
else
:
ocrModel
=
ocrModelTorchEng
alphabet
=
alphabetEnglish
LSTMFLAG
=
True
elif
ocrFlag
==
'opencv'
:
from
crnn.network_dnn
import
CRNN
ocrModel
=
ocrModelOpencv
alphabet
=
alphabetChinese
else
:
print
(
"err,ocr engine in keras\opencv\darknet"
)
nclass
=
len
(
alphabet
)
+
1
if
ocrFlag
==
'opencv'
:
crnn
=
CRNN
(
alphabet
=
alphabet
)
else
:
crnn
=
CRNN
(
32
,
1
,
nclass
,
256
,
leakyRelu
=
False
,
lstmFlag
=
LSTMFLAG
,
GPU
=
GPU
,
alphabet
=
alphabet
)
if
os
.
path
.
exists
(
ocrModel
):
crnn
.
load_weights
(
ocrModel
)
else
:
print
(
"download model or tranform model with tools!"
)
ocr
=
crnn
.
predict_job
from
main
import
TextOcrModel
model
=
TextOcrModel
(
ocr
,
text_detect
,
angle_detect
)
billList
=
[
'通用OCR'
,
'火车票'
,
'身份证'
]
...
...
@@ -30,79 +118,91 @@ class OCR:
post
[
'H'
]
=
1000
post
[
'width'
]
=
600
post
[
'W'
]
=
600
post
[
'uuid'
]
=
uuid
.
uuid1
().
__str__
()
post
[
'billList'
]
=
billList
return
render
.
ocr
(
post
)
def
POST
(
self
):
t
=
time
.
time
()
data
=
web
.
data
()
uidJob
=
uuid
.
uuid1
().
__str__
()
data
=
json
.
loads
(
data
)
billModel
=
data
.
get
(
'billModel'
,
''
)
textAngle
=
data
.
get
(
'textAngle'
,
False
)
##文字检测
textLine
=
data
.
get
(
'textLine'
,
False
)
##只进行单行识别
imgString
=
data
[
'imgString'
].
encode
().
split
(
b
';base64,'
)[
-
1
]
imgString
=
base64
.
b64decode
(
imgString
)
jobid
=
uuid
.
uuid1
().
__str__
()
path
=
'test/{}.jpg'
.
format
(
jobid
)
with
open
(
path
,
'wb'
)
as
f
:
f
.
write
(
imgString
)
img
=
cv2
.
imread
(
path
)
##GBR
img
=
base64_to_PIL
(
imgString
)
if
img
is
not
None
:
img
=
np
.
array
(
img
)
H
,
W
=
img
.
shape
[:
2
]
timeTake
=
time
.
time
()
if
textLine
:
##单行识别
partImg
=
Image
.
fromarray
(
img
)
text
=
model
.
crnnOcr
(
partImg
.
convert
(
'L'
))
res
=
[
{
'text'
:
text
,
'name'
:
'0'
,
'box'
:[
0
,
0
,
W
,
0
,
W
,
H
,
0
,
H
]}
]
else
:
detectAngle
=
textAngle
_
,
result
,
angle
=
model
.
model
(
img
,
detectAngle
=
detectAngle
,
##是否进行文字方向检测,通过web传参控制
config
=
dict
(
MAX_HORIZONTAL_GAP
=
50
,
##字符之间的最大间隔,用于文本行的合并
MIN_V_OVERLAPS
=
0.6
,
MIN_SIZE_SIM
=
0.6
,
TEXT_PROPOSALS_MIN_SCORE
=
0.1
,
TEXT_PROPOSALS_NMS_THRESH
=
0.3
,
TEXT_LINE_NMS_THRESH
=
0.7
,
##文本行之间测iou值
),
leftAdjust
=
True
,
##对检测的文本行进行向左延伸
rightAdjust
=
True
,
##对检测的文本行进行向右延伸
alph
=
0.01
,
##对检测的文本行进行向右、左延伸的倍数
)
if
billModel
==
''
or
billModel
==
'通用OCR'
:
result
=
union_rbox
(
result
,
0.2
)
res
=
[{
'text'
:
x
[
'text'
],
'name'
:
str
(
i
),
'box'
:{
'cx'
:
x
[
'cx'
],
'cy'
:
x
[
'cy'
],
'w'
:
x
[
'w'
],
'h'
:
x
[
'h'
],
'angle'
:
x
[
'degree'
]
}
}
for
i
,
x
in
enumerate
(
result
)]
res
=
adjust_box_to_origin
(
img
,
angle
,
res
)
##修正box
elif
billModel
==
'火车票'
:
res
=
trainTicket
.
trainTicket
(
result
)
res
=
res
.
res
res
=
[
{
'text'
:
res
[
key
],
'name'
:
key
,
'box'
:{}}
for
key
in
res
]
elif
billModel
==
'身份证'
:
res
=
idcard
.
idcard
(
result
)
res
=
res
.
res
res
=
[
{
'text'
:
res
[
key
],
'name'
:
key
,
'box'
:{}}
for
key
in
res
]
while
time
.
time
()
-
t
<=
TIMEOUT
:
if
os
.
path
.
exists
(
filelock
):
continue
else
:
with
open
(
filelock
,
'w'
)
as
f
:
f
.
write
(
uidJob
)
if
textLine
:
##单行识别
partImg
=
Image
.
fromarray
(
img
)
text
=
crnn
.
predict
(
partImg
.
convert
(
'L'
))
res
=
[
{
'text'
:
text
,
'name'
:
'0'
,
'box'
:[
0
,
0
,
W
,
0
,
W
,
H
,
0
,
H
]}
]
os
.
remove
(
filelock
)
break
else
:
detectAngle
=
textAngle
result
,
angle
=
model
.
model
(
img
,
scale
=
scale
,
maxScale
=
maxScale
,
detectAngle
=
detectAngle
,
##是否进行文字方向检测,通过web传参控制
MAX_HORIZONTAL_GAP
=
100
,
##字符之间的最大间隔,用于文本行的合并
MIN_V_OVERLAPS
=
0.6
,
MIN_SIZE_SIM
=
0.6
,
TEXT_PROPOSALS_MIN_SCORE
=
0.1
,
TEXT_PROPOSALS_NMS_THRESH
=
0.3
,
TEXT_LINE_NMS_THRESH
=
0.99
,
##文本行之间测iou值
LINE_MIN_SCORE
=
0.1
,
leftAdjustAlph
=
0.01
,
##对检测的文本行进行向左延伸
rightAdjustAlph
=
0.01
,
##对检测的文本行进行向右延伸
)
if
billModel
==
''
or
billModel
==
'通用OCR'
:
result
=
union_rbox
(
result
,
0.2
)
res
=
[{
'text'
:
x
[
'text'
],
'name'
:
str
(
i
),
'box'
:{
'cx'
:
x
[
'cx'
],
'cy'
:
x
[
'cy'
],
'w'
:
x
[
'w'
],
'h'
:
x
[
'h'
],
'angle'
:
x
[
'degree'
]
}
}
for
i
,
x
in
enumerate
(
result
)]
res
=
adjust_box_to_origin
(
img
,
angle
,
res
)
##修正box
elif
billModel
==
'火车票'
:
res
=
trainTicket
.
trainTicket
(
result
)
res
=
res
.
res
res
=
[
{
'text'
:
res
[
key
],
'name'
:
key
,
'box'
:{}}
for
key
in
res
]
elif
billModel
==
'身份证'
:
res
=
idcard
.
idcard
(
result
)
res
=
res
.
res
res
=
[
{
'text'
:
res
[
key
],
'name'
:
key
,
'box'
:{}}
for
key
in
res
]
os
.
remove
(
filelock
)
break
timeTake
=
time
.
time
()
-
t
imeTake
timeTake
=
time
.
time
()
-
t
os
.
remove
(
path
)
return
json
.
dumps
({
'res'
:
res
,
'timeTake'
:
round
(
timeTake
,
4
)},
ensure_ascii
=
False
)
...
...
config.py
浏览文件 @
7305add9
import
os
########################文字检测########################
##文字检测引擎
pwd
=
os
.
getcwd
()
opencvFlag
=
'keras'
##keras,opencv,darknet,模型性能 keras>darknet>opencv
########################文字检测################################################
##文字检测引擎
IMGSIZE
=
(
608
,
608
)
## yolo3 输入图像尺寸
## keras 版本anchors
yoloTextFlag
=
'keras'
##keras,opencv,darknet,模型性能 keras>darknet>opencv
############## keras yolo ##############
keras_anchors
=
'8,11, 8,16, 8,23, 8,33, 8,48, 8,97, 8,139, 8,198, 8,283'
class_names
=
[
'none'
,
'text'
,]
kerasTextModel
=
os
.
path
.
join
(
pwd
,
"models"
,
"text.h5"
)
##keras版本模型权重文件
############## keras yolo ##############
############## darknet yolo ##############
darknetRoot
=
os
.
path
.
join
(
os
.
path
.
curdir
,
"darknet"
)
## yolo 安装目录
yoloCfg
=
os
.
path
.
join
(
pwd
,
"models"
,
"text.cfg"
)
yoloWeights
=
os
.
path
.
join
(
pwd
,
"models"
,
"text.weights"
)
yoloData
=
os
.
path
.
join
(
pwd
,
"models"
,
"text.data"
)
############## darknet yolo ##############
########################文字检测########################
########################文字检测########################
########################
## GPU选择及启动GPU序号
GPU
=
True
##OCR 是否启用GPU
GPUID
=
0
##调用GPU序号
## nms选择,支持cython,gpu,python
nmsFlag
=
'gpu'
## cython/gpu/python ##容错性 优先启动GPU,其次是cpython 最后是python
if
not
GPU
:
nmsFlag
=
'cython'
##vgg文字方向检测模型
DETECTANGLE
=
True
##是否进行文字方向检测
AngleModelPb
=
os
.
path
.
join
(
pwd
,
"models"
,
"Angle-model.pb"
)
AngleModelPb
=
os
.
path
.
join
(
pwd
,
"models"
,
"Angle-model.pb"
)
AngleModelPbtxt
=
os
.
path
.
join
(
pwd
,
"models"
,
"Angle-model.pbtxt"
)
AngleModelFlag
=
'opencv'
## opencv or tf
######################OCR模型######################
######################OCR模型###################################################
ocr_redis
=
False
##是否多任务执行OCR识别加速 如果多任务,则配置redis数据库,数据库账号参考apphelper/redisbase.py
##是否启用LSTM crnn模型
##OCR模型是否调用LSTM层
LSTMFLAG
=
True
ocrFlag
=
'torch'
##ocr模型 支持 keras torch opencv版本
##模型选择 True:中英文模型 False:英文模型
ocrFlag
=
'torch'
##ocr模型 支持 keras torch版本
chinsesModel
=
True
ocrModelKeras
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr-dense-keras.h5"
)
##keras版本OCR,暂时支持dense
if
chinsesModel
:
if
LSTMFLAG
:
ocrModel
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr-lstm.pth"
)
else
:
ocrModel
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr-dense.pth"
)
else
:
##纯英文模型
LSTMFLAG
=
True
ocrModel
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr-english.pth"
)
######################OCR模型######################
chineseModel
=
True
## 中文模型或者纯英文模型
##转换keras模型 参考tools目录
ocrModelKerasDense
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr-dense.h5"
)
ocrModelKerasLstm
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr-lstm.h5"
)
ocrModelKerasEng
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr-english.h5"
)
ocrModelTorchLstm
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr-lstm.pth"
)
ocrModelTorchDense
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr-dense.pth"
)
ocrModelTorchEng
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr-english.pth"
)
ocrModelOpencv
=
os
.
path
.
join
(
pwd
,
"models"
,
"ocr.pb"
)
######################OCR模型###################################################
TIMEOUT
=
30
##超时时间
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录