未验证 提交 3df64d50 编写于 作者: Z zhoujun 提交者: GitHub

Merge pull request #5576 from WenmuZhou/android

[android demo] Fix error when run same image
# 如何快速测试 - [Android Demo](#android-demo)
### 1. 安装最新版本的Android Studio - [1. 简介](#1-简介)
- [2. 近期更新](#2-近期更新)
- [3. 快速使用](#3-快速使用)
- [3.1 安装最新版本的Android Studio](#31-安装最新版本的android-studio)
- [3.2 安装 NDK 20 以上版本](#32-安装-ndk-20-以上版本)
- [3.3 导入项目](#33-导入项目)
- [4 更多支持](#4-更多支持)
# Android Demo
## 1. 简介
此为PaddleOCR的Android Demo,目前支持文本检测,文本方向分类器和文本识别模型的使用。使用 [PaddleLite v2.10](https://github.com/PaddlePaddle/Paddle-Lite/tree/release/v2.10) 进行开发。
## 2. 近期更新
* 2022.02.27
* 预测库更新到PaddleLite v2.10
* 支持6种运行模式:
* 检测+分类+识别
* 检测+识别
* 分类+识别
* 检测
* 识别
* 分类
## 3. 快速使用
### 3.1 安装最新版本的Android Studio
可以从 https://developer.android.com/studio 下载。本Demo使用是4.0版本Android Studio编写。 可以从 https://developer.android.com/studio 下载。本Demo使用是4.0版本Android Studio编写。
### 2. 按照NDK 20 以上版本 ### 3.2 安装 NDK 20 以上版本
Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编译成功。 Demo测试的时候使用的是NDK 20b版本,20版本以上均可以支持编译成功。
如果您是初学者,可以用以下方式安装和测试NDK编译环境。 如果您是初学者,可以用以下方式安装和测试NDK编译环境。
点击 File -> New ->New Project, 新建 "Native C++" project 点击 File -> New ->New Project, 新建 "Native C++" project
### 3. 导入项目 ### 3.3 导入项目
点击 File->New->Import Project..., 然后跟着Android Studio的引导导入 点击 File->New->Import Project..., 然后跟着Android Studio的引导导入
## 4 更多支持
# 获得更多支持 前往[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite),获得更多开发支持
前往[端计算模型生成平台EasyEdge](https://ai.baidu.com/easyedge/app/open_source_demo?referrerUrl=paddlelite),获得更多开发支持:
- Demo APP:可使用手机扫码安装,方便手机端快速体验文字识别
- SDK:模型被封装为适配不同芯片硬件和操作系统SDK,包括完善的接口,方便进行二次开发
...@@ -8,8 +8,8 @@ android { ...@@ -8,8 +8,8 @@ android {
applicationId "com.baidu.paddle.lite.demo.ocr" applicationId "com.baidu.paddle.lite.demo.ocr"
minSdkVersion 23 minSdkVersion 23
targetSdkVersion 29 targetSdkVersion 29
versionCode 1 versionCode 2
versionName "1.0" versionName "2.0"
testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner" testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner"
externalNativeBuild { externalNativeBuild {
cmake { cmake {
...@@ -17,11 +17,6 @@ android { ...@@ -17,11 +17,6 @@ android {
arguments '-DANDROID_PLATFORM=android-23', '-DANDROID_STL=c++_shared' ,"-DANDROID_ARM_NEON=TRUE" arguments '-DANDROID_PLATFORM=android-23', '-DANDROID_STL=c++_shared' ,"-DANDROID_ARM_NEON=TRUE"
} }
} }
ndk {
// abiFilters "arm64-v8a", "armeabi-v7a"
abiFilters "arm64-v8a", "armeabi-v7a"
ldLibs "jnigraphics"
}
} }
buildTypes { buildTypes {
release { release {
...@@ -48,7 +43,7 @@ dependencies { ...@@ -48,7 +43,7 @@ dependencies {
def archives = [ def archives = [
[ [
'src' : 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/paddle_lite_libs_v2_9_0.tar.gz', 'src' : 'https://paddleocr.bj.bcebos.com/libs/paddle_lite_libs_v2_10.tar.gz',
'dest': 'PaddleLite' 'dest': 'PaddleLite'
], ],
[ [
...@@ -56,7 +51,7 @@ def archives = [ ...@@ -56,7 +51,7 @@ def archives = [
'dest': 'OpenCV' 'dest': 'OpenCV'
], ],
[ [
'src' : 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ocr_v2_for_cpu.tar.gz', 'src' : 'https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2.tar.gz',
'dest' : 'src/main/assets/models' 'dest' : 'src/main/assets/models'
], ],
[ [
......
...@@ -14,7 +14,6 @@ ...@@ -14,7 +14,6 @@
android:roundIcon="@mipmap/ic_launcher_round" android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true" android:supportsRtl="true"
android:theme="@style/AppTheme"> android:theme="@style/AppTheme">
<!-- to test MiniActivity, change this to com.baidu.paddle.lite.demo.ocr.MiniActivity -->
<activity android:name="com.baidu.paddle.lite.demo.ocr.MainActivity"> <activity android:name="com.baidu.paddle.lite.demo.ocr.MainActivity">
<intent-filter> <intent-filter>
<action android:name="android.intent.action.MAIN"/> <action android:name="android.intent.action.MAIN"/>
......
...@@ -13,7 +13,7 @@ static paddle::lite_api::PowerMode str_to_cpu_mode(const std::string &cpu_mode); ...@@ -13,7 +13,7 @@ static paddle::lite_api::PowerMode str_to_cpu_mode(const std::string &cpu_mode);
extern "C" JNIEXPORT jlong JNICALL extern "C" JNIEXPORT jlong JNICALL
Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_init( Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_init(
JNIEnv *env, jobject thiz, jstring j_det_model_path, JNIEnv *env, jobject thiz, jstring j_det_model_path,
jstring j_rec_model_path, jstring j_cls_model_path, jint j_thread_num, jstring j_rec_model_path, jstring j_cls_model_path, jint j_use_opencl, jint j_thread_num,
jstring j_cpu_mode) { jstring j_cpu_mode) {
std::string det_model_path = jstring_to_cpp_string(env, j_det_model_path); std::string det_model_path = jstring_to_cpp_string(env, j_det_model_path);
std::string rec_model_path = jstring_to_cpp_string(env, j_rec_model_path); std::string rec_model_path = jstring_to_cpp_string(env, j_rec_model_path);
...@@ -21,6 +21,7 @@ Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_init( ...@@ -21,6 +21,7 @@ Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_init(
int thread_num = j_thread_num; int thread_num = j_thread_num;
std::string cpu_mode = jstring_to_cpp_string(env, j_cpu_mode); std::string cpu_mode = jstring_to_cpp_string(env, j_cpu_mode);
ppredictor::OCR_Config conf; ppredictor::OCR_Config conf;
conf.use_opencl = j_use_opencl;
conf.thread_num = thread_num; conf.thread_num = thread_num;
conf.mode = str_to_cpu_mode(cpu_mode); conf.mode = str_to_cpu_mode(cpu_mode);
ppredictor::OCR_PPredictor *orc_predictor = ppredictor::OCR_PPredictor *orc_predictor =
...@@ -57,32 +58,31 @@ str_to_cpu_mode(const std::string &cpu_mode) { ...@@ -57,32 +58,31 @@ str_to_cpu_mode(const std::string &cpu_mode) {
extern "C" JNIEXPORT jfloatArray JNICALL extern "C" JNIEXPORT jfloatArray JNICALL
Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_forward( Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_forward(
JNIEnv *env, jobject thiz, jlong java_pointer, jfloatArray buf, JNIEnv *env, jobject thiz, jlong java_pointer, jobject original_image,jint j_max_size_len, jint j_run_det, jint j_run_cls, jint j_run_rec) {
jfloatArray ddims, jobject original_image) {
LOGI("begin to run native forward"); LOGI("begin to run native forward");
if (java_pointer == 0) { if (java_pointer == 0) {
LOGE("JAVA pointer is NULL"); LOGE("JAVA pointer is NULL");
return cpp_array_to_jfloatarray(env, nullptr, 0); return cpp_array_to_jfloatarray(env, nullptr, 0);
} }
cv::Mat origin = bitmap_to_cv_mat(env, original_image); cv::Mat origin = bitmap_to_cv_mat(env, original_image);
if (origin.size == 0) { if (origin.size == 0) {
LOGE("origin bitmap cannot convert to CV Mat"); LOGE("origin bitmap cannot convert to CV Mat");
return cpp_array_to_jfloatarray(env, nullptr, 0); return cpp_array_to_jfloatarray(env, nullptr, 0);
} }
int max_size_len = j_max_size_len;
int run_det = j_run_det;
int run_cls = j_run_cls;
int run_rec = j_run_rec;
ppredictor::OCR_PPredictor *ppredictor = ppredictor::OCR_PPredictor *ppredictor =
(ppredictor::OCR_PPredictor *)java_pointer; (ppredictor::OCR_PPredictor *)java_pointer;
std::vector<float> dims_float_arr = jfloatarray_to_float_vector(env, ddims);
std::vector<int64_t> dims_arr; std::vector<int64_t> dims_arr;
dims_arr.resize(dims_float_arr.size());
std::copy(dims_float_arr.cbegin(), dims_float_arr.cend(), dims_arr.begin());
// 这里值有点大,就不调用jfloatarray_to_float_vector了
int64_t buf_len = (int64_t)env->GetArrayLength(buf);
jfloat *buf_data = env->GetFloatArrayElements(buf, JNI_FALSE);
float *data = (jfloat *)buf_data;
std::vector<ppredictor::OCRPredictResult> results = std::vector<ppredictor::OCRPredictResult> results =
ppredictor->infer_ocr(dims_arr, data, buf_len, NET_OCR, origin); ppredictor->infer_ocr(origin, max_size_len, run_det, run_cls, run_rec);
LOGI("infer_ocr finished with boxes %ld", results.size()); LOGI("infer_ocr finished with boxes %ld", results.size());
// 这里将std::vector<ppredictor::OCRPredictResult> 序列化成 // 这里将std::vector<ppredictor::OCRPredictResult> 序列化成
// float数组,传输到java层再反序列化 // float数组,传输到java层再反序列化
std::vector<float> float_arr; std::vector<float> float_arr;
...@@ -90,13 +90,18 @@ Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_forward( ...@@ -90,13 +90,18 @@ Java_com_baidu_paddle_lite_demo_ocr_OCRPredictorNative_forward(
float_arr.push_back(r.points.size()); float_arr.push_back(r.points.size());
float_arr.push_back(r.word_index.size()); float_arr.push_back(r.word_index.size());
float_arr.push_back(r.score); float_arr.push_back(r.score);
// add det point
for (const std::vector<int> &point : r.points) { for (const std::vector<int> &point : r.points) {
float_arr.push_back(point.at(0)); float_arr.push_back(point.at(0));
float_arr.push_back(point.at(1)); float_arr.push_back(point.at(1));
} }
// add rec word idx
for (int index : r.word_index) { for (int index : r.word_index) {
float_arr.push_back(index); float_arr.push_back(index);
} }
// add cls result
float_arr.push_back(r.cls_label);
float_arr.push_back(r.cls_score);
} }
return cpp_array_to_jfloatarray(env, float_arr.data(), float_arr.size()); return cpp_array_to_jfloatarray(env, float_arr.data(), float_arr.size());
} }
......
...@@ -17,15 +17,15 @@ int OCR_PPredictor::init(const std::string &det_model_content, ...@@ -17,15 +17,15 @@ int OCR_PPredictor::init(const std::string &det_model_content,
const std::string &rec_model_content, const std::string &rec_model_content,
const std::string &cls_model_content) { const std::string &cls_model_content) {
_det_predictor = std::unique_ptr<PPredictor>( _det_predictor = std::unique_ptr<PPredictor>(
new PPredictor{_config.thread_num, NET_OCR, _config.mode}); new PPredictor{_config.use_opencl,_config.thread_num, NET_OCR, _config.mode});
_det_predictor->init_nb(det_model_content); _det_predictor->init_nb(det_model_content);
_rec_predictor = std::unique_ptr<PPredictor>( _rec_predictor = std::unique_ptr<PPredictor>(
new PPredictor{_config.thread_num, NET_OCR_INTERNAL, _config.mode}); new PPredictor{_config.use_opencl,_config.thread_num, NET_OCR_INTERNAL, _config.mode});
_rec_predictor->init_nb(rec_model_content); _rec_predictor->init_nb(rec_model_content);
_cls_predictor = std::unique_ptr<PPredictor>( _cls_predictor = std::unique_ptr<PPredictor>(
new PPredictor{_config.thread_num, NET_OCR_INTERNAL, _config.mode}); new PPredictor{_config.use_opencl,_config.thread_num, NET_OCR_INTERNAL, _config.mode});
_cls_predictor->init_nb(cls_model_content); _cls_predictor->init_nb(cls_model_content);
return RETURN_OK; return RETURN_OK;
} }
...@@ -34,15 +34,16 @@ int OCR_PPredictor::init_from_file(const std::string &det_model_path, ...@@ -34,15 +34,16 @@ int OCR_PPredictor::init_from_file(const std::string &det_model_path,
const std::string &rec_model_path, const std::string &rec_model_path,
const std::string &cls_model_path) { const std::string &cls_model_path) {
_det_predictor = std::unique_ptr<PPredictor>( _det_predictor = std::unique_ptr<PPredictor>(
new PPredictor{_config.thread_num, NET_OCR, _config.mode}); new PPredictor{_config.use_opencl, _config.thread_num, NET_OCR, _config.mode});
_det_predictor->init_from_file(det_model_path); _det_predictor->init_from_file(det_model_path);
_rec_predictor = std::unique_ptr<PPredictor>( _rec_predictor = std::unique_ptr<PPredictor>(
new PPredictor{_config.thread_num, NET_OCR_INTERNAL, _config.mode}); new PPredictor{_config.use_opencl,_config.thread_num, NET_OCR_INTERNAL, _config.mode});
_rec_predictor->init_from_file(rec_model_path); _rec_predictor->init_from_file(rec_model_path);
_cls_predictor = std::unique_ptr<PPredictor>( _cls_predictor = std::unique_ptr<PPredictor>(
new PPredictor{_config.thread_num, NET_OCR_INTERNAL, _config.mode}); new PPredictor{_config.use_opencl,_config.thread_num, NET_OCR_INTERNAL, _config.mode});
_cls_predictor->init_from_file(cls_model_path); _cls_predictor->init_from_file(cls_model_path);
return RETURN_OK; return RETURN_OK;
} }
...@@ -77,33 +78,126 @@ visual_img(const std::vector<std::vector<std::vector<int>>> &filter_boxes, ...@@ -77,33 +78,126 @@ visual_img(const std::vector<std::vector<std::vector<int>>> &filter_boxes,
} }
std::vector<OCRPredictResult> std::vector<OCRPredictResult>
OCR_PPredictor::infer_ocr(const std::vector<int64_t> &dims, OCR_PPredictor::infer_ocr(cv::Mat &origin,int max_size_len, int run_det, int run_cls, int run_rec) {
const float *input_data, int input_len, int net_flag, LOGI("ocr cpp start *****************");
cv::Mat &origin) { LOGI("ocr cpp det: %d, cls: %d, rec: %d", run_det, run_cls, run_rec);
std::vector<OCRPredictResult> ocr_results;
if(run_det){
infer_det(origin, max_size_len, ocr_results);
}
if(run_rec){
if(ocr_results.size()==0){
OCRPredictResult res;
ocr_results.emplace_back(std::move(res));
}
for(int i = 0; i < ocr_results.size();i++) {
infer_rec(origin, run_cls, ocr_results[i]);
}
}else if(run_cls){
ClsPredictResult cls_res = infer_cls(origin);
OCRPredictResult res;
res.cls_score = cls_res.cls_score;
res.cls_label = cls_res.cls_label;
ocr_results.push_back(res);
}
LOGI("ocr cpp end *****************");
return ocr_results;
}
cv::Mat DetResizeImg(const cv::Mat img, int max_size_len,
std::vector<float> &ratio_hw) {
int w = img.cols;
int h = img.rows;
float ratio = 1.f;
int max_wh = w >= h ? w : h;
if (max_wh > max_size_len) {
if (h > w) {
ratio = static_cast<float>(max_size_len) / static_cast<float>(h);
} else {
ratio = static_cast<float>(max_size_len) / static_cast<float>(w);
}
}
int resize_h = static_cast<int>(float(h) * ratio);
int resize_w = static_cast<int>(float(w) * ratio);
if (resize_h % 32 == 0)
resize_h = resize_h;
else if (resize_h / 32 < 1 + 1e-5)
resize_h = 32;
else
resize_h = (resize_h / 32 - 1) * 32;
if (resize_w % 32 == 0)
resize_w = resize_w;
else if (resize_w / 32 < 1 + 1e-5)
resize_w = 32;
else
resize_w = (resize_w / 32 - 1) * 32;
cv::Mat resize_img;
cv::resize(img, resize_img, cv::Size(resize_w, resize_h));
ratio_hw.push_back(static_cast<float>(resize_h) / static_cast<float>(h));
ratio_hw.push_back(static_cast<float>(resize_w) / static_cast<float>(w));
return resize_img;
}
void OCR_PPredictor::infer_det(cv::Mat &origin, int max_size_len, std::vector<OCRPredictResult> &ocr_results) {
std::vector<float> mean = {0.485f, 0.456f, 0.406f};
std::vector<float> scale = {1 / 0.229f, 1 / 0.224f, 1 / 0.225f};
PredictorInput input = _det_predictor->get_first_input(); PredictorInput input = _det_predictor->get_first_input();
input.set_dims(dims);
input.set_data(input_data, input_len); std::vector<float> ratio_hw;
cv::Mat input_image = DetResizeImg(origin, max_size_len, ratio_hw);
input_image.convertTo(input_image, CV_32FC3, 1 / 255.0f);
const float *dimg = reinterpret_cast<const float *>(input_image.data);
int input_size = input_image.rows * input_image.cols;
input.set_dims({1, 3, input_image.rows, input_image.cols});
neon_mean_scale(dimg, input.get_mutable_float_data(), input_size, mean,
scale);
LOGI("ocr cpp det shape %d,%d", input_image.rows,input_image.cols);
std::vector<PredictorOutput> results = _det_predictor->infer(); std::vector<PredictorOutput> results = _det_predictor->infer();
PredictorOutput &res = results.at(0); PredictorOutput &res = results.at(0);
std::vector<std::vector<std::vector<int>>> filtered_box = calc_filtered_boxes( std::vector<std::vector<std::vector<int>>> filtered_box = calc_filtered_boxes(
res.get_float_data(), res.get_size(), (int)dims[2], (int)dims[3], origin); res.get_float_data(), res.get_size(), input_image.rows, input_image.cols, origin);
LOGI("Filter_box size %ld", filtered_box.size()); LOGI("ocr cpp det Filter_box size %ld", filtered_box.size());
return infer_rec(filtered_box, origin);
for(int i = 0;i<filtered_box.size();i++){
LOGI("ocr cpp box %d,%d,%d,%d,%d,%d,%d,%d", filtered_box[i][0][0],filtered_box[i][0][1], filtered_box[i][1][0],filtered_box[i][1][1], filtered_box[i][2][0],filtered_box[i][2][1], filtered_box[i][3][0],filtered_box[i][3][1]);
OCRPredictResult res;
res.points = filtered_box[i];
ocr_results.push_back(res);
}
} }
std::vector<OCRPredictResult> OCR_PPredictor::infer_rec( void OCR_PPredictor::infer_rec(const cv::Mat &origin_img, int run_cls, OCRPredictResult& ocr_result) {
const std::vector<std::vector<std::vector<int>>> &boxes,
const cv::Mat &origin_img) {
std::vector<float> mean = {0.5f, 0.5f, 0.5f}; std::vector<float> mean = {0.5f, 0.5f, 0.5f};
std::vector<float> scale = {1 / 0.5f, 1 / 0.5f, 1 / 0.5f}; std::vector<float> scale = {1 / 0.5f, 1 / 0.5f, 1 / 0.5f};
std::vector<int64_t> dims = {1, 3, 0, 0}; std::vector<int64_t> dims = {1, 3, 0, 0};
std::vector<OCRPredictResult> ocr_results;
PredictorInput input = _rec_predictor->get_first_input(); PredictorInput input = _rec_predictor->get_first_input();
for (auto bp = boxes.crbegin(); bp != boxes.crend(); ++bp) {
const std::vector<std::vector<int>> &box = *bp; const std::vector<std::vector<int>> &box = ocr_result.points;
cv::Mat crop_img = get_rotate_crop_image(origin_img, box); cv::Mat crop_img;
crop_img = infer_cls(crop_img); if(box.size()>0){
crop_img = get_rotate_crop_image(origin_img, box);
}
else{
crop_img = origin_img;
}
if(run_cls){
ClsPredictResult cls_res = infer_cls(crop_img);
crop_img = cls_res.img;
ocr_result.cls_score = cls_res.cls_score;
ocr_result.cls_label = cls_res.cls_label;
}
float wh_ratio = float(crop_img.cols) / float(crop_img.rows); float wh_ratio = float(crop_img.cols) / float(crop_img.rows);
cv::Mat input_image = crnn_resize_img(crop_img, wh_ratio); cv::Mat input_image = crnn_resize_img(crop_img, wh_ratio);
...@@ -122,8 +216,6 @@ std::vector<OCRPredictResult> OCR_PPredictor::infer_rec( ...@@ -122,8 +216,6 @@ std::vector<OCRPredictResult> OCR_PPredictor::infer_rec(
const float *predict_batch = results.at(0).get_float_data(); const float *predict_batch = results.at(0).get_float_data();
const std::vector<int64_t> predict_shape = results.at(0).get_shape(); const std::vector<int64_t> predict_shape = results.at(0).get_shape();
OCRPredictResult res;
// ctc decode // ctc decode
int argmax_idx; int argmax_idx;
int last_index = 0; int last_index = 0;
...@@ -140,27 +232,19 @@ std::vector<OCRPredictResult> OCR_PPredictor::infer_rec( ...@@ -140,27 +232,19 @@ std::vector<OCRPredictResult> OCR_PPredictor::infer_rec(
if (argmax_idx > 0 && (!(n > 0 && argmax_idx == last_index))) { if (argmax_idx > 0 && (!(n > 0 && argmax_idx == last_index))) {
score += max_value; score += max_value;
count += 1; count += 1;
res.word_index.push_back(argmax_idx); ocr_result.word_index.push_back(argmax_idx);
} }
last_index = argmax_idx; last_index = argmax_idx;
} }
score /= count; score /= count;
if (res.word_index.empty()) { ocr_result.score = score;
continue; LOGI("ocr cpp rec word size %ld", count);
}
res.score = score;
res.points = box;
ocr_results.emplace_back(std::move(res));
}
LOGI("ocr_results finished %lu", ocr_results.size());
return ocr_results;
} }
cv::Mat OCR_PPredictor::infer_cls(const cv::Mat &img, float thresh) { ClsPredictResult OCR_PPredictor::infer_cls(const cv::Mat &img, float thresh) {
std::vector<float> mean = {0.5f, 0.5f, 0.5f}; std::vector<float> mean = {0.5f, 0.5f, 0.5f};
std::vector<float> scale = {1 / 0.5f, 1 / 0.5f, 1 / 0.5f}; std::vector<float> scale = {1 / 0.5f, 1 / 0.5f, 1 / 0.5f};
std::vector<int64_t> dims = {1, 3, 0, 0}; std::vector<int64_t> dims = {1, 3, 0, 0};
std::vector<OCRPredictResult> ocr_results;
PredictorInput input = _cls_predictor->get_first_input(); PredictorInput input = _cls_predictor->get_first_input();
...@@ -182,7 +266,7 @@ cv::Mat OCR_PPredictor::infer_cls(const cv::Mat &img, float thresh) { ...@@ -182,7 +266,7 @@ cv::Mat OCR_PPredictor::infer_cls(const cv::Mat &img, float thresh) {
float score = 0; float score = 0;
int label = 0; int label = 0;
for (int64_t i = 0; i < results.at(0).get_size(); i++) { for (int64_t i = 0; i < results.at(0).get_size(); i++) {
LOGI("output scores [%f]", scores[i]); LOGI("ocr cpp cls output scores [%f]", scores[i]);
if (scores[i] > score) { if (scores[i] > score) {
score = scores[i]; score = scores[i];
label = i; label = i;
...@@ -193,7 +277,12 @@ cv::Mat OCR_PPredictor::infer_cls(const cv::Mat &img, float thresh) { ...@@ -193,7 +277,12 @@ cv::Mat OCR_PPredictor::infer_cls(const cv::Mat &img, float thresh) {
if (label % 2 == 1 && score > thresh) { if (label % 2 == 1 && score > thresh) {
cv::rotate(srcimg, srcimg, 1); cv::rotate(srcimg, srcimg, 1);
} }
return srcimg; ClsPredictResult res;
res.cls_label = label;
res.cls_score = score;
res.img = srcimg;
LOGI("ocr cpp cls word cls %ld, %f", label, score);
return res;
} }
std::vector<std::vector<std::vector<int>>> std::vector<std::vector<std::vector<int>>>
......
...@@ -15,6 +15,7 @@ namespace ppredictor { ...@@ -15,6 +15,7 @@ namespace ppredictor {
* Config * Config
*/ */
struct OCR_Config { struct OCR_Config {
int use_opencl = 0;
int thread_num = 4; // Thread num int thread_num = 4; // Thread num
paddle::lite_api::PowerMode mode = paddle::lite_api::PowerMode mode =
paddle::lite_api::LITE_POWER_HIGH; // PaddleLite Mode paddle::lite_api::LITE_POWER_HIGH; // PaddleLite Mode
...@@ -27,8 +28,15 @@ struct OCRPredictResult { ...@@ -27,8 +28,15 @@ struct OCRPredictResult {
std::vector<int> word_index; std::vector<int> word_index;
std::vector<std::vector<int>> points; std::vector<std::vector<int>> points;
float score; float score;
float cls_score;
int cls_label=-1;
}; };
struct ClsPredictResult {
float cls_score;
int cls_label=-1;
cv::Mat img;
};
/** /**
* OCR there are 2 models * OCR there are 2 models
* 1. First model(det),select polygones to show where are the texts * 1. First model(det),select polygones to show where are the texts
...@@ -62,8 +70,7 @@ public: ...@@ -62,8 +70,7 @@ public:
* @return * @return
*/ */
virtual std::vector<OCRPredictResult> virtual std::vector<OCRPredictResult>
infer_ocr(const std::vector<int64_t> &dims, const float *input_data, infer_ocr(cv::Mat &origin, int max_size_len, int run_det, int run_cls, int run_rec);
int input_len, int net_flag, cv::Mat &origin);
virtual NET_TYPE get_net_flag() const; virtual NET_TYPE get_net_flag() const;
...@@ -80,16 +87,17 @@ private: ...@@ -80,16 +87,17 @@ private:
calc_filtered_boxes(const float *pred, int pred_size, int output_height, calc_filtered_boxes(const float *pred, int pred_size, int output_height,
int output_width, const cv::Mat &origin); int output_width, const cv::Mat &origin);
void
infer_det(cv::Mat &origin, int max_side_len, std::vector<OCRPredictResult>& ocr_results);
/** /**
* infer for second model * infer for rec model
* *
* @param boxes * @param boxes
* @param origin * @param origin
* @return * @return
*/ */
std::vector<OCRPredictResult> void
infer_rec(const std::vector<std::vector<std::vector<int>>> &boxes, infer_rec(const cv::Mat &origin, int run_cls, OCRPredictResult& ocr_result);
const cv::Mat &origin);
/** /**
* infer for cls model * infer for cls model
...@@ -98,7 +106,7 @@ private: ...@@ -98,7 +106,7 @@ private:
* @param origin * @param origin
* @return * @return
*/ */
cv::Mat infer_cls(const cv::Mat &origin, float thresh = 0.9); ClsPredictResult infer_cls(const cv::Mat &origin, float thresh = 0.9);
/** /**
* Postprocess or sencod model to extract text * Postprocess or sencod model to extract text
......
...@@ -2,9 +2,9 @@ ...@@ -2,9 +2,9 @@
#include "common.h" #include "common.h"
namespace ppredictor { namespace ppredictor {
PPredictor::PPredictor(int thread_num, int net_flag, PPredictor::PPredictor(int use_opencl, int thread_num, int net_flag,
paddle::lite_api::PowerMode mode) paddle::lite_api::PowerMode mode)
: _thread_num(thread_num), _net_flag(net_flag), _mode(mode) {} : _use_opencl(use_opencl), _thread_num(thread_num), _net_flag(net_flag), _mode(mode) {}
int PPredictor::init_nb(const std::string &model_content) { int PPredictor::init_nb(const std::string &model_content) {
paddle::lite_api::MobileConfig config; paddle::lite_api::MobileConfig config;
...@@ -19,10 +19,40 @@ int PPredictor::init_from_file(const std::string &model_content) { ...@@ -19,10 +19,40 @@ int PPredictor::init_from_file(const std::string &model_content) {
} }
template <typename ConfigT> int PPredictor::_init(ConfigT &config) { template <typename ConfigT> int PPredictor::_init(ConfigT &config) {
bool is_opencl_backend_valid = paddle::lite_api::IsOpenCLBackendValid(/*check_fp16_valid = false*/);
if (is_opencl_backend_valid) {
if (_use_opencl != 0) {
// Make sure you have write permission of the binary path.
// We strongly recommend each model has a unique binary name.
const std::string bin_path = "/data/local/tmp/";
const std::string bin_name = "lite_opencl_kernel.bin";
config.set_opencl_binary_path_name(bin_path, bin_name);
// opencl tune option
// CL_TUNE_NONE: 0
// CL_TUNE_RAPID: 1
// CL_TUNE_NORMAL: 2
// CL_TUNE_EXHAUSTIVE: 3
const std::string tuned_path = "/data/local/tmp/";
const std::string tuned_name = "lite_opencl_tuned.bin";
config.set_opencl_tune(paddle::lite_api::CL_TUNE_NORMAL, tuned_path, tuned_name);
// opencl precision option
// CL_PRECISION_AUTO: 0, first fp16 if valid, default
// CL_PRECISION_FP32: 1, force fp32
// CL_PRECISION_FP16: 2, force fp16
config.set_opencl_precision(paddle::lite_api::CL_PRECISION_FP32);
LOGI("ocr cpp device: running on gpu.");
}
} else {
LOGI("ocr cpp device: running on cpu.");
// you can give backup cpu nb model instead
// config.set_model_from_file(cpu_nb_model_dir);
}
config.set_threads(_thread_num); config.set_threads(_thread_num);
config.set_power_mode(_mode); config.set_power_mode(_mode);
_predictor = paddle::lite_api::CreatePaddlePredictor(config); _predictor = paddle::lite_api::CreatePaddlePredictor(config);
LOGI("paddle instance created"); LOGI("ocr cpp paddle instance created");
return RETURN_OK; return RETURN_OK;
} }
...@@ -43,18 +73,18 @@ std::vector<PredictorInput> PPredictor::get_inputs(int num) { ...@@ -43,18 +73,18 @@ std::vector<PredictorInput> PPredictor::get_inputs(int num) {
PredictorInput PPredictor::get_first_input() { return get_input(0); } PredictorInput PPredictor::get_first_input() { return get_input(0); }
std::vector<PredictorOutput> PPredictor::infer() { std::vector<PredictorOutput> PPredictor::infer() {
LOGI("infer Run start %d", _net_flag); LOGI("ocr cpp infer Run start %d", _net_flag);
std::vector<PredictorOutput> results; std::vector<PredictorOutput> results;
if (!_is_input_get) { if (!_is_input_get) {
return results; return results;
} }
_predictor->Run(); _predictor->Run();
LOGI("infer Run end"); LOGI("ocr cpp infer Run end");
for (int i = 0; i < _predictor->GetOutputNames().size(); i++) { for (int i = 0; i < _predictor->GetOutputNames().size(); i++) {
std::unique_ptr<const paddle::lite_api::Tensor> output_tensor = std::unique_ptr<const paddle::lite_api::Tensor> output_tensor =
_predictor->GetOutput(i); _predictor->GetOutput(i);
LOGI("output tensor[%d] size %ld", i, product(output_tensor->shape())); LOGI("ocr cpp output tensor[%d] size %ld", i, product(output_tensor->shape()));
PredictorOutput result{std::move(output_tensor), i, _net_flag}; PredictorOutput result{std::move(output_tensor), i, _net_flag};
results.emplace_back(std::move(result)); results.emplace_back(std::move(result));
} }
......
...@@ -22,7 +22,7 @@ public: ...@@ -22,7 +22,7 @@ public:
class PPredictor : public PPredictor_Interface { class PPredictor : public PPredictor_Interface {
public: public:
PPredictor( PPredictor(
int thread_num, int net_flag = 0, int use_opencl, int thread_num, int net_flag = 0,
paddle::lite_api::PowerMode mode = paddle::lite_api::LITE_POWER_HIGH); paddle::lite_api::PowerMode mode = paddle::lite_api::LITE_POWER_HIGH);
virtual ~PPredictor() {} virtual ~PPredictor() {}
...@@ -54,6 +54,7 @@ protected: ...@@ -54,6 +54,7 @@ protected:
template <typename ConfigT> int _init(ConfigT &config); template <typename ConfigT> int _init(ConfigT &config);
private: private:
int _use_opencl;
int _thread_num; int _thread_num;
paddle::lite_api::PowerMode _mode; paddle::lite_api::PowerMode _mode;
std::shared_ptr<paddle::lite_api::PaddlePredictor> _predictor; std::shared_ptr<paddle::lite_api::PaddlePredictor> _predictor;
......
...@@ -13,6 +13,7 @@ import android.graphics.BitmapFactory; ...@@ -13,6 +13,7 @@ import android.graphics.BitmapFactory;
import android.graphics.drawable.BitmapDrawable; import android.graphics.drawable.BitmapDrawable;
import android.media.ExifInterface; import android.media.ExifInterface;
import android.content.res.AssetManager; import android.content.res.AssetManager;
import android.media.FaceDetector;
import android.net.Uri; import android.net.Uri;
import android.os.Bundle; import android.os.Bundle;
import android.os.Environment; import android.os.Environment;
...@@ -27,7 +28,9 @@ import android.view.Menu; ...@@ -27,7 +28,9 @@ import android.view.Menu;
import android.view.MenuInflater; import android.view.MenuInflater;
import android.view.MenuItem; import android.view.MenuItem;
import android.view.View; import android.view.View;
import android.widget.CheckBox;
import android.widget.ImageView; import android.widget.ImageView;
import android.widget.Spinner;
import android.widget.TextView; import android.widget.TextView;
import android.widget.Toast; import android.widget.Toast;
...@@ -68,23 +71,24 @@ public class MainActivity extends AppCompatActivity { ...@@ -68,23 +71,24 @@ public class MainActivity extends AppCompatActivity {
protected ImageView ivInputImage; protected ImageView ivInputImage;
protected TextView tvOutputResult; protected TextView tvOutputResult;
protected TextView tvInferenceTime; protected TextView tvInferenceTime;
protected CheckBox cbOpencl;
protected Spinner spRunMode;
// Model settings of object detection // Model settings of ocr
protected String modelPath = ""; protected String modelPath = "";
protected String labelPath = ""; protected String labelPath = "";
protected String imagePath = ""; protected String imagePath = "";
protected int cpuThreadNum = 1; protected int cpuThreadNum = 1;
protected String cpuPowerMode = ""; protected String cpuPowerMode = "";
protected String inputColorFormat = ""; protected int detLongSize = 960;
protected long[] inputShape = new long[]{};
protected float[] inputMean = new float[]{};
protected float[] inputStd = new float[]{};
protected float scoreThreshold = 0.1f; protected float scoreThreshold = 0.1f;
private String currentPhotoPath; private String currentPhotoPath;
private AssetManager assetManager =null; private AssetManager assetManager = null;
protected Predictor predictor = new Predictor(); protected Predictor predictor = new Predictor();
private Bitmap cur_predict_image = null;
@Override @Override
protected void onCreate(Bundle savedInstanceState) { protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState); super.onCreate(savedInstanceState);
...@@ -98,10 +102,12 @@ public class MainActivity extends AppCompatActivity { ...@@ -98,10 +102,12 @@ public class MainActivity extends AppCompatActivity {
// Setup the UI components // Setup the UI components
tvInputSetting = findViewById(R.id.tv_input_setting); tvInputSetting = findViewById(R.id.tv_input_setting);
cbOpencl = findViewById(R.id.cb_opencl);
tvStatus = findViewById(R.id.tv_model_img_status); tvStatus = findViewById(R.id.tv_model_img_status);
ivInputImage = findViewById(R.id.iv_input_image); ivInputImage = findViewById(R.id.iv_input_image);
tvInferenceTime = findViewById(R.id.tv_inference_time); tvInferenceTime = findViewById(R.id.tv_inference_time);
tvOutputResult = findViewById(R.id.tv_output_result); tvOutputResult = findViewById(R.id.tv_output_result);
spRunMode = findViewById(R.id.sp_run_mode);
tvInputSetting.setMovementMethod(ScrollingMovementMethod.getInstance()); tvInputSetting.setMovementMethod(ScrollingMovementMethod.getInstance());
tvOutputResult.setMovementMethod(ScrollingMovementMethod.getInstance()); tvOutputResult.setMovementMethod(ScrollingMovementMethod.getInstance());
...@@ -111,26 +117,26 @@ public class MainActivity extends AppCompatActivity { ...@@ -111,26 +117,26 @@ public class MainActivity extends AppCompatActivity {
public void handleMessage(Message msg) { public void handleMessage(Message msg) {
switch (msg.what) { switch (msg.what) {
case RESPONSE_LOAD_MODEL_SUCCESSED: case RESPONSE_LOAD_MODEL_SUCCESSED:
if(pbLoadModel!=null && pbLoadModel.isShowing()){ if (pbLoadModel != null && pbLoadModel.isShowing()) {
pbLoadModel.dismiss(); pbLoadModel.dismiss();
} }
onLoadModelSuccessed(); onLoadModelSuccessed();
break; break;
case RESPONSE_LOAD_MODEL_FAILED: case RESPONSE_LOAD_MODEL_FAILED:
if(pbLoadModel!=null && pbLoadModel.isShowing()){ if (pbLoadModel != null && pbLoadModel.isShowing()) {
pbLoadModel.dismiss(); pbLoadModel.dismiss();
} }
Toast.makeText(MainActivity.this, "Load model failed!", Toast.LENGTH_SHORT).show(); Toast.makeText(MainActivity.this, "Load model failed!", Toast.LENGTH_SHORT).show();
onLoadModelFailed(); onLoadModelFailed();
break; break;
case RESPONSE_RUN_MODEL_SUCCESSED: case RESPONSE_RUN_MODEL_SUCCESSED:
if(pbRunModel!=null && pbRunModel.isShowing()){ if (pbRunModel != null && pbRunModel.isShowing()) {
pbRunModel.dismiss(); pbRunModel.dismiss();
} }
onRunModelSuccessed(); onRunModelSuccessed();
break; break;
case RESPONSE_RUN_MODEL_FAILED: case RESPONSE_RUN_MODEL_FAILED:
if(pbRunModel!=null && pbRunModel.isShowing()){ if (pbRunModel != null && pbRunModel.isShowing()) {
pbRunModel.dismiss(); pbRunModel.dismiss();
} }
Toast.makeText(MainActivity.this, "Run model failed!", Toast.LENGTH_SHORT).show(); Toast.makeText(MainActivity.this, "Run model failed!", Toast.LENGTH_SHORT).show();
...@@ -175,71 +181,47 @@ public class MainActivity extends AppCompatActivity { ...@@ -175,71 +181,47 @@ public class MainActivity extends AppCompatActivity {
super.onResume(); super.onResume();
SharedPreferences sharedPreferences = PreferenceManager.getDefaultSharedPreferences(this); SharedPreferences sharedPreferences = PreferenceManager.getDefaultSharedPreferences(this);
boolean settingsChanged = false; boolean settingsChanged = false;
boolean model_settingsChanged = false;
String model_path = sharedPreferences.getString(getString(R.string.MODEL_PATH_KEY), String model_path = sharedPreferences.getString(getString(R.string.MODEL_PATH_KEY),
getString(R.string.MODEL_PATH_DEFAULT)); getString(R.string.MODEL_PATH_DEFAULT));
String label_path = sharedPreferences.getString(getString(R.string.LABEL_PATH_KEY), String label_path = sharedPreferences.getString(getString(R.string.LABEL_PATH_KEY),
getString(R.string.LABEL_PATH_DEFAULT)); getString(R.string.LABEL_PATH_DEFAULT));
String image_path = sharedPreferences.getString(getString(R.string.IMAGE_PATH_KEY), String image_path = sharedPreferences.getString(getString(R.string.IMAGE_PATH_KEY),
getString(R.string.IMAGE_PATH_DEFAULT)); getString(R.string.IMAGE_PATH_DEFAULT));
settingsChanged |= !model_path.equalsIgnoreCase(modelPath); model_settingsChanged |= !model_path.equalsIgnoreCase(modelPath);
settingsChanged |= !label_path.equalsIgnoreCase(labelPath); settingsChanged |= !label_path.equalsIgnoreCase(labelPath);
settingsChanged |= !image_path.equalsIgnoreCase(imagePath); settingsChanged |= !image_path.equalsIgnoreCase(imagePath);
int cpu_thread_num = Integer.parseInt(sharedPreferences.getString(getString(R.string.CPU_THREAD_NUM_KEY), int cpu_thread_num = Integer.parseInt(sharedPreferences.getString(getString(R.string.CPU_THREAD_NUM_KEY),
getString(R.string.CPU_THREAD_NUM_DEFAULT))); getString(R.string.CPU_THREAD_NUM_DEFAULT)));
settingsChanged |= cpu_thread_num != cpuThreadNum; model_settingsChanged |= cpu_thread_num != cpuThreadNum;
String cpu_power_mode = String cpu_power_mode =
sharedPreferences.getString(getString(R.string.CPU_POWER_MODE_KEY), sharedPreferences.getString(getString(R.string.CPU_POWER_MODE_KEY),
getString(R.string.CPU_POWER_MODE_DEFAULT)); getString(R.string.CPU_POWER_MODE_DEFAULT));
settingsChanged |= !cpu_power_mode.equalsIgnoreCase(cpuPowerMode); model_settingsChanged |= !cpu_power_mode.equalsIgnoreCase(cpuPowerMode);
String input_color_format =
sharedPreferences.getString(getString(R.string.INPUT_COLOR_FORMAT_KEY), int det_long_size = Integer.parseInt(sharedPreferences.getString(getString(R.string.DET_LONG_SIZE_KEY),
getString(R.string.INPUT_COLOR_FORMAT_DEFAULT)); getString(R.string.DET_LONG_SIZE_DEFAULT)));
settingsChanged |= !input_color_format.equalsIgnoreCase(inputColorFormat); settingsChanged |= det_long_size != detLongSize;
long[] input_shape =
Utils.parseLongsFromString(sharedPreferences.getString(getString(R.string.INPUT_SHAPE_KEY),
getString(R.string.INPUT_SHAPE_DEFAULT)), ",");
float[] input_mean =
Utils.parseFloatsFromString(sharedPreferences.getString(getString(R.string.INPUT_MEAN_KEY),
getString(R.string.INPUT_MEAN_DEFAULT)), ",");
float[] input_std =
Utils.parseFloatsFromString(sharedPreferences.getString(getString(R.string.INPUT_STD_KEY)
, getString(R.string.INPUT_STD_DEFAULT)), ",");
settingsChanged |= input_shape.length != inputShape.length;
settingsChanged |= input_mean.length != inputMean.length;
settingsChanged |= input_std.length != inputStd.length;
if (!settingsChanged) {
for (int i = 0; i < input_shape.length; i++) {
settingsChanged |= input_shape[i] != inputShape[i];
}
for (int i = 0; i < input_mean.length; i++) {
settingsChanged |= input_mean[i] != inputMean[i];
}
for (int i = 0; i < input_std.length; i++) {
settingsChanged |= input_std[i] != inputStd[i];
}
}
float score_threshold = float score_threshold =
Float.parseFloat(sharedPreferences.getString(getString(R.string.SCORE_THRESHOLD_KEY), Float.parseFloat(sharedPreferences.getString(getString(R.string.SCORE_THRESHOLD_KEY),
getString(R.string.SCORE_THRESHOLD_DEFAULT))); getString(R.string.SCORE_THRESHOLD_DEFAULT)));
settingsChanged |= scoreThreshold != score_threshold; settingsChanged |= scoreThreshold != score_threshold;
if (settingsChanged) { if (settingsChanged) {
modelPath = model_path;
labelPath = label_path; labelPath = label_path;
imagePath = image_path; imagePath = image_path;
detLongSize = det_long_size;
scoreThreshold = score_threshold;
set_img();
}
if (model_settingsChanged) {
modelPath = model_path;
cpuThreadNum = cpu_thread_num; cpuThreadNum = cpu_thread_num;
cpuPowerMode = cpu_power_mode; cpuPowerMode = cpu_power_mode;
inputColorFormat = input_color_format;
inputShape = input_shape;
inputMean = input_mean;
inputStd = input_std;
scoreThreshold = score_threshold;
// Update UI // Update UI
tvInputSetting.setText("Model: " + modelPath.substring(modelPath.lastIndexOf("/") + 1) + "\n" + "CPU" + tvInputSetting.setText("Model: " + modelPath.substring(modelPath.lastIndexOf("/") + 1) + "\nOPENCL: " + cbOpencl.isChecked() + "\nCPU Thread Num: " + cpuThreadNum + "\nCPU Power Mode: " + cpuPowerMode);
" Thread Num: " + Integer.toString(cpuThreadNum) + "\n" + "CPU Power Mode: " + cpuPowerMode);
tvInputSetting.scrollTo(0, 0); tvInputSetting.scrollTo(0, 0);
// Reload model if configure has been changed // Reload model if configure has been changed
// loadModel(); loadModel();
set_img();
} }
} }
...@@ -254,20 +236,28 @@ public class MainActivity extends AppCompatActivity { ...@@ -254,20 +236,28 @@ public class MainActivity extends AppCompatActivity {
} }
public boolean onLoadModel() { public boolean onLoadModel() {
return predictor.init(MainActivity.this, modelPath, labelPath, cpuThreadNum, if (predictor.isLoaded()) {
predictor.releaseModel();
}
return predictor.init(MainActivity.this, modelPath, labelPath, cbOpencl.isChecked() ? 1 : 0, cpuThreadNum,
cpuPowerMode, cpuPowerMode,
inputColorFormat, detLongSize, scoreThreshold);
inputShape, inputMean,
inputStd, scoreThreshold);
} }
public boolean onRunModel() { public boolean onRunModel() {
return predictor.isLoaded() && predictor.runModel(); String run_mode = spRunMode.getSelectedItem().toString();
int run_det = run_mode.contains("检测") ? 1 : 0;
int run_cls = run_mode.contains("分类") ? 1 : 0;
int run_rec = run_mode.contains("识别") ? 1 : 0;
return predictor.isLoaded() && predictor.runModel(run_det, run_cls, run_rec);
} }
public void onLoadModelSuccessed() { public void onLoadModelSuccessed() {
// Load test image from path and run model // Load test image from path and run model
tvInputSetting.setText("Model: " + modelPath.substring(modelPath.lastIndexOf("/") + 1) + "\nOPENCL: " + cbOpencl.isChecked() + "\nCPU Thread Num: " + cpuThreadNum + "\nCPU Power Mode: " + cpuPowerMode);
tvInputSetting.scrollTo(0, 0);
tvStatus.setText("STATUS: load model successed"); tvStatus.setText("STATUS: load model successed");
} }
public void onLoadModelFailed() { public void onLoadModelFailed() {
...@@ -290,20 +280,13 @@ public class MainActivity extends AppCompatActivity { ...@@ -290,20 +280,13 @@ public class MainActivity extends AppCompatActivity {
tvStatus.setText("STATUS: run model failed"); tvStatus.setText("STATUS: run model failed");
} }
public void onImageChanged(Bitmap image) {
// Rerun model if users pick test image from gallery or camera
if (image != null && predictor.isLoaded()) {
predictor.setInputImage(image);
runModel();
}
}
public void set_img() { public void set_img() {
// Load test image from path and run model // Load test image from path and run model
try { try {
assetManager= getAssets(); assetManager = getAssets();
InputStream in=assetManager.open(imagePath); InputStream in = assetManager.open(imagePath);
Bitmap bmp=BitmapFactory.decodeStream(in); Bitmap bmp = BitmapFactory.decodeStream(in);
cur_predict_image = bmp;
ivInputImage.setImageBitmap(bmp); ivInputImage.setImageBitmap(bmp);
} catch (IOException e) { } catch (IOException e) {
Toast.makeText(MainActivity.this, "Load image failed!", Toast.LENGTH_SHORT).show(); Toast.makeText(MainActivity.this, "Load image failed!", Toast.LENGTH_SHORT).show();
...@@ -430,7 +413,7 @@ public class MainActivity extends AppCompatActivity { ...@@ -430,7 +413,7 @@ public class MainActivity extends AppCompatActivity {
Cursor cursor = managedQuery(uri, proj, null, null, null); Cursor cursor = managedQuery(uri, proj, null, null, null);
cursor.moveToFirst(); cursor.moveToFirst();
if (image != null) { if (image != null) {
// onImageChanged(image); cur_predict_image = image;
ivInputImage.setImageBitmap(image); ivInputImage.setImageBitmap(image);
} }
} catch (IOException e) { } catch (IOException e) {
...@@ -451,7 +434,7 @@ public class MainActivity extends AppCompatActivity { ...@@ -451,7 +434,7 @@ public class MainActivity extends AppCompatActivity {
Bitmap image = BitmapFactory.decodeFile(currentPhotoPath); Bitmap image = BitmapFactory.decodeFile(currentPhotoPath);
image = Utils.rotateBitmap(image, orientation); image = Utils.rotateBitmap(image, orientation);
if (image != null) { if (image != null) {
// onImageChanged(image); cur_predict_image = image;
ivInputImage.setImageBitmap(image); ivInputImage.setImageBitmap(image);
} }
} else { } else {
...@@ -464,28 +447,28 @@ public class MainActivity extends AppCompatActivity { ...@@ -464,28 +447,28 @@ public class MainActivity extends AppCompatActivity {
} }
} }
public void btn_load_model_click(View view) { public void btn_reset_img_click(View view) {
if (predictor.isLoaded()){ ivInputImage.setImageBitmap(cur_predict_image);
tvStatus.setText("STATUS: model has been loaded"); }
}else{
public void cb_opencl_click(View view) {
tvStatus.setText("STATUS: load model ......"); tvStatus.setText("STATUS: load model ......");
loadModel(); loadModel();
} }
}
public void btn_run_model_click(View view) { public void btn_run_model_click(View view) {
Bitmap image =((BitmapDrawable)ivInputImage.getDrawable()).getBitmap(); Bitmap image = ((BitmapDrawable) ivInputImage.getDrawable()).getBitmap();
if(image == null) { if (image == null) {
tvStatus.setText("STATUS: image is not exists"); tvStatus.setText("STATUS: image is not exists");
} } else if (!predictor.isLoaded()) {
else if (!predictor.isLoaded()){
tvStatus.setText("STATUS: model is not loaded"); tvStatus.setText("STATUS: model is not loaded");
}else{ } else {
tvStatus.setText("STATUS: run model ...... "); tvStatus.setText("STATUS: run model ...... ");
predictor.setInputImage(image); predictor.setInputImage(image);
runModel(); runModel();
} }
} }
public void btn_choice_img_click(View view) { public void btn_choice_img_click(View view) {
if (requestAllPermissions()) { if (requestAllPermissions()) {
openGallery(); openGallery();
...@@ -506,4 +489,32 @@ public class MainActivity extends AppCompatActivity { ...@@ -506,4 +489,32 @@ public class MainActivity extends AppCompatActivity {
worker.quit(); worker.quit();
super.onDestroy(); super.onDestroy();
} }
public int get_run_mode() {
String run_mode = spRunMode.getSelectedItem().toString();
int mode;
switch (run_mode) {
case "检测+分类+识别":
mode = 1;
break;
case "检测+识别":
mode = 2;
break;
case "识别+分类":
mode = 3;
break;
case "检测":
mode = 4;
break;
case "识别":
mode = 5;
break;
case "分类":
mode = 6;
break;
default:
mode = 1;
}
return mode;
}
} }
package com.baidu.paddle.lite.demo.ocr;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.os.Build;
import android.os.Bundle;
import android.os.Handler;
import android.os.HandlerThread;
import android.os.Message;
import android.util.Log;
import android.view.View;
import android.widget.Button;
import android.widget.ImageView;
import android.widget.TextView;
import android.widget.Toast;
import androidx.appcompat.app.AppCompatActivity;
import java.io.IOException;
import java.io.InputStream;
public class MiniActivity extends AppCompatActivity {
public static final int REQUEST_LOAD_MODEL = 0;
public static final int REQUEST_RUN_MODEL = 1;
public static final int REQUEST_UNLOAD_MODEL = 2;
public static final int RESPONSE_LOAD_MODEL_SUCCESSED = 0;
public static final int RESPONSE_LOAD_MODEL_FAILED = 1;
public static final int RESPONSE_RUN_MODEL_SUCCESSED = 2;
public static final int RESPONSE_RUN_MODEL_FAILED = 3;
private static final String TAG = "MiniActivity";
protected Handler receiver = null; // Receive messages from worker thread
protected Handler sender = null; // Send command to worker thread
protected HandlerThread worker = null; // Worker thread to load&run model
protected volatile Predictor predictor = null;
private String assetModelDirPath = "models/ocr_v2_for_cpu";
private String assetlabelFilePath = "labels/ppocr_keys_v1.txt";
private Button button;
private ImageView imageView; // image result
private TextView textView; // text result
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_mini);
Log.i(TAG, "SHOW in Logcat");
// Prepare the worker thread for mode loading and inference
worker = new HandlerThread("Predictor Worker");
worker.start();
sender = new Handler(worker.getLooper()) {
public void handleMessage(Message msg) {
switch (msg.what) {
case REQUEST_LOAD_MODEL:
// Load model and reload test image
if (!onLoadModel()) {
runOnUiThread(new Runnable() {
@Override
public void run() {
Toast.makeText(MiniActivity.this, "Load model failed!", Toast.LENGTH_SHORT).show();
}
});
}
break;
case REQUEST_RUN_MODEL:
// Run model if model is loaded
final boolean isSuccessed = onRunModel();
runOnUiThread(new Runnable() {
@Override
public void run() {
if (isSuccessed){
onRunModelSuccessed();
}else{
Toast.makeText(MiniActivity.this, "Run model failed!", Toast.LENGTH_SHORT).show();
}
}
});
break;
}
}
};
sender.sendEmptyMessage(REQUEST_LOAD_MODEL); // corresponding to REQUEST_LOAD_MODEL, to call onLoadModel()
imageView = findViewById(R.id.imageView);
textView = findViewById(R.id.sample_text);
button = findViewById(R.id.button);
button.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
sender.sendEmptyMessage(REQUEST_RUN_MODEL);
}
});
}
@Override
protected void onDestroy() {
onUnloadModel();
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.JELLY_BEAN_MR2) {
worker.quitSafely();
} else {
worker.quit();
}
super.onDestroy();
}
/**
* call in onCreate, model init
*
* @return
*/
private boolean onLoadModel() {
if (predictor == null) {
predictor = new Predictor();
}
return predictor.init(this, assetModelDirPath, assetlabelFilePath);
}
/**
* init engine
* call in onCreate
*
* @return
*/
private boolean onRunModel() {
try {
String assetImagePath = "images/0.jpg";
InputStream imageStream = getAssets().open(assetImagePath);
Bitmap image = BitmapFactory.decodeStream(imageStream);
// Input is Bitmap
predictor.setInputImage(image);
return predictor.isLoaded() && predictor.runModel();
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
private void onRunModelSuccessed() {
Log.i(TAG, "onRunModelSuccessed");
textView.setText(predictor.outputResult);
imageView.setImageBitmap(predictor.outputImage);
}
private void onUnloadModel() {
if (predictor != null) {
predictor.releaseModel();
}
}
}
...@@ -29,22 +29,22 @@ public class OCRPredictorNative { ...@@ -29,22 +29,22 @@ public class OCRPredictorNative {
public OCRPredictorNative(Config config) { public OCRPredictorNative(Config config) {
this.config = config; this.config = config;
loadLibrary(); loadLibrary();
nativePointer = init(config.detModelFilename, config.recModelFilename,config.clsModelFilename, nativePointer = init(config.detModelFilename, config.recModelFilename, config.clsModelFilename, config.useOpencl,
config.cpuThreadNum, config.cpuPower); config.cpuThreadNum, config.cpuPower);
Log.i("OCRPredictorNative", "load success " + nativePointer); Log.i("OCRPredictorNative", "load success " + nativePointer);
} }
public ArrayList<OcrResultModel> runImage(float[] inputData, int width, int height, int channels, Bitmap originalImage) { public ArrayList<OcrResultModel> runImage(Bitmap originalImage, int max_size_len, int run_det, int run_cls, int run_rec) {
Log.i("OCRPredictorNative", "begin to run image " + inputData.length + " " + width + " " + height); Log.i("OCRPredictorNative", "begin to run image ");
float[] dims = new float[]{1, channels, height, width}; float[] rawResults = forward(nativePointer, originalImage, max_size_len, run_det, run_cls, run_rec);
float[] rawResults = forward(nativePointer, inputData, dims, originalImage);
ArrayList<OcrResultModel> results = postprocess(rawResults); ArrayList<OcrResultModel> results = postprocess(rawResults);
return results; return results;
} }
public static class Config { public static class Config {
public int useOpencl;
public int cpuThreadNum; public int cpuThreadNum;
public String cpuPower; public String cpuPower;
public String detModelFilename; public String detModelFilename;
...@@ -53,16 +53,16 @@ public class OCRPredictorNative { ...@@ -53,16 +53,16 @@ public class OCRPredictorNative {
} }
public void destory(){ public void destory() {
if (nativePointer > 0) { if (nativePointer > 0) {
release(nativePointer); release(nativePointer);
nativePointer = 0; nativePointer = 0;
} }
} }
protected native long init(String detModelPath, String recModelPath,String clsModelPath, int threadNum, String cpuMode); protected native long init(String detModelPath, String recModelPath, String clsModelPath, int useOpencl, int threadNum, String cpuMode);
protected native float[] forward(long pointer, float[] buf, float[] ddims, Bitmap originalImage); protected native float[] forward(long pointer, Bitmap originalImage,int max_size_len, int run_det, int run_cls, int run_rec);
protected native void release(long pointer); protected native void release(long pointer);
...@@ -73,9 +73,9 @@ public class OCRPredictorNative { ...@@ -73,9 +73,9 @@ public class OCRPredictorNative {
while (begin < raw.length) { while (begin < raw.length) {
int point_num = Math.round(raw[begin]); int point_num = Math.round(raw[begin]);
int word_num = Math.round(raw[begin + 1]); int word_num = Math.round(raw[begin + 1]);
OcrResultModel model = parse(raw, begin + 2, point_num, word_num); OcrResultModel res = parse(raw, begin + 2, point_num, word_num);
begin += 2 + 1 + point_num * 2 + word_num; begin += 2 + 1 + point_num * 2 + word_num + 2;
results.add(model); results.add(res);
} }
return results; return results;
...@@ -83,19 +83,22 @@ public class OCRPredictorNative { ...@@ -83,19 +83,22 @@ public class OCRPredictorNative {
private OcrResultModel parse(float[] raw, int begin, int pointNum, int wordNum) { private OcrResultModel parse(float[] raw, int begin, int pointNum, int wordNum) {
int current = begin; int current = begin;
OcrResultModel model = new OcrResultModel(); OcrResultModel res = new OcrResultModel();
model.setConfidence(raw[current]); res.setConfidence(raw[current]);
current++; current++;
for (int i = 0; i < pointNum; i++) { for (int i = 0; i < pointNum; i++) {
model.addPoints(Math.round(raw[current + i * 2]), Math.round(raw[current + i * 2 + 1])); res.addPoints(Math.round(raw[current + i * 2]), Math.round(raw[current + i * 2 + 1]));
} }
current += (pointNum * 2); current += (pointNum * 2);
for (int i = 0; i < wordNum; i++) { for (int i = 0; i < wordNum; i++) {
int index = Math.round(raw[current + i]); int index = Math.round(raw[current + i]);
model.addWordIndex(index); res.addWordIndex(index);
} }
current += wordNum;
res.setClsIdx(raw[current]);
res.setClsConfidence(raw[current + 1]);
Log.i("OCRPredictorNative", "word finished " + wordNum); Log.i("OCRPredictorNative", "word finished " + wordNum);
return model; return res;
} }
......
...@@ -10,6 +10,9 @@ public class OcrResultModel { ...@@ -10,6 +10,9 @@ public class OcrResultModel {
private List<Integer> wordIndex; private List<Integer> wordIndex;
private String label; private String label;
private float confidence; private float confidence;
private float cls_idx;
private String cls_label;
private float cls_confidence;
public OcrResultModel() { public OcrResultModel() {
super(); super();
...@@ -49,4 +52,28 @@ public class OcrResultModel { ...@@ -49,4 +52,28 @@ public class OcrResultModel {
public void setConfidence(float confidence) { public void setConfidence(float confidence) {
this.confidence = confidence; this.confidence = confidence;
} }
public float getClsIdx() {
return cls_idx;
}
public void setClsIdx(float idx) {
this.cls_idx = idx;
}
public String getClsLabel() {
return cls_label;
}
public void setClsLabel(String label) {
this.cls_label = label;
}
public float getClsConfidence() {
return cls_confidence;
}
public void setClsConfidence(float confidence) {
this.cls_confidence = confidence;
}
} }
...@@ -31,23 +31,19 @@ public class Predictor { ...@@ -31,23 +31,19 @@ public class Predictor {
protected float inferenceTime = 0; protected float inferenceTime = 0;
// Only for object detection // Only for object detection
protected Vector<String> wordLabels = new Vector<String>(); protected Vector<String> wordLabels = new Vector<String>();
protected String inputColorFormat = "BGR"; protected int detLongSize = 960;
protected long[] inputShape = new long[]{1, 3, 960};
protected float[] inputMean = new float[]{0.485f, 0.456f, 0.406f};
protected float[] inputStd = new float[]{1.0f / 0.229f, 1.0f / 0.224f, 1.0f / 0.225f};
protected float scoreThreshold = 0.1f; protected float scoreThreshold = 0.1f;
protected Bitmap inputImage = null; protected Bitmap inputImage = null;
protected Bitmap outputImage = null; protected Bitmap outputImage = null;
protected volatile String outputResult = ""; protected volatile String outputResult = "";
protected float preprocessTime = 0;
protected float postprocessTime = 0; protected float postprocessTime = 0;
public Predictor() { public Predictor() {
} }
public boolean init(Context appCtx, String modelPath, String labelPath) { public boolean init(Context appCtx, String modelPath, String labelPath, int useOpencl, int cpuThreadNum, String cpuPowerMode) {
isLoaded = loadModel(appCtx, modelPath, cpuThreadNum, cpuPowerMode); isLoaded = loadModel(appCtx, modelPath, useOpencl, cpuThreadNum, cpuPowerMode);
if (!isLoaded) { if (!isLoaded) {
return false; return false;
} }
...@@ -56,49 +52,18 @@ public class Predictor { ...@@ -56,49 +52,18 @@ public class Predictor {
} }
public boolean init(Context appCtx, String modelPath, String labelPath, int cpuThreadNum, String cpuPowerMode, public boolean init(Context appCtx, String modelPath, String labelPath, int useOpencl, int cpuThreadNum, String cpuPowerMode,
String inputColorFormat, int detLongSize, float scoreThreshold) {
long[] inputShape, float[] inputMean, boolean isLoaded = init(appCtx, modelPath, labelPath, useOpencl, cpuThreadNum, cpuPowerMode);
float[] inputStd, float scoreThreshold) {
if (inputShape.length != 3) {
Log.e(TAG, "Size of input shape should be: 3");
return false;
}
if (inputMean.length != inputShape[1]) {
Log.e(TAG, "Size of input mean should be: " + Long.toString(inputShape[1]));
return false;
}
if (inputStd.length != inputShape[1]) {
Log.e(TAG, "Size of input std should be: " + Long.toString(inputShape[1]));
return false;
}
if (inputShape[0] != 1) {
Log.e(TAG, "Only one batch is supported in the image classification demo, you can use any batch size in " +
"your Apps!");
return false;
}
if (inputShape[1] != 1 && inputShape[1] != 3) {
Log.e(TAG, "Only one/three channels are supported in the image classification demo, you can use any " +
"channel size in your Apps!");
return false;
}
if (!inputColorFormat.equalsIgnoreCase("BGR")) {
Log.e(TAG, "Only BGR color format is supported.");
return false;
}
boolean isLoaded = init(appCtx, modelPath, labelPath);
if (!isLoaded) { if (!isLoaded) {
return false; return false;
} }
this.inputColorFormat = inputColorFormat; this.detLongSize = detLongSize;
this.inputShape = inputShape;
this.inputMean = inputMean;
this.inputStd = inputStd;
this.scoreThreshold = scoreThreshold; this.scoreThreshold = scoreThreshold;
return true; return true;
} }
protected boolean loadModel(Context appCtx, String modelPath, int cpuThreadNum, String cpuPowerMode) { protected boolean loadModel(Context appCtx, String modelPath, int useOpencl, int cpuThreadNum, String cpuPowerMode) {
// Release model if exists // Release model if exists
releaseModel(); releaseModel();
...@@ -118,12 +83,13 @@ public class Predictor { ...@@ -118,12 +83,13 @@ public class Predictor {
} }
OCRPredictorNative.Config config = new OCRPredictorNative.Config(); OCRPredictorNative.Config config = new OCRPredictorNative.Config();
config.useOpencl = useOpencl;
config.cpuThreadNum = cpuThreadNum; config.cpuThreadNum = cpuThreadNum;
config.detModelFilename = realPath + File.separator + "ch_ppocr_mobile_v2.0_det_opt.nb";
config.recModelFilename = realPath + File.separator + "ch_ppocr_mobile_v2.0_rec_opt.nb";
config.clsModelFilename = realPath + File.separator + "ch_ppocr_mobile_v2.0_cls_opt.nb";
Log.e("Predictor", "model path" + config.detModelFilename + " ; " + config.recModelFilename + ";" + config.clsModelFilename);
config.cpuPower = cpuPowerMode; config.cpuPower = cpuPowerMode;
config.detModelFilename = realPath + File.separator + "det_db.nb";
config.recModelFilename = realPath + File.separator + "rec_crnn.nb";
config.clsModelFilename = realPath + File.separator + "cls.nb";
Log.i("Predictor", "model path" + config.detModelFilename + " ; " + config.recModelFilename + ";" + config.clsModelFilename);
paddlePredictor = new OCRPredictorNative(config); paddlePredictor = new OCRPredictorNative(config);
this.cpuThreadNum = cpuThreadNum; this.cpuThreadNum = cpuThreadNum;
...@@ -170,82 +136,29 @@ public class Predictor { ...@@ -170,82 +136,29 @@ public class Predictor {
} }
public boolean runModel() { public boolean runModel(int run_det, int run_cls, int run_rec) {
if (inputImage == null || !isLoaded()) { if (inputImage == null || !isLoaded()) {
return false; return false;
} }
// Pre-process image, and feed input tensor with pre-processed data
Bitmap scaleImage = Utils.resizeWithStep(inputImage, Long.valueOf(inputShape[2]).intValue(), 32);
Date start = new Date();
int channels = (int) inputShape[1];
int width = scaleImage.getWidth();
int height = scaleImage.getHeight();
float[] inputData = new float[channels * width * height];
if (channels == 3) {
int[] channelIdx = null;
if (inputColorFormat.equalsIgnoreCase("RGB")) {
channelIdx = new int[]{0, 1, 2};
} else if (inputColorFormat.equalsIgnoreCase("BGR")) {
channelIdx = new int[]{2, 1, 0};
} else {
Log.i(TAG, "Unknown color format " + inputColorFormat + ", only RGB and BGR color format is " +
"supported!");
return false;
}
int[] channelStride = new int[]{width * height, width * height * 2};
int[] pixels=new int[width*height];
scaleImage.getPixels(pixels,0,scaleImage.getWidth(),0,0,scaleImage.getWidth(),scaleImage.getHeight());
for (int i = 0; i < pixels.length; i++) {
int color = pixels[i];
float[] rgb = new float[]{(float) red(color) / 255.0f, (float) green(color) / 255.0f,
(float) blue(color) / 255.0f};
inputData[i] = (rgb[channelIdx[0]] - inputMean[0]) / inputStd[0];
inputData[i + channelStride[0]] = (rgb[channelIdx[1]] - inputMean[1]) / inputStd[1];
inputData[i+ channelStride[1]] = (rgb[channelIdx[2]] - inputMean[2]) / inputStd[2];
}
} else if (channels == 1) {
int[] pixels=new int[width*height];
scaleImage.getPixels(pixels,0,scaleImage.getWidth(),0,0,scaleImage.getWidth(),scaleImage.getHeight());
for (int i = 0; i < pixels.length; i++) {
int color = pixels[i];
float gray = (float) (red(color) + green(color) + blue(color)) / 3.0f / 255.0f;
inputData[i] = (gray - inputMean[0]) / inputStd[0];
}
} else {
Log.i(TAG, "Unsupported channel size " + Integer.toString(channels) + ", only channel 1 and 3 is " +
"supported!");
return false;
}
float[] pixels = inputData;
Log.i(TAG, "pixels " + pixels[0] + " " + pixels[1] + " " + pixels[2] + " " + pixels[3]
+ " " + pixels[pixels.length / 2] + " " + pixels[pixels.length / 2 + 1] + " " + pixels[pixels.length - 2] + " " + pixels[pixels.length - 1]);
Date end = new Date();
preprocessTime = (float) (end.getTime() - start.getTime());
// Warm up // Warm up
for (int i = 0; i < warmupIterNum; i++) { for (int i = 0; i < warmupIterNum; i++) {
paddlePredictor.runImage(inputData, width, height, channels, inputImage); paddlePredictor.runImage(inputImage, detLongSize, run_det, run_cls, run_rec);
} }
warmupIterNum = 0; // do not need warm warmupIterNum = 0; // do not need warm
// Run inference // Run inference
start = new Date(); Date start = new Date();
ArrayList<OcrResultModel> results = paddlePredictor.runImage(inputData, width, height, channels, inputImage); ArrayList<OcrResultModel> results = paddlePredictor.runImage(inputImage, detLongSize, run_det, run_cls, run_rec);
end = new Date(); Date end = new Date();
inferenceTime = (end.getTime() - start.getTime()) / (float) inferIterNum; inferenceTime = (end.getTime() - start.getTime()) / (float) inferIterNum;
results = postprocess(results); results = postprocess(results);
Log.i(TAG, "[stat] Preprocess Time: " + preprocessTime Log.i(TAG, "[stat] Inference Time: " + inferenceTime + " ;Box Size " + results.size());
+ " ; Inference Time: " + inferenceTime + " ;Box Size " + results.size());
drawResults(results); drawResults(results);
return true; return true;
} }
public boolean isLoaded() { public boolean isLoaded() {
return paddlePredictor != null && isLoaded; return paddlePredictor != null && isLoaded;
} }
...@@ -282,10 +195,6 @@ public class Predictor { ...@@ -282,10 +195,6 @@ public class Predictor {
return outputResult; return outputResult;
} }
public float preprocessTime() {
return preprocessTime;
}
public float postprocessTime() { public float postprocessTime() {
return postprocessTime; return postprocessTime;
} }
...@@ -310,6 +219,7 @@ public class Predictor { ...@@ -310,6 +219,7 @@ public class Predictor {
} }
} }
r.setLabel(word.toString()); r.setLabel(word.toString());
r.setClsLabel(r.getClsIdx() == 1 ? "180" : "0");
} }
return results; return results;
} }
...@@ -319,14 +229,22 @@ public class Predictor { ...@@ -319,14 +229,22 @@ public class Predictor {
for (int i = 0; i < results.size(); i++) { for (int i = 0; i < results.size(); i++) {
OcrResultModel result = results.get(i); OcrResultModel result = results.get(i);
StringBuilder sb = new StringBuilder(""); StringBuilder sb = new StringBuilder("");
sb.append(result.getLabel()); if(result.getPoints().size()>0){
sb.append(" ").append(result.getConfidence()); sb.append("Det: ");
sb.append("; Points: ");
for (Point p : result.getPoints()) { for (Point p : result.getPoints()) {
sb.append("(").append(p.x).append(",").append(p.y).append(") "); sb.append("(").append(p.x).append(",").append(p.y).append(") ");
} }
}
if(result.getLabel().length() > 0){
sb.append("\n Rec: ").append(result.getLabel());
sb.append(",").append(result.getConfidence());
}
if(result.getClsIdx()!=-1){
sb.append(" Cls: ").append(result.getClsLabel());
sb.append(",").append(result.getClsConfidence());
}
Log.i(TAG, sb.toString()); // show LOG in Logcat panel Log.i(TAG, sb.toString()); // show LOG in Logcat panel
outputResultSb.append(i + 1).append(": ").append(result.getLabel()).append("\n"); outputResultSb.append(i + 1).append(": ").append(sb.toString()).append("\n");
} }
outputResult = outputResultSb.toString(); outputResult = outputResultSb.toString();
outputImage = inputImage; outputImage = inputImage;
...@@ -344,6 +262,9 @@ public class Predictor { ...@@ -344,6 +262,9 @@ public class Predictor {
for (OcrResultModel result : results) { for (OcrResultModel result : results) {
Path path = new Path(); Path path = new Path();
List<Point> points = result.getPoints(); List<Point> points = result.getPoints();
if(points.size()==0){
continue;
}
path.moveTo(points.get(0).x, points.get(0).y); path.moveTo(points.get(0).x, points.get(0).y);
for (int i = points.size() - 1; i >= 0; i--) { for (int i = points.size() - 1; i >= 0; i--) {
Point p = points.get(i); Point p = points.get(i);
......
...@@ -20,16 +20,13 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha ...@@ -20,16 +20,13 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha
ListPreference etImagePath = null; ListPreference etImagePath = null;
ListPreference lpCPUThreadNum = null; ListPreference lpCPUThreadNum = null;
ListPreference lpCPUPowerMode = null; ListPreference lpCPUPowerMode = null;
ListPreference lpInputColorFormat = null; EditTextPreference etDetLongSize = null;
EditTextPreference etInputShape = null;
EditTextPreference etInputMean = null;
EditTextPreference etInputStd = null;
EditTextPreference etScoreThreshold = null; EditTextPreference etScoreThreshold = null;
List<String> preInstalledModelPaths = null; List<String> preInstalledModelPaths = null;
List<String> preInstalledLabelPaths = null; List<String> preInstalledLabelPaths = null;
List<String> preInstalledImagePaths = null; List<String> preInstalledImagePaths = null;
List<String> preInstalledInputShapes = null; List<String> preInstalledDetLongSizes = null;
List<String> preInstalledCPUThreadNums = null; List<String> preInstalledCPUThreadNums = null;
List<String> preInstalledCPUPowerModes = null; List<String> preInstalledCPUPowerModes = null;
List<String> preInstalledInputColorFormats = null; List<String> preInstalledInputColorFormats = null;
...@@ -50,7 +47,7 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha ...@@ -50,7 +47,7 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha
preInstalledModelPaths = new ArrayList<String>(); preInstalledModelPaths = new ArrayList<String>();
preInstalledLabelPaths = new ArrayList<String>(); preInstalledLabelPaths = new ArrayList<String>();
preInstalledImagePaths = new ArrayList<String>(); preInstalledImagePaths = new ArrayList<String>();
preInstalledInputShapes = new ArrayList<String>(); preInstalledDetLongSizes = new ArrayList<String>();
preInstalledCPUThreadNums = new ArrayList<String>(); preInstalledCPUThreadNums = new ArrayList<String>();
preInstalledCPUPowerModes = new ArrayList<String>(); preInstalledCPUPowerModes = new ArrayList<String>();
preInstalledInputColorFormats = new ArrayList<String>(); preInstalledInputColorFormats = new ArrayList<String>();
...@@ -63,10 +60,7 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha ...@@ -63,10 +60,7 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha
preInstalledImagePaths.add(getString(R.string.IMAGE_PATH_DEFAULT)); preInstalledImagePaths.add(getString(R.string.IMAGE_PATH_DEFAULT));
preInstalledCPUThreadNums.add(getString(R.string.CPU_THREAD_NUM_DEFAULT)); preInstalledCPUThreadNums.add(getString(R.string.CPU_THREAD_NUM_DEFAULT));
preInstalledCPUPowerModes.add(getString(R.string.CPU_POWER_MODE_DEFAULT)); preInstalledCPUPowerModes.add(getString(R.string.CPU_POWER_MODE_DEFAULT));
preInstalledInputColorFormats.add(getString(R.string.INPUT_COLOR_FORMAT_DEFAULT)); preInstalledDetLongSizes.add(getString(R.string.DET_LONG_SIZE_DEFAULT));
preInstalledInputShapes.add(getString(R.string.INPUT_SHAPE_DEFAULT));
preInstalledInputMeans.add(getString(R.string.INPUT_MEAN_DEFAULT));
preInstalledInputStds.add(getString(R.string.INPUT_STD_DEFAULT));
preInstalledScoreThresholds.add(getString(R.string.SCORE_THRESHOLD_DEFAULT)); preInstalledScoreThresholds.add(getString(R.string.SCORE_THRESHOLD_DEFAULT));
// Setup UI components // Setup UI components
...@@ -89,11 +83,7 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha ...@@ -89,11 +83,7 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha
(ListPreference) findPreference(getString(R.string.CPU_THREAD_NUM_KEY)); (ListPreference) findPreference(getString(R.string.CPU_THREAD_NUM_KEY));
lpCPUPowerMode = lpCPUPowerMode =
(ListPreference) findPreference(getString(R.string.CPU_POWER_MODE_KEY)); (ListPreference) findPreference(getString(R.string.CPU_POWER_MODE_KEY));
lpInputColorFormat = etDetLongSize = (EditTextPreference) findPreference(getString(R.string.DET_LONG_SIZE_KEY));
(ListPreference) findPreference(getString(R.string.INPUT_COLOR_FORMAT_KEY));
etInputShape = (EditTextPreference) findPreference(getString(R.string.INPUT_SHAPE_KEY));
etInputMean = (EditTextPreference) findPreference(getString(R.string.INPUT_MEAN_KEY));
etInputStd = (EditTextPreference) findPreference(getString(R.string.INPUT_STD_KEY));
etScoreThreshold = (EditTextPreference) findPreference(getString(R.string.SCORE_THRESHOLD_KEY)); etScoreThreshold = (EditTextPreference) findPreference(getString(R.string.SCORE_THRESHOLD_KEY));
} }
...@@ -112,11 +102,7 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha ...@@ -112,11 +102,7 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha
editor.putString(getString(R.string.IMAGE_PATH_KEY), preInstalledImagePaths.get(modelIdx)); editor.putString(getString(R.string.IMAGE_PATH_KEY), preInstalledImagePaths.get(modelIdx));
editor.putString(getString(R.string.CPU_THREAD_NUM_KEY), preInstalledCPUThreadNums.get(modelIdx)); editor.putString(getString(R.string.CPU_THREAD_NUM_KEY), preInstalledCPUThreadNums.get(modelIdx));
editor.putString(getString(R.string.CPU_POWER_MODE_KEY), preInstalledCPUPowerModes.get(modelIdx)); editor.putString(getString(R.string.CPU_POWER_MODE_KEY), preInstalledCPUPowerModes.get(modelIdx));
editor.putString(getString(R.string.INPUT_COLOR_FORMAT_KEY), editor.putString(getString(R.string.DET_LONG_SIZE_KEY), preInstalledDetLongSizes.get(modelIdx));
preInstalledInputColorFormats.get(modelIdx));
editor.putString(getString(R.string.INPUT_SHAPE_KEY), preInstalledInputShapes.get(modelIdx));
editor.putString(getString(R.string.INPUT_MEAN_KEY), preInstalledInputMeans.get(modelIdx));
editor.putString(getString(R.string.INPUT_STD_KEY), preInstalledInputStds.get(modelIdx));
editor.putString(getString(R.string.SCORE_THRESHOLD_KEY), editor.putString(getString(R.string.SCORE_THRESHOLD_KEY),
preInstalledScoreThresholds.get(modelIdx)); preInstalledScoreThresholds.get(modelIdx));
editor.apply(); editor.apply();
...@@ -129,10 +115,7 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha ...@@ -129,10 +115,7 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha
etImagePath.setEnabled(enableCustomSettings); etImagePath.setEnabled(enableCustomSettings);
lpCPUThreadNum.setEnabled(enableCustomSettings); lpCPUThreadNum.setEnabled(enableCustomSettings);
lpCPUPowerMode.setEnabled(enableCustomSettings); lpCPUPowerMode.setEnabled(enableCustomSettings);
lpInputColorFormat.setEnabled(enableCustomSettings); etDetLongSize.setEnabled(enableCustomSettings);
etInputShape.setEnabled(enableCustomSettings);
etInputMean.setEnabled(enableCustomSettings);
etInputStd.setEnabled(enableCustomSettings);
etScoreThreshold.setEnabled(enableCustomSettings); etScoreThreshold.setEnabled(enableCustomSettings);
modelPath = sharedPreferences.getString(getString(R.string.MODEL_PATH_KEY), modelPath = sharedPreferences.getString(getString(R.string.MODEL_PATH_KEY),
getString(R.string.MODEL_PATH_DEFAULT)); getString(R.string.MODEL_PATH_DEFAULT));
...@@ -144,14 +127,8 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha ...@@ -144,14 +127,8 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha
getString(R.string.CPU_THREAD_NUM_DEFAULT)); getString(R.string.CPU_THREAD_NUM_DEFAULT));
String cpuPowerMode = sharedPreferences.getString(getString(R.string.CPU_POWER_MODE_KEY), String cpuPowerMode = sharedPreferences.getString(getString(R.string.CPU_POWER_MODE_KEY),
getString(R.string.CPU_POWER_MODE_DEFAULT)); getString(R.string.CPU_POWER_MODE_DEFAULT));
String inputColorFormat = sharedPreferences.getString(getString(R.string.INPUT_COLOR_FORMAT_KEY), String detLongSize = sharedPreferences.getString(getString(R.string.DET_LONG_SIZE_KEY),
getString(R.string.INPUT_COLOR_FORMAT_DEFAULT)); getString(R.string.DET_LONG_SIZE_DEFAULT));
String inputShape = sharedPreferences.getString(getString(R.string.INPUT_SHAPE_KEY),
getString(R.string.INPUT_SHAPE_DEFAULT));
String inputMean = sharedPreferences.getString(getString(R.string.INPUT_MEAN_KEY),
getString(R.string.INPUT_MEAN_DEFAULT));
String inputStd = sharedPreferences.getString(getString(R.string.INPUT_STD_KEY),
getString(R.string.INPUT_STD_DEFAULT));
String scoreThreshold = sharedPreferences.getString(getString(R.string.SCORE_THRESHOLD_KEY), String scoreThreshold = sharedPreferences.getString(getString(R.string.SCORE_THRESHOLD_KEY),
getString(R.string.SCORE_THRESHOLD_DEFAULT)); getString(R.string.SCORE_THRESHOLD_DEFAULT));
etModelPath.setSummary(modelPath); etModelPath.setSummary(modelPath);
...@@ -164,14 +141,8 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha ...@@ -164,14 +141,8 @@ public class SettingsActivity extends AppCompatPreferenceActivity implements Sha
lpCPUThreadNum.setSummary(cpuThreadNum); lpCPUThreadNum.setSummary(cpuThreadNum);
lpCPUPowerMode.setValue(cpuPowerMode); lpCPUPowerMode.setValue(cpuPowerMode);
lpCPUPowerMode.setSummary(cpuPowerMode); lpCPUPowerMode.setSummary(cpuPowerMode);
lpInputColorFormat.setValue(inputColorFormat); etDetLongSize.setSummary(detLongSize);
lpInputColorFormat.setSummary(inputColorFormat); etDetLongSize.setText(detLongSize);
etInputShape.setSummary(inputShape);
etInputShape.setText(inputShape);
etInputMean.setSummary(inputMean);
etInputMean.setText(inputMean);
etInputStd.setSummary(inputStd);
etInputStd.setText(inputStd);
etScoreThreshold.setText(scoreThreshold); etScoreThreshold.setText(scoreThreshold);
etScoreThreshold.setSummary(scoreThreshold); etScoreThreshold.setSummary(scoreThreshold);
} }
......
...@@ -23,13 +23,7 @@ ...@@ -23,13 +23,7 @@
android:layout_height="wrap_content" android:layout_height="wrap_content"
android:orientation="horizontal"> android:orientation="horizontal">
<Button
android:id="@+id/btn_load_model"
android:layout_width="0dp"
android:layout_height="wrap_content"
android:layout_weight="1"
android:onClick="btn_load_model_click"
android:text="加载模型" />
<Button <Button
android:id="@+id/btn_run_model" android:id="@+id/btn_run_model"
android:layout_width="0dp" android:layout_width="0dp"
...@@ -52,7 +46,45 @@ ...@@ -52,7 +46,45 @@
android:onClick="btn_choice_img_click" android:onClick="btn_choice_img_click"
android:text="选取图片" /> android:text="选取图片" />
<Button
android:id="@+id/btn_reset_img"
android:layout_width="0dp"
android:layout_height="wrap_content"
android:layout_weight="1"
android:onClick="btn_reset_img_click"
android:text="清空绘图" />
</LinearLayout>
<LinearLayout
android:id="@+id/run_mode_layout"
android:layout_width="fill_parent"
android:layout_height="wrap_content"
android:orientation="horizontal">
<CheckBox
android:id="@+id/cb_opencl"
android:layout_width="0dp"
android:layout_weight="1"
android:layout_height="wrap_content"
android:text="开启OPENCL"
android:onClick="cb_opencl_click"
android:visibility="gone"/>
<TextView
android:layout_width="0dp"
android:layout_weight="0.5"
android:layout_height="wrap_content"
android:text="运行模式:"/>
<Spinner
android:id="@+id/sp_run_mode"
android:layout_width="0dp"
android:layout_weight="1.5"
android:layout_height="wrap_content"
android:entries="@array/run_Model"
/>
</LinearLayout> </LinearLayout>
<TextView <TextView
android:id="@+id/tv_input_setting" android:id="@+id/tv_input_setting"
android:layout_width="wrap_content" android:layout_width="wrap_content"
...@@ -60,7 +92,7 @@ ...@@ -60,7 +92,7 @@
android:scrollbars="vertical" android:scrollbars="vertical"
android:layout_marginLeft="12dp" android:layout_marginLeft="12dp"
android:layout_marginRight="12dp" android:layout_marginRight="12dp"
android:layout_marginTop="10dp" android:layout_marginTop="5dp"
android:layout_marginBottom="5dp" android:layout_marginBottom="5dp"
android:lineSpacingExtra="4dp" android:lineSpacingExtra="4dp"
android:singleLine="false" android:singleLine="false"
......
<?xml version="1.0" encoding="utf-8"?>
<!-- for MiniActivity Use Only -->
<androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent"
app:layout_constraintLeft_toLeftOf="parent"
app:layout_constraintLeft_toRightOf="parent"
tools:context=".MainActivity">
<TextView
android:id="@+id/sample_text"
android:layout_width="0dp"
android:layout_height="wrap_content"
android:text="Hello World!"
app:layout_constraintLeft_toLeftOf="parent"
app:layout_constraintRight_toRightOf="parent"
app:layout_constraintTop_toBottomOf="@id/imageView"
android:scrollbars="vertical"
/>
<ImageView
android:id="@+id/imageView"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:paddingTop="20dp"
android:paddingBottom="20dp"
app:layout_constraintBottom_toTopOf="@id/imageView"
app:layout_constraintLeft_toLeftOf="parent"
app:layout_constraintRight_toRightOf="parent"
app:layout_constraintTop_toTopOf="parent"
tools:srcCompat="@tools:sample/avatars" />
<Button
android:id="@+id/button"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_marginBottom="4dp"
android:text="Button"
app:layout_constraintBottom_toBottomOf="parent"
app:layout_constraintLeft_toLeftOf="parent"
app:layout_constraintRight_toRightOf="parent"
tools:layout_editor_absoluteX="161dp" />
</androidx.constraintlayout.widget.ConstraintLayout>
\ No newline at end of file
<?xml version="1.0" encoding="utf-8"?> <?xml version="1.0" encoding="utf-8"?>
<resources> <resources>
<string-array name="image_name_entries"> <string-array name="image_name_entries">
<item>0.jpg</item> <item>det_0.jpg</item>
<item>90.jpg</item> <item>det_90.jpg</item>
<item>180.jpg</item> <item>det_180.jpg</item>
<item>270.jpg</item> <item>det_270.jpg</item>
<item>rec_0.jpg</item>
<item>rec_0_180.jpg</item>
<item>rec_1.jpg</item>
<item>rec_1_180.jpg</item>
</string-array> </string-array>
<string-array name="image_name_values"> <string-array name="image_name_values">
<item>images/0.jpg</item> <item>images/det_0.jpg</item>
<item>images/90.jpg</item> <item>images/det_90.jpg</item>
<item>images/180.jpg</item> <item>images/det_180.jpg</item>
<item>images/270.jpg</item> <item>images/det_270.jpg</item>
<item>images/rec_0.jpg</item>
<item>images/rec_0_180.jpg</item>
<item>images/rec_1.jpg</item>
<item>images/rec_1_180.jpg</item>
</string-array> </string-array>
<string-array name="cpu_thread_num_entries"> <string-array name="cpu_thread_num_entries">
<item>1 threads</item> <item>1 threads</item>
...@@ -48,4 +56,12 @@ ...@@ -48,4 +56,12 @@
<item>BGR</item> <item>BGR</item>
<item>RGB</item> <item>RGB</item>
</string-array> </string-array>
<string-array name="run_Model">
<item>检测+分类+识别</item>
<item>检测+识别</item>
<item>分类+识别</item>
<item>检测</item>
<item>识别</item>
<item>分类</item>
</string-array>
</resources> </resources>
\ No newline at end of file
<resources> <resources>
<string name="app_name">OCR Chinese</string> <string name="app_name">PaddleOCR</string>
<string name="CHOOSE_PRE_INSTALLED_MODEL_KEY">CHOOSE_PRE_INSTALLED_MODEL_KEY</string> <string name="CHOOSE_PRE_INSTALLED_MODEL_KEY">CHOOSE_PRE_INSTALLED_MODEL_KEY</string>
<string name="ENABLE_CUSTOM_SETTINGS_KEY">ENABLE_CUSTOM_SETTINGS_KEY</string> <string name="ENABLE_CUSTOM_SETTINGS_KEY">ENABLE_CUSTOM_SETTINGS_KEY</string>
<string name="MODEL_PATH_KEY">MODEL_PATH_KEY</string> <string name="MODEL_PATH_KEY">MODEL_PATH_KEY</string>
...@@ -7,20 +7,14 @@ ...@@ -7,20 +7,14 @@
<string name="IMAGE_PATH_KEY">IMAGE_PATH_KEY</string> <string name="IMAGE_PATH_KEY">IMAGE_PATH_KEY</string>
<string name="CPU_THREAD_NUM_KEY">CPU_THREAD_NUM_KEY</string> <string name="CPU_THREAD_NUM_KEY">CPU_THREAD_NUM_KEY</string>
<string name="CPU_POWER_MODE_KEY">CPU_POWER_MODE_KEY</string> <string name="CPU_POWER_MODE_KEY">CPU_POWER_MODE_KEY</string>
<string name="INPUT_COLOR_FORMAT_KEY">INPUT_COLOR_FORMAT_KEY</string> <string name="DET_LONG_SIZE_KEY">DET_LONG_SIZE_KEY</string>
<string name="INPUT_SHAPE_KEY">INPUT_SHAPE_KEY</string>
<string name="INPUT_MEAN_KEY">INPUT_MEAN_KEY</string>
<string name="INPUT_STD_KEY">INPUT_STD_KEY</string>
<string name="SCORE_THRESHOLD_KEY">SCORE_THRESHOLD_KEY</string> <string name="SCORE_THRESHOLD_KEY">SCORE_THRESHOLD_KEY</string>
<string name="MODEL_PATH_DEFAULT">models/ocr_v2_for_cpu</string> <string name="MODEL_PATH_DEFAULT">models/ch_PP-OCRv2</string>
<string name="LABEL_PATH_DEFAULT">labels/ppocr_keys_v1.txt</string> <string name="LABEL_PATH_DEFAULT">labels/ppocr_keys_v1.txt</string>
<string name="IMAGE_PATH_DEFAULT">images/0.jpg</string> <string name="IMAGE_PATH_DEFAULT">images/det_0.jpg</string>
<string name="CPU_THREAD_NUM_DEFAULT">4</string> <string name="CPU_THREAD_NUM_DEFAULT">4</string>
<string name="CPU_POWER_MODE_DEFAULT">LITE_POWER_HIGH</string> <string name="CPU_POWER_MODE_DEFAULT">LITE_POWER_HIGH</string>
<string name="INPUT_COLOR_FORMAT_DEFAULT">BGR</string> <string name="DET_LONG_SIZE_DEFAULT">960</string>
<string name="INPUT_SHAPE_DEFAULT">1,3,960</string>
<string name="INPUT_MEAN_DEFAULT">0.485, 0.456, 0.406</string>
<string name="INPUT_STD_DEFAULT">0.229,0.224,0.225</string>
<string name="SCORE_THRESHOLD_DEFAULT">0.1</string> <string name="SCORE_THRESHOLD_DEFAULT">0.1</string>
</resources> </resources>
...@@ -47,26 +47,10 @@ ...@@ -47,26 +47,10 @@
android:entryValues="@array/cpu_power_mode_values"/> android:entryValues="@array/cpu_power_mode_values"/>
</PreferenceCategory> </PreferenceCategory>
<PreferenceCategory android:title="Input Settings"> <PreferenceCategory android:title="Input Settings">
<ListPreference
android:defaultValue="@string/INPUT_COLOR_FORMAT_DEFAULT"
android:key="@string/INPUT_COLOR_FORMAT_KEY"
android:negativeButtonText="@null"
android:positiveButtonText="@null"
android:title="Input Color Format: BGR or RGB"
android:entries="@array/input_color_format_entries"
android:entryValues="@array/input_color_format_values"/>
<EditTextPreference
android:key="@string/INPUT_SHAPE_KEY"
android:defaultValue="@string/INPUT_SHAPE_DEFAULT"
android:title="Input Shape: (1,1,max_width_height) or (1,3,max_width_height)" />
<EditTextPreference
android:key="@string/INPUT_MEAN_KEY"
android:defaultValue="@string/INPUT_MEAN_DEFAULT"
android:title="Input Mean: (channel/255-mean)/std" />
<EditTextPreference <EditTextPreference
android:key="@string/INPUT_STD_KEY" android:key="@string/DET_LONG_SIZE_KEY"
android:defaultValue="@string/INPUT_STD_DEFAULT" android:defaultValue="@string/DET_LONG_SIZE_DEFAULT"
android:title="Input Std: (channel/255-mean)/std" /> android:title="det long size" />
</PreferenceCategory> </PreferenceCategory>
<PreferenceCategory android:title="Output Settings"> <PreferenceCategory android:title="Output Settings">
<EditTextPreference <EditTextPreference
......
- [端侧部署](#端侧部署)
- [1. 准备环境](#1-准备环境)
- [运行准备](#运行准备)
- [1.1 准备交叉编译环境](#11-准备交叉编译环境)
- [1.2 准备预测库](#12-准备预测库)
- [2 开始运行](#2-开始运行)
- [2.1 模型优化](#21-模型优化)
- [2.2 与手机联调](#22-与手机联调)
- [注意:](#注意)
- [FAQ](#faq)
# 端侧部署 # 端侧部署
本教程将介绍基于[Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite) 在移动端部署PaddleOCR超轻量中文检测、识别模型的详细步骤。 本教程将介绍基于[Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite) 在移动端部署PaddleOCR超轻量中文检测、识别模型的详细步骤。
...@@ -26,17 +37,17 @@ Paddle Lite是飞桨轻量化推理引擎,为手机、IOT端提供高效推理 ...@@ -26,17 +37,17 @@ Paddle Lite是飞桨轻量化推理引擎,为手机、IOT端提供高效推理
| 平台 | 预测库下载链接 | | 平台 | 预测库下载链接 |
|---|---| |---|---|
|Android|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.9/inference_lite_lib.android.armv7.gcc.c++_shared.with_extra.with_cv.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.9/inference_lite_lib.android.armv8.gcc.c++_shared.with_extra.with_cv.tar.gz)| |Android|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.android.armv7.gcc.c++_shared.with_extra.with_cv.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.android.armv8.gcc.c++_shared.with_extra.with_cv.tar.gz)|
|IOS|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.9/inference_lite_lib.ios.armv7.with_cv.with_extra.with_log.tiny_publish.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.9/inference_lite_lib.ios.armv8.with_cv.with_extra.with_log.tiny_publish.tar.gz)| |IOS|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.ios.armv7.with_cv.with_extra.with_log.tiny_publish.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.ios.armv8.with_cv.with_extra.with_log.tiny_publish.tar.gz)|
注:1. 上述预测库为PaddleLite 2.9分支编译得到,有关PaddleLite 2.9 详细信息可参考 [链接](https://github.com/PaddlePaddle/Paddle-Lite/releases/tag/v2.9) 。 注:1. 上述预测库为PaddleLite 2.10分支编译得到,有关PaddleLite 2.10 详细信息可参考 [链接](https://github.com/PaddlePaddle/Paddle-Lite/releases/tag/v2.10) 。
- 2. [推荐]编译Paddle-Lite得到预测库,Paddle-Lite的编译方式如下: - 2. [推荐]编译Paddle-Lite得到预测库,Paddle-Lite的编译方式如下:
``` ```
git clone https://github.com/PaddlePaddle/Paddle-Lite.git git clone https://github.com/PaddlePaddle/Paddle-Lite.git
cd Paddle-Lite cd Paddle-Lite
# 切换到Paddle-Lite release/v2.9 稳定分支 # 切换到Paddle-Lite release/v2.10 稳定分支
git checkout release/v2.9 git checkout release/v2.10
./lite/tools/build_android.sh --arch=armv8 --with_cv=ON --with_extra=ON ./lite/tools/build_android.sh --arch=armv8 --with_cv=ON --with_extra=ON
``` ```
...@@ -85,8 +96,8 @@ Paddle-Lite 提供了多种策略来自动优化原始的模型,其中包括 ...@@ -85,8 +96,8 @@ Paddle-Lite 提供了多种策略来自动优化原始的模型,其中包括
|模型版本|模型简介|模型大小|检测模型|文本方向分类模型|识别模型|Paddle-Lite版本| |模型版本|模型简介|模型大小|检测模型|文本方向分类模型|识别模型|Paddle-Lite版本|
|---|---|---|---|---|---|---| |---|---|---|---|---|---|---|
|V2.0|超轻量中文OCR 移动端模型|7.8M|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_det_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_rec_opt.nb)|v2.9| |PP-OCRv2|蒸馏版超轻量中文OCR移动端模型|11M|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_det_infer_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_ppocr_mobile_v2.0_cls_infer_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_rec_infer_opt.nb)|v2.10|
|V2.0(slim)|超轻量中文OCR 移动端模型|3.3M|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_det_slim_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_slim_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_rec_slim_opt.nb)|v2.9| |PP-OCRv2(slim)|蒸馏版超轻量中文OCR移动端模型|4.6M|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_det_slim_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_ppocr_mobile_v2.0_cls_slim_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_rec_slim_opt.nb)|v2.10|
如果直接使用上述表格中的模型进行部署,可略过下述步骤,直接阅读 [2.2节](#2.2与手机联调) 如果直接使用上述表格中的模型进行部署,可略过下述步骤,直接阅读 [2.2节](#2.2与手机联调)
...@@ -97,7 +108,7 @@ Paddle-Lite 提供了多种策略来自动优化原始的模型,其中包括 ...@@ -97,7 +108,7 @@ Paddle-Lite 提供了多种策略来自动优化原始的模型,其中包括
# 如果准备环境时已经clone了Paddle-Lite,则不用重新clone Paddle-Lite # 如果准备环境时已经clone了Paddle-Lite,则不用重新clone Paddle-Lite
git clone https://github.com/PaddlePaddle/Paddle-Lite.git git clone https://github.com/PaddlePaddle/Paddle-Lite.git
cd Paddle-Lite cd Paddle-Lite
git checkout release/v2.9 git checkout release/v2.10
# 启动编译 # 启动编译
./lite/tools/build.sh build_optimize_tool ./lite/tools/build.sh build_optimize_tool
``` ```
...@@ -123,15 +134,15 @@ cd build.opt/lite/api/ ...@@ -123,15 +134,15 @@ cd build.opt/lite/api/
下面以PaddleOCR的超轻量中文模型为例,介绍使用编译好的opt文件完成inference模型到Paddle-Lite优化模型的转换。 下面以PaddleOCR的超轻量中文模型为例,介绍使用编译好的opt文件完成inference模型到Paddle-Lite优化模型的转换。
``` ```
# 【推荐】 下载PaddleOCR V2.0版本的中英文 inference模型 # 【推荐】 下载 PP-OCRv2版本的中英文 inference模型
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_det_slim_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_slim_infer.tar wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_quant_infer.tar && tar xf ch_PP-OCRv2_det_slim_quant_infer.tar
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_rec_slim_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_slim_infer.tar wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_infer.tar && tar xf ch_PP-OCRv2_rec_slim_quant_infer.tar
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_cls_slim_infer.tar && tar xf ch_ppocr_mobile_v2.0_cls_slim_infer.tar wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_cls_slim_infer.tar && tar xf ch_ppocr_mobile_v2.0_cls_slim_infer.tar
# 转换V2.0检测模型 # 转换检测模型
./opt --model_file=./ch_ppocr_mobile_v2.0_det_slim_infer/inference.pdmodel --param_file=./ch_ppocr_mobile_v2.0_det_slim_infer/inference.pdiparams --optimize_out=./ch_ppocr_mobile_v2.0_det_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer ./opt --model_file=./ch_PP-OCRv2_det_slim_quant_infer/inference.pdmodel --param_file=./ch_PP-OCRv2_det_slim_quant_infer/inference.pdiparams --optimize_out=./ch_PP-OCRv2_det_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer
# 转换V2.0识别模型 # 转换识别模型
./opt --model_file=./ch_ppocr_mobile_v2.0_rec_slim_infer/inference.pdmodel --param_file=./ch_ppocr_mobile_v2.0_rec_slim_infer/inference.pdiparams --optimize_out=./ch_ppocr_mobile_v2.0_rec_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer ./opt --model_file=./ch_PP-OCRv2_rec_slim_quant_infer/inference.pdmodel --param_file=./ch_PP-OCRv2_rec_slim_quant_infer/inference.pdiparams --optimize_out=./ch_PP-OCRv2_rec_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer
# 转换V2.0方向分类器模型 # 转换方向分类器模型
./opt --model_file=./ch_ppocr_mobile_v2.0_cls_slim_infer/inference.pdmodel --param_file=./ch_ppocr_mobile_v2.0_cls_slim_infer/inference.pdiparams --optimize_out=./ch_ppocr_mobile_v2.0_cls_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer ./opt --model_file=./ch_ppocr_mobile_v2.0_cls_slim_infer/inference.pdmodel --param_file=./ch_ppocr_mobile_v2.0_cls_slim_infer/inference.pdiparams --optimize_out=./ch_ppocr_mobile_v2.0_cls_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer
``` ```
...@@ -186,15 +197,15 @@ wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_cls ...@@ -186,15 +197,15 @@ wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_cls
``` ```
准备测试图像,以`PaddleOCR/doc/imgs/11.jpg`为例,将测试的图像复制到`demo/cxx/ocr/debug/`文件夹下。 准备测试图像,以`PaddleOCR/doc/imgs/11.jpg`为例,将测试的图像复制到`demo/cxx/ocr/debug/`文件夹下。
准备lite opt工具优化后的模型文件,比如使用`ch_ppocr_mobile_v2.0_det_slim_opt.nb,ch_ppocr_mobile_v2.0_rec_slim_opt.nb, ch_ppocr_mobile_v2.0_cls_slim_opt.nb`,模型文件放置在`demo/cxx/ocr/debug/`文件夹下。 准备lite opt工具优化后的模型文件,比如使用`ch_PP-OCRv2_det_slim_opt.ch_PP-OCRv2_rec_slim_rec.nb, ch_ppocr_mobile_v2.0_cls_slim_opt.nb`,模型文件放置在`demo/cxx/ocr/debug/`文件夹下。
执行完成后,ocr文件夹下将有如下文件格式: 执行完成后,ocr文件夹下将有如下文件格式:
``` ```
demo/cxx/ocr/ demo/cxx/ocr/
|-- debug/ |-- debug/
| |--ch_ppocr_mobile_v2.0_det_slim_opt.nb 优化后的检测模型文件 | |--ch_PP-OCRv2_det_slim_opt.nb 优化后的检测模型文件
| |--ch_ppocr_mobile_v2.0_rec_slim_opt.nb 优化后的识别模型文件 | |--ch_PP-OCRv2_rec_slim_opt.nb 优化后的识别模型文件
| |--ch_ppocr_mobile_v2.0_cls_slim_opt.nb 优化后的文字方向分类器模型文件 | |--ch_ppocr_mobile_v2.0_cls_slim_opt.nb 优化后的文字方向分类器模型文件
| |--11.jpg 待测试图像 | |--11.jpg 待测试图像
| |--ppocr_keys_v1.txt 中文字典文件 | |--ppocr_keys_v1.txt 中文字典文件
...@@ -250,7 +261,7 @@ use_direction_classify 0 # 是否使用方向分类器,0表示不使用,1 ...@@ -250,7 +261,7 @@ use_direction_classify 0 # 是否使用方向分类器,0表示不使用,1
export LD_LIBRARY_PATH=${PWD}:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=${PWD}:$LD_LIBRARY_PATH
# 开始使用,ocr_db_crnn可执行文件的使用方式为: # 开始使用,ocr_db_crnn可执行文件的使用方式为:
# ./ocr_db_crnn 检测模型文件 方向分类器模型文件 识别模型文件 测试图像路径 字典文件路径 # ./ocr_db_crnn 检测模型文件 方向分类器模型文件 识别模型文件 测试图像路径 字典文件路径
./ocr_db_crnn ch_ppocr_mobile_v2.0_det_slim_opt.nb ch_ppocr_mobile_v2.0_rec_slim_opt.nb ch_ppocr_mobile_v2.0_cls_slim_opt.nb ./11.jpg ppocr_keys_v1.txt ./ocr_db_crnn ch_PP-OCRv2_det_slim_opt.nb ch_PP-OCRv2_rec_slim_opt.nb ch_ppocr_mobile_v2.0_cls_slim_opt.nb ./11.jpg ppocr_keys_v1.txt
``` ```
如果对代码做了修改,则需要重新编译并push到手机上。 如果对代码做了修改,则需要重新编译并push到手机上。
......
- [Tutorial of PaddleOCR Mobile deployment](#tutorial-of-paddleocr-mobile-deployment)
- [1. Preparation](#1-preparation)
- [Preparation environment](#preparation-environment)
- [1.1 Prepare the cross-compilation environment](#11-prepare-the-cross-compilation-environment)
- [1.2 Prepare Paddle-Lite library](#12-prepare-paddle-lite-library)
- [2 Run](#2-run)
- [2.1 Inference Model Optimization](#21-inference-model-optimization)
- [2.2 Run optimized model on Phone](#22-run-optimized-model-on-phone)
- [注意:](#注意)
- [FAQ](#faq)
# Tutorial of PaddleOCR Mobile deployment # Tutorial of PaddleOCR Mobile deployment
This tutorial will introduce how to use [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite) to deploy PaddleOCR ultra-lightweight Chinese and English detection models on mobile phones. This tutorial will introduce how to use [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite) to deploy PaddleOCR ultra-lightweight Chinese and English detection models on mobile phones.
...@@ -28,17 +39,17 @@ There are two ways to obtain the Paddle-Lite library: ...@@ -28,17 +39,17 @@ There are two ways to obtain the Paddle-Lite library:
| Platform | Paddle-Lite library download link | | Platform | Paddle-Lite library download link |
|---|---| |---|---|
|Android|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.9/inference_lite_lib.android.armv7.gcc.c++_shared.with_extra.with_cv.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.9/inference_lite_lib.android.armv8.gcc.c++_shared.with_extra.with_cv.tar.gz)| |Android|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.android.armv7.gcc.c++_shared.with_extra.with_cv.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.android.armv8.gcc.c++_shared.with_extra.with_cv.tar.gz)|
|IOS|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.9/inference_lite_lib.ios.armv7.with_cv.with_extra.with_log.tiny_publish.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.9/inference_lite_lib.ios.armv8.with_cv.with_extra.with_log.tiny_publish.tar.gz)| |IOS|[arm7](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.ios.armv7.with_cv.with_extra.with_log.tiny_publish.tar.gz) / [arm8](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10/inference_lite_lib.ios.armv8.with_cv.with_extra.with_log.tiny_publish.tar.gz)|
Note: 1. The above Paddle-Lite library is compiled from the Paddle-Lite 2.9 branch. For more information about Paddle-Lite 2.9, please refer to [link](https://github.com/PaddlePaddle/Paddle-Lite/releases/tag/v2.9). Note: 1. The above Paddle-Lite library is compiled from the Paddle-Lite 2.10 branch. For more information about Paddle-Lite 2.10, please refer to [link](https://github.com/PaddlePaddle/Paddle-Lite/releases/tag/v2.10).
- 2. [Recommended] Compile Paddle-Lite to get the prediction library. The compilation method of Paddle-Lite is as follows: - 2. [Recommended] Compile Paddle-Lite to get the prediction library. The compilation method of Paddle-Lite is as follows:
``` ```
git clone https://github.com/PaddlePaddle/Paddle-Lite.git git clone https://github.com/PaddlePaddle/Paddle-Lite.git
cd Paddle-Lite cd Paddle-Lite
# Switch to Paddle-Lite release/v2.8 stable branch # Switch to Paddle-Lite release/v2.10 stable branch
git checkout release/v2.8 git checkout release/v2.10
./lite/tools/build_android.sh --arch=armv8 --with_cv=ON --with_extra=ON ./lite/tools/build_android.sh --arch=armv8 --with_cv=ON --with_extra=ON
``` ```
...@@ -87,10 +98,10 @@ The following table also provides a series of models that can be deployed on mob ...@@ -87,10 +98,10 @@ The following table also provides a series of models that can be deployed on mob
|Version|Introduction|Model size|Detection model|Text Direction model|Recognition model|Paddle-Lite branch| |Version|Introduction|Model size|Detection model|Text Direction model|Recognition model|Paddle-Lite branch|
|---|---|---|---|---|---|---| |---|---|---|---|---|---|---|
|V2.0|extra-lightweight chinese OCR optimized model|7.8M|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_det_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_rec_opt.nb)|v2.9| |PP-OCRv2|extra-lightweight chinese OCR optimized model|11M|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_det_infer_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_ppocr_mobile_v2.0_cls_infer_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_rec_infer_opt.nb)|v2.10|
|V2.0(slim)|extra-lightweight chinese OCR optimized model|3.3M|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_det_slim_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_slim_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_rec_slim_opt.nb)|v2.9| |PP-OCRv2(slim)|extra-lightweight chinese OCR optimized model|4.6M|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_det_slim_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_ppocr_mobile_v2.0_cls_slim_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_rec_slim_opt.nb)|v2.10|
If you directly use the model in the above table for deployment, you can skip the following steps and directly read [Section 2.2](#2.2 Run optimized model on Phone). If you directly use the model in the above table for deployment, you can skip the following steps and directly read [Section 2.2](#2.2-Run-optimized-model-on-Phone).
If the model to be deployed is not in the above table, you need to follow the steps below to obtain the optimized model. If the model to be deployed is not in the above table, you need to follow the steps below to obtain the optimized model.
...@@ -98,7 +109,7 @@ The `opt` tool can be obtained by compiling Paddle Lite. ...@@ -98,7 +109,7 @@ The `opt` tool can be obtained by compiling Paddle Lite.
``` ```
git clone https://github.com/PaddlePaddle/Paddle-Lite.git git clone https://github.com/PaddlePaddle/Paddle-Lite.git
cd Paddle-Lite cd Paddle-Lite
git checkout release/v2.9 git checkout release/v2.10
./lite/tools/build.sh build_optimize_tool ./lite/tools/build.sh build_optimize_tool
``` ```
...@@ -124,22 +135,22 @@ cd build.opt/lite/api/ ...@@ -124,22 +135,22 @@ cd build.opt/lite/api/
The following takes the ultra-lightweight Chinese model of PaddleOCR as an example to introduce the use of the compiled opt file to complete the conversion of the inference model to the Paddle-Lite optimized model The following takes the ultra-lightweight Chinese model of PaddleOCR as an example to introduce the use of the compiled opt file to complete the conversion of the inference model to the Paddle-Lite optimized model
``` ```
# [Recommendation] Download the Chinese and English inference model of PaddleOCR V2.0 # 【[Recommendation] Download the Chinese and English inference model of PP-OCRv2
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_det_slim_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_slim_infer.tar wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_quant_infer.tar && tar xf ch_PP-OCRv2_det_slim_quant_infer.tar
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_rec_slim_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_slim_infer.tar wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_infer.tar && tar xf ch_PP-OCRv2_rec_slim_quant_infer.tar
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_cls_slim_infer.tar && tar xf ch_ppocr_mobile_v2.0_cls_slim_infer.tar wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/slim/ch_ppocr_mobile_v2.0_cls_slim_infer.tar && tar xf ch_ppocr_mobile_v2.0_cls_slim_infer.tar
# Convert V2.0 detection model # Convert detection model
./opt --model_file=./ch_ppocr_mobile_v2.0_det_slim_infer/inference.pdmodel --param_file=./ch_ppocr_mobile_v2.0_det_slim_infer/inference.pdiparams --optimize_out=./ch_ppocr_mobile_v2.0_det_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer ./opt --model_file=./ch_PP-OCRv2_det_slim_quant_infer/inference.pdmodel --param_file=./ch_PP-OCRv2_det_slim_quant_infer/inference.pdiparams --optimize_out=./ch_PP-OCRv2_det_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer
# Convert V2.0 recognition model # Convert recognition model
./opt --model_file=./ch_ppocr_mobile_v2.0_rec_slim_infer/inference.pdmodel --param_file=./ch_ppocr_mobile_v2.0_rec_slim_infer/inference.pdiparams --optimize_out=./ch_ppocr_mobile_v2.0_rec_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer ./opt --model_file=./ch_PP-OCRv2_rec_slim_quant_infer/inference.pdmodel --param_file=./ch_PP-OCRv2_rec_slim_quant_infer/inference.pdiparams --optimize_out=./ch_PP-OCRv2_rec_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer
# Convert V2.0 angle classifier model # Convert angle classifier model
./opt --model_file=./ch_ppocr_mobile_v2.0_cls_slim_infer/inference.pdmodel --param_file=./ch_ppocr_mobile_v2.0_cls_slim_infer/inference.pdiparams --optimize_out=./ch_ppocr_mobile_v2.0_cls_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer ./opt --model_file=./ch_ppocr_mobile_v2.0_cls_slim_infer/inference.pdmodel --param_file=./ch_ppocr_mobile_v2.0_cls_slim_infer/inference.pdiparams --optimize_out=./ch_ppocr_mobile_v2.0_cls_slim_opt --valid_targets=arm --optimize_out_type=naive_buffer
``` ```
After the conversion is successful, there will be more files ending with `.nb` in the inference model directory, which is the successfully converted model file. After the conversion is successful, there will be more files ending with `.nb` in the inference model directory, which is the successfully converted model file.
<a name="2.2 Run optimized model on Phone"></a> <a name="2.2-Run-optimized-model-on-Phone"></a>
### 2.2 Run optimized model on Phone ### 2.2 Run optimized model on Phone
Some preparatory work is required first. Some preparatory work is required first.
...@@ -194,8 +205,8 @@ The structure of the OCR demo is as follows after the above command is executed: ...@@ -194,8 +205,8 @@ The structure of the OCR demo is as follows after the above command is executed:
``` ```
demo/cxx/ocr/ demo/cxx/ocr/
|-- debug/ |-- debug/
| |--ch_ppocr_mobile_v2.0_det_slim_opt.nb Detection model | |--ch_PP-OCRv2_det_slim_opt.nb Detection model
| |--ch_ppocr_mobile_v2.0_rec_slim_opt.nb Recognition model | |--ch_PP-OCRv2_rec_slim_opt.nb Recognition model
| |--ch_ppocr_mobile_v2.0_cls_slim_opt.nb Text direction classification model | |--ch_ppocr_mobile_v2.0_cls_slim_opt.nb Text direction classification model
| |--11.jpg Image for OCR | |--11.jpg Image for OCR
| |--ppocr_keys_v1.txt Dictionary file | |--ppocr_keys_v1.txt Dictionary file
...@@ -249,7 +260,7 @@ After the above steps are completed, you can use adb to push the file to the pho ...@@ -249,7 +260,7 @@ After the above steps are completed, you can use adb to push the file to the pho
export LD_LIBRARY_PATH=${PWD}:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=${PWD}:$LD_LIBRARY_PATH
# The use of ocr_db_crnn is: # The use of ocr_db_crnn is:
# ./ocr_db_crnn Detection model file Orientation classifier model file Recognition model file Test image path Dictionary file path # ./ocr_db_crnn Detection model file Orientation classifier model file Recognition model file Test image path Dictionary file path
./ocr_db_crnn ch_ppocr_mobile_v2.0_det_opt.nb ch_ppocr_mobile_v2.0_rec_opt.nb ch_ppocr_mobile_v2.0_cls_opt.nb ./11.jpg ppocr_keys_v1.txt ./ocr_db_crnn ch_PP-OCRv2_det_slim_opt.nb ch_PP-OCRv2_rec_slim_opt.nb ch_ppocr_mobile_v2.0_cls_opt.nb ./11.jpg ppocr_keys_v1.txt
``` ```
If you modify the code, you need to recompile and push to the phone. If you modify the code, you need to recompile and push to the phone.
......
...@@ -6,13 +6,14 @@ ...@@ -6,13 +6,14 @@
> 3. 本文档提供的是PPOCR自研模型列表,更多基于公开数据集的算法介绍与预训练模型可以参考:[算法概览文档](./algorithm_overview.md)。 > 3. 本文档提供的是PPOCR自研模型列表,更多基于公开数据集的算法介绍与预训练模型可以参考:[算法概览文档](./algorithm_overview.md)。
- [1. 文本检测模型](#文本检测模型) - [PP-OCR系列模型列表(V2.1,2021年9月6日更新)](#pp-ocr系列模型列表v212021年9月6日更新)
- [2. 文本识别模型](#文本识别模型) - [1. 文本检测模型](#1-文本检测模型)
- [2.1 中文识别模型](#中文识别模型) - [2. 文本识别模型](#2-文本识别模型)
- [2.2 英文识别模型](#英文识别模型) - [2.1 中文识别模型](#21-中文识别模型)
- [2.3 多语言识别模型](#多语言识别模型) - [2.2 英文识别模型](#22-英文识别模型)
- [3. 文本方向分类模型](#文本方向分类模型) - [2.3 多语言识别模型(更多语言持续更新中...)](#23-多语言识别模型更多语言持续更新中)
- [4. Paddle-Lite 模型](#Paddle-Lite模型) - [3. 文本方向分类模型](#3-文本方向分类模型)
- [4. Paddle-Lite 模型](#4-paddle-lite-模型)
PaddleOCR提供的可下载模型包括`推理模型``训练模型``预训练模型``slim模型`,模型区别说明如下: PaddleOCR提供的可下载模型包括`推理模型``训练模型``预训练模型``slim模型`,模型区别说明如下:
...@@ -100,6 +101,8 @@ PaddleOCR提供的可下载模型包括`推理模型`、`训练模型`、`预训 ...@@ -100,6 +101,8 @@ PaddleOCR提供的可下载模型包括`推理模型`、`训练模型`、`预训
|模型版本|模型简介|模型大小|检测模型|文本方向分类模型|识别模型|Paddle-Lite版本| |模型版本|模型简介|模型大小|检测模型|文本方向分类模型|识别模型|Paddle-Lite版本|
|---|---|---|---|---|---|---| |---|---|---|---|---|---|---|
|PP-OCRv2|蒸馏版超轻量中文OCR移动端模型|11M|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_det_infer_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_ppocr_mobile_v2.0_cls_infer_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_rec_infer_opt.nb)|v2.10|
|PP-OCRv2(slim)|蒸馏版超轻量中文OCR移动端模型|4.6M|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_det_slim_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_ppocr_mobile_v2.0_cls_slim_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_rec_slim_opt.nb)|v2.10|
|PP-OCRv2|蒸馏版超轻量中文OCR移动端模型|11M|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer_opt.nb)|v2.9| |PP-OCRv2|蒸馏版超轻量中文OCR移动端模型|11M|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer_opt.nb)|v2.9|
|PP-OCRv2(slim)|蒸馏版超轻量中文OCR移动端模型|4.9M|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_slim_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_opt.nb)|v2.9| |PP-OCRv2(slim)|蒸馏版超轻量中文OCR移动端模型|4.9M|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_slim_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_opt.nb)|v2.9|
|V2.0|ppocr_v2.0超轻量中文OCR移动端模型|7.8M|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_det_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_rec_opt.nb)|v2.9| |V2.0|ppocr_v2.0超轻量中文OCR移动端模型|7.8M|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_det_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[下载地址](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_rec_opt.nb)|v2.9|
......
...@@ -94,6 +94,8 @@ For more supported languages, please refer to : [Multi-language model](./multi_l ...@@ -94,6 +94,8 @@ For more supported languages, please refer to : [Multi-language model](./multi_l
## 4. Paddle-Lite Model ## 4. Paddle-Lite Model
|Version|Introduction|Model size|Detection model|Text Direction model|Recognition model|Paddle-Lite branch| |Version|Introduction|Model size|Detection model|Text Direction model|Recognition model|Paddle-Lite branch|
|---|---|---|---|---|---|---| |---|---|---|---|---|---|---|
|PP-OCRv2|extra-lightweight chinese OCR optimized model|11M|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_det_infer_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_ppocr_mobile_v2.0_cls_infer_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_rec_infer_opt.nb)|v2.10|
|PP-OCRv2(slim)|extra-lightweight chinese OCR optimized model|4.6M|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_det_slim_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_ppocr_mobile_v2.0_cls_slim_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/lite/ch_PP-OCRv2_rec_slim_opt.nb)|v2.10|
|PP-OCRv2|extra-lightweight chinese OCR optimized model|11M|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer_opt.nb)|v2.9| |PP-OCRv2|extra-lightweight chinese OCR optimized model|11M|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer_opt.nb)|v2.9|
|PP-OCRv2(slim)|extra-lightweight chinese OCR optimized model|4.9M|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_slim_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_opt.nb)|v2.9| |PP-OCRv2(slim)|extra-lightweight chinese OCR optimized model|4.9M|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_slim_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_opt.nb)|v2.9|
|V2.0|ppocr_v2.0 extra-lightweight chinese OCR optimized model|7.8M|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_det_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_rec_opt.nb)|v2.9| |V2.0|ppocr_v2.0 extra-lightweight chinese OCR optimized model|7.8M|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_det_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_cls_opt.nb)|[download link](https://paddleocr.bj.bcebos.com/dygraph_v2.0/lite/ch_ppocr_mobile_v2.0_rec_opt.nb)|v2.9|
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册