未验证 提交 a6189b12 编写于 作者: G Guanghua Yu 提交者: GitHub

update picodet ncnn and mnn demo (#5721)

上级 23f982e0
...@@ -226,11 +226,16 @@ paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \ ...@@ -226,11 +226,16 @@ paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \
### 部署 ### 部署
- OpenVINO demo [Python](../../deploy/third_engine/demo_openvino/python) | 预测库 | Python | C++ | 带后处理预测 |
- [PaddleLite C++ demo](../../deploy/lite) | :-------- | :--------: | :---------------------: | :----------------: |
- [Android demo(Paddle Lite)](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo) | OpenVINO | [Python](../../deploy/third_engine/demo_openvino/python) | [C++](../../deploy/third_engine/demo_openvino)(带后处理开发中) | ✔︎ |
- ONNXRuntime demo [Python](../../deploy/third_engine/demo_onnxruntime) | Paddle Lite | - | [C++](../../deploy/lite) | ✔︎ |
- PaddleInference demo [Python](../../deploy/python) & [C++](../../deploy/cpp) | Android Demo | - | [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo) | ✔︎ |
| PaddleInference | [Python](../../deploy/python) | [C++](../../deploy/cpp) | ✔︎ |
| ONNXRuntime | [Python](../../deploy/third_engine/demo_onnxruntime) | Comming soon | ✔︎ |
| NCNN | Comming soon | [C++](../../deploy/third_engine/demo_ncnn) | ✘ |
| MNN | Comming soon | [C++](../../deploy/third_engine/demo_mnn) | ✘ |
Android demo可视化: Android demo可视化:
......
...@@ -222,11 +222,15 @@ paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \ ...@@ -222,11 +222,15 @@ paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \
### Deploy ### Deploy
- OpenVINO demo [Python](../../deploy/third_engine/demo_openvino/python) | Infer Engine | Python | C++ | Predict With Postprocess |
- [PaddleLite C++ demo](../../deploy/lite) | :-------- | :--------: | :---------------------: | :----------------: |
- [Android demo(Paddle Lite)](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo) | OpenVINO | [Python](../../deploy/third_engine/demo_openvino/python) | [C++](../../deploy/third_engine/demo_openvino)(postprocess comming soon) | ✔︎ |
- ONNXRuntime demo [Python](../../deploy/third_engine/demo_onnxruntime) | Paddle Lite | - | [C++](../../deploy/lite) | ✔︎ |
- PaddleInference demo [Python](../../deploy/python) & [C++](../../deploy/cpp) | Android Demo | - | [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo) | ✔︎ |
| PaddleInference | [Python](../../deploy/python) | [C++](../../deploy/cpp) | ✔︎ |
| ONNXRuntime | [Python](../../deploy/third_engine/demo_onnxruntime) | Comming soon | ✔︎ |
| NCNN | Comming soon | [C++](../../deploy/third_engine/demo_ncnn) | ✘ |
| MNN | Comming soon | [C++](../../deploy/third_engine/demo_mnn) | ✘ |
Android demo visualization: Android demo visualization:
......
...@@ -2,13 +2,14 @@ cmake_minimum_required(VERSION 3.9) ...@@ -2,13 +2,14 @@ cmake_minimum_required(VERSION 3.9)
project(picodet-mnn) project(picodet-mnn)
set(CMAKE_CXX_STANDARD 17) set(CMAKE_CXX_STANDARD 17)
set(MNN_DIR PATHS "./mnn")
# find_package(OpenCV REQUIRED PATHS "/work/dependence/opencv/opencv-3.4.3/build") # find_package(OpenCV REQUIRED PATHS "/work/dependence/opencv/opencv-3.4.3/build")
find_package(OpenCV REQUIRED) find_package(OpenCV REQUIRED)
include_directories( include_directories(
/path/to/MNN/include/MNN ${MNN_DIR}/include
/path/to/MNN/include ${MNN_DIR}/include/MNN
. ${CMAKE_SOURCE_DIR}
) )
link_directories(mnn/lib) link_directories(mnn/lib)
......
# PicoDet MNN Demo # PicoDet MNN Demo
This fold provides PicoDet inference code using 本Demo提供的预测代码是根据[Alibaba's MNN framework](https://github.com/alibaba/MNN) 推理库预测的。
[Alibaba's MNN framework](https://github.com/alibaba/MNN). Most of the implements in
this fold are same as *demo_ncnn*.
## Install MNN ## C++ Demo
### Python library - 第一步:根据[MNN官方编译文档](https://www.yuque.com/mnn/en/build_linux) 编译生成预测库.
- 第二步:编译或下载得到OpenCV库,可参考OpenCV官网,为了方便如果环境是gcc8.2 x86环境,可直接下载以下库:
Just run: ```shell
wget https://paddledet.bj.bcebos.com/data/opencv-3.4.16_gcc8.2_ffmpeg.tar.gz
``` shell tar -xf opencv-3.4.16_gcc8.2_ffmpeg.tar.gz
pip install MNN
``` ```
### C++ library - 第三步:准备模型
Please follow the [official document](https://www.yuque.com/mnn/en/build_linux) to build MNN engine.
- Create picodet_m_416_coco.onnx
```shell ```shell
modelName=picodet_m_416_coco modelName=picodet_s_320_coco_lcnet
# export model # 导出Inference model
python tools/export_model.py \ python tools/export_model.py \
-c configs/picodet/${modelName}.yml \ -c configs/picodet/${modelName}.yml \
-o weights=${modelName}.pdparams \ -o weights=${modelName}.pdparams \
--output_dir=inference_model --output_dir=inference_model
# convert to onnx # 转换到ONNX
paddle2onnx --model_dir inference_model/${modelName} \ paddle2onnx --model_dir inference_model/${modelName} \
--model_filename model.pdmodel \ --model_filename model.pdmodel \
--params_filename model.pdiparams \ --params_filename model.pdiparams \
--opset_version 11 \ --opset_version 11 \
--save_file ${modelName}.onnx --save_file ${modelName}.onnx
# onnxsim # 简化模型
python -m onnxsim ${modelName}.onnx ${modelName}_processed.onnx python -m onnxsim ${modelName}.onnx ${modelName}_processed.onnx
# 将模型转换至MNN格式
python -m MNN.tools.mnnconvert -f ONNX --modelFile picodet_s_320_lcnet_processed.onnx --MNNModel picodet_s_320_lcnet.mnn
``` ```
为了快速测试,可直接下载:[picodet_s_320_lcnet.mnn](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_lcnet.mnn)(不带后处理)。
- Convert model **注意:**由于MNN里,Matmul算子的输入shape如果不一致计算有问题,带后处理的Demo正在升级中,很快发布。
``` shell
python -m MNN.tools.mnnconvert -f ONNX --modelFile picodet-416.onnx --MNNModel picodet-416.mnn
```
Here are converted model [download link](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416.mnn).
## Build ## 编译可执行程序
The python code *demo_mnn.py* can run directly and independently without main PicoDet repo.
`PicoDetONNX` and `PicoDetTorch` are two classes used to check the similarity of MNN inference results
with ONNX model and Pytorch model. They can be remove with no side effects.
For C++ code, replace `libMNN.so` under *./mnn/lib* with the one you just compiled, modify OpenCV path and MNN path at CMake file,
and run
- 第一步:导入lib包
```
mkdir mnn && cd mnn && mkdir lib
cp /path/to/MNN/build/libMNN.so .
cd ..
cp -r /path/to/MNN/include .
```
- 第二步:修改CMakeLists.txt中OpenCV和MNN的路径
- 第三步:开始编译
``` shell ``` shell
mkdir build && cd build mkdir build && cd build
cmake .. cmake ..
make make
``` ```
如果在build目录下生成`picodet-mnn`可执行文件,就证明成功了。
Note that a flag at `main.cpp` is used to control whether to show the detection result or save it into a fold. ## 开始运行
``` c++
#define __SAVE_RESULT__ // if defined save drawed results to ../results, else show it in windows
```
## Run
### Python
`demo_mnn.py` provide an inference class `PicoDetMNN` that combines preprocess, post process, visualization.
Besides it can be used in command line with the form:
首先新建预测结果存放目录:
```shell ```shell
demo_mnn.py [-h] [--model_path MODEL_PATH] [--cfg_path CFG_PATH] cp -r ../demo_onnxruntime/imgs .
[--img_fold IMG_FOLD] [--result_fold RESULT_FOLD] cd build
[--input_shape INPUT_SHAPE INPUT_SHAPE] mkdir ../results
[--backend {MNN,ONNX,torch}]
``` ```
For example: - 预测一张图片
``` shell ``` shell
# run MNN 416 model ./picodet-mnn 0 ../picodet_s_320_lcnet_3.mnn 320 320 ../imgs/dog.jpg
python ./demo_mnn.py --model_path ../model/picodet-416.mnn --img_fold ../imgs --result_fold ../results
# run MNN 320 model
python ./demo_mnn.py --model_path ../model/picodet-320.mnn --input_shape 320 320 --backend MNN
# run onnx model
python ./demo_mnn.py --model_path ../model/sim.onnx --backend ONNX
``` ```
### C++ -测试速度Benchmark
C++ inference interface is same with NCNN code, to detect images in a fold, run:
``` shell ``` shell
./picodet-mnn "1" "../imgs/test.jpg" ./picodet-mnn 1 ../picodet_s_320_lcnet.mnn 320 320
``` ```
For speed benchmark ## FAQ
``` shell - 预测结果精度不对:
./picodet-mnn "3" "0" 请先确认模型输入shape是否对齐,并且模型输出name是否对齐,不带后处理的PicoDet增强版模型输出name如下:
```shell
# 分类分支 | 检测分支
{"transpose_0.tmp_0", "transpose_1.tmp_0"},
{"transpose_2.tmp_0", "transpose_3.tmp_0"},
{"transpose_4.tmp_0", "transpose_5.tmp_0"},
{"transpose_6.tmp_0", "transpose_7.tmp_0"},
``` ```
可使用[netron](https://netron.app)查看具体name,并修改`picodet_mnn.hpp`中相应`non_postprocess_heads_info`数组。
## Reference ## Reference
[MNN](https://github.com/alibaba/MNN) [MNN](https://github.com/alibaba/MNN)
...@@ -11,7 +11,6 @@ ...@@ -11,7 +11,6 @@
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// reference from https://github.com/RangiLyu/nanodet/tree/main/demo_mnn
#include "picodet_mnn.hpp" #include "picodet_mnn.hpp"
#include <iostream> #include <iostream>
...@@ -19,354 +18,186 @@ ...@@ -19,354 +18,186 @@
#include <opencv2/highgui/highgui.hpp> #include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp> #include <opencv2/imgproc/imgproc.hpp>
#define __SAVE_RESULT__ // if defined save drawed results to ../results, else show it in windows #define __SAVE_RESULT__ // if defined save drawed results to ../results, else
// show it in windows
struct object_rect { struct object_rect {
int x; int x;
int y; int y;
int width; int width;
int height; int height;
}; };
int resize_uniform(cv::Mat& src, cv::Mat& dst, cv::Size dst_size, object_rect& effect_area) std::vector<int> GenerateColorMap(int num_class) {
{ auto colormap = std::vector<int>(3 * num_class, 0);
int w = src.cols; for (int i = 0; i < num_class; ++i) {
int h = src.rows; int j = 0;
int dst_w = dst_size.width; int lab = i;
int dst_h = dst_size.height; while (lab) {
dst = cv::Mat(cv::Size(dst_w, dst_h), CV_8UC3, cv::Scalar(0)); colormap[i * 3] |= (((lab >> 0) & 1) << (7 - j));
colormap[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j));
float ratio_src = w * 1.0 / h; colormap[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j));
float ratio_dst = dst_w * 1.0 / dst_h; ++j;
lab >>= 3;
int tmp_w = 0;
int tmp_h = 0;
if (ratio_src > ratio_dst) {
tmp_w = dst_w;
tmp_h = floor((dst_w * 1.0 / w) * h);
}
else if (ratio_src < ratio_dst) {
tmp_h = dst_h;
tmp_w = floor((dst_h * 1.0 / h) * w);
}
else {
cv::resize(src, dst, dst_size);
effect_area.x = 0;
effect_area.y = 0;
effect_area.width = dst_w;
effect_area.height = dst_h;
return 0;
}
cv::Mat tmp;
cv::resize(src, tmp, cv::Size(tmp_w, tmp_h));
if (tmp_w != dst_w) {
int index_w = floor((dst_w - tmp_w) / 2.0);
for (int i = 0; i < dst_h; i++) {
memcpy(dst.data + i * dst_w * 3 + index_w * 3, tmp.data + i * tmp_w * 3, tmp_w * 3);
}
effect_area.x = index_w;
effect_area.y = 0;
effect_area.width = tmp_w;
effect_area.height = tmp_h;
}
else if (tmp_h != dst_h) {
int index_h = floor((dst_h - tmp_h) / 2.0);
memcpy(dst.data + index_h * dst_w * 3, tmp.data, tmp_w * tmp_h * 3);
effect_area.x = 0;
effect_area.y = index_h;
effect_area.width = tmp_w;
effect_area.height = tmp_h;
}
else {
printf("error\n");
} }
return 0; }
return colormap;
} }
const int color_list[80][3] = void draw_bboxes(const cv::Mat &im, const std::vector<BoxInfo> &bboxes,
{ std::string save_path = "None") {
{216 , 82 , 24}, static const char *class_names[] = {
{236 ,176 , 31}, "person", "bicycle", "car",
{125 , 46 ,141}, "motorcycle", "airplane", "bus",
{118 ,171 , 47}, "train", "truck", "boat",
{ 76 ,189 ,237}, "traffic light", "fire hydrant", "stop sign",
{238 , 19 , 46}, "parking meter", "bench", "bird",
{ 76 , 76 , 76}, "cat", "dog", "horse",
{153 ,153 ,153}, "sheep", "cow", "elephant",
{255 , 0 , 0}, "bear", "zebra", "giraffe",
{255 ,127 , 0}, "backpack", "umbrella", "handbag",
{190 ,190 , 0}, "tie", "suitcase", "frisbee",
{ 0 ,255 , 0}, "skis", "snowboard", "sports ball",
{ 0 , 0 ,255}, "kite", "baseball bat", "baseball glove",
{170 , 0 ,255}, "skateboard", "surfboard", "tennis racket",
{ 84 , 84 , 0}, "bottle", "wine glass", "cup",
{ 84 ,170 , 0}, "fork", "knife", "spoon",
{ 84 ,255 , 0}, "bowl", "banana", "apple",
{170 , 84 , 0}, "sandwich", "orange", "broccoli",
{170 ,170 , 0}, "carrot", "hot dog", "pizza",
{170 ,255 , 0}, "donut", "cake", "chair",
{255 , 84 , 0}, "couch", "potted plant", "bed",
{255 ,170 , 0}, "dining table", "toilet", "tv",
{255 ,255 , 0}, "laptop", "mouse", "remote",
{ 0 , 84 ,127}, "keyboard", "cell phone", "microwave",
{ 0 ,170 ,127}, "oven", "toaster", "sink",
{ 0 ,255 ,127}, "refrigerator", "book", "clock",
{ 84 , 0 ,127}, "vase", "scissors", "teddy bear",
{ 84 , 84 ,127}, "hair drier", "toothbrush"};
{ 84 ,170 ,127},
{ 84 ,255 ,127}, cv::Mat image = im.clone();
{170 , 0 ,127}, int src_w = image.cols;
{170 , 84 ,127}, int src_h = image.rows;
{170 ,170 ,127}, int thickness = 2;
{170 ,255 ,127}, auto colormap = GenerateColorMap(sizeof(class_names));
{255 , 0 ,127},
{255 , 84 ,127}, for (size_t i = 0; i < bboxes.size(); i++) {
{255 ,170 ,127}, const BoxInfo &bbox = bboxes[i];
{255 ,255 ,127}, std::cout << bbox.x1 << ". " << bbox.y1 << ". " << bbox.x2 << ". "
{ 0 , 84 ,255}, << bbox.y2 << ". " << std::endl;
{ 0 ,170 ,255}, int c1 = colormap[3 * bbox.label + 0];
{ 0 ,255 ,255}, int c2 = colormap[3 * bbox.label + 1];
{ 84 , 0 ,255}, int c3 = colormap[3 * bbox.label + 2];
{ 84 , 84 ,255}, cv::Scalar color = cv::Scalar(c1, c2, c3);
{ 84 ,170 ,255}, // cv::Scalar color = cv::Scalar(0, 0, 255);
{ 84 ,255 ,255}, cv::rectangle(image, cv::Rect(cv::Point(bbox.x1, bbox.y1),
{170 , 0 ,255}, cv::Point(bbox.x2, bbox.y2)),
{170 , 84 ,255}, color, 1, cv::LINE_AA);
{170 ,170 ,255},
{170 ,255 ,255}, char text[256];
{255 , 0 ,255}, sprintf(text, "%s %.1f%%", class_names[bbox.label], bbox.score * 100);
{255 , 84 ,255},
{255 ,170 ,255}, int baseLine = 0;
{ 42 , 0 , 0}, cv::Size label_size =
{ 84 , 0 , 0}, cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.4, 1, &baseLine);
{127 , 0 , 0},
{170 , 0 , 0}, int x = bbox.x1;
{212 , 0 , 0}, int y = bbox.y1 - label_size.height - baseLine;
{255 , 0 , 0}, if (y < 0)
{ 0 , 42 , 0}, y = 0;
{ 0 , 84 , 0}, if (x + label_size.width > image.cols)
{ 0 ,127 , 0}, x = image.cols - label_size.width;
{ 0 ,170 , 0},
{ 0 ,212 , 0}, cv::rectangle(image, cv::Rect(cv::Point(x, y),
{ 0 ,255 , 0}, cv::Size(label_size.width,
{ 0 , 0 , 42}, label_size.height + baseLine)),
{ 0 , 0 , 84}, color, -1);
{ 0 , 0 ,127},
{ 0 , 0 ,170}, cv::putText(image, text, cv::Point(x, y + label_size.height),
{ 0 , 0 ,212}, cv::FONT_HERSHEY_SIMPLEX, 0.4, cv::Scalar(255, 255, 255), 1,
{ 0 , 0 ,255}, cv::LINE_AA);
{ 0 , 0 , 0}, }
{ 36 , 36 , 36},
{ 72 , 72 , 72}, if (save_path == "None") {
{109 ,109 ,109}, cv::imshow("image", image);
{145 ,145 ,145}, } else {
{182 ,182 ,182}, cv::imwrite(save_path, image);
{218 ,218 ,218}, std::cout << save_path << std::endl;
{ 0 ,113 ,188}, }
{ 80 ,182 ,188},
{127 ,127 , 0},
};
void draw_bboxes(const cv::Mat& bgr, const std::vector<BoxInfo>& bboxes, object_rect effect_roi, std::string save_path="None")
{
static const char* class_names[] = { "person", "bicycle", "car", "motorcycle", "airplane", "bus",
"train", "truck", "boat", "traffic light", "fire hydrant",
"stop sign", "parking meter", "bench", "bird", "cat", "dog",
"horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe",
"backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
"skis", "snowboard", "sports ball", "kite", "baseball bat",
"baseball glove", "skateboard", "surfboard", "tennis racket",
"bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl",
"banana", "apple", "sandwich", "orange", "broccoli", "carrot",
"hot dog", "pizza", "donut", "cake", "chair", "couch",
"potted plant", "bed", "dining table", "toilet", "tv", "laptop",
"mouse", "remote", "keyboard", "cell phone", "microwave", "oven",
"toaster", "sink", "refrigerator", "book", "clock", "vase",
"scissors", "teddy bear", "hair drier", "toothbrush"
};
cv::Mat image = bgr.clone();
int src_w = image.cols;
int src_h = image.rows;
int dst_w = effect_roi.width;
int dst_h = effect_roi.height;
float width_ratio = (float)src_w / (float)dst_w;
float height_ratio = (float)src_h / (float)dst_h;
for (size_t i = 0; i < bboxes.size(); i++)
{
const BoxInfo& bbox = bboxes[i];
cv::Scalar color = cv::Scalar(color_list[bbox.label][0], color_list[bbox.label][1], color_list[bbox.label][2]);
cv::rectangle(image, cv::Rect(cv::Point((bbox.x1 - effect_roi.x) * width_ratio, (bbox.y1 - effect_roi.y) * height_ratio),
cv::Point((bbox.x2 - effect_roi.x) * width_ratio, (bbox.y2 - effect_roi.y) * height_ratio)), color);
char text[256];
sprintf(text, "%s %.1f%%", class_names[bbox.label], bbox.score * 100);
int baseLine = 0;
cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.4, 1, &baseLine);
int x = (bbox.x1 - effect_roi.x) * width_ratio;
int y = (bbox.y1 - effect_roi.y) * height_ratio - label_size.height - baseLine;
if (y < 0)
y = 0;
if (x + label_size.width > image.cols)
x = image.cols - label_size.width;
cv::rectangle(image, cv::Rect(cv::Point(x, y), cv::Size(label_size.width, label_size.height + baseLine)),
color, -1);
cv::putText(image, text, cv::Point(x, y + label_size.height),
cv::FONT_HERSHEY_SIMPLEX, 0.4, cv::Scalar(255, 255, 255));
}
if (save_path == "None")
{
cv::imshow("image", image);
}
else
{
cv::imwrite(save_path, image);
std::cout << save_path << std::endl;
}
}
int image_demo(PicoDet &detector, const char* imagepath)
{
std::vector<cv::String> filenames;
cv::glob(imagepath, filenames, false);
for (auto img_name : filenames)
{
cv::Mat image = cv::imread(img_name);
if (image.empty())
{
fprintf(stderr, "cv::imread %s failed\n", img_name.c_str());
return -1;
}
object_rect effect_roi;
cv::Mat resized_img;
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
std::vector<BoxInfo> results;
detector.detect(resized_img, results);
#ifdef __SAVE_RESULT__
std::string save_path = img_name;
draw_bboxes(image, results, effect_roi, save_path.replace(3, 4, "results"));
#else
draw_bboxes(image, results, effect_roi);
cv::waitKey(0);
#endif
}
return 0;
} }
int webcam_demo(PicoDet& detector, int cam_id) int image_demo(PicoDet &detector, const char *imagepath) {
{ std::vector<cv::String> filenames;
cv::Mat image; cv::glob(imagepath, filenames, false);
cv::VideoCapture cap(cam_id);
while (true) for (auto img_name : filenames) {
{ cv::Mat image = cv::imread(img_name, cv::IMREAD_COLOR);
cap >> image; if (image.empty()) {
object_rect effect_roi; fprintf(stderr, "cv::imread %s failed\n", img_name.c_str());
cv::Mat resized_img; return -1;
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
std::vector<BoxInfo> results;
detector.detect(resized_img, results);
draw_bboxes(image, results, effect_roi);
cv::waitKey(1);
} }
return 0; std::vector<BoxInfo> results;
detector.detect(image, results, false);
std::cout << "detect done." << std::endl;
#ifdef __SAVE_RESULT__
std::string save_path = img_name;
draw_bboxes(image, results, save_path.replace(3, 4, "results"));
#else
draw_bboxes(image, results);
cv::waitKey(0);
#endif
}
return 0;
} }
int video_demo(PicoDet& detector, const char* path) int benchmark(PicoDet &detector, int width, int height) {
{ int loop_num = 100;
cv::Mat image; int warm_up = 8;
cv::VideoCapture cap(path);
double time_min = DBL_MAX;
while (true) double time_max = -DBL_MAX;
{ double time_avg = 0;
cap >> image; cv::Mat image(width, height, CV_8UC3, cv::Scalar(1, 1, 1));
object_rect effect_roi; for (int i = 0; i < warm_up + loop_num; i++) {
cv::Mat resized_img; auto start = std::chrono::steady_clock::now();
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi); std::vector<BoxInfo> results;
std::vector<BoxInfo> results; detector.detect(image, results, false);
detector.detect(resized_img, results); auto end = std::chrono::steady_clock::now();
draw_bboxes(image, results, effect_roi);
cv::waitKey(1); std::chrono::duration<double> elapsed = end - start;
double time = elapsed.count();
if (i >= warm_up) {
time_min = (std::min)(time_min, time);
time_max = (std::max)(time_max, time);
time_avg += time;
} }
return 0; }
time_avg /= loop_num;
fprintf(stderr, "%20s min = %7.2f max = %7.2f avg = %7.2f\n", "picodet",
time_min, time_max, time_avg);
return 0;
} }
int benchmark(PicoDet& detector) int main(int argc, char **argv) {
{ int mode = atoi(argv[1]);
int loop_num = 100; std::string model_path = argv[2];
int warm_up = 8; int height = 320;
int width = 320;
double time_min = DBL_MAX; if (argc == 4) {
double time_max = -DBL_MAX; height = atoi(argv[3]);
double time_avg = 0; width = atoi(argv[4]);
cv::Mat image(320, 320, CV_8UC3, cv::Scalar(1, 1, 1)); }
for (int i = 0; i < warm_up + loop_num; i++) PicoDet detector = PicoDet(model_path, width, height, 4, 0.45, 0.3);
{ if (mode == 1) {
auto start = std::chrono::steady_clock::now(); benchmark(detector, width, height);
std::vector<BoxInfo> results; } else {
detector.detect(image, results); if (argc != 5) {
auto end = std::chrono::steady_clock::now(); std::cout << "Must set image file, such as ./picodet-mnn 0 "
"../picodet_s_320_lcnet.mnn 320 320 img.jpg"
std::chrono::duration<double> elapsed = end - start; << std::endl;
double time = elapsed.count();
if (i >= warm_up)
{
time_min = (std::min)(time_min, time);
time_max = (std::max)(time_max, time);
time_avg += time;
}
}
time_avg /= loop_num;
fprintf(stderr, "%20s min = %7.2f max = %7.2f avg = %7.2f\n", "picodet", time_min, time_max, time_avg);
return 0;
}
int main(int argc, char** argv)
{
if (argc != 3)
{
fprintf(stderr, "usage: %s [mode] [path]. \n For webcam mode=0, path is cam id; \n For image demo, mode=1, path=xxx/xxx/*.jpg; \n For video, mode=2; \n For benchmark, mode=3 path=0.\n", argv[0]);
return -1;
}
PicoDet detector = PicoDet("../weight/picodet-416.mnn", 416, 416, 4, 0.45, 0.3);
int mode = atoi(argv[1]);
switch (mode)
{
case 0:{
int cam_id = atoi(argv[2]);
webcam_demo(detector, cam_id);
break;
}
case 1:{
const char* images = argv[2];
image_demo(detector, images);
break;
}
case 2:{
const char* path = argv[2];
video_demo(detector, path);
break;
}
case 3:{
benchmark(detector);
break;
}
default:{
fprintf(stderr, "usage: %s [mode] [path]. \n For webcam mode=0, path is cam id; \n For image demo, mode=1, path=xxx/xxx/*.jpg; \n For video, mode=2; \n For benchmark, mode=3 path=0.\n", argv[0]);
break;
}
} }
const char *images = argv[5];
image_demo(detector, images);
}
} }
...@@ -44,7 +44,8 @@ PicoDet::~PicoDet() { ...@@ -44,7 +44,8 @@ PicoDet::~PicoDet() {
PicoDet_interpreter->releaseSession(PicoDet_session); PicoDet_interpreter->releaseSession(PicoDet_session);
} }
int PicoDet::detect(cv::Mat &raw_image, std::vector<BoxInfo> &result_list) { int PicoDet::detect(cv::Mat &raw_image, std::vector<BoxInfo> &result_list,
bool has_postprocess) {
if (raw_image.empty()) { if (raw_image.empty()) {
std::cout << "image is empty ,please check!" << std::endl; std::cout << "image is empty ,please check!" << std::endl;
return -1; return -1;
...@@ -70,22 +71,57 @@ int PicoDet::detect(cv::Mat &raw_image, std::vector<BoxInfo> &result_list) { ...@@ -70,22 +71,57 @@ int PicoDet::detect(cv::Mat &raw_image, std::vector<BoxInfo> &result_list) {
std::vector<std::vector<BoxInfo>> results; std::vector<std::vector<BoxInfo>> results;
results.resize(num_class); results.resize(num_class);
for (const auto &head_info : heads_info) { if (has_postprocess) {
MNN::Tensor *tensor_scores = PicoDet_interpreter->getSessionOutput( auto bbox_out_tensor = PicoDet_interpreter->getSessionOutput(
PicoDet_session, head_info.cls_layer.c_str()); PicoDet_session, nms_heads_info[0].c_str());
MNN::Tensor *tensor_boxes = PicoDet_interpreter->getSessionOutput( auto class_out_tensor = PicoDet_interpreter->getSessionOutput(
PicoDet_session, head_info.dis_layer.c_str()); PicoDet_session, nms_heads_info[1].c_str());
// bbox branch
MNN::Tensor tensor_scores_host(tensor_scores, auto tensor_bbox_host =
tensor_scores->getDimensionType()); new MNN::Tensor(bbox_out_tensor, MNN::Tensor::CAFFE);
tensor_scores->copyToHostTensor(&tensor_scores_host); bbox_out_tensor->copyToHostTensor(tensor_bbox_host);
auto bbox_output_shape = tensor_bbox_host->shape();
MNN::Tensor tensor_boxes_host(tensor_boxes, int output_size = 1;
tensor_boxes->getDimensionType()); for (int j = 0; j < bbox_output_shape.size(); ++j) {
tensor_boxes->copyToHostTensor(&tensor_boxes_host); output_size *= bbox_output_shape[j];
}
decode_infer(&tensor_scores_host, &tensor_boxes_host, head_info.stride, std::cout << "output_size:" << output_size << std::endl;
score_threshold, results); bbox_output_data_.resize(output_size);
std::copy_n(tensor_bbox_host->host<float>(), output_size,
bbox_output_data_.data());
delete tensor_bbox_host;
// class branch
auto tensor_class_host =
new MNN::Tensor(class_out_tensor, MNN::Tensor::CAFFE);
class_out_tensor->copyToHostTensor(tensor_class_host);
auto class_output_shape = tensor_class_host->shape();
output_size = 1;
for (int j = 0; j < class_output_shape.size(); ++j) {
output_size *= class_output_shape[j];
}
std::cout << "output_size:" << output_size << std::endl;
class_output_data_.resize(output_size);
std::copy_n(tensor_class_host->host<float>(), output_size,
class_output_data_.data());
delete tensor_class_host;
} else {
for (const auto &head_info : non_postprocess_heads_info) {
MNN::Tensor *tensor_scores = PicoDet_interpreter->getSessionOutput(
PicoDet_session, head_info.cls_layer.c_str());
MNN::Tensor *tensor_boxes = PicoDet_interpreter->getSessionOutput(
PicoDet_session, head_info.dis_layer.c_str());
MNN::Tensor tensor_scores_host(tensor_scores,
tensor_scores->getDimensionType());
tensor_scores->copyToHostTensor(&tensor_scores_host);
MNN::Tensor tensor_boxes_host(tensor_boxes,
tensor_boxes->getDimensionType());
tensor_boxes->copyToHostTensor(&tensor_boxes_host);
decode_infer(&tensor_scores_host, &tensor_boxes_host, head_info.stride,
score_threshold, results);
}
} }
auto end = chrono::steady_clock::now(); auto end = chrono::steady_clock::now();
...@@ -188,8 +224,6 @@ void PicoDet::nms(std::vector<BoxInfo> &input_boxes, float NMS_THRESH) { ...@@ -188,8 +224,6 @@ void PicoDet::nms(std::vector<BoxInfo> &input_boxes, float NMS_THRESH) {
} }
} }
string PicoDet::get_label_str(int label) { return labels[label]; }
inline float fast_exp(float x) { inline float fast_exp(float x) {
union { union {
uint32_t i; uint32_t i;
......
...@@ -11,7 +11,6 @@ ...@@ -11,7 +11,6 @@
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// reference from https://github.com/RangiLyu/nanodet/tree/main/demo_mnn
#ifndef __PicoDet_H__ #ifndef __PicoDet_H__
#define __PicoDet_H__ #define __PicoDet_H__
...@@ -20,90 +19,84 @@ ...@@ -20,90 +19,84 @@
#include "Interpreter.hpp" #include "Interpreter.hpp"
#include "ImageProcess.hpp"
#include "MNNDefine.h" #include "MNNDefine.h"
#include "Tensor.hpp" #include "Tensor.hpp"
#include "ImageProcess.hpp"
#include <opencv2/opencv.hpp>
#include <algorithm> #include <algorithm>
#include <chrono>
#include <iostream> #include <iostream>
#include <memory>
#include <opencv2/opencv.hpp>
#include <string> #include <string>
#include <vector> #include <vector>
#include <memory>
#include <chrono>
typedef struct NonPostProcessHeadInfo_ {
typedef struct HeadInfo_ std::string cls_layer;
{ std::string dis_layer;
std::string cls_layer; int stride;
std::string dis_layer; } NonPostProcessHeadInfo;
int stride;
} HeadInfo; typedef struct BoxInfo_ {
float x1;
typedef struct BoxInfo_ float y1;
{ float x2;
float x1; float y2;
float y1; float score;
float x2; int label;
float y2;
float score;
int label;
} BoxInfo; } BoxInfo;
class PicoDet { class PicoDet {
public: public:
PicoDet(const std::string &mnn_path, PicoDet(const std::string &mnn_path, int input_width, int input_length,
int input_width, int input_length, int num_thread_ = 4, float score_threshold_ = 0.5, float nms_threshold_ = 0.3); int num_thread_ = 4, float score_threshold_ = 0.5,
float nms_threshold_ = 0.3);
~PicoDet(); ~PicoDet();
int detect(cv::Mat &img, std::vector<BoxInfo> &result_list); int detect(cv::Mat &img, std::vector<BoxInfo> &result_list,
std::string get_label_str(int label); bool has_postprocess);
private: private:
void decode_infer(MNN::Tensor *cls_pred, MNN::Tensor *dis_pred, int stride, float threshold, std::vector<std::vector<BoxInfo>> &results); void decode_infer(MNN::Tensor *cls_pred, MNN::Tensor *dis_pred, int stride,
BoxInfo disPred2Bbox(const float *&dfl_det, int label, float score, int x, int y, int stride); float threshold,
void nms(std::vector<BoxInfo> &input_boxes, float NMS_THRESH); std::vector<std::vector<BoxInfo>> &results);
BoxInfo disPred2Bbox(const float *&dfl_det, int label, float score, int x,
int y, int stride);
void nms(std::vector<BoxInfo> &input_boxes, float NMS_THRESH);
private: private:
std::shared_ptr<MNN::Interpreter> PicoDet_interpreter;
std::shared_ptr<MNN::Interpreter> PicoDet_interpreter; MNN::Session *PicoDet_session = nullptr;
MNN::Session *PicoDet_session = nullptr; MNN::Tensor *input_tensor = nullptr;
MNN::Tensor *input_tensor = nullptr;
int num_thread;
int num_thread; int image_w;
int image_w; int image_h;
int image_h;
int in_w = 320;
int in_w = 320; int in_h = 320;
int in_h = 320;
float score_threshold;
float score_threshold; float nms_threshold;
float nms_threshold;
const float mean_vals[3] = {103.53f, 116.28f, 123.675f};
const float mean_vals[3] = { 103.53f, 116.28f, 123.675f }; const float norm_vals[3] = {0.017429f, 0.017507f, 0.017125f};
const float norm_vals[3] = { 0.017429f, 0.017507f, 0.017125f };
const int num_class = 80;
const int num_class = 80; const int reg_max = 7;
const int reg_max = 7;
std::vector<float> bbox_output_data_;
std::vector<HeadInfo> heads_info{ std::vector<float> class_output_data_;
// cls_pred|dis_pred|stride
{"save_infer_model/scale_0.tmp_1", "save_infer_model/scale_4.tmp_1", 8}, std::vector<std::string> nms_heads_info{"tmp_16", "concat_4.tmp_0"};
{"save_infer_model/scale_1.tmp_1", "save_infer_model/scale_5.tmp_1", 16}, // If not export post-process, will use non_postprocess_heads_info
{"save_infer_model/scale_2.tmp_1", "save_infer_model/scale_6.tmp_1", 32}, std::vector<NonPostProcessHeadInfo> non_postprocess_heads_info{
{"save_infer_model/scale_3.tmp_1", "save_infer_model/scale_7.tmp_1", 64}, // cls_pred|dis_pred|stride
}; {"transpose_0.tmp_0", "transpose_1.tmp_0", 8},
{"transpose_2.tmp_0", "transpose_3.tmp_0", 16},
std::vector<std::string> {"transpose_4.tmp_0", "transpose_5.tmp_0", 32},
labels{"person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light", {"transpose_6.tmp_0", "transpose_7.tmp_0", 64},
"fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow", };
"elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
"skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
"tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
"sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
"potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
"microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
"hair drier", "toothbrush"};
}; };
template <typename _Tp> template <typename _Tp>
......
cmake_minimum_required(VERSION 3.4.1) cmake_minimum_required(VERSION 3.9)
set(CMAKE_CXX_STANDARD 17) set(CMAKE_CXX_STANDARD 17)
project(picodet_demo) project(picodet_demo)
...@@ -11,9 +11,11 @@ if(OPENMP_FOUND) ...@@ -11,9 +11,11 @@ if(OPENMP_FOUND)
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_EXE_LINKER_FLAGS}") set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_EXE_LINKER_FLAGS}")
endif() endif()
find_package(OpenCV REQUIRED) # find_package(OpenCV REQUIRED)
find_package(OpenCV REQUIRED PATHS "/path/to/opencv-3.4.16_gcc8.2_ffmpeg")
find_package(ncnn REQUIRED) # find_package(ncnn REQUIRED)
find_package(ncnn REQUIRED PATHS "/path/to/ncnn/build/install/lib/cmake/ncnn")
if(NOT TARGET ncnn) if(NOT TARGET ncnn)
message(WARNING "ncnn NOT FOUND! Please set ncnn_DIR environment variable") message(WARNING "ncnn NOT FOUND! Please set ncnn_DIR environment variable")
else() else()
......
# PicoDet NCNN Demo # PicoDet NCNN Demo
This project provides PicoDet image inference, webcam inference and benchmark using 该Demo提供的预测代码是根据[Tencent's NCNN framework](https://github.com/Tencent/ncnn)推理库预测的。
[Tencent's NCNN framework](https://github.com/Tencent/ncnn).
# How to build
# 第一步:编译
## Windows ## Windows
### Step1. ### Step1.
Download and Install Visual Studio from https://visualstudio.microsoft.com/vs/community/ Download and Install Visual Studio from https://visualstudio.microsoft.com/vs/community/
...@@ -12,11 +10,16 @@ Download and Install Visual Studio from https://visualstudio.microsoft.com/vs/co ...@@ -12,11 +10,16 @@ Download and Install Visual Studio from https://visualstudio.microsoft.com/vs/co
### Step2. ### Step2.
Download and install OpenCV from https://github.com/opencv/opencv/releases Download and install OpenCV from https://github.com/opencv/opencv/releases
### Step3(Optional). 为了方便,如果环境是gcc8.2 x86环境,可直接下载以下库:
```shell
wget https://paddledet.bj.bcebos.com/data/opencv-3.4.16_gcc8.2_ffmpeg.tar.gz
tar -xf opencv-3.4.16_gcc8.2_ffmpeg.tar.gz
```
### Step3(可选).
Download and install Vulkan SDK from https://vulkan.lunarg.com/sdk/home Download and install Vulkan SDK from https://vulkan.lunarg.com/sdk/home
### Step4. ### Step4:编译NCNN
Clone NCNN repository
``` shell script ``` shell script
git clone --recursive https://github.com/Tencent/ncnn.git git clone --recursive https://github.com/Tencent/ncnn.git
...@@ -25,7 +28,7 @@ Build NCNN following this tutorial: [Build for Windows x64 using VS2017](https:/ ...@@ -25,7 +28,7 @@ Build NCNN following this tutorial: [Build for Windows x64 using VS2017](https:/
### Step5. ### Step5.
Add `ncnn_DIR` = `YOUR_NCNN_PATH/build/install/lib/cmake/ncnn` to system environment variables. 增加 `ncnn_DIR` = `YOUR_NCNN_PATH/build/install/lib/cmake/ncnn` 到系统变量中
Build project: Open x64 Native Tools Command Prompt for VS 2019 or 2017 Build project: Open x64 Native Tools Command Prompt for VS 2019 or 2017
...@@ -42,10 +45,10 @@ msbuild picodet_demo.vcxproj /p:configuration=release /p:platform=x64 ...@@ -42,10 +45,10 @@ msbuild picodet_demo.vcxproj /p:configuration=release /p:platform=x64
### Step1. ### Step1.
Build and install OpenCV from https://github.com/opencv/opencv Build and install OpenCV from https://github.com/opencv/opencv
### Step2(Optional). ### Step2(可选).
Download Vulkan SDK from https://vulkan.lunarg.com/sdk/home Download Vulkan SDK from https://vulkan.lunarg.com/sdk/home
### Step3. ### Step3:编译NCNN
Clone NCNN repository Clone NCNN repository
``` shell script ``` shell script
...@@ -54,15 +57,7 @@ git clone --recursive https://github.com/Tencent/ncnn.git ...@@ -54,15 +57,7 @@ git clone --recursive https://github.com/Tencent/ncnn.git
Build NCNN following this tutorial: [Build for Linux / NVIDIA Jetson / Raspberry Pi](https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-linux) Build NCNN following this tutorial: [Build for Linux / NVIDIA Jetson / Raspberry Pi](https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-linux)
### Step4. ### Step4:编译可执行文件
Set environment variables. Run:
``` shell script
export ncnn_DIR=YOUR_NCNN_PATH/build/install/lib/cmake/ncnn
```
Build project
``` shell script ``` shell script
cd <this-folder> cd <this-folder>
...@@ -71,47 +66,64 @@ cd build ...@@ -71,47 +66,64 @@ cd build
cmake .. cmake ..
make make
``` ```
# Run demo # Run demo
Download PicoDet ncnn model. - 准备模型
* [PicoDet ncnn model download link](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_ncnn.zip) ```shell
modelName=picodet_s_320_coco_lcnet
# 导出Inference model
## Webcam python tools/export_model.py \
-c configs/picodet/${modelName}.yml \
```shell script -o weights=${modelName}.pdparams \
picodet_demo 0 0 --output_dir=inference_model
# 转换到ONNX
paddle2onnx --model_dir inference_model/${modelName} \
--model_filename model.pdmodel \
--params_filename model.pdiparams \
--opset_version 11 \
--save_file ${modelName}.onnx
# 简化模型
python -m onnxsim ${modelName}.onnx ${modelName}_processed.onnx
# 将模型转换至NCNN格式
Run onnx2ncnn in ncnn tools to generate ncnn .param and .bin file.
```
转NCNN模型可以利用在线转换工具 [https://convertmodel.com](https://convertmodel.com/)
为了快速测试,可直接下载:[picodet_s_320_coco_lcnet-opt.bin](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco_lcnet-opt.bin)/ [picodet_s_320_coco_lcnet-opt.param](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco_lcnet-opt.param)(不带后处理)。
**注意:**由于带后处理后,NCNN预测会出NAN,暂时使用不带后处理Demo即可,带后处理的Demo正在升级中,很快发布。
## 开始运行
首先新建预测结果存放目录:
```shell
cp -r ../demo_onnxruntime/imgs .
cd build
mkdir ../results
``` ```
## Inference images - 预测一张图片
``` shell
```shell script ./picodet_demo 0 ../picodet_s_320_coco_lcnet.bin ../picodet_s_320_coco_lcnet.param 320 320 ../imgs/dog.jpg 0
picodet_demo 1 IMAGE_FOLDER/*.jpg
``` ```
具体参数解析可参考`main.cpp`
## Inference video -测试速度Benchmark
```shell script ``` shell
picodet_demo 2 VIDEO_PATH ./picodet_demo 1 ../picodet_s_320_lcnet.bin ../picodet_s_320_lcnet.param 320 320 0
``` ```
## Benchmark ## FAQ
```shell script
picodet_demo 3 0
result: picodet min = 17.74 max = 22.71 avg = 18.16
```
****
Notice:
If benchmark speed is slow, try to limit omp thread num.
Linux:
```shell script - 预测结果精度不对:
export OMP_THREAD_LIMIT=4 请先确认模型输入shape是否对齐,并且模型输出name是否对齐,不带后处理的PicoDet增强版模型输出name如下:
```shell
# 分类分支 | 检测分支
{"transpose_0.tmp_0", "transpose_1.tmp_0"},
{"transpose_2.tmp_0", "transpose_3.tmp_0"},
{"transpose_4.tmp_0", "transpose_5.tmp_0"},
{"transpose_6.tmp_0", "transpose_7.tmp_0"},
``` ```
可使用[netron](https://netron.app)查看具体name,并修改`picodet_mnn.hpp`中相应`non_postprocess_heads_info`数组。
...@@ -13,353 +13,198 @@ ...@@ -13,353 +13,198 @@
// limitations under the License. // limitations under the License.
// reference from https://github.com/RangiLyu/nanodet/tree/main/demo_ncnn // reference from https://github.com/RangiLyu/nanodet/tree/main/demo_ncnn
#include "picodet.h"
#include <benchmark.h>
#include <iostream>
#include <net.h>
#include <opencv2/core/core.hpp> #include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp> #include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp> #include <opencv2/imgproc/imgproc.hpp>
#include <iostream>
#include <net.h>
#include "picodet.h"
#include <benchmark.h>
#define __SAVE_RESULT__ // if defined save drawed results to ../results, else
// show it in windows
struct object_rect { struct object_rect {
int x; int x;
int y; int y;
int width; int width;
int height; int height;
};
int resize_uniform(cv::Mat& src, cv::Mat& dst, cv::Size dst_size, object_rect& effect_area)
{
int w = src.cols;
int h = src.rows;
int dst_w = dst_size.width;
int dst_h = dst_size.height;
dst = cv::Mat(cv::Size(dst_w, dst_h), CV_8UC3, cv::Scalar(0));
float ratio_src = w * 1.0 / h;
float ratio_dst = dst_w * 1.0 / dst_h;
int tmp_w = 0;
int tmp_h = 0;
if (ratio_src > ratio_dst) {
tmp_w = dst_w;
tmp_h = floor((dst_w * 1.0 / w) * h);
}
else if (ratio_src < ratio_dst) {
tmp_h = dst_h;
tmp_w = floor((dst_h * 1.0 / h) * w);
}
else {
cv::resize(src, dst, dst_size);
effect_area.x = 0;
effect_area.y = 0;
effect_area.width = dst_w;
effect_area.height = dst_h;
return 0;
}
cv::Mat tmp;
cv::resize(src, tmp, cv::Size(tmp_w, tmp_h));
if (tmp_w != dst_w) {
int index_w = floor((dst_w - tmp_w) / 2.0);
for (int i = 0; i < dst_h; i++) {
memcpy(dst.data + i * dst_w * 3 + index_w * 3, tmp.data + i * tmp_w * 3, tmp_w * 3);
}
effect_area.x = index_w;
effect_area.y = 0;
effect_area.width = tmp_w;
effect_area.height = tmp_h;
}
else if (tmp_h != dst_h) {
int index_h = floor((dst_h - tmp_h) / 2.0);
memcpy(dst.data + index_h * dst_w * 3, tmp.data, tmp_w * tmp_h * 3);
effect_area.x = 0;
effect_area.y = index_h;
effect_area.width = tmp_w;
effect_area.height = tmp_h;
}
else {
printf("error\n");
}
return 0;
}
const int color_list[80][3] =
{
{216 , 82 , 24},
{236 ,176 , 31},
{125 , 46 ,141},
{118 ,171 , 47},
{ 76 ,189 ,237},
{238 , 19 , 46},
{ 76 , 76 , 76},
{153 ,153 ,153},
{255 , 0 , 0},
{255 ,127 , 0},
{190 ,190 , 0},
{ 0 ,255 , 0},
{ 0 , 0 ,255},
{170 , 0 ,255},
{ 84 , 84 , 0},
{ 84 ,170 , 0},
{ 84 ,255 , 0},
{170 , 84 , 0},
{170 ,170 , 0},
{170 ,255 , 0},
{255 , 84 , 0},
{255 ,170 , 0},
{255 ,255 , 0},
{ 0 , 84 ,127},
{ 0 ,170 ,127},
{ 0 ,255 ,127},
{ 84 , 0 ,127},
{ 84 , 84 ,127},
{ 84 ,170 ,127},
{ 84 ,255 ,127},
{170 , 0 ,127},
{170 , 84 ,127},
{170 ,170 ,127},
{170 ,255 ,127},
{255 , 0 ,127},
{255 , 84 ,127},
{255 ,170 ,127},
{255 ,255 ,127},
{ 0 , 84 ,255},
{ 0 ,170 ,255},
{ 0 ,255 ,255},
{ 84 , 0 ,255},
{ 84 , 84 ,255},
{ 84 ,170 ,255},
{ 84 ,255 ,255},
{170 , 0 ,255},
{170 , 84 ,255},
{170 ,170 ,255},
{170 ,255 ,255},
{255 , 0 ,255},
{255 , 84 ,255},
{255 ,170 ,255},
{ 42 , 0 , 0},
{ 84 , 0 , 0},
{127 , 0 , 0},
{170 , 0 , 0},
{212 , 0 , 0},
{255 , 0 , 0},
{ 0 , 42 , 0},
{ 0 , 84 , 0},
{ 0 ,127 , 0},
{ 0 ,170 , 0},
{ 0 ,212 , 0},
{ 0 ,255 , 0},
{ 0 , 0 , 42},
{ 0 , 0 , 84},
{ 0 , 0 ,127},
{ 0 , 0 ,170},
{ 0 , 0 ,212},
{ 0 , 0 ,255},
{ 0 , 0 , 0},
{ 36 , 36 , 36},
{ 72 , 72 , 72},
{109 ,109 ,109},
{145 ,145 ,145},
{182 ,182 ,182},
{218 ,218 ,218},
{ 0 ,113 ,188},
{ 80 ,182 ,188},
{127 ,127 , 0},
}; };
void draw_bboxes(const cv::Mat& bgr, const std::vector<BoxInfo>& bboxes, object_rect effect_roi) std::vector<int> GenerateColorMap(int num_class) {
{ auto colormap = std::vector<int>(3 * num_class, 0);
static const char* class_names[] = { "person", "bicycle", "car", "motorcycle", "airplane", "bus", for (int i = 0; i < num_class; ++i) {
"train", "truck", "boat", "traffic light", "fire hydrant", int j = 0;
"stop sign", "parking meter", "bench", "bird", "cat", "dog", int lab = i;
"horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", while (lab) {
"backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", colormap[i * 3] |= (((lab >> 0) & 1) << (7 - j));
"skis", "snowboard", "sports ball", "kite", "baseball bat", colormap[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j));
"baseball glove", "skateboard", "surfboard", "tennis racket", colormap[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j));
"bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", ++j;
"banana", "apple", "sandwich", "orange", "broccoli", "carrot", lab >>= 3;
"hot dog", "pizza", "donut", "cake", "chair", "couch",
"potted plant", "bed", "dining table", "toilet", "tv", "laptop",
"mouse", "remote", "keyboard", "cell phone", "microwave", "oven",
"toaster", "sink", "refrigerator", "book", "clock", "vase",
"scissors", "teddy bear", "hair drier", "toothbrush"
};
cv::Mat image = bgr.clone();
int src_w = image.cols;
int src_h = image.rows;
int dst_w = effect_roi.width;
int dst_h = effect_roi.height;
float width_ratio = (float)src_w / (float)dst_w;
float height_ratio = (float)src_h / (float)dst_h;
for (size_t i = 0; i < bboxes.size(); i++)
{
const BoxInfo& bbox = bboxes[i];
cv::Scalar color = cv::Scalar(color_list[bbox.label][0], color_list[bbox.label][1], color_list[bbox.label][2]);
cv::rectangle(image, cv::Rect(cv::Point((bbox.x1 - effect_roi.x) * width_ratio, (bbox.y1 - effect_roi.y) * height_ratio),
cv::Point((bbox.x2 - effect_roi.x) * width_ratio, (bbox.y2 - effect_roi.y) * height_ratio)), color);
char text[256];
sprintf(text, "%s %.1f%%", class_names[bbox.label], bbox.score * 100);
int baseLine = 0;
cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.4, 1, &baseLine);
int x = (bbox.x1 - effect_roi.x) * width_ratio;
int y = (bbox.y1 - effect_roi.y) * height_ratio - label_size.height - baseLine;
if (y < 0)
y = 0;
if (x + label_size.width > image.cols)
x = image.cols - label_size.width;
cv::rectangle(image, cv::Rect(cv::Point(x, y), cv::Size(label_size.width, label_size.height + baseLine)),
color, -1);
cv::putText(image, text, cv::Point(x, y + label_size.height),
cv::FONT_HERSHEY_SIMPLEX, 0.4, cv::Scalar(255, 255, 255));
}
cv::imwrite("../result/test_picodet.jpg", image);
printf("************infer image success!!!**********\n");
}
int image_demo(PicoDet &detector, const char* imagepath)
{
std::vector<std::string> filenames;
cv::glob(imagepath, filenames, false);
for (auto img_name : filenames)
{
cv::Mat image = cv::imread(img_name);
if (image.empty())
{
fprintf(stderr, "cv::imread %s failed\n", img_name);
return -1;
}
object_rect effect_roi;
cv::Mat resized_img;
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
auto results = detector.detect(resized_img, 0.4, 0.5);
char imgName[20] = {};
draw_bboxes(image, results, effect_roi);
cv::waitKey(0);
} }
return 0; }
return colormap;
} }
int webcam_demo(PicoDet& detector, int cam_id) void draw_bboxes(const cv::Mat &im, const std::vector<BoxInfo> &bboxes,
{ std::string save_path = "None") {
cv::Mat image; static const char *class_names[] = {
cv::VideoCapture cap(cam_id); "person", "bicycle", "car",
"motorcycle", "airplane", "bus",
while (true) "train", "truck", "boat",
{ "traffic light", "fire hydrant", "stop sign",
cap >> image; "parking meter", "bench", "bird",
object_rect effect_roi; "cat", "dog", "horse",
cv::Mat resized_img; "sheep", "cow", "elephant",
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi); "bear", "zebra", "giraffe",
auto results = detector.detect(resized_img, 0.4, 0.5); "backpack", "umbrella", "handbag",
draw_bboxes(image, results, effect_roi); "tie", "suitcase", "frisbee",
cv::waitKey(1); "skis", "snowboard", "sports ball",
} "kite", "baseball bat", "baseball glove",
return 0; "skateboard", "surfboard", "tennis racket",
"bottle", "wine glass", "cup",
"fork", "knife", "spoon",
"bowl", "banana", "apple",
"sandwich", "orange", "broccoli",
"carrot", "hot dog", "pizza",
"donut", "cake", "chair",
"couch", "potted plant", "bed",
"dining table", "toilet", "tv",
"laptop", "mouse", "remote",
"keyboard", "cell phone", "microwave",
"oven", "toaster", "sink",
"refrigerator", "book", "clock",
"vase", "scissors", "teddy bear",
"hair drier", "toothbrush"};
cv::Mat image = im.clone();
int src_w = image.cols;
int src_h = image.rows;
int thickness = 2;
auto colormap = GenerateColorMap(sizeof(class_names));
for (size_t i = 0; i < bboxes.size(); i++) {
const BoxInfo &bbox = bboxes[i];
std::cout << bbox.x1 << ". " << bbox.y1 << ". " << bbox.x2 << ". "
<< bbox.y2 << ". " << std::endl;
int c1 = colormap[3 * bbox.label + 0];
int c2 = colormap[3 * bbox.label + 1];
int c3 = colormap[3 * bbox.label + 2];
cv::Scalar color = cv::Scalar(c1, c2, c3);
// cv::Scalar color = cv::Scalar(0, 0, 255);
cv::rectangle(image, cv::Rect(cv::Point(bbox.x1, bbox.y1),
cv::Point(bbox.x2, bbox.y2)),
color, 1);
char text[256];
sprintf(text, "%s %.1f%%", class_names[bbox.label], bbox.score * 100);
int baseLine = 0;
cv::Size label_size =
cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.4, 1, &baseLine);
int x = bbox.x1;
int y = bbox.y1 - label_size.height - baseLine;
if (y < 0)
y = 0;
if (x + label_size.width > image.cols)
x = image.cols - label_size.width;
cv::rectangle(image, cv::Rect(cv::Point(x, y),
cv::Size(label_size.width,
label_size.height + baseLine)),
color, -1);
cv::putText(image, text, cv::Point(x, y + label_size.height),
cv::FONT_HERSHEY_SIMPLEX, 0.4, cv::Scalar(255, 255, 255), 1);
}
if (save_path == "None") {
cv::imshow("image", image);
} else {
cv::imwrite(save_path, image);
std::cout << "Result save in: " << save_path << std::endl;
}
} }
int video_demo(PicoDet& detector, const char* path) int image_demo(PicoDet &detector, const char *imagepath,
{ int has_postprocess = 0) {
cv::Mat image; std::vector<cv::String> filenames;
cv::VideoCapture cap(path); cv::glob(imagepath, filenames, false);
bool is_postprocess = has_postprocess > 0 ? true : false;
while (true) for (auto img_name : filenames) {
{ cv::Mat image = cv::imread(img_name, cv::IMREAD_COLOR);
cap >> image; if (image.empty()) {
object_rect effect_roi; fprintf(stderr, "cv::imread %s failed\n", img_name.c_str());
cv::Mat resized_img; return -1;
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
auto results = detector.detect(resized_img, 0.4, 0.5);
draw_bboxes(image, results, effect_roi);
cv::waitKey(1);
} }
return 0; std::vector<BoxInfo> results;
detector.detect(image, results, is_postprocess);
std::cout << "detect done." << std::endl;
#ifdef __SAVE_RESULT__
std::string save_path = img_name;
draw_bboxes(image, results, save_path.replace(3, 4, "results"));
#else
draw_bboxes(image, results);
cv::waitKey(0);
#endif
}
return 0;
} }
int benchmark(PicoDet& detector) int benchmark(PicoDet &detector, int width, int height,
{ int has_postprocess = 0) {
int loop_num = 100; int loop_num = 100;
int warm_up = 8; int warm_up = 8;
double time_min = DBL_MAX; double time_min = DBL_MAX;
double time_max = -DBL_MAX; double time_max = -DBL_MAX;
double time_avg = 0; double time_avg = 0;
ncnn::Mat input = ncnn::Mat(320, 320, 3); cv::Mat image(width, height, CV_8UC3, cv::Scalar(1, 1, 1));
input.fill(0.01f); bool is_postprocess = has_postprocess > 0 ? true : false;
for (int i = 0; i < warm_up + loop_num; i++) for (int i = 0; i < warm_up + loop_num; i++) {
{ double start = ncnn::get_current_time();
double start = ncnn::get_current_time(); std::vector<BoxInfo> results;
ncnn::Extractor ex = detector.Net->create_extractor(); detector.detect(image, results, is_postprocess);
ex.input("image", input); // picodet double end = ncnn::get_current_time();
for (const auto& head_info : detector.heads_info)
{ double time = end - start;
ncnn::Mat dis_pred; if (i >= warm_up) {
ncnn::Mat cls_pred; time_min = (std::min)(time_min, time);
ex.extract(head_info.dis_layer.c_str(), dis_pred); time_max = (std::max)(time_max, time);
ex.extract(head_info.cls_layer.c_str(), cls_pred); time_avg += time;
}
double end = ncnn::get_current_time();
double time = end - start;
if (i >= warm_up)
{
time_min = (std::min)(time_min, time);
time_max = (std::max)(time_max, time);
time_avg += time;
}
} }
time_avg /= loop_num; }
fprintf(stderr, "%20s min = %7.2f max = %7.2f avg = %7.2f\n", "picodet", time_min, time_max, time_avg); time_avg /= loop_num;
return 0; fprintf(stderr, "%20s min = %7.2f max = %7.2f avg = %7.2f\n", "picodet",
time_min, time_max, time_avg);
return 0;
} }
int main(int argc, char **argv) {
int main(int argc, char** argv) int mode = atoi(argv[1]);
{ char *bin_model_path = argv[2];
if (argc != 3) char *param_model_path = argv[3];
{ int height = 320;
fprintf(stderr, "usage: %s [mode] [path]. \n For webcam mode=0, path is cam id; \n For image demo, mode=1, path=xxx/xxx/*.jpg; \n For video, mode=2; \n For benchmark, mode=3 path=0.\n", argv[0]); int width = 320;
return -1; if (argc == 5) {
} height = atoi(argv[4]);
PicoDet detector = PicoDet("../weight/picodet_m_416.param", "../weight/picodet_m_416.bin", true); width = atoi(argv[5]);
int mode = atoi(argv[1]); }
switch (mode) PicoDet detector =
{ PicoDet(param_model_path, bin_model_path, width, height, true, 0.45, 0.3);
case 0:{ if (mode == 1) {
int cam_id = atoi(argv[2]);
webcam_demo(detector, cam_id); benchmark(detector, width, height, atoi(argv[6]));
break; } else {
} if (argc != 6) {
case 1:{ std::cout << "Must set image file, such as ./picodet_demo 0 "
const char* images = argv[2]; "../picodet_s_320_lcnet.bin ../picodet_s_320_lcnet.param "
image_demo(detector, images); "320 320 img.jpg"
break; << std::endl;
}
case 2:{
const char* path = argv[2];
video_demo(detector, path);
break;
}
case 3:{
benchmark(detector);
break;
}
default:{
fprintf(stderr, "usage: %s [mode] [path]. \n For webcam mode=0, path is cam id; \n For image demo, mode=1, path=xxx/xxx/*.jpg; \n For video, mode=2; \n For benchmark, mode=3 path=0.\n", argv[0]);
break;
}
} }
const char *images = argv[6];
image_demo(detector, images, atoi(argv[7]));
}
} }
...@@ -48,7 +48,9 @@ int activation_function_softmax(const _Tp *src, _Tp *dst, int length) { ...@@ -48,7 +48,9 @@ int activation_function_softmax(const _Tp *src, _Tp *dst, int length) {
bool PicoDet::hasGPU = false; bool PicoDet::hasGPU = false;
PicoDet *PicoDet::detector = nullptr; PicoDet *PicoDet::detector = nullptr;
PicoDet::PicoDet(const char *param, const char *bin, bool useGPU) { PicoDet::PicoDet(const char *param, const char *bin, int input_width,
int input_hight, bool useGPU, float score_threshold_ = 0.5,
float nms_threshold_ = 0.3) {
this->Net = new ncnn::Net(); this->Net = new ncnn::Net();
#if NCNN_VULKAN #if NCNN_VULKAN
this->hasGPU = ncnn::get_gpu_count() > 0; this->hasGPU = ncnn::get_gpu_count() > 0;
...@@ -57,21 +59,28 @@ PicoDet::PicoDet(const char *param, const char *bin, bool useGPU) { ...@@ -57,21 +59,28 @@ PicoDet::PicoDet(const char *param, const char *bin, bool useGPU) {
this->Net->opt.use_fp16_arithmetic = true; this->Net->opt.use_fp16_arithmetic = true;
this->Net->load_param(param); this->Net->load_param(param);
this->Net->load_model(bin); this->Net->load_model(bin);
this->in_w = input_width;
this->in_h = input_hight;
this->score_threshold = score_threshold_;
this->nms_threshold = nms_threshold_;
} }
PicoDet::~PicoDet() { delete this->Net; } PicoDet::~PicoDet() { delete this->Net; }
void PicoDet::preprocess(cv::Mat &image, ncnn::Mat &in) { void PicoDet::preprocess(cv::Mat &image, ncnn::Mat &in) {
// cv::resize(image, image, cv::Size(this->in_w, this->in_h), 0.f, 0.f);
int img_w = image.cols; int img_w = image.cols;
int img_h = image.rows; int img_h = image.rows;
in = ncnn::Mat::from_pixels(image.data, ncnn::Mat::PIXEL_BGR, img_w, img_h); in = ncnn::Mat::from_pixels_resize(image.data, ncnn::Mat::PIXEL_BGR, img_w,
img_h, this->in_w, this->in_h);
const float mean_vals[3] = {103.53f, 116.28f, 123.675f}; const float mean_vals[3] = {103.53f, 116.28f, 123.675f};
const float norm_vals[3] = {0.017429f, 0.017507f, 0.017125f}; const float norm_vals[3] = {0.017429f, 0.017507f, 0.017125f};
in.substract_mean_normalize(mean_vals, norm_vals); in.substract_mean_normalize(mean_vals, norm_vals);
} }
std::vector<BoxInfo> PicoDet::detect(cv::Mat image, float score_threshold, int PicoDet::detect(cv::Mat image, std::vector<BoxInfo> &result_list,
float nms_threshold) { bool has_postprocess) {
ncnn::Mat input; ncnn::Mat input;
preprocess(image, input); preprocess(image, input);
auto ex = this->Net->create_extractor(); auto ex = this->Net->create_extractor();
...@@ -82,34 +91,76 @@ std::vector<BoxInfo> PicoDet::detect(cv::Mat image, float score_threshold, ...@@ -82,34 +91,76 @@ std::vector<BoxInfo> PicoDet::detect(cv::Mat image, float score_threshold,
#endif #endif
ex.input("image", input); // picodet ex.input("image", input); // picodet
this->image_h = image.rows;
this->image_w = image.cols;
std::vector<std::vector<BoxInfo>> results; std::vector<std::vector<BoxInfo>> results;
results.resize(this->num_class); results.resize(this->num_class);
for (const auto &head_info : this->heads_info) { if (has_postprocess) {
ncnn::Mat dis_pred; ncnn::Mat dis_pred;
ncnn::Mat cls_pred; ncnn::Mat cls_pred;
ex.extract(head_info.dis_layer.c_str(), dis_pred); ex.extract(this->nms_heads_info[0].c_str(), dis_pred);
ex.extract(head_info.cls_layer.c_str(), cls_pred); ex.extract(this->nms_heads_info[1].c_str(), cls_pred);
this->decode_infer(cls_pred, dis_pred, head_info.stride, score_threshold, std::cout << dis_pred.h << " " << dis_pred.w << std::endl;
results); std::cout << cls_pred.h << " " << cls_pred.w << std::endl;
this->nms_boxes(cls_pred, dis_pred, this->score_threshold, results);
} else {
for (const auto &head_info : this->non_postprocess_heads_info) {
ncnn::Mat dis_pred;
ncnn::Mat cls_pred;
ex.extract(head_info.dis_layer.c_str(), dis_pred);
ex.extract(head_info.cls_layer.c_str(), cls_pred);
this->decode_infer(cls_pred, dis_pred, head_info.stride,
this->score_threshold, results);
}
} }
std::vector<BoxInfo> dets;
for (int i = 0; i < (int)results.size(); i++) { for (int i = 0; i < (int)results.size(); i++) {
this->nms(results[i], nms_threshold); this->nms(results[i], this->nms_threshold);
for (auto box : results[i]) { for (auto box : results[i]) {
dets.push_back(box); box.x1 = box.x1 / this->in_w * this->image_w;
box.x2 = box.x2 / this->in_w * this->image_w;
box.y1 = box.y1 / this->in_h * this->image_h;
box.y2 = box.y2 / this->in_h * this->image_h;
result_list.push_back(box);
}
}
return 0;
}
void PicoDet::nms_boxes(ncnn::Mat &cls_pred, ncnn::Mat &dis_pred,
float score_threshold,
std::vector<std::vector<BoxInfo>> &result_list) {
BoxInfo bbox;
int i, j;
for (i = 0; i < dis_pred.h; i++) {
bbox.x1 = dis_pred.row(i)[0];
bbox.y1 = dis_pred.row(i)[1];
bbox.x2 = dis_pred.row(i)[2];
bbox.y2 = dis_pred.row(i)[3];
const float *scores = cls_pred.row(i);
float score = 0;
int cur_label = 0;
for (int label = 0; label < this->num_class; label++) {
float score_ = cls_pred.row(label)[i];
if (score_ > score) {
score = score_;
cur_label = label;
}
} }
bbox.score = score;
bbox.label = cur_label;
result_list[cur_label].push_back(bbox);
} }
return dets;
} }
void PicoDet::decode_infer(ncnn::Mat &cls_pred, ncnn::Mat &dis_pred, int stride, void PicoDet::decode_infer(ncnn::Mat &cls_pred, ncnn::Mat &dis_pred, int stride,
float threshold, float threshold,
std::vector<std::vector<BoxInfo>> &results) { std::vector<std::vector<BoxInfo>> &results) {
int feature_h = ceil((float)this->input_size[1] / stride); int feature_h = ceil((float)this->in_w / stride);
int feature_w = ceil((float)this->input_size[0] / stride); int feature_w = ceil((float)this->in_h / stride);
for (int idx = 0; idx < feature_h * feature_w; idx++) { for (int idx = 0; idx < feature_h * feature_w; idx++) {
const float *scores = cls_pred.row(idx); const float *scores = cls_pred.row(idx);
...@@ -151,8 +202,8 @@ BoxInfo PicoDet::disPred2Bbox(const float *&dfl_det, int label, float score, ...@@ -151,8 +202,8 @@ BoxInfo PicoDet::disPred2Bbox(const float *&dfl_det, int label, float score,
} }
float xmin = (std::max)(ct_x - dis_pred[0], .0f); float xmin = (std::max)(ct_x - dis_pred[0], .0f);
float ymin = (std::max)(ct_y - dis_pred[1], .0f); float ymin = (std::max)(ct_y - dis_pred[1], .0f);
float xmax = (std::min)(ct_x + dis_pred[2], (float)this->input_size[0]); float xmax = (std::min)(ct_x + dis_pred[2], (float)this->in_w);
float ymax = (std::min)(ct_y + dis_pred[3], (float)this->input_size[1]); float ymax = (std::min)(ct_y + dis_pred[3], (float)this->in_w);
return BoxInfo{xmin, ymin, xmax, ymax, score, label}; return BoxInfo{xmin, ymin, xmax, ymax, score, label};
} }
......
...@@ -16,66 +16,72 @@ ...@@ -16,66 +16,72 @@
#ifndef PICODET_H #ifndef PICODET_H
#define PICODET_H #define PICODET_H
#include <opencv2/core/core.hpp>
#include <net.h> #include <net.h>
#include <opencv2/core/core.hpp>
typedef struct HeadInfo typedef struct NonPostProcessHeadInfo {
{ std::string cls_layer;
std::string cls_layer; std::string dis_layer;
std::string dis_layer; int stride;
int stride; } NonPostProcessHeadInfo;
};
typedef struct BoxInfo typedef struct BoxInfo {
{ float x1;
float x1; float y1;
float y1; float x2;
float x2; float y2;
float y2; float score;
float score; int label;
int label;
} BoxInfo; } BoxInfo;
class PicoDet class PicoDet {
{
public: public:
PicoDet(const char* param, const char* bin, bool useGPU); PicoDet(const char *param, const char *bin, int input_width, int input_hight,
bool useGPU, float score_threshold_, float nms_threshold_);
~PicoDet();
static PicoDet* detector; ~PicoDet();
ncnn::Net* Net;
static bool hasGPU;
std::vector<HeadInfo> heads_info{ static PicoDet *detector;
// cls_pred|dis_pred|stride ncnn::Net *Net;
{"save_infer_model/scale_0.tmp_1", "save_infer_model/scale_4.tmp_1", 8}, static bool hasGPU;
{"save_infer_model/scale_1.tmp_1", "save_infer_model/scale_5.tmp_1", 16},
{"save_infer_model/scale_2.tmp_1", "save_infer_model/scale_6.tmp_1", 32},
{"save_infer_model/scale_3.tmp_1", "save_infer_model/scale_7.tmp_1", 64},
};
std::vector<BoxInfo> detect(cv::Mat image, float score_threshold, float nms_threshold); int detect(cv::Mat image, std::vector<BoxInfo> &result_list,
bool has_postprocess);
std::vector<std::string> labels{ "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
"fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
"elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
"skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
"tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
"sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
"potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
"microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
"hair drier", "toothbrush" };
private: private:
void preprocess(cv::Mat& image, ncnn::Mat& in); void preprocess(cv::Mat &image, ncnn::Mat &in);
void decode_infer(ncnn::Mat& cls_pred, ncnn::Mat& dis_pred, int stride, float threshold, std::vector<std::vector<BoxInfo>>& results); void decode_infer(ncnn::Mat &cls_pred, ncnn::Mat &dis_pred, int stride,
BoxInfo disPred2Bbox(const float*& dfl_det, int label, float score, int x, int y, int stride); float threshold,
static void nms(std::vector<BoxInfo>& result, float nms_threshold); std::vector<std::vector<BoxInfo>> &results);
int input_size[2] = {320, 320}; BoxInfo disPred2Bbox(const float *&dfl_det, int label, float score, int x,
int num_class = 80; int y, int stride);
int reg_max = 7; static void nms(std::vector<BoxInfo> &result, float nms_threshold);
void nms_boxes(ncnn::Mat &cls_pred, ncnn::Mat &dis_pred,
float score_threshold,
std::vector<std::vector<BoxInfo>> &result_list);
}; int image_w;
int image_h;
int in_w = 320;
int in_h = 320;
int num_class = 80;
int reg_max = 7;
float score_threshold;
float nms_threshold;
std::vector<float> bbox_output_data_;
std::vector<float> class_output_data_;
std::vector<std::string> nms_heads_info{"tmp_16", "concat_4.tmp_0"};
// If not export post-process, will use non_postprocess_heads_info
std::vector<NonPostProcessHeadInfo> non_postprocess_heads_info{
// cls_pred|dis_pred|stride
{"transpose_0.tmp_0", "transpose_1.tmp_0", 8},
{"transpose_2.tmp_0", "transpose_3.tmp_0", 16},
{"transpose_4.tmp_0", "transpose_5.tmp_0", 32},
{"transpose_6.tmp_0", "transpose_7.tmp_0", 64},
};
};
#endif #endif
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册