未验证 提交 a6189b12 编写于 作者: G Guanghua Yu 提交者: GitHub

update picodet ncnn and mnn demo (#5721)

上级 23f982e0
......@@ -226,11 +226,16 @@ paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \
### 部署
- OpenVINO demo [Python](../../deploy/third_engine/demo_openvino/python)
- [PaddleLite C++ demo](../../deploy/lite)
- [Android demo(Paddle Lite)](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo)
- ONNXRuntime demo [Python](../../deploy/third_engine/demo_onnxruntime)
- PaddleInference demo [Python](../../deploy/python) & [C++](../../deploy/cpp)
| 预测库 | Python | C++ | 带后处理预测 |
| :-------- | :--------: | :---------------------: | :----------------: |
| OpenVINO | [Python](../../deploy/third_engine/demo_openvino/python) | [C++](../../deploy/third_engine/demo_openvino)(带后处理开发中) | ✔︎ |
| Paddle Lite | - | [C++](../../deploy/lite) | ✔︎ |
| Android Demo | - | [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo) | ✔︎ |
| PaddleInference | [Python](../../deploy/python) | [C++](../../deploy/cpp) | ✔︎ |
| ONNXRuntime | [Python](../../deploy/third_engine/demo_onnxruntime) | Comming soon | ✔︎ |
| NCNN | Comming soon | [C++](../../deploy/third_engine/demo_ncnn) | ✘ |
| MNN | Comming soon | [C++](../../deploy/third_engine/demo_mnn) | ✘ |
Android demo可视化:
......
......@@ -222,11 +222,15 @@ paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \
### Deploy
- OpenVINO demo [Python](../../deploy/third_engine/demo_openvino/python)
- [PaddleLite C++ demo](../../deploy/lite)
- [Android demo(Paddle Lite)](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo)
- ONNXRuntime demo [Python](../../deploy/third_engine/demo_onnxruntime)
- PaddleInference demo [Python](../../deploy/python) & [C++](../../deploy/cpp)
| Infer Engine | Python | C++ | Predict With Postprocess |
| :-------- | :--------: | :---------------------: | :----------------: |
| OpenVINO | [Python](../../deploy/third_engine/demo_openvino/python) | [C++](../../deploy/third_engine/demo_openvino)(postprocess comming soon) | ✔︎ |
| Paddle Lite | - | [C++](../../deploy/lite) | ✔︎ |
| Android Demo | - | [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo) | ✔︎ |
| PaddleInference | [Python](../../deploy/python) | [C++](../../deploy/cpp) | ✔︎ |
| ONNXRuntime | [Python](../../deploy/third_engine/demo_onnxruntime) | Comming soon | ✔︎ |
| NCNN | Comming soon | [C++](../../deploy/third_engine/demo_ncnn) | ✘ |
| MNN | Comming soon | [C++](../../deploy/third_engine/demo_mnn) | ✘ |
Android demo visualization:
......
......@@ -2,13 +2,14 @@ cmake_minimum_required(VERSION 3.9)
project(picodet-mnn)
set(CMAKE_CXX_STANDARD 17)
set(MNN_DIR PATHS "./mnn")
# find_package(OpenCV REQUIRED PATHS "/work/dependence/opencv/opencv-3.4.3/build")
find_package(OpenCV REQUIRED)
include_directories(
/path/to/MNN/include/MNN
/path/to/MNN/include
.
${MNN_DIR}/include
${MNN_DIR}/include/MNN
${CMAKE_SOURCE_DIR}
)
link_directories(mnn/lib)
......
# PicoDet MNN Demo
This fold provides PicoDet inference code using
[Alibaba's MNN framework](https://github.com/alibaba/MNN). Most of the implements in
this fold are same as *demo_ncnn*.
本Demo提供的预测代码是根据[Alibaba's MNN framework](https://github.com/alibaba/MNN) 推理库预测的。
## Install MNN
## C++ Demo
### Python library
Just run:
``` shell
pip install MNN
- 第一步:根据[MNN官方编译文档](https://www.yuque.com/mnn/en/build_linux) 编译生成预测库.
- 第二步:编译或下载得到OpenCV库,可参考OpenCV官网,为了方便如果环境是gcc8.2 x86环境,可直接下载以下库:
```shell
wget https://paddledet.bj.bcebos.com/data/opencv-3.4.16_gcc8.2_ffmpeg.tar.gz
tar -xf opencv-3.4.16_gcc8.2_ffmpeg.tar.gz
```
### C++ library
Please follow the [official document](https://www.yuque.com/mnn/en/build_linux) to build MNN engine.
- Create picodet_m_416_coco.onnx
- 第三步:准备模型
```shell
modelName=picodet_m_416_coco
# export model
modelName=picodet_s_320_coco_lcnet
# 导出Inference model
python tools/export_model.py \
-c configs/picodet/${modelName}.yml \
-o weights=${modelName}.pdparams \
--output_dir=inference_model
# convert to onnx
# 转换到ONNX
paddle2onnx --model_dir inference_model/${modelName} \
--model_filename model.pdmodel \
--params_filename model.pdiparams \
--opset_version 11 \
--save_file ${modelName}.onnx
# onnxsim
# 简化模型
python -m onnxsim ${modelName}.onnx ${modelName}_processed.onnx
# 将模型转换至MNN格式
python -m MNN.tools.mnnconvert -f ONNX --modelFile picodet_s_320_lcnet_processed.onnx --MNNModel picodet_s_320_lcnet.mnn
```
为了快速测试,可直接下载:[picodet_s_320_lcnet.mnn](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_lcnet.mnn)(不带后处理)。
- Convert model
``` shell
python -m MNN.tools.mnnconvert -f ONNX --modelFile picodet-416.onnx --MNNModel picodet-416.mnn
```
Here are converted model [download link](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416.mnn).
**注意:**由于MNN里,Matmul算子的输入shape如果不一致计算有问题,带后处理的Demo正在升级中,很快发布。
## Build
The python code *demo_mnn.py* can run directly and independently without main PicoDet repo.
`PicoDetONNX` and `PicoDetTorch` are two classes used to check the similarity of MNN inference results
with ONNX model and Pytorch model. They can be remove with no side effects.
For C++ code, replace `libMNN.so` under *./mnn/lib* with the one you just compiled, modify OpenCV path and MNN path at CMake file,
and run
## 编译可执行程序
- 第一步:导入lib包
```
mkdir mnn && cd mnn && mkdir lib
cp /path/to/MNN/build/libMNN.so .
cd ..
cp -r /path/to/MNN/include .
```
- 第二步:修改CMakeLists.txt中OpenCV和MNN的路径
- 第三步:开始编译
``` shell
mkdir build && cd build
cmake ..
make
```
如果在build目录下生成`picodet-mnn`可执行文件,就证明成功了。
Note that a flag at `main.cpp` is used to control whether to show the detection result or save it into a fold.
``` c++
#define __SAVE_RESULT__ // if defined save drawed results to ../results, else show it in windows
```
## Run
### Python
`demo_mnn.py` provide an inference class `PicoDetMNN` that combines preprocess, post process, visualization.
Besides it can be used in command line with the form:
## 开始运行
首先新建预测结果存放目录:
```shell
demo_mnn.py [-h] [--model_path MODEL_PATH] [--cfg_path CFG_PATH]
[--img_fold IMG_FOLD] [--result_fold RESULT_FOLD]
[--input_shape INPUT_SHAPE INPUT_SHAPE]
[--backend {MNN,ONNX,torch}]
cp -r ../demo_onnxruntime/imgs .
cd build
mkdir ../results
```
For example:
- 预测一张图片
``` shell
# run MNN 416 model
python ./demo_mnn.py --model_path ../model/picodet-416.mnn --img_fold ../imgs --result_fold ../results
# run MNN 320 model
python ./demo_mnn.py --model_path ../model/picodet-320.mnn --input_shape 320 320 --backend MNN
# run onnx model
python ./demo_mnn.py --model_path ../model/sim.onnx --backend ONNX
./picodet-mnn 0 ../picodet_s_320_lcnet_3.mnn 320 320 ../imgs/dog.jpg
```
### C++
C++ inference interface is same with NCNN code, to detect images in a fold, run:
-测试速度Benchmark
``` shell
./picodet-mnn "1" "../imgs/test.jpg"
./picodet-mnn 1 ../picodet_s_320_lcnet.mnn 320 320
```
For speed benchmark
## FAQ
``` shell
./picodet-mnn "3" "0"
- 预测结果精度不对:
请先确认模型输入shape是否对齐,并且模型输出name是否对齐,不带后处理的PicoDet增强版模型输出name如下:
```shell
# 分类分支 | 检测分支
{"transpose_0.tmp_0", "transpose_1.tmp_0"},
{"transpose_2.tmp_0", "transpose_3.tmp_0"},
{"transpose_4.tmp_0", "transpose_5.tmp_0"},
{"transpose_6.tmp_0", "transpose_7.tmp_0"},
```
可使用[netron](https://netron.app)查看具体name,并修改`picodet_mnn.hpp`中相应`non_postprocess_heads_info`数组。
## Reference
[MNN](https://github.com/alibaba/MNN)
......@@ -11,7 +11,6 @@
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// reference from https://github.com/RangiLyu/nanodet/tree/main/demo_mnn
#include "picodet_mnn.hpp"
#include <iostream>
......@@ -19,354 +18,186 @@
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#define __SAVE_RESULT__ // if defined save drawed results to ../results, else show it in windows
#define __SAVE_RESULT__ // if defined save drawed results to ../results, else
// show it in windows
struct object_rect {
int x;
int y;
int width;
int height;
int x;
int y;
int width;
int height;
};
int resize_uniform(cv::Mat& src, cv::Mat& dst, cv::Size dst_size, object_rect& effect_area)
{
int w = src.cols;
int h = src.rows;
int dst_w = dst_size.width;
int dst_h = dst_size.height;
dst = cv::Mat(cv::Size(dst_w, dst_h), CV_8UC3, cv::Scalar(0));
float ratio_src = w * 1.0 / h;
float ratio_dst = dst_w * 1.0 / dst_h;
int tmp_w = 0;
int tmp_h = 0;
if (ratio_src > ratio_dst) {
tmp_w = dst_w;
tmp_h = floor((dst_w * 1.0 / w) * h);
}
else if (ratio_src < ratio_dst) {
tmp_h = dst_h;
tmp_w = floor((dst_h * 1.0 / h) * w);
}
else {
cv::resize(src, dst, dst_size);
effect_area.x = 0;
effect_area.y = 0;
effect_area.width = dst_w;
effect_area.height = dst_h;
return 0;
}
cv::Mat tmp;
cv::resize(src, tmp, cv::Size(tmp_w, tmp_h));
if (tmp_w != dst_w) {
int index_w = floor((dst_w - tmp_w) / 2.0);
for (int i = 0; i < dst_h; i++) {
memcpy(dst.data + i * dst_w * 3 + index_w * 3, tmp.data + i * tmp_w * 3, tmp_w * 3);
}
effect_area.x = index_w;
effect_area.y = 0;
effect_area.width = tmp_w;
effect_area.height = tmp_h;
}
else if (tmp_h != dst_h) {
int index_h = floor((dst_h - tmp_h) / 2.0);
memcpy(dst.data + index_h * dst_w * 3, tmp.data, tmp_w * tmp_h * 3);
effect_area.x = 0;
effect_area.y = index_h;
effect_area.width = tmp_w;
effect_area.height = tmp_h;
}
else {
printf("error\n");
std::vector<int> GenerateColorMap(int num_class) {
auto colormap = std::vector<int>(3 * num_class, 0);
for (int i = 0; i < num_class; ++i) {
int j = 0;
int lab = i;
while (lab) {
colormap[i * 3] |= (((lab >> 0) & 1) << (7 - j));
colormap[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j));
colormap[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j));
++j;
lab >>= 3;
}
return 0;
}
return colormap;
}
const int color_list[80][3] =
{
{216 , 82 , 24},
{236 ,176 , 31},
{125 , 46 ,141},
{118 ,171 , 47},
{ 76 ,189 ,237},
{238 , 19 , 46},
{ 76 , 76 , 76},
{153 ,153 ,153},
{255 , 0 , 0},
{255 ,127 , 0},
{190 ,190 , 0},
{ 0 ,255 , 0},
{ 0 , 0 ,255},
{170 , 0 ,255},
{ 84 , 84 , 0},
{ 84 ,170 , 0},
{ 84 ,255 , 0},
{170 , 84 , 0},
{170 ,170 , 0},
{170 ,255 , 0},
{255 , 84 , 0},
{255 ,170 , 0},
{255 ,255 , 0},
{ 0 , 84 ,127},
{ 0 ,170 ,127},
{ 0 ,255 ,127},
{ 84 , 0 ,127},
{ 84 , 84 ,127},
{ 84 ,170 ,127},
{ 84 ,255 ,127},
{170 , 0 ,127},
{170 , 84 ,127},
{170 ,170 ,127},
{170 ,255 ,127},
{255 , 0 ,127},
{255 , 84 ,127},
{255 ,170 ,127},
{255 ,255 ,127},
{ 0 , 84 ,255},
{ 0 ,170 ,255},
{ 0 ,255 ,255},
{ 84 , 0 ,255},
{ 84 , 84 ,255},
{ 84 ,170 ,255},
{ 84 ,255 ,255},
{170 , 0 ,255},
{170 , 84 ,255},
{170 ,170 ,255},
{170 ,255 ,255},
{255 , 0 ,255},
{255 , 84 ,255},
{255 ,170 ,255},
{ 42 , 0 , 0},
{ 84 , 0 , 0},
{127 , 0 , 0},
{170 , 0 , 0},
{212 , 0 , 0},
{255 , 0 , 0},
{ 0 , 42 , 0},
{ 0 , 84 , 0},
{ 0 ,127 , 0},
{ 0 ,170 , 0},
{ 0 ,212 , 0},
{ 0 ,255 , 0},
{ 0 , 0 , 42},
{ 0 , 0 , 84},
{ 0 , 0 ,127},
{ 0 , 0 ,170},
{ 0 , 0 ,212},
{ 0 , 0 ,255},
{ 0 , 0 , 0},
{ 36 , 36 , 36},
{ 72 , 72 , 72},
{109 ,109 ,109},
{145 ,145 ,145},
{182 ,182 ,182},
{218 ,218 ,218},
{ 0 ,113 ,188},
{ 80 ,182 ,188},
{127 ,127 , 0},
};
void draw_bboxes(const cv::Mat& bgr, const std::vector<BoxInfo>& bboxes, object_rect effect_roi, std::string save_path="None")
{
static const char* class_names[] = { "person", "bicycle", "car", "motorcycle", "airplane", "bus",
"train", "truck", "boat", "traffic light", "fire hydrant",
"stop sign", "parking meter", "bench", "bird", "cat", "dog",
"horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe",
"backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
"skis", "snowboard", "sports ball", "kite", "baseball bat",
"baseball glove", "skateboard", "surfboard", "tennis racket",
"bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl",
"banana", "apple", "sandwich", "orange", "broccoli", "carrot",
"hot dog", "pizza", "donut", "cake", "chair", "couch",
"potted plant", "bed", "dining table", "toilet", "tv", "laptop",
"mouse", "remote", "keyboard", "cell phone", "microwave", "oven",
"toaster", "sink", "refrigerator", "book", "clock", "vase",
"scissors", "teddy bear", "hair drier", "toothbrush"
};
cv::Mat image = bgr.clone();
int src_w = image.cols;
int src_h = image.rows;
int dst_w = effect_roi.width;
int dst_h = effect_roi.height;
float width_ratio = (float)src_w / (float)dst_w;
float height_ratio = (float)src_h / (float)dst_h;
for (size_t i = 0; i < bboxes.size(); i++)
{
const BoxInfo& bbox = bboxes[i];
cv::Scalar color = cv::Scalar(color_list[bbox.label][0], color_list[bbox.label][1], color_list[bbox.label][2]);
cv::rectangle(image, cv::Rect(cv::Point((bbox.x1 - effect_roi.x) * width_ratio, (bbox.y1 - effect_roi.y) * height_ratio),
cv::Point((bbox.x2 - effect_roi.x) * width_ratio, (bbox.y2 - effect_roi.y) * height_ratio)), color);
char text[256];
sprintf(text, "%s %.1f%%", class_names[bbox.label], bbox.score * 100);
int baseLine = 0;
cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.4, 1, &baseLine);
int x = (bbox.x1 - effect_roi.x) * width_ratio;
int y = (bbox.y1 - effect_roi.y) * height_ratio - label_size.height - baseLine;
if (y < 0)
y = 0;
if (x + label_size.width > image.cols)
x = image.cols - label_size.width;
cv::rectangle(image, cv::Rect(cv::Point(x, y), cv::Size(label_size.width, label_size.height + baseLine)),
color, -1);
cv::putText(image, text, cv::Point(x, y + label_size.height),
cv::FONT_HERSHEY_SIMPLEX, 0.4, cv::Scalar(255, 255, 255));
}
if (save_path == "None")
{
cv::imshow("image", image);
}
else
{
cv::imwrite(save_path, image);
std::cout << save_path << std::endl;
}
}
int image_demo(PicoDet &detector, const char* imagepath)
{
std::vector<cv::String> filenames;
cv::glob(imagepath, filenames, false);
for (auto img_name : filenames)
{
cv::Mat image = cv::imread(img_name);
if (image.empty())
{
fprintf(stderr, "cv::imread %s failed\n", img_name.c_str());
return -1;
}
object_rect effect_roi;
cv::Mat resized_img;
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
std::vector<BoxInfo> results;
detector.detect(resized_img, results);
#ifdef __SAVE_RESULT__
std::string save_path = img_name;
draw_bboxes(image, results, effect_roi, save_path.replace(3, 4, "results"));
#else
draw_bboxes(image, results, effect_roi);
cv::waitKey(0);
#endif
}
return 0;
void draw_bboxes(const cv::Mat &im, const std::vector<BoxInfo> &bboxes,
std::string save_path = "None") {
static const char *class_names[] = {
"person", "bicycle", "car",
"motorcycle", "airplane", "bus",
"train", "truck", "boat",
"traffic light", "fire hydrant", "stop sign",
"parking meter", "bench", "bird",
"cat", "dog", "horse",
"sheep", "cow", "elephant",
"bear", "zebra", "giraffe",
"backpack", "umbrella", "handbag",
"tie", "suitcase", "frisbee",
"skis", "snowboard", "sports ball",
"kite", "baseball bat", "baseball glove",
"skateboard", "surfboard", "tennis racket",
"bottle", "wine glass", "cup",
"fork", "knife", "spoon",
"bowl", "banana", "apple",
"sandwich", "orange", "broccoli",
"carrot", "hot dog", "pizza",
"donut", "cake", "chair",
"couch", "potted plant", "bed",
"dining table", "toilet", "tv",
"laptop", "mouse", "remote",
"keyboard", "cell phone", "microwave",
"oven", "toaster", "sink",
"refrigerator", "book", "clock",
"vase", "scissors", "teddy bear",
"hair drier", "toothbrush"};
cv::Mat image = im.clone();
int src_w = image.cols;
int src_h = image.rows;
int thickness = 2;
auto colormap = GenerateColorMap(sizeof(class_names));
for (size_t i = 0; i < bboxes.size(); i++) {
const BoxInfo &bbox = bboxes[i];
std::cout << bbox.x1 << ". " << bbox.y1 << ". " << bbox.x2 << ". "
<< bbox.y2 << ". " << std::endl;
int c1 = colormap[3 * bbox.label + 0];
int c2 = colormap[3 * bbox.label + 1];
int c3 = colormap[3 * bbox.label + 2];
cv::Scalar color = cv::Scalar(c1, c2, c3);
// cv::Scalar color = cv::Scalar(0, 0, 255);
cv::rectangle(image, cv::Rect(cv::Point(bbox.x1, bbox.y1),
cv::Point(bbox.x2, bbox.y2)),
color, 1, cv::LINE_AA);
char text[256];
sprintf(text, "%s %.1f%%", class_names[bbox.label], bbox.score * 100);
int baseLine = 0;
cv::Size label_size =
cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.4, 1, &baseLine);
int x = bbox.x1;
int y = bbox.y1 - label_size.height - baseLine;
if (y < 0)
y = 0;
if (x + label_size.width > image.cols)
x = image.cols - label_size.width;
cv::rectangle(image, cv::Rect(cv::Point(x, y),
cv::Size(label_size.width,
label_size.height + baseLine)),
color, -1);
cv::putText(image, text, cv::Point(x, y + label_size.height),
cv::FONT_HERSHEY_SIMPLEX, 0.4, cv::Scalar(255, 255, 255), 1,
cv::LINE_AA);
}
if (save_path == "None") {
cv::imshow("image", image);
} else {
cv::imwrite(save_path, image);
std::cout << save_path << std::endl;
}
}
int webcam_demo(PicoDet& detector, int cam_id)
{
cv::Mat image;
cv::VideoCapture cap(cam_id);
int image_demo(PicoDet &detector, const char *imagepath) {
std::vector<cv::String> filenames;
cv::glob(imagepath, filenames, false);
while (true)
{
cap >> image;
object_rect effect_roi;
cv::Mat resized_img;
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
std::vector<BoxInfo> results;
detector.detect(resized_img, results);
draw_bboxes(image, results, effect_roi);
cv::waitKey(1);
for (auto img_name : filenames) {
cv::Mat image = cv::imread(img_name, cv::IMREAD_COLOR);
if (image.empty()) {
fprintf(stderr, "cv::imread %s failed\n", img_name.c_str());
return -1;
}
return 0;
std::vector<BoxInfo> results;
detector.detect(image, results, false);
std::cout << "detect done." << std::endl;
#ifdef __SAVE_RESULT__
std::string save_path = img_name;
draw_bboxes(image, results, save_path.replace(3, 4, "results"));
#else
draw_bboxes(image, results);
cv::waitKey(0);
#endif
}
return 0;
}
int video_demo(PicoDet& detector, const char* path)
{
cv::Mat image;
cv::VideoCapture cap(path);
while (true)
{
cap >> image;
object_rect effect_roi;
cv::Mat resized_img;
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
std::vector<BoxInfo> results;
detector.detect(resized_img, results);
draw_bboxes(image, results, effect_roi);
cv::waitKey(1);
int benchmark(PicoDet &detector, int width, int height) {
int loop_num = 100;
int warm_up = 8;
double time_min = DBL_MAX;
double time_max = -DBL_MAX;
double time_avg = 0;
cv::Mat image(width, height, CV_8UC3, cv::Scalar(1, 1, 1));
for (int i = 0; i < warm_up + loop_num; i++) {
auto start = std::chrono::steady_clock::now();
std::vector<BoxInfo> results;
detector.detect(image, results, false);
auto end = std::chrono::steady_clock::now();
std::chrono::duration<double> elapsed = end - start;
double time = elapsed.count();
if (i >= warm_up) {
time_min = (std::min)(time_min, time);
time_max = (std::max)(time_max, time);
time_avg += time;
}
return 0;
}
time_avg /= loop_num;
fprintf(stderr, "%20s min = %7.2f max = %7.2f avg = %7.2f\n", "picodet",
time_min, time_max, time_avg);
return 0;
}
int benchmark(PicoDet& detector)
{
int loop_num = 100;
int warm_up = 8;
double time_min = DBL_MAX;
double time_max = -DBL_MAX;
double time_avg = 0;
cv::Mat image(320, 320, CV_8UC3, cv::Scalar(1, 1, 1));
for (int i = 0; i < warm_up + loop_num; i++)
{
auto start = std::chrono::steady_clock::now();
std::vector<BoxInfo> results;
detector.detect(image, results);
auto end = std::chrono::steady_clock::now();
std::chrono::duration<double> elapsed = end - start;
double time = elapsed.count();
if (i >= warm_up)
{
time_min = (std::min)(time_min, time);
time_max = (std::max)(time_max, time);
time_avg += time;
}
}
time_avg /= loop_num;
fprintf(stderr, "%20s min = %7.2f max = %7.2f avg = %7.2f\n", "picodet", time_min, time_max, time_avg);
return 0;
}
int main(int argc, char** argv)
{
if (argc != 3)
{
fprintf(stderr, "usage: %s [mode] [path]. \n For webcam mode=0, path is cam id; \n For image demo, mode=1, path=xxx/xxx/*.jpg; \n For video, mode=2; \n For benchmark, mode=3 path=0.\n", argv[0]);
return -1;
}
PicoDet detector = PicoDet("../weight/picodet-416.mnn", 416, 416, 4, 0.45, 0.3);
int mode = atoi(argv[1]);
switch (mode)
{
case 0:{
int cam_id = atoi(argv[2]);
webcam_demo(detector, cam_id);
break;
}
case 1:{
const char* images = argv[2];
image_demo(detector, images);
break;
}
case 2:{
const char* path = argv[2];
video_demo(detector, path);
break;
}
case 3:{
benchmark(detector);
break;
}
default:{
fprintf(stderr, "usage: %s [mode] [path]. \n For webcam mode=0, path is cam id; \n For image demo, mode=1, path=xxx/xxx/*.jpg; \n For video, mode=2; \n For benchmark, mode=3 path=0.\n", argv[0]);
break;
}
int main(int argc, char **argv) {
int mode = atoi(argv[1]);
std::string model_path = argv[2];
int height = 320;
int width = 320;
if (argc == 4) {
height = atoi(argv[3]);
width = atoi(argv[4]);
}
PicoDet detector = PicoDet(model_path, width, height, 4, 0.45, 0.3);
if (mode == 1) {
benchmark(detector, width, height);
} else {
if (argc != 5) {
std::cout << "Must set image file, such as ./picodet-mnn 0 "
"../picodet_s_320_lcnet.mnn 320 320 img.jpg"
<< std::endl;
}
const char *images = argv[5];
image_demo(detector, images);
}
}
......@@ -44,7 +44,8 @@ PicoDet::~PicoDet() {
PicoDet_interpreter->releaseSession(PicoDet_session);
}
int PicoDet::detect(cv::Mat &raw_image, std::vector<BoxInfo> &result_list) {
int PicoDet::detect(cv::Mat &raw_image, std::vector<BoxInfo> &result_list,
bool has_postprocess) {
if (raw_image.empty()) {
std::cout << "image is empty ,please check!" << std::endl;
return -1;
......@@ -70,22 +71,57 @@ int PicoDet::detect(cv::Mat &raw_image, std::vector<BoxInfo> &result_list) {
std::vector<std::vector<BoxInfo>> results;
results.resize(num_class);
for (const auto &head_info : heads_info) {
MNN::Tensor *tensor_scores = PicoDet_interpreter->getSessionOutput(
PicoDet_session, head_info.cls_layer.c_str());
MNN::Tensor *tensor_boxes = PicoDet_interpreter->getSessionOutput(
PicoDet_session, head_info.dis_layer.c_str());
MNN::Tensor tensor_scores_host(tensor_scores,
tensor_scores->getDimensionType());
tensor_scores->copyToHostTensor(&tensor_scores_host);
MNN::Tensor tensor_boxes_host(tensor_boxes,
tensor_boxes->getDimensionType());
tensor_boxes->copyToHostTensor(&tensor_boxes_host);
decode_infer(&tensor_scores_host, &tensor_boxes_host, head_info.stride,
score_threshold, results);
if (has_postprocess) {
auto bbox_out_tensor = PicoDet_interpreter->getSessionOutput(
PicoDet_session, nms_heads_info[0].c_str());
auto class_out_tensor = PicoDet_interpreter->getSessionOutput(
PicoDet_session, nms_heads_info[1].c_str());
// bbox branch
auto tensor_bbox_host =
new MNN::Tensor(bbox_out_tensor, MNN::Tensor::CAFFE);
bbox_out_tensor->copyToHostTensor(tensor_bbox_host);
auto bbox_output_shape = tensor_bbox_host->shape();
int output_size = 1;
for (int j = 0; j < bbox_output_shape.size(); ++j) {
output_size *= bbox_output_shape[j];
}
std::cout << "output_size:" << output_size << std::endl;
bbox_output_data_.resize(output_size);
std::copy_n(tensor_bbox_host->host<float>(), output_size,
bbox_output_data_.data());
delete tensor_bbox_host;
// class branch
auto tensor_class_host =
new MNN::Tensor(class_out_tensor, MNN::Tensor::CAFFE);
class_out_tensor->copyToHostTensor(tensor_class_host);
auto class_output_shape = tensor_class_host->shape();
output_size = 1;
for (int j = 0; j < class_output_shape.size(); ++j) {
output_size *= class_output_shape[j];
}
std::cout << "output_size:" << output_size << std::endl;
class_output_data_.resize(output_size);
std::copy_n(tensor_class_host->host<float>(), output_size,
class_output_data_.data());
delete tensor_class_host;
} else {
for (const auto &head_info : non_postprocess_heads_info) {
MNN::Tensor *tensor_scores = PicoDet_interpreter->getSessionOutput(
PicoDet_session, head_info.cls_layer.c_str());
MNN::Tensor *tensor_boxes = PicoDet_interpreter->getSessionOutput(
PicoDet_session, head_info.dis_layer.c_str());
MNN::Tensor tensor_scores_host(tensor_scores,
tensor_scores->getDimensionType());
tensor_scores->copyToHostTensor(&tensor_scores_host);
MNN::Tensor tensor_boxes_host(tensor_boxes,
tensor_boxes->getDimensionType());
tensor_boxes->copyToHostTensor(&tensor_boxes_host);
decode_infer(&tensor_scores_host, &tensor_boxes_host, head_info.stride,
score_threshold, results);
}
}
auto end = chrono::steady_clock::now();
......@@ -188,8 +224,6 @@ void PicoDet::nms(std::vector<BoxInfo> &input_boxes, float NMS_THRESH) {
}
}
string PicoDet::get_label_str(int label) { return labels[label]; }
inline float fast_exp(float x) {
union {
uint32_t i;
......
......@@ -11,7 +11,6 @@
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// reference from https://github.com/RangiLyu/nanodet/tree/main/demo_mnn
#ifndef __PicoDet_H__
#define __PicoDet_H__
......@@ -20,90 +19,84 @@
#include "Interpreter.hpp"
#include "ImageProcess.hpp"
#include "MNNDefine.h"
#include "Tensor.hpp"
#include "ImageProcess.hpp"
#include <opencv2/opencv.hpp>
#include <algorithm>
#include <chrono>
#include <iostream>
#include <memory>
#include <opencv2/opencv.hpp>
#include <string>
#include <vector>
#include <memory>
#include <chrono>
typedef struct HeadInfo_
{
std::string cls_layer;
std::string dis_layer;
int stride;
} HeadInfo;
typedef struct BoxInfo_
{
float x1;
float y1;
float x2;
float y2;
float score;
int label;
typedef struct NonPostProcessHeadInfo_ {
std::string cls_layer;
std::string dis_layer;
int stride;
} NonPostProcessHeadInfo;
typedef struct BoxInfo_ {
float x1;
float y1;
float x2;
float y2;
float score;
int label;
} BoxInfo;
class PicoDet {
public:
PicoDet(const std::string &mnn_path,
int input_width, int input_length, int num_thread_ = 4, float score_threshold_ = 0.5, float nms_threshold_ = 0.3);
PicoDet(const std::string &mnn_path, int input_width, int input_length,
int num_thread_ = 4, float score_threshold_ = 0.5,
float nms_threshold_ = 0.3);
~PicoDet();
~PicoDet();
int detect(cv::Mat &img, std::vector<BoxInfo> &result_list);
std::string get_label_str(int label);
int detect(cv::Mat &img, std::vector<BoxInfo> &result_list,
bool has_postprocess);
private:
void decode_infer(MNN::Tensor *cls_pred, MNN::Tensor *dis_pred, int stride, float threshold, std::vector<std::vector<BoxInfo>> &results);
BoxInfo disPred2Bbox(const float *&dfl_det, int label, float score, int x, int y, int stride);
void nms(std::vector<BoxInfo> &input_boxes, float NMS_THRESH);
void decode_infer(MNN::Tensor *cls_pred, MNN::Tensor *dis_pred, int stride,
float threshold,
std::vector<std::vector<BoxInfo>> &results);
BoxInfo disPred2Bbox(const float *&dfl_det, int label, float score, int x,
int y, int stride);
void nms(std::vector<BoxInfo> &input_boxes, float NMS_THRESH);
private:
std::shared_ptr<MNN::Interpreter> PicoDet_interpreter;
MNN::Session *PicoDet_session = nullptr;
MNN::Tensor *input_tensor = nullptr;
int num_thread;
int image_w;
int image_h;
int in_w = 320;
int in_h = 320;
float score_threshold;
float nms_threshold;
const float mean_vals[3] = { 103.53f, 116.28f, 123.675f };
const float norm_vals[3] = { 0.017429f, 0.017507f, 0.017125f };
const int num_class = 80;
const int reg_max = 7;
std::vector<HeadInfo> heads_info{
// cls_pred|dis_pred|stride
{"save_infer_model/scale_0.tmp_1", "save_infer_model/scale_4.tmp_1", 8},
{"save_infer_model/scale_1.tmp_1", "save_infer_model/scale_5.tmp_1", 16},
{"save_infer_model/scale_2.tmp_1", "save_infer_model/scale_6.tmp_1", 32},
{"save_infer_model/scale_3.tmp_1", "save_infer_model/scale_7.tmp_1", 64},
};
std::vector<std::string>
labels{"person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
"fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
"elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
"skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
"tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
"sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
"potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
"microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
"hair drier", "toothbrush"};
std::shared_ptr<MNN::Interpreter> PicoDet_interpreter;
MNN::Session *PicoDet_session = nullptr;
MNN::Tensor *input_tensor = nullptr;
int num_thread;
int image_w;
int image_h;
int in_w = 320;
int in_h = 320;
float score_threshold;
float nms_threshold;
const float mean_vals[3] = {103.53f, 116.28f, 123.675f};
const float norm_vals[3] = {0.017429f, 0.017507f, 0.017125f};
const int num_class = 80;
const int reg_max = 7;
std::vector<float> bbox_output_data_;
std::vector<float> class_output_data_;
std::vector<std::string> nms_heads_info{"tmp_16", "concat_4.tmp_0"};
// If not export post-process, will use non_postprocess_heads_info
std::vector<NonPostProcessHeadInfo> non_postprocess_heads_info{
// cls_pred|dis_pred|stride
{"transpose_0.tmp_0", "transpose_1.tmp_0", 8},
{"transpose_2.tmp_0", "transpose_3.tmp_0", 16},
{"transpose_4.tmp_0", "transpose_5.tmp_0", 32},
{"transpose_6.tmp_0", "transpose_7.tmp_0", 64},
};
};
template <typename _Tp>
......
cmake_minimum_required(VERSION 3.4.1)
cmake_minimum_required(VERSION 3.9)
set(CMAKE_CXX_STANDARD 17)
project(picodet_demo)
......@@ -11,9 +11,11 @@ if(OPENMP_FOUND)
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_EXE_LINKER_FLAGS}")
endif()
find_package(OpenCV REQUIRED)
# find_package(OpenCV REQUIRED)
find_package(OpenCV REQUIRED PATHS "/path/to/opencv-3.4.16_gcc8.2_ffmpeg")
find_package(ncnn REQUIRED)
# find_package(ncnn REQUIRED)
find_package(ncnn REQUIRED PATHS "/path/to/ncnn/build/install/lib/cmake/ncnn")
if(NOT TARGET ncnn)
message(WARNING "ncnn NOT FOUND! Please set ncnn_DIR environment variable")
else()
......
# PicoDet NCNN Demo
This project provides PicoDet image inference, webcam inference and benchmark using
[Tencent's NCNN framework](https://github.com/Tencent/ncnn).
# How to build
该Demo提供的预测代码是根据[Tencent's NCNN framework](https://github.com/Tencent/ncnn)推理库预测的。
# 第一步:编译
## Windows
### Step1.
Download and Install Visual Studio from https://visualstudio.microsoft.com/vs/community/
......@@ -12,11 +10,16 @@ Download and Install Visual Studio from https://visualstudio.microsoft.com/vs/co
### Step2.
Download and install OpenCV from https://github.com/opencv/opencv/releases
### Step3(Optional).
为了方便,如果环境是gcc8.2 x86环境,可直接下载以下库:
```shell
wget https://paddledet.bj.bcebos.com/data/opencv-3.4.16_gcc8.2_ffmpeg.tar.gz
tar -xf opencv-3.4.16_gcc8.2_ffmpeg.tar.gz
```
### Step3(可选).
Download and install Vulkan SDK from https://vulkan.lunarg.com/sdk/home
### Step4.
Clone NCNN repository
### Step4:编译NCNN
``` shell script
git clone --recursive https://github.com/Tencent/ncnn.git
......@@ -25,7 +28,7 @@ Build NCNN following this tutorial: [Build for Windows x64 using VS2017](https:/
### Step5.
Add `ncnn_DIR` = `YOUR_NCNN_PATH/build/install/lib/cmake/ncnn` to system environment variables.
增加 `ncnn_DIR` = `YOUR_NCNN_PATH/build/install/lib/cmake/ncnn` 到系统变量中
Build project: Open x64 Native Tools Command Prompt for VS 2019 or 2017
......@@ -42,10 +45,10 @@ msbuild picodet_demo.vcxproj /p:configuration=release /p:platform=x64
### Step1.
Build and install OpenCV from https://github.com/opencv/opencv
### Step2(Optional).
### Step2(可选).
Download Vulkan SDK from https://vulkan.lunarg.com/sdk/home
### Step3.
### Step3:编译NCNN
Clone NCNN repository
``` shell script
......@@ -54,15 +57,7 @@ git clone --recursive https://github.com/Tencent/ncnn.git
Build NCNN following this tutorial: [Build for Linux / NVIDIA Jetson / Raspberry Pi](https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-linux)
### Step4.
Set environment variables. Run:
``` shell script
export ncnn_DIR=YOUR_NCNN_PATH/build/install/lib/cmake/ncnn
```
Build project
### Step4:编译可执行文件
``` shell script
cd <this-folder>
......@@ -71,47 +66,64 @@ cd build
cmake ..
make
```
# Run demo
Download PicoDet ncnn model.
* [PicoDet ncnn model download link](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_ncnn.zip)
## Webcam
```shell script
picodet_demo 0 0
- 准备模型
```shell
modelName=picodet_s_320_coco_lcnet
# 导出Inference model
python tools/export_model.py \
-c configs/picodet/${modelName}.yml \
-o weights=${modelName}.pdparams \
--output_dir=inference_model
# 转换到ONNX
paddle2onnx --model_dir inference_model/${modelName} \
--model_filename model.pdmodel \
--params_filename model.pdiparams \
--opset_version 11 \
--save_file ${modelName}.onnx
# 简化模型
python -m onnxsim ${modelName}.onnx ${modelName}_processed.onnx
# 将模型转换至NCNN格式
Run onnx2ncnn in ncnn tools to generate ncnn .param and .bin file.
```
转NCNN模型可以利用在线转换工具 [https://convertmodel.com](https://convertmodel.com/)
为了快速测试,可直接下载:[picodet_s_320_coco_lcnet-opt.bin](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco_lcnet-opt.bin)/ [picodet_s_320_coco_lcnet-opt.param](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco_lcnet-opt.param)(不带后处理)。
**注意:**由于带后处理后,NCNN预测会出NAN,暂时使用不带后处理Demo即可,带后处理的Demo正在升级中,很快发布。
## 开始运行
首先新建预测结果存放目录:
```shell
cp -r ../demo_onnxruntime/imgs .
cd build
mkdir ../results
```
## Inference images
```shell script
picodet_demo 1 IMAGE_FOLDER/*.jpg
- 预测一张图片
``` shell
./picodet_demo 0 ../picodet_s_320_coco_lcnet.bin ../picodet_s_320_coco_lcnet.param 320 320 ../imgs/dog.jpg 0
```
具体参数解析可参考`main.cpp`
## Inference video
-测试速度Benchmark
```shell script
picodet_demo 2 VIDEO_PATH
``` shell
./picodet_demo 1 ../picodet_s_320_lcnet.bin ../picodet_s_320_lcnet.param 320 320 0
```
## Benchmark
```shell script
picodet_demo 3 0
result: picodet min = 17.74 max = 22.71 avg = 18.16
```
****
Notice:
If benchmark speed is slow, try to limit omp thread num.
Linux:
## FAQ
```shell script
export OMP_THREAD_LIMIT=4
- 预测结果精度不对:
请先确认模型输入shape是否对齐,并且模型输出name是否对齐,不带后处理的PicoDet增强版模型输出name如下:
```shell
# 分类分支 | 检测分支
{"transpose_0.tmp_0", "transpose_1.tmp_0"},
{"transpose_2.tmp_0", "transpose_3.tmp_0"},
{"transpose_4.tmp_0", "transpose_5.tmp_0"},
{"transpose_6.tmp_0", "transpose_7.tmp_0"},
```
可使用[netron](https://netron.app)查看具体name,并修改`picodet_mnn.hpp`中相应`non_postprocess_heads_info`数组。
......@@ -13,353 +13,198 @@
// limitations under the License.
// reference from https://github.com/RangiLyu/nanodet/tree/main/demo_ncnn
#include "picodet.h"
#include <benchmark.h>
#include <iostream>
#include <net.h>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <iostream>
#include <net.h>
#include "picodet.h"
#include <benchmark.h>
#define __SAVE_RESULT__ // if defined save drawed results to ../results, else
// show it in windows
struct object_rect {
int x;
int y;
int width;
int height;
};
int resize_uniform(cv::Mat& src, cv::Mat& dst, cv::Size dst_size, object_rect& effect_area)
{
int w = src.cols;
int h = src.rows;
int dst_w = dst_size.width;
int dst_h = dst_size.height;
dst = cv::Mat(cv::Size(dst_w, dst_h), CV_8UC3, cv::Scalar(0));
float ratio_src = w * 1.0 / h;
float ratio_dst = dst_w * 1.0 / dst_h;
int tmp_w = 0;
int tmp_h = 0;
if (ratio_src > ratio_dst) {
tmp_w = dst_w;
tmp_h = floor((dst_w * 1.0 / w) * h);
}
else if (ratio_src < ratio_dst) {
tmp_h = dst_h;
tmp_w = floor((dst_h * 1.0 / h) * w);
}
else {
cv::resize(src, dst, dst_size);
effect_area.x = 0;
effect_area.y = 0;
effect_area.width = dst_w;
effect_area.height = dst_h;
return 0;
}
cv::Mat tmp;
cv::resize(src, tmp, cv::Size(tmp_w, tmp_h));
if (tmp_w != dst_w) {
int index_w = floor((dst_w - tmp_w) / 2.0);
for (int i = 0; i < dst_h; i++) {
memcpy(dst.data + i * dst_w * 3 + index_w * 3, tmp.data + i * tmp_w * 3, tmp_w * 3);
}
effect_area.x = index_w;
effect_area.y = 0;
effect_area.width = tmp_w;
effect_area.height = tmp_h;
}
else if (tmp_h != dst_h) {
int index_h = floor((dst_h - tmp_h) / 2.0);
memcpy(dst.data + index_h * dst_w * 3, tmp.data, tmp_w * tmp_h * 3);
effect_area.x = 0;
effect_area.y = index_h;
effect_area.width = tmp_w;
effect_area.height = tmp_h;
}
else {
printf("error\n");
}
return 0;
}
const int color_list[80][3] =
{
{216 , 82 , 24},
{236 ,176 , 31},
{125 , 46 ,141},
{118 ,171 , 47},
{ 76 ,189 ,237},
{238 , 19 , 46},
{ 76 , 76 , 76},
{153 ,153 ,153},
{255 , 0 , 0},
{255 ,127 , 0},
{190 ,190 , 0},
{ 0 ,255 , 0},
{ 0 , 0 ,255},
{170 , 0 ,255},
{ 84 , 84 , 0},
{ 84 ,170 , 0},
{ 84 ,255 , 0},
{170 , 84 , 0},
{170 ,170 , 0},
{170 ,255 , 0},
{255 , 84 , 0},
{255 ,170 , 0},
{255 ,255 , 0},
{ 0 , 84 ,127},
{ 0 ,170 ,127},
{ 0 ,255 ,127},
{ 84 , 0 ,127},
{ 84 , 84 ,127},
{ 84 ,170 ,127},
{ 84 ,255 ,127},
{170 , 0 ,127},
{170 , 84 ,127},
{170 ,170 ,127},
{170 ,255 ,127},
{255 , 0 ,127},
{255 , 84 ,127},
{255 ,170 ,127},
{255 ,255 ,127},
{ 0 , 84 ,255},
{ 0 ,170 ,255},
{ 0 ,255 ,255},
{ 84 , 0 ,255},
{ 84 , 84 ,255},
{ 84 ,170 ,255},
{ 84 ,255 ,255},
{170 , 0 ,255},
{170 , 84 ,255},
{170 ,170 ,255},
{170 ,255 ,255},
{255 , 0 ,255},
{255 , 84 ,255},
{255 ,170 ,255},
{ 42 , 0 , 0},
{ 84 , 0 , 0},
{127 , 0 , 0},
{170 , 0 , 0},
{212 , 0 , 0},
{255 , 0 , 0},
{ 0 , 42 , 0},
{ 0 , 84 , 0},
{ 0 ,127 , 0},
{ 0 ,170 , 0},
{ 0 ,212 , 0},
{ 0 ,255 , 0},
{ 0 , 0 , 42},
{ 0 , 0 , 84},
{ 0 , 0 ,127},
{ 0 , 0 ,170},
{ 0 , 0 ,212},
{ 0 , 0 ,255},
{ 0 , 0 , 0},
{ 36 , 36 , 36},
{ 72 , 72 , 72},
{109 ,109 ,109},
{145 ,145 ,145},
{182 ,182 ,182},
{218 ,218 ,218},
{ 0 ,113 ,188},
{ 80 ,182 ,188},
{127 ,127 , 0},
int x;
int y;
int width;
int height;
};
void draw_bboxes(const cv::Mat& bgr, const std::vector<BoxInfo>& bboxes, object_rect effect_roi)
{
static const char* class_names[] = { "person", "bicycle", "car", "motorcycle", "airplane", "bus",
"train", "truck", "boat", "traffic light", "fire hydrant",
"stop sign", "parking meter", "bench", "bird", "cat", "dog",
"horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe",
"backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
"skis", "snowboard", "sports ball", "kite", "baseball bat",
"baseball glove", "skateboard", "surfboard", "tennis racket",
"bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl",
"banana", "apple", "sandwich", "orange", "broccoli", "carrot",
"hot dog", "pizza", "donut", "cake", "chair", "couch",
"potted plant", "bed", "dining table", "toilet", "tv", "laptop",
"mouse", "remote", "keyboard", "cell phone", "microwave", "oven",
"toaster", "sink", "refrigerator", "book", "clock", "vase",
"scissors", "teddy bear", "hair drier", "toothbrush"
};
cv::Mat image = bgr.clone();
int src_w = image.cols;
int src_h = image.rows;
int dst_w = effect_roi.width;
int dst_h = effect_roi.height;
float width_ratio = (float)src_w / (float)dst_w;
float height_ratio = (float)src_h / (float)dst_h;
for (size_t i = 0; i < bboxes.size(); i++)
{
const BoxInfo& bbox = bboxes[i];
cv::Scalar color = cv::Scalar(color_list[bbox.label][0], color_list[bbox.label][1], color_list[bbox.label][2]);
cv::rectangle(image, cv::Rect(cv::Point((bbox.x1 - effect_roi.x) * width_ratio, (bbox.y1 - effect_roi.y) * height_ratio),
cv::Point((bbox.x2 - effect_roi.x) * width_ratio, (bbox.y2 - effect_roi.y) * height_ratio)), color);
char text[256];
sprintf(text, "%s %.1f%%", class_names[bbox.label], bbox.score * 100);
int baseLine = 0;
cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.4, 1, &baseLine);
int x = (bbox.x1 - effect_roi.x) * width_ratio;
int y = (bbox.y1 - effect_roi.y) * height_ratio - label_size.height - baseLine;
if (y < 0)
y = 0;
if (x + label_size.width > image.cols)
x = image.cols - label_size.width;
cv::rectangle(image, cv::Rect(cv::Point(x, y), cv::Size(label_size.width, label_size.height + baseLine)),
color, -1);
cv::putText(image, text, cv::Point(x, y + label_size.height),
cv::FONT_HERSHEY_SIMPLEX, 0.4, cv::Scalar(255, 255, 255));
}
cv::imwrite("../result/test_picodet.jpg", image);
printf("************infer image success!!!**********\n");
}
int image_demo(PicoDet &detector, const char* imagepath)
{
std::vector<std::string> filenames;
cv::glob(imagepath, filenames, false);
for (auto img_name : filenames)
{
cv::Mat image = cv::imread(img_name);
if (image.empty())
{
fprintf(stderr, "cv::imread %s failed\n", img_name);
return -1;
}
object_rect effect_roi;
cv::Mat resized_img;
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
auto results = detector.detect(resized_img, 0.4, 0.5);
char imgName[20] = {};
draw_bboxes(image, results, effect_roi);
cv::waitKey(0);
std::vector<int> GenerateColorMap(int num_class) {
auto colormap = std::vector<int>(3 * num_class, 0);
for (int i = 0; i < num_class; ++i) {
int j = 0;
int lab = i;
while (lab) {
colormap[i * 3] |= (((lab >> 0) & 1) << (7 - j));
colormap[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j));
colormap[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j));
++j;
lab >>= 3;
}
return 0;
}
return colormap;
}
int webcam_demo(PicoDet& detector, int cam_id)
{
cv::Mat image;
cv::VideoCapture cap(cam_id);
while (true)
{
cap >> image;
object_rect effect_roi;
cv::Mat resized_img;
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
auto results = detector.detect(resized_img, 0.4, 0.5);
draw_bboxes(image, results, effect_roi);
cv::waitKey(1);
}
return 0;
void draw_bboxes(const cv::Mat &im, const std::vector<BoxInfo> &bboxes,
std::string save_path = "None") {
static const char *class_names[] = {
"person", "bicycle", "car",
"motorcycle", "airplane", "bus",
"train", "truck", "boat",
"traffic light", "fire hydrant", "stop sign",
"parking meter", "bench", "bird",
"cat", "dog", "horse",
"sheep", "cow", "elephant",
"bear", "zebra", "giraffe",
"backpack", "umbrella", "handbag",
"tie", "suitcase", "frisbee",
"skis", "snowboard", "sports ball",
"kite", "baseball bat", "baseball glove",
"skateboard", "surfboard", "tennis racket",
"bottle", "wine glass", "cup",
"fork", "knife", "spoon",
"bowl", "banana", "apple",
"sandwich", "orange", "broccoli",
"carrot", "hot dog", "pizza",
"donut", "cake", "chair",
"couch", "potted plant", "bed",
"dining table", "toilet", "tv",
"laptop", "mouse", "remote",
"keyboard", "cell phone", "microwave",
"oven", "toaster", "sink",
"refrigerator", "book", "clock",
"vase", "scissors", "teddy bear",
"hair drier", "toothbrush"};
cv::Mat image = im.clone();
int src_w = image.cols;
int src_h = image.rows;
int thickness = 2;
auto colormap = GenerateColorMap(sizeof(class_names));
for (size_t i = 0; i < bboxes.size(); i++) {
const BoxInfo &bbox = bboxes[i];
std::cout << bbox.x1 << ". " << bbox.y1 << ". " << bbox.x2 << ". "
<< bbox.y2 << ". " << std::endl;
int c1 = colormap[3 * bbox.label + 0];
int c2 = colormap[3 * bbox.label + 1];
int c3 = colormap[3 * bbox.label + 2];
cv::Scalar color = cv::Scalar(c1, c2, c3);
// cv::Scalar color = cv::Scalar(0, 0, 255);
cv::rectangle(image, cv::Rect(cv::Point(bbox.x1, bbox.y1),
cv::Point(bbox.x2, bbox.y2)),
color, 1);
char text[256];
sprintf(text, "%s %.1f%%", class_names[bbox.label], bbox.score * 100);
int baseLine = 0;
cv::Size label_size =
cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.4, 1, &baseLine);
int x = bbox.x1;
int y = bbox.y1 - label_size.height - baseLine;
if (y < 0)
y = 0;
if (x + label_size.width > image.cols)
x = image.cols - label_size.width;
cv::rectangle(image, cv::Rect(cv::Point(x, y),
cv::Size(label_size.width,
label_size.height + baseLine)),
color, -1);
cv::putText(image, text, cv::Point(x, y + label_size.height),
cv::FONT_HERSHEY_SIMPLEX, 0.4, cv::Scalar(255, 255, 255), 1);
}
if (save_path == "None") {
cv::imshow("image", image);
} else {
cv::imwrite(save_path, image);
std::cout << "Result save in: " << save_path << std::endl;
}
}
int video_demo(PicoDet& detector, const char* path)
{
cv::Mat image;
cv::VideoCapture cap(path);
while (true)
{
cap >> image;
object_rect effect_roi;
cv::Mat resized_img;
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
auto results = detector.detect(resized_img, 0.4, 0.5);
draw_bboxes(image, results, effect_roi);
cv::waitKey(1);
int image_demo(PicoDet &detector, const char *imagepath,
int has_postprocess = 0) {
std::vector<cv::String> filenames;
cv::glob(imagepath, filenames, false);
bool is_postprocess = has_postprocess > 0 ? true : false;
for (auto img_name : filenames) {
cv::Mat image = cv::imread(img_name, cv::IMREAD_COLOR);
if (image.empty()) {
fprintf(stderr, "cv::imread %s failed\n", img_name.c_str());
return -1;
}
return 0;
std::vector<BoxInfo> results;
detector.detect(image, results, is_postprocess);
std::cout << "detect done." << std::endl;
#ifdef __SAVE_RESULT__
std::string save_path = img_name;
draw_bboxes(image, results, save_path.replace(3, 4, "results"));
#else
draw_bboxes(image, results);
cv::waitKey(0);
#endif
}
return 0;
}
int benchmark(PicoDet& detector)
{
int loop_num = 100;
int warm_up = 8;
double time_min = DBL_MAX;
double time_max = -DBL_MAX;
double time_avg = 0;
ncnn::Mat input = ncnn::Mat(320, 320, 3);
input.fill(0.01f);
for (int i = 0; i < warm_up + loop_num; i++)
{
double start = ncnn::get_current_time();
ncnn::Extractor ex = detector.Net->create_extractor();
ex.input("image", input); // picodet
for (const auto& head_info : detector.heads_info)
{
ncnn::Mat dis_pred;
ncnn::Mat cls_pred;
ex.extract(head_info.dis_layer.c_str(), dis_pred);
ex.extract(head_info.cls_layer.c_str(), cls_pred);
}
double end = ncnn::get_current_time();
double time = end - start;
if (i >= warm_up)
{
time_min = (std::min)(time_min, time);
time_max = (std::max)(time_max, time);
time_avg += time;
}
int benchmark(PicoDet &detector, int width, int height,
int has_postprocess = 0) {
int loop_num = 100;
int warm_up = 8;
double time_min = DBL_MAX;
double time_max = -DBL_MAX;
double time_avg = 0;
cv::Mat image(width, height, CV_8UC3, cv::Scalar(1, 1, 1));
bool is_postprocess = has_postprocess > 0 ? true : false;
for (int i = 0; i < warm_up + loop_num; i++) {
double start = ncnn::get_current_time();
std::vector<BoxInfo> results;
detector.detect(image, results, is_postprocess);
double end = ncnn::get_current_time();
double time = end - start;
if (i >= warm_up) {
time_min = (std::min)(time_min, time);
time_max = (std::max)(time_max, time);
time_avg += time;
}
time_avg /= loop_num;
fprintf(stderr, "%20s min = %7.2f max = %7.2f avg = %7.2f\n", "picodet", time_min, time_max, time_avg);
return 0;
}
time_avg /= loop_num;
fprintf(stderr, "%20s min = %7.2f max = %7.2f avg = %7.2f\n", "picodet",
time_min, time_max, time_avg);
return 0;
}
int main(int argc, char** argv)
{
if (argc != 3)
{
fprintf(stderr, "usage: %s [mode] [path]. \n For webcam mode=0, path is cam id; \n For image demo, mode=1, path=xxx/xxx/*.jpg; \n For video, mode=2; \n For benchmark, mode=3 path=0.\n", argv[0]);
return -1;
}
PicoDet detector = PicoDet("../weight/picodet_m_416.param", "../weight/picodet_m_416.bin", true);
int mode = atoi(argv[1]);
switch (mode)
{
case 0:{
int cam_id = atoi(argv[2]);
webcam_demo(detector, cam_id);
break;
}
case 1:{
const char* images = argv[2];
image_demo(detector, images);
break;
}
case 2:{
const char* path = argv[2];
video_demo(detector, path);
break;
}
case 3:{
benchmark(detector);
break;
}
default:{
fprintf(stderr, "usage: %s [mode] [path]. \n For webcam mode=0, path is cam id; \n For image demo, mode=1, path=xxx/xxx/*.jpg; \n For video, mode=2; \n For benchmark, mode=3 path=0.\n", argv[0]);
break;
}
int main(int argc, char **argv) {
int mode = atoi(argv[1]);
char *bin_model_path = argv[2];
char *param_model_path = argv[3];
int height = 320;
int width = 320;
if (argc == 5) {
height = atoi(argv[4]);
width = atoi(argv[5]);
}
PicoDet detector =
PicoDet(param_model_path, bin_model_path, width, height, true, 0.45, 0.3);
if (mode == 1) {
benchmark(detector, width, height, atoi(argv[6]));
} else {
if (argc != 6) {
std::cout << "Must set image file, such as ./picodet_demo 0 "
"../picodet_s_320_lcnet.bin ../picodet_s_320_lcnet.param "
"320 320 img.jpg"
<< std::endl;
}
const char *images = argv[6];
image_demo(detector, images, atoi(argv[7]));
}
}
......@@ -48,7 +48,9 @@ int activation_function_softmax(const _Tp *src, _Tp *dst, int length) {
bool PicoDet::hasGPU = false;
PicoDet *PicoDet::detector = nullptr;
PicoDet::PicoDet(const char *param, const char *bin, bool useGPU) {
PicoDet::PicoDet(const char *param, const char *bin, int input_width,
int input_hight, bool useGPU, float score_threshold_ = 0.5,
float nms_threshold_ = 0.3) {
this->Net = new ncnn::Net();
#if NCNN_VULKAN
this->hasGPU = ncnn::get_gpu_count() > 0;
......@@ -57,21 +59,28 @@ PicoDet::PicoDet(const char *param, const char *bin, bool useGPU) {
this->Net->opt.use_fp16_arithmetic = true;
this->Net->load_param(param);
this->Net->load_model(bin);
this->in_w = input_width;
this->in_h = input_hight;
this->score_threshold = score_threshold_;
this->nms_threshold = nms_threshold_;
}
PicoDet::~PicoDet() { delete this->Net; }
void PicoDet::preprocess(cv::Mat &image, ncnn::Mat &in) {
// cv::resize(image, image, cv::Size(this->in_w, this->in_h), 0.f, 0.f);
int img_w = image.cols;
int img_h = image.rows;
in = ncnn::Mat::from_pixels(image.data, ncnn::Mat::PIXEL_BGR, img_w, img_h);
in = ncnn::Mat::from_pixels_resize(image.data, ncnn::Mat::PIXEL_BGR, img_w,
img_h, this->in_w, this->in_h);
const float mean_vals[3] = {103.53f, 116.28f, 123.675f};
const float norm_vals[3] = {0.017429f, 0.017507f, 0.017125f};
in.substract_mean_normalize(mean_vals, norm_vals);
}
std::vector<BoxInfo> PicoDet::detect(cv::Mat image, float score_threshold,
float nms_threshold) {
int PicoDet::detect(cv::Mat image, std::vector<BoxInfo> &result_list,
bool has_postprocess) {
ncnn::Mat input;
preprocess(image, input);
auto ex = this->Net->create_extractor();
......@@ -82,34 +91,76 @@ std::vector<BoxInfo> PicoDet::detect(cv::Mat image, float score_threshold,
#endif
ex.input("image", input); // picodet
this->image_h = image.rows;
this->image_w = image.cols;
std::vector<std::vector<BoxInfo>> results;
results.resize(this->num_class);
for (const auto &head_info : this->heads_info) {
if (has_postprocess) {
ncnn::Mat dis_pred;
ncnn::Mat cls_pred;
ex.extract(head_info.dis_layer.c_str(), dis_pred);
ex.extract(head_info.cls_layer.c_str(), cls_pred);
this->decode_infer(cls_pred, dis_pred, head_info.stride, score_threshold,
results);
ex.extract(this->nms_heads_info[0].c_str(), dis_pred);
ex.extract(this->nms_heads_info[1].c_str(), cls_pred);
std::cout << dis_pred.h << " " << dis_pred.w << std::endl;
std::cout << cls_pred.h << " " << cls_pred.w << std::endl;
this->nms_boxes(cls_pred, dis_pred, this->score_threshold, results);
} else {
for (const auto &head_info : this->non_postprocess_heads_info) {
ncnn::Mat dis_pred;
ncnn::Mat cls_pred;
ex.extract(head_info.dis_layer.c_str(), dis_pred);
ex.extract(head_info.cls_layer.c_str(), cls_pred);
this->decode_infer(cls_pred, dis_pred, head_info.stride,
this->score_threshold, results);
}
}
std::vector<BoxInfo> dets;
for (int i = 0; i < (int)results.size(); i++) {
this->nms(results[i], nms_threshold);
this->nms(results[i], this->nms_threshold);
for (auto box : results[i]) {
dets.push_back(box);
box.x1 = box.x1 / this->in_w * this->image_w;
box.x2 = box.x2 / this->in_w * this->image_w;
box.y1 = box.y1 / this->in_h * this->image_h;
box.y2 = box.y2 / this->in_h * this->image_h;
result_list.push_back(box);
}
}
return 0;
}
void PicoDet::nms_boxes(ncnn::Mat &cls_pred, ncnn::Mat &dis_pred,
float score_threshold,
std::vector<std::vector<BoxInfo>> &result_list) {
BoxInfo bbox;
int i, j;
for (i = 0; i < dis_pred.h; i++) {
bbox.x1 = dis_pred.row(i)[0];
bbox.y1 = dis_pred.row(i)[1];
bbox.x2 = dis_pred.row(i)[2];
bbox.y2 = dis_pred.row(i)[3];
const float *scores = cls_pred.row(i);
float score = 0;
int cur_label = 0;
for (int label = 0; label < this->num_class; label++) {
float score_ = cls_pred.row(label)[i];
if (score_ > score) {
score = score_;
cur_label = label;
}
}
bbox.score = score;
bbox.label = cur_label;
result_list[cur_label].push_back(bbox);
}
return dets;
}
void PicoDet::decode_infer(ncnn::Mat &cls_pred, ncnn::Mat &dis_pred, int stride,
float threshold,
std::vector<std::vector<BoxInfo>> &results) {
int feature_h = ceil((float)this->input_size[1] / stride);
int feature_w = ceil((float)this->input_size[0] / stride);
int feature_h = ceil((float)this->in_w / stride);
int feature_w = ceil((float)this->in_h / stride);
for (int idx = 0; idx < feature_h * feature_w; idx++) {
const float *scores = cls_pred.row(idx);
......@@ -151,8 +202,8 @@ BoxInfo PicoDet::disPred2Bbox(const float *&dfl_det, int label, float score,
}
float xmin = (std::max)(ct_x - dis_pred[0], .0f);
float ymin = (std::max)(ct_y - dis_pred[1], .0f);
float xmax = (std::min)(ct_x + dis_pred[2], (float)this->input_size[0]);
float ymax = (std::min)(ct_y + dis_pred[3], (float)this->input_size[1]);
float xmax = (std::min)(ct_x + dis_pred[2], (float)this->in_w);
float ymax = (std::min)(ct_y + dis_pred[3], (float)this->in_w);
return BoxInfo{xmin, ymin, xmax, ymax, score, label};
}
......
......@@ -16,66 +16,72 @@
#ifndef PICODET_H
#define PICODET_H
#include <opencv2/core/core.hpp>
#include <net.h>
#include <opencv2/core/core.hpp>
typedef struct HeadInfo
{
std::string cls_layer;
std::string dis_layer;
int stride;
};
typedef struct NonPostProcessHeadInfo {
std::string cls_layer;
std::string dis_layer;
int stride;
} NonPostProcessHeadInfo;
typedef struct BoxInfo
{
float x1;
float y1;
float x2;
float y2;
float score;
int label;
typedef struct BoxInfo {
float x1;
float y1;
float x2;
float y2;
float score;
int label;
} BoxInfo;
class PicoDet
{
class PicoDet {
public:
PicoDet(const char* param, const char* bin, bool useGPU);
~PicoDet();
PicoDet(const char *param, const char *bin, int input_width, int input_hight,
bool useGPU, float score_threshold_, float nms_threshold_);
static PicoDet* detector;
ncnn::Net* Net;
static bool hasGPU;
~PicoDet();
std::vector<HeadInfo> heads_info{
// cls_pred|dis_pred|stride
{"save_infer_model/scale_0.tmp_1", "save_infer_model/scale_4.tmp_1", 8},
{"save_infer_model/scale_1.tmp_1", "save_infer_model/scale_5.tmp_1", 16},
{"save_infer_model/scale_2.tmp_1", "save_infer_model/scale_6.tmp_1", 32},
{"save_infer_model/scale_3.tmp_1", "save_infer_model/scale_7.tmp_1", 64},
};
static PicoDet *detector;
ncnn::Net *Net;
static bool hasGPU;
std::vector<BoxInfo> detect(cv::Mat image, float score_threshold, float nms_threshold);
int detect(cv::Mat image, std::vector<BoxInfo> &result_list,
bool has_postprocess);
std::vector<std::string> labels{ "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
"fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
"elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
"skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
"tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
"sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
"potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
"microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
"hair drier", "toothbrush" };
private:
void preprocess(cv::Mat& image, ncnn::Mat& in);
void decode_infer(ncnn::Mat& cls_pred, ncnn::Mat& dis_pred, int stride, float threshold, std::vector<std::vector<BoxInfo>>& results);
BoxInfo disPred2Bbox(const float*& dfl_det, int label, float score, int x, int y, int stride);
static void nms(std::vector<BoxInfo>& result, float nms_threshold);
int input_size[2] = {320, 320};
int num_class = 80;
int reg_max = 7;
void preprocess(cv::Mat &image, ncnn::Mat &in);
void decode_infer(ncnn::Mat &cls_pred, ncnn::Mat &dis_pred, int stride,
float threshold,
std::vector<std::vector<BoxInfo>> &results);
BoxInfo disPred2Bbox(const float *&dfl_det, int label, float score, int x,
int y, int stride);
static void nms(std::vector<BoxInfo> &result, float nms_threshold);
void nms_boxes(ncnn::Mat &cls_pred, ncnn::Mat &dis_pred,
float score_threshold,
std::vector<std::vector<BoxInfo>> &result_list);
};
int image_w;
int image_h;
int in_w = 320;
int in_h = 320;
int num_class = 80;
int reg_max = 7;
float score_threshold;
float nms_threshold;
std::vector<float> bbox_output_data_;
std::vector<float> class_output_data_;
std::vector<std::string> nms_heads_info{"tmp_16", "concat_4.tmp_0"};
// If not export post-process, will use non_postprocess_heads_info
std::vector<NonPostProcessHeadInfo> non_postprocess_heads_info{
// cls_pred|dis_pred|stride
{"transpose_0.tmp_0", "transpose_1.tmp_0", 8},
{"transpose_2.tmp_0", "transpose_3.tmp_0", 16},
{"transpose_4.tmp_0", "transpose_5.tmp_0", 32},
{"transpose_6.tmp_0", "transpose_7.tmp_0", 64},
};
};
#endif
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册