未验证 提交 28510251 编写于 作者: G Guanghua Yu 提交者: GitHub

add christmas application (#1958)

* add christmas application

* fix element source
上级 4511b879
...@@ -236,6 +236,9 @@ PaddleDetection模块化地实现了多种主流目标检测算法,提供了 ...@@ -236,6 +236,9 @@ PaddleDetection模块化地实现了多种主流目标检测算法,提供了
- [Objects365 2019 Challenge夺冠模型](docs/featured_model/champion_model/CACascadeRCNN.md) - [Objects365 2019 Challenge夺冠模型](docs/featured_model/champion_model/CACascadeRCNN.md)
- [Open Images 2019-Object Detction比赛最佳单模型](docs/featured_model/champion_model/OIDV5_BASELINE_MODEL.md) - [Open Images 2019-Object Detction比赛最佳单模型](docs/featured_model/champion_model/OIDV5_BASELINE_MODEL.md)
## 应用案例
- [人像圣诞特效自动生成工具](application/christmas)
## 第三方教程推荐 ## 第三方教程推荐
......
...@@ -254,6 +254,9 @@ All these models can be get in [Model Zoo](#ModelZoo) ...@@ -254,6 +254,9 @@ All these models can be get in [Model Zoo](#ModelZoo)
- [Objects365 2019 Challenge champion model](docs/featured_model/champion_model/CACascadeRCNN.md) - [Objects365 2019 Challenge champion model](docs/featured_model/champion_model/CACascadeRCNN.md)
- [Best single model of Open Images 2019-Object Detction](docs/featured_model/champion_model/OIDV5_BASELINE_MODEL.md) - [Best single model of Open Images 2019-Object Detction](docs/featured_model/champion_model/OIDV5_BASELINE_MODEL.md)
## Applications
- [Christmas portrait automatic generation tool](application/christmas)
## Updates ## Updates
......
# 人像圣诞特效自动生成工具
通过SOLOv2实例分割模型分割人像,并通过BlazeFace关键点模型检测人脸关键点,然后根据两个模型输出结果更换圣诞风格背景并为人脸加上圣诞老人胡子、圣诞眼镜及圣诞帽等特效。本项目通过PaddleHub可直接发布Server服务,供本地调试与前端直接调用接口。您可通过以下二维码中微信小程序直接体验:
<div align="center">
<img src="demo_images/wechat_app.jpeg" width='400'/>
</div>
## 环境搭建
### 环境依赖
- paddlepaddle >= 2.0.0rc0
- paddlehub >= 2.0.0b1
### 模型准备
- 首先要获取模型,可在[模型配置文件](../../configs)里配置`solov2``blazeface_keypoint`,训练模型,并[导出模型](../../docs/advanced_tutorials/deploy/EXPORT_MODEL.md)。也可直接下载我们准备好模型:
[blazeface_keypoint模型](https://paddlemodels.bj.bcebos.com/object_detection/application/blazeface_keypoint.tar)
[solov2模型](https://paddlemodels.bj.bcebos.com/object_detection/application/solov2_r101_vd_fpn_3x.tar)
- 然后将模型分别拷贝至`blazeface/blazeface_keypoint/``solov2/solov2_r101_vd_fpn_3x/`文件夹内。
### hub安装blazeface和solov2模型
```shell
hub install solov2
hub install blazeface
```
### hub安装solov2_blazeface圣诞特效自动生成串联模型
```shell
$ hub install solov2_blazeface
```
## 开始测试
### 本地测试
```shell
python test_main.py
```
运行成功后,预测结果会保存到`chrismas_final.png`
### serving测试
- step1: 启动服务
```shell
export CUDA_VISIBLE_DEVICES=0
hub serving start -m solov2_blazeface -p 8880
```
- step2: 在服务端发送预测请求
```shell
python test_server.py
```
运行成功后,预测结果会保存到`chrismas_final.png`
## 效果展示
<div align="center">
<img src="demo_images/test.jpg" height="600px" ><img src="demo_images/result.png" height="600px" >
</div>
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import base64
import cv2
import numpy as np
from PIL import Image, ImageDraw
import paddle.fluid as fluid
def create_inputs(im, im_info):
"""generate input for different model type
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
inputs (dict): input of model
"""
inputs = {}
inputs['image'] = im
origin_shape = list(im_info['origin_shape'])
resize_shape = list(im_info['resize_shape'])
pad_shape = list(im_info['pad_shape']) if im_info[
'pad_shape'] is not None else list(im_info['resize_shape'])
scale_x, scale_y = im_info['scale']
scale = scale_x
im_info = np.array([resize_shape + [scale]]).astype('float32')
inputs['im_info'] = im_info
return inputs
def visualize_box_mask(im,
results,
labels=None,
mask_resolution=14,
threshold=0.5):
"""
Args:
im (str/np.ndarray): path of image/np.ndarray read by cv2
results (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box,
matix element:[class, score, x_min, y_min, x_max, y_max]
MaskRCNN's results include 'masks': np.ndarray:
shape:[N, class_num, mask_resolution, mask_resolution]
labels (list): labels:['class1', ..., 'classn']
mask_resolution (int): shape of a mask is:[mask_resolution, mask_resolution]
threshold (float): Threshold of score.
Returns:
im (PIL.Image.Image): visualized image
"""
if not labels:
labels = ['background', 'person']
if isinstance(im, str):
im = Image.open(im).convert('RGB')
else:
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im = Image.fromarray(im)
if 'masks' in results and 'boxes' in results:
im = draw_mask(
im,
results['boxes'],
results['masks'],
labels,
resolution=mask_resolution)
if 'boxes' in results:
im = draw_box(im, results['boxes'], labels)
if 'segm' in results:
im = draw_segm(
im,
results['segm'],
results['label'],
results['score'],
labels,
threshold=threshold)
if 'landmark' in results:
im = draw_lmk(im, results['landmark'])
return im
def get_color_map_list(num_classes):
"""
Args:
num_classes (int): number of class
Returns:
color_map (list): RGB color list
"""
color_map = num_classes * [0, 0, 0]
for i in range(0, num_classes):
j = 0
lab = i
while lab:
color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j))
color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j))
color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j))
j += 1
lab >>= 3
color_map = [color_map[i:i + 3] for i in range(0, len(color_map), 3)]
return color_map
def expand_boxes(boxes, scale=0.0):
"""
Args:
boxes (np.ndarray): shape:[N,4], N:number of box,
matix element:[x_min, y_min, x_max, y_max]
scale (float): scale of boxes
Returns:
boxes_exp (np.ndarray): expanded boxes
"""
w_half = (boxes[:, 2] - boxes[:, 0]) * .5
h_half = (boxes[:, 3] - boxes[:, 1]) * .5
x_c = (boxes[:, 2] + boxes[:, 0]) * .5
y_c = (boxes[:, 3] + boxes[:, 1]) * .5
w_half *= scale
h_half *= scale
boxes_exp = np.zeros(boxes.shape)
boxes_exp[:, 0] = x_c - w_half
boxes_exp[:, 2] = x_c + w_half
boxes_exp[:, 1] = y_c - h_half
boxes_exp[:, 3] = y_c + h_half
return boxes_exp
def draw_mask(im, np_boxes, np_masks, labels, resolution=14, threshold=0.5):
"""
Args:
im (PIL.Image.Image): PIL image
np_boxes (np.ndarray): shape:[N,6], N: number of box,
matix element:[class, score, x_min, y_min, x_max, y_max]
np_masks (np.ndarray): shape:[N, class_num, resolution, resolution]
labels (list): labels:['class1', ..., 'classn']
resolution (int): shape of a mask is:[resolution, resolution]
threshold (float): threshold of mask
Returns:
im (PIL.Image.Image): visualized image
"""
color_list = get_color_map_list(len(labels))
scale = (resolution + 2.0) / resolution
im_w, im_h = im.size
w_ratio = 0.4
alpha = 0.7
im = np.array(im).astype('float32')
rects = np_boxes[:, 2:]
expand_rects = expand_boxes(rects, scale)
expand_rects = expand_rects.astype(np.int32)
clsid_scores = np_boxes[:, 0:2]
padded_mask = np.zeros((resolution + 2, resolution + 2), dtype=np.float32)
clsid2color = {}
for idx in range(len(np_boxes)):
clsid, score = clsid_scores[idx].tolist()
clsid = int(clsid)
xmin, ymin, xmax, ymax = expand_rects[idx].tolist()
w = xmax - xmin + 1
h = ymax - ymin + 1
w = np.maximum(w, 1)
h = np.maximum(h, 1)
padded_mask[1:-1, 1:-1] = np_masks[idx, int(clsid), :, :]
resized_mask = cv2.resize(padded_mask, (w, h))
resized_mask = np.array(resized_mask > threshold, dtype=np.uint8)
x0 = min(max(xmin, 0), im_w)
x1 = min(max(xmax + 1, 0), im_w)
y0 = min(max(ymin, 0), im_h)
y1 = min(max(ymax + 1, 0), im_h)
im_mask = np.zeros((im_h, im_w), dtype=np.uint8)
im_mask[y0:y1, x0:x1] = resized_mask[(y0 - ymin):(y1 - ymin), (
x0 - xmin):(x1 - xmin)]
if clsid not in clsid2color:
clsid2color[clsid] = color_list[clsid]
color_mask = clsid2color[clsid]
for c in range(3):
color_mask[c] = color_mask[c] * (1 - w_ratio) + w_ratio * 255
idx = np.nonzero(im_mask)
color_mask = np.array(color_mask)
im[idx[0], idx[1], :] *= 1.0 - alpha
im[idx[0], idx[1], :] += alpha * color_mask
return Image.fromarray(im.astype('uint8'))
def draw_box(im, np_boxes, labels):
"""
Args:
im (PIL.Image.Image): PIL image
np_boxes (np.ndarray): shape:[N,6], N: number of box,
matix element:[class, score, x_min, y_min, x_max, y_max]
labels (list): labels:['class1', ..., 'classn']
Returns:
im (PIL.Image.Image): visualized image
"""
draw_thickness = min(im.size) // 320
draw = ImageDraw.Draw(im)
clsid2color = {}
color_list = get_color_map_list(len(labels))
for dt in np_boxes:
clsid, bbox, score = int(dt[0]), dt[2:], dt[1]
xmin, ymin, xmax, ymax = bbox
w = xmax - xmin
h = ymax - ymin
if clsid not in clsid2color:
clsid2color[clsid] = color_list[clsid]
color = tuple(clsid2color[clsid])
# draw bbox
draw.line(
[(xmin, ymin), (xmin, ymax), (xmax, ymax), (xmax, ymin),
(xmin, ymin)],
width=draw_thickness,
fill=color)
# draw label
text = "{} {:.4f}".format(labels[clsid], score)
tw, th = draw.textsize(text)
draw.rectangle(
[(xmin + 1, ymin - th), (xmin + tw + 1, ymin)], fill=color)
draw.text((xmin + 1, ymin - th), text, fill=(255, 255, 255))
return im
def draw_segm(im,
np_segms,
np_label,
np_score,
labels,
threshold=0.5,
alpha=0.7):
"""
Draw segmentation on image
"""
mask_color_id = 0
w_ratio = .4
color_list = get_color_map_list(len(labels))
im = np.array(im).astype('float32')
clsid2color = {}
np_segms = np_segms.astype(np.uint8)
index = np.where(np_label == 0)[0]
index = np.where(np_score[index] > threshold)[0]
person_segms = np_segms[index]
person_mask = np.sum(person_segms, axis=0)
person_mask[person_mask > 1] = 1
person_mask = np.expand_dims(person_mask, axis=2)
person_mask = np.repeat(person_mask, 3, axis=2)
im = im * person_mask
return Image.fromarray(im.astype('uint8'))
def load_predictor(model_dir,
run_mode='fluid',
batch_size=1,
use_gpu=False,
min_subgraph_size=3):
"""set AnalysisConfig, generate AnalysisPredictor
Args:
model_dir (str): root path of __model__ and __params__
use_gpu (bool): whether use gpu
Returns:
predictor (PaddlePredictor): AnalysisPredictor
Raises:
ValueError: predict by TensorRT need use_gpu == True.
"""
if not use_gpu and not run_mode == 'fluid':
raise ValueError(
"Predict by TensorRT mode: {}, expect use_gpu==True, but use_gpu == {}"
.format(run_mode, use_gpu))
if run_mode == 'trt_int8':
raise ValueError("TensorRT int8 mode is not supported now, "
"please use trt_fp32 or trt_fp16 instead.")
precision_map = {
'trt_int8': fluid.core.AnalysisConfig.Precision.Int8,
'trt_fp32': fluid.core.AnalysisConfig.Precision.Float32,
'trt_fp16': fluid.core.AnalysisConfig.Precision.Half
}
config = fluid.core.AnalysisConfig(
os.path.join(model_dir, '__model__'),
os.path.join(model_dir, '__params__'))
if use_gpu:
# initial GPU memory(M), device ID
config.enable_use_gpu(100, 0)
# optimize graph and fuse op
config.switch_ir_optim(True)
else:
config.disable_gpu()
if run_mode in precision_map.keys():
config.enable_tensorrt_engine(
workspace_size=1 << 10,
max_batch_size=batch_size,
min_subgraph_size=min_subgraph_size,
precision_mode=precision_map[run_mode],
use_static=False,
use_calib_mode=False)
# disable print log when predict
config.disable_glog_info()
# enable shared memory
config.enable_memory_optim()
# disable feed, fetch OP, needed by zero_copy_run
config.switch_use_feed_fetch_ops(False)
predictor = fluid.core.create_paddle_predictor(config)
return predictor
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def lmk2out(bboxes, np_lmk, im_info, threshold=0.5, is_bbox_normalized=True):
image_w, image_h = im_info['origin_shape']
scale = im_info['scale']
face_index, landmark, prior_box = np_lmk[:]
xywh_res = []
if bboxes.shape == (1, 1) or bboxes is None:
return np.array([])
prior = np.reshape(prior_box, (-1, 4))
predict_lmk = np.reshape(landmark, (-1, 10))
k = 0
for i in range(bboxes.shape[0]):
score = bboxes[i][1]
if score < threshold:
continue
theindex = face_index[i][0]
me_prior = prior[theindex, :]
lmk_pred = predict_lmk[theindex, :]
prior_h = me_prior[2] - me_prior[0]
prior_w = me_prior[3] - me_prior[1]
prior_h_center = (me_prior[2] + me_prior[0]) / 2
prior_w_center = (me_prior[3] + me_prior[1]) / 2
lmk_decode = np.zeros((10))
for j in [0, 2, 4, 6, 8]:
lmk_decode[j] = lmk_pred[j] * 0.1 * prior_w + prior_h_center
for j in [1, 3, 5, 7, 9]:
lmk_decode[j] = lmk_pred[j] * 0.1 * prior_h + prior_w_center
if is_bbox_normalized:
lmk_decode = lmk_decode * np.array([
image_h, image_w, image_h, image_w, image_h, image_w, image_h,
image_w, image_h, image_w
])
xywh_res.append(lmk_decode)
return np.asarray(xywh_res)
def draw_lmk(image, lmk_results):
draw = ImageDraw.Draw(image)
for lmk_decode in lmk_results:
for j in range(5):
x1 = int(round(lmk_decode[2 * j]))
y1 = int(round(lmk_decode[2 * j + 1]))
draw.ellipse(
(x1 - 2, y1 - 2, x1 + 3, y1 + 3), fill='green', outline='green')
return image
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import time
from functools import reduce
import base64
import cv2
import numpy as np
from paddlehub.module.module import moduleinfo, serving
import blazeface.data_feed as D
@moduleinfo(
name="blazeface",
type="CV/image_editing",
author="paddlepaddle",
author_email="",
summary="blazeface is a face key point detection model.",
version="1.0.0")
class Detector(object):
"""
Args:
config (object): config of model, defined by `Config(model_dir)`
model_dir (str): root path of __model__, __params__ and infer_cfg.yml
use_gpu (bool): whether use gpu
run_mode (str): mode of running(fluid/trt_fp32/trt_fp16)
threshold (float): threshold to reserve the result for output.
"""
def __init__(self,
min_subgraph_size=60,
use_gpu=False,
run_mode='fluid',
threshold=0.5):
model_dir = os.path.join(self.directory, 'blazeface_keypoint')
self.predictor = D.load_predictor(
model_dir,
run_mode=run_mode,
min_subgraph_size=min_subgraph_size,
use_gpu=use_gpu)
def face_img_process(self,
image,
mean=[104., 117., 123.],
std=[127.502231, 127.502231, 127.502231]):
image = np.array(image)
# HWC to CHW
if len(image.shape) == 3:
image = np.swapaxes(image, 1, 2)
image = np.swapaxes(image, 1, 0)
# RBG to BGR
image = image[[2, 1, 0], :, :]
image = image.astype('float32')
image -= np.array(mean)[:, np.newaxis, np.newaxis].astype('float32')
image /= np.array(std)[:, np.newaxis, np.newaxis].astype('float32')
image = [image]
image = np.array(image)
return image
def transform(self, image, shrink):
im_info = {
'scale': [1., 1.],
'origin_shape': None,
'resize_shape': None,
'pad_shape': None,
}
if isinstance(image, str):
with open(image, 'rb') as f:
im_read = f.read()
image = np.frombuffer(im_read, dtype='uint8')
image = cv2.imdecode(image, 1) # BGR mode, but need RGB mode
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
im_info['origin_shape'] = image.shape[:2]
else:
im_info['origin_shape'] = image.shape[:2]
image_shape = [3, image.shape[0], image.shape[1]]
h, w = shrink, shrink
image = cv2.resize(image, (w, h))
im_info['resize_shape'] = image.shape[:2]
image = self.face_img_process(image)
inputs = D.create_inputs(image, im_info)
return inputs, im_info
def postprocess(self, boxes_list, lmks_list, im_info, threshold=0.5):
assert len(boxes_list) == len(lmks_list)
best_np_boxes, best_np_lmk = boxes_list[0], lmks_list[0]
for i in range(1, len(boxes_list)):
#judgment detection score
if boxes_list[i][0][1] > 0.9:
break
face_width = boxes_list[i][0][4] - boxes_list[i][0][2]
if boxes_list[i][0][1] - best_np_boxes[0][
1] > 0.01 and face_width > 0.2:
best_np_boxes, best_np_lmk = boxes_list[i], lmks_list[i]
# postprocess output of predictor
results = {}
results['landmark'] = D.lmk2out(best_np_boxes, best_np_lmk, im_info,
threshold)
w, h = im_info['origin_shape']
best_np_boxes[:, 2] *= h
best_np_boxes[:, 3] *= w
best_np_boxes[:, 4] *= h
best_np_boxes[:, 5] *= w
expect_boxes = (best_np_boxes[:, 1] > threshold) & (
best_np_boxes[:, 0] > -1)
best_np_boxes = best_np_boxes[expect_boxes, :]
for box in best_np_boxes:
print('class_id:{:d}, confidence:{:.4f},'
'left_top:[{:.2f},{:.2f}],'
' right_bottom:[{:.2f},{:.2f}]'.format(
int(box[0]), box[1], box[2], box[3], box[4], box[5]))
results['boxes'] = best_np_boxes
return results
def predict(self,
image,
threshold=0.5,
repeats=1,
visualization=False,
with_lmk=True,
save_dir='blaze_result'):
'''
Args:
image (str/np.ndarray): path of image/ np.ndarray read by cv2
threshold (float): threshold of predicted box' score
Returns:
results (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box,
matix element:[class, score, x_min, y_min, x_max, y_max]
'''
shrink = [960, 640, 480, 320, 180]
boxes_list = []
lmks_list = []
for sh in shrink:
inputs, im_info = self.transform(image, shrink=sh)
np_boxes, np_lmk = None, None
input_names = self.predictor.get_input_names()
for i in range(len(input_names)):
input_tensor = self.predictor.get_input_tensor(input_names[i])
input_tensor.copy_from_cpu(inputs[input_names[i]])
t1 = time.time()
for i in range(repeats):
self.predictor.zero_copy_run()
output_names = self.predictor.get_output_names()
boxes_tensor = self.predictor.get_output_tensor(output_names[0])
np_boxes = boxes_tensor.copy_to_cpu()
if with_lmk == True:
face_index = self.predictor.get_output_tensor(output_names[
1])
landmark = self.predictor.get_output_tensor(output_names[2])
prior_boxes = self.predictor.get_output_tensor(output_names[
3])
np_face_index = face_index.copy_to_cpu()
np_prior_boxes = prior_boxes.copy_to_cpu()
np_landmark = landmark.copy_to_cpu()
np_lmk = [np_face_index, np_landmark, np_prior_boxes]
t2 = time.time()
ms = (t2 - t1) * 1000.0 / repeats
print("Inference: {} ms per batch image".format(ms))
# do not perform postprocess in benchmark mode
results = []
if reduce(lambda x, y: x * y, np_boxes.shape) < 6:
print('[WARNNING] No object detected.')
results = {'boxes': np.array([])}
else:
boxes_list.append(np_boxes)
lmks_list.append(np_lmk)
results = self.postprocess(
boxes_list, lmks_list, im_info, threshold=threshold)
if visualization:
if not os.path.exists(save_dir):
os.makedirs(save_dir)
output = D.visualize_box_mask(
im=image, results=results, labels=["background", "face"])
name = str(time.time()) + '.png'
save_path = os.path.join(save_dir, name)
output.save(save_path)
img = cv2.cvtColor(np.array(output), cv2.COLOR_RGB2BGR)
results['image'] = img
return results
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import base64
import cv2
import numpy as np
from PIL import Image, ImageDraw
import paddle.fluid as fluid
def create_inputs(im, im_info):
"""generate input for different model type
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
inputs (dict): input of model
"""
inputs = {}
inputs['image'] = im
origin_shape = list(im_info['origin_shape'])
resize_shape = list(im_info['resize_shape'])
pad_shape = list(im_info['pad_shape']) if im_info[
'pad_shape'] is not None else list(im_info['resize_shape'])
scale_x, scale_y = im_info['scale']
scale = scale_x
im_info = np.array([resize_shape + [scale]]).astype('float32')
inputs['im_info'] = im_info
return inputs
def visualize_box_mask(im,
results,
labels=None,
mask_resolution=14,
threshold=0.5):
"""
Args:
im (str/np.ndarray): path of image/np.ndarray read by cv2
results (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box,
matix element:[class, score, x_min, y_min, x_max, y_max]
MaskRCNN's results include 'masks': np.ndarray:
shape:[N, class_num, mask_resolution, mask_resolution]
labels (list): labels:['class1', ..., 'classn']
mask_resolution (int): shape of a mask is:[mask_resolution, mask_resolution]
threshold (float): Threshold of score.
Returns:
im (PIL.Image.Image): visualized image
"""
if not labels:
labels = [
'background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane',
'bus', 'train', 'truck', 'boat', 'traffic light', 'fire', 'hydrant',
'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',
'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe',
'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat',
'baseball glove', 'skateboard', 'surfboard', 'tennis racket',
'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl',
'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot',
'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop',
'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven',
'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase',
'scissors', 'teddy bear', 'hair drier', 'toothbrush'
]
if isinstance(im, str):
im = Image.open(im).convert('RGB')
else:
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im = Image.fromarray(im)
if 'masks' in results and 'boxes' in results:
im = draw_mask(
im,
results['boxes'],
results['masks'],
labels,
resolution=mask_resolution)
if 'boxes' in results:
im = draw_box(im, results['boxes'], labels)
if 'segm' in results:
im = draw_segm(
im,
results['segm'],
results['label'],
results['score'],
labels,
threshold=threshold)
return im
def get_color_map_list(num_classes):
"""
Args:
num_classes (int): number of class
Returns:
color_map (list): RGB color list
"""
color_map = num_classes * [0, 0, 0]
for i in range(0, num_classes):
j = 0
lab = i
while lab:
color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j))
color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j))
color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j))
j += 1
lab >>= 3
color_map = [color_map[i:i + 3] for i in range(0, len(color_map), 3)]
return color_map
def expand_boxes(boxes, scale=0.0):
"""
Args:
boxes (np.ndarray): shape:[N,4], N:number of box,
matix element:[x_min, y_min, x_max, y_max]
scale (float): scale of boxes
Returns:
boxes_exp (np.ndarray): expanded boxes
"""
w_half = (boxes[:, 2] - boxes[:, 0]) * .5
h_half = (boxes[:, 3] - boxes[:, 1]) * .5
x_c = (boxes[:, 2] + boxes[:, 0]) * .5
y_c = (boxes[:, 3] + boxes[:, 1]) * .5
w_half *= scale
h_half *= scale
boxes_exp = np.zeros(boxes.shape)
boxes_exp[:, 0] = x_c - w_half
boxes_exp[:, 2] = x_c + w_half
boxes_exp[:, 1] = y_c - h_half
boxes_exp[:, 3] = y_c + h_half
return boxes_exp
def draw_mask(im, np_boxes, np_masks, labels, resolution=14, threshold=0.5):
"""
Args:
im (PIL.Image.Image): PIL image
np_boxes (np.ndarray): shape:[N,6], N: number of box,
matix element:[class, score, x_min, y_min, x_max, y_max]
np_masks (np.ndarray): shape:[N, class_num, resolution, resolution]
labels (list): labels:['class1', ..., 'classn']
resolution (int): shape of a mask is:[resolution, resolution]
threshold (float): threshold of mask
Returns:
im (PIL.Image.Image): visualized image
"""
color_list = get_color_map_list(len(labels))
scale = (resolution + 2.0) / resolution
im_w, im_h = im.size
w_ratio = 0.4
alpha = 0.7
im = np.array(im).astype('float32')
rects = np_boxes[:, 2:]
expand_rects = expand_boxes(rects, scale)
expand_rects = expand_rects.astype(np.int32)
clsid_scores = np_boxes[:, 0:2]
padded_mask = np.zeros((resolution + 2, resolution + 2), dtype=np.float32)
clsid2color = {}
for idx in range(len(np_boxes)):
clsid, score = clsid_scores[idx].tolist()
clsid = int(clsid)
xmin, ymin, xmax, ymax = expand_rects[idx].tolist()
w = xmax - xmin + 1
h = ymax - ymin + 1
w = np.maximum(w, 1)
h = np.maximum(h, 1)
padded_mask[1:-1, 1:-1] = np_masks[idx, int(clsid), :, :]
resized_mask = cv2.resize(padded_mask, (w, h))
resized_mask = np.array(resized_mask > threshold, dtype=np.uint8)
x0 = min(max(xmin, 0), im_w)
x1 = min(max(xmax + 1, 0), im_w)
y0 = min(max(ymin, 0), im_h)
y1 = min(max(ymax + 1, 0), im_h)
im_mask = np.zeros((im_h, im_w), dtype=np.uint8)
im_mask[y0:y1, x0:x1] = resized_mask[(y0 - ymin):(y1 - ymin), (
x0 - xmin):(x1 - xmin)]
if clsid not in clsid2color:
clsid2color[clsid] = color_list[clsid]
color_mask = clsid2color[clsid]
for c in range(3):
color_mask[c] = color_mask[c] * (1 - w_ratio) + w_ratio * 255
idx = np.nonzero(im_mask)
color_mask = np.array(color_mask)
im[idx[0], idx[1], :] *= 1.0 - alpha
im[idx[0], idx[1], :] += alpha * color_mask
return Image.fromarray(im.astype('uint8'))
def draw_box(im, np_boxes, labels):
"""
Args:
im (PIL.Image.Image): PIL image
np_boxes (np.ndarray): shape:[N,6], N: number of box,
matix element:[class, score, x_min, y_min, x_max, y_max]
labels (list): labels:['class1', ..., 'classn']
Returns:
im (PIL.Image.Image): visualized image
"""
draw_thickness = min(im.size) // 320
draw = ImageDraw.Draw(im)
clsid2color = {}
color_list = get_color_map_list(len(labels))
for dt in np_boxes:
clsid, bbox, score = int(dt[0]), dt[2:], dt[1]
xmin, ymin, xmax, ymax = bbox
w = xmax - xmin
h = ymax - ymin
if clsid not in clsid2color:
clsid2color[clsid] = color_list[clsid]
color = tuple(clsid2color[clsid])
# draw bbox
draw.line(
[(xmin, ymin), (xmin, ymax), (xmax, ymax), (xmax, ymin),
(xmin, ymin)],
width=draw_thickness,
fill=color)
# draw label
text = "{} {:.4f}".format(labels[clsid], score)
tw, th = draw.textsize(text)
draw.rectangle(
[(xmin + 1, ymin - th), (xmin + tw + 1, ymin)], fill=color)
draw.text((xmin + 1, ymin - th), text, fill=(255, 255, 255))
return im
def draw_segm(im,
np_segms,
np_label,
np_score,
labels,
threshold=0.5,
alpha=0.7):
"""
Draw segmentation on image
"""
mask_color_id = 0
w_ratio = .4
color_list = get_color_map_list(len(labels))
im = np.array(im).astype('float32')
clsid2color = {}
np_segms = np_segms.astype(np.uint8)
index = np.where(np_label == 0)[0]
index = np.where(np_score[index] > threshold)[0]
person_segms = np_segms[index]
person_mask = np.sum(person_segms, axis=0)
person_mask[person_mask > 1] = 1
person_mask = np.expand_dims(person_mask, axis=2)
person_mask = np.repeat(person_mask, 3, axis=2)
im = im * person_mask
return Image.fromarray(im.astype('uint8'))
def load_predictor(model_dir,
run_mode='fluid',
batch_size=1,
use_gpu=False,
min_subgraph_size=3):
"""set AnalysisConfig, generate AnalysisPredictor
Args:
model_dir (str): root path of __model__ and __params__
use_gpu (bool): whether use gpu
Returns:
predictor (PaddlePredictor): AnalysisPredictor
Raises:
ValueError: predict by TensorRT need use_gpu == True.
"""
if not use_gpu and not run_mode == 'fluid':
raise ValueError(
"Predict by TensorRT mode: {}, expect use_gpu==True, but use_gpu == {}"
.format(run_mode, use_gpu))
if run_mode == 'trt_int8':
raise ValueError("TensorRT int8 mode is not supported now, "
"please use trt_fp32 or trt_fp16 instead.")
precision_map = {
'trt_int8': fluid.core.AnalysisConfig.Precision.Int8,
'trt_fp32': fluid.core.AnalysisConfig.Precision.Float32,
'trt_fp16': fluid.core.AnalysisConfig.Precision.Half
}
config = fluid.core.AnalysisConfig(
os.path.join(model_dir, '__model__'),
os.path.join(model_dir, '__params__'))
if use_gpu:
# initial GPU memory(M), device ID
config.enable_use_gpu(100, 0)
# optimize graph and fuse op
config.switch_ir_optim(True)
else:
config.disable_gpu()
if run_mode in precision_map.keys():
config.enable_tensorrt_engine(
workspace_size=1 << 10,
max_batch_size=batch_size,
min_subgraph_size=min_subgraph_size,
precision_mode=precision_map[run_mode],
use_static=False,
use_calib_mode=False)
# disable print log when predict
config.disable_glog_info()
# enable shared memory
config.enable_memory_optim()
# disable feed, fetch OP, needed by zero_copy_run
config.switch_use_feed_fetch_ops(False)
predictor = fluid.core.create_paddle_predictor(config)
return predictor
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import time
from functools import reduce
import base64
import cv2
import numpy as np
from paddlehub.module.module import moduleinfo, serving
import solov2.processor as P
import solov2.data_feed as D
class Detector(object):
"""
Args:
model_dir (str): root path of __model__, __params__ and infer_cfg.yml
use_gpu (bool): whether use gpu
run_mode (str): mode of running(fluid/trt_fp32/trt_fp16)
threshold (float): threshold to reserve the result for output.
"""
def __init__(self,
min_subgraph_size=60,
use_gpu=False,
run_mode='fluid',
threshold=0.5):
model_dir = os.path.join(self.directory, 'solov2_r101_vd_fpn_3x')
self.predictor = D.load_predictor(
model_dir,
run_mode=run_mode,
min_subgraph_size=min_subgraph_size,
use_gpu=use_gpu)
self.compose = [
P.Resize(max_size=1333), P.Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
P.Permute(), P.PadStride(stride=32)
]
def transform(self, im):
im, im_info = P.preprocess(im, self.compose)
inputs = D.create_inputs(im, im_info)
return inputs, im_info
def postprocess(self, np_boxes, np_masks, im_info, threshold=0.5):
# postprocess output of predictor
results = {}
expect_boxes = (np_boxes[:, 1] > threshold) & (np_boxes[:, 0] > -1)
np_boxes = np_boxes[expect_boxes, :]
for box in np_boxes:
print('class_id:{:d}, confidence:{:.4f},'
'left_top:[{:.2f},{:.2f}],'
' right_bottom:[{:.2f},{:.2f}]'.format(
int(box[0]), box[1], box[2], box[3], box[4], box[5]))
results['boxes'] = np_boxes
if np_masks is not None:
np_masks = np_masks[expect_boxes, :, :, :]
results['masks'] = np_masks
return results
def predict(self, image, threshold=0.5, warmup=0, repeats=1):
'''
Args:
image (str/np.ndarray): path of image/ np.ndarray read by cv2
threshold (float): threshold of predicted box' score
Returns:
results (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box,
matix element:[class, score, x_min, y_min, x_max, y_max]
MaskRCNN's results include 'masks': np.ndarray:
shape:[N, class_num, mask_resolution, mask_resolution]
'''
inputs, im_info = self.transform(image)
np_boxes, np_masks = None, None
input_names = self.predictor.get_input_names()
for i in range(len(input_names)):
input_tensor = self.predictor.get_input_tensor(input_names[i])
input_tensor.copy_from_cpu(inputs[input_names[i]])
for i in range(warmup):
self.predictor.zero_copy_run()
output_names = self.predictor.get_output_names()
boxes_tensor = self.predictor.get_output_tensor(output_names[0])
np_boxes = boxes_tensor.copy_to_cpu()
for i in range(repeats):
self.predictor.zero_copy_run()
output_names = self.predictor.get_output_names()
boxes_tensor = self.predictor.get_output_tensor(output_names[0])
np_boxes = boxes_tensor.copy_to_cpu()
# do not perform postprocess in benchmark mode
results = []
if reduce(lambda x, y: x * y, np_boxes.shape) < 6:
print('[WARNNING] No object detected.')
results = {'boxes': np.array([])}
else:
results = self.postprocess(
np_boxes, np_masks, im_info, threshold=threshold)
return results
@moduleinfo(
name="solov2",
type="CV/image_editing",
author="paddlepaddle",
author_email="",
summary="solov2 is a detection model, this module is trained with COCO dataset.",
version="1.0.0")
class DetectorSOLOv2(Detector):
def __init__(self, use_gpu=False, run_mode='fluid', threshold=0.5):
super(DetectorSOLOv2, self).__init__(
use_gpu=use_gpu, run_mode=run_mode, threshold=threshold)
def predict(self,
image,
threshold=0.5,
warmup=0,
repeats=1,
visualization=False,
save_dir='solov2_result'):
inputs, im_info = self.transform(image)
np_label, np_score, np_segms = None, None, None
input_names = self.predictor.get_input_names()
for i in range(len(input_names)):
input_tensor = self.predictor.get_input_tensor(input_names[i])
input_tensor.copy_from_cpu(inputs[input_names[i]])
for i in range(warmup):
self.predictor.zero_copy_run()
output_names = self.predictor.get_output_names()
np_label = self.predictor.get_output_tensor(output_names[
0]).copy_to_cpu()
np_score = self.predictor.get_output_tensor(output_names[
1]).copy_to_cpu()
np_segms = self.predictor.get_output_tensor(output_names[
2]).copy_to_cpu()
for i in range(repeats):
self.predictor.zero_copy_run()
output_names = self.predictor.get_output_names()
np_label = self.predictor.get_output_tensor(output_names[
0]).copy_to_cpu()
np_score = self.predictor.get_output_tensor(output_names[
1]).copy_to_cpu()
np_segms = self.predictor.get_output_tensor(output_names[
2]).copy_to_cpu()
output = dict(segm=np_segms, label=np_label, score=np_score)
if visualization:
if not os.path.exists(save_dir):
os.makedirs(save_dir)
image = D.visualize_box_mask(im=image, results=output)
name = str(time.time()) + '.png'
save_path = os.path.join(save_dir, name)
image.save(save_path)
img = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
output['image'] = img
return output
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from PIL import Image
import cv2
import numpy as np
def decode_image(im_file, im_info):
"""read rgb image
Args:
im_file (str/np.ndarray): path of image/ np.ndarray read by cv2
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
if isinstance(im_file, str):
with open(im_file, 'rb') as f:
im_read = f.read()
data = np.frombuffer(im_read, dtype='uint8')
im = cv2.imdecode(data, 1) # BGR mode, but need RGB mode
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im_info['origin_shape'] = im.shape[:2]
im_info['resize_shape'] = im.shape[:2]
else:
im = im_file
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im_info['origin_shape'] = im.shape[:2]
im_info['resize_shape'] = im.shape[:2]
return im, im_info
class Resize(object):
"""resize image by target_size and max_size
Args:
arch (str): model type
target_size (int): the target size of image
max_size (int): the max size of image
use_cv2 (bool): whether us cv2
image_shape (list): input shape of model
interp (int): method of resize
"""
def __init__(self,
target_size=800,
max_size=1333,
use_cv2=True,
image_shape=None,
interp=cv2.INTER_LINEAR,
resize_box=False):
self.target_size = target_size
self.max_size = max_size
self.image_shape = image_shape
self.use_cv2 = use_cv2
self.interp = interp
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im_channel = im.shape[2]
im_scale_x, im_scale_y = self.generate_scale(im)
im_info['resize_shape'] = [
im_scale_x * float(im.shape[0]), im_scale_y * float(im.shape[1])
]
if self.use_cv2:
im = cv2.resize(
im,
None,
None,
fx=im_scale_x,
fy=im_scale_y,
interpolation=self.interp)
else:
resize_w = int(im_scale_x * float(im.shape[1]))
resize_h = int(im_scale_y * float(im.shape[0]))
if self.max_size != 0:
raise TypeError(
'If you set max_size to cap the maximum size of image,'
'please set use_cv2 to True to resize the image.')
im = im.astype('uint8')
im = Image.fromarray(im)
im = im.resize((int(resize_w), int(resize_h)), self.interp)
im = np.array(im)
# padding im when image_shape fixed by infer_cfg.yml
if self.max_size != 0 and self.image_shape is not None:
padding_im = np.zeros(
(self.max_size, self.max_size, im_channel), dtype=np.float32)
im_h, im_w = im.shape[:2]
padding_im[:im_h, :im_w, :] = im
im = padding_im
im_info['scale'] = [im_scale_x, im_scale_y]
return im, im_info
def generate_scale(self, im):
"""
Args:
im (np.ndarray): image (np.ndarray)
Returns:
im_scale_x: the resize ratio of X
im_scale_y: the resize ratio of Y
"""
origin_shape = im.shape[:2]
im_c = im.shape[2]
if self.max_size != 0:
im_size_min = np.min(origin_shape[0:2])
im_size_max = np.max(origin_shape[0:2])
im_scale = float(self.target_size) / float(im_size_min)
if np.round(im_scale * im_size_max) > self.max_size:
im_scale = float(self.max_size) / float(im_size_max)
im_scale_x = im_scale
im_scale_y = im_scale
else:
im_scale_x = float(self.target_size) / float(origin_shape[1])
im_scale_y = float(self.target_size) / float(origin_shape[0])
return im_scale_x, im_scale_y
class Normalize(object):
"""normalize image
Args:
mean (list): im - mean
std (list): im / std
is_scale (bool): whether need im / 255
is_channel_first (bool): if True: image shape is CHW, else: HWC
"""
def __init__(self, mean, std, is_scale=True, is_channel_first=False):
self.mean = mean
self.std = std
self.is_scale = is_scale
self.is_channel_first = is_channel_first
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.astype(np.float32, copy=False)
if self.is_channel_first:
mean = np.array(self.mean)[:, np.newaxis, np.newaxis]
std = np.array(self.std)[:, np.newaxis, np.newaxis]
else:
mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
std = np.array(self.std)[np.newaxis, np.newaxis, :]
if self.is_scale:
im = im / 255.0
im -= mean
im /= std
return im, im_info
class Permute(object):
"""permute image
Args:
to_bgr (bool): whether convert RGB to BGR
channel_first (bool): whether convert HWC to CHW
"""
def __init__(self, to_bgr=False, channel_first=True):
self.to_bgr = to_bgr
self.channel_first = channel_first
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
if self.channel_first:
im = im.transpose((2, 0, 1)).copy()
if self.to_bgr:
im = im[[2, 1, 0], :, :]
return im, im_info
class PadStride(object):
""" padding image for model with FPN
Args:
stride (bool): model with FPN need image shape % stride == 0
"""
def __init__(self, stride=0):
self.coarsest_stride = stride
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
coarsest_stride = self.coarsest_stride
if coarsest_stride == 0:
return im
im_c, im_h, im_w = im.shape
pad_h = int(np.ceil(float(im_h) / coarsest_stride) * coarsest_stride)
pad_w = int(np.ceil(float(im_w) / coarsest_stride) * coarsest_stride)
padding_im = np.zeros((im_c, pad_h, pad_w), dtype=np.float32)
padding_im[:, :im_h, :im_w] = im
im_info['pad_shape'] = padding_im.shape[1:]
return padding_im, im_info
def preprocess(im, preprocess_ops):
# process image by preprocess_ops
im_info = {
'scale': [1., 1.],
'origin_shape': None,
'resize_shape': None,
'pad_shape': None,
}
im, im_info = decode_image(im, im_info)
count = 0
for operator in preprocess_ops:
count += 1
im, im_info = operator(im, im_info)
im = np.array((im, )).astype('float32')
return im, im_info
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import cv2
import json
import math
import numpy as np
import argparse
HAT_SCALES = {
'1.png': [3.0, 0.9, .0],
'2.png': [3.0, 1.3, .5],
'3.png': [2.2, 1.5, .8],
'4.png': [2.2, 1.8, .0],
'5.png': [1.8, 1.2, .0],
}
GLASSES_SCALES = {
'1.png': [0.65, 2.5],
'2.png': [0.65, 2.5],
}
BEARD_SCALES = {'1.png': [700, 0.3], '2.png': [220, 0.2]}
def rotate(image, angle):
"""
angle is degree, not radian
"""
(h, w) = image.shape[:2]
(cx, cy) = (w / 2, h / 2)
M = cv2.getRotationMatrix2D((cx, cy), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
M[0, 2] += (nW / 2) - cx
M[1, 2] += (nH / 2) - cy
return cv2.warpAffine(image, M, (nW, nH))
def n_rotate_coord(angle, x, y):
"""
angle is radian, not degree
"""
rotatex = math.cos(angle) * x - math.sin(angle) * y
rotatey = math.cos(angle) * y + math.sin(angle) * x
return rotatex, rotatey
def r_rotate_coord(angle, x, y):
"""
angle is radian, not degree
"""
rotatex = math.cos(angle) * x + math.sin(angle) * y
rotatey = math.cos(angle) * y - math.sin(angle) * x
return rotatex, rotatey
def add_beard(person, kypoint, element_path):
beard_file_name = os.path.split(element_path)[1]
# element_len: top width of beard
# loc_offset_scale: scale relative to nose
element_len, loc_offset_scale = BEARD_SCALES[beard_file_name][:]
x1, y1, x2, y2, x3, y3, x4, y4, x5, y5 = kypoint[:]
mouth_len = np.sqrt(np.square(np.abs(y4 - y5)) + np.square(x4 - x5))
element = cv2.imread(element_path)
h, w, _ = element.shape
resize_scale = mouth_len / float(element_len)
h, w = round(h * resize_scale + 0.5), round(w * resize_scale + 0.5)
resized_element = cv2.resize(element, (w, h))
resized_ele_h, resized_ele_w, _ = resized_element.shape
# First find the keypoint of mouth in front face
m_center_x = (x4 + x5) / 2.
m_center_y = (y4 + y5) / 2.
# cal degree only according mouth coordinates
degree = np.arccos((x4 - x5) / mouth_len)
# coordinate of RoI in front face
half_w = int(resized_ele_w // 2)
scale = loc_offset_scale
roi_top_left_y = int(y3 + (((y5 + y4) // 2) - y3) * scale)
roi_top_left_x = int(x3 - half_w)
roi_top_right_y = roi_top_left_y
roi_top_right_x = int(x3 + half_w)
roi_bottom_left_y = roi_top_left_y + resized_ele_h
roi_bottom_left_x = roi_top_left_x
roi_bottom_right_y = roi_bottom_left_y
roi_bottom_right_x = roi_top_right_x
r_x11, r_y11 = roi_top_left_x - x3, roi_top_left_y - y3
r_x12, r_y12 = roi_top_right_x - x3, roi_top_right_y - y3
r_x21, r_y21 = roi_bottom_left_x - x3, roi_bottom_left_y - y3
r_x22, r_y22 = roi_bottom_right_x - x3, roi_bottom_right_y - y3
# coordinate of RoI in raw face
if m_center_x > x3:
x11, y11 = r_rotate_coord(degree, r_x11, r_y11)
x12, y12 = r_rotate_coord(degree, r_x12, r_y12)
x21, y21 = r_rotate_coord(degree, r_x21, r_y21)
x22, y22 = r_rotate_coord(degree, r_x22, r_y22)
else:
x11, y11 = n_rotate_coord(degree, r_x11, r_y11)
x12, y12 = n_rotate_coord(degree, r_x12, r_y12)
x21, y21 = n_rotate_coord(degree, r_x21, r_y21)
x22, y22 = n_rotate_coord(degree, r_x22, r_y22)
x11, y11 = x11 + x3, y11 + y3
x12, y12 = x12 + x3, y12 + y3
x21, y21 = x21 + x3, y21 + y3
x22, y22 = x22 + x3, y22 + y3
min_x = int(min(x11, x12, x21, x22))
max_x = int(max(x11, x12, x21, x22))
min_y = int(min(y11, y12, y21, y22))
max_y = int(max(y11, y12, y21, y22))
angle = np.degrees(degree)
if y4 < y5:
angle = -angle
rotated_element = rotate(resized_element, angle)
rotated_ele_h, rotated_ele_w, _ = rotated_element.shape
max_x = min_x + int(rotated_ele_w)
max_y = min_y + int(rotated_ele_h)
e2gray = cv2.cvtColor(rotated_element, cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(e2gray, 238, 255, cv2.THRESH_BINARY_INV)
mask_inv = cv2.bitwise_not(mask)
roi = person[min_y:max_y, min_x:max_x]
person_bg = cv2.bitwise_and(roi, roi, mask=mask)
element_fg = cv2.bitwise_and(
rotated_element, rotated_element, mask=mask_inv)
dst = cv2.add(person_bg, element_fg)
person[min_y:max_y, min_x:max_x] = dst
return person
def add_hat(person, kypoint, element_path):
x1, y1, x2, y2, x3, y3, x4, y4, x5, y5 = kypoint[:]
eye_len = np.sqrt(np.square(np.abs(y1 - y2)) + np.square(np.abs(x1 - x2)))
# cal degree only according eye coordinates
degree = np.arccos((x2 - x1) / eye_len)
angle = np.degrees(degree)
if y2 < y1:
angle = -angle
element = cv2.imread(element_path)
hat_file_name = os.path.split(element_path)[1]
# head_scale: size scale of hat
# high_scale: height scale above the eyes
# offect_scale: width offect of hat in face
head_scale, high_scale, offect_scale = HAT_SCALES[hat_file_name][:]
h, w, _ = element.shape
element_len = w
resize_scale = eye_len * head_scale / float(w)
h, w = round(h * resize_scale + 0.5), round(w * resize_scale + 0.5)
resized_element = cv2.resize(element, (w, h))
resized_ele_h, resized_ele_w, _ = resized_element.shape
m_center_x = (x1 + x2) / 2.
m_center_y = (y1 + y2) / 2.
head_len = int(eye_len * high_scale)
if angle > 0:
head_center_x = int(m_center_x + head_len * math.sin(degree))
head_center_y = int(m_center_y - head_len * math.cos(degree))
else:
head_center_x = int(m_center_x + head_len * math.sin(degree))
head_center_y = int(m_center_y - head_len * math.cos(degree))
rotated_element = rotate(resized_element, angle)
rotated_ele_h, rotated_ele_w, _ = rotated_element.shape
max_x = int(head_center_x + (resized_ele_w // 2) * math.cos(degree)) + int(
angle * head_scale) + int(eye_len * offect_scale)
min_y = int(head_center_y - (resized_ele_w // 2) * math.cos(degree))
pad_ele_x0 = 0 if (max_x - int(rotated_ele_w)) > 0 else -(
max_x - int(rotated_ele_w))
pad_ele_y0 = 0 if min_y > 0 else -(min_y)
min_x = int(max(max_x - int(rotated_ele_w), 0))
min_y = int(max(min_y, 0))
max_y = min_y + int(rotated_ele_h)
pad_y1 = max(max_y - int(person.shape[0]), 0)
pad_x1 = max(max_x - int(person.shape[1]), 0)
pad_w = pad_ele_x0 + pad_x1
pad_h = pad_ele_y0 + pad_y1
max_x += pad_w
pad_person = np.zeros(
(person.shape[0] + pad_h, person.shape[1] + pad_w, 3)).astype(np.uint8)
pad_person[pad_ele_y0:pad_ele_y0 + person.shape[0], pad_ele_x0:pad_ele_x0 +
person.shape[1], :] = person
e2gray = cv2.cvtColor(rotated_element, cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(e2gray, 1, 255, cv2.THRESH_BINARY_INV)
mask_inv = cv2.bitwise_not(mask)
roi = pad_person[min_y:max_y, min_x:max_x]
person_bg = cv2.bitwise_and(roi, roi, mask=mask)
element_fg = cv2.bitwise_and(
rotated_element, rotated_element, mask=mask_inv)
dst = cv2.add(person_bg, element_fg)
pad_person[min_y:max_y, min_x:max_x] = dst
return pad_person, pad_ele_x0, pad_x1, pad_ele_y0, pad_y1, min_x, min_y, max_x, max_y
def add_glasses(person, kypoint, element_path):
x1, y1, x2, y2, x3, y3, x4, y4, x5, y5 = kypoint[:]
eye_len = np.sqrt(np.square(np.abs(y1 - y2)) + np.square(np.abs(x1 - x2)))
# cal degree only according eye coordinates
degree = np.arccos((x2 - x1) / eye_len)
angle = np.degrees(degree)
if y2 < y1:
angle = -angle
element = cv2.imread(element_path)
glasses_file_name = os.path.split(element_path)[1]
# height_scale: height scale above the eyes
# glasses_scale: size ratio of glasses
height_scale, glasses_scale = GLASSES_SCALES[glasses_file_name][:]
h, w, _ = element.shape
element_len = w
resize_scale = eye_len * glasses_scale / float(element_len)
h, w = round(h * resize_scale + 0.5), round(w * resize_scale + 0.5)
resized_element = cv2.resize(element, (w, h))
resized_ele_h, resized_ele_w, _ = resized_element.shape
rotated_element = rotate(resized_element, angle)
rotated_ele_h, rotated_ele_w, _ = rotated_element.shape
eye_center_x = (x1 + x2) / 2.
eye_center_y = (y1 + y2) / 2.
min_x = int(eye_center_x) - int(rotated_ele_w * 0.5) + int(
angle * glasses_scale * person.shape[1] / 2000)
min_y = int(eye_center_y) - int(rotated_ele_h * height_scale)
max_x = min_x + rotated_ele_w
max_y = min_y + rotated_ele_h
e2gray = cv2.cvtColor(rotated_element, cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(e2gray, 1, 255, cv2.THRESH_BINARY_INV)
mask_inv = cv2.bitwise_not(mask)
roi = person[min_y:max_y, min_x:max_x]
person_bg = cv2.bitwise_and(roi, roi, mask=mask)
element_fg = cv2.bitwise_and(
rotated_element, rotated_element, mask=mask_inv)
dst = cv2.add(person_bg, element_fg)
person[min_y:max_y, min_x:max_x] = dst
return person
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import time
import base64
import json
import cv2
import numpy as np
import paddle.nn as nn
import paddlehub as hub
from paddlehub.module.module import moduleinfo, serving, Module
import solov2_blazeface.processor as P
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
@moduleinfo(
name="solov2_blazeface",
type="CV/image_editing",
author="paddlepaddle",
author_email="",
summary="solov2_blaceface is a segmentation and face detection model based on solov2 and blaceface.",
version="1.0.0")
class SoloV2BlazeFaceModel(nn.Layer):
"""
SoloV2BlazeFaceModel
"""
def __init__(self, use_gpu=True):
super(SoloV2BlazeFaceModel, self).__init__()
self.solov2 = hub.Module(name='solov2', use_gpu=use_gpu)
self.blaceface = hub.Module(name='blazeface', use_gpu=use_gpu)
def predict(self,
image,
background,
beard_file=None,
glasses_file=None,
hat_file=None,
visualization=False,
threshold=0.5):
# instance segmention
solov2_output = self.solov2.predict(
image=image, threshold=threshold, visualization=visualization)
# Set background pixel to 0
im_segm, x0, x1, y0, y1, _, _, _, _, flag_seg = P.visualize_box_mask(
image, solov2_output, threshold=threshold)
if flag_seg == 0:
return im_segm
h, w = y1 - y0, x1 - x0
back_json = background[:-3] + 'json'
stand_box = json.load(open(back_json))
stand_box = stand_box['outputs']['object'][0]['bndbox']
stand_xmin, stand_xmax, stand_ymin, stand_ymax = stand_box[
'xmin'], stand_box['xmax'], stand_box['ymin'], stand_box['ymax']
im_path = np.asarray(im_segm)
# face detection
blaceface_output = self.blaceface.predict(
image=im_path, threshold=threshold, visualization=visualization)
im_face_kp, p_left, p_right, p_up, p_bottom, h_xmin, h_ymin, h_xmax, h_ymax, flag_face = P.visualize_box_mask(
im_path,
blaceface_output,
threshold=threshold,
beard_file=beard_file,
glasses_file=glasses_file,
hat_file=hat_file)
if flag_face == 1:
if x0 > h_xmin:
shift_x_ = x0 - h_xmin
else:
shift_x_ = 0
if y0 > h_ymin:
shift_y_ = y0 - h_ymin
else:
shift_y_ = 0
h += p_up + p_bottom + shift_y_
w += p_left + p_right + shift_x_
x0 = min(x0, h_xmin)
y0 = min(y0, h_ymin)
x1 = max(x1, h_xmax) + shift_x_ + p_left + p_right
y1 = max(y1, h_ymax) + shift_y_ + p_up + p_bottom
# Fill the background image
cropped = im_face_kp.crop((x0, y0, x1, y1))
resize_scale = min((stand_xmax - stand_xmin) / (x1 - x0),
(stand_ymax - stand_ymin) / (y1 - y0))
h, w = int(h * resize_scale), int(w * resize_scale)
cropped = cropped.resize((w, h), cv2.INTER_LINEAR)
cropped = cv2.cvtColor(np.asarray(cropped), cv2.COLOR_RGB2BGR)
shift_x = int((stand_xmax - stand_xmin - cropped.shape[1]) / 2)
shift_y = int((stand_ymax - stand_ymin - cropped.shape[0]) / 2)
out_image = cv2.imread(background)
e2gray = cv2.cvtColor(cropped, cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(e2gray, 1, 255, cv2.THRESH_BINARY_INV)
mask_inv = cv2.bitwise_not(mask)
roi = out_image[stand_ymin + shift_y:stand_ymin + cropped.shape[
0] + shift_y, stand_xmin + shift_x:stand_xmin + cropped.shape[1] +
shift_x]
person_bg = cv2.bitwise_and(roi, roi, mask=mask)
element_fg = cv2.bitwise_and(cropped, cropped, mask=mask_inv)
dst = cv2.add(person_bg, element_fg)
out_image[stand_ymin + shift_y:stand_ymin + cropped.shape[
0] + shift_y, stand_xmin + shift_x:stand_xmin + cropped.shape[1] +
shift_x] = dst
return out_image
@serving
def serving_method(self, images, background, beard, glasses, hat, **kwargs):
"""
Run as a service.
"""
final = {}
background_path = os.path.join(
self.directory,
'element_source/background/{}.png'.format(background))
beard_path = os.path.join(self.directory,
'element_source/beard/{}.png'.format(beard))
glasses_path = os.path.join(
self.directory, 'element_source/glasses/{}.png'.format(glasses))
hat_path = os.path.join(self.directory,
'element_source/hat/{}.png'.format(hat))
images_decode = base64_to_cv2(images[0])
output = self.predict(
image=images_decode,
background=background_path,
hat_file=hat_path,
beard_file=beard_path,
glasses_file=glasses_path,
**kwargs)
final['image'] = cv2_to_base64(output)
return final
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import division
import cv2
import numpy as np
from PIL import Image, ImageDraw
import solov2_blazeface.face_makeup_main as face_makeup_main
def visualize_box_mask(im,
results,
threshold=0.5,
beard_file=None,
glasses_file=None,
hat_file=None):
if isinstance(im, str):
im = Image.open(im).convert('RGB')
else:
im = Image.fromarray(im)
if 'segm' in results:
im, x0, x1, y0, y1, flag_seg = draw_segm(
im,
results['segm'],
results['label'],
results['score'],
threshold=threshold)
return im, x0, x1, y0, y1, 0, 0, 0, 0, flag_seg
if 'landmark' in results:
im, left, right, up, bottom, h_xmin, h_ymin, h_xmax, h_ymax, flag_face = trans_lmk(
im, results['landmark'], beard_file, glasses_file, hat_file)
return im, left, right, up, bottom, h_xmin, h_ymin, h_xmax, h_ymax, flag_face
else:
return im, 0, 0, 0, 0, 0, 0, 0, 0, 0
def draw_segm(im, np_segms, np_label, np_score, threshold=0.5, alpha=0.7):
"""
Draw segmentation on image
"""
im = np.array(im).astype('float32')
np_segms = np_segms.astype(np.uint8)
index_label = np.where(np_label == 0)[0]
index = np.where(np_score[index_label] > threshold)[0]
index = index_label[index]
if index.size == 0:
im = Image.fromarray(im.astype('uint8'))
return im, 0, 0, 0, 0, 0
person_segms = np_segms[index]
person_mask_single_channel = np.sum(person_segms, axis=0)
person_mask_single_channel[person_mask_single_channel > 1] = 1
person_mask = np.expand_dims(person_mask_single_channel, axis=2)
person_mask = np.repeat(person_mask, 3, axis=2)
im = im * person_mask
sum_x = np.sum(person_mask_single_channel, axis=0)
x = np.where(sum_x > 0.5)[0]
sum_y = np.sum(person_mask_single_channel, axis=1)
y = np.where(sum_y > 0.5)[0]
x0, x1, y0, y1 = x[0], x[-1], y[0], y[-1]
return Image.fromarray(im.astype('uint8')), x0, x1, y0, y1, 1
def lmk2out(bboxes, np_lmk, im_info, threshold=0.5, is_bbox_normalized=True):
image_w, image_h = im_info['origin_shape']
scale = im_info['scale']
face_index, landmark, prior_box = np_lmk[:]
xywh_res = []
if bboxes.shape == (1, 1) or bboxes is None:
return np.array([])
prior = np.reshape(prior_box, (-1, 4))
predict_lmk = np.reshape(landmark, (-1, 10))
k = 0
for i in range(bboxes.shape[0]):
score = bboxes[i][1]
if score < threshold:
continue
theindex = face_index[i][0]
me_prior = prior[theindex, :]
lmk_pred = predict_lmk[theindex, :]
prior_h = me_prior[2] - me_prior[0]
prior_w = me_prior[3] - me_prior[1]
prior_h_center = (me_prior[2] + me_prior[0]) / 2
prior_w_center = (me_prior[3] + me_prior[1]) / 2
lmk_decode = np.zeros((10))
for j in [0, 2, 4, 6, 8]:
lmk_decode[j] = lmk_pred[j] * 0.1 * prior_w + prior_h_center
for j in [1, 3, 5, 7, 9]:
lmk_decode[j] = lmk_pred[j] * 0.1 * prior_h + prior_w_center
if is_bbox_normalized:
lmk_decode = lmk_decode * np.array([
image_h, image_w, image_h, image_w, image_h, image_w, image_h,
image_w, image_h, image_w
])
xywh_res.append(lmk_decode)
return np.asarray(xywh_res)
def post_processing(image, lmk_decode, hat_path, beard_path, glasses_path):
image = cv2.cvtColor(np.asarray(image), cv2.COLOR_RGB2BGR)
p_left, p_right, p_up, p_bottom, h_xmax, h_ymax = [0] * 6
h_xmin, h_ymin = 10000, 10000
# Add beard on the face
if beard_path is not None:
image = face_makeup_main.add_beard(image, lmk_decode, beard_path)
# Add glasses on the face
if glasses_path is not None:
image = face_makeup_main.add_glasses(image, lmk_decode, glasses_path)
# Add hat on the face
if hat_path is not None:
image, p_left, p_right, p_up, p_bottom, h_xmin, h_ymin, h_xmax, h_ymax = face_makeup_main.add_hat(
image, lmk_decode, hat_path)
image = Image.fromarray(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
print('----------- Post Processing Success -----------')
return image, p_left, p_right, p_up, p_bottom, h_xmin, h_ymin, h_xmax, h_ymax
def trans_lmk(image, lmk_results, beard_file, glasses_file, hat_file):
p_left, p_right, p_up, p_bottom, h_xmax, h_ymax = [0] * 6
h_xmin, h_ymin = 10000, 10000
if lmk_results.shape[0] == 0:
return image, p_left, p_right, p_up, p_bottom, h_xmin, h_ymin, h_xmax, h_ymax, 0
for lmk_decode in lmk_results:
x1, y1, x2, y2 = lmk_decode[0], lmk_decode[1], lmk_decode[
2], lmk_decode[3]
x4, y4, x5, y5 = lmk_decode[6], lmk_decode[7], lmk_decode[
8], lmk_decode[9]
# Refine the order of keypoint
if x1 > x2:
lmk_decode[0], lmk_decode[1], lmk_decode[2], lmk_decode[
3] = lmk_decode[2], lmk_decode[3], lmk_decode[0], lmk_decode[1]
if x4 < x5:
lmk_decode[6], lmk_decode[7], lmk_decode[8], lmk_decode[
9] = lmk_decode[8], lmk_decode[9], lmk_decode[6], lmk_decode[7]
# Add decoration to the face
image, p_left_temp, p_right_temp, p_up_temp, p_bottom_temp, h_xmin_temp, h_ymin_temp, h_xmax_temp, h_ymax_temp = post_processing(
image, lmk_decode, hat_file, beard_file, glasses_file)
p_left = max(p_left, p_left_temp)
p_right = max(p_right, p_right_temp)
p_up = max(p_up, p_up_temp)
p_bottom = max(p_bottom, p_bottom_temp)
h_xmin = min(h_xmin, h_xmin_temp)
h_ymin = min(h_ymin, h_ymin_temp)
h_xmax = max(h_xmax, h_xmax_temp)
h_ymax = max(h_ymax, h_ymax_temp)
return image, p_left, p_right, p_up, p_bottom, h_xmin, h_ymin, h_xmax, h_ymax, 1
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import paddle
import paddlehub as hub
import cv2
from PIL import Image
import numpy as np
import base64
img_file = 'demo_images/test.jpg'
background = 'element_source/background/1.png'
beard_file = 'element_source/beard/1.png'
glasses_file = 'element_source/glasses/4.png'
hat_file = 'element_source/hat/1.png'
model = hub.Module(name='solov2_blazeface', use_gpu=True)
output = model.predict(
image=img_file,
background=background,
hat_file=hat_file,
beard_file=beard_file,
glasses_file=glasses_file,
visualization=True)
cv2.imwrite("chrismas_final.png", output)
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import requests
import json
import cv2
import base64
import time
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send HTTP request
org_im = cv2.cvtColor(cv2.imread('demo_images/test.jpg'), cv2.COLOR_BGR2RGB)
h, w, c = org_im.shape
hat_ids = 1
data = {
'images': [cv2_to_base64(org_im)],
'background': 3,
"beard": 2,
"glasses": 3,
"hat": 3
}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8880/predict/solov2_blazeface"
start = time.time()
r = requests.post(url=url, headers=headers, data=json.dumps(data))
end = time.time()
print('cost:', end - start)
result = base64_to_cv2(r.json()["results"]['image'])
cv2.imwrite("chrismas_final.png", result)
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册