未验证 提交 2afdd196 编写于 作者: H haoyuying 提交者: GitHub

Add sr (#808)

上级 8ff10b68
## 模型概述
HumanSeg_lite是基于ShuffleNetV2网络结构的基础上进行优化的人像分割模型,进一步减小了网络规模,网络大小只有541K,量化后只有187K,适用于手机自拍人像分割等实时分割场景。
## 命令行预测
```
hub run humanseg_lite --input_path "/PATH/TO/IMAGE"
```
## API
```python
def segment(images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_lite_output')
```
预测API,用于人像分割。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* batch\_size (int): batch 的大小;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-255 (0为全透明,255为不透明),也即取值越大的像素点越可能为人体,取值越小的像素点越可能为背景。
```python
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
```
预测API,用于逐帧对视频人像分割。
**参数**
* frame_org (numpy.ndarray): 单帧图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* frame_id (int): 当前帧的编号;
* prev_gray (numpy.ndarray): 前一帧输入网络图像的灰度图;
* prev_cfd (numpy.ndarray): 前一帧光流追踪图和预测结果融合图
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
**返回**
* img_matting (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-1 (0为全透明,1为不透明)。
* cur_gray (numpy.ndarray): 当前帧输入网络图像的灰度图;
* optflow_map (numpy.ndarray): 当前帧光流追踪图和预测结果融合图
```python
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_lite_video_result'):
```
预测API,用于视频人像分割。
**参数**
* video\_path (str): 待分割视频路径。若为None,则从本地摄像头获取视频,并弹出窗口显示在线分割结果。
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* save\_dir (str): 视频保存路径,仅在video\_path不为None时启用,保存离线视频处理结果。
```python
def save_inference_model(dirname='humanseg_lite_model',
model_filename=None,
params_filename=None,
combined=True)
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
图片分割及视频分割代码示例:
```python
import cv2
import paddlehub as hub
human_seg = hub.Module('humanseg_lite')
im = cv2.imread('/PATH/TO/IMAGE')
#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。
res = human_seg.segment(images=[im],visualization=True)
print(res[0]['data'])
human_seg.video_segment('/PATH/TO/VIDEO')
human_seg.save_inference_model('/PATH/TO/SAVE/MODEL')
```
视频流预测代码示例:
```python
import cv2
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_lite')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_lite_video.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps, (width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = human_seg.video_stream_segment(frame_org=frame_org, frame_id=cap_video.get(1), prev_gray=prev_gray, prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
```
## 服务部署
PaddleHub Serving可以部署一个人像分割的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m humanseg_lite
```
这样就完成了一个人像分割的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
org_im = cv2.imread('PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_lite"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 保存图片
mask =cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY)
rgba = np.concatenate((org_im, np.expand_dims(mask, axis=2)), axis=2)
cv2.imwrite("segment_human_lite.png", rgba)
```
### 查看代码
https://github.com/PaddlePaddle/PaddleSeg/tree/develop/contrib/HumanSeg
### 依赖
paddlepaddle >= 1.8.0
paddlehub >= 1.7.1
# -*- coding:utf-8 -*-
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
from PIL import Image
__all__ = ['reader', 'preprocess_v']
def preprocess_v(img, w, h):
img = cv2.resize(img, (w, h), cv2.INTER_LINEAR).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
return img
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
#print(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.resize(img, (192, 192)).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
element['image'] = img
yield element
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import os.path as osp
import argparse
import cv2
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from humanseg_lite.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
from humanseg_lite.data_feed import reader, preprocess_v
from humanseg_lite.optimal import postprocess_v, threshold_mask
@moduleinfo(
name="humanseg_lite",
type="CV/semantic_segmentation",
author="paddlepaddle",
author_email="",
summary="humanseg_lite is a semantic segmentation model.",
version="1.1.0")
class ShufflenetHumanSeg(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "humanseg_lite_inference")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = os.path.join(self.default_pretrained_model_path,
'__model__')
self.params_file_path = os.path.join(self.default_pretrained_model_path,
'__params__')
cpu_config = AnalysisConfig(self.model_file_path, self.params_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path,
self.params_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def segment(self,
images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_lite_output'):
"""
API for human segmentation.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
batch_size (int): batch size.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
loop_num = int(np.ceil(total_num / batch_size))
res = list()
for iter_id in range(loop_num):
batch_data = list()
handle_id = iter_id * batch_size
for image_id in range(batch_size):
try:
batch_data.append(all_data[handle_id + image_id])
except:
pass
# feed batch image
batch_image = np.array([data['image'] for data in batch_data])
batch_image = PaddleTensor(batch_image.copy())
output = self.gpu_predictor.run([
batch_image
]) if use_gpu else self.cpu_predictor.run([batch_image])
output = output[1].as_ndarray()
output = np.expand_dims(output[:, 1, :, :], axis=1)
# postprocess one by one
for i in range(len(batch_data)):
out = postprocess(
data_out=output[i],
org_im=batch_data[i]['org_im'],
org_im_shape=batch_data[i]['org_im_shape'],
org_im_path=batch_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
"""
API for human video segmentation.
Args:
frame_org (numpy.ndarray): frame data, shape of each is [H, W, C], the color space is BGR.
frame_id (int): index of the frame to be decoded.
prev_gray (numpy.ndarray): gray scale image of last frame, shape of each is [H, W]
prev_cfd (numpy.ndarray): fusion image from optical flow image and segment result, shape of each is [H, W]
use_gpu (bool): Whether to use gpu.
Returns:
img_matting (numpy.ndarray): data of segmentation mask.
cur_gray (numpy.ndarray): gray scale image of current frame, shape of each is [H, W]
optflow_map (numpy.ndarray): optical flow image of current frame, shape of each is [H, W]
"""
resize_h = 192
resize_w = 192
is_init = True
width = int(frame_org.shape[0])
height = int(frame_org.shape[1])
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
if frame_id == 1:
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
else:
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (height, width), cv2.INTER_LINEAR)
return [img_matting, cur_gray, optflow_map]
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_lite_video_result'):
"""
API for human video segmentation.
Args:
video_path (str): The path to take the video under preprocess. If video_path is None, it will capture
the vedio from your camera.
use_gpu (bool): Whether to use gpu.
save_dir (str): The path to store output video.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. "
"If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
resize_h = 192
resize_w = 192
if not video_path:
cap_video = cv2.VideoCapture(0)
else:
cap_video = cv2.VideoCapture(video_path)
if not cap_video.isOpened():
raise IOError("Error opening video stream or file, "
"--video_path whether existing: {}"
" or camera whether working".format(video_path))
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
is_init = True
fps = cap_video.get(cv2.CAP_PROP_FPS)
if video_path is not None:
print('Please wait. It is computing......')
if not osp.exists(save_dir):
os.makedirs(save_dir)
save_path = osp.join(save_dir, 'result' + '.avi')
cap_out = cv2.VideoWriter(
save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
else:
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cv2.imshow('HumanSegmentation', comb)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
cap_video.release()
def save_inference_model(self,
dirname='humanseg_lite_model',
model_filename=None,
params_filename=None,
combined=True):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path,
model_filename=model_filename,
params_filename=params_filename,
executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.segment(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.segment(
paths=[args.input_path],
batch_size=args.batch_size,
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='humanseg_lite_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='humanseg_lite_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=False,
help="whether to save output as images.")
self.arg_config_group.add_argument(
'--batch_size',
type=ast.literal_eval,
default=1,
help="batch size.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
m = ShufflenetHumanSeg()
#shuffle.video_segment()
img = cv2.imread('photo.jpg')
# res = m.segment(images=[img], visualization=True)
# print(res[0]['data'])
# m.video_segment('')
cap_video = cv2.VideoCapture('video_test.mp4')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'result_frame.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path,
cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = m.video_stream_segment(
frame_org=frame_org,
frame_id=cap_video.get(1),
prev_gray=prev_gray,
prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(
np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
# -*- coding:utf-8 -*
import numpy as np
def human_seg_tracking(pre_gray, cur_gray, prev_cfd, dl_weights, disflow):
"""计算光流跟踪匹配点和光流图
输入参数:
pre_gray: 上一帧灰度图
cur_gray: 当前帧灰度图
prev_cfd: 上一帧光流图
dl_weights: 融合权重图
disflow: 光流数据结构
返回值:
is_track: 光流点跟踪二值图,即是否具有光流点匹配
track_cfd: 光流跟踪图
"""
check_thres = 8
h, w = pre_gray.shape[:2]
track_cfd = np.zeros_like(prev_cfd)
is_track = np.zeros_like(pre_gray)
flow_fw = disflow.calc(pre_gray, cur_gray, None)
flow_bw = disflow.calc(cur_gray, pre_gray, None)
flow_fw = np.round(flow_fw).astype(np.int)
flow_bw = np.round(flow_bw).astype(np.int)
y_list = np.array(range(h))
x_list = np.array(range(w))
yv, xv = np.meshgrid(y_list, x_list)
yv, xv = yv.T, xv.T
cur_x = xv + flow_fw[:, :, 0]
cur_y = yv + flow_fw[:, :, 1]
# 超出边界不跟踪
not_track = (cur_x < 0) + (cur_x >= w) + (cur_y < 0) + (cur_y >= h)
flow_bw[~not_track] = flow_bw[cur_y[~not_track], cur_x[~not_track]]
not_track += (np.square(flow_fw[:, :, 0] + flow_bw[:, :, 0]) +
np.square(flow_fw[:, :, 1] + flow_bw[:, :, 1])) >= check_thres
track_cfd[cur_y[~not_track], cur_x[~not_track]] = prev_cfd[~not_track]
is_track[cur_y[~not_track], cur_x[~not_track]] = 1
not_flow = np.all(
np.abs(flow_fw) == 0, axis=-1) * np.all(
np.abs(flow_bw) == 0, axis=-1)
dl_weights[cur_y[not_flow], cur_x[not_flow]] = 0.05
return track_cfd, is_track, dl_weights
def human_seg_track_fuse(track_cfd, dl_cfd, dl_weights, is_track):
"""光流追踪图和人像分割结构融合
输入参数:
track_cfd: 光流追踪图
dl_cfd: 当前帧分割结果
dl_weights: 融合权重图
is_track: 光流点匹配二值图
返回
cur_cfd: 光流跟踪图和人像分割结果融合图
"""
fusion_cfd = dl_cfd.copy()
is_track = is_track.astype(np.bool)
fusion_cfd[is_track] = dl_weights[is_track] * dl_cfd[is_track] + (
1 - dl_weights[is_track]) * track_cfd[is_track]
# 确定区域
index_certain = ((dl_cfd > 0.9) + (dl_cfd < 0.1)) * is_track
index_less01 = (dl_weights < 0.1) * index_certain
fusion_cfd[index_less01] = 0.3 * dl_cfd[index_less01] + 0.7 * track_cfd[
index_less01]
index_larger09 = (dl_weights >= 0.1) * index_certain
fusion_cfd[index_larger09] = 0.4 * dl_cfd[index_larger09] + 0.6 * track_cfd[
index_larger09]
return fusion_cfd
def threshold_mask(img, thresh_bg, thresh_fg):
dst = (img / 255.0 - thresh_bg) / (thresh_fg - thresh_bg)
dst[np.where(dst > 1)] = 1
dst[np.where(dst < 0)] = 0
return dst.astype(np.float32)
def postprocess_v(cur_gray, scoremap, prev_gray, pre_cfd, disflow, is_init):
"""光流优化
Args:
cur_gray : 当前帧灰度图
pre_gray : 前一帧灰度图
pre_cfd :前一帧融合结果
scoremap : 当前帧分割结果
difflow : 光流
is_init : 是否第一帧
Returns:
fusion_cfd : 光流追踪图和预测结果融合图
"""
h, w = scoremap.shape
cur_cfd = scoremap.copy()
if is_init:
if h <= 64 or w <= 64:
disflow.setFinestScale(1)
elif h <= 160 or w <= 160:
disflow.setFinestScale(2)
else:
disflow.setFinestScale(3)
fusion_cfd = cur_cfd
else:
weights = np.ones((h, w), np.float32) * 0.3
track_cfd, is_track, weights = human_seg_tracking(
prev_gray, cur_gray, pre_cfd, weights, disflow)
fusion_cfd = human_seg_track_fuse(track_cfd, cur_cfd, weights, is_track)
return fusion_cfd
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out, org_im, org_im_shape, org_im_path, output_dir,
visualization):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for logit in data_out:
logit = (logit * 255).astype(np.uint8)
logit = cv2.resize(logit, (org_im_shape[1], org_im_shape[0]))
rgba = np.concatenate((org_im, np.expand_dims(logit, axis=2)), axis=2)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, rgba)
result['save_path'] = save_im_path
result['data'] = logit
else:
result['data'] = logit
print("result['data'] shape", result['data'].shape)
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
## 模型概述
HumanSeg-mobile是基于HRNet(Deep High-Resolution Representation Learning for Visual Recognition)的人像分割网络。HRNet在特征提取过程中保持了高分辨率的信息,保持了物体的细节信息,并可通过控制每个分支的通道数调整模型的大小。HumanSeg-mobile采用了HRNet_w18_small_v1的网络结构,模型大小只有5.8M, 适用于移动端或服务端CPU的前置摄像头场景。
## 命令行预测
```
hub run humanseg_mobile --input_path "/PATH/TO/IMAGE"
```
## API
```python
def segment(images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_mobile_output')
```
预测API,用于人像分割。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* batch\_size (int): batch 的大小;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-255 (0为全透明,255为不透明),也即取值越大的像素点越可能为人体,取值越小的像素点越可能为背景。
```python
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
```
预测API,用于逐帧对视频人像分割。
**参数**
* frame_org (numpy.ndarray): 单帧图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* frame_id (int): 当前帧的编号;
* prev_gray (numpy.ndarray): 前一帧输入网络图像的灰度图;
* prev_cfd (numpy.ndarray): 前一帧光流追踪图和预测结果融合图
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
**返回**
* img_matting (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-1 (0为全透明,1为不透明)。
* cur_gray (numpy.ndarray): 当前帧输入网络图像的灰度图;
* optflow_map (numpy.ndarray): 当前帧光流追踪图和预测结果融合图
```python
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_mobile_video_result'):
```
预测API,用于视频人像分割。
**参数**
* video\_path (str): 待分割视频路径。若为None,则从本地摄像头获取视频,并弹出窗口显示在线分割结果。
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* save\_dir (str): 视频保存路径,仅在video\_path不为None时启用,保存离线视频处理结果。
```python
def save_inference_model(dirname='humanseg_mobile_model',
model_filename=None,
params_filename=None,
combined=True)
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
图片分割及视频分割代码示例:
```python
import cv2
import paddlehub as hub
human_seg = hub.Module('humanseg_mobile')
im = cv2.imread('/PATH/TO/IMAGE')
#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。
res = human_seg.segment(images=[im],visualization=True)
print(res[0]['data'])
human_seg.video_segment('/PATH/TO/VIDEO')
human_seg.save_inference_model('/PATH/TO/SAVE/MODEL')
```
视频流预测代码示例:
```python
import cv2
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_mobile')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_mobile_video.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps, (width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = human_seg.video_stream_segment(frame_org=frame_org, frame_id=cap_video.get(1), prev_gray=prev_gray, prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
```
## 服务部署
PaddleHub Serving可以部署一个人像分割的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m humanseg_mobile
```
这样就完成了一个人像分割的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_mobile"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 保存图片
mask =cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY)
rgba = np.concatenate((org_im, np.expand_dims(mask, axis=2)), axis=2)
cv2.imwrite("segment_human_mobile.png", rgba)
```
### 查看代码
<https://github.com/PaddlePaddle/PaddleSeg/tree/develop/contrib/HumanSeg>
### 依赖
paddlepaddle >= 1.8.0
paddlehub >= 1.7.1
# -*- coding:utf-8 -*-
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
__all__ = ['reader', 'preprocess_v']
def preprocess_v(img, w, h):
img = cv2.resize(img, (w, h), cv2.INTER_LINEAR).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
return img
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
#print(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.resize(img, (192, 192)).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
element['image'] = img
yield element
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import os.path as osp
import argparse
import cv2
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from humanseg_mobile.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
from humanseg_mobile.data_feed import reader, preprocess_v
from humanseg_mobile.optimal import postprocess_v, threshold_mask
@moduleinfo(
name="humanseg_mobile",
type="CV/semantic_segmentation",
author="paddlepaddle",
author_email="",
summary="HRNet_w18_samll_v1 is a semantic segmentation model.",
version="1.1.0")
class HRNetw18samllv1humanseg(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "humanseg_mobile_inference")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = os.path.join(self.default_pretrained_model_path,
'__model__')
self.params_file_path = os.path.join(self.default_pretrained_model_path,
'__params__')
cpu_config = AnalysisConfig(self.model_file_path, self.params_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path,
self.params_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def segment(self,
images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_mobile_output'):
"""
API for human segmentation.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
batch_size (int): batch size.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly."
"If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
# compatibility with older versions
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
loop_num = int(np.ceil(total_num / batch_size))
res = list()
for iter_id in range(loop_num):
batch_data = list()
handle_id = iter_id * batch_size
for image_id in range(batch_size):
try:
batch_data.append(all_data[handle_id + image_id])
except:
pass
# feed batch image
batch_image = np.array([data['image'] for data in batch_data])
batch_image = PaddleTensor(batch_image.copy())
output = self.gpu_predictor.run([
batch_image
]) if use_gpu else self.cpu_predictor.run([batch_image])
output = output[1].as_ndarray()
output = np.expand_dims(output[:, 1, :, :], axis=1)
# postprocess one by one
for i in range(len(batch_data)):
out = postprocess(
data_out=output[i],
org_im=batch_data[i]['org_im'],
org_im_shape=batch_data[i]['org_im_shape'],
org_im_path=batch_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
"""
API for human video segmentation.
Args:
frame_org (numpy.ndarray): frame data, shape of each is [H, W, C], the color space is BGR.
frame_id (int): index of the frame to be decoded.
prev_gray (numpy.ndarray): gray scale image of last frame, shape of each is [H, W]
prev_cfd (numpy.ndarray): fusion image from optical flow image and segment result, shape of each is [H, W]
use_gpu (bool): Whether to use gpu.
Returns:
img_matting (numpy.ndarray): data of segmentation mask.
cur_gray (numpy.ndarray): gray scale image of current frame, shape of each is [H, W]
optflow_map (numpy.ndarray): optical flow image of current frame, shape of each is [H, W]
"""
resize_h = 192
resize_w = 192
is_init = True
width = int(frame_org.shape[0])
height = int(frame_org.shape[1])
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
if frame_id == 1:
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
else:
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (height, width), cv2.INTER_LINEAR)
return [img_matting, cur_gray, optflow_map]
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_mobile_video_result'):
"""
API for human video segmentation.
Args:
video_path (str): The path to take the video under preprocess. If video_path is None, it will capture
the vedio from your camera.
use_gpu (bool): Whether to use gpu.
save_dir (str): The path to store output video.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. "
"If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
resize_h = 192
resize_w = 192
if not video_path:
cap_video = cv2.VideoCapture(0)
else:
cap_video = cv2.VideoCapture(video_path)
if not cap_video.isOpened():
raise IOError("Error opening video stream or file, "
"--video_path whether existing: {}"
" or camera whether working".format(video_path))
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
is_init = True
fps = cap_video.get(cv2.CAP_PROP_FPS)
if video_path is not None:
print('Please wait. It is computing......')
if not osp.exists(save_dir):
os.makedirs(save_dir)
save_path = osp.join(save_dir, 'result' + '.avi')
cap_out = cv2.VideoWriter(
save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
else:
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cv2.imshow('HumanSegmentation', comb)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
cap_video.release()
def save_inference_model(self,
dirname='humanseg_mobile_model',
model_filename=None,
params_filename=None,
combined=True):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path,
model_filename=model_filename,
params_filename=params_filename,
executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.segment(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.segment(
paths=[args.input_path],
batch_size=args.batch_size,
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='humanseg_mobile_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='humanseg_mobile_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=False,
help="whether to save output as images.")
self.arg_config_group.add_argument(
'--batch_size',
type=ast.literal_eval,
default=1,
help="batch size.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
m = HRNetw18samllv1humanseg()
img = cv2.imread('photo.jpg')
#res = m.segment(images=[img], visualization=True)
#print(res[0]['data'])
#m.video_segment('')
cap_video = cv2.VideoCapture('video_test.mp4')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'result_frame.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path,
cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = m.video_stream_segment(
frame_org=frame_org,
frame_id=cap_video.get(1),
prev_gray=prev_gray,
prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(
np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
# -*- coding:utf-8 -*-
import numpy as np
def human_seg_tracking(pre_gray, cur_gray, prev_cfd, dl_weights, disflow):
"""计算光流跟踪匹配点和光流图
输入参数:
pre_gray: 上一帧灰度图
cur_gray: 当前帧灰度图
prev_cfd: 上一帧光流图
dl_weights: 融合权重图
disflow: 光流数据结构
返回值:
is_track: 光流点跟踪二值图,即是否具有光流点匹配
track_cfd: 光流跟踪图
"""
check_thres = 8
h, w = pre_gray.shape[:2]
track_cfd = np.zeros_like(prev_cfd)
is_track = np.zeros_like(pre_gray)
flow_fw = disflow.calc(pre_gray, cur_gray, None)
flow_bw = disflow.calc(cur_gray, pre_gray, None)
flow_fw = np.round(flow_fw).astype(np.int)
flow_bw = np.round(flow_bw).astype(np.int)
y_list = np.array(range(h))
x_list = np.array(range(w))
yv, xv = np.meshgrid(y_list, x_list)
yv, xv = yv.T, xv.T
cur_x = xv + flow_fw[:, :, 0]
cur_y = yv + flow_fw[:, :, 1]
# 超出边界不跟踪
not_track = (cur_x < 0) + (cur_x >= w) + (cur_y < 0) + (cur_y >= h)
flow_bw[~not_track] = flow_bw[cur_y[~not_track], cur_x[~not_track]]
not_track += (np.square(flow_fw[:, :, 0] + flow_bw[:, :, 0]) +
np.square(flow_fw[:, :, 1] + flow_bw[:, :, 1])) >= check_thres
track_cfd[cur_y[~not_track], cur_x[~not_track]] = prev_cfd[~not_track]
is_track[cur_y[~not_track], cur_x[~not_track]] = 1
not_flow = np.all(
np.abs(flow_fw) == 0, axis=-1) * np.all(
np.abs(flow_bw) == 0, axis=-1)
dl_weights[cur_y[not_flow], cur_x[not_flow]] = 0.05
return track_cfd, is_track, dl_weights
def human_seg_track_fuse(track_cfd, dl_cfd, dl_weights, is_track):
"""光流追踪图和人像分割结构融合
输入参数:
track_cfd: 光流追踪图
dl_cfd: 当前帧分割结果
dl_weights: 融合权重图
is_track: 光流点匹配二值图
返回
cur_cfd: 光流跟踪图和人像分割结果融合图
"""
fusion_cfd = dl_cfd.copy()
is_track = is_track.astype(np.bool)
fusion_cfd[is_track] = dl_weights[is_track] * dl_cfd[is_track] + (
1 - dl_weights[is_track]) * track_cfd[is_track]
# 确定区域
index_certain = ((dl_cfd > 0.9) + (dl_cfd < 0.1)) * is_track
index_less01 = (dl_weights < 0.1) * index_certain
fusion_cfd[index_less01] = 0.3 * dl_cfd[index_less01] + 0.7 * track_cfd[
index_less01]
index_larger09 = (dl_weights >= 0.1) * index_certain
fusion_cfd[index_larger09] = 0.4 * dl_cfd[index_larger09] + 0.6 * track_cfd[
index_larger09]
return fusion_cfd
def threshold_mask(img, thresh_bg, thresh_fg):
dst = (img / 255.0 - thresh_bg) / (thresh_fg - thresh_bg)
dst[np.where(dst > 1)] = 1
dst[np.where(dst < 0)] = 0
return dst.astype(np.float32)
def postprocess_v(cur_gray, scoremap, prev_gray, pre_cfd, disflow, is_init):
"""光流优化
Args:
cur_gray : 当前帧灰度图
pre_gray : 前一帧灰度图
pre_cfd :前一帧融合结果
scoremap : 当前帧分割结果
difflow : 光流
is_init : 是否第一帧
Returns:
fusion_cfd : 光流追踪图和预测结果融合图
"""
h, w = scoremap.shape
cur_cfd = scoremap.copy()
if is_init:
if h <= 64 or w <= 64:
disflow.setFinestScale(1)
elif h <= 160 or w <= 160:
disflow.setFinestScale(2)
else:
disflow.setFinestScale(3)
fusion_cfd = cur_cfd
else:
weights = np.ones((h, w), np.float32) * 0.3
track_cfd, is_track, weights = human_seg_tracking(
prev_gray, cur_gray, pre_cfd, weights, disflow)
fusion_cfd = human_seg_track_fuse(track_cfd, cur_cfd, weights, is_track)
return fusion_cfd
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out,
org_im,
org_im_shape,
org_im_path,
output_dir,
visualization,
thresh=120):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
thresh (float): threshold.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for logit in data_out:
logit = (logit * 255).astype(np.uint8)
logit = cv2.resize(logit, (org_im_shape[1], org_im_shape[0]))
rgba = np.concatenate((org_im, np.expand_dims(logit, axis=2)), axis=2)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, rgba)
result['save_path'] = save_im_path
result['data'] = logit
else:
result['data'] = logit
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
## 模型概述
高精度模型,适用于服务端GPU且背景复杂的人像场景, 模型结构为Deeplabv3+/Xcetion65, 模型大小为158M,网络结构如图:
<p align="center">
<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/deeplabv3plus.png" hspace='10'/> <br />
</p>
## 命令行预测
```
hub run humanseg_server --input_path "/PATH/TO/IMAGE"
```
## API
```python
def segment(self,
images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_server_output'):
```
预测API,用于人像分割。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* batch\_size (int): batch 的大小;
* use\_gpu (bool): 是否使用 GPU;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-255 (0为全透明,255为不透明),也即取值越大的像素点越可能为人体,取值越小的像素点越可能为背景。
```python
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
```
预测API,用于逐帧对视频人像分割。
**参数**
* frame_org (numpy.ndarray): 单帧图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* frame_id (int): 当前帧的编号;
* prev_gray (numpy.ndarray): 前一帧输入网络图像的灰度图;
* prev_cfd (numpy.ndarray): 前一帧光流追踪图和预测结果融合图;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
**返回**
* img_matting (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-1 (0为全透明,1为不透明);
* cur_gray (numpy.ndarray): 当前帧输入分割网络图像的灰度图;
* optflow_map (numpy.ndarray): 当前帧光流追踪图和预测结果融合图。
```python
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_server_video'):
```
预测API,用于视频人像分割。
**参数**
* video\_path (str): 待分割视频路径。若为None,则从本地摄像头获取视频,并弹出窗口显示在线分割结果;
* use\_gpu (bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* save\_dir (str): 视频保存路径,仅在video\_path不为None时启用,保存离线视频处理结果。
```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True):
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
图片分割及视频分割代码示例:
```python
import cv2
import paddlehub as hub
human_seg = hub.Module('humanseg_server')
im = cv2.imread('/PATH/TO/IMAGE')
#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。
res = human_seg.segment(images=[im],visualization=True)
print(res[0]['data'])
human_seg.video_segment('/PATH/TO/VIDEO')
human_seg.save_inference_model('/PATH/TO/SAVE/MODEL')
```
视频流预测代码示例:
```python
import cv2
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_server')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_server_video.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps, (width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = human_seg.video_stream_segment(frame_org=frame_org, frame_id=cap_video.get(1), prev_gray=prev_gray, prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
```
## 服务部署
PaddleHub Serving可以部署一个人像分割的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m humanseg_server
```
这样就完成了一个人像分割的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
org_im = cv2.imread('PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_server"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 保存图片
mask =cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY)
rgba = np.concatenate((org_im, np.expand_dims(mask, axis=2)), axis=2)
cv2.imwrite("segment_human_server.png", rgba)
```
### 查看代码
https://github.com/PaddlePaddle/PaddleSeg/tree/develop/contrib/HumanSeg
### 依赖
paddlepaddle >= 1.8.0
paddlehub >= 1.7.1
# coding=utf-8
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
from PIL import Image
__all__ = ['reader', 'preprocess_v']
def preprocess_v(img, w, h):
img = cv2.resize(img, (w, h), cv2.INTER_LINEAR).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
return img
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.resize(img, (513, 513)).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
element['image'] = img
yield element
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import os.path as osp
import argparse
import cv2
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from humanseg_server.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
from humanseg_server.data_feed import reader, preprocess_v
from humanseg_server.optimal import postprocess_v, threshold_mask
@moduleinfo(
name="humanseg_server",
type="CV/semantic_segmentation",
author="baidu-vis",
author_email="",
summary="DeepLabv3+ is a semantic segmentation model.",
version="1.1.0")
class DeeplabV3pXception65HumanSeg(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "humanseg_server_inference")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = os.path.join(self.default_pretrained_model_path,
'__model__')
self.params_file_path = os.path.join(self.default_pretrained_model_path,
'__params__')
cpu_config = AnalysisConfig(self.model_file_path, self.params_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path,
self.params_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def segment(self,
images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_server_output'):
"""
API for human segmentation.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
batch_size (int): batch size.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
# compatibility with older versions
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
loop_num = int(np.ceil(total_num / batch_size))
res = list()
for iter_id in range(loop_num):
batch_data = list()
handle_id = iter_id * batch_size
for image_id in range(batch_size):
try:
batch_data.append(all_data[handle_id + image_id])
except:
pass
# feed batch image
batch_image = np.array([data['image'] for data in batch_data])
batch_image = PaddleTensor(batch_image.copy())
output = self.gpu_predictor.run([
batch_image
]) if use_gpu else self.cpu_predictor.run([batch_image])
output = output[1].as_ndarray()
output = np.expand_dims(output[:, 1, :, :], axis=1)
# postprocess one by one
for i in range(len(batch_data)):
out = postprocess(
data_out=output[i],
org_im=batch_data[i]['org_im'],
org_im_shape=batch_data[i]['org_im_shape'],
org_im_path=batch_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
"""
API for human video segmentation.
Args:
frame_org (numpy.ndarray): frame data, shape of each is [H, W, C], the color space is BGR.
frame_id (int): index of the frame to be decoded.
prev_gray (numpy.ndarray): gray scale image of last frame, shape of each is [H, W]
prev_cfd (numpy.ndarray): fusion image from optical flow image and segment result, shape of each is [H, W]
use_gpu (bool): Whether to use gpu.
Returns:
img_matting (numpy.ndarray): data of segmentation mask.
cur_gray (numpy.ndarray): gray scale image of current frame, shape of each is [H, W]
optflow_map (numpy.ndarray): optical flow image of current frame, shape of each is [H, W]
"""
resize_h = 512
resize_w = 512
is_init = True
width = int(frame_org.shape[0])
height = int(frame_org.shape[1])
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
if frame_id == 1:
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
else:
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (height, width), cv2.INTER_LINEAR)
return [img_matting, cur_gray, optflow_map]
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_server_video'):
resize_h = 512
resize_w = 512
if not video_path:
cap_video = cv2.VideoCapture(0)
else:
cap_video = cv2.VideoCapture(video_path)
if not cap_video.isOpened():
raise IOError("Error opening video stream or file, "
"--video_path whether existing: {}"
" or camera whether working".format(video_path))
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
is_init = True
fps = cap_video.get(cv2.CAP_PROP_FPS)
if video_path is not None:
print('Please wait. It is computing......')
if not osp.exists(save_dir):
os.makedirs(save_dir)
save_path = osp.join(save_dir, 'result' + '.avi')
cap_out = cv2.VideoWriter(
save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
else:
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cv2.imshow('HumanSegmentation', comb)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
cap_video.release()
def save_inference_model(self,
dirname='humanseg_server_model',
model_filename=None,
params_filename=None,
combined=True):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path,
model_filename=model_filename,
params_filename=params_filename,
executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.segment(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.segment(
paths=[args.input_path],
batch_size=args.batch_size,
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='humanseg_server_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='humanseg_server_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=False,
help="whether to save output as images.")
self.arg_config_group.add_argument(
'--batch_size',
type=ast.literal_eval,
default=1,
help="batch size.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
m = DeeplabV3pXception65HumanSeg()
# img = cv2.imread('photo.jpg')
# res = m.segment(images=[img])
# print(res[0]['data'])
# m.save_inference_model()
#m.video_segment(video_path='video_test.mp4')
img = cv2.imread('photo.jpg')
# res = m.segment(images=[img], visualization=True)
# print(res[0]['data'])
# m.video_segment('')
cap_video = cv2.VideoCapture('video_test.mp4')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'result_frame.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path,
cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = m.video_stream_segment(
frame_org=frame_org,
frame_id=cap_video.get(1),
prev_gray=prev_gray,
prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(
np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
# -*- coding:utf-8 -*-
import numpy as np
def human_seg_tracking(pre_gray, cur_gray, prev_cfd, dl_weights, disflow):
"""计算光流跟踪匹配点和光流图
输入参数:
pre_gray: 上一帧灰度图
cur_gray: 当前帧灰度图
prev_cfd: 上一帧光流图
dl_weights: 融合权重图
disflow: 光流数据结构
返回值:
is_track: 光流点跟踪二值图,即是否具有光流点匹配
track_cfd: 光流跟踪图
"""
check_thres = 8
h, w = pre_gray.shape[:2]
track_cfd = np.zeros_like(prev_cfd)
is_track = np.zeros_like(pre_gray)
flow_fw = disflow.calc(pre_gray, cur_gray, None)
flow_bw = disflow.calc(cur_gray, pre_gray, None)
flow_fw = np.round(flow_fw).astype(np.int)
flow_bw = np.round(flow_bw).astype(np.int)
y_list = np.array(range(h))
x_list = np.array(range(w))
yv, xv = np.meshgrid(y_list, x_list)
yv, xv = yv.T, xv.T
cur_x = xv + flow_fw[:, :, 0]
cur_y = yv + flow_fw[:, :, 1]
# 超出边界不跟踪
not_track = (cur_x < 0) + (cur_x >= w) + (cur_y < 0) + (cur_y >= h)
flow_bw[~not_track] = flow_bw[cur_y[~not_track], cur_x[~not_track]]
not_track += (np.square(flow_fw[:, :, 0] + flow_bw[:, :, 0]) +
np.square(flow_fw[:, :, 1] + flow_bw[:, :, 1])) >= check_thres
track_cfd[cur_y[~not_track], cur_x[~not_track]] = prev_cfd[~not_track]
is_track[cur_y[~not_track], cur_x[~not_track]] = 1
not_flow = np.all(
np.abs(flow_fw) == 0, axis=-1) * np.all(
np.abs(flow_bw) == 0, axis=-1)
dl_weights[cur_y[not_flow], cur_x[not_flow]] = 0.05
return track_cfd, is_track, dl_weights
def human_seg_track_fuse(track_cfd, dl_cfd, dl_weights, is_track):
"""光流追踪图和人像分割结构融合
输入参数:
track_cfd: 光流追踪图
dl_cfd: 当前帧分割结果
dl_weights: 融合权重图
is_track: 光流点匹配二值图
返回
cur_cfd: 光流跟踪图和人像分割结果融合图
"""
fusion_cfd = dl_cfd.copy()
is_track = is_track.astype(np.bool)
fusion_cfd[is_track] = dl_weights[is_track] * dl_cfd[is_track] + (
1 - dl_weights[is_track]) * track_cfd[is_track]
# 确定区域
index_certain = ((dl_cfd > 0.9) + (dl_cfd < 0.1)) * is_track
index_less01 = (dl_weights < 0.1) * index_certain
fusion_cfd[index_less01] = 0.3 * dl_cfd[index_less01] + 0.7 * track_cfd[
index_less01]
index_larger09 = (dl_weights >= 0.1) * index_certain
fusion_cfd[index_larger09] = 0.4 * dl_cfd[index_larger09] + 0.6 * track_cfd[
index_larger09]
return fusion_cfd
def threshold_mask(img, thresh_bg, thresh_fg):
dst = (img / 255.0 - thresh_bg) / (thresh_fg - thresh_bg)
dst[np.where(dst > 1)] = 1
dst[np.where(dst < 0)] = 0
return dst.astype(np.float32)
def postprocess_v(cur_gray, scoremap, prev_gray, pre_cfd, disflow, is_init):
"""光流优化
Args:
cur_gray : 当前帧灰度图
pre_gray : 前一帧灰度图
pre_cfd :前一帧融合结果
scoremap : 当前帧分割结果
difflow : 光流
is_init : 是否第一帧
Returns:
fusion_cfd : 光流追踪图和预测结果融合图
"""
h, w = scoremap.shape
cur_cfd = scoremap.copy()
if is_init:
if h <= 64 or w <= 64:
disflow.setFinestScale(1)
elif h <= 160 or w <= 160:
disflow.setFinestScale(2)
else:
disflow.setFinestScale(3)
fusion_cfd = cur_cfd
else:
weights = np.ones((h, w), np.float32) * 0.3
track_cfd, is_track, weights = human_seg_tracking(
prev_gray, cur_gray, pre_cfd, weights, disflow)
fusion_cfd = human_seg_track_fuse(track_cfd, cur_cfd, weights, is_track)
return fusion_cfd
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out, org_im, org_im_shape, org_im_path, output_dir,
visualization):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for logit in data_out:
logit = (logit * 255).astype(np.uint8)
logit = cv2.resize(logit, (org_im_shape[1], org_im_shape[0]))
rgba = np.concatenate((org_im, np.expand_dims(logit, axis=2)), axis=2)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, rgba)
result['save_path'] = save_im_path
result['data'] = rgba[:, :, 3]
else:
result['data'] = rgba[:, :, 3]
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
## 模型概述
DCSCN是基于Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network设计的轻量化超分辨模型。该模型使用残差结构和跳连的方式构建网络来提取局部和全局特征,同时使用并行1*1的卷积网络学习细节特征提升模型性能。该模型提供的超分倍数为2倍。
## 命令行预测
```
$ hub run dcscn --input_path "/PATH/TO/IMAGE"
```
## API
```python
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="dcscn_output")
```
预测API,用于图像超分辨率。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 超分辨后图像。
```python
def save_inference_model(self,
dirname='dcscn_save_model',
model_filename=None,
params_filename=None,
combined=False)
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
```python
import cv2
import paddlehub as hub
sr_model = hub.Module('dcscn')
im = cv2.imread('/PATH/TO/IMAGE').astype('float32')
#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。
res = sr_model.reconstruct(images=[im], visualization=True)
print(res[0]['data'])
sr_model.save_inference_model()
```
## 服务部署
PaddleHub Serving可以部署一个图像超分的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m dcscn
```
这样就完成了一个超分任务的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/dcscn"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
sr = np.expand_dims(cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY), axis=2)
shape =sr.shape
org_im = cv2.cvtColor(org_im, cv2.COLOR_BGR2YUV)
uv = cv2.resize(org_im[...,1:], (shape[1], shape[0]), interpolation=cv2.INTER_CUBIC)
combine_im = cv2.cvtColor(np.concatenate((sr, uv), axis=2), cv2.COLOR_YUV2BGR)
cv2.imwrite('dcscn_X2.png', combine_im)
print("save image as dcscn_X2.png")
```
### 查看代码
https://github.com/jiny2001/dcscn-super-resolution
### 依赖
paddlepaddle >= 1.8.0
paddlehub >= 1.7.1
# -*- coding:utf-8 -*-
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
from PIL import Image
__all__ = ['reader']
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
im = im.astype(np.float32)
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
shape = img.shape
img_x = np.expand_dims(img[:, :, 0], axis=2)
img_x2 = np.expand_dims(
cv2.resize(
img_x, (shape[1] * 2, shape[0] * 2),
interpolation=cv2.INTER_CUBIC),
axis=2)
img_x = img_x.transpose((2, 0, 1)) / 255
img_x2 = img_x2.transpose(2, 0, 1) / 255
img_x = img_x.astype(np.float32)
img_x2 = img_x2.astype(np.float32)
element['img_x'] = img_x
element['img_x2'] = img_x2
yield element
if __name__ == "__main__":
path = ['photo.jpg']
reader(paths=path)
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import argparse
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from dcscn.data_feed import reader
from dcscn.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
@moduleinfo(
name="dcscn",
type="CV/image_editing",
author="paddlepaddle",
author_email="",
summary="dcscn is a super resolution model.",
version="1.0.0")
class Dcscn(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "dcscn_model")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = self.default_pretrained_model_path
cpu_config = AnalysisConfig(self.model_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="dcscn_output"):
"""
API for super resolution.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
res = list()
for i in range(total_num):
image_x = np.array([all_data[i]['img_x']])
image_x2 = np.array([all_data[i]['img_x2']])
dropout = np.array([0])
image_x = PaddleTensor(image_x.copy())
image_x2 = PaddleTensor(image_x2.copy())
drop_out = PaddleTensor(dropout.copy())
output = self.gpu_predictor.run([
image_x, image_x2
]) if use_gpu else self.cpu_predictor.run([image_x, image_x2])
output = np.expand_dims(output[0].as_ndarray(), axis=1)
out = postprocess(
data_out=output,
org_im=all_data[i]['org_im'],
org_im_shape=all_data[i]['org_im_shape'],
org_im_path=all_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def save_inference_model(self,
dirname='dcscn_save_model',
model_filename=None,
params_filename=None,
combined=False):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path, executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.reconstruct(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.reconstruct(
paths=[args.input_path],
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='dcscn_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='dcscn_save_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=True,
help="whether to save output as images.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
module = Dcscn()
#module.reconstruct(paths=["BSD100_001.png","BSD100_002.png"])
import cv2
img = cv2.imread("BSD100_001.png").astype('float32')
res = module.reconstruct(images=[img])
module.save_inference_model()
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out, org_im, org_im_shape, org_im_path, output_dir,
visualization):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for sr in data_out:
sr = np.squeeze(sr, 0)
sr = np.clip(sr * 255, 0, 255)
sr = sr.astype(np.uint8)
shape = sr.shape
if visualization:
org_im = cv2.cvtColor(org_im, cv2.COLOR_BGR2YUV)
uv = cv2.resize(
org_im[..., 1:], (shape[1], shape[0]),
interpolation=cv2.INTER_CUBIC)
combine_im = cv2.cvtColor(
np.concatenate((sr, uv), axis=2), cv2.COLOR_YUV2BGR)
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, combine_im)
print("save image at: ", save_im_path)
result['save_path'] = save_im_path
result['data'] = sr
else:
result['data'] = sr
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
## 模型概述
falsr_a是基于Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search设计的轻量化超分辨模型。该模型使用多目标方法处理超分问题,同时使用基于混合控制器的弹性搜索策略来提升模型性能。该模型提供的超分倍数为2倍。
## 命令行预测
```
$ hub run falsr_a --input_path "/PATH/TO/IMAGE"
```
## API
```python
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="falsr_a_output")
```
预测API,用于图像超分辨率。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 超分辨后图像。
```python
def save_inference_model(self,
dirname='falsr_a_save_model',
model_filename=None,
params_filename=None,
combined=False)
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
```python
import cv2
import paddlehub as hub
sr_model = hub.Module('falsr_a')
im = cv2.imread('/PATH/TO/IMAGE').astype('float32')
#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。
res = sr_model.reconstruct(images=[im], visualization=True)
print(res[0]['data'])
sr_model.save_inference_model()
```
## 服务部署
PaddleHub Serving可以部署一个图像超分的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m falsr_a
```
这样就完成了一个超分任务的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/falsr_a"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
sr = base64_to_cv2(r.json()["results"][0]['data'])
cv2.imwrite('falsr_a_X2.png', sr)
print("save image as falsr_a_X2.png")
```
### 查看代码
https://github.com/xiaomi-automl/FALSR
### 依赖
paddlepaddle >= 1.8.0
paddlehub >= 1.7.1
# -*- coding:utf-8 -*-
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
from PIL import Image
__all__ = ['reader']
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
im = im.astype(np.float32)
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
shape = img.shape
img_scale = cv2.resize(
img, (shape[1] * 2, shape[0] * 2), interpolation=cv2.INTER_CUBIC)
img_y = np.expand_dims(img[:, :, 0], axis=2)
img_scale_pbpr = img_scale[..., 1:]
img_y = img_y.transpose((2, 0, 1)) / 255
img_scale_pbpr = img_scale_pbpr.transpose(2, 0, 1) / 255
element['img_y'] = img_y
element['img_scale_pbpr'] = img_scale_pbpr
yield element
if __name__ == "__main__":
path = ['BSD100_001.png']
reader(paths=path)
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import argparse
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from falsr_a.data_feed import reader
from falsr_a.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
@moduleinfo(
name="falsr_a",
type="CV/image_editing",
author="paddlepaddle",
author_email="",
summary="falsr_a is a super resolution model.",
version="1.0.0")
class Falsr_A(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "falsr_a_model")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = self.default_pretrained_model_path
cpu_config = AnalysisConfig(self.model_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="falsr_a_output"):
"""
API for super resolution.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
res = list()
for i in range(total_num):
image_y = np.array([all_data[i]['img_y']])
image_scale_pbpr = np.array([all_data[i]['img_scale_pbpr']])
image_y = PaddleTensor(image_y.copy())
image_scale_pbpr = PaddleTensor(image_scale_pbpr.copy())
output = self.gpu_predictor.run([
image_y, image_scale_pbpr
]) if use_gpu else self.cpu_predictor.run(
[image_y, image_scale_pbpr])
output = np.expand_dims(output[0].as_ndarray(), axis=1)
out = postprocess(
data_out=output,
org_im=all_data[i]['org_im'],
org_im_shape=all_data[i]['org_im_shape'],
org_im_path=all_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def save_inference_model(self,
dirname='falsr_a_save_model',
model_filename=None,
params_filename=None,
combined=False):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path, executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.reconstruct(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.reconstruct(
paths=[args.input_path],
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='falsr_a_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='falsr_a_save_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=True,
help="whether to save output as images.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
module = Falsr_A()
module.reconstruct(
paths=["BSD100_001.png", "BSD100_002.png", "Set5_003.png"])
module.save_inference_model()
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out, org_im, org_im_shape, org_im_path, output_dir,
visualization):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for sr in data_out:
sr = np.squeeze(sr, 0)
sr = np.clip(sr * 255, 0, 255)
sr = sr.astype(np.uint8)
sr = cv2.cvtColor(sr, cv2.COLOR_RGB2BGR)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, sr)
print("save image at: ", save_im_path)
result['save_path'] = save_im_path
result['data'] = sr
else:
result['data'] = sr
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
## 模型概述
falsr_b是基于Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search设计的轻量化超分辨模型。falsr_b较falsr_a更轻量化。该模型使用多目标方法处理超分问题,同时使用基于混合控制器的弹性搜索策略来提升模型性能。该模型提供的超分倍数为2倍。
## 命令行预测
```
$ hub run falsr_b --input_path "/PATH/TO/IMAGE"
```
## API
```python
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=True,
output_dir="falsr_b_output")
```
预测API,用于图像超分辨率。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 超分辨后图像。
```python
def save_inference_model(self,
dirname='falsr_b_save_model',
model_filename=None,
params_filename=None,
combined=False)
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
```python
import cv2
import paddlehub as hub
sr_model = hub.Module('falsr_b')
im = cv2.imread('/PATH/TO/IMAGE').astype('float32')
#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。
res = sr_model.reconstruct(images=[im], visualization=True)
print(res[0]['data'])
sr_model.save_inference_model()
```
## 服务部署
PaddleHub Serving可以部署一个图像超分的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m falsr_b
```
这样就完成了一个超分任务的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/falsr_b"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
sr = base64_to_cv2(r.json()["results"][0]['data'])
cv2.imwrite('falsr_b_X2.png', sr)
print("save image as falsr_b_X2.png")
```
### 查看代码
https://github.com/xiaomi-automl/FALSR
### 依赖
paddlepaddle >= 1.8.0
paddlehub >= 1.7.1
# -*- coding:utf-8 -*-
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
from PIL import Image
__all__ = ['reader']
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
im = im.astype(np.float32)
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
shape = img.shape
img_scale = cv2.resize(
img, (shape[1] * 2, shape[0] * 2), interpolation=cv2.INTER_CUBIC)
img_y = np.expand_dims(img[:, :, 0], axis=2)
img_scale_pbpr = img_scale[..., 1:]
img_y = img_y.transpose((2, 0, 1)) / 255
img_scale_pbpr = img_scale_pbpr.transpose(2, 0, 1) / 255
element['img_y'] = img_y
element['img_scale_pbpr'] = img_scale_pbpr
yield element
if __name__ == "__main__":
path = ['BSD100_001.png']
reader(paths=path)
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import argparse
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from falsr_b.data_feed import reader
from falsr_b.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
@moduleinfo(
name="falsr_b",
type="CV/image_editing",
author="paddlepaddle",
author_email="",
summary="falsr_b is a super resolution model.",
version="1.0.0")
class Falsr_B(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "falsr_b_model")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = self.default_pretrained_model_path
cpu_config = AnalysisConfig(self.model_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="falsr_b_output"):
"""
API for super resolution.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
res = list()
for i in range(total_num):
image_y = np.array([all_data[i]['img_y']])
image_scale_pbpr = np.array([all_data[i]['img_scale_pbpr']])
image_y = PaddleTensor(image_y.copy())
image_scale_pbpr = PaddleTensor(image_scale_pbpr.copy())
output = self.gpu_predictor.run([
image_y, image_scale_pbpr
]) if use_gpu else self.cpu_predictor.run(
[image_y, image_scale_pbpr])
output = np.expand_dims(output[0].as_ndarray(), axis=1)
out = postprocess(
data_out=output,
org_im=all_data[i]['org_im'],
org_im_shape=all_data[i]['org_im_shape'],
org_im_path=all_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def save_inference_model(self,
dirname='falsr_b_save_model',
model_filename=None,
params_filename=None,
combined=False):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path, executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.reconstruct(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.reconstruct(
paths=[args.input_path],
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='falsr_b_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='falsr_b_save_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=True,
help="whether to save output as images.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
module = Falsr_B()
module.reconstruct(
paths=["BSD100_001.png", "BSD100_002.png", "Set5_003.png"])
module.save_inference_model()
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out, org_im, org_im_shape, org_im_path, output_dir,
visualization):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for sr in data_out:
sr = np.squeeze(sr, 0)
sr = np.clip(sr * 255, 0, 255)
sr = sr.astype(np.uint8)
sr = cv2.cvtColor(sr, cv2.COLOR_RGB2BGR)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, sr)
print("save image at: ", save_im_path)
result['save_path'] = save_im_path
result['data'] = sr
else:
result['data'] = sr
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
## 模型概述
falsr_c是基于Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search设计的轻量化超分辨模型。该模型使用多目标方法处理超分问题,同时使用基于混合控制器的弹性搜索策略来提升模型性能。该模型提供的超分倍数为2倍。
## 命令行预测
```
$ hub run falsr_c --input_path "/PATH/TO/IMAGE"
```
## API
```python
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="falsr_c_output")
```
预测API,用于图像超分辨率。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 超分辨后图像。
```python
def save_inference_model(self,
dirname='falsr_c_save_model',
model_filename=None,
params_filename=None,
combined=False)
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
```python
import cv2
import paddlehub as hub
sr_model = hub.Module('falsr_c')
im = cv2.imread('/PATH/TO/IMAGE').astype('float32')
#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。
res = sr_model.reconstruct(images=[im], visualization=True)
print(res[0]['data'])
sr_model.save_inference_model()
```
## 服务部署
PaddleHub Serving可以部署一个图像超分的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m falsr_c
```
这样就完成了一个超分任务的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/falsr_c"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
sr = base64_to_cv2(r.json()["results"][0]['data'])
cv2.imwrite('falsr_c_X2.png', sr)
print("save image as falsr_c_X2.png")
```
### 查看代码
https://github.com/xiaomi-automl/FALSR
### 依赖
paddlepaddle >= 1.8.0
paddlehub >= 1.7.1
# -*- coding:utf-8 -*-
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
from PIL import Image
__all__ = ['reader']
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
im = im.astype(np.float32)
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
shape = img.shape
img_scale = cv2.resize(
img, (shape[1] * 2, shape[0] * 2), interpolation=cv2.INTER_CUBIC)
img_y = np.expand_dims(img[:, :, 0], axis=2)
img_scale_pbpr = img_scale[..., 1:]
img_y = img_y.transpose((2, 0, 1)) / 255
img_scale_pbpr = img_scale_pbpr.transpose(2, 0, 1) / 255
element['img_y'] = img_y
element['img_scale_pbpr'] = img_scale_pbpr
yield element
if __name__ == "__main__":
path = ['BSD100_001.png']
reader(paths=path)
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import argparse
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from falsr_c.data_feed import reader
from falsr_c.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
@moduleinfo(
name="falsr_c",
type="CV/image_editing",
author="paddlepaddle",
author_email="",
summary="falsr_c is a super resolution model.",
version="1.0.0")
class Falsr_C(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "falsr_c_model")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = self.default_pretrained_model_path
cpu_config = AnalysisConfig(self.model_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="falsr_c_output"):
"""
API for super resolution.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
res = list()
for i in range(total_num):
image_y = np.array([all_data[i]['img_y']])
image_scale_pbpr = np.array([all_data[i]['img_scale_pbpr']])
image_y = PaddleTensor(image_y.copy())
image_scale_pbpr = PaddleTensor(image_scale_pbpr.copy())
output = self.gpu_predictor.run([
image_y, image_scale_pbpr
]) if use_gpu else self.cpu_predictor.run(
[image_y, image_scale_pbpr])
output = np.expand_dims(output[0].as_ndarray(), axis=1)
out = postprocess(
data_out=output,
org_im=all_data[i]['org_im'],
org_im_shape=all_data[i]['org_im_shape'],
org_im_path=all_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def save_inference_model(self,
dirname='falsr_c_save_model',
model_filename=None,
params_filename=None,
combined=False):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path, executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.reconstruct(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.reconstruct(
paths=[args.input_path],
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='falsr_c_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='falsr_c_save_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=True,
help="whether to save output as images.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
module = Falsr_C()
#module.reconstruct(paths=["BSD100_001.png","BSD100_002.png", "Set5_003.png"])
import cv2
img = cv2.imread("BSD100_001.png").astype('float32')
res = module.reconstruct(images=[img])
module.save_inference_model()
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out, org_im, org_im_shape, org_im_path, output_dir,
visualization):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for sr in data_out:
sr = np.squeeze(sr, 0)
sr = np.clip(sr * 255, 0, 255)
sr = sr.astype(np.uint8)
sr = cv2.cvtColor(sr, cv2.COLOR_RGB2BGR)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, sr)
print("save image at: ", save_im_path)
result['save_path'] = save_im_path
result['data'] = sr
else:
result['data'] = sr
print("result['data'] shape", result['data'].shape)
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
name: dcscn
dir: "modules/image/super_resolution/dcscn"
exclude:
- README.md
resources:
-
url: https://bj.bcebos.com/paddlehub/model/image/image_editing/dcscn_model.tar.gz
dest: .
uncompress: True
name: falsr_b
dir: "modules/image/super_resolution/falsr_b"
exclude:
- README.md
resources:
-
url: https://bj.bcebos.com/paddlehub/model/image/image_editing/falsr_B_model.tar.gz
dest: .
uncompress: True
# coding=utf-8
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import time
import unittest
import cv2
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
imgpath = [
'../image_dataset/super_resolution/BSD100_001.png',
'../image_dataset/super_resolution/BSD100_002.png',
'../image_dataset/super_resolution/BSD100_003.png',
]
class TestHumanSeg(unittest.TestCase):
@classmethod
def setUpClass(self):
"""Prepare the environment once before execution of all tests.\n"""
self.sr_model = hub.Module(name="dcscn")
@classmethod
def tearDownClass(self):
"""clean up the environment after the execution of all tests.\n"""
self.sr_model = None
def setUp(self):
"Call setUp() to prepare environment\n"
self.test_prog = fluid.Program()
def tearDown(self):
"Call tearDown to restore environment.\n"
self.test_prog = None
def test_single_pic(self):
with fluid.program_guard(self.test_prog):
img = cv2.imread(imgpath[0])
result = self.sr_model.super_resolution(
images=[img], use_gpu=False, visualization=True)
print(result[0]['data'])
def test_ndarray(self):
with fluid.program_guard(self.test_prog):
for pic_path in imgpath:
img = cv2.imread(pic_path)
result = self.sr_model.super_resolution(
images=[img],
output_dir='test_dcscn_model_output',
use_gpu=False,
visualization=True)
def test_save_inference_model(self):
with fluid.program_guard(self.test_prog):
self.sr_model.save_inference_model(
dirname='test_dcscn_model', combined=True)
if __name__ == "__main__":
suite = unittest.TestSuite()
suite.addTest(TestHumanSeg('test_single_pic'))
suite.addTest(TestHumanSeg('test_ndarray'))
suite.addTest(TestHumanSeg('test_save_inference_model'))
runner = unittest.TextTestRunner(verbosity=2)
runner.run(suite)
# coding=utf-8
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import time
import unittest
import cv2
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
imgpath = [
'../image_dataset/super_resolution/BSD100_001.png',
'../image_dataset/super_resolution/BSD100_002.png',
'../image_dataset/super_resolution/BSD100_003.png',
]
class TestHumanSeg(unittest.TestCase):
@classmethod
def setUpClass(self):
"""Prepare the environment once before execution of all tests.\n"""
self.sr_model = hub.Module(name="falsr_A")
@classmethod
def tearDownClass(self):
"""clean up the environment after the execution of all tests.\n"""
self.sr_model = None
def setUp(self):
"Call setUp() to prepare environment\n"
self.test_prog = fluid.Program()
def tearDown(self):
"Call tearDown to restore environment.\n"
self.test_prog = None
def test_single_pic(self):
with fluid.program_guard(self.test_prog):
img = cv2.imread(imgpath[0])
result = self.sr_model.super_resolution(
images=[img], use_gpu=False, visualization=True)
print(result[0]['data'])
def test_ndarray(self):
with fluid.program_guard(self.test_prog):
for pic_path in imgpath:
img = cv2.imread(pic_path)
result = self.sr_model.super_resolution(
images=[img],
output_dir='test_falsr_A_model_output',
use_gpu=False,
visualization=True)
def test_save_inference_model(self):
with fluid.program_guard(self.test_prog):
self.sr_model.save_inference_model(
dirname='test_falsr_A_model', combined=True)
if __name__ == "__main__":
suite = unittest.TestSuite()
suite.addTest(TestHumanSeg('test_single_pic'))
suite.addTest(TestHumanSeg('test_ndarray'))
suite.addTest(TestHumanSeg('test_save_inference_model'))
runner = unittest.TextTestRunner(verbosity=2)
runner.run(suite)
# coding=utf-8
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import time
import unittest
import cv2
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
imgpath = [
'../image_dataset/super_resolution/BSD100_001.png',
'../image_dataset/super_resolution/BSD100_002.png',
'../image_dataset/super_resolution/BSD100_003.png',
]
class TestHumanSeg(unittest.TestCase):
@classmethod
def setUpClass(self):
"""Prepare the environment once before execution of all tests.\n"""
self.sr_model = hub.Module(name="falsr_B")
@classmethod
def tearDownClass(self):
"""clean up the environment after the execution of all tests.\n"""
self.sr_model = None
def setUp(self):
"Call setUp() to prepare environment\n"
self.test_prog = fluid.Program()
def tearDown(self):
"Call tearDown to restore environment.\n"
self.test_prog = None
def test_single_pic(self):
with fluid.program_guard(self.test_prog):
img = cv2.imread(imgpath[0])
result = self.sr_model.super_resolution(
images=[img], use_gpu=False, visualization=True)
print(result[0]['data'])
def test_ndarray(self):
with fluid.program_guard(self.test_prog):
for pic_path in imgpath:
img = cv2.imread(pic_path)
result = self.sr_model.super_resolution(
images=[img],
output_dir='test_falsr_B_model_output',
use_gpu=False,
visualization=True)
def test_save_inference_model(self):
with fluid.program_guard(self.test_prog):
self.sr_model.save_inference_model(
dirname='test_falsr_B_model', combined=True)
if __name__ == "__main__":
suite = unittest.TestSuite()
suite.addTest(TestHumanSeg('test_single_pic'))
suite.addTest(TestHumanSeg('test_ndarray'))
suite.addTest(TestHumanSeg('test_save_inference_model'))
runner = unittest.TextTestRunner(verbosity=2)
runner.run(suite)
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册