提交 3cbf0f73 编写于 作者: G guosheng

Merge branch 'develop' of https://github.com/PaddlePaddle/models into...

Merge branch 'develop' of https://github.com/PaddlePaddle/models into add-transformer-BeamsearchDecoder-clean
运行本目录下的程序示例需要使用PaddlePaddle v0.10.0 版本。如果您的PaddlePaddle安装版本低于此要求,请按照[安装文档](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html)中的说明更新PaddlePaddle安装版本。
---
# 基于深度因子分解机的点击率预估模型
## 介绍
本模型实现了下述论文中提出的DeepFM模型:
```text
@inproceedings{guo2017deepfm,
title={DeepFM: A Factorization-Machine based Neural Network for CTR Prediction},
author={Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li and Xiuqiang He},
booktitle={the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI)},
pages={1725--1731},
year={2017}
}
```
DeepFM模型把因子分解机和深度神经网络的低阶和高阶特征的相互作用结合起来,有关因子分解机的详细信息,请参考论文[因子分解机](https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf)
## 数据集
本文使用的是Kaggle公司举办的[展示广告竞赛](https://www.kaggle.com/c/criteo-display-ad-challenge/)中所使用的Criteo数据集。
每一行是一次广告展示的特征,第一列是一个标签,表示这次广告展示是否被点击。总共有39个特征,其中13个特征采用整型值,另外26个特征是类别类特征。测试集中是没有标签的。
下载数据集:
```bash
cd data && ./download.sh && cd ..
```
## 模型
DeepFM模型是由因子分解机(FM)和深度神经网络(DNN)组成的。所有的输入特征都会同时输入FM和DNN,最后把FM和DNN的输出结合在一起形成最终的输出。DNN中稀疏特征生成的嵌入层与FM层中的隐含向量(因子)共享参数。
PaddlePaddle中的因子分解机层负责计算二阶组合特征的相互关系。以下的代码示例结合了因子分解机层和全连接层,形成了完整的的因子分解机:
```python
def fm_layer(input, factor_size):
first_order = paddle.layer.fc(input=input, size=1, act=paddle.activation.Linear())
second_order = paddle.layer.factorization_machine(input=input, factor_size=factor_size)
fm = paddle.layer.addto(input=[first_order, second_order],
act=paddle.activation.Linear(),
bias_attr=False)
return fm
```
## 数据准备
处理原始数据集,整型特征使用min-max归一化方法规范到[0, 1],类别类特征使用了one-hot编码。原始数据集分割成两部分:90%用于训练,其他10%用于训练过程中的验证。
```bash
python preprocess.py --datadir ./data/raw --outdir ./data
```
## 训练
训练的命令行选项可以通过`python train.py -h`列出。
训练模型:
```bash
python train.py \
--train_data_path data/train.txt \
--test_data_path data/valid.txt \
2>&1 | tee train.log
```
训练到第9轮的第40000个batch后,测试的AUC为0.807178,误差(cost)为0.445196。
## 预测
预测的命令行选项可以通过`python infer.py -h`列出。
对测试集进行预测:
```bash
python infer.py \
--model_gz_path models/model-pass-9-batch-10000.tar.gz \
--data_path data/test.txt \
--prediction_output_path ./predict.txt
```
The minimum PaddlePaddle version needed for the code sample in this directory is the lastest develop branch. If you are on a version of PaddlePaddle earlier than this, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
---
### TODO
## Deep Automatic Speech Recognition
This project is still under active development.
### Introduction
TBD
### Installation
#### Kaldi
The decoder depends on [kaldi](https://github.com/kaldi-asr/kaldi), install it by flowing its instructions. Then
```shell
export KALDI_ROOT=<absolute path to kaldi>
```
#### Decoder
```shell
git clone https://github.com/PaddlePaddle/models.git
cd models/fluid/DeepASR/decoder
sh setup.sh
```
### Data reprocessing
TBD
### Training
TBD
### Inference & Decoding
TBD
### Question and Contribution
TBD
......@@ -8,6 +8,7 @@ import numpy as np
import data_utils.augmentor.trans_mean_variance_norm as trans_mean_variance_norm
import data_utils.augmentor.trans_add_delta as trans_add_delta
import data_utils.augmentor.trans_splice as trans_splice
import data_utils.augmentor.trans_delay as trans_delay
class TestTransMeanVarianceNorm(unittest.TestCase):
......@@ -112,5 +113,24 @@ class TestTransSplict(unittest.TestCase):
self.assertAlmostEqual(feature[i][j * 10 + k], cur_val)
class TestTransDelay(unittest.TestCase):
"""unittest TransDelay
"""
def test_perform(self):
label = np.zeros((10, 1), dtype="int64")
for i in xrange(10):
label[i][0] = i
trans = trans_delay.TransDelay(5)
(_, label, _) = trans.perform_trans((None, label, None))
for i in xrange(5):
self.assertAlmostEqual(label[i + 5][0], i)
for i in xrange(5):
self.assertAlmostEqual(label[i][0], 0)
if __name__ == '__main__':
unittest.main()
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import math
class TransDelay(object):
""" Delay label, and copy first label value in the front.
Attributes:
_delay_time : the delay frame num of label
"""
def __init__(self, delay_time):
"""init construction
Args:
delay_time : the delay frame num of label
"""
self._delay_time = delay_time
def perform_trans(self, sample):
"""
Args:
sample(object):input sample, contain feature numpy and label numpy, sample name list
Returns:
(feature, label, name)
"""
(feature, label, name) = sample
shape = label.shape
assert len(shape) == 2
label[self._delay_time:shape[0]] = label[0:shape[0] - self._delay_time]
for i in xrange(self._delay_time):
label[i][0] = label[self._delay_time][0]
return (feature, label, name)
data_dir=~/.cache/paddle/dataset/speech/deep_asr_data/aishell
data_url='http://deep-asr-data.gz.bcebos.com/aishell_data.tar.gz'
lst_url='http://deep-asr-data.gz.bcebos.com/aishell_lst.tar.gz'
md5=e017d858d9e509c8a84b73f673f08b9a
md5=17669b8d63331c9326f4a9393d289bfb
if [ ! -e $data_dir ]; then
mkdir -p $data_dir
......
export CUDA_VISIBLE_DEVICES=2,3,4,5
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -u ../../tools/profile.py --feature_lst data/train_feature.lst \
--label_lst data/train_label.lst \
--mean_var data/aishell/global_mean_var \
--parallel \
--frame_dim 2640 \
--class_num 101 \
--frame_dim 80 \
--class_num 3040 \
export CUDA_VISIBLE_DEVICES=2,3,4,5
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -u ../../train.py --train_feature_lst data/train_feature.lst \
--train_label_lst data/train_label.lst \
--val_feature_lst data/val_feature.lst \
--val_label_lst data/val_label.lst \
--mean_var data/aishell/global_mean_var \
--checkpoints checkpoints \
--frame_dim 2640 \
--class_num 101 \
--frame_dim 80 \
--class_num 3040 \
--infer_models '' \
--batch_size 128 \
--learning_rate 0.00016 \
--batch_size 64 \
--learning_rate 6.4e-5 \
--parallel
~
......@@ -12,6 +12,7 @@ import paddle.fluid as fluid
import data_utils.augmentor.trans_mean_variance_norm as trans_mean_variance_norm
import data_utils.augmentor.trans_add_delta as trans_add_delta
import data_utils.augmentor.trans_splice as trans_splice
import data_utils.augmentor.trans_delay as trans_delay
import data_utils.async_data_reader as reader
from decoder.post_decode_faster import Decoder
from data_utils.util import lodtensor_to_ndarray
......@@ -36,7 +37,7 @@ def parse_args():
parser.add_argument(
'--frame_dim',
type=int,
default=120 * 11,
default=80,
help='Frame dimension of feature data. (default: %(default)d)')
parser.add_argument(
'--stacked_num',
......@@ -179,7 +180,7 @@ def infer_from_ckpt(args):
ltrans = [
trans_add_delta.TransAddDelta(2, 2),
trans_mean_variance_norm.TransMeanVarianceNorm(args.mean_var),
trans_splice.TransSplice()
trans_splice.TransSplice(), trans_delay.TransDelay(5)
]
feature_t = fluid.LoDTensor()
......
......@@ -32,25 +32,23 @@ def stacked_lstmp_model(frame_dim,
# network configuration
def _net_conf(feature, label):
seq_conv1 = fluid.layers.sequence_conv(
conv1 = fluid.layers.conv2d(
input=feature,
num_filters=1024,
num_filters=32,
filter_size=3,
filter_stride=1,
bias_attr=True)
bn1 = fluid.layers.batch_norm(
input=seq_conv1,
act="sigmoid",
is_test=not is_train,
momentum=0.9,
epsilon=1e-05,
data_layout='NCHW')
stride=1,
padding=1,
bias_attr=True,
act="relu")
stack_input = bn1
pool1 = fluid.layers.pool2d(
conv1, pool_size=3, pool_type="max", pool_stride=2, pool_padding=0)
stack_input = pool1
for i in range(stacked_num):
fc = fluid.layers.fc(input=stack_input,
size=hidden_dim * 4,
bias_attr=True)
bias_attr=None)
proj, cell = fluid.layers.dynamic_lstmp(
input=fc,
size=hidden_dim * 4,
......@@ -62,7 +60,6 @@ def stacked_lstmp_model(frame_dim,
proj_activation="tanh")
bn = fluid.layers.batch_norm(
input=proj,
act="sigmoid",
is_test=not is_train,
momentum=0.9,
epsilon=1e-05,
......@@ -80,7 +77,10 @@ def stacked_lstmp_model(frame_dim,
# data feeder
feature = fluid.layers.data(
name="feature", shape=[-1, frame_dim], dtype="float32", lod_level=1)
name="feature",
shape=[-1, 3, 11, frame_dim],
dtype="float32",
lod_level=1)
label = fluid.layers.data(
name="label", shape=[-1, 1], dtype="int64", lod_level=1)
......
......@@ -13,6 +13,7 @@ import _init_paths
import data_utils.augmentor.trans_mean_variance_norm as trans_mean_variance_norm
import data_utils.augmentor.trans_add_delta as trans_add_delta
import data_utils.augmentor.trans_splice as trans_splice
import data_utils.augmentor.trans_delay as trans_delay
import data_utils.async_data_reader as reader
from model_utils.model import stacked_lstmp_model
from data_utils.util import lodtensor_to_ndarray
......@@ -87,7 +88,7 @@ def parse_args():
parser.add_argument(
'--max_batch_num',
type=int,
default=10,
default=11,
help='Maximum number of batches for profiling. (default: %(default)d)')
parser.add_argument(
'--first_batches_to_skip',
......@@ -146,10 +147,10 @@ def profile(args):
ltrans = [
trans_add_delta.TransAddDelta(2, 2),
trans_mean_variance_norm.TransMeanVarianceNorm(args.mean_var),
trans_splice.TransSplice()
trans_splice.TransSplice(5, 5), trans_delay.TransDelay(5)
]
data_reader = reader.AsyncDataReader(args.feature_lst, args.label_lst)
data_reader = reader.AsyncDataReader(args.feature_lst, args.label_lst, -1)
data_reader.set_transformers(ltrans)
feature_t = fluid.LoDTensor()
......@@ -169,6 +170,8 @@ def profile(args):
frames_seen = 0
# load_data
(features, labels, lod, _) = batch_data
features = np.reshape(features, (-1, 11, 3, args.frame_dim))
features = np.transpose(features, (0, 2, 1, 3))
feature_t.set(features, place)
feature_t.set_lod([lod])
label_t.set(labels, place)
......
......@@ -12,6 +12,7 @@ import paddle.fluid as fluid
import data_utils.augmentor.trans_mean_variance_norm as trans_mean_variance_norm
import data_utils.augmentor.trans_add_delta as trans_add_delta
import data_utils.augmentor.trans_splice as trans_splice
import data_utils.augmentor.trans_delay as trans_delay
import data_utils.async_data_reader as reader
from data_utils.util import lodtensor_to_ndarray
from model_utils.model import stacked_lstmp_model
......@@ -33,7 +34,7 @@ def parse_args():
parser.add_argument(
'--frame_dim',
type=int,
default=120 * 11,
default=80,
help='Frame dimension of feature data. (default: %(default)d)')
parser.add_argument(
'--stacked_num',
......@@ -53,7 +54,7 @@ def parse_args():
parser.add_argument(
'--class_num',
type=int,
default=1749,
default=3040,
help='Number of classes in label. (default: %(default)d)')
parser.add_argument(
'--pass_num',
......@@ -157,6 +158,7 @@ def train(args):
# program for test
test_program = fluid.default_main_program().clone()
#optimizer = fluid.optimizer.Momentum(learning_rate=args.learning_rate, momentum=0.9)
optimizer = fluid.optimizer.Adam(learning_rate=args.learning_rate)
optimizer.minimize(avg_cost)
......@@ -171,7 +173,7 @@ def train(args):
ltrans = [
trans_add_delta.TransAddDelta(2, 2),
trans_mean_variance_norm.TransMeanVarianceNorm(args.mean_var),
trans_splice.TransSplice()
trans_splice.TransSplice(5, 5), trans_delay.TransDelay(5)
]
feature_t = fluid.LoDTensor()
......@@ -193,6 +195,8 @@ def train(args):
args.minimum_batch_size)):
# load_data
(features, labels, lod, _) = batch_data
features = np.reshape(features, (-1, 11, 3, args.frame_dim))
features = np.transpose(features, (0, 2, 1, 3))
feature_t.set(features, place)
feature_t.set_lod([lod])
label_t.set(labels, place)
......@@ -220,6 +224,8 @@ def train(args):
args.minimum_batch_size)):
# load_data
(features, labels, lod, name_lst) = batch_data
features = np.reshape(features, (-1, 11, 3, args.frame_dim))
features = np.transpose(features, (0, 2, 1, 3))
feature_t.set(features, place)
feature_t.set_lod([lod])
label_t.set(labels, place)
......
#-*- coding: utf-8 -*-
#File: DQN.py
from agent import Model
import gym
import argparse
from tqdm import tqdm
from expreplay import ReplayMemory, Experience
import numpy as np
import os
UPDATE_FREQ = 4
MEMORY_WARMUP_SIZE = 1000
def run_episode(agent, env, exp, train_or_test):
assert train_or_test in ['train', 'test'], train_or_test
total_reward = 0
state = env.reset()
for step in range(200):
action = agent.act(state, train_or_test)
next_state, reward, isOver, _ = env.step(action)
if train_or_test == 'train':
exp.append(Experience(state, action, reward, isOver))
# train model
# start training
if len(exp) > MEMORY_WARMUP_SIZE:
batch_idx = np.random.randint(
len(exp) - 1, size=(args.batch_size))
if step % UPDATE_FREQ == 0:
batch_state, batch_action, batch_reward, \
batch_next_state, batch_isOver = exp.sample(batch_idx)
agent.train(batch_state, batch_action, batch_reward, \
batch_next_state, batch_isOver)
total_reward += reward
state = next_state
if isOver:
break
return total_reward
def train_agent():
env = gym.make(args.env)
state_shape = env.observation_space.shape
exp = ReplayMemory(args.mem_size, state_shape)
action_dim = env.action_space.n
agent = Model(state_shape[0], action_dim, gamma=0.99)
while len(exp) < MEMORY_WARMUP_SIZE:
run_episode(agent, env, exp, train_or_test='train')
max_episode = 4000
# train
total_episode = 0
pbar = tqdm(total=max_episode)
recent_100_reward = []
for episode in xrange(max_episode):
# start epoch
total_reward = run_episode(agent, env, exp, train_or_test='train')
pbar.set_description('[train]exploration:{}'.format(agent.exploration))
pbar.update()
# recent 100 reward
total_reward = run_episode(agent, env, exp, train_or_test='test')
recent_100_reward.append(total_reward)
if len(recent_100_reward) > 100:
recent_100_reward = recent_100_reward[1:]
pbar.write("episode:{} test_reward:{}".format(\
episode, np.mean(recent_100_reward)))
pbar.close()
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--env', type=str, default='MountainCar-v0', \
help='enviroment to train DQN model, e.g CartPole-v0')
parser.add_argument('--gamma', type=float, default=0.99, \
help='discount factor for accumulated reward computation')
parser.add_argument('--mem_size', type=int, default=500000, \
help='memory size for experience replay')
parser.add_argument('--batch_size', type=int, default=192, \
help='batch size for training')
args = parser.parse_args()
train_agent()
#-*- coding: utf-8 -*-
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
import numpy as np
import math
from tqdm import tqdm
from utils import fluid_flatten
class DQNModel(object):
def __init__(self, state_dim, action_dim, gamma, hist_len, use_cuda=False):
self.img_height = state_dim[0]
self.img_width = state_dim[1]
self.action_dim = action_dim
self.gamma = gamma
self.exploration = 1.1
self.update_target_steps = 10000 // 4
self.hist_len = hist_len
self.use_cuda = use_cuda
self.global_step = 0
self._build_net()
def _get_inputs(self):
return fluid.layers.data(
name='state',
shape=[self.hist_len, self.img_height, self.img_width],
dtype='float32'), \
fluid.layers.data(
name='action', shape=[1], dtype='int32'), \
fluid.layers.data(
name='reward', shape=[], dtype='float32'), \
fluid.layers.data(
name='next_s',
shape=[self.hist_len, self.img_height, self.img_width],
dtype='float32'), \
fluid.layers.data(
name='isOver', shape=[], dtype='bool')
def _build_net(self):
state, action, reward, next_s, isOver = self._get_inputs()
self.pred_value = self.get_DQN_prediction(state)
self.predict_program = fluid.default_main_program().clone()
reward = fluid.layers.clip(reward, min=-1.0, max=1.0)
action_onehot = fluid.layers.one_hot(action, self.action_dim)
action_onehot = fluid.layers.cast(action_onehot, dtype='float32')
pred_action_value = fluid.layers.reduce_sum(
fluid.layers.elementwise_mul(action_onehot, self.pred_value), dim=1)
targetQ_predict_value = self.get_DQN_prediction(next_s, target=True)
best_v = fluid.layers.reduce_max(targetQ_predict_value, dim=1)
best_v.stop_gradient = True
target = reward + (1.0 - fluid.layers.cast(
isOver, dtype='float32')) * self.gamma * best_v
cost = fluid.layers.square_error_cost(pred_action_value, target)
cost = fluid.layers.reduce_mean(cost)
self._sync_program = self._build_sync_target_network()
optimizer = fluid.optimizer.Adam(1e-3 * 0.5, epsilon=1e-3)
optimizer.minimize(cost)
# define program
self.train_program = fluid.default_main_program()
# fluid exe
place = fluid.CUDAPlace(0) if self.use_cuda else fluid.CPUPlace()
self.exe = fluid.Executor(place)
self.exe.run(fluid.default_startup_program())
def get_DQN_prediction(self, image, target=False):
image = image / 255.0
variable_field = 'target' if target else 'policy'
conv1 = fluid.layers.conv2d(
input=image,
num_filters=32,
filter_size=[5, 5],
stride=[1, 1],
padding=[2, 2],
act='relu',
param_attr=ParamAttr(name='{}_conv1'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv1_b'.format(variable_field)))
max_pool1 = fluid.layers.pool2d(
input=conv1, pool_size=[2, 2], pool_stride=[2, 2], pool_type='max')
conv2 = fluid.layers.conv2d(
input=max_pool1,
num_filters=32,
filter_size=[5, 5],
stride=[1, 1],
padding=[2, 2],
act='relu',
param_attr=ParamAttr(name='{}_conv2'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv2_b'.format(variable_field)))
max_pool2 = fluid.layers.pool2d(
input=conv2, pool_size=[2, 2], pool_stride=[2, 2], pool_type='max')
conv3 = fluid.layers.conv2d(
input=max_pool2,
num_filters=64,
filter_size=[4, 4],
stride=[1, 1],
padding=[1, 1],
act='relu',
param_attr=ParamAttr(name='{}_conv3'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv3_b'.format(variable_field)))
max_pool3 = fluid.layers.pool2d(
input=conv3, pool_size=[2, 2], pool_stride=[2, 2], pool_type='max')
conv4 = fluid.layers.conv2d(
input=max_pool3,
num_filters=64,
filter_size=[3, 3],
stride=[1, 1],
padding=[1, 1],
act='relu',
param_attr=ParamAttr(name='{}_conv4'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv4_b'.format(variable_field)))
flatten = fluid_flatten(conv4)
out = fluid.layers.fc(
input=flatten,
size=self.action_dim,
param_attr=ParamAttr(name='{}_fc1'.format(variable_field)),
bias_attr=ParamAttr(name='{}_fc1_b'.format(variable_field)))
return out
def _build_sync_target_network(self):
vars = list(fluid.default_main_program().list_vars())
policy_vars = filter(
lambda x: 'GRAD' not in x.name and 'policy' in x.name, vars)
target_vars = filter(
lambda x: 'GRAD' not in x.name and 'target' in x.name, vars)
policy_vars.sort(key=lambda x: x.name)
target_vars.sort(key=lambda x: x.name)
sync_program = fluid.default_main_program().clone()
with fluid.program_guard(sync_program):
sync_ops = []
for i, var in enumerate(policy_vars):
sync_op = fluid.layers.assign(policy_vars[i], target_vars[i])
sync_ops.append(sync_op)
sync_program = sync_program.prune(sync_ops)
return sync_program
def act(self, state, train_or_test):
sample = np.random.random()
if train_or_test == 'train' and sample < self.exploration:
act = np.random.randint(self.action_dim)
else:
if np.random.random() < 0.01:
act = np.random.randint(self.action_dim)
else:
state = np.expand_dims(state, axis=0)
pred_Q = self.exe.run(self.predict_program,
feed={'state': state.astype('float32')},
fetch_list=[self.pred_value])[0]
pred_Q = np.squeeze(pred_Q, axis=0)
act = np.argmax(pred_Q)
if train_or_test == 'train':
self.exploration = max(0.1, self.exploration - 1e-6)
return act
def train(self, state, action, reward, next_state, isOver):
if self.global_step % self.update_target_steps == 0:
self.sync_target_network()
self.global_step += 1
action = np.expand_dims(action, -1)
self.exe.run(self.train_program,
feed={
'state': state.astype('float32'),
'action': action.astype('int32'),
'reward': reward,
'next_s': next_state.astype('float32'),
'isOver': isOver
})
def sync_target_network(self):
self.exe.run(self._sync_program)
#-*- coding: utf-8 -*-
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
import numpy as np
from tqdm import tqdm
import math
from utils import fluid_argmax, fluid_flatten
class DoubleDQNModel(object):
def __init__(self, state_dim, action_dim, gamma, hist_len, use_cuda=False):
self.img_height = state_dim[0]
self.img_width = state_dim[1]
self.action_dim = action_dim
self.gamma = gamma
self.exploration = 1.1
self.update_target_steps = 10000 // 4
self.hist_len = hist_len
self.use_cuda = use_cuda
self.global_step = 0
self._build_net()
def _get_inputs(self):
return fluid.layers.data(
name='state',
shape=[self.hist_len, self.img_height, self.img_width],
dtype='float32'), \
fluid.layers.data(
name='action', shape=[1], dtype='int32'), \
fluid.layers.data(
name='reward', shape=[], dtype='float32'), \
fluid.layers.data(
name='next_s',
shape=[self.hist_len, self.img_height, self.img_width],
dtype='float32'), \
fluid.layers.data(
name='isOver', shape=[], dtype='bool')
def _build_net(self):
state, action, reward, next_s, isOver = self._get_inputs()
self.pred_value = self.get_DQN_prediction(state)
self.predict_program = fluid.default_main_program().clone()
reward = fluid.layers.clip(reward, min=-1.0, max=1.0)
action_onehot = fluid.layers.one_hot(action, self.action_dim)
action_onehot = fluid.layers.cast(action_onehot, dtype='float32')
pred_action_value = fluid.layers.reduce_sum(
fluid.layers.elementwise_mul(action_onehot, self.pred_value), dim=1)
targetQ_predict_value = self.get_DQN_prediction(next_s, target=True)
next_s_predcit_value = self.get_DQN_prediction(next_s)
greedy_action = fluid_argmax(next_s_predcit_value)
predict_onehot = fluid.layers.one_hot(greedy_action, self.action_dim)
best_v = fluid.layers.reduce_sum(
fluid.layers.elementwise_mul(predict_onehot, targetQ_predict_value),
dim=1)
best_v.stop_gradient = True
target = reward + (1.0 - fluid.layers.cast(
isOver, dtype='float32')) * self.gamma * best_v
cost = fluid.layers.square_error_cost(pred_action_value, target)
cost = fluid.layers.reduce_mean(cost)
self._sync_program = self._build_sync_target_network()
optimizer = fluid.optimizer.Adam(1e-3 * 0.5, epsilon=1e-3)
optimizer.minimize(cost)
# define program
self.train_program = fluid.default_main_program()
# fluid exe
place = fluid.CUDAPlace(0) if self.use_cuda else fluid.CPUPlace()
self.exe = fluid.Executor(place)
self.exe.run(fluid.default_startup_program())
def get_DQN_prediction(self, image, target=False):
image = image / 255.0
variable_field = 'target' if target else 'policy'
conv1 = fluid.layers.conv2d(
input=image,
num_filters=32,
filter_size=[5, 5],
stride=[1, 1],
padding=[2, 2],
act='relu',
param_attr=ParamAttr(name='{}_conv1'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv1_b'.format(variable_field)))
max_pool1 = fluid.layers.pool2d(
input=conv1, pool_size=[2, 2], pool_stride=[2, 2], pool_type='max')
conv2 = fluid.layers.conv2d(
input=max_pool1,
num_filters=32,
filter_size=[5, 5],
stride=[1, 1],
padding=[2, 2],
act='relu',
param_attr=ParamAttr(name='{}_conv2'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv2_b'.format(variable_field)))
max_pool2 = fluid.layers.pool2d(
input=conv2, pool_size=[2, 2], pool_stride=[2, 2], pool_type='max')
conv3 = fluid.layers.conv2d(
input=max_pool2,
num_filters=64,
filter_size=[4, 4],
stride=[1, 1],
padding=[1, 1],
act='relu',
param_attr=ParamAttr(name='{}_conv3'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv3_b'.format(variable_field)))
max_pool3 = fluid.layers.pool2d(
input=conv3, pool_size=[2, 2], pool_stride=[2, 2], pool_type='max')
conv4 = fluid.layers.conv2d(
input=max_pool3,
num_filters=64,
filter_size=[3, 3],
stride=[1, 1],
padding=[1, 1],
act='relu',
param_attr=ParamAttr(name='{}_conv4'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv4_b'.format(variable_field)))
flatten = fluid_flatten(conv4)
out = fluid.layers.fc(
input=flatten,
size=self.action_dim,
param_attr=ParamAttr(name='{}_fc1'.format(variable_field)),
bias_attr=ParamAttr(name='{}_fc1_b'.format(variable_field)))
return out
def _build_sync_target_network(self):
vars = list(fluid.default_main_program().list_vars())
policy_vars = filter(
lambda x: 'GRAD' not in x.name and 'policy' in x.name, vars)
target_vars = filter(
lambda x: 'GRAD' not in x.name and 'target' in x.name, vars)
policy_vars.sort(key=lambda x: x.name)
target_vars.sort(key=lambda x: x.name)
sync_program = fluid.default_main_program().clone()
with fluid.program_guard(sync_program):
sync_ops = []
for i, var in enumerate(policy_vars):
sync_op = fluid.layers.assign(policy_vars[i], target_vars[i])
sync_ops.append(sync_op)
sync_program = sync_program.prune(sync_ops)
return sync_program
def act(self, state, train_or_test):
sample = np.random.random()
if train_or_test == 'train' and sample < self.exploration:
act = np.random.randint(self.action_dim)
else:
if np.random.random() < 0.01:
act = np.random.randint(self.action_dim)
else:
state = np.expand_dims(state, axis=0)
pred_Q = self.exe.run(self.predict_program,
feed={'state': state.astype('float32')},
fetch_list=[self.pred_value])[0]
pred_Q = np.squeeze(pred_Q, axis=0)
act = np.argmax(pred_Q)
if train_or_test == 'train':
self.exploration = max(0.1, self.exploration - 1e-6)
return act
def train(self, state, action, reward, next_state, isOver):
if self.global_step % self.update_target_steps == 0:
self.sync_target_network()
self.global_step += 1
action = np.expand_dims(action, -1)
self.exe.run(self.train_program,
feed={
'state': state.astype('float32'),
'action': action.astype('int32'),
'reward': reward,
'next_s': next_state.astype('float32'),
'isOver': isOver
})
def sync_target_network(self):
self.exe.run(self._sync_program)
#-*- coding: utf-8 -*-
#File: agent.py
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
import numpy as np
from tqdm import tqdm
import math
from utils import fluid_flatten
UPDATE_TARGET_STEPS = 200
class Model(object):
def __init__(self, state_dim, action_dim, gamma):
self.global_step = 0
self.state_dim = state_dim
class DuelingDQNModel(object):
def __init__(self, state_dim, action_dim, gamma, hist_len, use_cuda=False):
self.img_height = state_dim[0]
self.img_width = state_dim[1]
self.action_dim = action_dim
self.gamma = gamma
self.exploration = 1.0
self.exploration = 1.1
self.update_target_steps = 10000 // 4
self.hist_len = hist_len
self.use_cuda = use_cuda
self.global_step = 0
self._build_net()
def _get_inputs(self):
return [fluid.layers.data(\
name='state', shape=[self.state_dim], dtype='float32'),
fluid.layers.data(\
name='action', shape=[1], dtype='int32'),
fluid.layers.data(\
name='reward', shape=[], dtype='float32'),
fluid.layers.data(\
name='next_s', shape=[self.state_dim], dtype='float32'),
fluid.layers.data(\
name='isOver', shape=[], dtype='bool')]
return fluid.layers.data(
name='state',
shape=[self.hist_len, self.img_height, self.img_width],
dtype='float32'), \
fluid.layers.data(
name='action', shape=[1], dtype='int32'), \
fluid.layers.data(
name='reward', shape=[], dtype='float32'), \
fluid.layers.data(
name='next_s',
shape=[self.hist_len, self.img_height, self.img_width],
dtype='float32'), \
fluid.layers.data(
name='isOver', shape=[], dtype='bool')
def _build_net(self):
state, action, reward, next_s, isOver = self._get_inputs()
self.pred_value = self.get_DQN_prediction(state)
self.predict_program = fluid.default_main_program().clone()
reward = fluid.layers.clip(reward, min=-1.0, max=1.0)
action_onehot = fluid.layers.one_hot(action, self.action_dim)
action_onehot = fluid.layers.cast(action_onehot, dtype='float32')
pred_action_value = fluid.layers.reduce_sum(\
fluid.layers.elementwise_mul(action_onehot, self.pred_value), dim=1)
pred_action_value = fluid.layers.reduce_sum(
fluid.layers.elementwise_mul(action_onehot, self.pred_value), dim=1)
targetQ_predict_value = self.get_DQN_prediction(next_s, target=True)
best_v = fluid.layers.reduce_max(targetQ_predict_value, dim=1)
best_v.stop_gradient = True
target = reward + (1.0 - fluid.layers.cast(\
target = reward + (1.0 - fluid.layers.cast(
isOver, dtype='float32')) * self.gamma * best_v
cost = fluid.layers.square_error_cost(\
input=pred_action_value, label=target)
cost = fluid.layers.square_error_cost(pred_action_value, target)
cost = fluid.layers.reduce_mean(cost)
self._sync_program = self._build_sync_target_network()
optimizer = fluid.optimizer.Adam(1e-3)
optimizer = fluid.optimizer.Adam(1e-3 * 0.5, epsilon=1e-3)
optimizer.minimize(cost)
# define program
self.train_program = fluid.default_main_program()
# fluid exe
place = fluid.CUDAPlace(0)
place = fluid.CUDAPlace(0) if self.use_cuda else fluid.CPUPlace()
self.exe = fluid.Executor(place)
self.exe.run(fluid.default_startup_program())
def get_DQN_prediction(self, state, target=False):
def get_DQN_prediction(self, image, target=False):
image = image / 255.0
variable_field = 'target' if target else 'policy'
# layer fc1
param_attr = ParamAttr(name='{}_fc1'.format(variable_field))
bias_attr = ParamAttr(name='{}_fc1_b'.format(variable_field))
fc1 = fluid.layers.fc(input=state,
size=256,
act='relu',
param_attr=param_attr,
bias_attr=bias_attr)
param_attr = ParamAttr(name='{}_fc2'.format(variable_field))
bias_attr = ParamAttr(name='{}_fc2_b'.format(variable_field))
fc2 = fluid.layers.fc(input=fc1,
size=128,
act='tanh',
param_attr=param_attr,
bias_attr=bias_attr)
param_attr = ParamAttr(name='{}_fc3'.format(variable_field))
bias_attr = ParamAttr(name='{}_fc3_b'.format(variable_field))
value = fluid.layers.fc(input=fc2,
size=self.action_dim,
param_attr=param_attr,
bias_attr=bias_attr)
return value
conv1 = fluid.layers.conv2d(
input=image,
num_filters=32,
filter_size=[5, 5],
stride=[1, 1],
padding=[2, 2],
act='relu',
param_attr=ParamAttr(name='{}_conv1'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv1_b'.format(variable_field)))
max_pool1 = fluid.layers.pool2d(
input=conv1, pool_size=[2, 2], pool_stride=[2, 2], pool_type='max')
conv2 = fluid.layers.conv2d(
input=max_pool1,
num_filters=32,
filter_size=[5, 5],
stride=[1, 1],
padding=[2, 2],
act='relu',
param_attr=ParamAttr(name='{}_conv2'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv2_b'.format(variable_field)))
max_pool2 = fluid.layers.pool2d(
input=conv2, pool_size=[2, 2], pool_stride=[2, 2], pool_type='max')
conv3 = fluid.layers.conv2d(
input=max_pool2,
num_filters=64,
filter_size=[4, 4],
stride=[1, 1],
padding=[1, 1],
act='relu',
param_attr=ParamAttr(name='{}_conv3'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv3_b'.format(variable_field)))
max_pool3 = fluid.layers.pool2d(
input=conv3, pool_size=[2, 2], pool_stride=[2, 2], pool_type='max')
conv4 = fluid.layers.conv2d(
input=max_pool3,
num_filters=64,
filter_size=[3, 3],
stride=[1, 1],
padding=[1, 1],
act='relu',
param_attr=ParamAttr(name='{}_conv4'.format(variable_field)),
bias_attr=ParamAttr(name='{}_conv4_b'.format(variable_field)))
flatten = fluid_flatten(conv4)
value = fluid.layers.fc(
input=flatten,
size=1,
param_attr=ParamAttr(name='{}_value_fc'.format(variable_field)),
bias_attr=ParamAttr(name='{}_value_fc_b'.format(variable_field)))
advantage = fluid.layers.fc(
input=flatten,
size=self.action_dim,
param_attr=ParamAttr(name='{}_advantage_fc'.format(variable_field)),
bias_attr=ParamAttr(
name='{}_advantage_fc_b'.format(variable_field)))
Q = advantage + (value - fluid.layers.reduce_mean(
advantage, dim=1, keep_dim=True))
return Q
def _build_sync_target_network(self):
vars = fluid.default_main_program().list_vars()
policy_vars = []
target_vars = []
for var in vars:
if 'GRAD' in var.name: continue
if 'policy' in var.name:
policy_vars.append(var)
elif 'target' in var.name:
target_vars.append(var)
policy_vars.sort(key=lambda x: x.name.split('policy_')[1])
target_vars.sort(key=lambda x: x.name.split('target_')[1])
vars = list(fluid.default_main_program().list_vars())
policy_vars = filter(
lambda x: 'GRAD' not in x.name and 'policy' in x.name, vars)
target_vars = filter(
lambda x: 'GRAD' not in x.name and 'target' in x.name, vars)
policy_vars.sort(key=lambda x: x.name)
target_vars.sort(key=lambda x: x.name)
sync_program = fluid.default_main_program().clone()
with fluid.program_guard(sync_program):
......@@ -122,26 +166,30 @@ class Model(object):
if train_or_test == 'train' and sample < self.exploration:
act = np.random.randint(self.action_dim)
else:
state = np.expand_dims(state, axis=0)
pred_Q = self.exe.run(self.predict_program,
feed={'state': state.astype('float32')},
fetch_list=[self.pred_value])[0]
pred_Q = np.squeeze(pred_Q, axis=0)
act = np.argmax(pred_Q)
self.exploration = max(0.1, self.exploration - 1e-6)
if np.random.random() < 0.01:
act = np.random.randint(self.action_dim)
else:
state = np.expand_dims(state, axis=0)
pred_Q = self.exe.run(self.predict_program,
feed={'state': state.astype('float32')},
fetch_list=[self.pred_value])[0]
pred_Q = np.squeeze(pred_Q, axis=0)
act = np.argmax(pred_Q)
if train_or_test == 'train':
self.exploration = max(0.1, self.exploration - 1e-6)
return act
def train(self, state, action, reward, next_state, isOver):
if self.global_step % UPDATE_TARGET_STEPS == 0:
if self.global_step % self.update_target_steps == 0:
self.sync_target_network()
self.global_step += 1
action = np.expand_dims(action, -1)
self.exe.run(self.train_program, \
feed={'state': state, \
'action': action, \
feed={'state': state.astype('float32'), \
'action': action.astype('int32'), \
'reward': reward, \
'next_s': next_state, \
'next_s': next_state.astype('float32'), \
'isOver': isOver})
def sync_target_network(self):
......
<img src="mountain_car.gif" width="300" height="200">
# Reproduce DQN, DoubleDQN, DuelingDQN model with fluid version of PaddlePaddle
# Reproduce DQN model
+ DQN in:
+ DQN in:
[Human-level Control Through Deep Reinforcement Learning](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)
+ DoubleDQN in:
[Deep Reinforcement Learning with Double Q-Learning](https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12389)
+ DuelingDQN in:
[Dueling Network Architectures for Deep Reinforcement Learning](http://proceedings.mlr.press/v48/wangf16.html)
# Mountain-CAR benchmark & performance
[MountainCar-v0](https://gym.openai.com/envs/MountainCar-v0/)
# Atari benchmark & performance
## [Atari games introduction](https://gym.openai.com/envs/#atari)
A car is on a one-dimensional track, positioned between two "mountains". The goal is to drive up the mountain on the right; however, the car's engine is not strong enough to scale the mountain in a single pass. Therefore, the only way to succeed is to drive back and forth to build up momentum.
+ Pong game result
![DQN result](assets/dqn.png)
# How to use
+ Dependencies:
+ python2.7
+ gym
+ tqdm
+ paddlepaddle-gpu==0.12.0
+ Start Training:
```
# To train a model for Pong game with gpu (use DQN model as default)
python train.py --rom ./rom_files/pong.bin --use_cuda
<img src="curve.png" >
# To train a model for Pong with DoubleDQN
python train.py --rom ./rom_files/pong.bin --use_cuda --alg DoubleDQN
# To train a model for Pong with DuelingDQN
python train.py --rom ./rom_files/pong.bin --use_cuda --alg DuelingDQN
```
To train more games, can install more rom files from [here](https://github.com/openai/atari-py/tree/master/atari_py/atari_roms)
# How to use
+ Dependencies:
+ python2.7
+ gym
+ tqdm
+ paddle-fluid
+ Start Training:
```
# use mountain-car enviroment as default
python DQN.py
+ Start Testing:
```
# Play the game with saved model and calculate the average rewards
python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong/stepXXXXX
# use other enviorment
python DQN.py --env CartPole-v0
```
# Play the game with visualization
python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong/stepXXXXX --viz 0.01
```
# -*- coding: utf-8 -*-
import numpy as np
import os
import cv2
import threading
import gym
from gym import spaces
from gym.envs.atari.atari_env import ACTION_MEANING
from ale_python_interface import ALEInterface
__all__ = ['AtariPlayer']
ROM_URL = "https://github.com/openai/atari-py/tree/master/atari_py/atari_roms"
_ALE_LOCK = threading.Lock()
"""
The following AtariPlayer are copied or modified from tensorpack/tensorpack:
https://github.com/tensorpack/tensorpack/blob/master/examples/DeepQNetwork/atari.py
"""
class AtariPlayer(gym.Env):
"""
A wrapper for ALE emulator, with configurations to mimic DeepMind DQN settings.
Info:
score: the accumulated reward in the current game
gameOver: True when the current game is Over
"""
def __init__(self,
rom_file,
viz=0,
frame_skip=4,
nullop_start=30,
live_lost_as_eoe=True,
max_num_frames=0):
"""
Args:
rom_file: path to the rom
frame_skip: skip every k frames and repeat the action
viz: visualization to be done.
Set to 0 to disable.
Set to a positive number to be the delay between frames to show.
Set to a string to be a directory to store frames.
nullop_start: start with random number of null ops.
live_losts_as_eoe: consider lost of lives as end of episode. Useful for training.
max_num_frames: maximum number of frames per episode.
"""
super(AtariPlayer, self).__init__()
assert os.path.isfile(rom_file), \
"rom {} not found. Please download at {}".format(rom_file, ROM_URL)
try:
ALEInterface.setLoggerMode(ALEInterface.Logger.Error)
except AttributeError:
print "You're not using latest ALE"
# avoid simulator bugs: https://github.com/mgbellemare/Arcade-Learning-Environment/issues/86
with _ALE_LOCK:
self.ale = ALEInterface()
self.ale.setInt(b"random_seed", np.random.randint(0, 30000))
self.ale.setInt(b"max_num_frames_per_episode", max_num_frames)
self.ale.setBool(b"showinfo", False)
self.ale.setInt(b"frame_skip", 1)
self.ale.setBool(b'color_averaging', False)
# manual.pdf suggests otherwise.
self.ale.setFloat(b'repeat_action_probability', 0.0)
# viz setup
if isinstance(viz, str):
assert os.path.isdir(viz), viz
self.ale.setString(b'record_screen_dir', viz)
viz = 0
if isinstance(viz, int):
viz = float(viz)
self.viz = viz
if self.viz and isinstance(self.viz, float):
self.windowname = os.path.basename(rom_file)
cv2.startWindowThread()
cv2.namedWindow(self.windowname)
self.ale.loadROM(rom_file.encode('utf-8'))
self.width, self.height = self.ale.getScreenDims()
self.actions = self.ale.getMinimalActionSet()
self.live_lost_as_eoe = live_lost_as_eoe
self.frame_skip = frame_skip
self.nullop_start = nullop_start
self.action_space = spaces.Discrete(len(self.actions))
self.observation_space = spaces.Box(low=0,
high=255,
shape=(self.height, self.width),
dtype=np.uint8)
self._restart_episode()
def get_action_meanings(self):
return [ACTION_MEANING[i] for i in self.actions]
def _grab_raw_image(self):
"""
:returns: the current 3-channel image
"""
m = self.ale.getScreenRGB()
return m.reshape((self.height, self.width, 3))
def _current_state(self):
"""
returns: a gray-scale (h, w) uint8 image
"""
ret = self._grab_raw_image()
# avoid missing frame issue: max-pooled over the last screen
ret = np.maximum(ret, self.last_raw_screen)
if self.viz:
if isinstance(self.viz, float):
cv2.imshow(self.windowname, ret)
cv2.waitKey(int(self.viz * 1000))
ret = ret.astype('float32')
# 0.299,0.587.0.114. same as rgb2y in torch/image
ret = cv2.cvtColor(ret, cv2.COLOR_RGB2GRAY)
return ret.astype('uint8') # to save some memory
def _restart_episode(self):
with _ALE_LOCK:
self.ale.reset_game()
# random null-ops start
n = np.random.randint(self.nullop_start)
self.last_raw_screen = self._grab_raw_image()
for k in range(n):
if k == n - 1:
self.last_raw_screen = self._grab_raw_image()
self.ale.act(0)
def reset(self):
if self.ale.game_over():
self._restart_episode()
return self._current_state()
def step(self, act):
oldlives = self.ale.lives()
r = 0
for k in range(self.frame_skip):
if k == self.frame_skip - 1:
self.last_raw_screen = self._grab_raw_image()
r += self.ale.act(self.actions[act])
newlives = self.ale.lives()
if self.ale.game_over() or \
(self.live_lost_as_eoe and newlives < oldlives):
break
isOver = self.ale.game_over()
if self.live_lost_as_eoe:
isOver = isOver or newlives < oldlives
info = {'ale.lives': newlives}
return self._current_state(), r, isOver, info
# -*- coding: utf-8 -*-
import numpy as np
from collections import deque
import gym
from gym import spaces
_v0, _v1 = gym.__version__.split('.')[:2]
assert int(_v0) > 0 or int(_v1) >= 10, gym.__version__
"""
The following wrappers are copied or modified from openai/baselines:
https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py
"""
class MapState(gym.ObservationWrapper):
def __init__(self, env, map_func):
gym.ObservationWrapper.__init__(self, env)
self._func = map_func
def observation(self, obs):
return self._func(obs)
class FrameStack(gym.Wrapper):
def __init__(self, env, k):
"""Buffer observations and stack across channels (last axis)."""
gym.Wrapper.__init__(self, env)
self.k = k
self.frames = deque([], maxlen=k)
shp = env.observation_space.shape
chan = 1 if len(shp) == 2 else shp[2]
self.observation_space = spaces.Box(low=0,
high=255,
shape=(shp[0], shp[1], chan * k),
dtype=np.uint8)
def reset(self):
"""Clear buffer and re-fill by duplicating the first observation."""
ob = self.env.reset()
for _ in range(self.k - 1):
self.frames.append(np.zeros_like(ob))
self.frames.append(ob)
return self.observation()
def step(self, action):
ob, reward, done, info = self.env.step(action)
self.frames.append(ob)
return self.observation(), reward, done, info
def observation(self):
assert len(self.frames) == self.k
return np.stack(self.frames, axis=0)
class _FireResetEnv(gym.Wrapper):
def __init__(self, env):
"""Take action on reset for environments that are fixed until firing."""
gym.Wrapper.__init__(self, env)
assert env.unwrapped.get_action_meanings()[1] == 'FIRE'
assert len(env.unwrapped.get_action_meanings()) >= 3
def reset(self):
self.env.reset()
obs, _, done, _ = self.env.step(1)
if done:
self.env.reset()
obs, _, done, _ = self.env.step(2)
if done:
self.env.reset()
return obs
def step(self, action):
return self.env.step(action)
def FireResetEnv(env):
if isinstance(env, gym.Wrapper):
baseenv = env.unwrapped
else:
baseenv = env
if 'FIRE' in baseenv.get_action_meanings():
return _FireResetEnv(env)
return env
class LimitLength(gym.Wrapper):
def __init__(self, env, k):
gym.Wrapper.__init__(self, env)
self.k = k
def reset(self):
# This assumes that reset() will really reset the env.
# If the underlying env tries to be smart about reset
# (e.g. end-of-life), the assumption doesn't hold.
ob = self.env.reset()
self.cnt = 0
return ob
def step(self, action):
ob, r, done, info = self.env.step(action)
self.cnt += 1
if self.cnt == self.k:
done = True
return ob, r, done, info
#-*- coding: utf-8 -*-
#File: expreplay.py
# -*- coding: utf-8 -*-
from collections import namedtuple
import numpy as np
import copy
from collections import deque, namedtuple
Experience = namedtuple('Experience', ['state', 'action', 'reward', 'isOver'])
class ReplayMemory(object):
def __init__(self, max_size, state_shape):
def __init__(self, max_size, state_shape, context_len):
self.max_size = int(max_size)
self.state_shape = state_shape
self.context_len = int(context_len)
self.state = np.zeros((self.max_size, ) + state_shape, dtype='float32')
self.state = np.zeros((self.max_size, ) + state_shape, dtype='uint8')
self.action = np.zeros((self.max_size, ), dtype='int32')
self.reward = np.zeros((self.max_size, ), dtype='float32')
self.isOver = np.zeros((self.max_size, ), dtype='bool')
self._curr_size = 0
self._curr_pos = 0
self._context = deque(maxlen=context_len - 1)
def append(self, exp):
"""append a new experience into replay memory
"""
if self._curr_size < self.max_size:
self._assign(self._curr_pos, exp)
self._curr_size += 1
else:
self._assign(self._curr_pos, exp)
self._curr_pos = (self._curr_pos + 1) % self.max_size
if exp.isOver:
self._context.clear()
else:
self._context.append(exp)
def recent_state(self):
""" maintain recent state for training"""
lst = list(self._context)
states = [np.zeros(self.state_shape, dtype='uint8')] * \
(self._context.maxlen - len(lst))
states.extend([k.state for k in lst])
return states
def sample(self, idx):
""" return state, action, reward, isOver,
note that some frames in state may be generated from last episode,
they should be removed from state
"""
state = np.zeros(
(self.context_len + 1, ) + self.state_shape, dtype=np.uint8)
state_idx = np.arange(idx, idx + self.context_len + 1) % self._curr_size
# confirm that no frame was generated from last episode
has_last_episode = False
for k in range(self.context_len - 2, -1, -1):
to_check_idx = state_idx[k]
if self.isOver[to_check_idx]:
has_last_episode = True
state_idx = state_idx[k + 1:]
state[k + 1:] = self.state[state_idx]
break
if not has_last_episode:
state = self.state[state_idx]
real_idx = (idx + self.context_len - 1) % self._curr_size
action = self.action[real_idx]
reward = self.reward[real_idx]
isOver = self.isOver[real_idx]
return state, reward, action, isOver
def __len__(self):
return self._curr_size
def _assign(self, pos, exp):
self.state[pos] = exp.state
self.action[pos] = exp.action
self.reward[pos] = exp.reward
self.action[pos] = exp.action
self.isOver[pos] = exp.isOver
def __len__(self):
return self._curr_size
def sample(self, batch_idx):
# index mapping to avoid sampling lastest state
def sample_batch(self, batch_size):
"""sample a batch from replay memory for training
"""
batch_idx = np.random.randint(
self._curr_size - self.context_len - 1, size=batch_size)
batch_idx = (self._curr_pos + batch_idx) % self._curr_size
next_idx = (batch_idx + 1) % self._curr_size
state = self.state[batch_idx]
reward = self.reward[batch_idx]
action = self.action[batch_idx]
next_state = self.state[next_idx]
isOver = self.isOver[batch_idx]
return (state, action, reward, next_state, isOver)
batch_exp = [self.sample(i) for i in batch_idx]
return self._process_batch(batch_exp)
def _process_batch(self, batch_exp):
state = np.asarray([e[0] for e in batch_exp], dtype='uint8')
reward = np.asarray([e[1] for e in batch_exp], dtype='float32')
action = np.asarray([e[2] for e in batch_exp], dtype='int8')
isOver = np.asarray([e[3] for e in batch_exp], dtype='bool')
return [state, action, reward, isOver]
#-*- coding: utf-8 -*-
import argparse
import os
import numpy as np
import paddle.fluid as fluid
from train import get_player
from tqdm import tqdm
def predict_action(exe, state, predict_program, feed_names, fetch_targets,
action_dim):
if np.random.randint(100) == 0:
act = np.random.randint(action_dim)
else:
state = np.expand_dims(state, axis=0)
pred_Q = exe.run(predict_program,
feed={feed_names[0]: state.astype('float32')},
fetch_list=fetch_targets)[0]
pred_Q = np.squeeze(pred_Q, axis=0)
act = np.argmax(pred_Q)
return act
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument(
'--use_cuda', action='store_true', help='if set, use cuda')
parser.add_argument('--rom', type=str, required=True, help='atari rom')
parser.add_argument(
'--model_path', type=str, required=True, help='dirname to load model')
parser.add_argument(
'--viz',
type=float,
default=0,
help='''viz: visualization setting:
Set to 0 to disable;
Set to a positive number to be the delay between frames to show.
''')
args = parser.parse_args()
env = get_player(args.rom, viz=args.viz)
place = fluid.CUDAPlace(0) if args.use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)
inference_scope = fluid.core.Scope()
with fluid.scope_guard(inference_scope):
[predict_program, feed_names,
fetch_targets] = fluid.io.load_inference_model(args.model_path, exe)
episode_reward = []
for _ in tqdm(xrange(30), desc='eval agent'):
state = env.reset()
total_reward = 0
while True:
action = predict_action(exe, state, predict_program, feed_names,
fetch_targets, env.action_space.n)
state, reward, isOver, info = env.step(action)
total_reward += reward
if isOver:
break
episode_reward.append(total_reward)
eval_reward = np.mean(episode_reward)
print('Average reward of 30 epidose: {}'.format(eval_reward))
#-*- coding: utf-8 -*-
from DQN_agent import DQNModel
from DoubleDQN_agent import DoubleDQNModel
from DuelingDQN_agent import DuelingDQNModel
from atari import AtariPlayer
import paddle.fluid as fluid
import gym
import argparse
import cv2
from tqdm import tqdm
from expreplay import ReplayMemory, Experience
import numpy as np
import os
from datetime import datetime
from atari_wrapper import FrameStack, MapState, FireResetEnv, LimitLength
from collections import deque
UPDATE_FREQ = 4
#MEMORY_WARMUP_SIZE = 2000
MEMORY_SIZE = 1e6
MEMORY_WARMUP_SIZE = MEMORY_SIZE // 20
IMAGE_SIZE = (84, 84)
CONTEXT_LEN = 4
ACTION_REPEAT = 4 # aka FRAME_SKIP
UPDATE_FREQ = 4
def run_train_episode(agent, env, exp):
total_reward = 0
state = env.reset()
step = 0
while True:
step += 1
context = exp.recent_state()
context.append(state)
context = np.stack(context, axis=0)
action = agent.act(context, train_or_test='train')
next_state, reward, isOver, _ = env.step(action)
exp.append(Experience(state, action, reward, isOver))
# train model
# start training
if len(exp) > MEMORY_WARMUP_SIZE:
if step % UPDATE_FREQ == 0:
batch_all_state, batch_action, batch_reward, batch_isOver = exp.sample_batch(
args.batch_size)
batch_state = batch_all_state[:, :CONTEXT_LEN, :, :]
batch_next_state = batch_all_state[:, 1:, :, :]
agent.train(batch_state, batch_action, batch_reward,
batch_next_state, batch_isOver)
total_reward += reward
state = next_state
if isOver:
break
return total_reward, step
def get_player(rom, viz=False, train=False):
env = AtariPlayer(
rom,
frame_skip=ACTION_REPEAT,
viz=viz,
live_lost_as_eoe=train,
max_num_frames=60000)
env = FireResetEnv(env)
env = MapState(env, lambda im: cv2.resize(im, IMAGE_SIZE))
if not train:
# in training, context is taken care of in expreplay buffer
env = FrameStack(env, CONTEXT_LEN)
return env
def eval_agent(agent, env):
episode_reward = []
for _ in tqdm(xrange(30), desc='eval agent'):
state = env.reset()
total_reward = 0
step = 0
while True:
step += 1
action = agent.act(state, train_or_test='test')
state, reward, isOver, info = env.step(action)
total_reward += reward
if isOver:
break
episode_reward.append(total_reward)
eval_reward = np.mean(episode_reward)
return eval_reward
def train_agent():
env = get_player(args.rom, train=True)
test_env = get_player(args.rom)
exp = ReplayMemory(args.mem_size, IMAGE_SIZE, CONTEXT_LEN)
action_dim = env.action_space.n
if args.alg == 'DQN':
agent = DQNModel(IMAGE_SIZE, action_dim, args.gamma, CONTEXT_LEN,
args.use_cuda)
elif args.alg == 'DoubleDQN':
agent = DoubleDQNModel(IMAGE_SIZE, action_dim, args.gamma, CONTEXT_LEN,
args.use_cuda)
elif args.alg == 'DuelingDQN':
agent = DuelingDQNModel(IMAGE_SIZE, action_dim, args.gamma, CONTEXT_LEN,
args.use_cuda)
else:
print('Input algorithm name error!')
return
with tqdm(total=MEMORY_WARMUP_SIZE) as pbar:
while len(exp) < MEMORY_WARMUP_SIZE:
total_reward, step = run_train_episode(agent, env, exp)
pbar.update(step)
# train
test_flag = 0
save_flag = 0
pbar = tqdm(total=1e8)
recent_100_reward = []
total_step = 0
while True:
# start epoch
total_reward, step = run_train_episode(agent, env, exp)
total_step += step
pbar.set_description('[train]exploration:{}'.format(agent.exploration))
pbar.update(step)
if total_step // args.test_every_steps == test_flag:
pbar.write("testing")
eval_reward = eval_agent(agent, test_env)
test_flag += 1
print("eval_agent done, (steps, eval_reward): ({}, {})".format(
total_step, eval_reward))
if total_step // args.save_every_steps == save_flag:
save_flag += 1
save_path = os.path.join(args.model_dirname, '{}-{}'.format(
args.alg, os.path.basename(args.rom).split('.')[0]),
'step{}'.format(total_step))
fluid.io.save_inference_model(save_path, ['state'],
agent.pred_value, agent.exe,
agent.predict_program)
pbar.close()
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument(
'--alg',
type=str,
default='DQN',
help='Reinforcement learning algorithm, support: DQN, DoubleDQN, DuelingDQN'
)
parser.add_argument(
'--use_cuda', action='store_true', help='if set, use cuda')
parser.add_argument(
'--gamma',
type=float,
default=0.99,
help='discount factor for accumulated reward computation')
parser.add_argument(
'--mem_size',
type=int,
default=1000000,
help='memory size for experience replay')
parser.add_argument(
'--batch_size', type=int, default=64, help='batch size for training')
parser.add_argument('--rom', help='atari rom', required=True)
parser.add_argument(
'--model_dirname',
type=str,
default='saved_model',
help='dirname to save model')
parser.add_argument(
'--save_every_steps',
type=int,
default=100000,
help='every steps number to save model')
parser.add_argument(
'--test_every_steps',
type=int,
default=100000,
help='every steps number to run test')
args = parser.parse_args()
train_agent()
#-*- coding: utf-8 -*-
#File: utils.py
import paddle.fluid as fluid
import numpy as np
def fluid_argmax(x):
"""
Get index of max value for the last dimension
"""
_, max_index = fluid.layers.topk(x, k=1)
return max_index
def fluid_flatten(x):
"""
Flatten fluid variable along the first dimension
"""
return fluid.layers.reshape(x, shape=[-1, np.prod(x.shape[1:])])
......@@ -211,13 +211,12 @@ def main(train_data_file, test_data_file, model_save_dir, num_passes):
avg_cost, feature_out, word, mention, target = ner_net(word_dict_len,
label_dict_len)
crf_decode = fluid.layers.crf_decoding(
input=feature_out, param_attr=fluid.ParamAttr(name='crfw'))
sgd_optimizer = fluid.optimizer.SGD(learning_rate=1e-3)
sgd_optimizer.minimize(avg_cost)
crf_decode = fluid.layers.crf_decoding(
input=feature_out, param_attr=fluid.ParamAttr(
name='crfw', ))
(precision, recall, f1_score, num_infer_chunks, num_label_chunks,
num_correct_chunks) = fluid.layers.chunk_eval(
input=crf_decode,
......@@ -289,8 +288,8 @@ def main(train_data_file, test_data_file, model_save_dir, num_passes):
+ str(f1))
save_dirname = os.path.join(model_save_dir,
"params_pass_%d" % pass_id)
fluid.io.save_inference_model(
save_dirname, ['word', 'mention', 'target'], [crf_decode], exe)
fluid.io.save_inference_model(save_dirname, ['word', 'mention'],
[crf_decode], exe)
if __name__ == "__main__":
......
model/
pretrained/
data/
label/
*.swp
*.log
infer_results/
from PIL import Image, ImageEnhance, ImageDraw
from PIL import ImageFile
import numpy as np
import random
import math
ImageFile.LOAD_TRUNCATED_IMAGES = True #otherwise IOError raised image file is truncated
class sampler():
def __init__(self,
max_sample,
max_trial,
min_scale,
max_scale,
min_aspect_ratio,
max_aspect_ratio,
min_jaccard_overlap,
max_jaccard_overlap,
min_object_coverage,
max_object_coverage,
use_square=False):
self.max_sample = max_sample
self.max_trial = max_trial
self.min_scale = min_scale
self.max_scale = max_scale
self.min_aspect_ratio = min_aspect_ratio
self.max_aspect_ratio = max_aspect_ratio
self.min_jaccard_overlap = min_jaccard_overlap
self.max_jaccard_overlap = max_jaccard_overlap
self.min_object_coverage = min_object_coverage
self.max_object_coverage = max_object_coverage
self.use_square = use_square
class bbox():
def __init__(self, xmin, ymin, xmax, ymax):
self.xmin = xmin
self.ymin = ymin
self.xmax = xmax
self.ymax = ymax
def intersect_bbox(bbox1, bbox2):
if bbox2.xmin > bbox1.xmax or bbox2.xmax < bbox1.xmin or \
bbox2.ymin > bbox1.ymax or bbox2.ymax < bbox1.ymin:
intersection_box = bbox(0.0, 0.0, 0.0, 0.0)
else:
intersection_box = bbox(
max(bbox1.xmin, bbox2.xmin),
max(bbox1.ymin, bbox2.ymin),
min(bbox1.xmax, bbox2.xmax), min(bbox1.ymax, bbox2.ymax))
return intersection_box
def bbox_coverage(bbox1, bbox2):
inter_box = intersect_bbox(bbox1, bbox2)
intersect_size = bbox_area(inter_box)
if intersect_size > 0:
bbox1_size = bbox_area(bbox1)
return intersect_size / bbox1_size
else:
return 0.
def bbox_area(src_bbox):
if src_bbox.xmax < src_bbox.xmin or src_bbox.ymax < src_bbox.ymin:
return 0.
else:
width = src_bbox.xmax - src_bbox.xmin
height = src_bbox.ymax - src_bbox.ymin
return width * height
def generate_sample(sampler, image_width, image_height):
scale = random.uniform(sampler.min_scale, sampler.max_scale)
aspect_ratio = random.uniform(sampler.min_aspect_ratio,
sampler.max_aspect_ratio)
aspect_ratio = max(aspect_ratio, (scale**2.0))
aspect_ratio = min(aspect_ratio, 1 / (scale**2.0))
bbox_width = scale * (aspect_ratio**0.5)
bbox_height = scale / (aspect_ratio**0.5)
# guarantee a squared image patch after cropping
if sampler.use_square:
if image_height < image_width:
bbox_width = bbox_height * image_height / image_width
else:
bbox_height = bbox_width * image_width / image_height
xmin_bound = 1 - bbox_width
ymin_bound = 1 - bbox_height
xmin = random.uniform(0, xmin_bound)
ymin = random.uniform(0, ymin_bound)
xmax = xmin + bbox_width
ymax = ymin + bbox_height
sampled_bbox = bbox(xmin, ymin, xmax, ymax)
return sampled_bbox
def data_anchor_sampling(sampler, bbox_labels, image_width, image_height,
scale_array, resize_width, resize_height):
num_gt = len(bbox_labels)
# np.random.randint range: [low, high)
rand_idx = np.random.randint(0, num_gt) if num_gt != 0 else 0
if num_gt != 0:
norm_xmin = bbox_labels[rand_idx][0]
norm_ymin = bbox_labels[rand_idx][1]
norm_xmax = bbox_labels[rand_idx][2]
norm_ymax = bbox_labels[rand_idx][3]
xmin = norm_xmin * image_width
ymin = norm_ymin * image_height
wid = image_width * (norm_xmax - norm_xmin)
hei = image_height * (norm_ymax - norm_ymin)
range_size = 0
for scale_ind in range(0, len(scale_array) - 1):
area = wid * hei
if area > scale_array[scale_ind] ** 2 and area < \
scale_array[scale_ind + 1] ** 2:
range_size = scale_ind + 1
break
scale_choose = 0.0
if range_size == 0:
rand_idx_size = range_size + 1
else:
# np.random.randint range: [low, high)
rng_rand_size = np.random.randint(0, range_size)
rand_idx_size = rng_rand_size % range_size
scale_choose = random.uniform(scale_array[rand_idx_size] / 2.0,
2.0 * scale_array[rand_idx_size])
sample_bbox_size = wid * resize_width / scale_choose
w_off_orig = 0.0
h_off_orig = 0.0
if sample_bbox_size < max(image_height, image_width):
if wid <= sample_bbox_size:
w_off_orig = random.uniform(xmin + wid - sample_bbox_size, xmin)
else:
w_off_orig = random.uniform(xmin, xmin + wid - sample_bbox_size)
if hei <= sample_bbox_size:
h_off_orig = random.uniform(ymin + hei - sample_bbox_size, ymin)
else:
h_off_orig = random.uniform(ymin, ymin + hei - sample_bbox_size)
else:
w_off_orig = random.uniform(image_width - sample_bbox_size, 0.0)
h_off_orig = random.uniform(image_height - sample_bbox_size, 0.0)
w_off_orig = math.floor(w_off_orig)
h_off_orig = math.floor(h_off_orig)
# Figure out top left coordinates.
w_off = 0.0
h_off = 0.0
w_off = float(w_off_orig / image_width)
h_off = float(h_off_orig / image_height)
sampled_bbox = bbox(w_off, h_off,
w_off + float(sample_bbox_size / image_width),
h_off + float(sample_bbox_size / image_height))
return sampled_bbox
def jaccard_overlap(sample_bbox, object_bbox):
if sample_bbox.xmin >= object_bbox.xmax or \
sample_bbox.xmax <= object_bbox.xmin or \
sample_bbox.ymin >= object_bbox.ymax or \
sample_bbox.ymax <= object_bbox.ymin:
return 0
intersect_xmin = max(sample_bbox.xmin, object_bbox.xmin)
intersect_ymin = max(sample_bbox.ymin, object_bbox.ymin)
intersect_xmax = min(sample_bbox.xmax, object_bbox.xmax)
intersect_ymax = min(sample_bbox.ymax, object_bbox.ymax)
intersect_size = (intersect_xmax - intersect_xmin) * (
intersect_ymax - intersect_ymin)
sample_bbox_size = bbox_area(sample_bbox)
object_bbox_size = bbox_area(object_bbox)
overlap = intersect_size / (
sample_bbox_size + object_bbox_size - intersect_size)
return overlap
def satisfy_sample_constraint(sampler, sample_bbox, bbox_labels):
if sampler.min_jaccard_overlap == 0 and sampler.max_jaccard_overlap == 0:
has_jaccard_overlap = False
else:
has_jaccard_overlap = True
if sampler.min_object_coverage == 0 and sampler.max_object_coverage == 0:
has_object_coverage = False
else:
has_object_coverage = True
if not has_jaccard_overlap and not has_object_coverage:
return True
found = False
for i in range(len(bbox_labels)):
object_bbox = bbox(bbox_labels[i][1], bbox_labels[i][2],
bbox_labels[i][3], bbox_labels[i][4])
if has_jaccard_overlap:
overlap = jaccard_overlap(sample_bbox, object_bbox)
if sampler.min_jaccard_overlap != 0 and \
overlap < sampler.min_jaccard_overlap:
continue
if sampler.max_jaccard_overlap != 0 and \
overlap > sampler.max_jaccard_overlap:
continue
found = True
if has_object_coverage:
object_coverage = bbox_coverage(object_bbox, sample_bbox)
if sampler.min_object_coverage != 0 and \
object_coverage < sampler.min_object_coverage:
continue
if sampler.max_object_coverage != 0 and \
object_coverage > sampler.max_object_coverage:
continue
found = True
if found:
return True
return found
def generate_batch_samples(batch_sampler, bbox_labels, image_width,
image_height):
sampled_bbox = []
for sampler in batch_sampler:
found = 0
for i in range(sampler.max_trial):
if found >= sampler.max_sample:
break
sample_bbox = generate_sample(sampler, image_width, image_height)
if satisfy_sample_constraint(sampler, sample_bbox, bbox_labels):
sampled_bbox.append(sample_bbox)
found = found + 1
return sampled_bbox
def generate_batch_random_samples(batch_sampler, bbox_labels, image_width,
image_height, scale_array, resize_width,
resize_height):
sampled_bbox = []
for sampler in batch_sampler:
found = 0
for i in range(sampler.max_trial):
if found >= sampler.max_sample:
break
sample_bbox = data_anchor_sampling(
sampler, bbox_labels, image_width, image_height, scale_array,
resize_width, resize_height)
if satisfy_sample_constraint(sampler, sample_bbox, bbox_labels):
sampled_bbox.append(sample_bbox)
found = found + 1
return sampled_bbox
def clip_bbox(src_bbox):
src_bbox.xmin = max(min(src_bbox.xmin, 1.0), 0.0)
src_bbox.ymin = max(min(src_bbox.ymin, 1.0), 0.0)
src_bbox.xmax = max(min(src_bbox.xmax, 1.0), 0.0)
src_bbox.ymax = max(min(src_bbox.ymax, 1.0), 0.0)
return src_bbox
def meet_emit_constraint(src_bbox, sample_bbox):
center_x = (src_bbox.xmax + src_bbox.xmin) / 2
center_y = (src_bbox.ymax + src_bbox.ymin) / 2
if center_x >= sample_bbox.xmin and \
center_x <= sample_bbox.xmax and \
center_y >= sample_bbox.ymin and \
center_y <= sample_bbox.ymax:
return True
return False
def project_bbox(object_bbox, sample_bbox):
if object_bbox.xmin >= sample_bbox.xmax or \
object_bbox.xmax <= sample_bbox.xmin or \
object_bbox.ymin >= sample_bbox.ymax or \
object_bbox.ymax <= sample_bbox.ymin:
return False
else:
proj_bbox = bbox(0, 0, 0, 0)
sample_width = sample_bbox.xmax - sample_bbox.xmin
sample_height = sample_bbox.ymax - sample_bbox.ymin
proj_bbox.xmin = (object_bbox.xmin - sample_bbox.xmin) / sample_width
proj_bbox.ymin = (object_bbox.ymin - sample_bbox.ymin) / sample_height
proj_bbox.xmax = (object_bbox.xmax - sample_bbox.xmin) / sample_width
proj_bbox.ymax = (object_bbox.ymax - sample_bbox.ymin) / sample_height
proj_bbox = clip_bbox(proj_bbox)
if bbox_area(proj_bbox) > 0:
return proj_bbox
else:
return False
def transform_labels(bbox_labels, sample_bbox):
sample_labels = []
for i in range(len(bbox_labels)):
sample_label = []
object_bbox = bbox(bbox_labels[i][1], bbox_labels[i][2],
bbox_labels[i][3], bbox_labels[i][4])
if not meet_emit_constraint(object_bbox, sample_bbox):
continue
proj_bbox = project_bbox(object_bbox, sample_bbox)
if proj_bbox:
sample_label.append(bbox_labels[i][0])
sample_label.append(float(proj_bbox.xmin))
sample_label.append(float(proj_bbox.ymin))
sample_label.append(float(proj_bbox.xmax))
sample_label.append(float(proj_bbox.ymax))
sample_label = sample_label + bbox_labels[i][5:]
sample_labels.append(sample_label)
return sample_labels
def crop_image(img, bbox_labels, sample_bbox, image_width, image_height):
sample_bbox = clip_bbox(sample_bbox)
xmin = int(sample_bbox.xmin * image_width)
xmax = int(sample_bbox.xmax * image_width)
ymin = int(sample_bbox.ymin * image_height)
ymax = int(sample_bbox.ymax * image_height)
sample_img = img[ymin:ymax, xmin:xmax]
sample_labels = transform_labels(bbox_labels, sample_bbox)
return sample_img, sample_labels
def crop_image_sampling(img, bbox_labels, sample_bbox, image_width,
image_height, resize_width, resize_height):
# no clipping here
xmin = int(sample_bbox.xmin * image_width)
xmax = int(sample_bbox.xmax * image_width)
ymin = int(sample_bbox.ymin * image_height)
ymax = int(sample_bbox.ymax * image_height)
w_off = xmin
h_off = ymin
width = xmax - xmin
height = ymax - ymin
cross_xmin = max(0.0, float(w_off))
cross_ymin = max(0.0, float(h_off))
cross_xmax = min(float(w_off + width - 1.0), float(image_width))
cross_ymax = min(float(h_off + height - 1.0), float(image_height))
cross_width = cross_xmax - cross_xmin
cross_height = cross_ymax - cross_ymin
roi_xmin = 0 if w_off >= 0 else abs(w_off)
roi_ymin = 0 if h_off >= 0 else abs(h_off)
roi_width = cross_width
roi_height = cross_height
sample_img = np.zeros((width, height, 3))
sample_img[roi_xmin : roi_xmin + roi_width, roi_ymin : roi_ymin + roi_height] = \
img[cross_xmin : cross_xmin + cross_width, cross_ymin : cross_ymin + cross_height]
sample_img = cv2.resize(
sample_img, (resize_width, resize_height), interpolation=cv2.INTER_AREA)
sample_labels = transform_labels(bbox_labels, sample_bbox)
return sample_img, sample_labels
def random_brightness(img, settings):
prob = random.uniform(0, 1)
if prob < settings.brightness_prob:
delta = random.uniform(-settings.brightness_delta,
settings.brightness_delta) + 1
img = ImageEnhance.Brightness(img).enhance(delta)
return img
def random_contrast(img, settings):
prob = random.uniform(0, 1)
if prob < settings.contrast_prob:
delta = random.uniform(-settings.contrast_delta,
settings.contrast_delta) + 1
img = ImageEnhance.Contrast(img).enhance(delta)
return img
def random_saturation(img, settings):
prob = random.uniform(0, 1)
if prob < settings.saturation_prob:
delta = random.uniform(-settings.saturation_delta,
settings.saturation_delta) + 1
img = ImageEnhance.Color(img).enhance(delta)
return img
def random_hue(img, settings):
prob = random.uniform(0, 1)
if prob < settings.hue_prob:
delta = random.uniform(-settings.hue_delta, settings.hue_delta)
img_hsv = np.array(img.convert('HSV'))
img_hsv[:, :, 0] = img_hsv[:, :, 0] + delta
img = Image.fromarray(img_hsv, mode='HSV').convert('RGB')
return img
def distort_image(img, settings):
prob = random.uniform(0, 1)
# Apply different distort order
if prob > 0.5:
img = random_brightness(img, settings)
img = random_contrast(img, settings)
img = random_saturation(img, settings)
img = random_hue(img, settings)
else:
img = random_brightness(img, settings)
img = random_saturation(img, settings)
img = random_hue(img, settings)
img = random_contrast(img, settings)
return img
def expand_image(img, bbox_labels, img_width, img_height, settings):
prob = random.uniform(0, 1)
if prob < settings.expand_prob:
if settings.expand_max_ratio - 1 >= 0.01:
expand_ratio = random.uniform(1, settings.expand_max_ratio)
height = int(img_height * expand_ratio)
width = int(img_width * expand_ratio)
h_off = math.floor(random.uniform(0, height - img_height))
w_off = math.floor(random.uniform(0, width - img_width))
expand_bbox = bbox(-w_off / img_width, -h_off / img_height,
(width - w_off) / img_width,
(height - h_off) / img_height)
expand_img = np.ones((height, width, 3))
expand_img = np.uint8(expand_img * np.squeeze(settings.img_mean))
expand_img = Image.fromarray(expand_img)
expand_img.paste(img, (int(w_off), int(h_off)))
bbox_labels = transform_labels(bbox_labels, expand_bbox)
return expand_img, bbox_labels, width, height
return img, bbox_labels, img_width, img_height
import os
import time
import numpy as np
import argparse
import functools
from PIL import Image
from PIL import ImageDraw
import paddle
import paddle.fluid as fluid
import reader
from pyramidbox import PyramidBox
from utility import add_arguments, print_arguments
parser = argparse.ArgumentParser(description=__doc__)
add_arg = functools.partial(add_arguments, argparser=parser)
# yapf: disable
add_arg('use_gpu', bool, True, "Whether use GPU.")
add_arg('use_pyramidbox', bool, False, "Whether use PyramidBox model.")
add_arg('confs_threshold', float, 0.25, "Confidence threshold to draw bbox.")
add_arg('image_path', str, '', "The data root path.")
add_arg('model_dir', str, '', "The model path.")
# yapf: enable
def draw_bounding_box_on_image(image_path, nms_out, confs_threshold):
image = Image.open(image_path)
draw = ImageDraw.Draw(image)
for dt in nms_out:
xmin, ymin, xmax, ymax, score = dt
if score < confs_threshold:
continue
(left, right, top, bottom) = (xmin, xmax, ymin, ymax)
draw.line(
[(left, top), (left, bottom), (right, bottom), (right, top),
(left, top)],
width=4,
fill='red')
image_name = image_path.split('/')[-1]
image_class = image_path.split('/')[-2]
print("image with bbox drawed saved as {}".format(image_name))
image.save('./infer_results/' + image_class.encode('utf-8') + '/' +
image_name.encode('utf-8'))
def write_to_txt(image_path, f, nms_out):
image_name = image_path.split('/')[-1]
image_class = image_path.split('/')[-2]
f.write('{:s}\n'.format(
image_class.encode('utf-8') + '/' + image_name.encode('utf-8')))
f.write('{:d}\n'.format(nms_out.shape[0]))
for dt in nms_out:
xmin, ymin, xmax, ymax, score = dt
f.write('{:.1f} {:.1f} {:.1f} {:.1f} {:.3f}\n'.format(xmin, ymin, (
xmax - xmin + 1), (ymax - ymin + 1), score))
print("image infer result saved {}".format(image_name[:-4]))
def get_round(x, loc):
str_x = str(x)
if '.' in str_x:
len_after = len(str_x.split('.')[1])
str_before = str_x.split('.')[0]
str_after = str_x.split('.')[1]
if len_after >= 3:
str_final = str_before + '.' + str_after[0:loc]
return float(str_final)
else:
return x
def bbox_vote(det):
order = det[:, 4].ravel().argsort()[::-1]
det = det[order, :]
if det.shape[0] == 0:
dets = np.array([[10, 10, 20, 20, 0.002]])
det = np.empty(shape=[0, 5])
while det.shape[0] > 0:
# IOU
area = (det[:, 2] - det[:, 0] + 1) * (det[:, 3] - det[:, 1] + 1)
xx1 = np.maximum(det[0, 0], det[:, 0])
yy1 = np.maximum(det[0, 1], det[:, 1])
xx2 = np.minimum(det[0, 2], det[:, 2])
yy2 = np.minimum(det[0, 3], det[:, 3])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
o = inter / (area[0] + area[:] - inter)
# get needed merge det and delete these det
merge_index = np.where(o >= 0.3)[0]
det_accu = det[merge_index, :]
det = np.delete(det, merge_index, 0)
if merge_index.shape[0] <= 1:
if det.shape[0] == 0:
try:
dets = np.row_stack((dets, det_accu))
except:
dets = det_accu
continue
det_accu[:, 0:4] = det_accu[:, 0:4] * np.tile(det_accu[:, -1:], (1, 4))
max_score = np.max(det_accu[:, 4])
det_accu_sum = np.zeros((1, 5))
det_accu_sum[:, 0:4] = np.sum(det_accu[:, 0:4],
axis=0) / np.sum(det_accu[:, -1:])
det_accu_sum[:, 4] = max_score
try:
dets = np.row_stack((dets, det_accu_sum))
except:
dets = det_accu_sum
dets = dets[0:750, :]
return dets
def image_preprocess(image):
img = np.array(image)
# HWC to CHW
if len(img.shape) == 3:
img = np.swapaxes(img, 1, 2)
img = np.swapaxes(img, 1, 0)
# RBG to BGR
img = img[[2, 1, 0], :, :]
img = img.astype('float32')
img -= np.array(
[104., 117., 123.])[:, np.newaxis, np.newaxis].astype('float32')
img = img * 0.007843
img = [img]
img = np.array(img)
return img
def detect_face(image, shrink):
image_shape = [3, image.size[1], image.size[0]]
num_classes = 2
place = fluid.CUDAPlace(0) if args.use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place)
if shrink != 1:
image = image.resize((int(image_shape[2] * shrink),
int(image_shape[1] * shrink)), Image.ANTIALIAS)
image_shape = [
image_shape[0], int(image_shape[1] * shrink),
int(image_shape[2] * shrink)
]
print "image_shape:", image_shape
img = image_preprocess(image)
scope = fluid.core.Scope()
main_program = fluid.Program()
startup_program = fluid.Program()
with fluid.scope_guard(scope):
with fluid.unique_name.guard():
with fluid.program_guard(main_program, startup_program):
fetches = []
network = PyramidBox(
image_shape,
num_classes,
sub_network=args.use_pyramidbox,
is_infer=True)
infer_program, nmsed_out = network.infer(main_program)
fetches = [nmsed_out]
fluid.io.load_persistables(
exe, args.model_dir, main_program=main_program)
detection, = exe.run(infer_program,
feed={'image': img},
fetch_list=fetches,
return_numpy=False)
detection = np.array(detection)
# layout: xmin, ymin, xmax. ymax, score
det_conf = detection[:, 1]
det_xmin = image_shape[2] * detection[:, 2] / shrink
det_ymin = image_shape[1] * detection[:, 3] / shrink
det_xmax = image_shape[2] * detection[:, 4] / shrink
det_ymax = image_shape[1] * detection[:, 5] / shrink
det = np.column_stack((det_xmin, det_ymin, det_xmax, det_ymax, det_conf))
keep_index = np.where(det[:, 4] >= 0)[0]
det = det[keep_index, :]
return det
def flip_test(image, shrink):
img = image.transpose(Image.FLIP_LEFT_RIGHT)
det_f = detect_face(img, shrink)
det_t = np.zeros(det_f.shape)
# image.size: [width, height]
det_t[:, 0] = image.size[0] - det_f[:, 2]
det_t[:, 1] = det_f[:, 1]
det_t[:, 2] = image.size[0] - det_f[:, 0]
det_t[:, 3] = det_f[:, 3]
det_t[:, 4] = det_f[:, 4]
return det_t
def multi_scale_test(image, max_shrink):
# shrink detecting and shrink only detect big face
st = 0.5 if max_shrink >= 0.75 else 0.5 * max_shrink
det_s = detect_face(image, st)
index = np.where(
np.maximum(det_s[:, 2] - det_s[:, 0] + 1, det_s[:, 3] - det_s[:, 1] + 1)
> 30)[0]
det_s = det_s[index, :]
# enlarge one times
bt = min(2, max_shrink) if max_shrink > 1 else (st + max_shrink) / 2
det_b = detect_face(image, bt)
# enlarge small image x times for small face
if max_shrink > 2:
bt *= 2
while bt < max_shrink:
det_b = np.row_stack((det_b, detect_face(image, bt)))
bt *= 2
det_b = np.row_stack((det_b, detect_face(image, max_shrink)))
# enlarge only detect small face
if bt > 1:
index = np.where(
np.minimum(det_b[:, 2] - det_b[:, 0] + 1,
det_b[:, 3] - det_b[:, 1] + 1) < 100)[0]
det_b = det_b[index, :]
else:
index = np.where(
np.maximum(det_b[:, 2] - det_b[:, 0] + 1,
det_b[:, 3] - det_b[:, 1] + 1) > 30)[0]
det_b = det_b[index, :]
return det_s, det_b
def get_im_shrink(image_shape):
max_shrink_v1 = (0x7fffffff / 577.0 /
(image_shape[1] * image_shape[2]))**0.5
max_shrink_v2 = (
(678 * 1024 * 2.0 * 2.0) / (image_shape[1] * image_shape[2]))**0.5
max_shrink = get_round(min(max_shrink_v1, max_shrink_v2), 2) - 0.3
if max_shrink >= 1.5 and max_shrink < 2:
max_shrink = max_shrink - 0.1
elif max_shrink >= 2 and max_shrink < 3:
max_shrink = max_shrink - 0.2
elif max_shrink >= 3 and max_shrink < 4:
max_shrink = max_shrink - 0.3
elif max_shrink >= 4 and max_shrink < 5:
max_shrink = max_shrink - 0.4
elif max_shrink >= 5:
max_shrink = max_shrink - 0.5
print 'max_shrink = ', max_shrink
shrink = max_shrink if max_shrink < 1 else 1
print "shrink = ", shrink
return shrink, max_shrink
def infer(args, batch_size, data_args):
if not os.path.exists(args.model_dir):
raise ValueError("The model path [%s] does not exist." %
(args.model_dir))
infer_reader = paddle.batch(
reader.test(data_args, file_list), batch_size=batch_size)
for batch_id, img in enumerate(infer_reader()):
image = img[0][0]
image_path = img[0][1]
# image.size: [width, height]
image_shape = [3, image.size[1], image.size[0]]
shrink, max_shrink = get_im_shrink(image_shape)
det0 = detect_face(image, shrink)
det1 = flip_test(image, shrink)
[det2, det3] = multi_scale_test(image, max_shrink)
det = np.row_stack((det0, det1, det2, det3))
dets = bbox_vote(det)
image_name = image_path.split('/')[-1]
image_class = image_path.split('/')[-2]
if not os.path.exists('./infer_results/' + image_class.encode('utf-8')):
os.makedirs('./infer_results/' + image_class.encode('utf-8'))
f = open('./infer_results/' + image_class.encode('utf-8') + '/' +
image_name.encode('utf-8')[:-4] + '.txt', 'w')
write_to_txt(image_path, f, dets)
# draw_bounding_box_on_image(image_path, dets, args.confs_threshold)
print "Done"
if __name__ == '__main__':
args = parser.parse_args()
print_arguments(args)
data_dir = 'data/WIDERFACE/WIDER_val/images/'
file_list = 'label/val_gt_widerface.res'
data_args = reader.Settings(
data_dir=data_dir,
mean_value=[104., 117., 123],
apply_distort=False,
apply_expand=False,
ap_version='11point')
infer(args, batch_size=1, data_args=data_args)
import numpy as np
import paddle.fluid as fluid
from paddle.fluid.param_attr import ParamAttr
from paddle.fluid.initializer import Xavier
from paddle.fluid.initializer import Constant
from paddle.fluid.initializer import Bilinear
from paddle.fluid.regularizer import L2Decay
def conv_bn(input, filter, ksize, stride, padding, act='relu', bias_attr=False):
conv = fluid.layers.conv2d(
input=input,
filter_size=ksize,
num_filters=filter,
stride=stride,
padding=padding,
act=None,
bias_attr=bias_attr)
return fluid.layers.batch_norm(input=conv, act=act)
def conv_block(input, groups, filters, ksizes, strides=None, with_pool=True):
assert len(filters) == groups
assert len(ksizes) == groups
strides = [1] * groups if strides is None else strides
w_attr = ParamAttr(learning_rate=1., initializer=Xavier())
b_attr = ParamAttr(learning_rate=2., regularizer=L2Decay(0.))
conv = input
for i in xrange(groups):
conv = fluid.layers.conv2d(
input=conv,
num_filters=filters[i],
filter_size=ksizes[i],
stride=strides[i],
padding=(ksizes[i] - 1) / 2,
param_attr=w_attr,
bias_attr=b_attr,
act='relu')
if with_pool:
pool = fluid.layers.pool2d(
input=conv,
pool_size=2,
pool_type='max',
pool_stride=2,
ceil_mode=True)
return conv, pool
else:
return conv
class PyramidBox(object):
def __init__(self,
data_shape,
num_classes,
use_transposed_conv2d=True,
is_infer=False,
sub_network=False):
"""
TODO(qingqing): add comments.
"""
self.data_shape = data_shape
self.min_sizes = [16., 32., 64., 128., 256., 512.]
self.steps = [4., 8., 16., 32., 64., 128.]
self.num_classes = num_classes
self.use_transposed_conv2d = use_transposed_conv2d
self.is_infer = is_infer
self.sub_network = sub_network
# the base network is VGG with atrous layers
self._input()
self._vgg()
if sub_network:
self._low_level_fpn()
self._cpm_module()
self._pyramidbox()
else:
self._vgg_ssd()
def feeds(self):
if self.is_infer:
return [self.image]
else:
return [
self.image, self.face_box, self.head_box, self.gt_label,
self.difficult
]
def _input(self):
self.image = fluid.layers.data(
name='image', shape=self.data_shape, dtype='float32')
if not self.is_infer:
self.face_box = fluid.layers.data(
name='face_box', shape=[4], dtype='float32', lod_level=1)
self.head_box = fluid.layers.data(
name='head_box', shape=[4], dtype='float32', lod_level=1)
self.gt_label = fluid.layers.data(
name='gt_label', shape=[1], dtype='int32', lod_level=1)
self.difficult = fluid.layers.data(
name='gt_difficult', shape=[1], dtype='int32', lod_level=1)
def _vgg(self):
self.conv1, self.pool1 = conv_block(self.image, 2, [64] * 2, [3] * 2)
self.conv2, self.pool2 = conv_block(self.pool1, 2, [128] * 2, [3] * 2)
#priorbox min_size is 16
self.conv3, self.pool3 = conv_block(self.pool2, 3, [256] * 3, [3] * 3)
#priorbox min_size is 32
self.conv4, self.pool4 = conv_block(self.pool3, 3, [512] * 3, [3] * 3)
#priorbox min_size is 64
self.conv5, self.pool5 = conv_block(self.pool4, 3, [512] * 3, [3] * 3)
# fc6 and fc7 in paper, priorbox min_size is 128
self.conv6 = conv_block(
self.pool5, 2, [1024, 1024], [3, 1], with_pool=False)
# conv6_1 and conv6_2 in paper, priorbox min_size is 256
self.conv7 = conv_block(
self.conv6, 2, [256, 512], [1, 3], [1, 2], with_pool=False)
# conv7_1 and conv7_2 in paper, priorbox mini_size is 512
self.conv8 = conv_block(
self.conv7, 2, [128, 256], [1, 3], [1, 2], with_pool=False)
def _low_level_fpn(self):
"""
Low-level feature pyramid network.
"""
def fpn(up_from, up_to):
ch = up_to.shape[1]
b_attr = ParamAttr(learning_rate=2., regularizer=L2Decay(0.))
conv1 = fluid.layers.conv2d(
up_from, ch, 1, act='relu', bias_attr=b_attr)
if self.use_transposed_conv2d:
w_attr = ParamAttr(
learning_rate=0.,
regularizer=L2Decay(0.),
initializer=Bilinear())
upsampling = fluid.layers.conv2d_transpose(
conv1,
ch,
output_size=None,
filter_size=4,
padding=1,
stride=2,
groups=ch,
param_attr=w_attr,
bias_attr=False)
else:
upsampling = fluid.layers.resize_bilinear(
conv1, out_shape=up_to.shape[2:])
b_attr = ParamAttr(learning_rate=2., regularizer=L2Decay(0.))
conv2 = fluid.layers.conv2d(
up_to, ch, 1, act='relu', bias_attr=b_attr)
if self.is_infer:
upsampling = fluid.layers.crop(upsampling, shape=conv2)
# eltwise mul
conv_fuse = upsampling * conv2
return conv_fuse
self.lfpn2_on_conv5 = fpn(self.conv6, self.conv5)
self.lfpn1_on_conv4 = fpn(self.lfpn2_on_conv5, self.conv4)
self.lfpn0_on_conv3 = fpn(self.lfpn1_on_conv4, self.conv3)
def _cpm_module(self):
"""
Context-sensitive Prediction Module
"""
def cpm(input):
# residual
branch1 = conv_bn(input, 1024, 1, 1, 0, None)
branch2a = conv_bn(input, 256, 1, 1, 0, act='relu')
branch2b = conv_bn(branch2a, 256, 3, 1, 1, act='relu')
branch2c = conv_bn(branch2b, 1024, 1, 1, 0, None)
sum = branch1 + branch2c
rescomb = fluid.layers.relu(x=sum)
# ssh
b_attr = ParamAttr(learning_rate=2., regularizer=L2Decay(0.))
ssh_1 = fluid.layers.conv2d(rescomb, 256, 3, 1, 1, bias_attr=b_attr)
ssh_dimred = fluid.layers.conv2d(
rescomb, 128, 3, 1, 1, act='relu', bias_attr=b_attr)
ssh_2 = fluid.layers.conv2d(
ssh_dimred, 128, 3, 1, 1, bias_attr=b_attr)
ssh_3a = fluid.layers.conv2d(
ssh_dimred, 128, 3, 1, 1, act='relu', bias_attr=b_attr)
ssh_3b = fluid.layers.conv2d(ssh_3a, 128, 3, 1, 1, bias_attr=b_attr)
ssh_concat = fluid.layers.concat([ssh_1, ssh_2, ssh_3b], axis=1)
ssh_out = fluid.layers.relu(x=ssh_concat)
return ssh_out
self.ssh_conv3 = cpm(self.lfpn0_on_conv3)
self.ssh_conv4 = cpm(self.lfpn1_on_conv4)
self.ssh_conv5 = cpm(self.lfpn2_on_conv5)
self.ssh_conv6 = cpm(self.conv6)
self.ssh_conv7 = cpm(self.conv7)
self.ssh_conv8 = cpm(self.conv8)
def _l2_norm_scale(self, input, init_scale=1.0, channel_shared=False):
from paddle.fluid.layer_helper import LayerHelper
helper = LayerHelper("Scale")
l2_norm = fluid.layers.l2_normalize(
input, axis=1) # l2 norm along channel
shape = [1] if channel_shared else [input.shape[1]]
scale = helper.create_parameter(
attr=helper.param_attr,
shape=shape,
dtype=input.dtype,
default_initializer=Constant(init_scale))
out = fluid.layers.elementwise_mul(
x=l2_norm, y=scale, axis=-1 if channel_shared else 1)
return out
def _pyramidbox(self):
"""
Get prior-boxes and pyramid-box
"""
self.ssh_conv3_norm = self._l2_norm_scale(
self.ssh_conv3, init_scale=10.)
self.ssh_conv4_norm = self._l2_norm_scale(self.ssh_conv4, init_scale=8.)
self.ssh_conv5_norm = self._l2_norm_scale(self.ssh_conv5, init_scale=5.)
def permute_and_reshape(input, last_dim):
trans = fluid.layers.transpose(input, perm=[0, 2, 3, 1])
new_shape = [
trans.shape[0], np.prod(trans.shape[1:]) / last_dim, last_dim
]
return fluid.layers.reshape(trans, shape=new_shape)
face_locs, face_confs = [], []
head_locs, head_confs = [], []
boxes, vars = [], []
inputs = [
self.ssh_conv3_norm, self.ssh_conv4_norm, self.ssh_conv5_norm,
self.ssh_conv6, self.ssh_conv7, self.ssh_conv8
]
b_attr = ParamAttr(learning_rate=2., regularizer=L2Decay(0.))
for i, input in enumerate(inputs):
mbox_loc = fluid.layers.conv2d(input, 8, 3, 1, 1, bias_attr=b_attr)
face_loc, head_loc = fluid.layers.split(
mbox_loc, num_or_sections=2, dim=1)
face_loc = permute_and_reshape(face_loc, 4)
head_loc = permute_and_reshape(head_loc, 4)
mbox_conf = fluid.layers.conv2d(input, 6, 3, 1, 1, bias_attr=b_attr)
face_conf1, face_conf3, head_conf = fluid.layers.split(
mbox_conf, num_or_sections=[1, 3, 2], dim=1)
face_conf3_maxin = fluid.layers.reduce_max(
face_conf3, dim=1, keep_dim=True)
face_conf = fluid.layers.concat(
[face_conf1, face_conf3_maxin], axis=1)
face_conf = permute_and_reshape(face_conf, 2)
head_conf = permute_and_reshape(head_conf, 2)
face_locs.append(face_loc)
face_confs.append(face_conf)
head_locs.append(head_loc)
head_confs.append(head_conf)
box, var = fluid.layers.prior_box(
input,
self.image,
min_sizes=[self.min_sizes[i]],
steps=[self.steps[i]] * 2,
aspect_ratios=[1.],
clip=False,
flip=True,
offset=0.5)
box = fluid.layers.reshape(box, shape=[-1, 4])
var = fluid.layers.reshape(var, shape=[-1, 4])
boxes.append(box)
vars.append(var)
self.face_mbox_loc = fluid.layers.concat(face_locs, axis=1)
self.face_mbox_conf = fluid.layers.concat(face_confs, axis=1)
self.head_mbox_loc = fluid.layers.concat(head_locs, axis=1)
self.head_mbox_conf = fluid.layers.concat(head_confs, axis=1)
self.prior_boxes = fluid.layers.concat(boxes)
self.box_vars = fluid.layers.concat(vars)
def _vgg_ssd(self):
self.conv3_norm = self._l2_norm_scale(self.conv3, init_scale=10.)
self.conv4_norm = self._l2_norm_scale(self.conv4, init_scale=8.)
self.conv5_norm = self._l2_norm_scale(self.conv5, init_scale=5.)
def permute_and_reshape(input, last_dim):
trans = fluid.layers.transpose(input, perm=[0, 2, 3, 1])
new_shape = [
trans.shape[0], np.prod(trans.shape[1:]) / last_dim, last_dim
]
return fluid.layers.reshape(trans, shape=new_shape)
locs, confs = [], []
boxes, vars = [], []
b_attr = ParamAttr(learning_rate=2., regularizer=L2Decay(0.))
# conv3
mbox_loc = fluid.layers.conv2d(
self.conv3_norm, 4, 3, 1, 1, bias_attr=b_attr)
loc = permute_and_reshape(mbox_loc, 4)
mbox_conf = fluid.layers.conv2d(
self.conv3_norm, 4, 3, 1, 1, bias_attr=b_attr)
conf1, conf3 = fluid.layers.split(
mbox_conf, num_or_sections=[1, 3], dim=1)
conf3_maxin = fluid.layers.reduce_max(conf3, dim=1, keep_dim=True)
conf = fluid.layers.concat([conf1, conf3_maxin], axis=1)
conf = permute_and_reshape(conf, 2)
box, var = fluid.layers.prior_box(
self.conv3_norm,
self.image,
min_sizes=[16.],
steps=[4, 4],
aspect_ratios=[1.],
clip=False,
flip=True,
offset=0.5)
box = fluid.layers.reshape(box, shape=[-1, 4])
var = fluid.layers.reshape(var, shape=[-1, 4])
locs.append(loc)
confs.append(conf)
boxes.append(box)
vars.append(var)
min_sizes = [32., 64., 128., 256., 512.]
steps = [8., 16., 32., 64., 128.]
inputs = [
self.conv4_norm, self.conv5_norm, self.conv6, self.conv7, self.conv8
]
for i, input in enumerate(inputs):
mbox_loc = fluid.layers.conv2d(input, 4, 3, 1, 1, bias_attr=b_attr)
loc = permute_and_reshape(mbox_loc, 4)
mbox_conf = fluid.layers.conv2d(input, 2, 3, 1, 1, bias_attr=b_attr)
conf = permute_and_reshape(mbox_conf, 2)
box, var = fluid.layers.prior_box(
input,
self.image,
min_sizes=[min_sizes[i]],
steps=[steps[i]] * 2,
aspect_ratios=[1.],
clip=False,
flip=True,
offset=0.5)
box = fluid.layers.reshape(box, shape=[-1, 4])
var = fluid.layers.reshape(var, shape=[-1, 4])
locs.append(loc)
confs.append(conf)
boxes.append(box)
vars.append(var)
self.face_mbox_loc = fluid.layers.concat(locs, axis=1)
self.face_mbox_conf = fluid.layers.concat(confs, axis=1)
self.prior_boxes = fluid.layers.concat(boxes)
self.box_vars = fluid.layers.concat(vars)
def vgg_ssd_loss(self):
loss = fluid.layers.ssd_loss(
self.face_mbox_loc,
self.face_mbox_conf,
self.face_box,
self.gt_label,
self.prior_boxes,
self.box_vars,
overlap_threshold=0.35,
neg_overlap=0.35)
loss = fluid.layers.reduce_sum(loss)
return loss
def train(self):
face_loss = fluid.layers.ssd_loss(
self.face_mbox_loc,
self.face_mbox_conf,
self.face_box,
self.gt_label,
self.prior_boxes,
self.box_vars,
overlap_threshold=0.35,
neg_overlap=0.35)
head_loss = fluid.layers.ssd_loss(
self.head_mbox_loc,
self.head_mbox_conf,
self.head_box,
self.gt_label,
self.prior_boxes,
self.box_vars,
overlap_threshold=0.35,
neg_overlap=0.35)
face_loss = fluid.layers.reduce_sum(face_loss)
head_loss = fluid.layers.reduce_sum(head_loss)
total_loss = face_loss + head_loss
return face_loss, head_loss, total_loss
def infer(self, main_program=None):
if main_program is None:
test_program = fluid.default_main_program().clone(for_test=True)
else:
test_program = main_program.clone(for_test=True)
with fluid.program_guard(test_program):
face_nmsed_out = fluid.layers.detection_output(
self.face_mbox_loc,
self.face_mbox_conf,
self.prior_boxes,
self.box_vars,
nms_threshold=0.45)
return test_program, face_nmsed_out
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import image_util
from paddle.utils.image_util import *
import random
from PIL import Image
from PIL import ImageDraw
import numpy as np
import xml.etree.ElementTree
import os
import time
import copy
import random
class Settings(object):
def __init__(self,
dataset=None,
data_dir=None,
label_file=None,
resize_h=None,
resize_w=None,
mean_value=[104., 117., 123.],
apply_distort=True,
apply_expand=True,
ap_version='11point',
toy=0):
self.dataset = dataset
self.ap_version = ap_version
self.toy = toy
self.data_dir = data_dir
self.apply_distort = apply_distort
self.apply_expand = apply_expand
self.resize_height = resize_h
self.resize_width = resize_w
self.img_mean = np.array(mean_value)[:, np.newaxis, np.newaxis].astype(
'float32')
self.expand_prob = 0.5
self.expand_max_ratio = 4
self.hue_prob = 0.5
self.hue_delta = 18
self.contrast_prob = 0.5
self.contrast_delta = 0.5
self.saturation_prob = 0.5
self.saturation_delta = 0.5
self.brightness_prob = 0.5
# _brightness_delta is the normalized value by 256
# self._brightness_delta = 32
self.brightness_delta = 0.125
self.scale = 0.007843 # 1 / 127.5
self.data_anchor_sampling_prob = 0.5
def preprocess(img, bbox_labels, mode, settings):
img_width, img_height = img.size
sampled_labels = bbox_labels
if mode == 'train':
if settings.apply_distort:
img = image_util.distort_image(img, settings)
if settings.apply_expand:
img, bbox_labels, img_width, img_height = image_util.expand_image(
img, bbox_labels, img_width, img_height, settings)
# sampling
batch_sampler = []
prob = random.uniform(0., 1.)
if prob > settings.data_anchor_sampling_prob:
scale_array = np.array([16, 32, 64, 128, 256, 512])
batch_sampler.append(
image_util.sampler(1, 10, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.2,
0.0, True))
sampled_bbox = image_util.generate_batch_random_samples(
batch_sampler, bbox_labels, img_width, img_height, scale_array,
settings.resize_width, settings.resize_height)
img = np.array(img)
if len(sampled_bbox) > 0:
idx = int(random.uniform(0, len(sampled_bbox)))
img, sampled_labels = image_util.crop_image_sampling(
img, bbox_labels, sampled_bbox[idx], img_width, img_height,
resize_width, resize_heigh)
img = Image.fromarray(img)
else:
# hard-code here
batch_sampler.append(
image_util.sampler(1, 50, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0,
0.0, True))
batch_sampler.append(
image_util.sampler(1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0,
0.0, True))
batch_sampler.append(
image_util.sampler(1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0,
0.0, True))
batch_sampler.append(
image_util.sampler(1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0,
0.0, True))
batch_sampler.append(
image_util.sampler(1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0,
0.0, True))
sampled_bbox = image_util.generate_batch_samples(
batch_sampler, bbox_labels, img_width, img_height)
img = np.array(img)
if len(sampled_bbox) > 0:
idx = int(random.uniform(0, len(sampled_bbox)))
img, sampled_labels = image_util.crop_image(
img, bbox_labels, sampled_bbox[idx], img_width, img_height)
img = Image.fromarray(img)
img = img.resize((settings.resize_width, settings.resize_height),
Image.ANTIALIAS)
img = np.array(img)
if mode == 'train':
mirror = int(random.uniform(0, 2))
if mirror == 1:
img = img[:, ::-1, :]
for i in xrange(len(sampled_labels)):
tmp = sampled_labels[i][1]
sampled_labels[i][1] = 1 - sampled_labels[i][3]
sampled_labels[i][3] = 1 - tmp
# HWC to CHW
if len(img.shape) == 3:
img = np.swapaxes(img, 1, 2)
img = np.swapaxes(img, 1, 0)
# RBG to BGR
img = img[[2, 1, 0], :, :]
img = img.astype('float32')
img -= settings.img_mean
img = img * settings.scale
return img, sampled_labels
def put_txt_in_dict(input_txt):
with open(input_txt, 'r') as f_dir:
lines_input_txt = f_dir.readlines()
dict_input_txt = {}
num_class = 0
for i in range(len(lines_input_txt)):
tmp_line_txt = lines_input_txt[i].strip('\n\t\r')
if '--' in tmp_line_txt:
if i != 0:
num_class += 1
dict_input_txt[num_class] = []
dict_name = tmp_line_txt
dict_input_txt[num_class].append(tmp_line_txt)
if '--' not in tmp_line_txt:
if len(tmp_line_txt) > 6:
split_str = tmp_line_txt.split(' ')
x1_min = float(split_str[0])
y1_min = float(split_str[1])
x2_max = float(split_str[2])
y2_max = float(split_str[3])
tmp_line_txt = str(x1_min) + ' ' + str(y1_min) + ' ' + str(
x2_max) + ' ' + str(y2_max)
dict_input_txt[num_class].append(tmp_line_txt)
else:
dict_input_txt[num_class].append(tmp_line_txt)
return dict_input_txt
def expand_bboxes(bboxes,
expand_left=2.,
expand_up=2.,
expand_right=2.,
expand_down=2.):
"""
Expand bboxes, expand 2 times by defalut.
"""
expand_boxes = []
for bbox in bboxes:
xmin = bbox[0]
ymin = bbox[1]
xmax = bbox[2]
ymax = bbox[3]
w = xmax - xmin
h = ymax - ymin
ex_xmin = max(xmin - w / expand_left, 0.)
ex_ymin = max(ymin - h / expand_up, 0.)
ex_xmax = min(xmax + w / expand_right, 1.)
ex_ymax = min(ymax + h / expand_down, 1.)
expand_boxes.append([ex_xmin, ex_ymin, ex_xmax, ex_ymax])
return expand_boxes
def pyramidbox(settings, file_list, mode, shuffle):
dict_input_txt = {}
dict_input_txt = put_txt_in_dict(file_list)
def reader():
if mode == 'train' and shuffle:
random.shuffle(dict_input_txt)
for index_image in range(len(dict_input_txt)):
image_name = dict_input_txt[index_image][0] + '.jpg'
image_path = os.path.join(settings.data_dir, image_name)
im = Image.open(image_path)
if im.mode == 'L':
im = im.convert('RGB')
im_width, im_height = im.size
# layout: label | xmin | ymin | xmax | ymax
if mode == 'train':
bbox_labels = []
for index_box in range(len(dict_input_txt[index_image])):
if index_box >= 2:
bbox_sample = []
temp_info_box = dict_input_txt[index_image][
index_box].split(' ')
xmin = float(temp_info_box[0])
ymin = float(temp_info_box[1])
w = float(temp_info_box[2])
h = float(temp_info_box[3])
xmax = xmin + w
ymax = ymin + h
bbox_sample.append(1)
bbox_sample.append(float(xmin) / im_width)
bbox_sample.append(float(ymin) / im_height)
bbox_sample.append(float(xmax) / im_width)
bbox_sample.append(float(ymax) / im_height)
bbox_labels.append(bbox_sample)
im, sample_labels = preprocess(im, bbox_labels, mode, settings)
sample_labels = np.array(sample_labels)
if len(sample_labels) == 0: continue
im = im.astype('float32')
boxes = sample_labels[:, 1:5]
lbls = [1] * len(boxes)
difficults = [1] * len(boxes)
yield im, boxes, expand_bboxes(boxes), lbls, difficults
if mode == 'test':
yield im, image_path
return reader
def train(settings, file_list, shuffle=True):
return pyramidbox(settings, file_list, 'train', shuffle)
def test(settings, file_list):
return pyramidbox(settings, file_list, 'test', False)
def infer(settings, image_path):
def batch_reader():
img = Image.open(image_path)
if img.mode == 'L':
img = im.convert('RGB')
im_width, im_height = img.size
if settings.resize_width and settings.resize_height:
img = img.resize((settings.resize_width, settings.resize_height),
Image.ANTIALIAS)
img = np.array(img)
# HWC to CHW
if len(img.shape) == 3:
img = np.swapaxes(img, 1, 2)
img = np.swapaxes(img, 1, 0)
# RBG to BGR
img = img[[2, 1, 0], :, :]
img = img.astype('float32')
img -= settings.img_mean
img = img * settings.scale
return np.array([img])
return batch_reader
import os
import shutil
import numpy as np
import time
import argparse
import functools
import reader
import paddle
import paddle.fluid as fluid
from pyramidbox import PyramidBox
from utility import add_arguments, print_arguments
parser = argparse.ArgumentParser(description=__doc__)
add_arg = functools.partial(add_arguments, argparser=parser)
# yapf: disable
add_arg('parallel', bool, True, "parallel")
add_arg('learning_rate', float, 0.001, "Learning rate.")
add_arg('batch_size', int, 12, "Minibatch size.")
add_arg('num_passes', int, 120, "Epoch number.")
add_arg('use_gpu', bool, True, "Whether use GPU.")
add_arg('use_pyramidbox', bool, True, "Whether use PyramidBox model.")
add_arg('model_save_dir', str, 'output', "The path to save model.")
add_arg('pretrained_model', str, './pretrained/', "The init model path.")
add_arg('resize_h', int, 640, "The resized image height.")
add_arg('resize_w', int, 640, "The resized image height.")
#yapf: enable
def train(args, config, train_file_list, optimizer_method):
learning_rate = args.learning_rate
batch_size = args.batch_size
num_passes = args.num_passes
height = args.resize_h
width = args.resize_w
use_gpu = args.use_gpu
use_pyramidbox = args.use_pyramidbox
model_save_dir = args.model_save_dir
pretrained_model = args.pretrained_model
num_classes = 2
image_shape = [3, height, width]
devices = os.getenv("CUDA_VISIBLE_DEVICES") or ""
devices_num = len(devices.split(","))
fetches = []
network = PyramidBox(image_shape, num_classes,
sub_network=use_pyramidbox)
if use_pyramidbox:
face_loss, head_loss, loss = network.train()
fetches = [face_loss, head_loss]
else:
loss = network.vgg_ssd_loss()
fetches = [loss]
epocs = 12880 / batch_size
boundaries = [epocs * 40, epocs * 60, epocs * 80, epocs * 100]
values = [
learning_rate, learning_rate * 0.5, learning_rate * 0.25,
learning_rate * 0.1, learning_rate * 0.01
]
if optimizer_method == "momentum":
optimizer = fluid.optimizer.Momentum(
learning_rate=fluid.layers.piecewise_decay(
boundaries=boundaries, values=values),
momentum=0.9,
regularization=fluid.regularizer.L2Decay(0.0005),
)
else:
optimizer = fluid.optimizer.RMSProp(
learning_rate=fluid.layers.piecewise_decay(boundaries, values),
regularization=fluid.regularizer.L2Decay(0.0005),
)
optimizer.minimize(loss)
#fluid.memory_optimize(fluid.default_main_program())
place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
start_pass = 0
if pretrained_model:
if pretrained_model.isdigit():
start_pass = int(pretrained_model) + 1
pretrained_model = os.path.join(model_save_dir, pretrained_model)
print("Resume from %s " %(pretrained_model))
if not os.path.exists(pretrained_model):
raise ValueError("The pre-trained model path [%s] does not exist." %
(pretrained_model))
def if_exist(var):
return os.path.exists(os.path.join(pretrained_model, var.name))
fluid.io.load_vars(exe, pretrained_model, predicate=if_exist)
if args.parallel:
train_exe = fluid.ParallelExecutor(
use_cuda=use_gpu, loss_name=loss.name)
train_reader = paddle.batch(
reader.train(config, train_file_list), batch_size=batch_size)
feeder = fluid.DataFeeder(place=place, feed_list=network.feeds())
def save_model(postfix):
model_path = os.path.join(model_save_dir, postfix)
if os.path.isdir(model_path):
shutil.rmtree(model_path)
print 'save models to %s' % (model_path)
fluid.io.save_persistables(exe, model_path)
for pass_id in range(start_pass, num_passes):
start_time = time.time()
prev_start_time = start_time
end_time = 0
for batch_id, data in enumerate(train_reader()):
prev_start_time = start_time
start_time = time.time()
if len(data) < 2 * devices_num: continue
if args.parallel:
fetch_vars = train_exe.run(fetch_list=[v.name for v in fetches],
feed=feeder.feed(data))
else:
fetch_vars = exe.run(fluid.default_main_program(),
feed=feeder.feed(data),
fetch_list=fetches)
end_time = time.time()
fetch_vars = [np.mean(np.array(v)) for v in fetch_vars]
if batch_id % 1 == 0:
if not args.use_pyramidbox:
print("Pass {0}, batch {1}, loss {2}, time {3}".format(
pass_id, batch_id, fetch_vars[0],
start_time - prev_start_time))
else:
print("Pass {0}, batch {1}, face loss {2}, head loss {3}, " \
"time {4}".format(pass_id,
batch_id, fetch_vars[0], fetch_vars[1],
start_time - prev_start_time))
if pass_id % 1 == 0 or pass_id == num_passes - 1:
save_model(str(pass_id))
if __name__ == '__main__':
args = parser.parse_args()
print_arguments(args)
data_dir = 'data/WIDERFACE/WIDER_train/images/'
train_file_list = 'label/train_gt_widerface.res'
config = reader.Settings(
data_dir=data_dir,
resize_h=args.resize_h,
resize_w=args.resize_w,
apply_expand=False,
mean_value=[104., 117., 123],
ap_version='11point')
train(args, config, train_file_list, optimizer_method="momentum")
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import distutils.util
def print_arguments(args):
"""Print argparse's arguments.
Usage:
.. code-block:: python
parser = argparse.ArgumentParser()
parser.add_argument("name", default="Jonh", type=str, help="User name.")
args = parser.parse_args()
print_arguments(args)
:param args: Input argparse.Namespace for printing.
:type args: argparse.Namespace
"""
print("----------- Configuration Arguments -----------")
for arg, value in sorted(vars(args).iteritems()):
print("%s: %s" % (arg, value))
print("------------------------------------------------")
def add_arguments(argname, type, default, help, argparser, **kwargs):
"""Add argparse's argument.
Usage:
.. code-block:: python
parser = argparse.ArgumentParser()
add_argument("name", str, "Jonh", "User name.", parser)
args = parser.parse_args()
"""
type = distutils.util.strtobool if type == bool else type
argparser.add_argument(
"--" + argname,
default=default,
type=type,
help=help + ' Default: %(default)s.',
**kwargs)
运行本目录下的程序示例需要使用PaddlePaddle develop最新版本。如果您的PaddlePaddle安装版本低于此要求,请按照[安装文档](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html)中的说明更新PaddlePaddle安装版本。
## 代码结构
```
├── network.py # 网络结构定义脚本
├── train.py # 训练任务脚本
├── eval.py # 评估脚本
├── infer.py # 预测脚本
├── cityscape.py # 数据预处理脚本
└── utils.py # 定义通用的函数
```
## 简介
Image Cascade Network(ICNet)主要用于图像实时语义分割。相较于其它压缩计算的方法,ICNet即考虑了速度,也考虑了准确性。
ICNet的主要思想是将输入图像变换为不同的分辨率,然后用不同计算复杂度的子网络计算不同分辨率的输入,然后将结果合并。ICNet由三个子网络组成,计算复杂度高的网络处理低分辨率输入,计算复杂度低的网络处理分辨率高的网络,通过这种方式在高分辨率图像的准确性和低复杂度网络的效率之间获得平衡。
整个网络结构如下:
<p align="center">
<img src="images/icnet.png" width="620" hspace='10'/> <br/>
<strong>图 1</strong>
</p>
## 数据准备
本文采用Cityscape数据集,请前往[Cityscape官网](https://www.cityscapes-dataset.com)注册下载。下载数据之后,按照[这里](https://github.com/mcordts/cityscapesScripts/blob/master/cityscapesscripts/preparation/createTrainIdLabelImgs.py#L3)的说明和工具处理数据。
处理之后的数据
```
data/cityscape/
|-- gtFine
| |-- test
| |-- train
| `-- val
|-- leftImg8bit
| |-- test
| |-- train
| `-- val
|-- train.list
`-- val.list
```
其中,train.list和val.list分别是用于训练和测试的列表文件,第一列为输入图像数据,第二列为标注数据,两列用空格分开。示例如下:
```
leftImg8bit/train/stuttgart/stuttgart_000021_000019_leftImg8bit.png gtFine/train/stuttgart/stuttgart_000021_000019_gtFine_labelTrainIds.png
leftImg8bit/train/stuttgart/stuttgart_000072_000019_leftImg8bit.png gtFine/train/stuttgart/stuttgart_000072_000019_gtFine_labelTrainIds.png
```
完成数据下载和准备后,需要修改`cityscape.py`脚本中对应的数据地址。
## 模型训练与预测
### 训练
执行以下命令进行训练,同时指定checkpoint保存路径:
```
python train.py --batch_size=16 --use_gpu=True --checkpoint_path="./chkpnt/"
```
使用以下命令获得更多使用说明:
```
python train.py --help
```
训练过程中会根据用户的设置,输出训练集上每个网络分支的`loss`, 示例如下:
```
Iter[0]; train loss: 2.338; sub4_loss: 3.367; sub24_loss: 4.120; sub124_loss: 0.151
```
### 测试
执行以下命令在`Cityscape`测试数据集上进行测试:
```
python eval.py --model_path="./model/" --use_gpu=True
```
需要通过选项`--model_path`指定模型文件。
测试脚本的输出的评估指标为[mean IoU]()。
### 预测
执行以下命令对指定的数据进行预测:
```
python infer.py \
--model_path="./model" \
--images_path="./data/cityscape/" \
--images_list="./data/cityscape/infer.list"
```
通过选项`--images_list`指定列表文件,列表文件中每一行为一个要预测的图片的路径。
预测结果默认保存到当前路径下的`output`文件夹下。
## 实验结果
图2为在`CityScape`训练集上的训练的Loss曲线:
<p align="center">
<img src="images/train_loss.png" width="620" hspace='10'/> <br/>
<strong>图 2</strong>
</p>
在训练集上训练,在validation数据集上验证的结果为:mean_IoU=67.0%(论文67.7%)
图3是使用`infer.py`脚本预测产生的结果示例,其中,第一行为输入的原始图片,第二行为人工的标注,第三行为我们模型计算的结果。
<p align="center">
<img src="images/result.png" width="620" hspace='10'/> <br/>
<strong>图 3</strong>
</p>
## 其他信息
|数据集 | pretrained model |
|---|---|
|CityScape | [Model]()[md: ] |
## 参考
- [ICNet for Real-Time Semantic Segmentation on High-Resolution Images](https://arxiv.org/abs/1704.08545)
"""Reader for Cityscape dataset.
"""
import os
import cv2
import numpy as np
import paddle.v2 as paddle
DATA_PATH = "./data/cityscape"
TRAIN_LIST = DATA_PATH + "/train.list"
TEST_LIST = DATA_PATH + "/val.list"
IGNORE_LABEL = 255
NUM_CLASSES = 19
TRAIN_DATA_SHAPE = (3, 720, 720)
TEST_DATA_SHAPE = (3, 1024, 2048)
IMG_MEAN = np.array((103.939, 116.779, 123.68), dtype=np.float32)
def train_data_shape():
return TRAIN_DATA_SHAPE
def test_data_shape():
return TEST_DATA_SHAPE
def num_classes():
return NUM_CLASSES
class DataGenerater:
def __init__(self, data_list, mode="train", flip=True, scaling=True):
self.flip = flip
self.scaling = scaling
self.image_label = []
with open(data_list, 'r') as f:
for line in f:
image_file, label_file = line.strip().split(' ')
self.image_label.append((image_file, label_file))
def create_train_reader(self, batch_size):
"""
Create a reader for train dataset.
"""
def reader():
np.random.shuffle(self.image_label)
images = []
labels_sub1 = []
labels_sub2 = []
labels_sub4 = []
count = 0
for image, label in self.image_label:
image, label_sub1, label_sub2, label_sub4 = self.process_train_data(
image, label)
count += 1
images.append(image)
labels_sub1.append(label_sub1)
labels_sub2.append(label_sub2)
labels_sub4.append(label_sub4)
if count == batch_size:
yield self.mask(
np.array(images),
np.array(labels_sub1),
np.array(labels_sub2), np.array(labels_sub4))
images = []
labels_sub1 = []
labels_sub2 = []
labels_sub4 = []
count = 0
if images:
yield self.mask(
np.array(images),
np.array(labels_sub1),
np.array(labels_sub2), np.array(labels_sub4))
return reader
def create_test_reader(self):
"""
Create a reader for test dataset.
"""
def reader():
for image, label in self.image_label:
image, label = self.load(image, label)
image = paddle.image.to_chw(image)[np.newaxis, :]
label = label[np.newaxis, :, :, np.newaxis].astype("float32")
label_mask = np.where((label != IGNORE_LABEL).flatten())[
0].astype("int32")
yield image, label, label_mask
return reader
def process_train_data(self, image, label):
"""
Process training data.
"""
image, label = self.load(image, label)
if self.flip:
image, label = self.random_flip(image, label)
if self.scaling:
image, label = self.random_scaling(image, label)
image, label = self.resize(image, label, out_size=TRAIN_DATA_SHAPE[1:])
label = label.astype("float32")
label_sub1 = paddle.image.to_chw(self.scale_label(label, factor=4))
label_sub2 = paddle.image.to_chw(self.scale_label(label, factor=8))
label_sub4 = paddle.image.to_chw(self.scale_label(label, factor=16))
image = paddle.image.to_chw(image)
return image, label_sub1, label_sub2, label_sub4
def load(self, image, label):
"""
Load image from file.
"""
image = paddle.image.load_image(
DATA_PATH + "/" + image, is_color=True).astype("float32")
image -= IMG_MEAN
label = paddle.image.load_image(
DATA_PATH + "/" + label, is_color=False).astype("float32")
return image, label
def random_flip(self, image, label):
"""
Flip image and label randomly.
"""
r = np.random.rand(1)
if r > 0.5:
image = paddle.image.left_right_flip(image, is_color=True)
label = paddle.image.left_right_flip(label, is_color=False)
return image, label
def random_scaling(self, image, label):
"""
Scale image and label randomly.
"""
scale = np.random.uniform(0.5, 2.0, 1)[0]
h_new = int(image.shape[0] * scale)
w_new = int(image.shape[1] * scale)
image = cv2.resize(image, (w_new, h_new))
label = cv2.resize(
label, (w_new, h_new), interpolation=cv2.INTER_NEAREST)
return image, label
def padding_as(self, image, h, w, is_color):
"""
Padding image.
"""
pad_h = max(image.shape[0], h) - image.shape[0]
pad_w = max(image.shape[1], w) - image.shape[1]
if is_color:
return np.pad(image, ((0, pad_h), (0, pad_w), (0, 0)), 'constant')
else:
return np.pad(image, ((0, pad_h), (0, pad_w)), 'constant')
def resize(self, image, label, out_size):
"""
Resize image and label by padding or cropping.
"""
ignore_label = IGNORE_LABEL
label = label - ignore_label
if len(label.shape) == 2:
label = label[:, :, np.newaxis]
combined = np.concatenate((image, label), axis=2)
combined = self.padding_as(
combined, out_size[0], out_size[1], is_color=True)
combined = paddle.image.random_crop(
combined, out_size[0], is_color=True)
image = combined[:, :, 0:3]
label = combined[:, :, 3:4] + ignore_label
return image, label
def scale_label(self, label, factor):
"""
Scale label according to factor.
"""
h = label.shape[0] / factor
w = label.shape[1] / factor
return cv2.resize(
label, (h, w), interpolation=cv2.INTER_NEAREST)[:, :, np.newaxis]
def mask(self, image, label0, label1, label2):
"""
Get mask for valid pixels.
"""
mask_sub1 = np.where(((label0 < (NUM_CLASSES + 1)) & (
label0 != IGNORE_LABEL)).flatten())[0].astype("int32")
mask_sub2 = np.where(((label1 < (NUM_CLASSES + 1)) & (
label1 != IGNORE_LABEL)).flatten())[0].astype("int32")
mask_sub4 = np.where(((label2 < (NUM_CLASSES + 1)) & (
label2 != IGNORE_LABEL)).flatten())[0].astype("int32")
return image.astype(
"float32"), label0, mask_sub1, label1, mask_sub2, label2, mask_sub4
def train(batch_size=32, flip=True, scaling=True):
"""
Cityscape training set reader.
It returns a reader, in which each result is a batch with batch_size samples.
:param batch_size: The batch size of each result return by the reader.
:type batch_size: int
:param flip: Whether flip images randomly.
:type batch_size: bool
:param scaling: Whether scale images randomly.
:type batch_size: bool
:return: Training reader.
:rtype: callable
"""
reader = DataGenerater(
TRAIN_LIST, flip=flip, scaling=scaling).create_train_reader(batch_size)
return reader
def test():
"""
Cityscape validation set reader.
It returns a reader, in which each result is a sample.
:return: Training reader.
:rtype: callable
"""
reader = DataGenerater(TEST_LIST).create_test_reader()
return reader
def infer(image_list=TEST_LIST):
"""
Infer set reader.
It returns a reader, in which each result is a sample.
:param image_list: The image list file in which each line is a path of image to be infered.
:type batch_size: str
:return: Infer reader.
:rtype: callable
"""
reader = DataGenerater(image_list).create_test_reader()
"""Evaluator for ICNet model."""
import paddle.fluid as fluid
import numpy as np
from utils import add_arguments, print_arguments, get_feeder_data
from paddle.fluid.layers.learning_rate_scheduler import _decay_step_counter
from paddle.fluid.initializer import init_on_cpu
from icnet import icnet
import cityscape
import argparse
import functools
import sys
import os
parser = argparse.ArgumentParser(description=__doc__)
add_arg = functools.partial(add_arguments, argparser=parser)
# yapf: disable
add_arg('model_path', str, None, "Model path.")
add_arg('use_gpu', bool, True, "Whether use GPU to test.")
# yapf: enable
def cal_mean_iou(wrong, correct):
sum = wrong + cerroct
true_num = (sum != 0).sum()
for i in len(sum):
if sum[i] == 0:
sum[i] = 1
return (cerroct.astype("float64") / sum).sum() / true_num
def create_iou(predict, label, mask, num_classes, image_shape):
predict = fluid.layers.resize_bilinear(predict, out_shape=image_shape[1:3])
predict = fluid.layers.transpose(predict, perm=[0, 2, 3, 1])
predict = fluid.layers.reshape(predict, shape=[-1, num_classes])
label = fluid.layers.reshape(label, shape=[-1, 1])
_, predict = fluid.layers.topk(predict, k=1)
predict = fluid.layers.cast(predict, dtype="float32")
predict = fluid.layers.gather(predict, mask)
label = fluid.layers.gather(label, mask)
label = fluid.layers.cast(label, dtype="int32")
predict = fluid.layers.cast(predict, dtype="int32")
iou, out_w, out_r = fluid.layers.mean_iou(predict, label, num_classes)
return iou, out_w, out_r
def eval(args):
data_shape = cityscape.test_data_shape()
num_classes = cityscape.num_classes()
# define network
images = fluid.layers.data(name='image', shape=data_shape, dtype='float32')
label = fluid.layers.data(name='label', shape=[1], dtype='int32')
mask = fluid.layers.data(name='mask', shape=[-1], dtype='int32')
_, _, sub124_out = icnet(images, num_classes,
np.array(data_shape[1:]).astype("float32"))
iou, out_w, out_r = create_iou(sub124_out, label, mask, num_classes,
data_shape)
inference_program = fluid.default_main_program().clone(for_test=True)
# prepare environment
place = fluid.CPUPlace()
if args.use_gpu:
place = fluid.CUDAPlace(0)
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
assert os.path.exists(args.model_path)
fluid.io.load_params(exe, args.model_path)
print "loaded model from: %s" % args.model_path
sys.stdout.flush()
fetch_vars = [iou, out_w, out_r]
out_wrong = np.zeros([num_classes]).astype("int64")
out_right = np.zeros([num_classes]).astype("int64")
count = 0
test_reader = cityscape.test()
for data in test_reader():
count += 1
result = exe.run(inference_program,
feed=get_feeder_data(
data, place, for_test=True),
fetch_list=fetch_vars)
out_wrong += result[1]
out_right += result[2]
print "count: %s; current iou: %.3f;\r" % (count, result[0]),
sys.stdout.flush()
iou = cal_mean_iou(out_wrong, out_right)
print "\nmean iou: %.3f" % iou
def main():
args = parser.parse_args()
print_arguments(args)
eval(args)
if __name__ == "__main__":
main()
import paddle.fluid as fluid
import numpy as np
import sys
def conv(input,
k_h,
k_w,
c_o,
s_h,
s_w,
relu=False,
padding="VALID",
biased=False,
name=None):
act = None
tmp = input
if relu:
act = "relu"
if padding == "SAME":
padding_h = max(k_h - s_h, 0)
padding_w = max(k_w - s_w, 0)
padding_top = padding_h / 2
padding_left = padding_w / 2
padding_bottom = padding_h - padding_top
padding_right = padding_w - padding_left
padding = [
0, 0, 0, 0, padding_top, padding_bottom, padding_left, padding_right
]
tmp = fluid.layers.pad(tmp, padding)
tmp = fluid.layers.conv2d(
tmp,
num_filters=c_o,
filter_size=[k_h, k_w],
stride=[s_h, s_w],
groups=1,
act=act,
bias_attr=biased,
use_cudnn=False,
name=name)
return tmp
def atrous_conv(input,
k_h,
k_w,
c_o,
dilation,
relu=False,
padding="VALID",
biased=False,
name=None):
act = None
if relu:
act = "relu"
tmp = input
if padding == "SAME":
padding_h = max(k_h - s_h, 0)
padding_w = max(k_w - s_w, 0)
padding_top = padding_h / 2
padding_left = padding_w / 2
padding_bottom = padding_h - padding_top
padding_right = padding_w - padding_left
padding = [
0, 0, 0, 0, padding_top, padding_bottom, padding_left, padding_right
]
tmp = fluid.layers.pad(tmp, padding)
tmp = fluid.layers.conv2d(
input,
num_filters=c_o,
filter_size=[k_h, k_w],
dilation=dilation,
groups=1,
act=act,
bias_attr=biased,
use_cudnn=False,
name=name)
return tmp
def zero_padding(input, padding):
return fluid.layers.pad(input,
[0, 0, 0, 0, padding, padding, padding, padding])
def bn(input, relu=False, name=None, is_test=False):
act = None
if relu:
act = 'relu'
name = input.name.split(".")[0] + "_bn"
tmp = fluid.layers.batch_norm(
input, act=act, momentum=0.95, epsilon=1e-5, name=name)
return tmp
def avg_pool(input, k_h, k_w, s_h, s_w, name=None, padding=0):
temp = fluid.layers.pool2d(
input,
pool_size=[k_h, k_w],
pool_type="avg",
pool_stride=[s_h, s_w],
pool_padding=padding,
name=name)
return temp
def max_pool(input, k_h, k_w, s_h, s_w, name=None, padding=0):
temp = fluid.layers.pool2d(
input,
pool_size=[k_h, k_w],
pool_type="max",
pool_stride=[s_h, s_w],
pool_padding=padding,
name=name)
return temp
def interp(input, out_shape):
out_shape = list(out_shape.astype("int32"))
return fluid.layers.resize_bilinear(input, out_shape=out_shape)
def dilation_convs(input):
tmp = res_block(input, filter_num=256, padding=1, name="conv3_2")
tmp = res_block(tmp, filter_num=256, padding=1, name="conv3_3")
tmp = res_block(tmp, filter_num=256, padding=1, name="conv3_4")
tmp = proj_block(tmp, filter_num=512, padding=2, dilation=2, name="conv4_1")
tmp = res_block(tmp, filter_num=512, padding=2, dilation=2, name="conv4_2")
tmp = res_block(tmp, filter_num=512, padding=2, dilation=2, name="conv4_3")
tmp = res_block(tmp, filter_num=512, padding=2, dilation=2, name="conv4_4")
tmp = res_block(tmp, filter_num=512, padding=2, dilation=2, name="conv4_5")
tmp = res_block(tmp, filter_num=512, padding=2, dilation=2, name="conv4_6")
tmp = proj_block(
tmp, filter_num=1024, padding=4, dilation=4, name="conv5_1")
tmp = res_block(tmp, filter_num=1024, padding=4, dilation=4, name="conv5_2")
tmp = res_block(tmp, filter_num=1024, padding=4, dilation=4, name="conv5_3")
return tmp
def pyramis_pooling(input, input_shape):
shape = np.ceil(input_shape / 32).astype("int32")
h, w = shape
pool1 = avg_pool(input, h, w, h, w)
pool1_interp = interp(pool1, shape)
pool2 = avg_pool(input, h / 2, w / 2, h / 2, w / 2)
pool2_interp = interp(pool2, shape)
pool3 = avg_pool(input, h / 3, w / 3, h / 3, w / 3)
pool3_interp = interp(pool3, shape)
pool4 = avg_pool(input, h / 4, w / 4, h / 4, w / 4)
pool4_interp = interp(pool4, shape)
conv5_3_sum = input + pool4_interp + pool3_interp + pool2_interp + pool1_interp
return conv5_3_sum
def shared_convs(image):
tmp = conv(image, 3, 3, 32, 2, 2, padding='SAME', name="conv1_1_3_3_s2")
tmp = bn(tmp, relu=True)
tmp = conv(tmp, 3, 3, 32, 1, 1, padding='SAME', name="conv1_2_3_3")
tmp = bn(tmp, relu=True)
tmp = conv(tmp, 3, 3, 64, 1, 1, padding='SAME', name="conv1_3_3_3")
tmp = bn(tmp, relu=True)
tmp = max_pool(tmp, 3, 3, 2, 2, padding=[1, 1])
tmp = proj_block(tmp, filter_num=128, padding=0, name="conv2_1")
tmp = res_block(tmp, filter_num=128, padding=1, name="conv2_2")
tmp = res_block(tmp, filter_num=128, padding=1, name="conv2_3")
tmp = proj_block(tmp, filter_num=256, padding=1, stride=2, name="conv3_1")
return tmp
def res_block(input, filter_num, padding=0, dilation=None, name=None):
tmp = conv(input, 1, 1, filter_num / 4, 1, 1, name=name + "_1_1_reduce")
tmp = bn(tmp, relu=True)
tmp = zero_padding(tmp, padding=padding)
if dilation is None:
tmp = conv(tmp, 3, 3, filter_num / 4, 1, 1, name=name + "_3_3")
else:
tmp = atrous_conv(
tmp, 3, 3, filter_num / 4, dilation, name=name + "_3_3")
tmp = bn(tmp, relu=True)
tmp = conv(tmp, 1, 1, filter_num, 1, 1, name=name + "_1_1_increase")
tmp = bn(tmp, relu=False)
tmp = input + tmp
tmp = fluid.layers.relu(tmp, name=name + "_relu")
return tmp
def proj_block(input, filter_num, padding=0, dilation=None, stride=1,
name=None):
proj = conv(
input, 1, 1, filter_num, stride, stride, name=name + "_1_1_proj")
proj_bn = bn(proj, relu=False)
tmp = conv(
input, 1, 1, filter_num / 4, stride, stride, name=name + "_1_1_reduce")
tmp = bn(tmp, relu=True)
tmp = zero_padding(tmp, padding=padding)
if padding == 0:
padding = 'SAME'
else:
padding = 'VALID'
if dilation is None:
tmp = conv(
tmp,
3,
3,
filter_num / 4,
1,
1,
padding=padding,
name=name + "_3_3")
else:
tmp = atrous_conv(
tmp,
3,
3,
filter_num / 4,
dilation,
padding=padding,
name=name + "_3_3")
tmp = bn(tmp, relu=True)
tmp = conv(tmp, 1, 1, filter_num, 1, 1, name=name + "_1_1_increase")
tmp = bn(tmp, relu=False)
tmp = proj_bn + tmp
tmp = fluid.layers.relu(tmp, name=name + "_relu")
return tmp
def sub_net_4(input, input_shape):
tmp = interp(input, out_shape=np.ceil(input_shape / 32))
tmp = dilation_convs(tmp)
tmp = pyramis_pooling(tmp, input_shape)
tmp = conv(tmp, 1, 1, 256, 1, 1, name="conv5_4_k1")
tmp = bn(tmp, relu=True)
tmp = interp(tmp, input_shape / 16)
return tmp
def sub_net_2(input):
tmp = conv(input, 1, 1, 128, 1, 1, name="conv3_1_sub2_proj")
tmp = bn(tmp, relu=False)
return tmp
def sub_net_1(input):
tmp = conv(input, 3, 3, 32, 2, 2, padding='SAME', name="conv1_sub1")
tmp = bn(tmp, relu=True)
tmp = conv(tmp, 3, 3, 32, 2, 2, padding='SAME', name="conv2_sub1")
tmp = bn(tmp, relu=True)
tmp = conv(tmp, 3, 3, 64, 2, 2, padding='SAME', name="conv3_sub1")
tmp = bn(tmp, relu=True)
tmp = conv(tmp, 1, 1, 128, 1, 1, name="conv3_sub1_proj")
tmp = bn(tmp, relu=False)
return tmp
def CCF24(sub2_out, sub4_out, input_shape):
tmp = zero_padding(sub4_out, padding=2)
tmp = atrous_conv(tmp, 3, 3, 128, 2, name="conv_sub4")
tmp = bn(tmp, relu=False)
tmp = tmp + sub2_out
tmp = fluid.layers.relu(tmp)
tmp = interp(tmp, input_shape / 8)
return tmp
def CCF124(sub1_out, sub24_out, input_shape):
tmp = zero_padding(sub24_out, padding=2)
tmp = atrous_conv(tmp, 3, 3, 128, 2, name="conv_sub2")
tmp = bn(tmp, relu=False)
tmp = tmp + sub1_out
tmp = fluid.layers.relu(tmp)
tmp = interp(tmp, input_shape / 4)
return tmp
def icnet(data, num_classes, input_shape):
image_sub1 = data
image_sub2 = interp(data, out_shape=input_shape * 0.5)
s_convs = shared_convs(image_sub2)
sub4_out = sub_net_4(s_convs, input_shape)
sub2_out = sub_net_2(s_convs)
sub1_out = sub_net_1(image_sub1)
sub24_out = CCF24(sub2_out, sub4_out, input_shape)
sub124_out = CCF124(sub1_out, sub24_out, input_shape)
conv6_cls = conv(
sub124_out, 1, 1, num_classes, 1, 1, biased=True, name="conv6_cls")
sub4_out = conv(
sub4_out, 1, 1, num_classes, 1, 1, biased=True, name="sub4_out")
sub24_out = conv(
sub24_out, 1, 1, num_classes, 1, 1, biased=True, name="sub24_out")
return sub4_out, sub24_out, conv6_cls
"""Infer for ICNet model."""
import cityscape
import argparse
import functools
import sys
import os
import cv2
import paddle.fluid as fluid
import paddle.v2 as paddle
from icnet import icnet
from utils import add_arguments, print_arguments, get_feeder_data
from paddle.fluid.layers.learning_rate_scheduler import _decay_step_counter
from paddle.fluid.initializer import init_on_cpu
import numpy as np
IMG_MEAN = np.array((103.939, 116.779, 123.68), dtype=np.float32)
parser = argparse.ArgumentParser(description=__doc__)
add_arg = functools.partial(add_arguments, argparser=parser)
# yapf: disable
add_arg('model_path', str, None, "Model path.")
add_arg('images_list', str, None, "List file with images to be infered.")
add_arg('images_path', str, None, "The images path.")
add_arg('out_path', str, "./output", "Output path.")
add_arg('use_gpu', bool, True, "Whether use GPU to test.")
# yapf: enable
data_shape = [3, 1024, 2048]
num_classes = 19
label_colours = [
[128, 64, 128],
[244, 35, 231],
[69, 69, 69]
# 0 = road, 1 = sidewalk, 2 = building
,
[102, 102, 156],
[190, 153, 153],
[153, 153, 153]
# 3 = wall, 4 = fence, 5 = pole
,
[250, 170, 29],
[219, 219, 0],
[106, 142, 35]
# 6 = traffic light, 7 = traffic sign, 8 = vegetation
,
[152, 250, 152],
[69, 129, 180],
[219, 19, 60]
# 9 = terrain, 10 = sky, 11 = person
,
[255, 0, 0],
[0, 0, 142],
[0, 0, 69]
# 12 = rider, 13 = car, 14 = truck
,
[0, 60, 100],
[0, 79, 100],
[0, 0, 230]
# 15 = bus, 16 = train, 17 = motocycle
,
[119, 10, 32]
]
# 18 = bicycle
def color(input):
"""
Convert infered result to color image.
"""
result = []
for i in input.flatten():
result.append(
[label_colours[i][2], label_colours[i][1], label_colours[i][0]])
result = np.array(result).reshape([input.shape[0], input.shape[1], 3])
return result
def infer(args):
data_shape = cityscape.test_data_shape()
num_classes = cityscape.num_classes()
# define network
images = fluid.layers.data(name='image', shape=data_shape, dtype='float32')
_, _, sub124_out = icnet(images, num_classes,
np.array(data_shape[1:]).astype("float32"))
predict = fluid.layers.resize_bilinear(
sub124_out, out_shape=data_shape[1:3])
predict = fluid.layers.transpose(predict, perm=[0, 2, 3, 1])
predict = fluid.layers.reshape(predict, shape=[-1, num_classes])
_, predict = fluid.layers.topk(predict, k=1)
predict = fluid.layers.reshape(
predict,
shape=[data_shape[1], data_shape[2], -1]) # batch_size should be 1
inference_program = fluid.default_main_program().clone(for_test=True)
# prepare environment
place = fluid.CPUPlace()
if args.use_gpu:
place = fluid.CUDAPlace(0)
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
assert os.path.exists(args.model_path)
fluid.io.load_params(exe, args.model_path)
print "loaded model from: %s" % args.model_path
sys.stdout.flush()
if not os.path.isdir(args.out_path):
os.makedirs(args.out_path)
for line in open(args.images_list):
image_file = args.images_path + "/" + line.strip()
filename = os.path.basename(image_file)
image = paddle.image.load_image(
image_file, is_color=True).astype("float32")
image -= IMG_MEAN
img = paddle.image.to_chw(image)[np.newaxis, :]
image_t = fluid.core.LoDTensor()
image_t.set(img, place)
result = exe.run(inference_program,
feed={"image": image_t},
fetch_list=[predict])
cv2.imwrite(args.out_path + "/" + filename + "_result.png",
color(result[0]))
def main():
args = parser.parse_args()
print_arguments(args)
infer(args)
if __name__ == "__main__":
main()
"""Trainer for ICNet model."""
from icnet import icnet
import cityscape
import argparse
import functools
import sys
import time
import paddle.fluid as fluid
import numpy as np
from utils import add_arguments, print_arguments, get_feeder_data
from paddle.fluid.layers.learning_rate_scheduler import _decay_step_counter
from paddle.fluid.initializer import init_on_cpu
parser = argparse.ArgumentParser(description=__doc__)
add_arg = functools.partial(add_arguments, argparser=parser)
# yapf: disable
add_arg('batch_size', int, 16, "Minibatch size.")
add_arg('checkpoint_path', str, None, "Checkpoint svae path.")
add_arg('init_model', str, None, "Pretrain model path.")
add_arg('use_gpu', bool, True, "Whether use GPU to train.")
add_arg('random_mirror', bool, True, "Whether prepare by random mirror.")
add_arg('random_scaling', bool, True, "Whether prepare by random scaling.")
# yapf: enable
LAMBDA1 = 0.16
LAMBDA2 = 0.4
LAMBDA3 = 1.0
LEARNING_RATE = 0.003
POWER = 0.9
LOG_PERIOD = 1
CHECKPOINT_PERIOD = 1000
TOTAL_STEP = 60000
no_grad_set = []
def create_loss(predict, label, mask, num_classes):
predict = fluid.layers.transpose(predict, perm=[0, 2, 3, 1])
predict = fluid.layers.reshape(predict, shape=[-1, num_classes])
label = fluid.layers.reshape(label, shape=[-1, 1])
predict = fluid.layers.gather(predict, mask)
label = fluid.layers.gather(label, mask)
label = fluid.layers.cast(label, dtype="int64")
loss = fluid.layers.softmax_with_cross_entropy(predict, label)
no_grad_set.append(label.name)
return fluid.layers.reduce_mean(loss)
def poly_decay():
global_step = _decay_step_counter()
with init_on_cpu():
decayed_lr = LEARNING_RATE * (fluid.layers.pow(
(1 - global_step / TOTAL_STEP), POWER))
return decayed_lr
def train(args):
data_shape = cityscape.train_data_shape()
num_classes = cityscape.num_classes()
# define network
images = fluid.layers.data(name='image', shape=data_shape, dtype='float32')
label_sub1 = fluid.layers.data(name='label_sub1', shape=[1], dtype='int32')
label_sub2 = fluid.layers.data(name='label_sub2', shape=[1], dtype='int32')
label_sub4 = fluid.layers.data(name='label_sub4', shape=[1], dtype='int32')
mask_sub1 = fluid.layers.data(name='mask_sub1', shape=[-1], dtype='int32')
mask_sub2 = fluid.layers.data(name='mask_sub2', shape=[-1], dtype='int32')
mask_sub4 = fluid.layers.data(name='mask_sub4', shape=[-1], dtype='int32')
sub4_out, sub24_out, sub124_out = icnet(
images, num_classes, np.array(data_shape[1:]).astype("float32"))
loss_sub4 = create_loss(sub4_out, label_sub4, mask_sub4, num_classes)
loss_sub24 = create_loss(sub24_out, label_sub2, mask_sub2, num_classes)
loss_sub124 = create_loss(sub124_out, label_sub1, mask_sub1, num_classes)
reduced_loss = LAMBDA1 * loss_sub4 + LAMBDA2 * loss_sub24 + LAMBDA3 * loss_sub124
regularizer = fluid.regularizer.L2Decay(0.0001)
optimizer = fluid.optimizer.Momentum(
learning_rate=poly_decay(), momentum=0.9, regularization=regularizer)
_, params_grads = optimizer.minimize(reduced_loss, no_grad_set=no_grad_set)
# prepare environment
place = fluid.CPUPlace()
if args.use_gpu:
place = fluid.CUDAPlace(0)
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
if args.init_model is not None:
print "load model from: %s" % args.init_model
sys.stdout.flush()
fluid.io.load_params(exe, args.init_model)
iter_id = 0
t_loss = 0.
sub4_loss = 0.
sub24_loss = 0.
sub124_loss = 0.
train_reader = cityscape.train(
args.batch_size, flip=args.random_mirror, scaling=args.random_scaling)
while True:
# train a pass
for data in train_reader():
if iter_id > TOTAL_STEP:
return
iter_id += 1
results = exe.run(
feed=get_feeder_data(data, place),
fetch_list=[reduced_loss, loss_sub4, loss_sub24, loss_sub124])
t_loss += results[0]
sub4_loss += results[1]
sub24_loss += results[2]
sub124_loss += results[3]
# training log
if iter_id % LOG_PERIOD == 0:
print "Iter[%d]; train loss: %.3f; sub4_loss: %.3f; sub24_loss: %.3f; sub124_loss: %.3f" % (
iter_id, t_loss / LOG_PERIOD, sub4_loss / LOG_PERIOD,
sub24_loss / LOG_PERIOD, sub124_loss / LOG_PERIOD)
t_loss = 0.
sub4_loss = 0.
sub24_loss = 0.
sub124_loss = 0.
sys.stdout.flush()
if iter_id % CHECKPOINT_PERIOD == 0:
dir_name = args.checkpoint_path + "/" + str(iter_id)
fluid.io.save_persistables(exe, dirname=dir_name)
print "Saved checkpoint: %s" % (dir_name)
def main():
args = parser.parse_args()
print_arguments(args)
train(args)
if __name__ == "__main__":
main()
"""Contains common utility functions."""
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import distutils.util
import numpy as np
from paddle.fluid import core
def print_arguments(args):
"""Print argparse's arguments.
Usage:
.. code-block:: python
parser = argparse.ArgumentParser()
parser.add_argument("name", default="Jonh", type=str, help="User name.")
args = parser.parse_args()
print_arguments(args)
:param args: Input argparse.Namespace for printing.
:type args: argparse.Namespace
"""
print("----------- Configuration Arguments -----------")
for arg, value in sorted(vars(args).iteritems()):
print("%s: %s" % (arg, value))
print("------------------------------------------------")
def add_arguments(argname, type, default, help, argparser, **kwargs):
"""Add argparse's argument.
Usage:
.. code-block:: python
parser = argparse.ArgumentParser()
add_argument("name", str, "Jonh", "User name.", parser)
args = parser.parse_args()
"""
type = distutils.util.strtobool if type == bool else type
argparser.add_argument(
"--" + argname,
default=default,
type=type,
help=help + ' Default: %(default)s.',
**kwargs)
def to_lodtensor(data, place):
seq_lens = [len(seq) for seq in data]
cur_len = 0
lod = [cur_len]
for l in seq_lens:
cur_len += l
lod.append(cur_len)
flattened_data = np.concatenate(data, axis=0).astype("int32")
flattened_data = flattened_data.reshape([len(flattened_data), 1])
res = core.LoDTensor()
res.set(flattened_data, place)
res.set_lod([lod])
return res
def get_feeder_data(data, place, for_test=False):
feed_dict = {}
image_t = core.LoDTensor()
image_t.set(data[0], place)
feed_dict["image"] = image_t
if not for_test:
labels_sub1_t = core.LoDTensor()
labels_sub2_t = core.LoDTensor()
labels_sub4_t = core.LoDTensor()
mask_sub1_t = core.LoDTensor()
mask_sub2_t = core.LoDTensor()
mask_sub4_t = core.LoDTensor()
labels_sub1_t.set(data[1], place)
labels_sub2_t.set(data[3], place)
mask_sub1_t.set(data[2], place)
mask_sub2_t.set(data[4], place)
labels_sub4_t.set(data[5], place)
mask_sub4_t.set(data[6], place)
feed_dict["label_sub1"] = labels_sub1_t
feed_dict["label_sub2"] = labels_sub2_t
feed_dict["mask_sub1"] = mask_sub1_t
feed_dict["mask_sub2"] = mask_sub2_t
feed_dict["label_sub4"] = labels_sub4_t
feed_dict["mask_sub4"] = mask_sub4_t
else:
label_t = core.LoDTensor()
mask_t = core.LoDTensor()
label_t.set(data[1], place)
mask_t.set(data[2], place)
feed_dict["label"] = label_t
feed_dict["mask"] = mask_t
return feed_dict
The minimum PaddlePaddle version needed for the code sample in this directory is the lastest develop branch. If you are on a version of PaddlePaddle earlier than this, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
# Image Classification and Model Zoo
Image classification, which is an important field of computer vision, is to classify an image into pre-defined labels. Recently, many researchers developed different kinds of neural networks and highly improve the classification performance. This page introduces how to do image classification with PaddlePaddle Fluid, including [data preparation](#data-preparation), [training](#training-a-model), [finetuning](#finetuning), [evaluation](#evaluation) and [inference](#inference).
---
## Table of Contents
- [Installation](#installation)
- [Data preparation](#data-preparation)
- [Training a model with flexible parameters](#training-a-model)
- [Finetuning](#finetuning)
- [Evaluation](#evaluation)
- [Inference](#inference)
- [Supported models and performances](#supported-models)
# SE-ResNeXt for image classification
## Installation
This model built with paddle fluid is still under active development and is not
the final version. We welcome feedbacks.
Running sample code in this directory requires PaddelPaddle Fluid v0.13.0 and later. If the PaddlePaddle on your device is lower than this version, please follow the instructions in [installation document](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html) and make an update.
## Introduction
## Data preparation
The current code support the training of [SE-ResNeXt](https://arxiv.org/abs/1709.01507) (50/152 layers).
An example for ImageNet classification is as follows. First of all, preparation of imagenet data can be done as:
```
cd data/ILSVRC2012/
sh download_imagenet2012.sh
```
## Data Preparation
In the shell script ```download_imagenet2012.sh```, there are three steps to prepare data:
1. Download ImageNet-2012 dataset
```
cd data/
mkdir -p ILSVRC2012/
cd ILSVRC2012/
# get training set
wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_train.tar
# get validation set
wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_val.tar
# prepare directory
tar xf ILSVRC2012_img_train.tar
tar xf ILSVRC2012_img_val.tar
**step-1:** Register at ```image-net.org``` first in order to get a pair of ```Username``` and ```AccessKey```, which are used to download ImageNet data.
# unzip all classes data using unzip.sh
sh unzip.sh
```
**step-2:** Download ImageNet-2012 dataset from website. The training and validation data will be downloaded into folder "train" and "val" respectively. Please note that the size of data is more than 40 GB, it will take much time to download. Users who have downloaded the ImageNet data can organize it into ```data/ILSVRC2012``` directly.
2. Download training and validation label files from [ImageNet2012 url](https://pan.baidu.com/s/1Y6BCo0nmxsm_FsEqmx2hKQ)(password:```wx99```). Untar it into workspace ```ILSVRC2012/```. The files include
**step-3:** Download training and validation label files. There are two label files which contain train and validation image labels respectively:
**train_list.txt**: training list of imagenet 2012 classification task, with each line seperated by SPACE.
* *train_list.txt*: label file of imagenet-2012 training set, with each line seperated by ```SPACE```, like:
```
train/n02483708/n02483708_2436.jpeg 369
train/n03998194/n03998194_7015.jpeg 741
......@@ -41,7 +40,7 @@ train/n04596742/n04596742_3032.jpeg 909
train/n03208938/n03208938_7065.jpeg 535
...
```
**val_list.txt**: validation list of imagenet 2012 classification task, with each line seperated by SPACE.
* *val_list.txt*: label file of imagenet-2012 validation set, with each line seperated by ```SPACE```, like.
```
val/ILSVRC2012_val_00000001.jpeg 65
val/ILSVRC2012_val_00000002.jpeg 970
......@@ -50,38 +49,160 @@ val/ILSVRC2012_val_00000004.jpeg 809
val/ILSVRC2012_val_00000005.jpeg 516
...
```
**synset_words.txt**: the semantic label of each class.
## Training a model
## Training a model with flexible parameters
To start a training task, one can use command line as:
After data preparation, one can start the training step by:
```
python train.py --num_layers=50 --batch_size=8 --with_mem_opt=True --parallel_exe=False
python train.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--total_images=1281167 \
--class_dim=1000
--image_shape=3,224,224 \
--model_save_dir=output/ \
--with_mem_opt=False \
--lr_strategy=piecewise_decay \
--lr=0.1
```
## Finetune a model
**parameter introduction:**
* **model**: name model to use. Default: "SE_ResNeXt50_32x4d".
* **num_epochs**: the number of epochs. Default: 120.
* **batch_size**: the size of each mini-batch. Default: 256.
* **use_gpu**: whether to use GPU or not. Default: True.
* **total_images**: total number of images in the training set. Default: 1281167.
* **class_dim**: the class number of the classification task. Default: 1000.
* **image_shape**: input size of the network. Default: "3,224,224".
* **model_save_dir**: the directory to save trained model. Default: "output".
* **with_mem_opt**: whether to use memory optimization or not. Default: False.
* **lr_strategy**: learning rate changing strategy. Default: "piecewise_decay".
* **lr**: initialized learning rate. Default: 0.1.
* **pretrained_model**: model path for pretraining. Default: None.
* **checkpoint**: the checkpoint path to resume. Default: None.
**data reader introduction:** Data reader is defined in ```reader.py```. In [training stage](#training-a-model), random crop and flipping are used, while center crop is used in [evaluation](#inference) and [inference](#inference) stages. Supported data augmentation includes:
* rotation
* color jitter
* random crop
* center crop
* resize
* flipping
**training curve:** The training curve can be drawn based on training log. For example, the log from training AlexNet is like:
```
python train.py --num_layers=50 --batch_size=8 --with_mem_opt=True --parallel_exe=False --pretrained_model="pretrain/96/"
End pass 1, train_loss 6.23153877258, train_acc1 0.0150696625933, train_acc5 0.0552518665791, test_loss 5.41981744766, test_acc1 0.0519132651389, test_acc5 0.156150355935
End pass 2, train_loss 5.15442800522, train_acc1 0.0784279331565, train_acc5 0.211050540209, test_loss 4.45795249939, test_acc1 0.140469551086, test_acc5 0.333163291216
End pass 3, train_loss 4.51505613327, train_acc1 0.145300447941, train_acc5 0.331567406654, test_loss 3.86548018456, test_acc1 0.219443559647, test_acc5 0.446448504925
End pass 4, train_loss 4.12735557556, train_acc1 0.19437250495, train_acc5 0.405713528395, test_loss 3.56990146637, test_acc1 0.264536827803, test_acc5 0.507190704346
End pass 5, train_loss 3.87505435944, train_acc1 0.229518383741, train_acc5 0.453582793474, test_loss 3.35345435143, test_acc1 0.297349333763, test_acc5 0.54753267765
End pass 6, train_loss 3.6929500103, train_acc1 0.255628824234, train_acc5 0.487188398838, test_loss 3.17112898827, test_acc1 0.326953113079, test_acc5 0.581780135632
End pass 7, train_loss 3.55882954597, train_acc1 0.275381118059, train_acc5 0.511990904808, test_loss 3.03736782074, test_acc1 0.349035382271, test_acc5 0.606293857098
End pass 8, train_loss 3.45595097542, train_acc1 0.291462600231, train_acc5 0.530815005302, test_loss 2.96034455299, test_acc1 0.362228929996, test_acc5 0.617390751839
End pass 9, train_loss 3.3745200634, train_acc1 0.303871691227, train_acc5 0.545210540295, test_loss 2.93932366371, test_acc1 0.37129303813, test_acc5 0.623573005199
...
```
TBD
## Inference
The error rate curves of AlexNet, ResNet50 and SE-ResNeXt-50 are shown in the figure below.
<p align="center">
<img src="images/curve.jpg" height=480 width=640 hspace='10'/> <br />
Training and validation Curves
</p>
## Finetuning
Finetuning is to finetune model weights in a specific task by loading pretrained weights. After initializing ```path_to_pretrain_model```, one can finetune a model as:
```
python infer.py --num_layers=50 --batch_size=8 --model='model/90' --test_list=''
python train.py
--model=SE_ResNeXt50_32x4d \
--pretrained_model=${path_to_pretrain_model} \
--batch_size=32 \
--total_images=1281167 \
--class_dim=1000 \
--image_shape=3,224,224 \
--model_save_dir=output/ \
--with_mem_opt=True \
--lr_strategy=piecewise_decay \
--lr=0.1
```
TBD
## Results
## Evaluation
Evaluation is to evaluate the performance of a trained model. One can download [pretrained models](#supported-models) and set its path to ```path_to_pretrain_model```. Then top1/top5 accuracy can be obtained by running the following command:
```
python eval.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--class_dim=1000 \
--image_shape=3,224,224 \
--with_mem_opt=True \
--pretrained_model=${path_to_pretrain_model}
```
The SE-ResNeXt-50 model is trained by starting with learning rate ```0.1``` and decaying it by ```0.1``` after each ```10``` epoches. Top-1/Top-5 Validation Accuracy on ImageNet 2012 is listed in table.
According to the congfiguration of evaluation, the output log is like:
```
Testbatch 0,loss 2.1786134243, acc1 0.625,acc5 0.8125,time 0.48 sec
Testbatch 10,loss 0.898496925831, acc1 0.75,acc5 0.9375,time 0.51 sec
Testbatch 20,loss 1.32524681091, acc1 0.6875,acc5 0.9375,time 0.37 sec
Testbatch 30,loss 1.46830511093, acc1 0.5,acc5 0.9375,time 0.51 sec
Testbatch 40,loss 1.12802267075, acc1 0.625,acc5 0.9375,time 0.35 sec
Testbatch 50,loss 0.881597697735, acc1 0.8125,acc5 1.0,time 0.32 sec
Testbatch 60,loss 0.300163716078, acc1 0.875,acc5 1.0,time 0.48 sec
Testbatch 70,loss 0.692037761211, acc1 0.875,acc5 1.0,time 0.35 sec
Testbatch 80,loss 0.0969972759485, acc1 1.0,acc5 1.0,time 0.41 sec
...
```
|model | [original paper(Fig.5)](https://arxiv.org/abs/1709.01507) | Pytorch | Paddle fluid
|- | :-: |:-: | -:
|SE-ResNeXt-50 | 77.6%/- | 77.71%/93.63% | 77.42%/93.50%
## Inference
Inference is used to get prediction score or image features based on trained models.
```
python infer.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--class_dim=1000 \
--image_shape=3,224,224 \
--with_mem_opt=True \
--pretrained_model=${path_to_pretrain_model}
```
The output contains predication results, including maximum score (before softmax) and corresponding predicted label.
```
Test-0-score: [13.168352], class [491]
Test-1-score: [7.913302], class [975]
Test-2-score: [16.959702], class [21]
Test-3-score: [14.197695], class [383]
Test-4-score: [12.607652], class [878]
Test-5-score: [17.725458], class [15]
Test-6-score: [12.678599], class [118]
Test-7-score: [12.353498], class [505]
Test-8-score: [20.828007], class [747]
Test-9-score: [15.135801], class [315]
Test-10-score: [14.585114], class [920]
Test-11-score: [13.739927], class [679]
Test-12-score: [15.040644], class [386]
...
```
## Supported models and performances
Models are trained by starting with learning rate ```0.1``` and decaying it by ```0.1``` after each pre-defined epoches, if not special introduced. Available top-1/top-5 validation accuracy on ImageNet 2012 are listed in table. Pretrained models can be downloaded by clicking related model names.
## Released models
|model | Baidu Cloud
|model | top-1/top-5 accuracy
|- | -:
|SE-ResNeXt-50 | [url]()
TBD
|[AlexNet](http://paddle-imagenet-models.bj.bcebos.com/alexnet_model.tar) | 57.21%/79.72%
|VGG11 | -
|VGG13 | -
|VGG16 | -
|VGG19 | -
|GoogleNet | -
|InceptionV4 | -
|MobileNet | -
|[ResNet50](http://paddle-imagenet-models.bj.bcebos.com/resnet_50_model.tar) | 76.63%/93.10%
|ResNet101 | -
|ResNet152 | -
|[SE_ResNeXt50_32x4d](http://paddle-imagenet-models.bj.bcebos.com/se_resnext_50_model.tar) | 78.33%/93.96%
|SE_ResNeXt101_32x4d | -
|SE_ResNeXt152_32x4d | -
|DPN68 | -
|DPN92 | -
|DPN98 | -
|DPN107 | -
|DPN131 | -
此差异已折叠。
......@@ -20,8 +20,8 @@ def calc_diff(f1, f2):
d1 = np.load(f1)
d2 = np.load(f2)
print d1.shape
print d2.shape
#print d1.shape
#print d2.shape
#print d1[0, 0, 0:10, 0:10]
#print d2[0, 0, 0:10, 0:10]
#d1 = d1[:, :, 1:-2, 1:-2]
......
......@@ -19,4 +19,6 @@ if [[ $# -eq 3 ]];then
else
caffe_file="./results/${model_name}.caffe/${2}.npy"
fi
python ./compare.py $paddle_file $caffe_file
cmd="python ./compare.py $paddle_file $caffe_file"
echo $cmd
eval $cmd
......@@ -3,7 +3,7 @@
#function:
# a tool used to compare all layers' results
#
#set -x
if [[ $# -ne 1 ]];then
echo "usage:"
echo " bash $0 [model_name]"
......@@ -13,11 +13,20 @@ fi
model_name=$1
prototxt="models.caffe/$model_name/${model_name}.prototxt"
layers=$(cat $prototxt | perl -ne 'if(/^\s+name\s*:\s*\"([^\"]+)/){print $1."\n";}')
cat $prototxt | grep name | perl -ne 'if(/^\s*name\s*:\s+\"([^\"]+)/){ print $1."\n";}' >.layer_names
final_layer=$(cat $prototxt | perl -ne 'if(/^\s*top\s*:\s+\"([^\"]+)/){ print $1."\n";}' | tail -n1)
ret=$(grep "^$final_layer$" .layer_names | wc -l)
if [[ $ret -eq 0 ]];then
echo $final_layer >>.layer_names
fi
for i in $layers;do
for i in $(cat .layer_names);do
i=${i//\//_}
cf_npy="results/${model_name}.caffe/${i}.npy"
pd_npy="results/${model_name}.paddle/${i}.npy"
#pd_npy="results/${model_name}.paddle/${i}.npy"
#pd_npy=$(find results/${model_name}.paddle -iname "${i}*.npy" | head -n1)
pd_npy=$(find results/${model_name}.paddle -iname "${i}.*npy" | grep deleted -v | head -n1)
if [[ ! -e $cf_npy ]];then
echo "caffe's result not exist[$cf_npy]"
......
......@@ -71,7 +71,9 @@ if [[ -z $only_convert ]];then
if [[ -z $net_name ]];then
net_name="MyNet"
fi
$PYTHON ./infer.py dump $net_file $weight_file $imgfile $net_name
cmd="$PYTHON ./infer.py dump $net_file $weight_file $imgfile $net_name"
echo $cmd
eval $cmd
ret=$?
fi
exit $ret
#!/bin/bash
#
#script to test all models
#
models="alexnet vgg16 googlenet resnet152 resnet101 resnet50"
for i in $models;do
echo "begin to process $i"
bash ./tools/diff.sh $i 2>&1
echo "finished to process $i with ret[$?]"
done
......@@ -43,7 +43,7 @@ def axpy_layer(inputs, name):
x = inputs[1]
y = inputs[2]
output = fluid.layers.elementwise_mul(x, alpha, axis=0)
output = fluid.layers.elementwise_add(output, y)
output = fluid.layers.elementwise_add(output, y, name=name)
return output
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册