提交 7df53c9a 编写于 作者: W wanghaoshuang

Merge branch 'develop' of https://github.com/PaddlePaddle/models into model_avg

The minimum PaddlePaddle version needed for the code sample in this directory is v0.11.0. If you are on a version of PaddlePaddle earlier than v0.11.0, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
---
# Convolutional Sequence to Sequence Learning
This model implements the work in the following paper:
......
运行本目录下的程序示例需要使用PaddlePaddle v0.10.0 版本。如果您的PaddlePaddle安装版本低于此要求,请按照[安装文档](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html)中的说明更新PaddlePaddle安装版本。
---
# 点击率预估
以下是本例目录包含的文件以及对应说明:
......
The minimum PaddlePaddle version needed for the code sample in this directory is v0.10.0. If you are on a version of PaddlePaddle earlier than v0.10.0, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
---
# Click-Through Rate Prediction
## Introduction
......
The minimum PaddlePaddle version needed for the code sample in this directory is v0.11.0. If you are on a version of PaddlePaddle earlier than v0.11.0, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
---
# Deep Factorization Machine for Click-Through Rate prediction
## Introduction
......
运行本目录下的程序示例需要使用PaddlePaddle v0.10.0 版本。如果您的PaddlePaddle安装版本低于此版本要求,请按照[安装文档](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html)中的说明更新PaddlePaddle安装版本。
---
# 深度结构化语义模型 (Deep Structured Semantic Models, DSSM)
DSSM使用DNN模型在一个连续的语义空间中学习文本低纬的表示向量,并且建模两个句子间的语义相似度。本例演示如何使用PaddlePaddle实现一个通用的DSSM 模型,用于建模两个字符串间的语义相似度,模型实现支持通用的数据格式,用户替换数据便可以在真实场景中使用该模型。
......
The minimum PaddlePaddle version needed for the code sample in this directory is v0.10.0. If you are on a version of PaddlePaddle earlier than v0.10.0, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
---
# Deep Structured Semantic Models (DSSM)
Deep Structured Semantic Models (DSSM) is simple but powerful DNN based model for matching web search queries and the URL based documents. This example demonstrates how to use PaddlePaddle to implement a generic DSSM model for modeling the semantic similarity between two strings.
......
Deep ASR Kickoff
The minimum PaddlePaddle version needed for the code sample in this directory is the lastest develop branch. If you are on a version of PaddlePaddle earlier than this, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
---
### TODO
This project is still under active development.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import data_utils.augmentor.trans_mean_variance_norm as trans_mean_variance_norm
import data_utils.augmentor.trans_add_delta as trans_add_delta
import data_utils.augmentor.trans_splice as trans_splice
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sys
import sys, time
from six import reraise
from tblib import Traceback
from multiprocessing import Manager, Process
import posix_ipc, mmap
import numpy as np
......@@ -35,21 +37,177 @@ def lodtensor_to_ndarray(lod_tensor):
return ret, lod_tensor.lod()
def batch_to_ndarray(batch_samples, lod):
frame_dim = batch_samples[0][0].shape[1]
batch_feature = np.zeros((lod[-1], frame_dim), dtype="float32")
batch_label = np.zeros((lod[-1], 1), dtype="int64")
start = 0
for sample in batch_samples:
frame_num = sample[0].shape[0]
batch_feature[start:start + frame_num, :] = sample[0]
batch_label[start:start + frame_num, :] = sample[1]
start += frame_num
return (batch_feature, batch_label)
def split_infer_result(infer_seq, lod):
infer_batch = []
for i in xrange(0, len(lod[0]) - 1):
infer_batch.append(infer_seq[lod[0][i]:lod[0][i + 1]])
return infer_batch
class DaemonProcessGroup(object):
def __init__(self, proc_num, target, args):
self._proc_num = proc_num
self._workers = [
Process(
target=target, args=args) for _ in xrange(self._proc_num)
]
def start_all(self):
for w in self._workers:
w.daemon = True
w.start()
@property
def proc_num(self):
return self._proc_num
class EpochEndSignal(object):
pass
class CriticalException(Exception):
pass
class SharedNDArray(object):
"""SharedNDArray utilizes shared memory to avoid data serialization when
data object shared among different processes. We can reconstruct the
`ndarray` when memory address, shape and dtype provided.
Args:
name (str): Address name of shared memory.
whether_verify (bool): Whether to validate the writing operation.
"""
def __init__(self, name, whether_verify=False):
self._name = name
self._shm = None
self._buf = None
self._array = np.zeros(1, dtype=np.float32)
self._inited = False
self._whether_verify = whether_verify
def zeros_like(self, shape, dtype):
size = int(np.prod(shape)) * np.dtype(dtype).itemsize
if self._inited:
self._shm = posix_ipc.SharedMemory(self._name)
else:
self._shm = posix_ipc.SharedMemory(
self._name, posix_ipc.O_CREAT, size=size)
self._buf = mmap.mmap(self._shm.fd, size)
self._array = np.ndarray(shape, dtype, self._buf, order='C')
def copy(self, ndarray):
size = int(np.prod(ndarray.shape)) * np.dtype(ndarray.dtype).itemsize
self.zeros_like(ndarray.shape, ndarray.dtype)
self._array[:] = ndarray
self._buf.flush()
self._inited = True
if self._whether_verify:
shm = posix_ipc.SharedMemory(self._name)
buf = mmap.mmap(shm.fd, size)
array = np.ndarray(ndarray.shape, ndarray.dtype, buf, order='C')
np.testing.assert_array_equal(array, ndarray)
@property
def ndarray(self):
return self._array
def recycle(self, pool):
self._buf.close()
self._shm.close_fd()
self._inited = False
pool[self._name] = self
def __getstate__(self):
return (self._name, self._array.shape, self._array.dtype, self._inited,
self._whether_verify)
def __setstate__(self, state):
self._name = state[0]
self._inited = state[3]
self.zeros_like(state[1], state[2])
self._whether_verify = state[4]
class SharedMemoryPoolManager(object):
"""SharedMemoryPoolManager maintains a multiprocessing.Manager.dict object.
All available addresses are allocated once and will be reused. Though this
class is not process-safe, the pool can be shared between processes. All
shared memory should be unlinked before the main process exited.
Args:
pool_size (int): Size of shared memory pool.
manager (dict): A multiprocessing.Manager object, the pool is
maintained by the proxy process.
name_prefix (str): Address prefix of shared memory.
"""
def __init__(self, pool_size, manager, name_prefix='/deep_asr'):
self._names = []
self._dict = manager.dict()
self._time_prefix = time.strftime('%Y%m%d%H%M%S')
for i in xrange(pool_size):
name = name_prefix + '_' + self._time_prefix + '_' + str(i)
self._dict[name] = SharedNDArray(name)
self._names.append(name)
@property
def pool(self):
return self._dict
def __del__(self):
for name in self._names:
# have to unlink the shared memory
posix_ipc.unlink_shared_memory(name)
def suppress_signal(signo, stack_frame):
pass
def suppress_complaints(verbose):
def suppress_complaints(verbose, notify=None):
def decorator_maker(func):
def suppress_warpper(*args, **kwargs):
try:
func(*args, **kwargs)
except:
et, ev, tb = sys.exc_info()
tb = Traceback(tb)
if verbose == 1:
reraise(et, ev, tb.as_traceback())
if notify is not None:
notify(except_type=et, except_value=ev, traceback=tb)
if verbose == 1 or isinstance(ev, CriticalException):
reraise(et, ev, Traceback(tb).as_traceback())
return suppress_warpper
return decorator_maker
class ForceExitWrapper(object):
def __init__(self, exit_flag):
self._exit_flag = exit_flag
@suppress_complaints(verbose=0)
def __call__(self, *args, **kwargs):
self._exit_flag.value = True
def __eq__(self, flag):
return self._exit_flag.value == flag
/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "post_decode_faster.h"
typedef kaldi::int32 int32;
using fst::SymbolTable;
using fst::VectorFst;
using fst::StdArc;
Decoder::Decoder(std::string word_syms_filename,
std::string fst_in_filename,
std::string logprior_rxfilename) {
const char* usage =
"Decode, reading log-likelihoods (of transition-ids or whatever symbol "
"is on the graph) as matrices.";
kaldi::ParseOptions po(usage);
binary = true;
acoustic_scale = 1.5;
allow_partial = true;
kaldi::FasterDecoderOptions decoder_opts;
decoder_opts.Register(&po, true); // true == include obscure settings.
po.Register("binary", &binary, "Write output in binary mode");
po.Register("allow-partial",
&allow_partial,
"Produce output even when final state was not reached");
po.Register("acoustic-scale",
&acoustic_scale,
"Scaling factor for acoustic likelihoods");
word_syms = NULL;
if (word_syms_filename != "") {
word_syms = fst::SymbolTable::ReadText(word_syms_filename);
if (!word_syms)
KALDI_ERR << "Could not read symbol table from file "
<< word_syms_filename;
}
std::ifstream is_logprior(logprior_rxfilename);
logprior.Read(is_logprior, false);
// It's important that we initialize decode_fst after loglikes_reader, as it
// can prevent crashes on systems installed without enough virtual memory.
// It has to do with what happens on UNIX systems if you call fork() on a
// large process: the page-table entries are duplicated, which requires a
// lot of virtual memory.
decode_fst = fst::ReadFstKaldi(fst_in_filename);
decoder = new kaldi::FasterDecoder(*decode_fst, decoder_opts);
}
Decoder::~Decoder() {
if (!word_syms) delete word_syms;
delete decode_fst;
delete decoder;
}
std::string Decoder::decode(
std::string key,
const std::vector<std::vector<kaldi::BaseFloat>>& log_probs) {
size_t num_frames = log_probs.size();
size_t dim_label = log_probs[0].size();
kaldi::Matrix<kaldi::BaseFloat> loglikes(
num_frames, dim_label, kaldi::kSetZero, kaldi::kStrideEqualNumCols);
for (size_t i = 0; i < num_frames; ++i) {
memcpy(loglikes.Data() + i * dim_label,
log_probs[i].data(),
sizeof(kaldi::BaseFloat) * dim_label);
}
return decode(key, loglikes);
}
std::vector<std::string> Decoder::decode(std::string posterior_rspecifier) {
kaldi::SequentialBaseFloatMatrixReader posterior_reader(posterior_rspecifier);
std::vector<std::string> decoding_results;
for (; !posterior_reader.Done(); posterior_reader.Next()) {
std::string key = posterior_reader.Key();
kaldi::Matrix<kaldi::BaseFloat> loglikes(posterior_reader.Value());
decoding_results.push_back(decode(key, loglikes));
}
return decoding_results;
}
std::string Decoder::decode(std::string key,
kaldi::Matrix<kaldi::BaseFloat>& loglikes) {
std::string decoding_result;
if (loglikes.NumRows() == 0) {
KALDI_WARN << "Zero-length utterance: " << key;
}
KALDI_ASSERT(loglikes.NumCols() == logprior.Dim());
loglikes.ApplyLog();
loglikes.AddVecToRows(-1.0, logprior);
kaldi::DecodableMatrixScaled decodable(loglikes, acoustic_scale);
decoder->Decode(&decodable);
VectorFst<kaldi::LatticeArc> decoded; // linear FST.
if ((allow_partial || decoder->ReachedFinal()) &&
decoder->GetBestPath(&decoded)) {
if (!decoder->ReachedFinal())
KALDI_WARN << "Decoder did not reach end-state, outputting partial "
"traceback.";
std::vector<int32> alignment;
std::vector<int32> words;
kaldi::LatticeWeight weight;
GetLinearSymbolSequence(decoded, &alignment, &words, &weight);
if (word_syms != NULL) {
for (size_t i = 0; i < words.size(); i++) {
std::string s = word_syms->Find(words[i]);
decoding_result += s;
if (s == "")
KALDI_ERR << "Word-id " << words[i] << " not in symbol table.";
}
}
}
return decoding_result;
}
/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include <string>
#include <vector>
#include "base/kaldi-common.h"
#include "base/timer.h"
#include "decoder/decodable-matrix.h"
#include "decoder/faster-decoder.h"
#include "fstext/fstext-lib.h"
#include "hmm/transition-model.h"
#include "lat/kaldi-lattice.h" // for {Compact}LatticeArc
#include "tree/context-dep.h"
#include "util/common-utils.h"
class Decoder {
public:
Decoder(std::string word_syms_filename,
std::string fst_in_filename,
std::string logprior_rxfilename);
~Decoder();
// Interface to accept the scores read from specifier and return
// the batch decoding results
std::vector<std::string> decode(std::string posterior_rspecifier);
// Accept the scores of one utterance and return the decoding result
std::string decode(
std::string key,
const std::vector<std::vector<kaldi::BaseFloat>> &log_probs);
private:
// For decoding one utterance
std::string decode(std::string key,
kaldi::Matrix<kaldi::BaseFloat> &loglikes);
fst::SymbolTable *word_syms;
fst::VectorFst<fst::StdArc> *decode_fst;
kaldi::FasterDecoder *decoder;
kaldi::Vector<kaldi::BaseFloat> logprior;
bool binary;
kaldi::BaseFloat acoustic_scale;
bool allow_partial;
};
/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include "post_decode_faster.h"
namespace py = pybind11;
PYBIND11_MODULE(post_decode_faster, m) {
m.doc() = "Decoder for Deep ASR model";
py::class_<Decoder>(m, "Decoder")
.def(py::init<std::string, std::string, std::string>())
.def("decode",
(std::vector<std::string> (Decoder::*)(std::string)) &
Decoder::decode,
"Decode for the probability matrices in specifier "
"and return the transcriptions.")
.def(
"decode",
(std::string (Decoder::*)(
std::string, const std::vector<std::vector<kaldi::BaseFloat>>&)) &
Decoder::decode,
"Decode one input probability matrix "
"and return the transcription.");
}
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import glob
from distutils.core import setup, Extension
from distutils.sysconfig import get_config_vars
try:
kaldi_root = os.environ['KALDI_ROOT']
except:
raise ValueError("Enviroment variable 'KALDI_ROOT' is not defined. Please "
"install kaldi and export KALDI_ROOT=<kaldi's root dir> .")
args = [
'-std=c++11', '-Wno-sign-compare', '-Wno-unused-variable',
'-Wno-unused-local-typedefs', '-Wno-unused-but-set-variable',
'-Wno-deprecated-declarations', '-Wno-unused-function'
]
# remove warning about -Wstrict-prototypes
(opt, ) = get_config_vars('OPT')
os.environ['OPT'] = " ".join(flag for flag in opt.split()
if flag != '-Wstrict-prototypes')
os.environ['CC'] = 'g++'
LIBS = [
'fst', 'kaldi-base', 'kaldi-util', 'kaldi-matrix', 'kaldi-tree',
'kaldi-hmm', 'kaldi-fstext', 'kaldi-decoder', 'kaldi-lat'
]
LIB_DIRS = [
'tools/openfst/lib', 'src/base', 'src/matrix', 'src/util', 'src/tree',
'src/hmm', 'src/fstext', 'src/decoder', 'src/lat'
]
LIB_DIRS = [os.path.join(kaldi_root, path) for path in LIB_DIRS]
LIB_DIRS = [os.path.abspath(path) for path in LIB_DIRS]
ext_modules = [
Extension(
'post_decode_faster',
['pybind.cc', 'post_decode_faster.cc'],
include_dirs=[
'pybind11/include', '.', os.path.join(kaldi_root, 'src'),
os.path.join(kaldi_root, 'tools/openfst/src/include')
],
language='c++',
libraries=LIBS,
library_dirs=LIB_DIRS,
runtime_library_dirs=LIB_DIRS,
extra_compile_args=args, ),
]
setup(
name='post_decode_faster',
version='0.0.1',
author='Paddle',
author_email='',
description='Decoder for Deep ASR model',
ext_modules=ext_modules, )
set -e
if [ ! -d pybind11 ]; then
git clone https://github.com/pybind/pybind11.git
fi
python setup.py build_ext -i
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import argparse
import paddle.fluid as fluid
import data_utils.augmentor.trans_mean_variance_norm as trans_mean_variance_norm
import data_utils.augmentor.trans_add_delta as trans_add_delta
import data_utils.augmentor.trans_splice as trans_splice
import data_utils.data_reader as reader
from data_utils.util import lodtensor_to_ndarray
from data_utils.util import split_infer_result
def parse_args():
parser = argparse.ArgumentParser("Inference for stacked LSTMP model.")
parser.add_argument(
'--batch_size',
type=int,
default=32,
help='The sequence number of a batch data. (default: %(default)d)')
parser.add_argument(
'--device',
type=str,
default='GPU',
choices=['CPU', 'GPU'],
help='The device type. (default: %(default)s)')
parser.add_argument(
'--mean_var',
type=str,
default='data/global_mean_var_search26kHr',
help="The path for feature's global mean and variance. "
"(default: %(default)s)")
parser.add_argument(
'--infer_feature_lst',
type=str,
default='data/infer_feature.lst',
help='The feature list path for inference. (default: %(default)s)')
parser.add_argument(
'--infer_label_lst',
type=str,
default='data/infer_label.lst',
help='The label list path for inference. (default: %(default)s)')
parser.add_argument(
'--infer_model_path',
type=str,
default='./infer_models/deep_asr.pass_0.infer.model/',
help='The directory for loading inference model. '
'(default: %(default)s)')
args = parser.parse_args()
return args
def print_arguments(args):
print('----------- Configuration Arguments -----------')
for arg, value in sorted(vars(args).iteritems()):
print('%s: %s' % (arg, value))
print('------------------------------------------------')
def infer(args):
""" Gets one batch of feature data and predicts labels for each sample.
"""
if not os.path.exists(args.infer_model_path):
raise IOError("Invalid inference model path!")
place = fluid.CUDAPlace(0) if args.device == 'GPU' else fluid.CPUPlace()
exe = fluid.Executor(place)
# load model
[infer_program, feed_dict,
fetch_targets] = fluid.io.load_inference_model(args.infer_model_path, exe)
ltrans = [
trans_add_delta.TransAddDelta(2, 2),
trans_mean_variance_norm.TransMeanVarianceNorm(args.mean_var),
trans_splice.TransSplice()
]
infer_data_reader = reader.DataReader(args.infer_feature_lst,
args.infer_label_lst)
infer_data_reader.set_transformers(ltrans)
feature_t = fluid.LoDTensor()
one_batch = infer_data_reader.batch_iterator(args.batch_size, 1).next()
(features, labels, lod) = one_batch
feature_t.set(features, place)
feature_t.set_lod([lod])
results = exe.run(infer_program,
feed={feed_dict[0]: feature_t},
fetch_list=fetch_targets,
return_numpy=False)
probs, lod = lodtensor_to_ndarray(results[0])
preds = probs.argmax(axis=1)
infer_batch = split_infer_result(preds, lod)
for index, sample in enumerate(infer_batch):
print("result %d: " % index, sample, '\n')
if __name__ == '__main__':
args = parse_args()
print_arguments(args)
infer(args)
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sys
import os
import numpy as np
import argparse
import time
import paddle.fluid as fluid
import data_utils.augmentor.trans_mean_variance_norm as trans_mean_variance_norm
import data_utils.augmentor.trans_add_delta as trans_add_delta
import data_utils.augmentor.trans_splice as trans_splice
import data_utils.async_data_reader as reader
from decoder.post_decode_faster import Decoder
from data_utils.util import lodtensor_to_ndarray
from model_utils.model import stacked_lstmp_model
from data_utils.util import split_infer_result
def parse_args():
parser = argparse.ArgumentParser("Run inference by using checkpoint.")
parser.add_argument(
'--batch_size',
type=int,
default=32,
help='The sequence number of a batch data. (default: %(default)d)')
parser.add_argument(
'--minimum_batch_size',
type=int,
default=1,
help='The minimum sequence number of a batch data. '
'(default: %(default)d)')
parser.add_argument(
'--frame_dim',
type=int,
default=120 * 11,
help='Frame dimension of feature data. (default: %(default)d)')
parser.add_argument(
'--stacked_num',
type=int,
default=5,
help='Number of lstmp layers to stack. (default: %(default)d)')
parser.add_argument(
'--proj_dim',
type=int,
default=512,
help='Project size of lstmp unit. (default: %(default)d)')
parser.add_argument(
'--hidden_dim',
type=int,
default=1024,
help='Hidden size of lstmp unit. (default: %(default)d)')
parser.add_argument(
'--class_num',
type=int,
default=1749,
help='Number of classes in label. (default: %(default)d)')
parser.add_argument(
'--learning_rate',
type=float,
default=0.00016,
help='Learning rate used to train. (default: %(default)f)')
parser.add_argument(
'--device',
type=str,
default='GPU',
choices=['CPU', 'GPU'],
help='The device type. (default: %(default)s)')
parser.add_argument(
'--parallel', action='store_true', help='If set, run in parallel.')
parser.add_argument(
'--mean_var',
type=str,
default='data/global_mean_var_search26kHr',
help="The path for feature's global mean and variance. "
"(default: %(default)s)")
parser.add_argument(
'--infer_feature_lst',
type=str,
default='data/infer_feature.lst',
help='The feature list path for inference. (default: %(default)s)')
parser.add_argument(
'--infer_label_lst',
type=str,
default='data/infer_label.lst',
help='The label list path for inference. (default: %(default)s)')
parser.add_argument(
'--checkpoint',
type=str,
default='./checkpoint',
help="The checkpoint path to init model. (default: %(default)s)")
parser.add_argument(
'--vocabulary',
type=str,
default='./decoder/graph/words.txt',
help="The path to vocabulary. (default: %(default)s)")
parser.add_argument(
'--graphs',
type=str,
default='./decoder/graph/TLG.fst',
help="The path to TLG graphs for decoding. (default: %(default)s)")
parser.add_argument(
'--log_prior',
type=str,
default="./decoder/logprior",
help="The log prior probs for training data. (default: %(default)s)")
args = parser.parse_args()
return args
def print_arguments(args):
print('----------- Configuration Arguments -----------')
for arg, value in sorted(vars(args).iteritems()):
print('%s: %s' % (arg, value))
print('------------------------------------------------')
def infer_from_ckpt(args):
"""Inference by using checkpoint."""
if not os.path.exists(args.checkpoint):
raise IOError("Invalid checkpoint!")
prediction, avg_cost, accuracy = stacked_lstmp_model(
frame_dim=args.frame_dim,
hidden_dim=args.hidden_dim,
proj_dim=args.proj_dim,
stacked_num=args.stacked_num,
class_num=args.class_num,
parallel=args.parallel)
infer_program = fluid.default_main_program().clone()
optimizer = fluid.optimizer.Adam(learning_rate=args.learning_rate)
optimizer.minimize(avg_cost)
place = fluid.CPUPlace() if args.device == 'CPU' else fluid.CUDAPlace(0)
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
# load checkpoint.
fluid.io.load_persistables(exe, args.checkpoint)
ltrans = [
trans_add_delta.TransAddDelta(2, 2),
trans_mean_variance_norm.TransMeanVarianceNorm(args.mean_var),
trans_splice.TransSplice()
]
feature_t = fluid.LoDTensor()
label_t = fluid.LoDTensor()
# infer data reader
infer_data_reader = reader.AsyncDataReader(args.infer_feature_lst,
args.infer_label_lst)
infer_data_reader.set_transformers(ltrans)
infer_costs, infer_accs = [], []
for batch_id, batch_data in enumerate(
infer_data_reader.batch_iterator(args.batch_size,
args.minimum_batch_size)):
# load_data
(features, labels, lod) = batch_data
feature_t.set(features.ndarray, place)
feature_t.set_lod([lod.ndarray])
label_t.set(labels.ndarray, place)
label_t.set_lod([lod.ndarray])
infer_data_reader.recycle(features, labels, lod)
results = exe.run(infer_program,
feed={"feature": feature_t,
"label": label_t},
fetch_list=[prediction, avg_cost, accuracy],
return_numpy=False)
infer_costs.append(lodtensor_to_ndarray(results[1])[0])
infer_accs.append(lodtensor_to_ndarray(results[2])[0])
probs, lod = lodtensor_to_ndarray(results[0])
infer_batch = split_infer_result(probs, lod)
for index, sample in enumerate(infer_batch):
key = "utter#%d" % (batch_id * args.batch_size + index)
print(key, ": ", decoder.decode(key, sample), "\n")
print(np.mean(infer_costs), np.mean(infer_accs))
if __name__ == '__main__':
args = parse_args()
print_arguments(args)
infer_from_ckpt(args)
......@@ -3,10 +3,11 @@ from __future__ import division
from __future__ import print_function
import paddle.v2 as paddle
import paddle.v2.fluid as fluid
import paddle.fluid as fluid
def stacked_lstmp_model(hidden_dim,
def stacked_lstmp_model(frame_dim,
hidden_dim,
proj_dim,
stacked_num,
class_num,
......@@ -20,12 +21,13 @@ def stacked_lstmp_model(hidden_dim,
label data respectively. And in inference, only `feature` is needed.
Args:
hidden_dim(int): The hidden state's dimension of the LSTMP layer.
proj_dim(int): The projection size of the LSTMP layer.
stacked_num(int): The number of stacked LSTMP layers.
parallel(bool): Run in parallel or not, default `False`.
is_train(bool): Run in training phase or not, default `True`.
class_dim(int): The number of output classes.
frame_dim(int): The frame dimension of feature data.
hidden_dim(int): The hidden state's dimension of the LSTMP layer.
proj_dim(int): The projection size of the LSTMP layer.
stacked_num(int): The number of stacked LSTMP layers.
parallel(bool): Run in parallel or not, default `False`.
is_train(bool): Run in training phase or not, default `True`.
class_dim(int): The number of output classes.
"""
# network configuration
......@@ -78,7 +80,7 @@ def stacked_lstmp_model(hidden_dim,
# data feeder
feature = fluid.layers.data(
name="feature", shape=[-1, 120 * 11], dtype="float32", lod_level=1)
name="feature", shape=[-1, frame_dim], dtype="float32", lod_level=1)
label = fluid.layers.data(
name="label", shape=[-1, 1], dtype="int64", lod_level=1)
......@@ -92,11 +94,12 @@ def stacked_lstmp_model(hidden_dim,
feat_ = pd.read_input(feature)
label_ = pd.read_input(label)
prediction, avg_cost, acc = _net_conf(feat_, label_)
for out in [avg_cost, acc]:
for out in [prediction, avg_cost, acc]:
pd.write_output(out)
# get mean loss and acc through every devices.
avg_cost, acc = pd()
prediction, avg_cost, acc = pd()
prediction.stop_gradient = True
avg_cost = fluid.layers.mean(x=avg_cost)
acc = fluid.layers.mean(x=acc)
else:
......
......@@ -7,13 +7,13 @@ import numpy as np
import argparse
import time
import paddle.v2.fluid as fluid
import paddle.v2.fluid.profiler as profiler
import paddle.fluid as fluid
import paddle.fluid.profiler as profiler
import _init_paths
import data_utils.augmentor.trans_mean_variance_norm as trans_mean_variance_norm
import data_utils.augmentor.trans_add_delta as trans_add_delta
import data_utils.augmentor.trans_splice as trans_splice
import data_utils.data_reader as reader
import data_utils.async_data_reader as reader
from model_utils.model import stacked_lstmp_model
from data_utils.util import lodtensor_to_ndarray
......@@ -31,6 +31,11 @@ def parse_args():
default=1,
help='The minimum sequence number of a batch data. '
'(default: %(default)d)')
parser.add_argument(
'--frame_dim',
type=int,
default=120 * 11,
help='Frame dimension of feature data. (default: %(default)d)')
parser.add_argument(
'--stacked_num',
type=int,
......@@ -46,10 +51,15 @@ def parse_args():
type=int,
default=1024,
help='Hidden size of lstmp unit. (default: %(default)d)')
parser.add_argument(
'--class_num',
type=int,
default=1749,
help='Number of classes in label. (default: %(default)d)')
parser.add_argument(
'--learning_rate',
type=float,
default=0.002,
default=0.00016,
help='Learning rate used to train. (default: %(default)f)')
parser.add_argument(
'--device',
......@@ -119,14 +129,15 @@ def profile(args):
"arg 'first_batches_to_skip' must not be smaller than 0.")
_, avg_cost, accuracy = stacked_lstmp_model(
frame_dim=args.frame_dim,
hidden_dim=args.hidden_dim,
proj_dim=args.proj_dim,
stacked_num=args.stacked_num,
class_num=1749,
class_num=args.class_num,
parallel=args.parallel)
adam_optimizer = fluid.optimizer.Adam(learning_rate=args.learning_rate)
adam_optimizer.minimize(avg_cost)
optimizer = fluid.optimizer.Adam(learning_rate=args.learning_rate)
optimizer.minimize(avg_cost)
place = fluid.CPUPlace() if args.device == 'CPU' else fluid.CUDAPlace(0)
exe = fluid.Executor(place)
......@@ -138,7 +149,7 @@ def profile(args):
trans_splice.TransSplice()
]
data_reader = reader.DataReader(args.feature_lst, args.label_lst)
data_reader = reader.AsyncDataReader(args.feature_lst, args.label_lst)
data_reader.set_transformers(ltrans)
feature_t = fluid.LoDTensor()
......@@ -158,17 +169,20 @@ def profile(args):
frames_seen = 0
# load_data
(features, labels, lod) = batch_data
feature_t.set(features, place)
feature_t.set_lod([lod])
label_t.set(labels, place)
label_t.set_lod([lod])
feature_t.set(features.ndarray, place)
feature_t.set_lod([lod.ndarray])
label_t.set(labels.ndarray, place)
label_t.set_lod([lod.ndarray])
frames_seen += lod.ndarray[-1]
frames_seen += lod[-1]
data_reader.recycle(features, labels, lod)
outs = exe.run(fluid.default_main_program(),
feed={"feature": feature_t,
"label": label_t},
fetch_list=[avg_cost, accuracy],
fetch_list=[avg_cost, accuracy]
if args.print_train_acc else [],
return_numpy=False)
if args.print_train_acc:
......
......@@ -8,11 +8,11 @@ import numpy as np
import argparse
import time
import paddle.v2.fluid as fluid
import paddle.fluid as fluid
import data_utils.augmentor.trans_mean_variance_norm as trans_mean_variance_norm
import data_utils.augmentor.trans_add_delta as trans_add_delta
import data_utils.augmentor.trans_splice as trans_splice
import data_utils.data_reader as reader
import data_utils.async_data_reader as reader
from data_utils.util import lodtensor_to_ndarray
from model_utils.model import stacked_lstmp_model
......@@ -30,21 +30,31 @@ def parse_args():
default=1,
help='The minimum sequence number of a batch data. '
'(default: %(default)d)')
parser.add_argument(
'--frame_dim',
type=int,
default=120 * 11,
help='Frame dimension of feature data. (default: %(default)d)')
parser.add_argument(
'--stacked_num',
type=int,
default=5,
help='Number of lstm layers to stack. (default: %(default)d)')
help='Number of lstmp layers to stack. (default: %(default)d)')
parser.add_argument(
'--proj_dim',
type=int,
default=512,
help='Project size of lstm unit. (default: %(default)d)')
help='Project size of lstmp unit. (default: %(default)d)')
parser.add_argument(
'--hidden_dim',
type=int,
default=1024,
help='Hidden size of lstm unit. (default: %(default)d)')
help='Hidden size of lstmp unit. (default: %(default)d)')
parser.add_argument(
'--class_num',
type=int,
default=1749,
help='Number of classes in label. (default: %(default)d)')
parser.add_argument(
'--pass_num',
type=int,
......@@ -58,7 +68,7 @@ def parse_args():
parser.add_argument(
'--learning_rate',
type=float,
default=0.002,
default=0.00016,
help='Learning rate used to train. (default: %(default)f)')
parser.add_argument(
'--device',
......@@ -72,33 +82,46 @@ def parse_args():
'--mean_var',
type=str,
default='data/global_mean_var_search26kHr',
help='mean var path')
help="The path for feature's global mean and variance. "
"(default: %(default)s)")
parser.add_argument(
'--train_feature_lst',
type=str,
default='data/feature.lst',
help='feature list path for training.')
help='The feature list path for training. (default: %(default)s)')
parser.add_argument(
'--train_label_lst',
type=str,
default='data/label.lst',
help='label list path for training.')
help='The label list path for training. (default: %(default)s)')
parser.add_argument(
'--val_feature_lst',
type=str,
default='data/val_feature.lst',
help='feature list path for validation.')
help='The feature list path for validation. (default: %(default)s)')
parser.add_argument(
'--val_label_lst',
type=str,
default='data/val_label.lst',
help='label list path for validation.')
help='The label list path for validation. (default: %(default)s)')
parser.add_argument(
'--init_model_path',
type=str,
default=None,
help="The model (checkpoint) path which the training resumes from. "
"If None, train the model from scratch. (default: %(default)s)")
parser.add_argument(
'--model_save_dir',
'--checkpoints',
type=str,
default='./checkpoints',
help='directory to save model. Do not save model if set to '
'.')
help="The directory for saving checkpoints. Do not save checkpoints "
"if set to ''. (default: %(default)s)")
parser.add_argument(
'--infer_models',
type=str,
default='./infer_models',
help="The directory for saving inference models. Do not save inference "
"models if set to ''. (default: %(default)s)")
args = parser.parse_args()
return args
......@@ -114,27 +137,37 @@ def train(args):
"""train in loop.
"""
# prediction, avg_cost, accuracy = stacked_lstmp_model(args.hidden_dim,
# args.proj_dim, args.stacked_num, class_num=1749, args.parallel)
# paths check
if args.init_model_path is not None and \
not os.path.exists(args.init_model_path):
raise IOError("Invalid initial model path!")
if args.checkpoints != '' and not os.path.exists(args.checkpoints):
os.mkdir(args.checkpoints)
if args.infer_models != '' and not os.path.exists(args.infer_models):
os.mkdir(args.infer_models)
prediction, avg_cost, accuracy = stacked_lstmp_model(
frame_dim=args.frame_dim,
hidden_dim=args.hidden_dim,
proj_dim=args.proj_dim,
stacked_num=args.stacked_num,
class_num=1749,
class_num=args.class_num,
parallel=args.parallel)
adam_optimizer = fluid.optimizer.Adam(learning_rate=args.learning_rate)
adam_optimizer.minimize(avg_cost)
# program for test
test_program = fluid.default_main_program().clone()
with fluid.program_guard(test_program):
test_program = fluid.io.get_inference_program([avg_cost, accuracy])
optimizer = fluid.optimizer.Adam(learning_rate=args.learning_rate)
optimizer.minimize(avg_cost)
place = fluid.CPUPlace() if args.device == 'CPU' else fluid.CUDAPlace(0)
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
# resume training if initial model provided.
if args.init_model_path is not None:
fluid.io.load_persistables(exe, args.init_model_path)
ltrans = [
trans_add_delta.TransAddDelta(2, 2),
trans_mean_variance_norm.TransMeanVarianceNorm(args.mean_var),
......@@ -151,8 +184,8 @@ def train(args):
os.path.exists(args.val_label_lst)):
return -1.0, -1.0
# test data reader
test_data_reader = reader.DataReader(args.val_feature_lst,
args.val_label_lst)
test_data_reader = reader.AsyncDataReader(args.val_feature_lst,
args.val_label_lst)
test_data_reader.set_transformers(ltrans)
test_costs, test_accs = [], []
for batch_id, batch_data in enumerate(
......@@ -160,10 +193,12 @@ def train(args):
args.minimum_batch_size)):
# load_data
(features, labels, lod) = batch_data
feature_t.set(features, place)
feature_t.set_lod([lod])
label_t.set(labels, place)
label_t.set_lod([lod])
feature_t.set(features.ndarray, place)
feature_t.set_lod([lod.ndarray])
label_t.set(labels.ndarray, place)
label_t.set_lod([lod.ndarray])
test_data_reader.recycle(features, labels, lod)
cost, acc = exe.run(test_program,
feed={"feature": feature_t,
......@@ -175,8 +210,8 @@ def train(args):
return np.mean(test_costs), np.mean(test_accs)
# train data reader
train_data_reader = reader.DataReader(args.train_feature_lst,
args.train_label_lst)
train_data_reader = reader.AsyncDataReader(args.train_feature_lst,
args.train_label_lst, -1)
train_data_reader.set_transformers(ltrans)
# train
for pass_id in xrange(args.pass_num):
......@@ -186,30 +221,46 @@ def train(args):
args.minimum_batch_size)):
# load_data
(features, labels, lod) = batch_data
feature_t.set(features, place)
feature_t.set_lod([lod])
label_t.set(labels, place)
label_t.set_lod([lod])
feature_t.set(features.ndarray, place)
feature_t.set_lod([lod.ndarray])
label_t.set(labels.ndarray, place)
label_t.set_lod([lod.ndarray])
cost, acc = exe.run(fluid.default_main_program(),
feed={"feature": feature_t,
"label": label_t},
fetch_list=[avg_cost, accuracy],
return_numpy=False)
train_data_reader.recycle(features, labels, lod)
to_print = batch_id > 0 and (batch_id % args.print_per_batches == 0)
outs = exe.run(fluid.default_main_program(),
feed={"feature": feature_t,
"label": label_t},
fetch_list=[avg_cost, accuracy] if to_print else [],
return_numpy=False)
if batch_id > 0 and (batch_id % args.print_per_batches == 0):
if to_print:
print("\nBatch %d, train cost: %f, train acc: %f" %
(batch_id, lodtensor_to_ndarray(cost)[0],
lodtensor_to_ndarray(acc)[0]))
(batch_id, lodtensor_to_ndarray(outs[0])[0],
lodtensor_to_ndarray(outs[1])[0]))
# save the latest checkpoint
if args.checkpoints != '':
model_path = os.path.join(args.checkpoints,
"deep_asr.latest.checkpoint")
fluid.io.save_persistables(exe, model_path)
else:
sys.stdout.write('.')
sys.stdout.flush()
# run test
val_cost, val_acc = test(exe)
# save model
if args.model_save_dir != '':
# save checkpoint per pass
if args.checkpoints != '':
model_path = os.path.join(
args.model_save_dir, "deep_asr.pass_" + str(pass_id) + ".model")
args.checkpoints,
"deep_asr.pass_" + str(pass_id) + ".checkpoint")
fluid.io.save_persistables(exe, model_path)
# save inference model
if args.infer_models != '':
model_path = os.path.join(
args.infer_models,
"deep_asr.pass_" + str(pass_id) + ".infer.model")
fluid.io.save_inference_model(model_path, ["feature"],
[prediction], exe)
# cal pass time
......@@ -224,7 +275,4 @@ if __name__ == '__main__':
args = parse_args()
print_arguments(args)
if args.model_save_dir != '' and not os.path.exists(args.model_save_dir):
os.mkdir(args.model_save_dir)
train(args)
# Paddle Fluid Models
---
The Paddle Fluid models are a collection of example models that use Paddle Fluid APIs. Currently, example codes in this directory are still under active development.
The minimum PaddlePaddle version needed for the code sample in this directory is the lastest develop branch. If you are on a version of PaddlePaddle earlier than this, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
---
# Advbox
Advbox is a Python toolbox to create adversarial examples that fool neural networks. It requires Python and paddle.
......
"""
A set of tools for generating adversarial example on paddle platform
"""
from . import attacks
from . import models
from .adversary import Adversary
......@@ -18,13 +18,15 @@ class Adversary(object):
"""
assert original is not None
self.original_label = original_label
self.target_label = None
self.adversarial_label = None
self.__original = original
self.__original_label = original_label
self.__target_label = None
self.__target = None
self.__is_targeted_attack = False
self.__adversarial_example = None
self.__adversarial_label = None
self.__bad_adversarial_example = None
def set_target(self, is_targeted_attack, target=None, target_label=None):
"""
......@@ -38,10 +40,10 @@ class Adversary(object):
"""
assert (target_label is None) or is_targeted_attack
self.__is_targeted_attack = is_targeted_attack
self.__target_label = target_label
self.target_label = target_label
self.__target = target
if not is_targeted_attack:
self.__target_label = None
self.target_label = None
self.__target = None
def set_original(self, original, original_label=None):
......@@ -53,10 +55,11 @@ class Adversary(object):
"""
if original != self.__original:
self.__original = original
self.__original_label = original_label
self.original_label = original_label
self.__adversarial_example = None
self.__bad_adversarial_example = None
if original is None:
self.__original_label = None
self.original_label = None
def _is_successful(self, adversarial_label):
"""
......@@ -65,11 +68,11 @@ class Adversary(object):
:param adversarial_label: adversarial label.
:return: bool
"""
if self.__target_label is not None:
return adversarial_label == self.__target_label
if self.target_label is not None:
return adversarial_label == self.target_label
else:
return (adversarial_label is not None) and \
(adversarial_label != self.__original_label)
(adversarial_label != self.original_label)
def is_successful(self):
"""
......@@ -77,7 +80,7 @@ class Adversary(object):
:return: bool
"""
return self._is_successful(self.__adversarial_label)
return self._is_successful(self.adversarial_label)
def try_accept_the_example(self, adversarial_example, adversarial_label):
"""
......@@ -93,7 +96,9 @@ class Adversary(object):
ok = self._is_successful(adversarial_label)
if ok:
self.__adversarial_example = adversarial_example
self.__adversarial_label = adversarial_label
self.adversarial_label = adversarial_label
else:
self.__bad_adversarial_example = adversarial_example
return ok
def perturbation(self, multiplying_factor=1.0):
......@@ -104,9 +109,14 @@ class Adversary(object):
:return: The perturbation that is multiplied by multiplying_factor.
"""
assert self.__original is not None
assert self.__adversarial_example is not None
return multiplying_factor * (
self.__adversarial_example - self.__original)
assert (self.__adversarial_example is not None) or \
(self.__bad_adversarial_example is not None)
if self.__adversarial_example is not None:
return multiplying_factor * (
self.__adversarial_example - self.__original)
else:
return multiplying_factor * (
self.__bad_adversarial_example - self.__original)
@property
def is_targeted_attack(self):
......@@ -115,20 +125,6 @@ class Adversary(object):
"""
return self.__is_targeted_attack
@property
def target_label(self):
"""
:property: target_label
"""
return self.__target_label
@target_label.setter
def target_label(self, label):
"""
:property: target_label
"""
self.__target_label = label
@property
def target(self):
"""
......@@ -143,20 +139,6 @@ class Adversary(object):
"""
return self.__original
@property
def original_label(self):
"""
:property: original
"""
return self.__original_label
@original_label.setter
def original_label(self, label):
"""
original_label setter
"""
self.__original_label = label
@property
def adversarial_example(self):
"""
......@@ -164,23 +146,9 @@ class Adversary(object):
"""
return self.__adversarial_example
@adversarial_example.setter
def adversarial_example(self, example):
"""
adversarial_example setter
"""
self.__adversarial_example = example
@property
def adversarial_label(self):
"""
:property: adversarial_label
"""
return self.__adversarial_label
@adversarial_label.setter
def adversarial_label(self, label):
def bad_adversarial_example(self):
"""
adversarial_label setter
:property: bad_adversarial_example
"""
self.__adversarial_label = label
return self.__bad_adversarial_example
"""
Attack methods
Attack methods __init__.py
"""
from .base import Attack
from .deepfool import DeepFoolAttack
from .gradientsign import FGSM
from .gradientsign import GradientSignAttack
from .iterator_gradientsign import IFGSM
from .iterator_gradientsign import IteratorGradientSignAttack
......@@ -52,21 +52,23 @@ class Attack(object):
:param adversary: adversary
:return: None
"""
assert self.model.channel_axis() == adversary.original.ndim
if adversary.original_label is None:
adversary.original_label = np.argmax(
self.model.predict(adversary.original))
if adversary.is_targeted_attack and adversary.target_label is None:
if adversary.target is None:
raise ValueError(
'When adversary.is_targeted_attack is True, '
'When adversary.is_targeted_attack is true, '
'adversary.target_label or adversary.target must be set.')
else:
adversary.target_label_label = np.argmax(
self.model.predict(
self.model.scale_input(adversary.target)))
adversary.target_label = np.argmax(
self.model.predict(adversary.target))
logging.info('adversary:\noriginal_label: {}'
'\n target_lable: {}'
'\n is_targeted_attack: {}'
logging.info('adversary:'
'\n original_label: {}'
'\n target_label: {}'
'\n is_targeted_attack: {}'
''.format(adversary.original_label, adversary.target_label,
adversary.is_targeted_attack))
......@@ -10,6 +10,8 @@ import numpy as np
from .base import Attack
__all__ = ['DeepFoolAttack']
class DeepFoolAttack(Attack):
"""
......@@ -56,7 +58,7 @@ class DeepFoolAttack(Attack):
gradient_k = self.model.gradient(x, k)
w_k = gradient_k - gradient
f_k = f[k] - f[pre_label]
w_k_norm = np.linalg.norm(w_k) + 1e-8
w_k_norm = np.linalg.norm(w_k.flatten()) + 1e-8
pert_k = (np.abs(f_k) + 1e-8) / w_k_norm
if pert_k < pert:
pert = pert_k
......@@ -70,9 +72,12 @@ class DeepFoolAttack(Attack):
f = self.model.predict(x)
gradient = self.model.gradient(x, pre_label)
adv_label = np.argmax(f)
logging.info('iteration = {}, f = {}, pre_label = {}'
', adv_label={}'.format(iteration, f[pre_label],
pre_label, adv_label))
logging.info('iteration={}, f[pre_label]={}, f[target_label]={}'
', f[adv_label]={}, pre_label={}, adv_label={}'
''.format(iteration, f[pre_label], (
f[adversary.target_label]
if adversary.is_targeted_attack else 'NaN'), f[
adv_label], pre_label, adv_label))
if adversary.try_accept_the_example(x, adv_label):
return adversary
......
"""
This module provide the attack method for Iterator FGSM's implement.
"""
from __future__ import division
import logging
from collections import Iterable
import numpy as np
from .base import Attack
__all__ = [
'GradientMethodAttack', 'FastGradientSignMethodAttack', 'FGSM',
'FastGradientSignMethodTargetedAttack', 'FGSMT',
'BasicIterativeMethodAttack', 'BIM',
'IterativeLeastLikelyClassMethodAttack', 'ILCM'
]
class GradientMethodAttack(Attack):
"""
This class implements gradient attack method, and is the base of FGSM, BIM,
ILCM, etc.
"""
def __init__(self, model, support_targeted=True):
"""
:param model(model): The model to be attacked.
:param support_targeted(bool): Does this attack method support targeted.
"""
super(GradientMethodAttack, self).__init__(model)
self.support_targeted = support_targeted
def _apply(self, adversary, norm_ord=np.inf, epsilons=0.01, steps=100):
"""
Apply the gradient attack method.
:param adversary(Adversary):
The Adversary object.
:param norm_ord(int):
Order of the norm, such as np.inf, 1, 2, etc. It can't be 0.
:param epsilons(list|tuple|int):
Attack step size (input variation).
:param steps:
The number of iterator steps.
:return:
adversary(Adversary): The Adversary object.
"""
if norm_ord == 0:
raise ValueError("L0 norm is not supported!")
if not self.support_targeted:
if adversary.is_targeted_attack:
raise ValueError(
"This attack method doesn't support targeted attack!")
if not isinstance(epsilons, Iterable):
epsilons = np.linspace(epsilons, epsilons + 1e-10, num=steps)
pre_label = adversary.original_label
min_, max_ = self.model.bounds()
assert self.model.channel_axis() == adversary.original.ndim
assert (self.model.channel_axis() == 1 or
self.model.channel_axis() == adversary.original.shape[0] or
self.model.channel_axis() == adversary.original.shape[-1])
step = 1
adv_img = adversary.original
for epsilon in epsilons[:steps]:
if epsilon == 0.0:
continue
if adversary.is_targeted_attack:
gradient = -self.model.gradient(adv_img, adversary.target_label)
else:
gradient = self.model.gradient(adv_img,
adversary.original_label)
if norm_ord == np.inf:
gradient_norm = np.sign(gradient)
else:
gradient_norm = gradient / self._norm(gradient, ord=norm_ord)
adv_img = adv_img + epsilon * gradient_norm * (max_ - min_)
adv_img = np.clip(adv_img, min_, max_)
adv_label = np.argmax(self.model.predict(adv_img))
logging.info('step={}, epsilon = {:.5f}, pre_label = {}, '
'adv_label={}'.format(step, epsilon, pre_label,
adv_label))
if adversary.try_accept_the_example(adv_img, adv_label):
return adversary
step += 1
return adversary
@staticmethod
def _norm(a, ord):
if a.ndim == 1:
return np.linalg.norm(a, ord=ord)
if a.ndim == a.shape[0]:
norm_shape = (a.ndim, reduce(np.dot, a.shape[1:]))
norm_axis = 1
else:
norm_shape = (reduce(np.dot, a.shape[:-1]), a.ndim)
norm_axis = 0
return np.linalg.norm(a.reshape(norm_shape), ord=ord, axis=norm_axis)
class FastGradientSignMethodTargetedAttack(GradientMethodAttack):
"""
"Fast Gradient Sign Method" is extended to support targeted attack.
"Fast Gradient Sign Method" was originally implemented by Goodfellow et
al. (2015) with the infinity norm.
Paper link: https://arxiv.org/abs/1412.6572
"""
def _apply(self, adversary, epsilons=0.03):
return GradientMethodAttack._apply(
self,
adversary=adversary,
norm_ord=np.inf,
epsilons=epsilons,
steps=1)
class FastGradientSignMethodAttack(FastGradientSignMethodTargetedAttack):
"""
This attack was originally implemented by Goodfellow et al. (2015) with the
infinity norm, and is known as the "Fast Gradient Sign Method".
Paper link: https://arxiv.org/abs/1412.6572
"""
def __init__(self, model):
super(FastGradientSignMethodAttack, self).__init__(model, False)
class IterativeLeastLikelyClassMethodAttack(GradientMethodAttack):
"""
"Iterative Least-likely Class Method (ILCM)" extends "BIM" to support
targeted attack.
"The Basic Iterative Method (BIM)" is to extend "FSGM". "BIM" iteratively
take multiple small steps while adjusting the direction after each step.
Paper link: https://arxiv.org/abs/1607.02533
"""
def _apply(self, adversary, epsilons=0.001, steps=1000):
return GradientMethodAttack._apply(
self,
adversary=adversary,
norm_ord=np.inf,
epsilons=epsilons,
steps=steps)
class BasicIterativeMethodAttack(IterativeLeastLikelyClassMethodAttack):
"""
FGSM is a one-step method. "The Basic Iterative Method (BIM)" iteratively
take multiple small steps while adjusting the direction after each step.
Paper link: https://arxiv.org/abs/1607.02533
"""
def __init__(self, model):
super(BasicIterativeMethodAttack, self).__init__(model, False)
FGSM = FastGradientSignMethodAttack
FGSMT = FastGradientSignMethodTargetedAttack
BIM = BasicIterativeMethodAttack
ILCM = IterativeLeastLikelyClassMethodAttack
"""
This module provide the attack method for FGSM's implement.
"""
from __future__ import division
import logging
from collections import Iterable
import numpy as np
from .base import Attack
class GradientSignAttack(Attack):
"""
This attack was originally implemented by Goodfellow et al. (2015) with the
infinity norm (and is known as the "Fast Gradient Sign Method").
This is therefore called the Fast Gradient Method.
Paper link: https://arxiv.org/abs/1412.6572
"""
def _apply(self, adversary, epsilons=1000):
"""
Apply the gradient sign attack.
Args:
adversary(Adversary): The Adversary object.
epsilons(list|tuple|int): The epsilon (input variation parameter).
Return:
adversary: The Adversary object.
"""
assert adversary is not None
if not isinstance(epsilons, Iterable):
epsilons = np.linspace(0, 1, num=epsilons + 1)[1:]
pre_label = adversary.original_label
min_, max_ = self.model.bounds()
if adversary.is_targeted_attack:
gradient = self.model.gradient(adversary.original,
adversary.target_label)
gradient_sign = -np.sign(gradient) * (max_ - min_)
else:
gradient = self.model.gradient(adversary.original,
adversary.original_label)
gradient_sign = np.sign(gradient) * (max_ - min_)
for epsilon in epsilons:
adv_img = adversary.original + epsilon * gradient_sign
adv_img = np.clip(adv_img, min_, max_)
adv_label = np.argmax(self.model.predict(adv_img))
logging.info('epsilon = {:.3f}, pre_label = {}, adv_label={}'.
format(epsilon, pre_label, adv_label))
if adversary.try_accept_the_example(adv_img, adv_label):
return adversary
return adversary
FGSM = GradientSignAttack
"""
This module provide the attack method for Iterator FGSM's implement.
"""
from __future__ import division
import logging
from collections import Iterable
import numpy as np
from .base import Attack
class IteratorGradientSignAttack(Attack):
"""
This attack was originally implemented by Alexey Kurakin(Google Brain).
Paper link: https://arxiv.org/pdf/1607.02533.pdf
"""
def _apply(self, adversary, epsilons=100, steps=10):
"""
Apply the iterative gradient sign attack.
Args:
adversary(Adversary): The Adversary object.
epsilons(list|tuple|int): The epsilon (input variation parameter).
steps(int): The number of iterator steps.
Return:
adversary(Adversary): The Adversary object.
"""
if not isinstance(epsilons, Iterable):
epsilons = np.linspace(0, 1 / steps, num=epsilons + 1)[1:]
pre_label = adversary.original_label
min_, max_ = self.model.bounds()
for epsilon in epsilons:
adv_img = adversary.original
for _ in range(steps):
if adversary.is_targeted_attack:
gradient = self.model.gradient(adversary.original,
adversary.target_label)
gradient_sign = -np.sign(gradient) * (max_ - min_)
else:
gradient = self.model.gradient(adversary.original,
adversary.original_label)
gradient_sign = np.sign(gradient) * (max_ - min_)
adv_img = adv_img + gradient_sign * epsilon
adv_img = np.clip(adv_img, min_, max_)
adv_label = np.argmax(self.model.predict(adv_img))
logging.info('epsilon = {:.3f}, pre_label = {}, adv_label={}'.
format(epsilon, pre_label, adv_label))
if adversary.try_accept_the_example(adv_img, adv_label):
return adversary
return adversary
IFGSM = IteratorGradientSignAttack
"""
This module provide the attack method of "LBFGS".
"""
from __future__ import division
import logging
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
from .base import Attack
__all__ = ['LBFGSAttack', 'LBFGS']
class LBFGSAttack(Attack):
"""
Uses L-BFGS-B to minimize the cross-entropy and the distance between the
original and the adversary.
Paper link: https://arxiv.org/abs/1510.05328
"""
def __init__(self, model):
super(LBFGSAttack, self).__init__(model)
self._predicts_normalized = None
self._adversary = None # type: Adversary
def _apply(self, adversary, epsilon=0.001, steps=10):
self._adversary = adversary
if not adversary.is_targeted_attack:
raise ValueError("This attack method only support targeted attack!")
# finding initial c
logging.info('finding initial c...')
c = epsilon
x0 = adversary.original.flatten()
for i in range(30):
c = 2 * c
logging.info('c={}'.format(c))
is_adversary = self._lbfgsb(x0, c, steps)
if is_adversary:
break
if not is_adversary:
logging.info('Failed!')
return adversary
# binary search c
logging.info('binary search c...')
c_low = 0
c_high = c
while c_high - c_low >= epsilon:
logging.info('c_high={}, c_low={}, diff={}, epsilon={}'
.format(c_high, c_low, c_high - c_low, epsilon))
c_half = (c_low + c_high) / 2
is_adversary = self._lbfgsb(x0, c_half, steps)
if is_adversary:
c_high = c_half
else:
c_low = c_half
return adversary
def _is_predicts_normalized(self, predicts):
"""
To determine the predicts is normalized.
:param predicts(np.array): the output of the model.
:return: bool
"""
if self._predicts_normalized is None:
if self.model.predict_name().lower() in [
'softmax', 'probabilities', 'probs'
]:
self._predicts_normalized = True
else:
if np.any(predicts < 0.0):
self._predicts_normalized = False
else:
s = np.sum(predicts.flatten())
if 0.999 <= s <= 1.001:
self._predicts_normalized = True
else:
self._predicts_normalized = False
assert self._predicts_normalized is not None
return self._predicts_normalized
def _loss(self, adv_x, c):
"""
To get the loss and gradient.
:param adv_x: the candidate adversarial example
:param c: parameter 'C' in the paper
:return: (loss, gradient)
"""
x = adv_x.reshape(self._adversary.original.shape)
# cross_entropy
logits = self.model.predict(x)
if not self._is_predicts_normalized(logits): # to softmax
e = np.exp(logits)
logits = e / np.sum(e)
e = np.exp(logits)
s = np.sum(e)
ce = np.log(s) - logits[self._adversary.target_label]
# L2 distance
min_, max_ = self.model.bounds()
d = np.sum((x - self._adversary.original).flatten() ** 2) \
/ ((max_ - min_) ** 2) / len(adv_x)
# gradient
gradient = self.model.gradient(x, self._adversary.target_label)
result = (c * ce + d).astype(float), gradient.flatten().astype(float)
return result
def _lbfgsb(self, x0, c, maxiter):
min_, max_ = self.model.bounds()
bounds = [(min_, max_)] * len(x0)
approx_grad_eps = (max_ - min_) / 100.0
x, f, d = fmin_l_bfgs_b(
self._loss,
x0,
args=(c, ),
bounds=bounds,
maxiter=maxiter,
epsilon=approx_grad_eps)
if np.amax(x) > max_ or np.amin(x) < min_:
x = np.clip(x, min_, max_)
shape = self._adversary.original.shape
adv_label = np.argmax(self.model.predict(x.reshape(shape)))
logging.info('pre_label = {}, adv_label={}'.format(
self._adversary.target_label, adv_label))
return self._adversary.try_accept_the_example(
x.reshape(shape), adv_label)
LBFGS = LBFGSAttack
"""
This module provide the attack method for JSMA's implement.
"""
from __future__ import division
import logging
import random
import numpy as np
from .base import Attack
class SaliencyMapAttack(Attack):
"""
Implements the Saliency Map Attack.
The Jacobian-based Saliency Map Approach (Papernot et al. 2016).
Paper link: https://arxiv.org/pdf/1511.07528.pdf
"""
def _apply(self,
adversary,
max_iter=2000,
fast=True,
theta=0.1,
max_perturbations_per_pixel=7):
"""
Apply the JSMA attack.
Args:
adversary(Adversary): The Adversary object.
max_iter(int): The max iterations.
fast(bool): Whether evaluate the pixel influence on sum of residual classes.
theta(float): Perturbation per pixel relative to [min, max] range.
max_perturbations_per_pixel(int): The max count of perturbation per pixel.
Return:
adversary: The Adversary object.
"""
assert adversary is not None
if not adversary.is_targeted_attack or (adversary.target_label is None):
target_labels = self._generate_random_target(
adversary.original_label)
else:
target_labels = [adversary.target_label]
for target in target_labels:
original_image = adversary.original
# the mask defines the search domain
# each modified pixel with border value is set to zero in mask
mask = np.ones_like(original_image)
# count tracks how often each pixel was changed
counts = np.zeros_like(original_image)
labels = range(self.model.num_classes())
adv_img = original_image.copy()
min_, max_ = self.model.bounds()
for step in range(max_iter):
adv_img = np.clip(adv_img, min_, max_)
adv_label = np.argmax(self.model.predict(adv_img))
if adversary.try_accept_the_example(adv_img, adv_label):
return adversary
# stop if mask is all zero
if not any(mask.flatten()):
return adversary
logging.info('step = {}, original_label = {}, adv_label={}'.
format(step, adversary.original_label, adv_label))
# get pixel location with highest influence on class
idx, p_sign = self._saliency_map(
adv_img, target, labels, mask, fast=fast)
# apply perturbation
adv_img[idx] += -p_sign * theta * (max_ - min_)
# tracks number of updates for each pixel
counts[idx] += 1
# remove pixel from search domain if it hits the bound
if adv_img[idx] <= min_ or adv_img[idx] >= max_:
mask[idx] = 0
# remove pixel if it was changed too often
if counts[idx] >= max_perturbations_per_pixel:
mask[idx] = 0
adv_img = np.clip(adv_img, min_, max_)
def _generate_random_target(self, original_label):
"""
Draw random target labels all of which are different and not the original label.
Args:
original_label(int): Original label.
Return:
target_labels(list): random target labels
"""
num_random_target = 1
num_classes = self.model.num_classes()
assert num_random_target <= num_classes - 1
target_labels = random.sample(range(num_classes), num_random_target + 1)
target_labels = [t for t in target_labels if t != original_label]
target_labels = target_labels[:num_random_target]
return target_labels
def _saliency_map(self, image, target, labels, mask, fast=False):
"""
Get pixel location with highest influence on class.
Args:
image(numpy.ndarray): Image with shape (height, width, channels).
target(int): The target label.
labels(int): The number of classes of the output label.
mask(list): Each modified pixel with border value is set to zero in mask.
fast(bool): Whether evaluate the pixel influence on sum of residual classes.
Return:
idx: The index of optimal pixel.
pix_sign: The direction of perturbation
"""
# pixel influence on target class
alphas = self.model.gradient(image, target) * mask
# pixel influence on sum of residual classes(don't evaluate if fast == True)
if fast:
betas = -np.ones_like(alphas)
else:
betas = np.sum([
self.model.gradient(image, label) * mask - alphas
for label in labels
], 0)
# compute saliency map (take into account both pos. & neg. perturbations)
sal_map = np.abs(alphas) * np.abs(betas) * np.sign(alphas * betas)
# find optimal pixel & direction of perturbation
idx = np.argmin(sal_map)
idx = np.unravel_index(idx, mask.shape)
pix_sign = np.sign(alphas)[idx]
return idx, pix_sign
JSMA = SaliencyMapAttack
"""
Paddle model for target of attack
"""
from .base import Model
from .paddle import PaddleModel
Models __init__.py
"""
\ No newline at end of file
......@@ -24,11 +24,21 @@ class Model(object):
assert len(bounds) == 2
assert channel_axis in [0, 1, 2, 3]
if preprocess is None:
preprocess = (0, 1)
self._bounds = bounds
self._channel_axis = channel_axis
self._preprocess = preprocess
# Make self._preprocess to be (0,1) if possible, so that don't need
# to do substract or divide.
if preprocess is not None:
sub, div = np.array(preprocess)
if not np.any(sub):
sub = 0
if np.all(div == 1):
div = 1
assert (div is None) or np.all(div)
self._preprocess = (sub, div)
else:
self._preprocess = (0, 1)
def bounds(self):
"""
......@@ -47,8 +57,7 @@ class Model(object):
sub, div = self._preprocess
if np.any(sub != 0):
res = input_ - sub
assert np.any(div != 0)
if np.any(div != 1):
if not np.all(sub == 1):
if res is None: # "res = input_ - sub" is not executed!
res = input_ / div
else:
......@@ -97,3 +106,11 @@ class Model(object):
with the shape (height, width, channel).
"""
raise NotImplementedError
@abstractmethod
def predict_name(self):
"""
Get the predict name, such as "softmax",etc.
:return: string
"""
raise NotImplementedError
......@@ -4,7 +4,7 @@ Paddle model
from __future__ import absolute_import
import numpy as np
import paddle.v2.fluid as fluid
import paddle.fluid as fluid
from .base import Model
......@@ -16,7 +16,7 @@ class PaddleModel(Model):
instance of PaddleModel.
Args:
program(paddle.v2.fluid.framework.Program): The program of the model
program(paddle.fluid.framework.Program): The program of the model
which generate the adversarial sample.
input_name(string): The name of the input.
logits_name(string): The name of the logits.
......@@ -114,3 +114,10 @@ class PaddleModel(Model):
feed=feeder.feed([(scaled_data, label)]),
fetch_list=[self._gradient])
return grad.reshape(data.shape)
def predict_name(self):
"""
Get the predict name, such as "softmax",etc.
:return: string
"""
return self._program.block(0).var(self._predict_name).op.type
......@@ -2,7 +2,7 @@
CNN on mnist data using fluid api of paddlepaddle
"""
import paddle.v2 as paddle
import paddle.v2.fluid as fluid
import paddle.fluid as fluid
def mnist_cnn_model(img):
......@@ -47,7 +47,9 @@ def main():
optimizer = fluid.optimizer.Adam(learning_rate=0.01)
optimizer.minimize(avg_cost)
accuracy = fluid.evaluator.Accuracy(input=logits, label=label)
batch_size = fluid.layers.create_tensor(dtype='int64')
batch_acc = fluid.layers.accuracy(
input=logits, label=label, total=batch_size)
BATCH_SIZE = 50
PASS_NUM = 3
......@@ -63,20 +65,22 @@ def main():
feeder = fluid.DataFeeder(feed_list=[img, label], place=place)
exe.run(fluid.default_startup_program())
pass_acc = fluid.average.WeightedAverage()
for pass_id in range(PASS_NUM):
accuracy.reset(exe)
pass_acc.reset()
for data in train_reader():
loss, acc = exe.run(fluid.default_main_program(),
feed=feeder.feed(data),
fetch_list=[avg_cost] + accuracy.metrics)
pass_acc = accuracy.eval(exe)
print("pass_id=" + str(pass_id) + " acc=" + str(acc) + " pass_acc="
+ str(pass_acc))
loss, acc, b_size = exe.run(
fluid.default_main_program(),
feed=feeder.feed(data),
fetch_list=[avg_cost, batch_acc, batch_size])
pass_acc.add(value=acc, weight=b_size)
print("pass_id=" + str(pass_id) + " acc=" + str(acc[0]) +
" pass_acc=" + str(pass_acc.eval()[0]))
if loss < LOSS_THRESHOLD and pass_acc > ACC_THRESHOLD:
break
pass_acc = accuracy.eval(exe)
print("pass_id=" + str(pass_id) + " pass_acc=" + str(pass_acc))
print("pass_id=" + str(pass_id) + " pass_acc=" + str(pass_acc.eval()[
0]))
fluid.io.save_params(
exe, dirname='./mnist', main_program=fluid.default_main_program())
print('train mnist done')
......
......@@ -3,10 +3,10 @@ FGSM demos on mnist using advbox tool.
"""
import matplotlib.pyplot as plt
import paddle.v2 as paddle
import paddle.v2.fluid as fluid
import paddle.fluid as fluid
from advbox import Adversary
from advbox.attacks.gradientsign import GradientSignAttack
from advbox.adversary import Adversary
from advbox.attacks.gradient_method import FGSM
from advbox.models.paddle import PaddleModel
......@@ -73,7 +73,7 @@ def main():
# advbox demo
m = PaddleModel(fluid.default_main_program(), IMG_NAME, LABEL_NAME,
logits.name, avg_cost.name, (-1, 1))
att = GradientSignAttack(m)
att = FGSM(m)
for data in train_reader():
# fgsm attack
adversary = att(Adversary(data[0][0], data[0][1]))
......
"""
FGSM demos on mnist using advbox tool.
"""
import matplotlib.pyplot as plt
import paddle.v2 as paddle
import paddle.fluid as fluid
import numpy as np
from advbox import Adversary
from advbox.attacks.saliency import SaliencyMapAttack
from advbox.models.paddle import PaddleModel
def cnn_model(img):
"""
Mnist cnn model
Args:
img(Varaible): the input image to be recognized
Returns:
Variable: the label prediction
"""
# conv1 = fluid.nets.conv2d()
conv_pool_1 = fluid.nets.simple_img_conv_pool(
input=img,
num_filters=20,
filter_size=5,
pool_size=2,
pool_stride=2,
act='relu')
conv_pool_2 = fluid.nets.simple_img_conv_pool(
input=conv_pool_1,
num_filters=50,
filter_size=5,
pool_size=2,
pool_stride=2,
act='relu')
logits = fluid.layers.fc(input=conv_pool_2, size=10, act='softmax')
return logits
def main():
"""
Advbox demo which demonstrate how to use advbox.
"""
IMG_NAME = 'img'
LABEL_NAME = 'label'
img = fluid.layers.data(name=IMG_NAME, shape=[1, 28, 28], dtype='float32')
# gradient should flow
img.stop_gradient = False
label = fluid.layers.data(name=LABEL_NAME, shape=[1], dtype='int64')
logits = cnn_model(img)
cost = fluid.layers.cross_entropy(input=logits, label=label)
avg_cost = fluid.layers.mean(x=cost)
place = fluid.CPUPlace()
exe = fluid.Executor(place)
BATCH_SIZE = 1
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.mnist.train(), buf_size=500),
batch_size=BATCH_SIZE)
feeder = fluid.DataFeeder(
feed_list=[IMG_NAME, LABEL_NAME],
place=place,
program=fluid.default_main_program())
fluid.io.load_params(
exe, "./mnist/", main_program=fluid.default_main_program())
# advbox demo
m = PaddleModel(fluid.default_main_program(), IMG_NAME, LABEL_NAME,
logits.name, avg_cost.name, (-1, 1))
attack = SaliencyMapAttack(m)
total_num = 0
success_num = 0
for data in train_reader():
total_num += 1
# adversary.set_target(True, target_label=target_label)
jsma_attack = attack(Adversary(data[0][0], data[0][1]))
if jsma_attack is not None and jsma_attack.is_successful():
# plt.imshow(jsma_attack.target, cmap='Greys_r')
# plt.show()
success_num += 1
print('original_label=%d, adversary examples label =%d' %
(data[0][1], jsma_attack.adversarial_label))
# np.save('adv_img', jsma_attack.adversarial_example)
print('total num = %d, success num = %d ' % (total_num, success_num))
if total_num == 100:
break
if __name__ == '__main__':
main()
The minimum PaddlePaddle version needed for the code sample in this directory is the lastest develop branch. If you are on a version of PaddlePaddle earlier than this, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
---
# SE-ResNeXt for image classification
This model built with paddle fluid is still under active development and is not
......
### Caffe2Fluid
This tool is used to convert a Caffe model to Fluid model
### Howto
1, Prepare caffepb.py in ./proto if your python has no 'pycaffe' module, two options provided here:
1) generate it from caffe.proto using protoc
bash ./proto/compile.sh
2) download one from github directly
cd proto/ && wget https://github.com/ethereon/caffe-tensorflow/blob/master/kaffe/caffe/caffepb.py
2, Convert the caffe model using 'convert.py' which will generate a python script and a weight(in .npy) file
3, Use the converted model to predict
see more detail info in 'examples/xxx'
### Tested models
- Lenet on mnist dataset
- ResNets:(ResNet-50, ResNet-101, ResNet-152)
model addr: `https://onedrive.live.com/?authkey=%21AAFW2-FVoxeVRck&id=4006CBB8476FF777%2117887&cid=4006CBB8476FF777`_
- GoogleNet:
model addr: `https://gist.github.com/jimmie33/7ea9f8ac0da259866b854460f4526034`_
- VGG:
model addr: `https://gist.github.com/ksimonyan/211839e770f7b538e2d8`_
- AlexNet:
model addr: `https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet`_
### Notes
Some of this code come from here: https://github.com/ethereon/caffe-tensorflow
#!/usr/bin/env python
import os
import sys
import numpy as np
import argparse
from kaffe import KaffeError, print_stderr
from kaffe.paddle import Transformer
def fatal_error(msg):
""" fatal error encounted
"""
print_stderr(msg)
exit(-1)
def validate_arguments(args):
""" validate args
"""
if (args.data_output_path is not None) and (args.caffemodel is None):
fatal_error('No input data path provided.')
if (args.caffemodel is not None) and (args.data_output_path is None):
fatal_error('No output data path provided.')
if (args.code_output_path is None) and (args.data_output_path is None):
fatal_error('No output path specified.')
def convert(def_path, caffemodel_path, data_output_path, code_output_path,
phase):
""" convert caffe model to tf/paddle models
"""
try:
transformer = Transformer(def_path, caffemodel_path, phase=phase)
print_stderr('Converting data...')
if caffemodel_path is not None:
data = transformer.transform_data()
print_stderr('Saving data...')
with open(data_output_path, 'wb') as data_out:
np.save(data_out, data)
if code_output_path:
print_stderr('Saving source...')
with open(code_output_path, 'wb') as src_out:
src_out.write(transformer.transform_source())
print_stderr('Done.')
except KaffeError as err:
fatal_error('Error encountered: {}'.format(err))
return 0
def main():
""" main
"""
parser = argparse.ArgumentParser()
parser.add_argument('def_path', help='Model definition (.prototxt) path')
parser.add_argument('--caffemodel', help='Model data (.caffemodel) path')
parser.add_argument('--data-output-path', help='Converted data output path')
parser.add_argument(
'--code-output-path', help='Save generated source to this path')
parser.add_argument(
'-p',
'--phase',
default='test',
help='The phase to convert: test (default) or train')
args = parser.parse_args()
validate_arguments(args)
return convert(args.def_path, args.caffemodel, args.data_output_path,
args.code_output_path, args.phase)
if __name__ == '__main__':
ret = main()
sys.exit(ret)
a demo to show converting caffe models on 'imagenet' using caffe2fluid
---
# How to use
1. prepare python environment
2. download caffe model to "models.caffe/xxx" which contains "xxx.caffemodel" and "xxx.prototxt"
3. run the tool
eg: bash ./run.sh resnet50 ./models.caffe/resnet50 ./models/resnet50
#!/bin/env python
#function:
# a demo to show how to use the converted model genereated by caffe2fluid
#
#notes:
# only support imagenet data
import os
import sys
import inspect
import numpy as np
import paddle.v2 as paddle
import paddle.v2.fluid as fluid
def load_data(imgfile, shape):
h, w = shape[1:]
from PIL import Image
im = Image.open(imgfile)
# The storage order of the loaded image is W(widht),
# H(height), C(channel). PaddlePaddle requires
# the CHW order, so transpose them.
im = im.resize((w, h), Image.ANTIALIAS)
im = np.array(im).astype(np.float32)
im = im.transpose((2, 0, 1)) # CHW
im = im[(2, 1, 0), :, :] # BGR
# The mean to be subtracted from each image.
# By default, the per-channel ImageNet mean.
mean = np.array([104., 117., 124.], dtype=np.float32)
mean = mean.reshape([3, 1, 1])
im = im - mean
return im.reshape([1] + shape)
def build_model(net_file, net_name):
print('build model with net_file[%s] and net_name[%s]' %
(net_file, net_name))
net_path = os.path.dirname(net_file)
module_name = os.path.basename(net_file).rstrip('.py')
if net_path not in sys.path:
sys.path.insert(0, net_path)
try:
m = __import__(module_name, fromlist=[net_name])
MyNet = getattr(m, net_name)
except Exception as e:
print('failed to load module[%s]' % (module_name))
print(e)
return None
input_name = 'data'
input_shape = MyNet.input_shapes()[input_name]
images = fluid.layers.data(name='image', shape=input_shape, dtype='float32')
#label = fluid.layers.data(name='label', shape=[1], dtype='int64')
net = MyNet({input_name: images})
input_shape = MyNet.input_shapes()[input_name]
return net, input_shape
def dump_results(results, names, root):
if os.path.exists(root) is False:
os.path.mkdir(root)
for i in range(len(names)):
n = names[i]
res = results[i]
filename = os.path.join(root, n)
np.save(filename + '.npy', res)
def infer(net_file, net_name, model_file, imgfile, debug=False):
""" do inference using a model which consist 'xxx.py' and 'xxx.npy'
"""
#1, build model
net, input_shape = build_model(net_file, net_name)
prediction = net.get_output()
#2, load weights for this model
place = fluid.CPUPlace()
exe = fluid.Executor(place)
startup_program = fluid.default_startup_program()
exe.run(startup_program)
if model_file.find('.npy') > 0:
net.load(data_path=model_file, exe=exe, place=place)
else:
net.load(data_path=model_file, exe=exe)
#3, test this model
test_program = fluid.default_main_program().clone()
fetch_list_var = []
fetch_list_name = []
if debug is False:
fetch_list_var.append(prediction)
else:
for k, v in net.layers.items():
fetch_list_var.append(v)
fetch_list_name.append(k)
np_images = load_data(imgfile, input_shape)
results = exe.run(program=test_program,
feed={'image': np_images},
fetch_list=fetch_list_var)
if debug is True:
dump_path = 'results.layers'
dump_results(results, fetch_list_name, dump_path)
print('all results dumped to [%s]' % (dump_path))
else:
result = results[0]
print('predicted class:', np.argmax(result))
if __name__ == "__main__":
""" maybe more convenient to use 'run.sh' to call this tool
"""
net_file = 'models/resnet50/resnet50.py'
weight_file = 'models/resnet50/resnet50.npy'
imgfile = 'data/65.jpeg'
net_name = 'ResNet50'
argc = len(sys.argv)
if argc == 5:
net_file = sys.argv[1]
weight_file = sys.argv[2]
imgfile = sys.argv[3]
net_name = sys.argv[4]
elif argc > 1:
print('usage:')
print('\tpython %s [net_file] [weight_file] [imgfile] [net_name]' %
(sys.argv[0]))
print('\teg:python %s %s %s %s %s' % (sys.argv[0], net_file,
weight_file, imgfile, net_name))
sys.exit(1)
infer(net_file, net_name, weight_file, imgfile)
#!/bin/bash
#function:
# a tool used to:
# 1, convert a caffe model
# 2, do inference using this model
#
#usage:
# bash run.sh resnet50 ./models.caffe/resnet50 ./models/resnet50
#
#set -x
if [[ $# -lt 3 ]];then
echo "usage:"
echo " bash $0 [model_name] [cf_model_path] [pd_model_path] [only_convert]"
echo " eg: bash $0 resnet50 ./models.caffe/resnet50 ./models/resnet50"
exit 1
else
model_name=$1
cf_model_path=$2
pd_model_path=$3
only_convert=$4
fi
proto_file=$cf_model_path/${model_name}.prototxt
caffemodel_file=$cf_model_path/${model_name}.caffemodel
weight_file=$pd_model_path/${model_name}.npy
net_file=$pd_model_path/${model_name}.py
if [[ ! -e $proto_file ]];then
echo "not found prototxt[$proto_file]"
exit 1
fi
if [[ ! -e $caffemodel_file ]];then
echo "not found caffemodel[$caffemodel_file]"
exit 1
fi
if [[ ! -e $pd_model_path ]];then
mkdir $pd_model_path
fi
PYTHON=`which cfpython`
if [[ -z $PYTHON ]];then
PYTHON=`which python`
fi
$PYTHON ../../convert.py \
$proto_file \
--caffemodel $caffemodel_file \
--data-output-path $weight_file\
--code-output-path $net_file
ret=$?
if [[ $ret -ne 0 ]];then
echo "failed to convert caffe model[$cf_model_path]"
exit $ret
else
echo "succeed to convert caffe model[$cf_model_path] to fluid model[$pd_model_path]"
fi
if [[ -z $only_convert ]];then
PYTHON=`which pdpython`
if [[ -z $PYTHON ]];then
PYTHON=`which python`
fi
imgfile="data/65.jpeg"
net_name=`grep "name" $proto_file | head -n1 | perl -ne 'if(/\"([^\"]+)\"/){ print $1."\n";}'`
$PYTHON ./infer.py $net_file $weight_file $imgfile $net_name
ret=$?
fi
exit $ret
a demo to show converting caffe model on 'mnist' using caffe2fluid
---
# How to use
1. prepare python environment
2. download caffe model to "models.caffe/lenet" which contains "lenet.caffemodel" and "lenet.prototxt"
3. run the tool
eg: bash ./run.sh lenet ./models.caffe/lenet ./models/lenet
#!/bin/env python
#function:
# demo to show how to use converted model using caffe2fluid
#
import sys
import os
import numpy as np
import paddle.v2 as paddle
import paddle.v2.fluid as fluid
def test_model(exe, test_program, fetch_list, test_reader, feeder):
acc_set = []
for data in test_reader():
acc_np, pred = exe.run(program=test_program,
feed=feeder.feed(data),
fetch_list=fetch_list)
acc_set.append(float(acc_np))
acc_val = np.array(acc_set).mean()
return float(acc_val)
def evaluate(net_file, model_file):
""" main
"""
#1, build model
net_path = os.path.dirname(net_file)
if net_path not in sys.path:
sys.path.insert(0, net_path)
from lenet import LeNet as MyNet
with_gpu = False
paddle.init(use_gpu=with_gpu)
#1, define network topology
images = fluid.layers.data(name='image', shape=[1, 28, 28], dtype='float32')
label = fluid.layers.data(name='label', shape=[1], dtype='int64')
net = MyNet({'data': images})
prediction = net.layers['prob']
acc = fluid.layers.accuracy(input=prediction, label=label)
place = fluid.CUDAPlace(0) if with_gpu is True else fluid.CPUPlace()
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
#2, load weights
if model_file.find('.npy') > 0:
net.load(data_path=model_file, exe=exe, place=place)
else:
net.load(data_path=model_file, exe=exe)
#3, test this model
test_program = fluid.default_main_program().clone()
test_reader = paddle.batch(paddle.dataset.mnist.test(), batch_size=128)
feeder = fluid.DataFeeder(feed_list=[images, label], place=place)
fetch_list = [acc, prediction]
print('go to test model using test set')
acc_val = test_model(exe, test_program, \
fetch_list, test_reader, feeder)
print('test accuracy is [%.4f], expected value[0.919]' % (acc_val))
if __name__ == "__main__":
net_file = 'models/lenet/lenet.py'
weight_file = 'models/lenet/lenet.npy'
argc = len(sys.argv)
if argc == 3:
net_file = sys.argv[1]
weight_file = sys.argv[2]
elif argc > 1:
print('usage:')
print('\tpython %s [net_file] [weight_file]' % (sys.argv[0]))
print('\teg:python %s %s %s %s' % (sys.argv[0], net_file, weight_file))
sys.exit(1)
evaluate(net_file, weight_file)
#!/bin/bash
#function:
# a tool used to:
# 1, convert a caffe model
# 2, do inference using this model
#
#usage:
# bash run.sh lenet ./models.caffe/lenet ./models/lenet
#
#set -x
if [[ $# -lt 3 ]];then
echo "usage:"
echo " bash $0 [model_name] [cf_model_path] [pd_model_path] [only_convert]"
echo " eg: bash $0 lenet ./models.caffe/lenet ./models/lenet"
exit 1
else
model_name=$1
cf_model_path=$2
pd_model_path=$3
no_eval=$4
fi
proto_file=$cf_model_path/${model_name}.prototxt
caffemodel_file=$cf_model_path/${model_name}.caffemodel
weight_file=$pd_model_path/${model_name}.npy
net_file=$pd_model_path/${model_name}.py
if [[ ! -e $proto_file ]];then
echo "not found prototxt[$proto_file]"
exit 1
fi
if [[ ! -e $caffemodel_file ]];then
echo "not found caffemodel[$caffemodel_file]"
exit 1
fi
if [[ ! -e $pd_model_path ]];then
mkdir $pd_model_path
fi
PYTHON=`which cfpython`
if [[ -z $PYTHON ]];then
PYTHON=`which python`
fi
$PYTHON ../../convert.py \
$proto_file \
--caffemodel $caffemodel_file \
--data-output-path $weight_file\
--code-output-path $net_file
ret=$?
if [[ $ret -ne 0 ]];then
echo "failed to convert caffe model[$cf_model_path]"
exit $ret
else
echo "succeed to convert caffe model[$cf_model_path] to fluid model[$pd_model_path]"
fi
if [[ -z $only_convert ]];then
PYTHON=`which pdpython`
if [[ -z $PYTHON ]];then
PYTHON=`which python`
fi
net_name=`grep "name" $proto_file | head -n1 | perl -ne 'if(/\"([^\"]+)\"/){ print $1."\n";}'`
if [[ $net_name != "LeNet" ]];then
echo "only support LeNet"
exit 1
fi
$PYTHON ./evaluate.py $net_file $weight_file
ret=$?
fi
exit $ret
from .graph import GraphBuilder, NodeMapper
from .errors import KaffeError, print_stderr
import os
from . import paddle
from .resolver import get_caffe_resolver, has_pycaffe
import os
import sys
SHARED_CAFFE_RESOLVER = None
def import_caffepb():
p = os.path.realpath(__file__)
p = os.path.dirname(p)
p = os.path.join(p, '../../proto')
sys.path.insert(0, p)
import caffepb
return caffepb
class CaffeResolver(object):
def __init__(self):
self.import_caffe()
def import_caffe(self):
self.caffe = None
try:
# Try to import PyCaffe first
import caffe
self.caffe = caffe
except ImportError:
# Fall back to the protobuf implementation
self.caffepb = import_caffepb()
show_fallback_warning()
if self.caffe:
# Use the protobuf code from the imported distribution.
# This way, Caffe variants with custom layers will work.
self.caffepb = self.caffe.proto.caffe_pb2
self.NetParameter = self.caffepb.NetParameter
def has_pycaffe(self):
return self.caffe is not None
def get_caffe_resolver():
global SHARED_CAFFE_RESOLVER
if SHARED_CAFFE_RESOLVER is None:
SHARED_CAFFE_RESOLVER = CaffeResolver()
return SHARED_CAFFE_RESOLVER
def has_pycaffe():
return get_caffe_resolver().has_pycaffe()
def show_fallback_warning():
msg = '''
------------------------------------------------------------
WARNING: PyCaffe not found!
Falling back to a pure protocol buffer implementation.
* Conversions will be drastically slower.
------------------------------------------------------------
'''
sys.stderr.write(msg)
import sys
#debug level, can be 'warn', 'verbose'
log_level = 'warn'
class KaffeError(Exception):
pass
def print_stderr(msg):
sys.stderr.write('%s\n' % msg)
def debug(msg):
if log_level == 'verbose':
print_stderr('[DEBUG]' + msg)
def notice(msg):
print_stderr('[NOTICE]' + msg)
def warn(msg):
print_stderr('[WARNING]' + msg)
def set_loglevel(level):
global log_level
if 'warn' != level and 'verbose' != level:
raise Exception('not supported log level[%s]' % (level))
log_level = level
from google.protobuf import text_format
from .caffe import get_caffe_resolver
from .errors import KaffeError, print_stderr
from .layers import LayerAdapter, LayerType, NodeKind, NodeDispatch
from .shapes import TensorShape
class Node(object):
def __init__(self, name, kind, layer=None):
self.name = name
self.kind = kind
self.layer = LayerAdapter(layer, kind) if layer else None
self.parents = []
self.children = []
self.data = None
self.output_shape = None
self.metadata = {}
def add_parent(self, parent_node):
assert parent_node not in self.parents
self.parents.append(parent_node)
if self not in parent_node.children:
parent_node.children.append(self)
def add_child(self, child_node):
assert child_node not in self.children
self.children.append(child_node)
if self not in child_node.parents:
child_node.parents.append(self)
def get_only_parent(self):
if len(self.parents) != 1:
raise KaffeError('Node (%s) expected to have 1 parent. Found %s.' %
(self, len(self.parents)))
return self.parents[0]
@property
def parameters(self):
if self.layer is not None:
return self.layer.parameters
return None
def __str__(self):
return '[%s] %s' % (self.kind, self.name)
def __repr__(self):
return '%s (0x%x)' % (self.name, id(self))
class Graph(object):
def __init__(self, nodes=None, name=None):
self.nodes = nodes or []
self.node_lut = {node.name: node for node in self.nodes}
self.name = name
def add_node(self, node):
self.nodes.append(node)
self.node_lut[node.name] = node
def get_node(self, name):
try:
return self.node_lut[name]
except KeyError:
raise KaffeError('Layer not found: %s' % name)
def get_input_nodes(self):
return [node for node in self.nodes if len(node.parents) == 0]
def get_output_nodes(self):
return [node for node in self.nodes if len(node.children) == 0]
def topologically_sorted(self):
sorted_nodes = []
unsorted_nodes = list(self.nodes)
temp_marked = set()
perm_marked = set()
def visit(node):
if node in temp_marked:
raise KaffeError('Graph is not a DAG.')
if node in perm_marked:
return
temp_marked.add(node)
for child in node.children:
visit(child)
perm_marked.add(node)
temp_marked.remove(node)
sorted_nodes.insert(0, node)
while len(unsorted_nodes):
visit(unsorted_nodes.pop())
return sorted_nodes
def compute_output_shapes(self):
sorted_nodes = self.topologically_sorted()
for node in sorted_nodes:
node.output_shape = TensorShape(
*NodeKind.compute_output_shape(node))
def replaced(self, new_nodes):
return Graph(nodes=new_nodes, name=self.name)
def transformed(self, transformers):
graph = self
for transformer in transformers:
graph = transformer(graph)
if graph is None:
raise KaffeError('Transformer failed: {}'.format(transformer))
assert isinstance(graph, Graph)
return graph
def __contains__(self, key):
return key in self.node_lut
def __str__(self):
hdr = '{:<20} {:<30} {:>20} {:>20}'.format('Type', 'Name', 'Param',
'Output')
s = [hdr, '-' * 94]
for node in self.topologically_sorted():
# If the node has learned parameters, display the first one's shape.
# In case of convolutions, this corresponds to the weights.
data_shape = node.data[0].shape if node.data else '--'
out_shape = node.output_shape or '--'
s.append('{:<20} {:<30} {:>20} {:>20}'.format(
node.kind, node.name, data_shape, tuple(out_shape)))
return '\n'.join(s)
class GraphBuilder(object):
'''Constructs a model graph from a Caffe protocol buffer definition.'''
def __init__(self, def_path, phase='test'):
'''
def_path: Path to the model definition (.prototxt)
data_path: Path to the model data (.caffemodel)
phase: Either 'test' or 'train'. Used for filtering phase-specific nodes.
'''
self.def_path = def_path
self.phase = phase
self.load()
def load(self):
'''Load the layer definitions from the prototxt.'''
self.params = get_caffe_resolver().NetParameter()
with open(self.def_path, 'rb') as def_file:
text_format.Merge(def_file.read(), self.params)
def filter_layers(self, layers):
'''Filter out layers based on the current phase.'''
phase_map = {0: 'train', 1: 'test'}
filtered_layer_names = set()
filtered_layers = []
for layer in layers:
phase = self.phase
if len(layer.include):
phase = phase_map[layer.include[0].phase]
if len(layer.exclude):
phase = phase_map[1 - layer.include[0].phase]
exclude = (phase != self.phase)
# Dropout layers appear in a fair number of Caffe
# test-time networks. These are just ignored. We'll
# filter them out here.
if (not exclude) and (phase == 'test'):
exclude = (layer.type == LayerType.Dropout)
if not exclude:
filtered_layers.append(layer)
# Guard against dupes.
assert layer.name not in filtered_layer_names
filtered_layer_names.add(layer.name)
return filtered_layers
def make_node(self, layer):
'''Create a graph node for the given layer.'''
kind = NodeKind.map_raw_kind(layer.type)
if kind is None:
raise KaffeError('Unknown layer type encountered: %s' % layer.type)
# We want to use the layer's top names (the "output" names), rather than the
# name attribute, which is more of readability thing than a functional one.
# Other layers will refer to a node by its "top name".
return Node(layer.name, kind, layer=layer)
def make_input_nodes(self):
'''
Create data input nodes.
This method is for old-style inputs, where the input specification
was not treated as a first-class layer in the prototext.
Newer models use the "Input layer" type.
'''
nodes = [Node(name, NodeKind.Data) for name in self.params.input]
if len(nodes):
input_dim = map(int, self.params.input_dim)
if not input_dim:
if len(self.params.input_shape) > 0:
input_dim = map(int, self.params.input_shape[0].dim)
else:
raise KaffeError('Dimensions for input not specified.')
for node in nodes:
node.output_shape = tuple(input_dim)
return nodes
def build(self):
'''
Builds the graph from the Caffe layer definitions.
'''
# Get the layers
layers = self.params.layers or self.params.layer
# Filter out phase-excluded layers
layers = self.filter_layers(layers)
# Get any separately-specified input layers
nodes = self.make_input_nodes()
nodes += [self.make_node(layer) for layer in layers]
# Initialize the graph
graph = Graph(nodes=nodes, name=self.params.name)
# Connect the nodes
#
# A note on layers and outputs:
# In Caffe, each layer can produce multiple outputs ("tops") from a set of inputs
# ("bottoms"). The bottoms refer to other layers' tops. The top can rewrite a bottom
# (in case of in-place operations). Note that the layer's name is not used for establishing
# any connectivity. It's only used for data association. By convention, a layer with a
# single top will often use the same name (although this is not required).
#
# The current implementation only supports single-output nodes (note that a node can still
# have multiple children, since multiple child nodes can refer to the single top's name).
node_outputs = {}
for layer in layers:
node = graph.get_node(layer.name)
for input_name in layer.bottom:
assert input_name != layer.name
parent_node = node_outputs.get(input_name)
if (parent_node is None) or (parent_node == node):
parent_node = graph.get_node(input_name)
node.add_parent(parent_node)
if len(layer.top) > 1:
raise KaffeError('Multiple top nodes are not supported.')
for output_name in layer.top:
if output_name == layer.name:
# Output is named the same as the node. No further action required.
continue
# There are two possibilities here:
#
# Case 1: output_name refers to another node in the graph.
# This is an "in-place operation" that overwrites an existing node.
# This would create a cycle in the graph. We'll undo the in-placing
# by substituting this node wherever the overwritten node is referenced.
#
# Case 2: output_name violates the convention layer.name == output_name.
# Since we are working in the single-output regime, we will can rename it to
# match the layer name.
#
# For both cases, future references to this top re-routes to this node.
node_outputs[output_name] = node
graph.compute_output_shapes()
return graph
class NodeMapper(NodeDispatch):
def __init__(self, graph):
self.graph = graph
def map(self):
nodes = self.graph.topologically_sorted()
# Remove input nodes - we'll handle them separately.
input_nodes = self.graph.get_input_nodes()
nodes = [t for t in nodes if t not in input_nodes]
# Decompose DAG into chains.
chains = []
for node in nodes:
attach_to_chain = None
if len(node.parents) == 1:
parent = node.get_only_parent()
for chain in chains:
if chain[-1] == parent:
# Node is part of an existing chain.
attach_to_chain = chain
break
if attach_to_chain is None:
# Start a new chain for this node.
attach_to_chain = []
chains.append(attach_to_chain)
attach_to_chain.append(node)
# Map each chain.
mapped_chains = []
for chain in chains:
mapped_chains.append(self.map_chain(chain))
return self.commit(mapped_chains)
def map_chain(self, chain):
return [self.map_node(node) for node in chain]
def map_node(self, node):
map_func = self.get_handler(node.kind, 'map')
mapped_node = map_func(node)
assert mapped_node is not None
mapped_node.node = node
return mapped_node
def commit(self, mapped_chains):
raise NotImplementedError('Must be implemented by subclass.')
import re
import numbers
from collections import namedtuple
from .shapes import *
LAYER_DESCRIPTORS = {
# Caffe Types
'AbsVal': shape_identity,
'Accuracy': shape_scalar,
'ArgMax': shape_not_implemented,
'BatchNorm': shape_identity,
'BNLL': shape_not_implemented,
'Concat': shape_concat,
'ContrastiveLoss': shape_scalar,
'Convolution': shape_convolution,
'Deconvolution': shape_not_implemented,
'Data': shape_data,
'Dropout': shape_identity,
'DummyData': shape_data,
'EuclideanLoss': shape_scalar,
'Eltwise': shape_identity,
'Exp': shape_identity,
'Flatten': shape_not_implemented,
'HDF5Data': shape_data,
'HDF5Output': shape_identity,
'HingeLoss': shape_scalar,
'Im2col': shape_not_implemented,
'ImageData': shape_data,
'InfogainLoss': shape_scalar,
'InnerProduct': shape_inner_product,
'Input': shape_data,
'LRN': shape_identity,
'MemoryData': shape_mem_data,
'MultinomialLogisticLoss': shape_scalar,
'MVN': shape_not_implemented,
'Pooling': shape_pool,
'Power': shape_identity,
'ReLU': shape_identity,
'Scale': shape_identity,
'Sigmoid': shape_identity,
'SigmoidCrossEntropyLoss': shape_scalar,
'Silence': shape_not_implemented,
'Softmax': shape_identity,
'SoftmaxWithLoss': shape_scalar,
'Split': shape_not_implemented,
'Slice': shape_not_implemented,
'TanH': shape_identity,
'WindowData': shape_not_implemented,
'Threshold': shape_identity,
}
# layer types in 'V1LayerParameter'
# (v1layertype name, enum value, mapped to layer type)
v1_layertypes = [
('ABSVAL', 35),
('ACCURACY', 1),
('ARGMAX', 30),
('BNLL', 2),
('CONCAT', 3),
('CONVOLUTION', 4),
('DATA', 5),
('DECONVOLUTION', 39),
('DROPOUT', 6),
('ELTWISE', 25),
('EXP', 38),
('FLATTEN', 8),
('IM2COL', 11),
('INNERPRODUCT', 14),
('LRN', 15),
('MEMORYDATA', 29),
('MULTINOMIALLOGISTICLOSS', 16),
('MVN', 34),
('POOLING', 17),
('POWER', 26),
('RELU', 18),
('SIGMOID', 19),
('SIGMOIDCROSSENTROPYLOSS', 27),
('SILENCE', 36),
('SOFTMAX', 20),
('SPLIT', 22),
('SLICE', 33),
('TANH', 23),
('WINDOWDATA', 24),
('THRESHOLD', 31),
]
LAYER_TYPES = LAYER_DESCRIPTORS.keys()
LayerType = type('LayerType', (), {t: t for t in LAYER_TYPES})
#map the layer name in V1 to standard name
V1_LAYER_MAP = {'_not_init_': True}
def get_v1_layer_map():
global V1_LAYER_MAP
if '_not_init_' not in V1_LAYER_MAP:
return V1_LAYER_MAP
else:
del V1_LAYER_MAP['_not_init_']
name2layer = {}
for n in LAYER_TYPES:
name2layer[n.upper()] = n
for l in v1_layertypes:
n, v = l
if n in name2layer and v not in V1_LAYER_MAP:
V1_LAYER_MAP[v] = name2layer[n]
else:
raise KaffeError('not found v1 layer type %s' % n)
return V1_LAYER_MAP
class NodeKind(LayerType):
@staticmethod
def map_raw_kind(kind):
if kind in LAYER_TYPES:
return kind
v1_layers = get_v1_layer_map()
if kind in v1_layers:
return v1_layers[kind]
else:
return None
@staticmethod
def compute_output_shape(node):
try:
val = LAYER_DESCRIPTORS[node.kind](node)
return val
except NotImplementedError:
raise KaffeError(
'Output shape computation not implemented for type: %s' %
node.kind)
class NodeDispatchError(KaffeError):
pass
class NodeDispatch(object):
@staticmethod
def get_handler_name(node_kind):
if len(node_kind) <= 4:
# A catch-all for things like ReLU and tanh
return node_kind.lower()
# Convert from CamelCase to under_scored
name = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', node_kind)
return re.sub('([a-z0-9])([A-Z])', r'\1_\2', name).lower()
def get_handler(self, node_kind, prefix):
name = self.get_handler_name(node_kind)
name = '_'.join((prefix, name))
try:
return getattr(self, name)
except AttributeError:
raise NodeDispatchError(
'No handler found for node kind: %s (expected: %s)' %
(node_kind, name))
class LayerAdapter(object):
def __init__(self, layer, kind):
self.layer = layer
self.kind = kind
@property
def parameters(self):
name = NodeDispatch.get_handler_name(self.kind)
name = '_'.join((name, 'param'))
try:
return getattr(self.layer, name)
except AttributeError:
raise NodeDispatchError(
'Caffe parameters not found for layer kind: %s' % (self.kind))
@staticmethod
def get_kernel_value(scalar, repeated, idx, default=None):
if scalar:
return scalar
if repeated:
if isinstance(repeated, numbers.Number):
return repeated
if len(repeated) == 1:
# Same value applies to all spatial dimensions
return int(repeated[0])
assert idx < len(repeated)
# Extract the value for the given spatial dimension
return repeated[idx]
if default is None:
raise ValueError('Unable to determine kernel parameter!')
return default
@property
def kernel_parameters(self):
assert self.kind in (NodeKind.Convolution, NodeKind.Pooling)
params = self.parameters
k_h = self.get_kernel_value(params.kernel_h, params.kernel_size, 0)
k_w = self.get_kernel_value(params.kernel_w, params.kernel_size, 1)
s_h = self.get_kernel_value(
params.stride_h, params.stride, 0, default=1)
s_w = self.get_kernel_value(
params.stride_w, params.stride, 1, default=1)
p_h = self.get_kernel_value(params.pad_h, params.pad, 0, default=0)
p_w = self.get_kernel_value(params.pad_h, params.pad, 1, default=0)
return KernelParameters(k_h, k_w, s_h, s_w, p_h, p_w)
KernelParameters = namedtuple('KernelParameters', [
'kernel_h', 'kernel_w', 'stride_h', 'stride_w', 'pad_h', 'pad_w'
])
from .transformer import Transformer
from .network import Network
import math
import os
import numpy as np
def import_fluid():
import paddle.v2.fluid as fluid
return fluid
def layer(op):
'''Decorator for composable network layers.'''
def layer_decorated(self, *args, **kwargs):
# Automatically set a name if not provided.
name = kwargs.setdefault('name', self.get_unique_name(op.__name__))
# Figure out the layer inputs.
if len(self.terminals) == 0:
raise RuntimeError('No input variables found for layer %s.' % name)
elif len(self.terminals) == 1:
layer_input = self.terminals[0]
else:
layer_input = list(self.terminals)
# Perform the operation and get the output.
layer_output = op(self, layer_input, *args, **kwargs)
# Add to layer LUT.
self.layers[name] = layer_output
# This output is now the input for the next layer.
self.feed(layer_output)
#print('output shape of %s:' % (name))
#print layer_output.shape
# Return self for chained calls.
return self
return layer_decorated
class Network(object):
def __init__(self, inputs, trainable=True):
# The input nodes for this network
self.inputs = inputs
# The current list of terminal nodes
self.terminals = []
# Mapping from layer names to layers
self.layers = dict(inputs)
# If true, the resulting variables are set as trainable
self.trainable = trainable
# Switch variable for dropout
self.paddle_env = None
self.setup()
def setup(self):
'''Construct the network. '''
raise NotImplementedError('Must be implemented by the subclass.')
def load(self, data_path, exe=None, place=None, ignore_missing=False):
'''Load network weights.
data_path: The path to the numpy-serialized network weights
ignore_missing: If true, serialized weights for missing layers are ignored.
'''
fluid = import_fluid()
#load fluid mode directly
if os.path.isdir(data_path):
assert (exe is not None), \
'must provide a executor to load fluid model'
fluid.io.load_persistables_if_exist(executor=exe, dirname=data_path)
return True
#load model from a npy file
if exe is None or place is None:
if self.paddle_env is None:
place = fluid.CPUPlace()
exe = fluid.Executor(place)
self.paddle_env = {'place': place, 'exe': exe}
exe = exe.run(fluid.default_startup_program())
else:
place = self.paddle_env['place']
exe = self.paddle_env['exe']
data_dict = np.load(data_path).item()
for op_name in data_dict:
layer = self.layers[op_name]
for param_name, data in data_dict[op_name].iteritems():
try:
name = '%s_%s' % (op_name, param_name)
v = fluid.global_scope().find_var(name)
w = v.get_tensor()
w.set(data, place)
except ValueError:
if not ignore_missing:
raise
return True
def feed(self, *args):
'''Set the input(s) for the next operation by replacing the terminal nodes.
The arguments can be either layer names or the actual layers.
'''
assert len(args) != 0
self.terminals = []
for fed_layer in args:
if isinstance(fed_layer, basestring):
try:
fed_layer = self.layers[fed_layer]
except KeyError:
raise KeyError('Unknown layer name fed: %s' % fed_layer)
self.terminals.append(fed_layer)
return self
def get_output(self):
'''Returns the current network output.'''
return self.terminals[-1]
def get_unique_name(self, prefix):
'''Returns an index-suffixed unique name for the given prefix.
This is used for auto-generating layer names based on the type-prefix.
'''
ident = sum(t.startswith(prefix) for t, _ in self.layers.items()) + 1
return '%s_%d' % (prefix, ident)
@layer
def conv(self,
input,
k_h,
k_w,
c_o,
s_h,
s_w,
name,
relu=True,
padding=None,
group=1,
biased=True):
if padding is None:
padding = [0, 0]
# Get the number of channels in the input
c_i, h_i, w_i = input.shape[1:]
# Verify that the grouping parameter is valid
assert c_i % group == 0
assert c_o % group == 0
fluid = import_fluid()
prefix = name + '_'
output = fluid.layers.conv2d(
input=input,
filter_size=[k_h, k_w],
num_filters=c_o,
stride=[s_h, s_w],
padding=padding,
groups=group,
param_attr=fluid.ParamAttr(name=prefix + "weights"),
bias_attr=fluid.ParamAttr(name=prefix + "biases"),
act="relu" if relu is True else None)
return output
@layer
def relu(self, input, name):
fluid = import_fluid()
output = fluid.layers.relu(x=input)
return output
def _adjust_pad_if_needed(self, i_hw, k_hw, s_hw, p_hw):
#adjust the padding if needed
i_h, i_w = i_hw
k_h, k_w = k_hw
s_h, s_w = s_hw
p_h, p_w = p_hw
def is_consistent(i, k, s, p):
o = i + 2 * p - k
if o % s == 0:
return True
else:
return False
real_p_h = 0
real_p_w = 0
if is_consistent(i_h, k_h, s_h, p_h) is False:
real_p_h = int(k_h / 2)
if is_consistent(i_w, k_w, s_w, p_w) is False:
real_p_w = int(k_w / 2)
return [real_p_h, real_p_w]
def pool(self, pool_type, input, k_h, k_w, s_h, s_w, name, padding):
# Get the number of channels in the input
in_hw = input.shape[2:]
k_hw = [k_h, k_w]
s_hw = [s_h, s_w]
if padding is None:
#fix bug about the difference between conv and pool
#more info: https://github.com/BVLC/caffe/issues/1318
padding = self._adjust_pad_if_needed(in_hw, k_hw, s_hw, [0, 0])
fluid = import_fluid()
output = fluid.layers.pool2d(
input=input,
pool_size=k_hw,
pool_stride=s_hw,
pool_padding=padding,
pool_type=pool_type)
return output
@layer
def max_pool(self, input, k_h, k_w, s_h, s_w, name, padding=None):
return self.pool('max', input, k_h, k_w, s_h, s_w, name, padding)
@layer
def avg_pool(self, input, k_h, k_w, s_h, s_w, name, padding=None):
return self.pool('avg', input, k_h, k_w, s_h, s_w, name, padding)
@layer
def lrn(self, input, radius, alpha, beta, name, bias=1.0):
fluid = import_fluid()
output = fluid.layers.lrn(input=input, \
n=radius, k=bias, alpha=alpha, beta=beta, name=name)
return output
@layer
def concat(self, inputs, axis, name):
fluid = import_fluid()
output = fluid.layers.concat(input=inputs, axis=axis)
return output
@layer
def add(self, inputs, name):
fluid = import_fluid()
output = inputs[0]
for i in inputs[1:]:
output = fluid.layers.elementwise_add(x=output, y=i)
return output
@layer
def fc(self, input, num_out, name, relu=True, act=None):
fluid = import_fluid()
if act is None:
act = 'relu' if relu is True else None
prefix = name + '_'
output = fluid.layers.fc(
name=name,
input=input,
size=num_out,
act=act,
param_attr=fluid.ParamAttr(name=prefix + 'weights'),
bias_attr=fluid.ParamAttr(name=prefix + 'biases'))
return output
@layer
def softmax(self, input, name):
fluid = import_fluid()
output = fluid.layers.softmax(input)
return output
@layer
def batch_normalization(self, input, name, scale_offset=True, relu=False):
# NOTE: Currently, only inference is supported
fluid = import_fluid()
prefix = name + '_'
param_attr = None if scale_offset is False else fluid.ParamAttr(
name=prefix + 'scale')
bias_attr = None if scale_offset is False else fluid.ParamAttr(
name=prefix + 'offset')
mean_name = prefix + 'mean'
variance_name = prefix + 'variance'
output = fluid.layers.batch_norm(
name=name,
input=input,
is_test=True,
param_attr=param_attr,
bias_attr=bias_attr,
moving_mean_name=mean_name,
moving_variance_name=variance_name,
epsilon=1e-5,
act='relu' if relu is True else None)
return output
@layer
def dropout(self, input, drop_prob, name, is_test=True):
fluid = import_fluid()
output = fluid.layers.dropout(
input, dropout_prob=drop_prob, is_test=is_test, name=name)
return output
import numpy as np
from ..errors import KaffeError, print_stderr
from ..graph import GraphBuilder, NodeMapper
from ..layers import NodeKind
from ..transformers import (DataInjector, DataReshaper, NodeRenamer, ReLUFuser,
BatchNormScaleBiasFuser, BatchNormPreprocessor,
ParameterNamer)
from . import network
def get_padding_type(kernel_params, input_shape, output_shape):
'''Translates Caffe's numeric padding to one of ('SAME', 'VALID').
Caffe supports arbitrary padding values, while TensorFlow only
supports 'SAME' and 'VALID' modes. So, not all Caffe paddings
can be translated to TensorFlow. There are some subtleties to
how the padding edge-cases are handled. These are described here:
https://github.com/Yangqing/caffe2/blob/master/caffe2/proto/caffe2_legacy.proto
'''
k_h, k_w, s_h, s_w, p_h, p_w = kernel_params
if p_h * p_w > 0:
return [p_h, p_w]
else:
return None
class TensorFlowNode(object):
'''An intermediate representation for TensorFlow operations.'''
def __init__(self, op, *args, **kwargs):
# A string corresponding to the TensorFlow operation
self.op = op
# Positional arguments for the operation
self.args = args
# Keyword arguments for the operation
self.kwargs = list(kwargs.items())
# The source Caffe node
self.node = None
def format(self, arg):
'''Returns a string representation for the given value.'''
return "'%s'" % arg if isinstance(arg, basestring) else str(arg)
def pair(self, key, value):
'''Returns key=formatted(value).'''
return '%s=%s' % (key, self.format(value))
def emit(self):
'''Emits the Python source for this node.'''
# Format positional arguments
args = map(self.format, self.args)
# Format any keyword arguments
if self.kwargs:
args += [self.pair(k, v) for k, v in self.kwargs]
# Set the node name
args.append(self.pair('name', self.node.name))
args = ', '.join(args)
return '%s(%s)' % (self.op, args)
class MaybeActivated(object):
def __init__(self, node, default=True):
self.inject_kwargs = {}
if node.metadata.get('relu', False) != default:
self.inject_kwargs['relu'] = not default
def __call__(self, *args, **kwargs):
kwargs.update(self.inject_kwargs)
return TensorFlowNode(*args, **kwargs)
class TensorFlowMapper(NodeMapper):
def get_kernel_params(self, node):
kernel_params = node.layer.kernel_parameters
input_shape = node.get_only_parent().output_shape
padding = get_padding_type(kernel_params, input_shape,
node.output_shape)
# Only emit the padding if it's not the default value.
padding = {'padding': padding} if padding is not None else {}
return (kernel_params, padding)
def map_convolution(self, node):
(kernel_params, kwargs) = self.get_kernel_params(node)
h = kernel_params.kernel_h
w = kernel_params.kernel_w
c_o = node.output_shape[1]
c_i = node.parents[0].output_shape[1]
group = node.parameters.group
if group != 1:
kwargs['group'] = group
if not node.parameters.bias_term:
kwargs['biased'] = False
assert kernel_params.kernel_h == h
assert kernel_params.kernel_w == w
return MaybeActivated(node)(
'conv', kernel_params.kernel_h, kernel_params.kernel_w, c_o,
kernel_params.stride_h, kernel_params.stride_w, **kwargs)
def map_relu(self, node):
return TensorFlowNode('relu')
def map_pooling(self, node):
pool_type = node.parameters.pool
if pool_type == 0:
pool_op = 'max_pool'
elif pool_type == 1:
pool_op = 'avg_pool'
else:
# Stochastic pooling, for instance.
raise KaffeError('Unsupported pooling type.')
(kernel_params, padding) = self.get_kernel_params(node)
return TensorFlowNode(pool_op, kernel_params.kernel_h,
kernel_params.kernel_w, kernel_params.stride_h,
kernel_params.stride_w, **padding)
def map_inner_product(self, node):
#TODO: Axis
assert node.parameters.axis == 1
#TODO: Unbiased
assert node.parameters.bias_term == True
return MaybeActivated(node)('fc', node.parameters.num_output)
def map_softmax(self, node):
return TensorFlowNode('softmax')
def map_lrn(self, node):
params = node.parameters
# The window size must be an odd value. For a window
# size of (2*n+1), TensorFlow defines depth_radius = n.
assert params.local_size % 2 == 1
# Caffe scales by (alpha/(2*n+1)), whereas TensorFlow
# just scales by alpha (as does Krizhevsky's paper).
# We'll account for that here.
alpha = params.alpha / float(params.local_size)
return TensorFlowNode('lrn', params.local_size, alpha, params.beta)
def map_concat(self, node):
return TensorFlowNode('concat', node.parameters.axis)
def map_dropout(self, node):
return TensorFlowNode('dropout', node.parameters.dropout_ratio)
def map_batch_norm(self, node):
scale_offset = len(node.data) == 4
kwargs = {} if scale_offset else {'scale_offset': False}
return MaybeActivated(
node, default=False)('batch_normalization', **kwargs)
def map_eltwise(self, node):
operations = {0: 'multiply', 1: 'add', 2: 'max'}
op_code = node.parameters.operation
try:
return TensorFlowNode(operations[op_code])
except KeyError:
raise KaffeError('Unknown elementwise operation: {}'.format(
op_code))
def commit(self, chains):
return chains
class TensorFlowEmitter(object):
def __init__(self, tab=None):
self.tab = tab or ' ' * 4
self.prefix = ''
self.net_name = ''
def indent(self):
self.prefix += self.tab
def outdent(self):
self.prefix = self.prefix[:-len(self.tab)]
def statement(self, s):
return self.prefix + s + '\n'
def emit_imports(self):
import inspect
codes = []
codes.append(
'### generated by caffe2fluid, your net is in class "%s" ###\n' %
(self.net_name))
network_source = inspect.getsource(network)
codes.append(network_source + '\n')
return self.statement('\n'.join(codes))
def emit_class_def(self, name):
return self.statement('class %s(Network):' % (name))
def emit_setup_def(self):
return self.statement('def setup(self):')
def emit_shape_def(self, input_nodes):
self.outdent()
func_def = self.statement('@classmethod')
func_def += self.statement('def input_shapes(cls):')
self.indent()
input_shapes = {}
for n in input_nodes:
name = n.name
output_shape = n.output_shape
shape = [str(s) for s in output_shape[1:]]
input_shapes[name] = ', '.join(shape)
input_shapes = ['"%s": [%s]' % (n, l) for n, l in input_shapes.items()]
shape_str = ','.join(input_shapes)
func_def += self.statement('return {%s}' % (shape_str))
return '\n\n' + func_def
def emit_convert_def(self, input_nodes):
codes = []
inputs = {}
codes.append('shapes = cls.input_shapes()')
for n in input_nodes:
name = n.name
layer_var = name + '_layer'
layer_def = '%s = fluid.layers.data(name="%s", shape=shapes["%s"],'\
' dtype="float32")' % (layer_var, name, name)
#layer_var, layer_def = data_layer_def(n.name, n.output_shape)
codes.append(layer_def)
inputs[name] = layer_var
input_dict = ','.join(['"%s": %s' % (n, l) for n, l in inputs.items()])
codes.append('feed_data = {' + input_dict + '}')
codes.append('net = cls(feed_data)')
codes.append("place = fluid.CPUPlace()")
codes.append("exe = fluid.Executor(place)")
codes.append("exe.run(fluid.default_startup_program())")
codes.append("net.load(data_path=npy_model, exe=exe, place=place)")
codes.append(
"fluid.io.save_persistables(executor=exe, dirname=fluid_path)")
self.outdent()
func_def = self.statement('@classmethod')
func_def += self.statement('def convert(cls, npy_model, fluid_path):')
self.indent()
func_def += self.statement('import paddle.v2.fluid as fluid')
for l in codes:
func_def += self.statement(l)
return '\n' + func_def
def emit_main_def(self, name):
if name is None:
return ''
self.prefix = ''
main_def = self.statement('if __name__ == "__main__":')
self.indent()
main_def += self.statement("#usage: python xxxnet.py xxx.npy ./model\n")
main_def += self.statement("import sys")
main_def += self.statement("npy_weight = sys.argv[1]")
main_def += self.statement("fluid_model = sys.argv[2]")
main_def += self.statement("%s.convert(npy_weight, fluid_model)" %
(name))
main_def += self.statement("exit(0)")
return '\n\n' + main_def
def emit_parents(self, chain):
assert len(chain)
s = 'self.feed('
sep = ', \n' + self.prefix + (' ' * len(s))
s += sep.join(
["'%s'" % parent.name for parent in chain[0].node.parents])
return self.statement(s + ')')
def emit_node(self, node):
return self.statement('self.' + node.emit())
def emit(self, name, chains, input_nodes=None):
self.net_name = name
s = self.emit_imports()
s += self.emit_class_def(name)
self.indent()
s += self.emit_setup_def()
self.indent()
blocks = []
for chain in chains:
b = ''
b += self.emit_parents(chain)
for node in chain:
b += self.emit_node(node)
blocks.append(b[:-1])
s = s + '\n\n'.join(blocks)
s += self.emit_shape_def(input_nodes)
s += self.emit_convert_def(input_nodes)
s += self.emit_main_def(name)
return s
class Transformer(object):
def __init__(self, def_path, data_path, verbose=True, phase='test'):
self.verbose = verbose
self.phase = phase
self.load(def_path, data_path, phase)
self.params = None
self.source = None
def load(self, def_path, data_path, phase):
# Build the graph
graph = GraphBuilder(def_path, phase).build()
if data_path is not None:
# Load and associate learned parameters
graph = DataInjector(def_path, data_path)(graph)
# Transform the graph
transformers = [
# Fuse split batch normalization layers
BatchNormScaleBiasFuser(),
# Fuse ReLUs
# TODO: Move non-linearity application to layer wrapper, allowing
# any arbitrary operation to be optionally activated.
ReLUFuser(allowed_parent_types=[
NodeKind.Convolution, NodeKind.InnerProduct, NodeKind.BatchNorm
]),
# Rename nodes
# Slashes are used for scoping in TensorFlow. Replace slashes
# in node names with underscores.
# (Caffe's GoogLeNet implementation uses slashes)
NodeRenamer(lambda node: node.name.replace('/', '_'))
]
self.graph = graph.transformed(transformers)
# Display the graph
if self.verbose:
print_stderr(self.graph)
def transform_data(self):
if self.params is None:
transformers = [
# Reshape the parameters to TensorFlow's ordering
DataReshaper({
# (c_o, c_i, h, w) -> (h, w, c_i, c_o) for TF
NodeKind.Convolution: (0, 1, 2, 3),
# (c_o, c_i) -> (c_i, c_o)
NodeKind.InnerProduct: (1, 0)
}),
# Pre-process batch normalization data
BatchNormPreprocessor(),
# Convert parameters to dictionaries
ParameterNamer(),
]
self.graph = self.graph.transformed(transformers)
self.params = {
node.name: node.data
for node in self.graph.nodes if node.data
}
return self.params
def transform_source(self):
if self.source is None:
mapper = TensorFlowMapper(self.graph)
chains = mapper.map()
emitter = TensorFlowEmitter()
input_nodes = self.graph.get_input_nodes()
self.source = emitter.emit(self.graph.name, chains, input_nodes)
return self.source
import math
from collections import namedtuple
from .errors import KaffeError
TensorShape = namedtuple('TensorShape',
['batch_size', 'channels', 'height', 'width'])
def get_filter_output_shape(i_h, i_w, params, round_func):
o_h = (i_h + 2 * params.pad_h - params.kernel_h
) / float(params.stride_h) + 1
o_w = (i_w + 2 * params.pad_w - params.kernel_w
) / float(params.stride_w) + 1
return (int(round_func(o_h)), int(round_func(o_w)))
def get_strided_kernel_output_shape(node, round_func):
assert node.layer is not None
input_shape = node.get_only_parent().output_shape
o_h, o_w = get_filter_output_shape(input_shape.height, input_shape.width,
node.layer.kernel_parameters, round_func)
params = node.layer.parameters
has_c_o = hasattr(params, 'num_output')
c = params.num_output if has_c_o else input_shape.channels
return TensorShape(input_shape.batch_size, c, o_h, o_w)
def shape_not_implemented(node):
raise NotImplementedError
def shape_identity(node):
assert len(node.parents) > 0
return node.parents[0].output_shape
def shape_scalar(node):
return TensorShape(1, 1, 1, 1)
def shape_data(node):
if node.output_shape:
# Old-style input specification
return node.output_shape
try:
# New-style input specification
return map(int, node.parameters.shape[0].dim)
except:
# We most likely have a data layer on our hands. The problem is,
# Caffe infers the dimensions of the data from the source (eg: LMDB).
# We want to avoid reading datasets here. Fail for now.
# This can be temporarily fixed by transforming the data layer to
# Caffe's "input" layer (as is usually used in the "deploy" version).
# TODO: Find a better solution for this.
raise KaffeError('Cannot determine dimensions of data layer.\n'
'See comments in function shape_data for more info.')
def shape_mem_data(node):
params = node.parameters
return TensorShape(params.batch_size, params.channels, params.height,
params.width)
def shape_concat(node):
axis = node.layer.parameters.axis
output_shape = None
for parent in node.parents:
if output_shape is None:
output_shape = list(parent.output_shape)
else:
output_shape[axis] += parent.output_shape[axis]
return tuple(output_shape)
def shape_convolution(node):
return get_strided_kernel_output_shape(node, math.floor)
def shape_pool(node):
return get_strided_kernel_output_shape(node, math.ceil)
def shape_inner_product(node):
input_shape = node.get_only_parent().output_shape
return TensorShape(input_shape.batch_size, node.layer.parameters.num_output,
1, 1)
'''
A collection of graph transforms.
A transformer is a callable that accepts a graph and returns a transformed version.
'''
import os
import numpy as np
from .caffe import get_caffe_resolver, has_pycaffe
from .errors import KaffeError, debug, notice, warn
from .layers import NodeKind
class DataInjector(object):
'''
Associates parameters loaded from a .caffemodel file with their corresponding nodes.
'''
def __init__(self, def_path, data_path):
# The .prototxt file defining the graph
self.def_path = def_path
# The .caffemodel file containing the learned parameters
self.data_path = data_path
# Set to true if the fallback protocol-buffer based backend was used
self.did_use_pb = False
# A list containing (layer name, parameters) tuples
self.params = None
# Load the parameters
self.load()
def load(self):
if has_pycaffe():
self.load_using_caffe()
else:
self.load_using_pb()
def load_using_caffe(self):
caffe = get_caffe_resolver().caffe
net = caffe.Net(self.def_path, self.data_path, caffe.TEST)
data = lambda blob: blob.data
self.params = [(k, map(data, v)) for k, v in net.params.items()]
def load_using_pb(self):
data = get_caffe_resolver().NetParameter()
data.MergeFromString(open(self.data_path, 'rb').read())
pair = lambda layer: (layer.name, self.normalize_pb_data(layer))
layers = data.layers or data.layer
self.params = [pair(layer) for layer in layers if layer.blobs]
self.did_use_pb = True
def normalize_pb_data(self, layer):
transformed = []
for blob in layer.blobs:
if len(blob.shape.dim):
dims = blob.shape.dim
c_o, c_i, h, w = map(int, [1] * (4 - len(dims)) + list(dims))
else:
c_o = blob.num
c_i = blob.channels
h = blob.height
w = blob.width
data = np.array(blob.data, dtype=np.float32).reshape(c_o, c_i, h, w)
transformed.append(data)
return transformed
def adjust_parameters(self, node, data):
if not self.did_use_pb:
return data
# When using the protobuf-backend, each parameter initially has four dimensions.
# In certain cases (like FC layers), we want to eliminate the singleton dimensions.
# This implementation takes care of the common cases. However, it does leave the
# potential for future issues.
# The Caffe-backend does not suffer from this problem.
data = list(data)
squeeze_indices = [1] # Squeeze biases.
if node.kind == NodeKind.InnerProduct:
squeeze_indices.append(0) # Squeeze FC.
for idx in squeeze_indices:
if idx >= len(data):
continue
shape_old = data[idx].shape
data[idx] = np.squeeze(data[idx])
shape_new = data[idx].shape
if len(shape_old) != shape_new:
debug('squeeze idx:%d, with kind:%s,name:%s' % \
(idx, node.kind, node.name))
return data
def __call__(self, graph):
for layer_name, data in self.params:
if layer_name in graph:
node = graph.get_node(layer_name)
node.data = self.adjust_parameters(node, data)
else:
notice('Ignoring parameters for non-existent layer: %s' % \
layer_name)
return graph
class DataReshaper(object):
def __init__(self, mapping, replace=True):
# A dictionary mapping NodeKind to the transposed order.
self.mapping = mapping
# The node kinds eligible for reshaping
self.reshaped_node_types = self.mapping.keys()
# If true, the reshaped data will replace the old one.
# Otherwise, it's set to the reshaped_data attribute.
self.replace = replace
def has_spatial_parent(self, node):
try:
parent = node.get_only_parent()
s = parent.output_shape
return s.height > 1 or s.width > 1
except KaffeError:
return False
def map(self, node_kind):
try:
return self.mapping[node_kind]
except KeyError:
raise
#raise KaffeError('Ordering not found for node kind: {}'.format(node_kind))
def __call__(self, graph):
for node in graph.nodes:
if node.data is None:
continue
if node.kind not in self.reshaped_node_types:
# Check for 2+ dimensional data
if any(len(tensor.shape) > 1 for tensor in node.data):
notice('parmaters not reshaped for node: {}'.format(node))
continue
transpose_order = self.map(node.kind)
weights = node.data[0]
if (node.kind == NodeKind.InnerProduct
) and self.has_spatial_parent(node):
# The FC layer connected to the spatial layer needs to be
# re-wired to match the new spatial ordering.
in_shape = node.get_only_parent().output_shape
fc_shape = weights.shape
output_channels = fc_shape[0]
weights = weights.reshape((output_channels, -1))
weights = weights.transpose(transpose_order)
node.reshaped_data = weights
else:
node.reshaped_data = weights.transpose(transpose_order)
if self.replace:
for node in graph.nodes:
if hasattr(node, 'reshaped_data'):
# Set the weights
node.data[0] = node.reshaped_data
del node.reshaped_data
return graph
class SubNodeFuser(object):
'''
An abstract helper for merging a single-child with its single-parent.
'''
def __call__(self, graph):
nodes = graph.nodes
fused_nodes = []
for node in nodes:
if len(node.parents) != 1:
# We're only fusing nodes with single parents
continue
parent = node.get_only_parent()
if len(parent.children) != 1:
# We can only fuse a node if its parent's
# value isn't used by any other node.
continue
if not self.is_eligible_pair(parent, node):
continue
# Rewrite the fused node's children to its parent.
for child in node.children:
child.parents.remove(node)
parent.add_child(child)
# Disconnect the fused node from the graph.
parent.children.remove(node)
fused_nodes.append(node)
# Let the sub-class merge the fused node in any arbitrary way.
self.merge(parent, node)
transformed_nodes = [node for node in nodes if node not in fused_nodes]
return graph.replaced(transformed_nodes)
def is_eligible_pair(self, parent, child):
'''Returns true if this parent/child pair is eligible for fusion.'''
raise NotImplementedError('Must be implemented by subclass.')
def merge(self, parent, child):
'''Merge the child node into the parent.'''
raise NotImplementedError('Must be implemented by subclass')
class ReLUFuser(SubNodeFuser):
'''
Fuses rectified linear units with their parent nodes.
'''
def __init__(self, allowed_parent_types=None):
# Fuse ReLUs when the parent node is one of the given types.
# If None, all node types are eligible.
self.allowed_parent_types = allowed_parent_types
def is_eligible_pair(self, parent, child):
return ((self.allowed_parent_types is None or \
parent.kind in self.allowed_parent_types) and \
child.kind == NodeKind.ReLU)
def merge(self, parent, _):
parent.metadata['relu'] = True
class BatchNormScaleBiasFuser(SubNodeFuser):
'''
The original batch normalization paper includes two learned
parameters: a scaling factor \gamma and a bias \beta.
Caffe's implementation does not include these two. However, it is commonly
replicated by adding a scaling+bias layer immidiately after the batch norm.
This fuser merges the scaling+bias layer with the batch norm.
'''
def is_eligible_pair(self, parent, child):
return (parent.kind == NodeKind.BatchNorm and \
child.kind == NodeKind.Scale and \
child.parameters.axis == 1 and \
child.parameters.bias_term == True)
def merge(self, parent, child):
parent.scale_bias_node = child
class BatchNormPreprocessor(object):
'''
Prescale batch normalization parameters.
Concatenate gamma (scale) and beta (bias) terms if set.
'''
def __call__(self, graph):
for node in graph.nodes:
if node.kind != NodeKind.BatchNorm:
continue
assert node.data is not None
assert len(node.data) == 3
node.data = [np.squeeze(i) for i in node.data]
mean, variance, scale = node.data
# Prescale the stats
scaling_factor = 1.0 / scale if scale != 0 else 0
mean *= scaling_factor
variance *= scaling_factor
# Replace with the updated values
node.data = [mean, variance]
if hasattr(node, 'scale_bias_node'):
# Include the scale and bias terms
gamma, beta = node.scale_bias_node.data
node.data += [np.squeeze(i) for i in [gamma, beta]]
return graph
class NodeRenamer(object):
'''
Renames nodes in the graph using a given unary function that
accepts a node and returns its new name.
'''
def __init__(self, renamer):
self.renamer = renamer
def __call__(self, graph):
for node in graph.nodes:
node.name = self.renamer(node)
return graph
class ParameterNamer(object):
'''
Convert layer data arrays to a dictionary mapping parameter names to their values.
'''
def __call__(self, graph):
for node in graph.nodes:
if node.data is None:
continue
if node.kind in (NodeKind.Convolution, NodeKind.InnerProduct):
names = ('weights', )
if node.parameters.bias_term:
names += ('biases', )
elif node.kind == NodeKind.BatchNorm:
names = ('mean', 'variance')
if len(node.data) == 4:
names += ('scale', 'offset')
else:
warn('Unhandled parameters: {}'.format(node.kind))
continue
assert len(names) == len(node.data)
node.data = dict(zip(names, node.data))
return graph
#!/bin/bash
#function:
# script used to generate caffepb.py from caffe.proto using protoc
#
PROTOC=`which protoc`
if [[ -z $PROTOC ]];then
echo "not found protoc, you should first install it following this[https://github.com/google/protobuf/releases]"
exit 1
fi
WORK_ROOT=$(dirname `readlink -f "$BASH_SOURCE[0]"`)
PY_NAME="$WORK_ROOT/caffepb.py"
$PROTOC --proto_path=$WORK_ROOT --python_out=$WORK_ROOT $WORK_ROOT/caffe.proto
ret=$?
if [ $ret -eq 0 ];then
mv $WORK_ROOT/caffe_pb2.py $PY_NAME
fi
if [ -e "$PY_NAME" ];then
echo "succeed to generate [$PY_NAME]"
exit 0
else
echo "failed to generate [$PY_NAME]"
fi
exit $ret
此差异已折叠。
The minimum PaddlePaddle version needed for the code sample in this directory is the lastest develop branch. If you are on a version of PaddlePaddle earlier than this, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
---
This is a collection of example models for neural machine translation and neural sequence modeling.
### TODO
This project is still under active development.
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
The minimum PaddlePaddle version needed for the code sample in this directory is the lastest develop branch. If you are on a version of PaddlePaddle earlier than this, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
---
# MobileNet-SSD
This model built with paddle fluid is still under active development and is not
the final version. We welcome feedbacks.
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册