Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
毕竟曾有刹那
Mace
提交
ced4a49d
Mace
项目概览
毕竟曾有刹那
/
Mace
与 Fork 源项目一致
Fork自
Xiaomi / Mace
通知
1
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
DevOps
流水线
流水线任务
计划
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
Mace
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
DevOps
DevOps
流水线
流水线任务
计划
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
流水线任务
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
ced4a49d
编写于
5月 28, 2018
作者:
L
liuqi
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Refactor mace_tools and yaml format for better usage.
上级
fe0cdf27
变更
30
展开全部
隐藏空白更改
内联
并排
Showing
30 changed file
with
1922 addition
and
1344 deletion
+1922
-1344
.gitlab-ci.yml
.gitlab-ci.yml
+1
-0
docs/getting_started/create_a_model_deployment.rst
docs/getting_started/create_a_model_deployment.rst
+6
-6
docs/getting_started/how_to_build.rst
docs/getting_started/how_to_build.rst
+114
-28
docs/getting_started/models/demo_app_models.yaml
docs/getting_started/models/demo_app_models.yaml
+28
-22
mace/benchmark/BUILD
mace/benchmark/BUILD
+0
-2
mace/core/BUILD
mace/core/BUILD
+1
-7
mace/core/mace.cc
mace/core/mace.cc
+5
-4
mace/kernels/opencl/concat.cc
mace/kernels/opencl/concat.cc
+2
-2
mace/kernels/opencl/helper.cc
mace/kernels/opencl/helper.cc
+3
-2
mace/kernels/opencl/slice.cc
mace/kernels/opencl/slice.cc
+2
-2
mace/mace.bzl
mace/mace.bzl
+0
-12
mace/python/tools/BUILD
mace/python/tools/BUILD
+3
-4
mace/python/tools/converter.py
mace/python/tools/converter.py
+43
-60
mace/python/tools/converter_tool/tensorflow_converter.py
mace/python/tools/converter_tool/tensorflow_converter.py
+0
-6
mace/python/tools/converter_tool/transformer.py
mace/python/tools/converter_tool/transformer.py
+4
-3
mace/python/tools/encrypt_opencl_codegen.py
mace/python/tools/encrypt_opencl_codegen.py
+0
-2
mace/python/tools/mace_engine_factory.h.jinja2
mace/python/tools/mace_engine_factory.h.jinja2
+1
-1
mace/python/tools/mace_engine_factory_codegen.py
mace/python/tools/mace_engine_factory_codegen.py
+0
-1
mace/python/tools/model_saver.py
mace/python/tools/model_saver.py
+315
-0
mace/python/tools/operator.jinja2
mace/python/tools/operator.jinja2
+4
-2
mace/python/tools/source_converter_lib.py
mace/python/tools/source_converter_lib.py
+0
-99
mace/utils/BUILD
mace/utils/BUILD
+1
-23
mace/utils/tuner.h
mace/utils/tuner.h
+10
-10
tools/bazel.rc
tools/bazel.rc
+57
-0
tools/bazel_adb_run.py
tools/bazel_adb_run.py
+20
-41
tools/common.py
tools/common.py
+86
-10
tools/example.yaml
tools/example.yaml
+0
-46
tools/mace_tools.py
tools/mace_tools.py
+985
-709
tools/sh_commands.py
tools/sh_commands.py
+210
-227
tools/validate.py
tools/validate.py
+21
-13
未找到文件。
.gitlab-ci.yml
浏览文件 @
ced4a49d
...
...
@@ -68,4 +68,5 @@ platform_compitable_tests:
stage
:
platform_compitable_tests
script
:
-
mkdir -p mace/codegen/version && bash mace/tools/git/gen_version_source.sh mace/codegen/version/version.cc
-
mkdir -p mace/codegen/tuning && python mace/python/tools/binary_codegen.py --output_path=mace/codegen/tuning/tuning_params.cc
-
bazel build mace/core:core
docs/getting_started/create_a_model_deployment.rst
浏览文件 @
ced4a49d
...
...
@@ -46,14 +46,14 @@ Configurations
- The SHA256 checksum of the model file
* - weight_sha256_checksum
- The SHA256 checksum of the weight file, used by Caffe model
* - input_
node
s
- The input
node names,
one or more strings
* - output_
node
s
- The output
node names,
one or more strings
* - input_
tensor
s
- The input
tensor names (tensorflow), top name of inputs' layer (caffe).
one or more strings
* - output_
tensor
s
- The output
tensor names (tensorflow), top name of outputs' layer (caffe).
one or more strings
* - input_shapes
- The shapes of the input
node
s, in NHWC order
- The shapes of the input
tensor
s, in NHWC order
* - output_shapes
- The shapes of the output
node
s, in NHWC order
- The shapes of the output
tensor
s, in NHWC order
* - runtime
- The running device, one of CPU, GPU or DSP
* - limit_opencl_kernel_time
...
...
docs/getting_started/how_to_build.rst
浏览文件 @
ced4a49d
...
...
@@ -126,6 +126,7 @@ Tool <https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/grap
strip_unused_nodes(type=float, shape="1,64,64,3")
remove_nodes(op=Identity, op=CheckNumerics)
fold_constants(ignore_errors=true)
flatten_atrous_conv
fold_batch_norms
fold_old_batch_norms
strip_unused_nodes
...
...
@@ -171,38 +172,110 @@ Caffe目前只支持最新版本,旧版本请使用Caffe的工具进行升级
3.2 运行\ ``tools/mace_tools.py``\ 脚本
**Commands**
**build**
.. note::
build模型静态库以及测试工具。
* *--config* (type=str, default="", required):模型配置yaml文件路径.
* *--tuning* (optional):是否为特定SOC调制GPU参数.
* *--enable_openmp* (optional):是否启用openmp.
**run**
.. note::
命令行运行模型
* *--config* (type=str, default="", required):模型配置yaml文件路径.
* *--round* (type=int, default=1, optional):模型运行次数。
* *--validate* (optional): 是否需要验证运行结果与框架运行结果是否一致。
* *--caffe_env* (type=local/docker, default=docker, optional):当vaildate时,可以选择指定caffe环境,local表示本地,docker表示使用docker容器.
* *--restart_round* (type=int, default=1, optional):模型重启次数。
* *--check_gpu_out_of_memory* (optional): 是否需要检查gpu内存越界。
* *--vlog_level* (type=int[0-5], default=0, optional):详细日志级别.
.. warning::
run依赖于build命令.build完成以后才可以执行run命令
**benchmark**
* *--config* (type=str, default="", required):模型配置yaml文件路径.
.. warning::
benchmark依赖于build命令.
**通用参数**
.. list-table::
:widths: auto
:header-rows: 1
:align: left
* - argument(key)
- argument(value)
- default
- required
- commands
- explanation
* - --omp_num_threads
- int
- -1
- N
- run/benchmark
- number of threads
* - --cpu_affinity_policy
- int
- 1
- N
- run/benchmark
- 0:AFFINITY_NONE/1:AFFINITY_BIG_ONLY/2:AFFINITY_LITTLE_ONLY
* - --gpu_perf_hint
- int
- 3
- N
- run/benchmark
- 0:DEFAULT/1:LOW/2:NORMAL/3:HIGH
* - --gpu_perf_hint
- int
- 3
- N
- run/benchmark
- 0:DEFAULT/1:LOW/2:NORMAL/3:HIGH
* - --gpu_priority_hint
- int
- 3
- N
- run/benchmark
- 0:DEFAULT/1:LOW/2:NORMAL/3:HIGH
.. code:: sh
# print help message
# python tools/mace_tools.py --help
# --config 配置文件的路径
# --output_dir 编译结果的输出文件目录,默认为`./build`
# --round 调用`examples/mace_run`运行模型的次数,默认为`1`
# --tuning 对opencl的参数调参,该项通常只有开发人员用到,默认为`true`
# --mode 运行模式,包含build/run/validate/merge/all/benchmark,默认为`all`
python tools/mace_tools.py -h
python tools/mace_tools.py build -h
python tools/mace_tools.py run -h
python tools/mace_tools.py benchmark -h
# 仅编译模型和生成静态库
python tools/mace_tools.py
--config=models/config.yaml --mode=build
python tools/mace_tools.py
build --config=models/config.yaml
# 测试模型的运行时间
python tools/mace_tools.py
--config=models/config.yaml --mode=run --round=10
00
python tools/mace_tools.py
run --config=models/config.yaml --round=1
00
# 对比编译好的模型在mace上与直接使用tensorflow或者caffe运行的结果,相似度使用`余弦距离表示`
# 其中使用OpenCL设备,默认相似度大于等于`0.995`为通过;DSP设备下,相似度需要达到`0.930`。
python tools/mace_tools.py --config=models/config.yaml --mode=run --round=1000
# 将已编译好的多个模型合并成静态库
# 比如编译了8个模型,决定使用其中2个模型,这时候可以不重新build,直接修改全局配置文件,合并生成静态库
python tools/mace_tools.py --config=models/config.yaml --mode=merge
# 运行以上所有项(可用于测试速度,建议 round=20)
python tools/mace_tools.py --config=models/config.yaml --mode=all --round=1000
python tools/mace_tools.py run --config=models/config.yaml --validate
# 模型Benchmark:查看每个Op的运行时间
python tools/mace_tools.py
--config=models/config.yaml --mode=benchmark
python tools/mace_tools.py
benchmark --config=models/config.yaml
# 查看模型运行时占用内存(如果有多个模型,可能需要注释掉一部分配置,只剩一个模型的配置)
python tools/mace_tools.py
--config=models/config.yaml --mode=run
--round=10000 &
python tools/mace_tools.py
run --config=models/config.yaml
--round=10000 &
adb shell dumpsys meminfo | grep mace_run
sleep 10
kill %1
...
...
@@ -211,21 +284,34 @@ Caffe目前只支持最新版本,旧版本请使用Caffe的工具进行升级
通过前面的步骤,我们得到了包含业务模型的库文件。在业务代码中,我们只需要引入下面3组文件(\ ``./build/``\ 是默认的编译结果输出目录):
头文件(包含mace.h和各个模型的头文件): \*
``./build/${project_name}/${target_abi}/include/mace/public/*.h``
**头文件**
* ``./build/${library_name}/include/mace/public/*.h``
**静态库**
* ``./build/${library_name}/library/${target_abi}/*.a``
**动态库**
* ``./build/${library_name}/library/${target_abi}/libhexagon_controller.so``
.. note::
仅编译的模型中包含dsp模式时用到
**模型文件**
* ``./build/${library_name}/model/${MODEL_TAG}.pb``
* ``./build/${library_name}/model/${MODEL_TAG}.data``
.. note::
静态库(包含mace engine、opencl和模型相关库): \*
``./build/${project_name}/${target_abi}/*.a``
pb文件紧当模型build_type设置为proto时才会产生。
动态库(仅编译的模型中包含dsp模式时用到): \*
``./build/${project_name}/${target_abi}/libhexagon_controller.so``
模型数据文件(仅在EMBED\_MODEL\_DATA=0时产生): \
*
``./build/${project_name}/data/${MODEL_TAG}.data
``
**库文件tar包*
*
* ``./build/${library_name}/libmace_${library_name}.tar.gz
``
编译过程中间文件: \* ``./build/${project_name}/build/``
.. note::
库文件tar包: \* ``./build/${project_name}/${project_name}.tar.gz``
该文件包含了上述所有文件,可以发布使用。
5. 使用
...
...
docs/getting_started/models/demo_app_models.yaml
浏览文件 @
ced4a49d
# 配置文件名会被用作生成库的名称:libmace-${filename}.a
# 库的名字
library_name
:
library_name
# 配置文件名会被用作生成库的名称:libmace-${library_name}.a
target_abis
:
[
armeabi-v7a
,
arm64-v8a
]
# 具体机型的soc编号,可以使用`adb shell getprop | grep ro.board.platform | cut -d [ -f3 | cut -d ] -f1`获取
target_socs
:
[
msm8998
]
embed_model_data
:
1
build_type
:
code
# 模型build类型。code表示将模型转为代码,proto表示将模型转为protobuf文件
models
:
# 一个配置文件可以包含多个模型的配置信息,最终生成的库中包含多个模型
first_net
:
# 模型的标签,在调度模型的时候,会用这个变量
first_net
:
# 模型的标签,在调度模型的时候,会用这个变量
,必须唯一
platform
:
tensorflow
model_file_path
:
path/to/model64.pb
# also support http:// and https://
model_sha256_checksum
:
7f7462333406e7dea87222737590ebb7d94490194d2f21a7d72bafa87e64e9f9
input_nodes
:
input_node
output_nodes
:
output_node
input_shapes
:
1,64,64,3
output_shapes
:
1,64,64,2
subgraphs
:
-
input_tensors
:
input_node
input_shapes
:
1,64,64,3
output_tensors
:
output_node
output_shapes
:
1,64,64,2
runtime
:
gpu
data_type
:
fp16_fp32
limit_opencl_kernel_time
:
0
dsp
_mode
:
0
nnlib_graph
_mode
:
0
obfuscate
:
1
fast_conv
:
0
winograd
:
0
input_files
:
-
path/to/input_files
# support http://
second_net
:
...
...
@@ -25,22 +30,23 @@ models: # 一个配置文件可以包含多个模型的配置信息,最终生
weight_file_path
:
path/to/weight.caffemodel
model_sha256_checksum
:
05d92625809dc9edd6484882335c48c043397aed450a168d75eb8b538e86881a
weight_sha256_checksum
:
05d92625809dc9edd6484882335c48c043397aed450a168d75eb8b538e86881a
input_nodes
:
-
input_node0
-
input_node1
output_nodes
:
-
output_node0
-
output_node1
input_shapes
:
-
1,256,256,3
-
1,128,128,3
output_shapes
:
-
1,256,256,2
-
1,1,1,2
subgraphs
:
-
input_tensors
:
-
input_node0
-
input_node1
input_shapes
:
-
1,256,256,3
-
1,128,128,3
output_tensors
:
-
output_node0
-
output_node1
output_shapes
:
-
1,256,256,2
-
1,1,1,2
runtime
:
cpu
limit_opencl_kernel_time
:
1
dsp
_mode
:
0
nnlib_graph
_mode
:
0
obfuscate
:
1
fast_conv
:
0
winograd
:
0
input_files
:
-
path/to/input_files
# support http://
mace/benchmark/BUILD
浏览文件 @
ced4a49d
...
...
@@ -2,8 +2,6 @@
# Examples
load
(
"//mace:mace.bzl"
,
"if_production_mode"
,
"if_not_production_mode"
,
"if_hexagon_enabled"
,
"if_openmp_enabled"
,
"if_android"
,
...
...
mace/core/BUILD
浏览文件 @
ced4a49d
...
...
@@ -12,8 +12,6 @@ load(
"if_android"
,
"if_hexagon_enabled"
,
"if_not_hexagon_enabled"
,
"if_production_mode"
,
"if_not_production_mode"
,
"if_openmp_enabled"
,
"if_neon_enabled"
,
)
...
...
@@ -63,17 +61,13 @@ cc_library(
]),
deps
=
[
"//mace/codegen:generated_version"
,
"//mace/codegen:generated_tuning_params"
,
"//mace/proto:mace_cc"
,
"//mace/utils"
,
]
+
if_android
([
":opencl_headers"
,
"//mace/codegen:generated_opencl"
,
"@half//:half"
,
])
+
if_production_mode
([
"//mace/codegen:generated_tuning_params"
,
"//mace/utils:utils_prod"
,
])
+
if_not_production_mode
([
"//mace/utils:utils_dev"
,
])
+
if_hexagon_enabled
([
"//third_party/nnlib:libhexagon"
,
]),
...
...
mace/core/mace.cc
浏览文件 @
ced4a49d
...
...
@@ -284,16 +284,16 @@ const unsigned char *LoadModelData(const std::string &model_data_file,
const
size_t
&
data_size
)
{
int
fd
=
open
(
model_data_file
.
c_str
(),
O_RDONLY
);
MACE_CHECK
(
fd
>=
0
,
"Failed to open model data file "
,
model_data_file
,
", error code: "
,
errno
);
model_data_file
,
", error code: "
,
strerror
(
errno
)
);
const
unsigned
char
*
model_data
=
static_cast
<
const
unsigned
char
*>
(
mmap
(
nullptr
,
data_size
,
PROT_READ
,
MAP_PRIVATE
,
fd
,
0
));
MACE_CHECK
(
model_data
!=
MAP_FAILED
,
"Failed to map model data file "
,
model_data_file
,
", error code: "
,
errno
);
model_data_file
,
", error code: "
,
strerror
(
errno
)
);
int
ret
=
close
(
fd
);
MACE_CHECK
(
ret
==
0
,
"Failed to close model data file "
,
model_data_file
,
", error code: "
,
errno
);
model_data_file
,
", error code: "
,
strerror
(
errno
)
);
return
model_data
;
}
...
...
@@ -302,7 +302,8 @@ void UnloadModelData(const unsigned char *model_data,
const
size_t
&
data_size
)
{
int
ret
=
munmap
(
const_cast
<
unsigned
char
*>
(
model_data
),
data_size
);
MACE_CHECK
(
ret
==
0
,
"Failed to unmap model data file, error code: "
,
errno
);
MACE_CHECK
(
ret
==
0
,
"Failed to unmap model data file, error code: "
,
strerror
(
errno
));
}
MaceStatus
CreateMaceEngineFromProto
(
...
...
mace/kernels/opencl/concat.cc
浏览文件 @
ced4a49d
...
...
@@ -215,6 +215,7 @@ static MaceStatus ConcatN(cl::Kernel *kernel,
(
*
kernel_error
)
->
UnMap
();
}
if
(
runtime
->
is_profiling_enabled
())
{
event
.
wait
();
CallStats
tmp_stats
;
runtime
->
GetCallStats
(
event
,
&
tmp_stats
);
call_stats
.
start_micros
=
...
...
@@ -223,8 +224,7 @@ static MaceStatus ConcatN(cl::Kernel *kernel,
}
}
if
(
future
!=
nullptr
)
{
future
->
wait_fn
=
[
runtime
,
event
,
call_stats
](
CallStats
*
stats
)
{
event
.
wait
();
future
->
wait_fn
=
[
runtime
,
call_stats
](
CallStats
*
stats
)
{
if
(
stats
!=
nullptr
)
{
stats
->
start_micros
=
call_stats
.
start_micros
;
stats
->
end_micros
=
stats
->
start_micros
+
call_stats
.
end_micros
;
...
...
mace/kernels/opencl/helper.cc
浏览文件 @
ced4a49d
...
...
@@ -209,8 +209,9 @@ std::string DtToUpstreamCLCMDDt(const DataType dt) {
std
::
vector
<
uint32_t
>
Default3DLocalWS
(
const
uint32_t
*
gws
,
const
uint32_t
kwg_size
)
{
std
::
vector
<
uint32_t
>
lws
(
4
,
0
);
uint64_t
cache_size
=
OpenCLRuntime
::
Global
()
->
device_global_mem_cache_size
();
uint32_t
base
=
cache_size
/
kBaseGPUMemCacheSize
;
uint64_t
cache_size
=
OpenCLRuntime
::
Global
()
->
device_global_mem_cache_size
();
uint32_t
base
=
std
::
max
<
uint32_t
>
(
cache_size
/
kBaseGPUMemCacheSize
,
1
);
lws
[
1
]
=
std
::
min
<
uint32_t
>
(
gws
[
1
],
kwg_size
);
lws
[
2
]
=
std
::
min
<
uint32_t
>
(
std
::
min
<
uint32_t
>
(
gws
[
2
],
base
),
kwg_size
/
lws
[
1
]);
...
...
mace/kernels/opencl/slice.cc
浏览文件 @
ced4a49d
...
...
@@ -115,6 +115,7 @@ MaceStatus SliceFunctor<DeviceType::GPU, T>::operator()(
kernel_error_
->
UnMap
();
}
if
(
runtime
->
is_profiling_enabled
())
{
event
.
wait
();
CallStats
tmp_stats
;
runtime
->
GetCallStats
(
event
,
&
tmp_stats
);
call_stats
.
start_micros
=
...
...
@@ -123,8 +124,7 @@ MaceStatus SliceFunctor<DeviceType::GPU, T>::operator()(
}
}
if
(
future
!=
nullptr
)
{
future
->
wait_fn
=
[
runtime
,
event
,
call_stats
](
CallStats
*
stats
)
{
event
.
wait
();
future
->
wait_fn
=
[
runtime
,
call_stats
](
CallStats
*
stats
)
{
if
(
stats
!=
nullptr
)
{
stats
->
start_micros
=
call_stats
.
start_micros
;
stats
->
end_micros
=
stats
->
start_micros
+
call_stats
.
end_micros
;
...
...
mace/mace.bzl
浏览文件 @
ced4a49d
...
...
@@ -24,18 +24,6 @@ def if_android_arm64(a):
"//conditions:default"
:
[],
})
def
if_production_mode
(
a
):
return
select
({
"//mace:production_mode"
:
a
,
"//conditions:default"
:
[],
})
def
if_not_production_mode
(
a
):
return
select
({
"//mace:production_mode"
:
[],
"//conditions:default"
:
a
,
})
def
if_neon_enabled
(
a
):
return
select
({
"//mace:neon_enabled"
:
a
,
...
...
mace/python/tools/BUILD
浏览文件 @
ced4a49d
...
...
@@ -3,7 +3,6 @@ py_library(
srcs
=
[
"convert_util.py"
,
"graph_util.py"
,
"tensor_util.py"
,
"tf_dsp_converter_lib.py"
,
"converter_tool/base_converter.py"
,
"converter_tool/shape_inference.py"
,
...
...
@@ -20,9 +19,9 @@ py_library(
)
py_library
(
name
=
"
source_convert
er_lib"
,
name
=
"
model_sav
er_lib"
,
srcs
=
[
"
source_converter_lib
.py"
,
"
model_saver
.py"
,
],
srcs_version
=
"PY2AND3"
,
deps
=
[
...
...
@@ -45,7 +44,7 @@ py_binary(
srcs_version
=
"PY2AND3"
,
deps
=
[
":converter_lib"
,
":
source_convert
er_lib"
,
":
model_sav
er_lib"
,
"@six_archive//:six"
,
],
)
mace/python/tools/converter.py
浏览文件 @
ced4a49d
...
...
@@ -21,8 +21,7 @@ import copy
from
mace.proto
import
mace_pb2
from
mace.python.tools
import
tf_dsp_converter_lib
from
mace.python.tools
import
memory_optimizer
from
mace.python.tools
import
source_converter_lib
from
mace.python.tools
import
tensor_util
from
mace.python.tools
import
model_saver
from
mace.python.tools.converter_tool
import
base_converter
as
cvt
from
mace.python.tools.converter_tool
import
tensorflow_converter
from
mace.python.tools.converter_tool
import
caffe_converter
...
...
@@ -42,6 +41,20 @@ device_type_map = {'cpu': cvt.DeviceType.CPU.value,
'dsp'
:
cvt
.
DeviceType
.
HEXAGON
.
value
}
def
parse_data_type
(
data_type
,
device_type
):
if
device_type
==
cvt
.
DeviceType
.
GPU
.
value
:
if
data_type
==
'fp32_fp32'
:
return
mace_pb2
.
DT_FLOAT
else
:
return
mace_pb2
.
DT_HALF
elif
device_type
==
cvt
.
DeviceType
.
CPU
.
value
:
return
mace_pb2
.
DT_FLOAT
elif
device_type
==
cvt
.
DeviceType
.
HEXAGON
.
value
:
return
mace_pb2
.
DT_UINT8
else
:
print
(
"Invalid device type: "
+
device_type
)
def
file_checksum
(
fname
):
hash_func
=
hashlib
.
sha256
()
with
open
(
fname
,
"rb"
)
as
f
:
...
...
@@ -82,7 +95,7 @@ def main(unused_args):
if
FLAGS
.
platform
not
in
[
'tensorflow'
,
'caffe'
]:
print
(
"platform %s is not supported."
%
FLAGS
.
platform
)
sys
.
exit
(
-
1
)
if
FLAGS
.
runtime
not
in
[
'cpu'
,
'gpu'
,
'dsp'
,
''
]:
if
FLAGS
.
runtime
not
in
[
'cpu'
,
'gpu'
,
'dsp'
,
'
cpu+gpu
'
]:
print
(
"runtime %s is not supported."
%
FLAGS
.
runtime
)
sys
.
exit
(
-
1
)
...
...
@@ -114,7 +127,6 @@ def main(unused_args):
output_node
.
name
=
output_node_names
[
i
]
option
.
add_output_node
(
output_node
)
print
(
"Convert model to mace model."
)
if
FLAGS
.
platform
==
'tensorflow'
:
converter
=
tensorflow_converter
.
TensorflowConverter
(
option
,
FLAGS
.
model_file
)
...
...
@@ -122,24 +134,18 @@ def main(unused_args):
converter
=
caffe_converter
.
CaffeConverter
(
option
,
FLAGS
.
model_file
,
FLAGS
.
weight_file
)
else
:
print
(
"Mace do not support platorm %s yet."
&
FLAGS
.
platform
)
exit
(
1
)
output_graph_def
=
converter
.
run
()
if
FLAGS
.
gpu_data_type
==
'half'
:
gpu_data_type
=
mace_pb2
.
DT_HALF
else
:
gpu_data_type
=
mace_pb2
.
DT_FLOAT
device_data_type_map
=
{
cvt
.
DeviceType
.
CPU
.
value
:
mace_pb2
.
DT_FLOAT
,
cvt
.
DeviceType
.
GPU
.
value
:
gpu_data_type
,
cvt
.
DeviceType
.
HEXAGON
.
value
:
mace_pb2
.
DT_UINT8
}
print
(
"Transform model to one that can better run on device"
)
if
not
FLAGS
.
runtime
:
if
FLAGS
.
runtime
==
'cpu+gpu'
:
cpu_graph_def
=
copy
.
deepcopy
(
output_graph_def
)
option
.
device
=
cvt
.
DeviceType
.
CPU
.
value
option
.
data_type
=
device_data_type_map
[
cvt
.
DeviceType
.
CPU
.
value
]
option
.
data_type
=
parse_data_type
(
FLAGS
.
data_type
,
cvt
.
DeviceType
.
CPU
.
value
)
option
.
disable_transpose_filters
()
mace_cpu_transformer
=
transformer
.
Transformer
(
option
,
cpu_graph_def
)
...
...
@@ -149,7 +155,8 @@ def main(unused_args):
print
"CPU memory optimization done."
option
.
device
=
cvt
.
DeviceType
.
GPU
.
value
option
.
data_type
=
device_data_type_map
[
cvt
.
DeviceType
.
GPU
.
value
]
option
.
data_type
=
parse_data_type
(
FLAGS
.
data_type
,
cvt
.
DeviceType
.
GPU
.
value
)
option
.
enable_transpose_filters
()
mace_gpu_transformer
=
transformer
.
Transformer
(
option
,
output_graph_def
)
...
...
@@ -165,7 +172,8 @@ def main(unused_args):
print
"Merge done"
else
:
option
.
device
=
device_type_map
[
FLAGS
.
runtime
]
option
.
data_type
=
device_data_type_map
[
option
.
device
]
option
.
data_type
=
parse_data_type
(
FLAGS
.
data_type
,
option
.
device
)
mace_transformer
=
transformer
.
Transformer
(
option
,
output_graph_def
)
output_graph_def
=
mace_transformer
.
run
()
...
...
@@ -180,36 +188,13 @@ def main(unused_args):
print
"Memory optimization done."
if
FLAGS
.
obfuscate
:
tensor_util
.
obfuscate_name
(
output_graph_def
)
else
:
tensor_util
.
rename_tensor
(
output_graph_def
)
tensor_infos
,
model_data
=
tensor_util
.
get_tensor_info_and_model_data
(
output_graph_def
,
FLAGS
.
runtime
,
FLAGS
.
gpu_data_type
)
source_converter_lib
.
convert_to_source
(
output_graph_def
,
model_checksum
,
weight_checksum
,
FLAGS
.
template
,
FLAGS
.
obfuscate
,
FLAGS
.
model_tag
,
FLAGS
.
codegen_output
,
FLAGS
.
runtime
,
FLAGS
.
embed_model_data
,
FLAGS
.
winograd
,
FLAGS
.
model_load_type
,
tensor_infos
,
model_data
)
if
not
FLAGS
.
embed_model_data
:
output_dir
=
os
.
path
.
dirname
(
FLAGS
.
codegen_output
)
+
'/'
with
open
(
output_dir
+
FLAGS
.
model_tag
+
'.data'
,
"wb"
)
as
f
:
f
.
write
(
bytearray
(
model_data
))
if
FLAGS
.
model_load_type
==
'pb'
:
tensor_util
.
del_tensor_data
(
output_graph_def
,
FLAGS
.
runtime
,
FLAGS
.
gpu_data_type
)
tensor_util
.
update_tensor_data_type
(
output_graph_def
,
FLAGS
.
runtime
,
FLAGS
.
gpu_data_type
)
with
open
(
FLAGS
.
pb_output
,
"wb"
)
as
f
:
f
.
write
(
output_graph_def
.
SerializeToString
())
# with open(FLAGS.pb_output + '_txt', "wb") as f:
# # output_graph_def.ClearField('tensors')
# f.write(str(output_graph_def))
print
(
"Model conversion is completed."
)
model_saver
.
save_model
(
output_graph_def
,
model_checksum
,
weight_checksum
,
FLAGS
.
template_dir
,
FLAGS
.
obfuscate
,
FLAGS
.
model_tag
,
FLAGS
.
output_dir
,
FLAGS
.
runtime
,
FLAGS
.
embed_model_data
,
FLAGS
.
winograd
,
FLAGS
.
data_type
,
FLAGS
.
model_build_type
)
def
str2bool
(
v
):
...
...
@@ -244,15 +229,10 @@ def parse_args():
default
=
""
,
help
=
"Weight file sha256 checksum"
)
parser
.
add_argument
(
"--
codegen_output
"
,
"--
output_dir
"
,
type
=
str
,
default
=
""
,
help
=
"File to save the output graph to."
)
parser
.
add_argument
(
"--pb_output"
,
type
=
str
,
default
=
""
,
help
=
"File to save the mace model to."
)
parser
.
add_argument
(
"--runtime"
,
type
=
str
,
default
=
""
,
help
=
"Runtime: cpu/gpu/dsp"
)
parser
.
add_argument
(
...
...
@@ -263,7 +243,7 @@ def parse_args():
parser
.
add_argument
(
"--output_node"
,
type
=
str
,
default
=
"softmax"
,
help
=
"e.g., softmax"
)
parser
.
add_argument
(
"--template"
,
type
=
str
,
default
=
""
,
help
=
"template path"
)
"--template
_dir
"
,
type
=
str
,
default
=
""
,
help
=
"template path"
)
parser
.
add_argument
(
"--obfuscate"
,
type
=
str2bool
,
...
...
@@ -295,13 +275,16 @@ def parse_args():
default
=
True
,
help
=
"embed model data."
)
parser
.
add_argument
(
"--model_
loa
d_type"
,
"--model_
buil
d_type"
,
type
=
str
,
default
=
"
sourc
e"
,
help
=
"[
source|pb] Load models in generated `source`
code"
+
"or `
pb
` file."
)
default
=
"
cod
e"
,
help
=
"[
proto|code] build models to
code"
+
"or `
Protobuf
` file."
)
parser
.
add_argument
(
"--gpu_data_type"
,
type
=
str
,
default
=
"half"
,
help
=
"half/float"
)
"--data_type"
,
type
=
str
,
default
=
"fp16_fp32"
,
help
=
"fp16_fp32/fp32_fp32"
)
return
parser
.
parse_known_args
()
...
...
mace/python/tools/converter_tool/tensorflow_converter.py
浏览文件 @
ced4a49d
...
...
@@ -395,12 +395,6 @@ class TensorflowConverter(base_converter.ConverterInterface):
align_corners_arg
.
i
=
tf_op
.
get_attr
(
tf_align_corners
)
def
convert_space_batch
(
self
,
tf_op
):
print
"""You might want to try 'flatten_atrous_conv' in
transform graph to turn atrous conv2d into regular conv2d.
This may give you performance benefit on GPU.
(see https://github.com/tensorflow/tensorflow/blob/master/
tensorflow/tools/graph_transforms/README.md#flatten_atrous_conv)
"""
op
=
self
.
convert_general_op
(
tf_op
)
del
op
.
input
[
1
:]
...
...
mace/python/tools/converter_tool/transformer.py
浏览文件 @
ced4a49d
...
...
@@ -52,7 +52,7 @@ class Transformer(base_converter.ConverterInterface):
"""
def
__init__
(
self
,
option
,
model
):
# DO NOT reorder the following transformers
# DO NOT reorder the following transformers
' order
self
.
_registered_transformers_order
=
[
TransformerRule
.
REMOVE_USELESS_RESHAPE_OP
,
TransformerRule
.
REMOVE_IDENTITY_OP
,
...
...
@@ -940,8 +940,9 @@ class Transformer(base_converter.ConverterInterface):
op_def
.
type
=
MaceKeyword
.
mace_image_to_buffer
op_def
.
input
.
extend
([
output_node
.
name
])
op_def
.
output
.
extend
([
output_name
])
output_shape
=
op_def
.
output_shape
.
add
()
output_shape
.
dims
.
extend
(
output_node
.
shape
)
if
output_node
.
shape
:
output_shape
=
op_def
.
output_shape
.
add
()
output_shape
.
dims
.
extend
(
output_node
.
shape
)
arg
=
op_def
.
arg
.
add
()
arg
.
name
=
MaceKeyword
.
mace_buffer_type
...
...
mace/python/tools/encrypt_opencl_codegen.py
浏览文件 @
ced4a49d
...
...
@@ -73,8 +73,6 @@ def encrypt_opencl_codegen(cl_kernel_dir, output_path):
with
open
(
output_path
,
"w"
)
as
w_file
:
w_file
.
write
(
cpp_cl_encrypted_kernel
)
print
(
"Generate encrypted opencl source done!"
)
def
parse_args
():
"""Parses command line arguments."""
...
...
mace/python/tools/mace_engine_factory.h.jinja2
浏览文件 @
ced4a49d
...
...
@@ -25,7 +25,7 @@
namespace mace {
{% if model_type == '
sourc
e' %}
{% if model_type == '
cod
e' %}
{% for tag in model_tags %}
namespace {{tag}} {
...
...
mace/python/tools/mace_engine_factory_codegen.py
浏览文件 @
ced4a49d
...
...
@@ -25,7 +25,6 @@ def gen_mace_engine_factory(model_tags, template_dir, model_type, output_dir):
j2_env
=
Environment
(
loader
=
FileSystemLoader
(
template_dir
),
trim_blocks
=
True
)
# generate mace_run BUILD file
print
model_tags
template_name
=
'mace_engine_factory.h.jinja2'
source
=
j2_env
.
get_template
(
template_name
).
render
(
model_tags
=
model_tags
,
...
...
mace/python/tools/
tensor_util
.py
→
mace/python/tools/
model_saver
.py
浏览文件 @
ced4a49d
...
...
@@ -12,13 +12,26 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import
hashlib
import
datetime
import
os
import
uuid
import
numpy
as
np
import
hashlib
from
enum
import
Enum
from
mace.proto
import
mace_pb2
from
jinja2
import
Environment
,
FileSystemLoader
GENERATED_NAME
=
set
()
GPUDataTypeStrs
=
[
"fp16_fp32"
,
"fp32_fp32"
,
]
GPUDataType
=
\
Enum
(
'GPUDataType'
,
[(
ele
,
ele
)
for
ele
in
GPUDataTypeStrs
],
type
=
str
)
def
generate_obfuscated_name
(
namespace
,
name
):
md5
=
hashlib
.
md5
()
...
...
@@ -104,70 +117,199 @@ def rename_tensor(net_def):
op
.
output
[
i
]
=
tensor_map
[
op
.
output
[
i
]]
def
stringfy
(
value
):
return
', '
.
join
(
'"{0}"'
.
format
(
w
)
for
w
in
value
)
class
TensorInfo
:
def
__init__
(
self
,
id
,
t
,
runtime
,
gpu_data_type
):
def
__init__
(
self
,
id
,
t
ensor
):
self
.
id
=
id
self
.
data_type
=
mace_pb2
.
DataType
.
Name
(
t
.
data_type
)
if
t
.
data_type
==
mace_pb2
.
DT_FLOAT
:
if
runtime
==
'gpu'
and
gpu_data_type
==
'half'
:
self
.
data_type
=
mace_pb2
.
DT_HALF
self
.
data
=
bytearray
(
np
.
array
(
t
.
float_data
).
astype
(
np
.
float16
).
tobytes
())
else
:
self
.
data_type
=
mace_pb2
.
DT_FLOAT
self
.
data
=
bytearray
(
np
.
array
(
t
.
float_data
).
astype
(
np
.
float32
).
tobytes
())
elif
t
.
data_type
==
mace_pb2
.
DT_INT32
:
self
.
data_type
=
tensor
.
data_type
if
tensor
.
data_type
==
mace_pb2
.
DT_HALF
:
self
.
data_type
=
mace_pb2
.
DT_HALF
self
.
data
=
bytearray
(
np
.
array
(
t
.
int32_data
).
astype
(
np
.
int32
).
tobytes
())
elif
t
.
data_type
==
mace_pb2
.
DT_UINT8
:
np
.
array
(
tensor
.
float_data
).
astype
(
np
.
float16
).
tobytes
())
elif
tensor
.
data_type
==
mace_pb2
.
DT_FLOAT
:
self
.
data_type
=
mace_pb2
.
DT_FLOAT
self
.
data
=
bytearray
(
np
.
array
(
t
.
int32_data
).
astype
(
np
.
uint8
).
tolist
())
np
.
array
(
tensor
.
float_data
).
astype
(
np
.
float32
).
tobytes
())
elif
tensor
.
data_type
==
mace_pb2
.
DT_INT32
:
self
.
data
=
bytearray
(
np
.
array
(
tensor
.
int32_data
).
astype
(
np
.
int32
).
tobytes
())
elif
tensor
.
data_type
==
mace_pb2
.
DT_UINT8
:
self
.
data
=
bytearray
(
np
.
array
(
tensor
.
int32_data
).
astype
(
np
.
uint8
).
tolist
())
else
:
raise
Exception
(
'Tensor data type %s not supported'
%
t
.
data_type
)
raise
Exception
(
'Tensor data type %s not supported'
%
tensor
.
data_type
)
def
get_tensor_info_and_model_data
(
net_def
,
runtime
,
gpu_data_type
):
model_data
=
[]
def
update_tensor_infos
(
net_def
,
runtime
,
data_type
):
offset
=
0
counter
=
0
tensor_infos
=
[]
for
t
in
net_def
.
tensors
:
tensor_info
=
TensorInfo
(
counter
,
t
,
runtime
,
gpu_data_type
)
for
tensor
in
net_def
.
tensors
:
# update data_type
if
tensor
.
data_type
==
mace_pb2
.
DT_FLOAT
and
runtime
==
'gpu'
\
and
data_type
==
GPUDataType
.
fp16_fp32
:
tensor
.
data_type
=
mace_pb2
.
DT_HALF
# Add offset and data_size
tensor_info
=
TensorInfo
(
counter
,
tensor
)
tensor_infos
.
append
(
tensor_info
)
# align
if
tensor_info
.
data_type
!=
'DT_UINT8'
and
offset
%
4
!=
0
:
padding
=
4
-
offset
%
4
model_data
.
extend
(
bytearray
([
0
]
*
padding
))
offset
+=
padding
if
t
.
data_type
==
mace_pb2
.
DT_FLOAT
:
t
.
data_size
=
len
(
t
.
float_data
)
elif
t
.
data_type
==
mace_pb2
.
DT_INT32
:
t
.
data_size
=
len
(
t
.
int32_data
)
elif
t
.
data_type
==
mace_pb2
.
DT_UINT8
:
t
.
data_size
=
len
(
t
.
int32_data
)
t
.
offset
=
offset
if
tensor
.
data_type
==
mace_pb2
.
DT_FLOAT
\
or
tensor
.
data_type
==
mace_pb2
.
DT_HALF
:
tensor
.
data_size
=
len
(
tensor
.
float_data
)
elif
tensor
.
data_type
==
mace_pb2
.
DT_INT32
:
tensor
.
data_size
=
len
(
tensor
.
int32_data
)
elif
tensor
.
data_type
==
mace_pb2
.
DT_UINT8
:
tensor
.
data_size
=
len
(
tensor
.
int32_data
)
tensor
.
offset
=
offset
offset
+=
len
(
tensor_info
.
data
)
counter
+=
1
def
extract_model_data
(
net_def
):
model_data
=
[]
offset
=
0
counter
=
0
for
tensor
in
net_def
.
tensors
:
tensor_info
=
TensorInfo
(
counter
,
tensor
)
# align
if
tensor_info
.
data_type
!=
mace_pb2
.
DT_UINT8
and
offset
%
4
!=
0
:
padding
=
4
-
offset
%
4
model_data
.
extend
(
bytearray
([
0
]
*
padding
))
offset
+=
padding
model_data
.
extend
(
tensor_info
.
data
)
offset
+=
len
(
tensor_info
.
data
)
counter
+=
1
return
model_data
return
tensor_infos
,
model_data
def
save_model_data
(
net_def
,
model_tag
,
output_dir
):
model_data
=
extract_model_data
(
net_def
)
# generate tensor data
with
open
(
output_dir
+
model_tag
+
'.data'
,
"wb"
)
as
f
:
f
.
write
(
bytearray
(
model_data
))
def
del_tensor_data
(
net_def
,
runtime
,
gpu_data_type
):
for
t
in
net_def
.
tensors
:
if
t
.
data_type
==
mace_pb2
.
DT_FLOAT
:
del
t
.
float_data
[:]
elif
t
.
data_type
==
mace_pb2
.
DT_INT32
:
del
t
.
int32_data
[:]
elif
t
.
data_type
==
mace_pb2
.
DT_UINT8
:
del
t
.
int32_data
[:]
def
save_model_to_proto
(
net_def
,
model_tag
,
output_dir
):
for
tensor
in
net_def
.
tensors
:
if
tensor
.
data_type
==
mace_pb2
.
DT_FLOAT
\
or
tensor
.
data_type
==
mace_pb2
.
DT_HALF
:
del
tensor
.
float_data
[:]
elif
tensor
.
data_type
==
mace_pb2
.
DT_INT32
:
del
tensor
.
int32_data
[:]
elif
tensor
.
data_type
==
mace_pb2
.
DT_UINT8
:
del
tensor
.
int32_data
[:]
proto_file_path
=
output_dir
+
model_tag
+
'.pb'
with
open
(
proto_file_path
,
"wb"
)
as
f
:
f
.
write
(
net_def
.
SerializeToString
())
with
open
(
proto_file_path
+
'_txt'
,
"wb"
)
as
f
:
f
.
write
(
str
(
net_def
))
def
update_tensor_data_type
(
net_def
,
runtime
,
gpu_data_type
):
for
t
in
net_def
.
tensors
:
if
t
.
data_type
==
mace_pb2
.
DT_FLOAT
and
runtime
==
'gpu'
\
and
gpu_data_type
==
'half'
:
t
.
data_type
=
mace_pb2
.
DT_HALF
def
save_model_to_code
(
net_def
,
model_tag
,
runtime
,
template_dir
,
output_dir
,
embed_model_data
,
model_checksum
,
weight_checksum
,
obfuscate
,
winograd_conv
):
# Create the jinja2 environment.
j2_env
=
Environment
(
loader
=
FileSystemLoader
(
template_dir
),
trim_blocks
=
True
)
j2_env
.
filters
[
'stringfy'
]
=
stringfy
# generate tensor source files
template_name
=
'tensor_source.jinja2'
counter
=
0
for
tensor
in
net_def
.
tensors
:
tensor_info
=
TensorInfo
(
counter
,
tensor
)
# convert tensor
source
=
j2_env
.
get_template
(
template_name
).
render
(
tensor_info
=
tensor_info
,
tensor
=
tensor
,
tag
=
model_tag
,
)
with
open
(
output_dir
+
'tensor'
+
str
(
counter
)
+
'.cc'
,
"wb"
)
as
f
:
f
.
write
(
source
)
counter
+=
1
# generate tensor data
model_data
=
extract_model_data
(
net_def
)
template_name
=
'tensor_data.jinja2'
source
=
j2_env
.
get_template
(
template_name
).
render
(
tag
=
model_tag
,
embed_model_data
=
embed_model_data
,
model_data_size
=
len
(
model_data
),
model_data
=
model_data
)
with
open
(
output_dir
+
'tensor_data'
+
'.cc'
,
"wb"
)
as
f
:
f
.
write
(
source
)
# generate op source files
template_name
=
'operator.jinja2'
counter
=
0
op_size
=
len
(
net_def
.
op
)
for
start
in
range
(
0
,
op_size
,
10
):
source
=
j2_env
.
get_template
(
template_name
).
render
(
start
=
start
,
end
=
min
(
start
+
10
,
op_size
),
net
=
net_def
,
tag
=
model_tag
,
runtime
=
runtime
,
)
with
open
(
output_dir
+
'op'
+
str
(
counter
)
+
'.cc'
,
"wb"
)
as
f
:
f
.
write
(
source
)
counter
+=
1
# generate model source files
build_time
=
datetime
.
datetime
.
now
().
strftime
(
'%Y-%m-%d %H:%M:%S'
)
template_name
=
'model.jinja2'
checksum
=
model_checksum
if
weight_checksum
is
not
None
:
checksum
=
"{},{}"
.
format
(
model_checksum
,
weight_checksum
)
source
=
j2_env
.
get_template
(
template_name
).
render
(
net
=
net_def
,
tag
=
model_tag
,
runtime
=
runtime
,
obfuscate
=
obfuscate
,
embed_model_data
=
embed_model_data
,
winograd_conv
=
winograd_conv
,
checksum
=
checksum
,
build_time
=
build_time
)
with
open
(
output_dir
+
'model.cc'
,
"wb"
)
as
f
:
f
.
write
(
source
)
# generate model header file
template_name
=
'model_header.jinja2'
source
=
j2_env
.
get_template
(
template_name
).
render
(
tag
=
model_tag
,
)
with
open
(
output_dir
+
model_tag
+
'.h'
,
"wb"
)
as
f
:
f
.
write
(
source
)
def
save_model
(
net_def
,
model_checksum
,
weight_checksum
,
template_dir
,
obfuscate
,
model_tag
,
output_dir
,
runtime
,
embed_model_data
,
winograd_conv
,
data_type
,
model_build_type
):
if
obfuscate
:
obfuscate_name
(
net_def
)
else
:
rename_tensor
(
net_def
)
output_dir
=
output_dir
+
'/'
# update tensor type
update_tensor_infos
(
net_def
,
runtime
,
data_type
)
if
model_build_type
==
'proto'
or
not
embed_model_data
:
save_model_data
(
net_def
,
model_tag
,
output_dir
)
if
model_build_type
==
'proto'
:
save_model_to_proto
(
net_def
,
model_tag
,
output_dir
)
else
:
save_model_to_code
(
net_def
,
model_tag
,
runtime
,
template_dir
,
output_dir
,
embed_model_data
,
model_checksum
,
weight_checksum
,
obfuscate
,
winograd_conv
)
mace/python/tools/operator.jinja2
浏览文件 @
ced4a49d
...
...
@@ -94,10 +94,11 @@ void CreateOperator{{i}}(mace::OperatorDef *op) {
{% endfor %}
{% if net.op[i].output_shape|length > 0 %}
op->mutable_output_shape()->Reserve({{ net.op[i].output_shape|length }});
mace::OutputShape * output_shape = nullptr;
{% for shape in net.op[i].output_shape %}
{% if shape.dims|length > 0 %}
{% if shape.dims|length > 0 %}
output_shape = op->add_output_shape();
output_shape->mutable_dims()->Reserve({{ shape.dims|length }});
...
...
@@ -105,8 +106,9 @@ void CreateOperator{{i}}(mace::OperatorDef *op) {
output_shape->add_dims({{ dim }});
{% endfor %}
{% endif %}
{% endif %}
{% endfor %}
{% endif %}
std::vector<int> output_types_int({ {{ net.op[i].output_type | join(', ') }} });
std::vector<mace::DataType> output_types({{ net.op[i].output_type | length }});
...
...
mace/python/tools/source_converter_lib.py
已删除
100644 → 0
浏览文件 @
fe0cdf27
# Copyright 2018 Xiaomi, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import
datetime
import
os
from
mace.proto
import
mace_pb2
from
jinja2
import
Environment
,
FileSystemLoader
def
stringfy
(
value
):
return
', '
.
join
(
'"{0}"'
.
format
(
w
)
for
w
in
value
)
def
convert_to_source
(
net_def
,
model_checksum
,
weight_checksum
,
template_dir
,
obfuscate
,
model_tag
,
output
,
runtime
,
embed_model_data
,
winograd_conv
,
model_load_type
,
tensor_infos
,
model_data
):
# Capture our current directory
print
template_dir
# Create the jinja2 environment.
j2_env
=
Environment
(
loader
=
FileSystemLoader
(
template_dir
),
trim_blocks
=
True
)
j2_env
.
filters
[
'stringfy'
]
=
stringfy
output_dir
=
os
.
path
.
dirname
(
output
)
+
'/'
# generate tensor source files
template_name
=
'tensor_source.jinja2'
for
i
in
range
(
len
(
net_def
.
tensors
)):
if
model_load_type
==
'source'
:
source
=
j2_env
.
get_template
(
template_name
).
render
(
tensor_info
=
tensor_infos
[
i
],
tensor
=
net_def
.
tensors
[
i
],
tag
=
model_tag
,
)
with
open
(
output_dir
+
'tensor'
+
str
(
i
)
+
'.cc'
,
"wb"
)
as
f
:
f
.
write
(
source
)
if
model_load_type
==
'source'
:
# generate tensor data
template_name
=
'tensor_data.jinja2'
source
=
j2_env
.
get_template
(
template_name
).
render
(
tag
=
model_tag
,
embed_model_data
=
embed_model_data
,
model_data_size
=
len
(
model_data
),
model_data
=
model_data
)
with
open
(
output_dir
+
'tensor_data'
+
'.cc'
,
"wb"
)
as
f
:
f
.
write
(
source
)
# generate op source files
template_name
=
'operator.jinja2'
counter
=
0
op_size
=
len
(
net_def
.
op
)
for
start
in
range
(
0
,
op_size
,
10
):
source
=
j2_env
.
get_template
(
template_name
).
render
(
start
=
start
,
end
=
min
(
start
+
10
,
op_size
),
net
=
net_def
,
tag
=
model_tag
,
runtime
=
runtime
,
)
with
open
(
output_dir
+
'op'
+
str
(
counter
)
+
'.cc'
,
"wb"
)
as
f
:
f
.
write
(
source
)
counter
+=
1
# generate model source files
build_time
=
datetime
.
datetime
.
now
().
strftime
(
'%Y-%m-%d %H:%M:%S'
)
template_name
=
'model.jinja2'
checksum
=
model_checksum
if
weight_checksum
is
not
None
:
checksum
=
"{},{}"
.
format
(
model_checksum
,
weight_checksum
)
source
=
j2_env
.
get_template
(
template_name
).
render
(
net
=
net_def
,
tag
=
model_tag
,
runtime
=
runtime
,
obfuscate
=
obfuscate
,
embed_model_data
=
embed_model_data
,
winograd_conv
=
winograd_conv
,
checksum
=
checksum
,
build_time
=
build_time
)
with
open
(
output
,
"wb"
)
as
f
:
f
.
write
(
source
)
# generate model header file
template_name
=
'model_header.jinja2'
source
=
j2_env
.
get_template
(
template_name
).
render
(
tag
=
model_tag
,
)
with
open
(
output_dir
+
model_tag
+
'.h'
,
"wb"
)
as
f
:
f
.
write
(
source
)
mace/utils/BUILD
浏览文件 @
ced4a49d
...
...
@@ -31,28 +31,6 @@ cc_library(
copts
=
[
"-Werror"
,
"-Wextra"
,
"-Wno-missing-field-initializers"
],
deps
=
[
"//mace/public"
,
],
)
cc_library
(
name
=
"utils_dev"
,
srcs
=
[
"tuner_development.cc"
,
],
copts
=
[
"-Werror"
,
"-Wextra"
,
"-Wno-missing-field-initializers"
],
deps
=
[
":utils"
,
],
)
cc_library
(
name
=
"utils_prod"
,
srcs
=
[
"tuner_production.cc"
,
],
copts
=
[
"-Werror"
,
"-Wextra"
,
"-Wno-missing-field-initializers"
],
deps
=
[
":utils"
,
"//mace/codegen:generated_tuning_params"
,
],
)
...
...
@@ -70,7 +48,7 @@ cc_test(
]),
linkstatic
=
1
,
deps
=
[
":utils
_dev
"
,
":utils"
,
"@gtest//:gtest"
,
"@gtest//:gtest_main"
,
],
...
...
mace/utils/tuner.h
浏览文件 @
ced4a49d
...
...
@@ -29,10 +29,6 @@
namespace
mace
{
extern
bool
GetTuningParams
(
const
char
*
path
,
std
::
unordered_map
<
std
::
string
,
std
::
vector
<
unsigned
int
>>
*
param_table
);
template
<
typename
param_type
>
class
Tuner
{
public:
...
...
@@ -74,9 +70,6 @@ class Tuner {
:
""
);
return
func
(
param_table_
[
obfucated_param_key
],
nullptr
,
nullptr
);
}
else
{
#ifndef MACE_DISABLE_NO_TUNING_WARNING
LOG
(
WARNING
)
<<
"Fallback to default parameter: "
<<
param_key
;
#endif
return
func
(
default_param
,
nullptr
,
nullptr
);
}
}
...
...
@@ -124,9 +117,16 @@ class Tuner {
}
inline
void
ReadRunParamters
()
{
bool
success
=
GetTuningParams
(
path_
,
&
param_table_
);
if
(
!
success
)
{
LOG
(
WARNING
)
<<
"Get run parameter failed."
;
extern
const
std
::
map
<
std
::
string
,
std
::
vector
<
unsigned
int
>>
kTuningParamsData
;
if
(
!
kTuningParamsData
.
empty
())
{
for
(
auto
it
=
kTuningParamsData
.
begin
();
it
!=
kTuningParamsData
.
end
();
++
it
)
{
param_table_
.
emplace
(
it
->
first
,
std
::
vector
<
unsigned
int
>
(
it
->
second
.
begin
(),
it
->
second
.
end
()));
}
}
else
{
LOG
(
INFO
)
<<
"There is no tuned parameters."
;
}
}
...
...
tools/bazel.rc
0 → 100644
浏览文件 @
ced4a49d
# Partially borrowed from tensorflow tools/bazel.rc
# By default, we don't distinct target and host platfroms.
# When doing cross compilation, use --config=cross_compile to distinct them.
build --distinct_host_configuration=false
build:cross_compile --distinct_host_configuration=true
build --verbose_failures
build --copt=-std=c++11
build --copt=-D_GLIBCXX_USE_C99_MATH_TR1
build --copt=-DMACE_OBFUSCATE_LITERALS
# Usage example: bazel build --config android
build:android --crosstool_top=//external:android/crosstool
build:android --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
build:android --config=cross_compile
# Usage example: bazel build --config optimization
build:optimization -c opt
build:optimization --copt=-O3
build:optimization --strip=always
# Address sanitizer
build:asan --strip=never
build:asan --copt -fsanitize=address
build:asan --copt -D_FORTIFY_SOURCE
build:asan --copt -DADDRESS_SANITIZER
build:asan --copt -O0
build:asan --copt -g
build:asan --copt -fno-omit-frame-pointer
build:asan --linkopt -fsanitize=address
# Thread sanitizer
build:tsan --strip=never
build:tsan --copt -fsanitize=thread
build:tsan --copt -DTHREAD_SANITIZER
build:tsan --copt -DDYNAMIC_ANNOTATIONS_ENABLED=1
build:tsan --copt -DDYNAMIC_ANNOTATIONS_EXTERNAL_IMPL=1
build:tsan --copt -O0
build:tsan --copt -fno-omit-frame-pointer
build:tsan --linkopt -fsanitize=thread
# Memory sanitizer
build:msan --strip=never
build:msan --copt -fsanitize=memory
build:msan --copt -DADDRESS_SANITIZER
build:msan --copt -O0
build:msan --copt -fno-omit-frame-pointer
build:msan --linkopt -fsanitize=memory
# Undefined Behavior Sanitizer
build:ubsan --strip=never
build:ubsan --copt -fsanitize=undefined
build:ubsan --copt -O0
build:ubsan --copt -fno-omit-frame-pointer
build:ubsan --linkopt -fsanitize=undefined
build:ubsan --linkopt -lubsan
tools/bazel_adb_run.py
浏览文件 @
ced4a49d
...
...
@@ -95,21 +95,6 @@ def parse_args():
type
=
str2bool
,
default
=
False
,
help
=
"Whether to run the target"
)
parser
.
add_argument
(
"--valgrind"
,
type
=
bool
,
default
=
False
,
help
=
"Whether to use valgrind to check memory error."
)
parser
.
add_argument
(
"--valgrind_path"
,
type
=
str
,
default
=
"/data/local/tmp/valgrind"
,
help
=
"Valgrind install path."
)
parser
.
add_argument
(
"--valgrind_args"
,
type
=
str
,
default
=
""
,
help
=
"Valgrind command args."
)
parser
.
add_argument
(
"--args"
,
type
=
str
,
default
=
""
,
help
=
"Command args"
)
parser
.
add_argument
(
"--stdout_processor"
,
...
...
@@ -121,6 +106,10 @@ def parse_args():
type
=
str2bool
,
default
=
True
,
help
=
"Whether to use neon optimization"
)
parser
.
add_argument
(
'--address_sanitizer'
,
action
=
"store_true"
,
help
=
"Whether to enable AddressSanitizer"
)
return
parser
.
parse_known_args
()
...
...
@@ -145,16 +134,17 @@ def main(unused_args):
sh_commands
.
gen_encrypted_opencl_source
()
sh_commands
.
gen_compiled_opencl_source
()
sh_commands
.
gen_mace_version
()
sh_commands
.
gen_tuning_param_code
([])
strip
=
"always"
debug
=
False
if
FLAGS
.
valgrind
:
if
FLAGS
.
address_sanitizer
:
strip
=
"never"
debug
=
True
for
target_abi
in
target_abis
:
sh_commands
.
bazel_build
(
target
,
strip
=
strip
,
abi
=
target_abi
,
disable_no_tuning_warning
=
True
,
debug
=
debug
,
enable_neon
=
FLAGS
.
enable_neon
)
sh_commands
.
bazel_build
(
target
,
abi
=
target_abi
,
enable_neon
=
FLAGS
.
enable_neon
,
address_sanitizer
=
FLAGS
.
address_sanitizer
)
if
FLAGS
.
run_target
:
for
serialno
in
target_devices
:
if
target_abi
not
in
set
(
...
...
@@ -162,28 +152,17 @@ def main(unused_args):
print
(
"Skip device %s which does not support ABI %s"
%
(
serialno
,
target_abi
))
continue
if
FLAGS
.
valgrind
:
stdouts
=
sh_commands
.
adb_run_valgrind
(
serialno
,
host_bin_path
,
bin_name
,
valgrind_path
=
FLAGS
.
valgrind_path
,
valgrind_args
=
FLAGS
.
valgrind_args
,
args
=
FLAGS
.
args
,
opencl_profiling
=
1
,
vlog_level
=
0
,
device_bin_path
=
"/data/local/tmp/mace"
,
out_of_range_check
=
1
)
else
:
stdouts
=
sh_commands
.
adb_run
(
serialno
,
host_bin_path
,
bin_name
,
args
=
FLAGS
.
args
,
opencl_profiling
=
1
,
vlog_level
=
0
,
device_bin_path
=
"/data/local/tmp/mace"
,
out_of_range_check
=
1
)
stdouts
=
sh_commands
.
adb_run
(
target_abi
,
serialno
,
host_bin_path
,
bin_name
,
args
=
FLAGS
.
args
,
opencl_profiling
=
1
,
vlog_level
=
0
,
device_bin_path
=
"/data/local/tmp/mace"
,
out_of_range_check
=
1
,
address_sanitizer
=
FLAGS
.
address_sanitizer
)
device_properties
=
sh_commands
.
adb_getprop_by_serialno
(
serialno
)
globals
()[
FLAGS
.
stdout_processor
](
stdouts
,
device_properties
,
...
...
tools/common.py
浏览文件 @
ced4a49d
...
...
@@ -13,23 +13,99 @@
# limitations under the License.
import
enum
import
logging
import
re
################################
# log
################################
def
init_logging
():
logger
=
logging
.
getLogger
(
'MACE'
)
logger
.
setLevel
(
logging
.
INFO
)
class
CMDColors
:
PURPLE
=
'
\033
[95m'
BLUE
=
'
\033
[94m'
GREEN
=
'
\033
[92m'
YELLOW
=
'
\033
[93m'
RED
=
'
\033
[91m'
ENDC
=
'
\033
[0m'
BOLD
=
'
\033
[1m'
UNDERLINE
=
'
\033
[4m'
ch
=
logging
.
StreamHandler
()
ch
.
setLevel
(
logging
.
INFO
)
formatter
=
logging
.
Formatter
(
'%(asctime)s [%(name)s] [%(levelname)s]: %(message)s'
)
ch
.
setFormatter
(
formatter
)
logger
.
addHandler
(
ch
)
class
MaceLogger
:
@
staticmethod
def
header
(
message
):
print
CMDColors
.
PURPLE
+
message
+
CMDColors
.
ENDC
@
staticmethod
def
summary
(
message
):
print
CMDColors
.
GREEN
+
message
+
CMDColors
.
ENDC
@
staticmethod
def
info
(
message
):
print
message
@
staticmethod
def
warning
(
message
):
print
CMDColors
.
YELLOW
+
'WARNING:'
+
message
+
CMDColors
.
ENDC
@
staticmethod
def
error
(
module
,
message
):
print
CMDColors
.
RED
+
'ERROR: ['
+
module
+
'] '
\
+
message
+
CMDColors
.
ENDC
exit
(
1
)
def
mace_check
(
condition
,
module
,
message
):
if
not
condition
:
MaceLogger
.
error
(
module
,
message
)
################################
# String Formatter
################################
class
StringFormatter
:
@
staticmethod
def
table
(
header
,
data
,
title
,
align
=
"R"
):
data_size
=
len
(
data
)
column_size
=
len
(
header
)
column_length
=
[
len
(
str
(
ele
))
+
1
for
ele
in
header
]
for
row_idx
in
range
(
data_size
):
data_tuple
=
data
[
row_idx
]
ele_size
=
len
(
data_tuple
)
assert
(
ele_size
==
column_size
)
for
i
in
range
(
ele_size
):
column_length
[
i
]
=
max
(
column_length
[
i
],
len
(
str
(
data_tuple
[
i
]))
+
1
)
table_column_length
=
sum
(
column_length
)
+
column_size
+
1
dash_line
=
'-'
*
table_column_length
+
'
\n
'
header_line
=
'='
*
table_column_length
+
'
\n
'
output
=
""
output
+=
dash_line
output
+=
str
(
title
).
center
(
table_column_length
)
+
'
\n
'
output
+=
dash_line
output
+=
'|'
+
'|'
.
join
([
str
(
header
[
i
]).
center
(
column_length
[
i
])
for
i
in
range
(
column_size
)])
+
'|
\n
'
output
+=
header_line
for
data_tuple
in
data
:
ele_size
=
len
(
data_tuple
)
row_list
=
[]
for
i
in
range
(
ele_size
):
if
align
==
"R"
:
row_list
.
append
(
str
(
data_tuple
[
i
]).
rjust
(
column_length
[
i
]))
elif
align
==
"L"
:
row_list
.
append
(
str
(
data_tuple
[
i
]).
ljust
(
column_length
[
i
]))
elif
align
==
"C"
:
row_list
.
append
(
str
(
data_tuple
[
i
])
.
center
(
column_length
[
i
]))
output
+=
'|'
+
'|'
.
join
(
row_list
)
+
"|
\n
"
+
dash_line
return
output
@
staticmethod
def
block
(
message
):
line_length
=
10
+
len
(
str
(
message
))
+
10
star_line
=
'*'
*
line_length
+
'
\n
'
return
star_line
+
str
(
message
).
center
(
line_length
)
+
'
\n
'
+
star_line
################################
...
...
tools/example.yaml
已删除
100644 → 0
浏览文件 @
fe0cdf27
# example.yaml
# Each yaml file describes a exported library (could be named [target_abi]/libmace-${filename}.a),
# which can contains more than one models
# target_soc can get by `adb shell getprop | grep ro.board.platform | cut -d [ -f3 | cut -d ] -f1`
target_abis
:
[
armeabi-v7a
,
arm64-v8a
]
target_socs
:
[
MSM8953
]
embed_model_data
:
1
models
:
preview_net
:
platform
:
tensorflow
model_file_path
:
path/to/model64.pb
# also support http:// and https://
model_sha256_checksum
:
05d92625809dc9edd6484882335c48c043397aed450a168d75eb8b538e86881a
input_nodes
:
input_node
output_nodes
:
output_node
input_shapes
:
1,64,64,3
output_shapes
:
1,64,64,2
runtime
:
gpu
limit_opencl_kernel_time
:
0
dsp_mode
:
0
obfuscate
:
1
fast_conv
:
0
validation_inputs_data
:
-
path/to/input_files
capture_net
:
platform
:
caffe
model_file_path
:
path/to/model.prototxt
weight_file_path
:
path/to/weight.caffemodel
model_sha256_checksum
:
05d92625809dc9edd6484882335c48c043397aed450a168d75eb8b538e86881a
weight_sha256_checksum
:
05d92625809dc9edd6484882335c48c043397aed450a168d75eb8b538e86881a
input_nodes
:
-
input_node0
-
input_node1
output_nodes
:
-
output_node0
-
output_node1
input_shapes
:
-
1,256,256,3
-
1,128,128,3
output_shapes
:
-
1,256,256,2
-
1,1,1,2
runtime
:
cpu
limit_opencl_kernel_time
:
1
dsp_mode
:
0
obfuscate
:
1
fast_conv
:
0
tools/mace_tools.py
浏览文件 @
ced4a49d
此差异已折叠。
点击以展开。
tools/sh_commands.py
浏览文件 @
ced4a49d
此差异已折叠。
点击以展开。
tools/validate.py
浏览文件 @
ced4a49d
...
...
@@ -36,6 +36,8 @@ import common
# --input_shape 1,64,64,3 \
# --output_shape 1,64,64,2
VALIDATION_MODULE
=
'VALIDATION'
def
load_data
(
file
):
if
os
.
path
.
isfile
(
file
):
...
...
@@ -51,18 +53,21 @@ def compare_output(platform, device_type, output_name, mace_out_value,
mace_out_value
=
mace_out_value
.
reshape
(
-
1
)
assert
len
(
out_value
)
==
len
(
mace_out_value
)
similarity
=
(
1
-
spatial
.
distance
.
cosine
(
out_value
,
mace_out_value
))
print
output_name
,
'MACE VS'
,
platform
.
upper
(
),
'similarity: '
,
similarity
common
.
MaceLogger
.
summary
(
output_name
+
' MACE VS '
+
platform
.
upper
()
+
' similarity: '
+
str
(
similarity
))
if
(
device_type
==
"CPU"
and
similarity
>
0.999
)
or
\
(
device_type
==
"GPU"
and
similarity
>
0.995
)
or
\
(
device_type
==
"HEXAGON"
and
similarity
>
0.930
):
print
'===================Similarity Test Passed=================='
common
.
MaceLogger
.
summary
(
common
.
StringFormatter
.
block
(
"Similarity Test Passed"
))
else
:
print
'===================Similarity Test Failed=================='
sys
.
exit
(
-
1
)
common
.
MaceLogger
.
error
(
""
,
common
.
StringFormatter
.
block
(
"Similarity Test Failed"
)
)
else
:
print
'=======================Skip empty node==================='
sys
.
exit
(
-
1
)
common
.
MaceLogger
.
error
(
""
,
common
.
StringFormatter
.
block
(
"Similarity Test failed because of empty output"
))
def
normalize_tf_tensor_name
(
name
):
...
...
@@ -76,8 +81,9 @@ def validate_tf_model(platform, device_type, model_file, input_file,
mace_out_file
,
input_names
,
input_shapes
,
output_names
):
import
tensorflow
as
tf
if
not
os
.
path
.
isfile
(
model_file
):
print
(
"Input graph file '"
+
model_file
+
"' does not exist!"
)
sys
.
exit
(
-
1
)
common
.
MaceLogger
.
error
(
VALIDATION_MODULE
,
"Input graph file '"
+
model_file
+
"' does not exist!"
)
tf
.
reset_default_graph
()
input_graph_def
=
tf
.
GraphDef
()
...
...
@@ -118,11 +124,13 @@ def validate_caffe_model(platform, device_type, model_file, input_file,
os
.
environ
[
'GLOG_minloglevel'
]
=
'1'
# suprress Caffe verbose prints
import
caffe
if
not
os
.
path
.
isfile
(
model_file
):
print
(
"Input graph file '"
+
model_file
+
"' does not exist!"
)
sys
.
exit
(
-
1
)
common
.
MaceLogger
.
error
(
VALIDATION_MODULE
,
"Input graph file '"
+
model_file
+
"' does not exist!"
)
if
not
os
.
path
.
isfile
(
weight_file
):
print
(
"Input weight file '"
+
weight_file
+
"' does not exist!"
)
sys
.
exit
(
-
1
)
common
.
MaceLogger
.
error
(
VALIDATION_MODULE
,
"Input weight file '"
+
weight_file
+
"' does not exist!"
)
caffe
.
set_mode_cpu
()
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录