Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
毕竟曾有刹那
Mace
提交
12c4dace
Mace
项目概览
毕竟曾有刹那
/
Mace
与 Fork 源项目一致
Fork自
Xiaomi / Mace
通知
1
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
DevOps
流水线
流水线任务
计划
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
Mace
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
DevOps
DevOps
流水线
流水线任务
计划
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
流水线任务
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
12c4dace
编写于
12月 11, 2018
作者:
L
liuqi
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Update the document about usage of ARM Linux
上级
51b14100
变更
12
隐藏空白更改
内联
并排
Showing
12 changed file
with
112 addition
and
784 deletion
+112
-784
docs/user_guide/advanced_usage.rst
docs/user_guide/advanced_usage.rst
+34
-43
docs/user_guide/basic_usage.rst
docs/user_guide/basic_usage.rst
+4
-3
docs/user_guide/devices/demo_device_nanopi.yml
docs/user_guide/devices/demo_device_nanopi.yml
+0
-3
mace/core/runtime/cpu/cpu_runtime.cc
mace/core/runtime/cpu/cpu_runtime.cc
+1
-31
mace/python/tools/memory_optimizer.py
mace/python/tools/memory_optimizer.py
+0
-350
tools/bazel.rc
tools/bazel.rc
+3
-5
tools/bazel_adb_run.py
tools/bazel_adb_run.py
+8
-8
tools/build-standalone-lib.sh
tools/build-standalone-lib.sh
+4
-4
tools/common.py
tools/common.py
+6
-6
tools/converter.py
tools/converter.py
+17
-47
tools/device.py
tools/device.py
+27
-34
tools/sh_commands.py
tools/sh_commands.py
+8
-250
未找到文件。
docs/user_guide/advanced_usage.rst
浏览文件 @
12c4dace
...
@@ -114,69 +114,60 @@ Advanced usage
...
@@ -114,69 +114,60 @@ Advanced usage
--------------
--------------
There are three common advanced use cases:
There are three common advanced use cases:
- run your model on the embedded device
- run your model on the embedded device
(ARM LINUX)
- converting model to C++ code.
- converting model to C++ code.
- tuning GPU kernels for a specific SoC.
- tuning GPU kernels for a specific SoC.
Run you model on the embedded device
Run you model on the embedded device
(ARM Linux)
------------------
------------------
-----------------------------
MACE use ssh to connect embedded device, in this case we recommend you to push ``$HOME/.ssh/id_rsa.pub``
The way to run your model on the ARM Linux is nearly same as with android, except you need specify a device config file.
to your device ``$HOME/.ssh/authorized_keys``
.. code:: bash
.. code:: bash
cat ~/.ssh/id_rsa.pub | ssh -q {user}@{ip} "cat >> ~/.ssh/authorized_keys"
python tools/converter.py run --config=/path/to/your/model_deployment_file.yml --device_yml=/path/to/devices.yml
Th
is part will show you how to write your own device yaml config file.
Th
ere are two steps to do before run:
**Device yaml config file**
1. configure login without password
The way to run your model on the embedded device is nearly the same as run on android, except you need give a device yaml config file
.
MACE use ssh to connect embedded device, you should copy your public key to embedded device with the blow command
.
MACE get this yaml config via ``--device_yml`` argument, default config value is ``devices.yml``
.. code:: bash
, when the yaml config file is not found. we treat as there is no available arm linux device, give a message
and continue on other device such as plugged android phone.
* **Example**
Here is an device yaml config demo.
cat ~/.ssh/id_rsa.pub | ssh -q {user}@{ip} "cat >> ~/.ssh/authorized_keys"
.. literalinclude:: devices/demo_device_nanopi.yml
2. write your own device yaml configuration file.
:language: yaml
* **Configuration**
* **Example**
.. list-table::
:header-rows: 1
* - Options
Here is an device yaml config demo.
- Usage
* - target_abis
- Device supported abis, you can get it via ``dpkg --print-architecture`` and
``dpkg --print-foreign-architectures`` command, if more than one abi is supported,
separate them by commas.
* - target_socs
- device soc, you can get it from device manual, we haven't found a way to get it in shell.
* - models
- device models full name, you can get via get ``lshw`` command (third party package, install it via your package manager).
see it's product value.
* - address
- Since we use ssh to connect device, ip address is required.
* - username
- login username, required.
* - password
- login password, optional when you can login into device without password
.. literalinclude:: devices/demo_device_nanopi.yml
:language: yaml
.. note::
* **Configuration**
The detailed explanation is listed in the blow table.
Some command tools:
.. list-table::
:header-rows: 1
.. code:: bash
* - Options
- Usage
* - target_abis
- Device supported abis, you can get it via ``dpkg --print-architecture`` and
``dpkg --print-foreign-architectures`` command, if more than one abi is supported,
separate them by commas.
* - target_socs
- device soc, you can get it from device manual, we haven't found a way to get it in shell.
* - models
- device models full name, you can get via get ``lshw`` command (third party package, install it via your package manager).
see it's product value.
* - address
- Since we use ssh to connect device, ip address is required.
* - username
- login username, required.
# specify device yaml config file via --device_yml argument or put the file under working directory
python tools/converter.py run --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml --device_yml=/path/to/devices.yml
Convert model(s) to C++ code
Convert model(s) to C++ code
--------------------------------
--------------------------------
...
...
docs/user_guide/basic_usage.rst
浏览文件 @
12c4dace
...
@@ -246,13 +246,14 @@ to run and validate your model.
...
@@ -246,13 +246,14 @@ to run and validate your model.
# Test model run time
# Test model run time
python tools/converter.py run --config=/path/to/your/model_deployment_file.yml --round=100
python tools/converter.py run --config=/path/to/your/model_deployment_file.yml --round=100
# If you want to run model on specified arm linux device, you should put device config file in the working directory or run with flag `--device_yml`
python tools/converter.py run --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml --device_yml=/path/to/devices.yml --example
# Validate the correctness by comparing the results against the
# Validate the correctness by comparing the results against the
# original model and framework, measured with cosine distance for similarity.
# original model and framework, measured with cosine distance for similarity.
python tools/converter.py run --config=/path/to/your/model_deployment_file.yml --validate
python tools/converter.py run --config=/path/to/your/model_deployment_file.yml --validate
# If you want to run model on specified arm linux device, you should put device config file in the working directory or run with flag `--device_yml`
python tools/converter.py run --config=/path/to/your/model_deployment_file.yml --device_yml=/path/to/devices.yml
* **benchmark**
* **benchmark**
benchmark and profile the model.
benchmark and profile the model.
...
...
docs/user_guide/devices/demo_device_nanopi.yml
浏览文件 @
12c4dace
...
@@ -12,12 +12,9 @@ devices:
...
@@ -12,12 +12,9 @@ devices:
address
:
10.0.0.0
address
:
10.0.0.0
# login username
# login username
username
:
user
username
:
user
# login password, is required when you can login into device without password
password
:
1234567
raspberry
:
raspberry
:
target_abis
:
[
armv7l
]
target_abis
:
[
armv7l
]
target_socs
:
BCM2837
target_socs
:
BCM2837
models
:
Raspberry Pi 3 Model B Plus Rev
1.3
models
:
Raspberry Pi 3 Model B Plus Rev
1.3
address
:
10.0.0.1
address
:
10.0.0.1
username
:
user
username
:
user
password
:
123456
mace/core/runtime/cpu/cpu_runtime.cc
浏览文件 @
12c4dace
...
@@ -42,7 +42,7 @@ struct CPUFreq {
...
@@ -42,7 +42,7 @@ struct CPUFreq {
};
};
namespace
{
namespace
{
#if defined(__ANDROID__)
int
GetCPUCount
()
{
int
GetCPUCount
()
{
int
cpu_count
=
0
;
int
cpu_count
=
0
;
std
::
string
cpu_sys_conf
=
"/proc/cpuinfo"
;
std
::
string
cpu_sys_conf
=
"/proc/cpuinfo"
;
...
@@ -69,10 +69,8 @@ int GetCPUCount() {
...
@@ -69,10 +69,8 @@ int GetCPUCount() {
VLOG
(
2
)
<<
"CPU cores: "
<<
cpu_count
;
VLOG
(
2
)
<<
"CPU cores: "
<<
cpu_count
;
return
cpu_count
;
return
cpu_count
;
}
}
#endif
int
GetCPUMaxFreq
(
std
::
vector
<
float
>
*
max_freqs
)
{
int
GetCPUMaxFreq
(
std
::
vector
<
float
>
*
max_freqs
)
{
#if defined(__ANDROID__)
int
cpu_count
=
GetCPUCount
();
int
cpu_count
=
GetCPUCount
();
for
(
int
cpu_id
=
0
;
cpu_id
<
cpu_count
;
++
cpu_id
)
{
for
(
int
cpu_id
=
0
;
cpu_id
<
cpu_count
;
++
cpu_id
)
{
std
::
string
cpuinfo_max_freq_sys_conf
=
MakeString
(
std
::
string
cpuinfo_max_freq_sys_conf
=
MakeString
(
...
@@ -94,34 +92,6 @@ int GetCPUMaxFreq(std::vector<float> *max_freqs) {
...
@@ -94,34 +92,6 @@ int GetCPUMaxFreq(std::vector<float> *max_freqs) {
}
}
f
.
close
();
f
.
close
();
}
}
#else
std
::
string
cpu_sys_conf
=
"/proc/cpuinfo"
;
std
::
ifstream
f
(
cpu_sys_conf
);
if
(
!
f
.
is_open
())
{
LOG
(
ERROR
)
<<
"failed to open "
<<
cpu_sys_conf
;
return
-
1
;
}
std
::
string
line
;
const
std
::
string
freq_key
=
"cpu MHz"
;
while
(
std
::
getline
(
f
,
line
))
{
if
(
line
.
size
()
>=
freq_key
.
size
()
&&
line
.
compare
(
0
,
freq_key
.
size
(),
freq_key
)
==
0
)
{
size_t
pos
=
line
.
find
(
":"
);
if
(
pos
!=
std
::
string
::
npos
)
{
std
::
string
freq_str
=
line
.
substr
(
pos
+
1
);
float
freq
=
atof
(
freq_str
.
c_str
());
max_freqs
->
push_back
(
freq
);
}
}
}
if
(
f
.
bad
())
{
LOG
(
ERROR
)
<<
"failed to read "
<<
cpu_sys_conf
;
}
if
(
!
f
.
eof
())
{
LOG
(
ERROR
)
<<
"failed to read end of "
<<
cpu_sys_conf
;
}
f
.
close
();
#endif
for
(
float
freq
:
*
max_freqs
)
{
for
(
float
freq
:
*
max_freqs
)
{
VLOG
(
2
)
<<
"CPU freq: "
<<
freq
;
VLOG
(
2
)
<<
"CPU freq: "
<<
freq
;
...
...
mace/python/tools/memory_optimizer.py
已删除
100644 → 0
浏览文件 @
51b14100
# Copyright 2018 Xiaomi, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import
sys
import
operator
import
six
from
six.moves
import
reduce
from
mace.proto
import
mace_pb2
from
mace.python.tools.converter_tool
import
base_converter
as
cvt
from
mace.python.tools.converter_tool.base_converter
import
DeviceType
from
mace.python.tools.converter_tool.base_converter
import
ConverterUtil
from
mace.python.tools.converter_tool.base_converter
import
MaceKeyword
from
mace.python.tools.convert_util
import
calculate_image_shape
from
mace.python.tools.convert_util
import
OpenCLBufferType
def
MemoryTypeToStr
(
mem_type
):
if
mem_type
==
mace_pb2
.
CPU_BUFFER
:
return
'CPU_BUFFER'
elif
mem_type
==
mace_pb2
.
GPU_BUFFER
:
return
'GPU_BUFFER'
elif
mem_type
==
mace_pb2
.
GPU_IMAGE
:
return
'GPU_IMAGE'
else
:
return
'UNKNOWN'
class
MemoryBlock
(
object
):
def
__init__
(
self
,
mem_type
,
block
):
self
.
_mem_type
=
mem_type
self
.
_block
=
block
@
property
def
mem_type
(
self
):
return
self
.
_mem_type
@
property
def
block
(
self
):
return
self
.
_block
class
MemoryOptimizer
(
object
):
def
__init__
(
self
,
net_def
):
self
.
net_def
=
net_def
self
.
idle_mem
=
set
()
self
.
op_mem
=
{}
# op_name->mem_id
self
.
mem_block
=
{}
# mem_id->[size] or mem_id->[x, y]
self
.
total_mem_count
=
0
self
.
input_ref_counter
=
{}
self
.
mem_ref_counter
=
{}
ocl_mem_type_arg
=
ConverterUtil
.
get_arg
(
net_def
,
MaceKeyword
.
mace_opencl_mem_type
)
self
.
cl_mem_type
=
ocl_mem_type_arg
.
i
if
ocl_mem_type_arg
is
not
None
\
else
None
consumers
=
{}
for
op
in
net_def
.
op
:
if
not
self
.
op_need_optimize_memory
(
op
):
continue
for
ipt
in
op
.
input
:
if
ipt
not
in
consumers
:
consumers
[
ipt
]
=
[]
consumers
[
ipt
].
append
(
op
)
# only ref op's output tensor
for
op
in
net_def
.
op
:
if
not
self
.
op_need_optimize_memory
(
op
):
continue
for
output
in
op
.
output
:
tensor_name
=
output
if
tensor_name
in
consumers
:
self
.
input_ref_counter
[
tensor_name
]
=
\
len
(
consumers
[
tensor_name
])
else
:
self
.
input_ref_counter
[
tensor_name
]
=
0
def
op_need_optimize_memory
(
self
,
op
):
return
True
def
get_op_mem_block
(
self
,
op_type
,
output_shape
,
output_type
):
data_type_size
=
4
if
output_type
==
mace_pb2
.
DT_UINT8
:
data_type_size
=
1
return
MemoryBlock
(
mace_pb2
.
CPU_BUFFER
,
[
reduce
(
operator
.
mul
,
output_shape
,
1
)
*
data_type_size
])
def
mem_size
(
self
,
memory_block
):
return
memory_block
.
block
[
0
]
def
sub_mem_block
(
self
,
mem_block1
,
mem_block2
):
return
self
.
mem_size
(
mem_block1
)
-
self
.
mem_size
(
mem_block2
)
def
resize_mem_block
(
self
,
old_mem_block
,
op_mem_block
):
return
MemoryBlock
(
old_mem_block
.
mem_type
,
[
max
(
old_mem_block
.
block
[
0
],
op_mem_block
.
block
[
0
])])
def
add_net_mem_blocks
(
self
):
for
mem
in
self
.
mem_block
:
arena
=
self
.
net_def
.
mem_arena
block
=
arena
.
mem_block
.
add
()
block
.
mem_id
=
mem
block
.
device_type
=
DeviceType
.
CPU
.
value
block
.
mem_type
=
self
.
mem_block
[
mem
].
mem_type
block
.
x
=
self
.
mem_block
[
mem
].
block
[
0
]
block
.
y
=
1
def
get_total_origin_mem_size
(
self
):
origin_mem_size
=
0
for
op
in
self
.
net_def
.
op
:
if
not
self
.
op_need_optimize_memory
(
op
):
continue
origin_mem_size
+=
reduce
(
operator
.
mul
,
op
.
output_shape
[
0
].
dims
,
1
)
return
origin_mem_size
def
get_total_optimized_mem_size
(
self
):
optimized_mem_size
=
0
for
mem
in
self
.
mem_block
:
print
(
mem
,
MemoryTypeToStr
(
self
.
mem_block
[
mem
].
mem_type
),
self
.
mem_block
[
mem
].
block
)
optimized_mem_size
+=
self
.
mem_size
(
self
.
mem_block
[
mem
])
return
optimized_mem_size
@
staticmethod
def
is_memory_reuse_op
(
op
):
return
op
.
type
==
'Reshape'
or
op
.
type
==
'Identity'
\
or
op
.
type
==
'Squeeze'
or
op
.
type
==
'ExpandDims'
def
optimize
(
self
):
for
op
in
self
.
net_def
.
op
:
if
not
self
.
op_need_optimize_memory
(
op
):
continue
if
not
op
.
output_shape
:
six
.
print_
(
"WARNING: There is no output shape information to "
"do memory optimization. %s (%s)"
%
(
op
.
name
,
op
.
type
),
file
=
sys
.
stderr
)
return
if
len
(
op
.
output_shape
)
!=
len
(
op
.
output
):
six
.
print_
(
'WARNING: the number of output shape is '
'not equal to the number of output.'
,
file
=
sys
.
stderr
)
return
for
i
in
range
(
len
(
op
.
output
)):
if
self
.
is_memory_reuse_op
(
op
):
# make these ops reuse memory of input tensor
mem_id
=
self
.
op_mem
.
get
(
op
.
input
[
0
],
-
1
)
else
:
output_type
=
mace_pb2
.
DT_FLOAT
for
arg
in
op
.
arg
:
if
arg
.
name
==
'T'
:
output_type
=
arg
.
i
if
len
(
op
.
output_type
)
>
i
:
output_type
=
op
.
output_type
[
i
]
op_mem_block
=
self
.
get_op_mem_block
(
op
.
type
,
op
.
output_shape
[
i
].
dims
,
output_type
)
mem_id
=
-
1
if
len
(
self
.
idle_mem
)
>
0
:
best_mem_add_size
=
six
.
MAXSIZE
best_mem_waste_size
=
six
.
MAXSIZE
for
mid
in
self
.
idle_mem
:
old_mem_block
=
self
.
mem_block
[
mid
]
if
old_mem_block
.
mem_type
!=
op_mem_block
.
mem_type
:
continue
new_mem_block
=
self
.
resize_mem_block
(
old_mem_block
,
op_mem_block
)
add_mem_size
=
self
.
sub_mem_block
(
new_mem_block
,
old_mem_block
)
waste_mem_size
=
self
.
sub_mem_block
(
new_mem_block
,
op_mem_block
)
# minimize add_mem_size; if best_mem_add_size is 0,
# then minimize waste_mem_size
if
(
best_mem_add_size
>
0
and
add_mem_size
<
best_mem_add_size
)
\
or
(
best_mem_add_size
==
0
and
waste_mem_size
<
best_mem_waste_size
):
best_mem_id
=
mid
best_mem_add_size
=
add_mem_size
best_mem_waste_size
=
waste_mem_size
best_mem_block
=
new_mem_block
# if add mem size < op mem size, then reuse it
if
best_mem_add_size
<=
self
.
mem_size
(
op_mem_block
):
self
.
mem_block
[
best_mem_id
]
=
best_mem_block
mem_id
=
best_mem_id
self
.
idle_mem
.
remove
(
mem_id
)
if
mem_id
==
-
1
:
mem_id
=
self
.
total_mem_count
self
.
total_mem_count
+=
1
self
.
mem_block
[
mem_id
]
=
op_mem_block
if
mem_id
!=
-
1
:
op
.
mem_id
.
extend
([
mem_id
])
self
.
op_mem
[
op
.
output
[
i
]]
=
mem_id
if
mem_id
not
in
self
.
mem_ref_counter
:
self
.
mem_ref_counter
[
mem_id
]
=
1
else
:
self
.
mem_ref_counter
[
mem_id
]
+=
1
# de-ref input tensor mem
for
idx
in
six
.
moves
.
range
(
len
(
op
.
input
)):
ipt
=
op
.
input
[
idx
]
if
ipt
in
self
.
input_ref_counter
:
self
.
input_ref_counter
[
ipt
]
-=
1
if
self
.
input_ref_counter
[
ipt
]
==
0
\
and
ipt
in
self
.
op_mem
:
mem_id
=
self
.
op_mem
[
ipt
]
self
.
mem_ref_counter
[
mem_id
]
-=
1
if
self
.
mem_ref_counter
[
mem_id
]
==
0
:
self
.
idle_mem
.
add
(
self
.
op_mem
[
ipt
])
elif
self
.
input_ref_counter
[
ipt
]
<
0
:
raise
Exception
(
'ref count is less than 0'
)
self
.
add_net_mem_blocks
()
print
(
"total op: %d"
%
len
(
self
.
net_def
.
op
))
print
(
"origin mem: %d, optimized mem: %d"
%
(
self
.
get_total_origin_mem_size
(),
self
.
get_total_optimized_mem_size
()))
class
GPUMemoryOptimizer
(
MemoryOptimizer
):
def
op_need_optimize_memory
(
self
,
op
):
if
op
.
type
==
MaceKeyword
.
mace_buffer_transform
:
for
arg
in
op
.
arg
:
if
arg
.
name
==
'mode'
and
arg
.
i
==
0
:
return
False
return
op
.
type
!=
MaceKeyword
.
mace_buffer_inverse_transform
def
get_op_image_mem_block
(
self
,
op_type
,
output_shape
):
if
op_type
==
'WinogradTransform'
or
op_type
==
'MatMul'
:
buffer_shape
=
list
(
output_shape
)
+
[
1
]
mem_block
=
MemoryBlock
(
mace_pb2
.
GPU_IMAGE
,
calculate_image_shape
(
OpenCLBufferType
.
IN_OUT_HEIGHT
,
buffer_shape
))
elif
op_type
in
[
'Shape'
,
'InferConv2dShape'
,
'StridedSlice'
,
'Stack'
,
'ScalarMath'
]:
if
len
(
output_shape
)
==
1
:
mem_block
=
MemoryBlock
(
mace_pb2
.
CPU_BUFFER
,
[
output_shape
[
0
],
1
])
elif
len
(
output_shape
)
==
0
:
mem_block
=
MemoryBlock
(
mace_pb2
.
CPU_BUFFER
,
[
1
,
1
])
else
:
raise
Exception
(
'%s output shape dim size is not 0 or 1.'
%
op_type
)
else
:
if
len
(
output_shape
)
==
2
:
# only support fc/softmax
buffer_shape
=
[
output_shape
[
0
],
output_shape
[
1
]]
elif
len
(
output_shape
)
==
4
:
buffer_shape
=
output_shape
else
:
raise
Exception
(
'%s output shape dim size is not 2 or 4.'
%
op_type
)
mem_block
=
MemoryBlock
(
mace_pb2
.
GPU_IMAGE
,
calculate_image_shape
(
OpenCLBufferType
.
IN_OUT_CHANNEL
,
buffer_shape
))
return
mem_block
def
get_op_buffer_mem_block
(
self
,
output_shape
):
return
MemoryBlock
(
mace_pb2
.
GPU_BUFFER
,
[
reduce
(
operator
.
mul
,
output_shape
,
1
),
1
])
def
get_op_mem_block
(
self
,
op_type
,
output_shape
,
output_type
):
if
self
.
cl_mem_type
==
mace_pb2
.
GPU_IMAGE
:
return
self
.
get_op_image_mem_block
(
op_type
,
output_shape
)
else
:
return
self
.
get_op_buffer_mem_block
(
output_shape
)
def
mem_size
(
self
,
memory_block
):
if
memory_block
.
mem_type
==
mace_pb2
.
GPU_IMAGE
:
return
memory_block
.
block
[
0
]
*
memory_block
.
block
[
1
]
*
4
else
:
return
memory_block
.
block
[
0
]
def
resize_mem_block
(
self
,
old_mem_block
,
op_mem_block
):
resize_mem_block
=
MemoryBlock
(
old_mem_block
.
mem_type
,
[
max
(
old_mem_block
.
block
[
0
],
op_mem_block
.
block
[
0
]),
max
(
old_mem_block
.
block
[
1
],
op_mem_block
.
block
[
1
])
])
return
resize_mem_block
def
add_net_mem_blocks
(
self
):
max_image_size_x
=
0
max_image_size_y
=
0
for
mem
in
self
.
mem_block
:
arena
=
self
.
net_def
.
mem_arena
block
=
arena
.
mem_block
.
add
()
block
.
mem_id
=
mem
block
.
device_type
=
DeviceType
.
GPU
.
value
block
.
mem_type
=
self
.
mem_block
[
mem
].
mem_type
block
.
x
=
self
.
mem_block
[
mem
].
block
[
0
]
block
.
y
=
self
.
mem_block
[
mem
].
block
[
1
]
if
self
.
mem_block
[
mem
].
mem_type
==
mace_pb2
.
GPU_IMAGE
:
max_image_size_x
=
max
(
max_image_size_x
,
block
.
x
)
max_image_size_y
=
max
(
max_image_size_y
,
block
.
y
)
if
self
.
cl_mem_type
==
mace_pb2
.
GPU_IMAGE
:
# Update OpenCL max image size
net_ocl_max_img_size_arg
=
None
for
arg
in
self
.
net_def
.
arg
:
if
arg
.
name
==
cvt
.
MaceKeyword
.
mace_opencl_max_image_size
:
net_ocl_max_img_size_arg
=
arg
max_image_size_x
=
max
(
arg
.
ints
[
0
],
max_image_size_x
)
max_image_size_y
=
max
(
arg
.
ints
[
1
],
max_image_size_y
)
break
if
net_ocl_max_img_size_arg
is
None
:
net_ocl_max_img_size_arg
=
self
.
net_def
.
arg
.
add
()
net_ocl_max_img_size_arg
.
name
=
\
cvt
.
MaceKeyword
.
mace_opencl_max_image_size
net_ocl_max_img_size_arg
.
ints
[:]
=
[
max_image_size_x
,
max_image_size_y
]
def
optimize_gpu_memory
(
net_def
):
mem_optimizer
=
GPUMemoryOptimizer
(
net_def
)
mem_optimizer
.
optimize
()
def
optimize_cpu_memory
(
net_def
):
mem_optimizer
=
MemoryOptimizer
(
net_def
)
mem_optimizer
.
optimize
()
tools/bazel.rc
浏览文件 @
12c4dace
# Partially borrowed from tensorflow tools/bazel.rc
# Partially borrowed from tensorflow tools/bazel.rc
# By default, we don't distinct target and host platfroms.
# By default, we don't distinct target and host platfroms.
# When doing cross compilation, use --config=cross_compile to distinct them.
build --distinct_host_configuration=false
build --distinct_host_configuration=false
build:cross_compile --distinct_host_configuration=true
build --verbose_failures
build --verbose_failures
build --copt=-std=c++11
build --copt=-std=c++11
...
@@ -17,12 +15,12 @@ build --copt=-DMACE_USE_NNLIB_CAF
...
@@ -17,12 +15,12 @@ build --copt=-DMACE_USE_NNLIB_CAF
build:symbol_hidden --copt=-fvisibility=hidden
build:symbol_hidden --copt=-fvisibility=hidden
# Usage example: bazel build --config android
# Usage example: bazel build --config android
build:android --
config=cross_compil
e
build:android --
distinct_host_configuration=tru
e
build:android --crosstool_top=//external:android/crosstool
build:android --crosstool_top=//external:android/crosstool
build:android --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
build:android --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
# Usage example: bazel build --config arm_linux_gnueabihf
# Usage example: bazel build --config arm_linux_gnueabihf
build:arm_linux_gnueabihf --
config=cross_compil
e
build:arm_linux_gnueabihf --
distinct_host_configuration=tru
e
build:arm_linux_gnueabihf --crosstool_top=//tools/arm_compiler:toolchain
build:arm_linux_gnueabihf --crosstool_top=//tools/arm_compiler:toolchain
build:arm_linux_gnueabihf --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
build:arm_linux_gnueabihf --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
build:arm_linux_gnueabihf --cpu=armeabi-v7a
build:arm_linux_gnueabihf --cpu=armeabi-v7a
...
@@ -34,7 +32,7 @@ build:arm_linux_gnueabihf --copt -Wno-sequence-point
...
@@ -34,7 +32,7 @@ build:arm_linux_gnueabihf --copt -Wno-sequence-point
build:arm_linux_gnueabihf --copt -Wno-implicit-fallthrough
build:arm_linux_gnueabihf --copt -Wno-implicit-fallthrough
# Usage example: bazel build --config aarch64_linux_gnu
# Usage example: bazel build --config aarch64_linux_gnu
build:aarch64_linux_gnu --
config=cross_compil
e
build:aarch64_linux_gnu --
distinct_host_configuration=tru
e
build:aarch64_linux_gnu --crosstool_top=//tools/aarch64_compiler:toolchain
build:aarch64_linux_gnu --crosstool_top=//tools/aarch64_compiler:toolchain
build:aarch64_linux_gnu --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
build:aarch64_linux_gnu --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
build:aarch64_linux_gnu --cpu=aarch64
build:aarch64_linux_gnu --cpu=aarch64
...
...
tools/bazel_adb_run.py
浏览文件 @
12c4dace
...
@@ -52,13 +52,13 @@ def ops_benchmark_stdout_processor(stdout, dev, abi):
...
@@ -52,13 +52,13 @@ def ops_benchmark_stdout_processor(stdout, dev, abi):
metrics
[
"%s.input_mb_per_sec"
%
parts
[
0
]]
=
parts
[
3
]
metrics
[
"%s.input_mb_per_sec"
%
parts
[
0
]]
=
parts
[
3
]
metrics
[
"%s.gmacc_per_sec"
%
parts
[
0
]]
=
parts
[
4
]
metrics
[
"%s.gmacc_per_sec"
%
parts
[
0
]]
=
parts
[
4
]
platform
=
dev
[
YAMLKeyword
.
target_socs
]
#
platform = dev[YAMLKeyword.target_socs]
model
=
dev
[
YAMLKeyword
.
models
]
# model = dev[YAMLKeyword.device_name
]
tags
=
{
#
tags = {
"ro.board.platform"
:
platform
,
#
"ro.board.platform": platform,
"ro.product.model"
:
model
,
#
"ro.product.model": model,
"abi"
:
abi
#
"abi": abi
}
#
}
# sh_commands.falcon_push_metrics(server,
# sh_commands.falcon_push_metrics(server,
# metrics, tags=tags, endpoint="mace_ops_benchmark")
# metrics, tags=tags, endpoint="mace_ops_benchmark")
...
@@ -99,7 +99,7 @@ def parse_args():
...
@@ -99,7 +99,7 @@ def parse_args():
parser
.
add_argument
(
parser
.
add_argument
(
"--stdout_processor"
,
"--stdout_processor"
,
type
=
str
,
type
=
str
,
default
=
"stdout_processor"
,
default
=
"
unittest_
stdout_processor"
,
help
=
"Stdout processing function, default: stdout_processor"
)
help
=
"Stdout processing function, default: stdout_processor"
)
parser
.
add_argument
(
parser
.
add_argument
(
"--enable_neon"
,
"--enable_neon"
,
...
...
tools/build-standalone-lib.sh
浏览文件 @
12c4dace
...
@@ -45,11 +45,11 @@ bazel build --config android --config optimization mace/libmace:libmace_dynamic
...
@@ -45,11 +45,11 @@ bazel build --config android --config optimization mace/libmace:libmace_dynamic
cp
bazel-bin/mace/libmace/libmace.so
$LIB_DIR
/arm64-v8a/cpu_gpu/
cp
bazel-bin/mace/libmace/libmace.so
$LIB_DIR
/arm64-v8a/cpu_gpu/
echo
"build shared lib for arm_linux_gnueabihf + cpu_gpu"
echo
"build shared lib for arm_linux_gnueabihf + cpu_gpu"
bazel build
--config
arm_linux_gnueabihf
--config
optimization mace/libmace:libmace_dynamic
--define
neon
=
true
--define
openmp
=
true
--define
opencl
=
true
bazel build
--config
arm_linux_gnueabihf
--config
optimization mace/libmace:libmace_dynamic
--define
neon
=
true
--define
openmp
=
true
--define
opencl
=
true
--define
quantize
=
true
cp
bazel-bin/mace/libmace/libmace.so
$LIB_DIR
/arm_linux_gnueabihf/cpu_gpu/
cp
bazel-bin/mace/libmace/libmace.so
$LIB_DIR
/arm_linux_gnueabihf/cpu_gpu/
echo
"build shared lib for aarch64_linux_gnu + cpu_gpu"
echo
"build shared lib for aarch64_linux_gnu + cpu_gpu"
bazel build
--config
aarch64_linux_gnu
--config
optimization mace/libmace:libmace_dynamic
--define
neon
=
true
--define
openmp
=
true
--define
opencl
=
true
bazel build
--config
aarch64_linux_gnu
--config
optimization mace/libmace:libmace_dynamic
--define
neon
=
true
--define
openmp
=
true
--define
opencl
=
true
--define
quantize
=
true
cp
bazel-bin/mace/libmace/libmace.so
$LIB_DIR
/aarch64_linux_gnu/cpu_gpu/
cp
bazel-bin/mace/libmace/libmace.so
$LIB_DIR
/aarch64_linux_gnu/cpu_gpu/
if
[[
"
$OSTYPE
"
!=
"darwin"
*
]]
;
then
if
[[
"
$OSTYPE
"
!=
"darwin"
*
]]
;
then
...
@@ -73,11 +73,11 @@ bazel build --config android --config optimization mace/libmace:libmace_static -
...
@@ -73,11 +73,11 @@ bazel build --config android --config optimization mace/libmace:libmace_static -
cp
bazel-genfiles/mace/libmace/libmace.a
$LIB_DIR
/arm64-v8a/cpu_gpu/
cp
bazel-genfiles/mace/libmace/libmace.a
$LIB_DIR
/arm64-v8a/cpu_gpu/
echo
"build static lib for arm_linux_gnueabihf + cpu_gpu"
echo
"build static lib for arm_linux_gnueabihf + cpu_gpu"
bazel build
--config
arm_linux_gnueabihf
--config
optimization mace/libmace:libmace_static
--config
symbol_hidden
--define
neon
=
true
--define
openmp
=
true
--define
opencl
=
true
bazel build
--config
arm_linux_gnueabihf
--config
optimization mace/libmace:libmace_static
--config
symbol_hidden
--define
neon
=
true
--define
openmp
=
true
--define
opencl
=
true
--define
quantize
=
true
cp
bazel-genfiles/mace/libmace/libmace.a
$LIB_DIR
/arm_linux_gnueabihf/cpu_gpu/
cp
bazel-genfiles/mace/libmace/libmace.a
$LIB_DIR
/arm_linux_gnueabihf/cpu_gpu/
echo
"build static lib for aarch64_linux_gnu + cpu_gpu"
echo
"build static lib for aarch64_linux_gnu + cpu_gpu"
bazel build
--config
aarch64_linux_gnu
--config
optimization mace/libmace:libmace_static
--config
symbol_hidden
--define
neon
=
true
--define
openmp
=
true
--define
opencl
=
true
bazel build
--config
aarch64_linux_gnu
--config
optimization mace/libmace:libmace_static
--config
symbol_hidden
--define
neon
=
true
--define
openmp
=
true
--define
opencl
=
true
--define
quantize
=
true
cp
bazel-genfiles/mace/libmace/libmace.a
$LIB_DIR
/aarch64_linux_gnu/cpu_gpu/
cp
bazel-genfiles/mace/libmace/libmace.a
$LIB_DIR
/aarch64_linux_gnu/cpu_gpu/
if
[[
"
$OSTYPE
"
!=
"darwin"
*
]]
;
then
if
[[
"
$OSTYPE
"
!=
"darwin"
*
]]
;
then
...
...
tools/common.py
浏览文件 @
12c4dace
...
@@ -240,7 +240,7 @@ def get_model_files(model_file_path,
...
@@ -240,7 +240,7 @@ def get_model_files(model_file_path,
def
get_opencl_binary_output_path
(
library_name
,
target_abi
,
device
):
def
get_opencl_binary_output_path
(
library_name
,
target_abi
,
device
):
target_soc
=
device
.
target_socs
target_soc
=
device
.
target_socs
device_
model
=
device
.
models
device_
name
=
device
.
device_name
return
'%s/%s/%s/%s/%s_%s.%s.%s.bin'
%
\
return
'%s/%s/%s/%s/%s_%s.%s.%s.bin'
%
\
(
BUILD_OUTPUT_DIR
,
(
BUILD_OUTPUT_DIR
,
library_name
,
library_name
,
...
@@ -248,13 +248,13 @@ def get_opencl_binary_output_path(library_name, target_abi, device):
...
@@ -248,13 +248,13 @@ def get_opencl_binary_output_path(library_name, target_abi, device):
target_abi
,
target_abi
,
library_name
,
library_name
,
OUTPUT_OPENCL_BINARY_FILE_NAME
,
OUTPUT_OPENCL_BINARY_FILE_NAME
,
device_
model
,
device_
name
,
target_soc
)
target_soc
)
def
get_opencl_parameter_output_path
(
library_name
,
target_abi
,
device
):
def
get_opencl_parameter_output_path
(
library_name
,
target_abi
,
device
):
target_soc
=
device
.
target_socs
target_soc
=
device
.
target_socs
device_
model
=
device
.
models
device_
name
=
device
.
device_name
return
'%s/%s/%s/%s/%s_%s.%s.%s.bin'
%
\
return
'%s/%s/%s/%s/%s_%s.%s.%s.bin'
%
\
(
BUILD_OUTPUT_DIR
,
(
BUILD_OUTPUT_DIR
,
library_name
,
library_name
,
...
@@ -262,7 +262,7 @@ def get_opencl_parameter_output_path(library_name, target_abi, device):
...
@@ -262,7 +262,7 @@ def get_opencl_parameter_output_path(library_name, target_abi, device):
target_abi
,
target_abi
,
library_name
,
library_name
,
OUTPUT_OPENCL_PARAMETER_FILE_NAME
,
OUTPUT_OPENCL_PARAMETER_FILE_NAME
,
device_
model
,
device_
name
,
target_soc
)
target_soc
)
...
@@ -271,7 +271,7 @@ def get_build_model_dirs(library_name,
...
@@ -271,7 +271,7 @@ def get_build_model_dirs(library_name,
target_abi
,
target_abi
,
device
,
device
,
model_file_path
):
model_file_path
):
models
=
device
.
models
device_name
=
device
.
device_name
target_socs
=
device
.
target_socs
target_socs
=
device
.
target_socs
model_path_digest
=
md5sum
(
model_file_path
)
model_path_digest
=
md5sum
(
model_file_path
)
model_output_base_dir
=
'{}/{}/{}/{}/{}'
.
format
(
model_output_base_dir
=
'{}/{}/{}/{}/{}'
.
format
(
...
@@ -287,7 +287,7 @@ def get_build_model_dirs(library_name,
...
@@ -287,7 +287,7 @@ def get_build_model_dirs(library_name,
else
:
else
:
model_output_dir
=
'{}/{}_{}/{}'
.
format
(
model_output_dir
=
'{}/{}_{}/{}'
.
format
(
model_output_base_dir
,
model_output_base_dir
,
models
,
device_name
,
target_socs
,
target_socs
,
target_abi
target_abi
)
)
...
...
tools/converter.py
浏览文件 @
12c4dace
...
@@ -111,6 +111,13 @@ class DefaultValues(object):
...
@@ -111,6 +111,13 @@ class DefaultValues(object):
gpu_priority_hint
=
3
,
gpu_priority_hint
=
3
,
class
ValidationThreshold
(
object
):
cpu_threshold
=
0.999
,
gpu_threshold
=
0.995
,
hexagon_threshold
=
0.930
,
cpu_quantize_threshold
=
0.980
,
CPP_KEYWORDS
=
[
CPP_KEYWORDS
=
[
'alignas'
,
'alignof'
,
'and'
,
'and_eq'
,
'asm'
,
'atomic_cancel'
,
'alignas'
,
'alignof'
,
'and'
,
'and_eq'
,
'asm'
,
'atomic_cancel'
,
'atomic_commit'
,
'atomic_noexcept'
,
'auto'
,
'bitand'
,
'bitor'
,
'atomic_commit'
,
'atomic_noexcept'
,
'auto'
,
'bitand'
,
'bitor'
,
...
@@ -435,10 +442,11 @@ def format_model_config(flags):
...
@@ -435,10 +442,11 @@ def format_model_config(flags):
'similarity threshold must be a dict.'
)
'similarity threshold must be a dict.'
)
threshold_dict
=
{
threshold_dict
=
{
DeviceType
.
CPU
:
0.999
,
DeviceType
.
CPU
:
ValidationThreshold
.
cpu_threshold
,
DeviceType
.
GPU
:
0.995
,
DeviceType
.
GPU
:
ValidationThreshold
.
gpu_threshold
,
DeviceType
.
HEXAGON
:
0.930
,
DeviceType
.
HEXAGON
:
ValidationThreshold
.
hexagon_threshold
,
DeviceType
.
CPU
+
"_QUANTIZE"
:
0.980
,
DeviceType
.
CPU
+
"_QUANTIZE"
:
ValidationThreshold
.
cpu_quantize_threshold
,
}
}
for
k
,
v
in
six
.
iteritems
(
validation_threshold
):
for
k
,
v
in
six
.
iteritems
(
validation_threshold
):
if
k
.
upper
()
==
'DSP'
:
if
k
.
upper
()
==
'DSP'
:
...
@@ -838,39 +846,6 @@ def build_mace_run(configs, target_abi, toolchain, enable_openmp,
...
@@ -838,39 +846,6 @@ def build_mace_run(configs, target_abi, toolchain, enable_openmp,
mace_lib_type
==
MACELibType
.
dynamic
)
mace_lib_type
==
MACELibType
.
dynamic
)
def
build_quantize_stat
(
configs
):
library_name
=
configs
[
YAMLKeyword
.
library_name
]
build_tmp_binary_dir
=
get_build_binary_dir
(
library_name
,
ABIType
.
host
)
if
os
.
path
.
exists
(
build_tmp_binary_dir
):
sh
.
rm
(
"-rf"
,
build_tmp_binary_dir
)
os
.
makedirs
(
build_tmp_binary_dir
)
quantize_stat_target
=
QUANTIZE_STAT_TARGET
build_arg
=
""
six
.
print_
(
configs
[
YAMLKeyword
.
model_graph_format
])
if
configs
[
YAMLKeyword
.
model_graph_format
]
==
ModelFormat
.
code
:
mace_check
(
os
.
path
.
exists
(
ENGINE_CODEGEN_DIR
),
ModuleName
.
RUN
,
"You should convert model first."
)
build_arg
=
"--per_file_copt=mace/tools/quantization/quantize_stat.cc@-DMODEL_GRAPH_FORMAT_CODE"
# noqa
sh_commands
.
bazel_build
(
quantize_stat_target
,
abi
=
ABIType
.
host
,
toolchain
=
flags
.
toolchain
,
enable_openmp
=
True
,
symbol_hidden
=
True
,
extra_args
=
build_arg
)
quantize_stat_filepath
=
build_tmp_binary_dir
+
"/quantize_stat"
if
os
.
path
.
exists
(
quantize_stat_filepath
):
sh
.
rm
(
"-rf"
,
quantize_stat_filepath
)
sh
.
cp
(
"-f"
,
"bazel-bin/mace/tools/quantization/quantize_stat"
,
build_tmp_binary_dir
)
def
build_example
(
configs
,
target_abi
,
toolchain
,
def
build_example
(
configs
,
target_abi
,
toolchain
,
enable_openmp
,
mace_lib_type
):
enable_openmp
,
mace_lib_type
):
library_name
=
configs
[
YAMLKeyword
.
library_name
]
library_name
=
configs
[
YAMLKeyword
.
library_name
]
...
@@ -951,10 +926,8 @@ def run_mace(flags):
...
@@ -951,10 +926,8 @@ def run_mace(flags):
clear_build_dirs
(
configs
[
YAMLKeyword
.
library_name
])
clear_build_dirs
(
configs
[
YAMLKeyword
.
library_name
])
target_socs
=
configs
[
YAMLKeyword
.
target_socs
]
target_socs
=
configs
[
YAMLKeyword
.
target_socs
]
if
not
target_socs
or
ALL_SOC_TAG
in
target_socs
:
device_list
=
DeviceManager
.
list_devices
(
flags
.
device_yml
)
device_list
=
DeviceManager
.
list_devices
(
flags
.
device_yml
)
if
target_socs
and
ALL_SOC_TAG
not
in
target_socs
:
else
:
device_list
=
DeviceManager
.
list_devices
(
flags
.
device_yml
)
device_list
=
[
dev
for
dev
in
device_list
device_list
=
[
dev
for
dev
in
device_list
if
dev
[
YAMLKeyword
.
target_socs
].
lower
()
in
target_socs
]
if
dev
[
YAMLKeyword
.
target_socs
].
lower
()
in
target_socs
]
for
target_abi
in
configs
[
YAMLKeyword
.
target_abis
]:
for
target_abi
in
configs
[
YAMLKeyword
.
target_abis
]:
...
@@ -1042,13 +1015,10 @@ def benchmark_model(flags):
...
@@ -1042,13 +1015,10 @@ def benchmark_model(flags):
clear_build_dirs
(
configs
[
YAMLKeyword
.
library_name
])
clear_build_dirs
(
configs
[
YAMLKeyword
.
library_name
])
target_socs
=
configs
[
YAMLKeyword
.
target_socs
]
target_socs
=
configs
[
YAMLKeyword
.
target_socs
]
if
not
target_socs
or
ALL_SOC_TAG
in
target_socs
:
device_list
=
DeviceManager
.
list_devices
(
flags
.
device_yml
)
device_list
=
DeviceManager
.
list_devices
(
flags
.
device_yml
)
if
target_socs
and
ALL_SOC_TAG
not
in
target_socs
:
# target_socs = sh_commands.adb_get_all_socs()
else
:
device_list
=
DeviceManager
.
list_devices
(
flags
.
device_yml
)
device_list
=
[
dev
for
dev
in
device_list
device_list
=
[
dev
for
dev
in
device_list
if
dev
[
YAMLKeyword
.
target_socs
]
in
target_socs
]
if
dev
[
YAMLKeyword
.
target_socs
]
.
lower
()
in
target_socs
]
for
target_abi
in
configs
[
YAMLKeyword
.
target_abis
]:
for
target_abi
in
configs
[
YAMLKeyword
.
target_abis
]:
# build benchmark_model binary
# build benchmark_model binary
...
...
tools/device.py
浏览文件 @
12c4dace
...
@@ -37,8 +37,8 @@ class DeviceWrapper:
...
@@ -37,8 +37,8 @@ class DeviceWrapper:
:type device_dict: Device
:type device_dict: Device
:param device_dict: a key-value dict that holds the device information,
:param device_dict: a key-value dict that holds the device information,
which attribute has:
which attribute has:
target_abis, target_socs, models, system, address
device_name, target_abis, target_socs, system,
password
, username
address
, username
"""
"""
diff
=
set
(
device_dict
.
keys
())
-
set
(
YAMLKeyword
.
__dict__
.
keys
())
diff
=
set
(
device_dict
.
keys
())
-
set
(
YAMLKeyword
.
__dict__
.
keys
())
if
len
(
diff
)
>
0
:
if
len
(
diff
)
>
0
:
...
@@ -111,6 +111,7 @@ class DeviceWrapper:
...
@@ -111,6 +111,7 @@ class DeviceWrapper:
def
push
(
self
,
src_path
,
dst_path
):
def
push
(
self
,
src_path
,
dst_path
):
mace_check
(
os
.
path
.
exists
(
src_path
),
"Device"
,
mace_check
(
os
.
path
.
exists
(
src_path
),
"Device"
,
'{} not found'
.
format
(
src_path
))
'{} not found'
.
format
(
src_path
))
six
.
print_
(
"Push %s to %s"
%
(
src_path
,
dst_path
))
if
self
.
system
==
SystemType
.
android
:
if
self
.
system
==
SystemType
.
android
:
sh_commands
.
adb_push
(
src_path
,
dst_path
,
self
.
address
)
sh_commands
.
adb_push
(
src_path
,
dst_path
,
self
.
address
)
elif
self
.
system
==
SystemType
.
arm_linux
:
elif
self
.
system
==
SystemType
.
arm_linux
:
...
@@ -129,6 +130,7 @@ class DeviceWrapper:
...
@@ -129,6 +130,7 @@ class DeviceWrapper:
dst_file
=
"%s/%s"
%
(
dst_path
,
file_name
)
dst_file
=
"%s/%s"
%
(
dst_path
,
file_name
)
if
os
.
path
.
exists
(
dst_file
):
if
os
.
path
.
exists
(
dst_file
):
sh
.
rm
(
'-f'
,
dst_file
)
sh
.
rm
(
'-f'
,
dst_file
)
six
.
print_
(
"Pull %s to %s"
%
(
src_path
,
dst_path
))
if
self
.
system
==
SystemType
.
android
:
if
self
.
system
==
SystemType
.
android
:
sh_commands
.
adb_pull
(
sh_commands
.
adb_pull
(
src_file
,
dst_file
,
self
.
address
)
src_file
,
dst_file
,
self
.
address
)
...
@@ -138,7 +140,6 @@ class DeviceWrapper:
...
@@ -138,7 +140,6 @@ class DeviceWrapper:
self
.
address
,
self
.
address
,
src_file
),
src_file
),
dst_file
)
dst_file
)
print
(
"pull file "
,
src_path
,
dst_path
)
except
sh
.
ErrorReturnCode_1
as
e
:
except
sh
.
ErrorReturnCode_1
as
e
:
six
.
print_
(
"Pull Failed !"
,
file
=
sys
.
stderr
)
six
.
print_
(
"Pull Failed !"
,
file
=
sys
.
stderr
)
raise
e
raise
e
...
@@ -256,10 +257,13 @@ class DeviceWrapper:
...
@@ -256,10 +257,13 @@ class DeviceWrapper:
if
model_graph_format
==
ModelFormat
.
file
:
if
model_graph_format
==
ModelFormat
.
file
:
mace_model_phone_path
=
"%s/%s.pb"
%
(
self
.
data_dir
,
mace_model_phone_path
=
"%s/%s.pb"
%
(
self
.
data_dir
,
model_tag
)
model_tag
)
self
.
push
(
mace_model_path
,
self
.
push
(
mace_model_path
,
mace_model_phone_path
)
mace_model_phone_path
)
if
link_dynamic
:
if
link_dynamic
:
self
.
push
(
libmace_dynamic_library_path
,
self
.
data_dir
)
self
.
push
(
libmace_dynamic_library_path
,
self
.
data_dir
)
if
self
.
system
==
SystemType
.
android
:
sh_commands
.
push_depended_so_libs
(
libmace_dynamic_library_path
,
abi
,
self
.
data_dir
,
self
.
address
)
self
.
push
(
"%s/%s"
%
(
target_dir
,
target_name
),
self
.
data_dir
)
self
.
push
(
"%s/%s"
%
(
target_dir
,
target_name
),
self
.
data_dir
)
stdout_buff
=
[]
stdout_buff
=
[]
...
@@ -430,14 +434,11 @@ class DeviceWrapper:
...
@@ -430,14 +434,11 @@ class DeviceWrapper:
configs
[
YAMLKeyword
.
model_graph_format
],
configs
[
YAMLKeyword
.
model_graph_format
],
configs
[
YAMLKeyword
.
model_data_format
],
configs
[
YAMLKeyword
.
model_data_format
],
target_abi
)
target_abi
)
if
target_abi
==
ABIType
.
host
:
if
target_abi
!=
ABIType
.
host
:
device_model
=
ABIType
.
host
else
:
device_model
=
self
.
models
self
.
clear_data_dir
()
self
.
clear_data_dir
()
MaceLogger
.
header
(
MaceLogger
.
header
(
StringFormatter
.
block
(
StringFormatter
.
block
(
'Run model {} on {}'
.
format
(
model_name
,
device_model
)))
'Run model {} on {}'
.
format
(
model_name
,
self
.
device_name
)))
model_config
=
configs
[
YAMLKeyword
.
models
][
model_name
]
model_config
=
configs
[
YAMLKeyword
.
models
][
model_name
]
model_runtime
=
model_config
[
YAMLKeyword
.
runtime
]
model_runtime
=
model_config
[
YAMLKeyword
.
runtime
]
...
@@ -631,7 +632,7 @@ class DeviceWrapper:
...
@@ -631,7 +632,7 @@ class DeviceWrapper:
data_str
=
'{model_name},{device_name},{soc},{abi},{device_type},'
\
data_str
=
'{model_name},{device_name},{soc},{abi},{device_type},'
\
'{init},{warmup},{run_avg},{tuned}
\n
'
.
format
(
'{init},{warmup},{run_avg},{tuned}
\n
'
.
format
(
model_name
=
model_name
,
model_name
=
model_name
,
device_name
=
self
.
models
,
device_name
=
self
.
device_name
,
soc
=
self
.
target_socs
,
soc
=
self
.
target_socs
,
abi
=
target_abi
,
abi
=
target_abi
,
device_type
=
device_type
,
device_type
=
device_type
,
...
@@ -671,7 +672,7 @@ class DeviceWrapper:
...
@@ -671,7 +672,7 @@ class DeviceWrapper:
mace_model_path
=
''
mace_model_path
=
''
if
model_graph_format
==
ModelFormat
.
file
:
if
model_graph_format
==
ModelFormat
.
file
:
mace_model_path
=
'%s/%s.pb'
%
(
mace_model_dir
,
model_tag
)
mace_model_path
=
'%s/%s.pb'
%
(
mace_model_dir
,
model_tag
)
if
abi
==
'host'
:
if
abi
==
ABIType
.
host
:
libmace_dynamic_lib_dir_path
=
\
libmace_dynamic_lib_dir_path
=
\
os
.
path
.
dirname
(
libmace_dynamic_library_path
)
os
.
path
.
dirname
(
libmace_dynamic_library_path
)
p
=
subprocess
.
Popen
(
p
=
subprocess
.
Popen
(
...
@@ -719,6 +720,10 @@ class DeviceWrapper:
...
@@ -719,6 +720,10 @@ class DeviceWrapper:
self
.
push
(
mace_model_path
,
mace_model_device_path
)
self
.
push
(
mace_model_path
,
mace_model_device_path
)
if
link_dynamic
:
if
link_dynamic
:
self
.
push
(
libmace_dynamic_library_path
,
self
.
data_dir
)
self
.
push
(
libmace_dynamic_library_path
,
self
.
data_dir
)
if
self
.
system
==
SystemType
.
android
:
sh_commands
.
push_depended_so_libs
(
libmace_dynamic_library_path
,
abi
,
self
.
data_dir
,
self
.
address
)
self
.
rm
(
'%s/%s'
%
(
self
.
data_dir
,
benchmark_binary_name
))
self
.
rm
(
'%s/%s'
%
(
self
.
data_dir
,
benchmark_binary_name
))
self
.
push
(
'%s/%s'
%
(
benchmark_binary_dir
,
benchmark_binary_name
),
self
.
push
(
'%s/%s'
%
(
benchmark_binary_dir
,
benchmark_binary_name
),
self
.
data_dir
)
self
.
data_dir
)
...
@@ -761,19 +766,11 @@ class DeviceWrapper:
...
@@ -761,19 +766,11 @@ class DeviceWrapper:
os
.
remove
(
tmp_cmd_file
)
os
.
remove
(
tmp_cmd_file
)
if
self
.
system
==
SystemType
.
android
:
if
self
.
system
==
SystemType
.
android
:
sh
.
adb
(
sh
.
adb
(
'-s'
,
self
.
address
,
'shell'
,
'sh'
,
cmd_file_path
,
'-s'
,
_fg
=
True
)
self
.
address
,
'shell'
,
'sh'
,
cmd_file_path
,
_fg
=
True
)
elif
self
.
system
==
SystemType
.
arm_linux
:
elif
self
.
system
==
SystemType
.
arm_linux
:
sh
.
ssh
(
'%s@%s'
%
(
self
.
username
,
self
.
address
),
sh
.
ssh
(
'%s@%s'
%
(
self
.
username
,
self
.
address
),
'sh'
,
'sh'
,
cmd_file_path
,
_fg
=
True
)
cmd_file_path
,
_fg
=
True
)
self
.
rm
(
cmd_file_path
)
self
.
rm
(
cmd_file_path
)
six
.
print_
(
'Benchmark done!
\n
'
)
six
.
print_
(
'Benchmark done!
\n
'
)
...
@@ -804,13 +801,10 @@ class DeviceWrapper:
...
@@ -804,13 +801,10 @@ class DeviceWrapper:
configs
[
YAMLKeyword
.
model_graph_format
],
configs
[
YAMLKeyword
.
model_graph_format
],
configs
[
YAMLKeyword
.
model_data_format
],
configs
[
YAMLKeyword
.
model_data_format
],
target_abi
)
target_abi
)
if
target_abi
==
ABIType
.
host
:
device_name
=
ABIType
.
host
else
:
device_name
=
self
.
models
MaceLogger
.
header
(
MaceLogger
.
header
(
StringFormatter
.
block
(
StringFormatter
.
block
(
'Benchmark model %s on %s'
%
(
model_name
,
device_name
)))
'Benchmark model %s on %s'
%
(
model_name
,
self
.
device_name
)))
model_config
=
configs
[
YAMLKeyword
.
models
][
model_name
]
model_config
=
configs
[
YAMLKeyword
.
models
][
model_name
]
model_runtime
=
model_config
[
YAMLKeyword
.
runtime
]
model_runtime
=
model_config
[
YAMLKeyword
.
runtime
]
subgraphs
=
model_config
[
YAMLKeyword
.
subgraphs
]
subgraphs
=
model_config
[
YAMLKeyword
.
subgraphs
]
...
@@ -885,7 +879,7 @@ class DeviceWrapper:
...
@@ -885,7 +879,7 @@ class DeviceWrapper:
print
(
'Trying to lock device %s'
%
self
.
address
)
print
(
'Trying to lock device %s'
%
self
.
address
)
with
self
.
lock
():
with
self
.
lock
():
print
(
'Run on device: %s, %s, %s'
%
print
(
'Run on device: %s, %s, %s'
%
(
self
.
address
,
self
.
target_socs
,
self
.
models
))
(
self
.
address
,
self
.
target_socs
,
self
.
device_name
))
self
.
rm
(
self
.
data_dir
)
self
.
rm
(
self
.
data_dir
)
self
.
exec_command
(
'mkdir -p %s'
%
self
.
data_dir
)
self
.
exec_command
(
'mkdir -p %s'
%
self
.
data_dir
)
self
.
push
(
host_bin_full_path
,
device_bin_full_path
)
self
.
push
(
host_bin_full_path
,
device_bin_full_path
)
...
@@ -949,11 +943,11 @@ class DeviceManager:
...
@@ -949,11 +943,11 @@ class DeviceManager:
for
adb
in
adb_list
:
for
adb
in
adb_list
:
prop
=
sh_commands
.
adb_getprop_by_serialno
(
adb
[
0
])
prop
=
sh_commands
.
adb_getprop_by_serialno
(
adb
[
0
])
android
=
{
android
=
{
YAMLKeyword
.
device_name
:
adb
[
1
],
YAMLKeyword
.
device_name
:
prop
[
'ro.product.model'
].
replace
(
' '
,
''
),
YAMLKeyword
.
target_abis
:
YAMLKeyword
.
target_abis
:
prop
[
'ro.product.cpu.abilist'
].
split
(
','
),
prop
[
'ro.product.cpu.abilist'
].
split
(
','
),
YAMLKeyword
.
target_socs
:
prop
[
'ro.board.platform'
],
YAMLKeyword
.
target_socs
:
prop
[
'ro.board.platform'
],
YAMLKeyword
.
models
:
prop
[
'ro.product.model'
].
replace
(
' '
,
'_'
),
YAMLKeyword
.
system
:
SystemType
.
android
,
YAMLKeyword
.
system
:
SystemType
.
android
,
YAMLKeyword
.
address
:
adb
[
0
],
YAMLKeyword
.
address
:
adb
[
0
],
YAMLKeyword
.
username
:
''
,
YAMLKeyword
.
username
:
''
,
...
@@ -968,9 +962,9 @@ class DeviceManager:
...
@@ -968,9 +962,9 @@ class DeviceManager:
devices
=
devices
[
'devices'
]
devices
=
devices
[
'devices'
]
device_list
=
[]
device_list
=
[]
for
name
,
dev
in
six
.
iteritems
(
devices
):
for
name
,
dev
in
six
.
iteritems
(
devices
):
dev
[
YAMLKeyword
.
device_name
]
=
name
dev
[
YAMLKeyword
.
device_name
]
=
\
dev
[
YAMLKeyword
.
models
].
replace
(
' '
,
''
)
dev
[
YAMLKeyword
.
system
]
=
SystemType
.
arm_linux
dev
[
YAMLKeyword
.
system
]
=
SystemType
.
arm_linux
dev
[
YAMLKeyword
.
models
]
=
dev
[
YAMLKeyword
.
models
].
replace
(
' '
,
'_'
)
device_list
.
append
(
dev
)
device_list
.
append
(
dev
)
return
device_list
return
device_list
...
@@ -992,7 +986,6 @@ class DeviceManager:
...
@@ -992,7 +986,6 @@ class DeviceManager:
YAMLKeyword
.
target_abis
:
[
ABIType
.
host
],
YAMLKeyword
.
target_abis
:
[
ABIType
.
host
],
YAMLKeyword
.
target_socs
:
''
,
YAMLKeyword
.
target_socs
:
''
,
YAMLKeyword
.
system
:
SystemType
.
host
,
YAMLKeyword
.
system
:
SystemType
.
host
,
YAMLKeyword
.
models
:
None
,
YAMLKeyword
.
address
:
None
,
YAMLKeyword
.
address
:
None
,
}
}
...
...
tools/sh_commands.py
浏览文件 @
12c4dace
...
@@ -20,7 +20,6 @@ import os
...
@@ -20,7 +20,6 @@ import os
import
re
import
re
import
sh
import
sh
import
struct
import
struct
import
subprocess
import
sys
import
sys
import
time
import
time
import
platform
import
platform
...
@@ -28,10 +27,6 @@ import platform
...
@@ -28,10 +27,6 @@ import platform
import
six
import
six
import
common
import
common
from
common
import
ModelFormat
from
common
import
ABIType
from
common
import
SystemType
from
common
import
YAMLKeyword
from
common
import
abi_to_internal
from
common
import
abi_to_internal
sys
.
path
.
insert
(
0
,
"mace/python/tools"
)
sys
.
path
.
insert
(
0
,
"mace/python/tools"
)
...
@@ -179,99 +174,16 @@ def adb_get_all_socs():
...
@@ -179,99 +174,16 @@ def adb_get_all_socs():
def
adb_push
(
src_path
,
dst_path
,
serialno
):
def
adb_push
(
src_path
,
dst_path
,
serialno
):
six
.
print_
(
"Push %s to %s"
%
(
src_path
,
dst_path
))
sh
.
adb
(
"-s"
,
serialno
,
"push"
,
src_path
,
dst_path
)
sh
.
adb
(
"-s"
,
serialno
,
"push"
,
src_path
,
dst_path
)
def
adb_pull
(
src_path
,
dst_path
,
serialno
):
def
adb_pull
(
src_path
,
dst_path
,
serialno
):
six
.
print_
(
"Pull %s to %s"
%
(
src_path
,
dst_path
))
try
:
try
:
sh
.
adb
(
"-s"
,
serialno
,
"pull"
,
src_path
,
dst_path
)
sh
.
adb
(
"-s"
,
serialno
,
"pull"
,
src_path
,
dst_path
)
except
Exception
as
e
:
except
Exception
as
e
:
six
.
print_
(
"Error msg: %s"
%
e
,
file
=
sys
.
stderr
)
six
.
print_
(
"Error msg: %s"
%
e
,
file
=
sys
.
stderr
)
def
adb_run
(
abi
,
serialno
,
host_bin_path
,
bin_name
,
args
=
""
,
opencl_profiling
=
True
,
vlog_level
=
0
,
device_bin_path
=
"/data/local/tmp/mace"
,
out_of_range_check
=
True
,
address_sanitizer
=
False
,
simpleperf
=
False
):
host_bin_full_path
=
"%s/%s"
%
(
host_bin_path
,
bin_name
)
device_bin_full_path
=
"%s/%s"
%
(
device_bin_path
,
bin_name
)
props
=
adb_getprop_by_serialno
(
serialno
)
six
.
print_
(
"====================================================================="
)
six
.
print_
(
"Trying to lock device %s"
%
serialno
)
with
device_lock
(
serialno
):
six
.
print_
(
"Run on device: %s, %s, %s"
%
(
serialno
,
props
[
"ro.board.platform"
],
props
[
"ro.product.model"
]))
sh
.
adb
(
"-s"
,
serialno
,
"shell"
,
"rm -rf %s"
%
device_bin_path
)
sh
.
adb
(
"-s"
,
serialno
,
"shell"
,
"mkdir -p %s"
%
device_bin_path
)
adb_push
(
host_bin_full_path
,
device_bin_full_path
,
serialno
)
ld_preload
=
""
if
address_sanitizer
:
adb_push
(
find_asan_rt_library
(
abi
),
device_bin_path
,
serialno
)
ld_preload
=
"LD_PRELOAD=%s/%s"
%
(
device_bin_path
,
asan_rt_library_names
(
abi
)),
opencl_profiling
=
1
if
opencl_profiling
else
0
out_of_range_check
=
1
if
out_of_range_check
else
0
six
.
print_
(
"Run %s"
%
device_bin_full_path
)
stdout_buff
=
[]
process_output
=
make_output_processor
(
stdout_buff
)
if
simpleperf
:
adb_push
(
find_simpleperf_library
(
abi
),
device_bin_path
,
serialno
)
simpleperf_cmd
=
"%s/simpleperf"
%
device_bin_path
sh
.
adb
(
"-s"
,
serialno
,
"shell"
,
ld_preload
,
"MACE_OUT_OF_RANGE_CHECK=%d"
%
out_of_range_check
,
"MACE_OPENCL_PROFILING=%d"
%
opencl_profiling
,
"MACE_CPP_MIN_VLOG_LEVEL=%d"
%
vlog_level
,
simpleperf_cmd
,
"stat"
,
"--group"
,
"raw-l1-dcache,raw-l1-dcache-refill"
,
"--group"
,
"raw-l2-dcache,raw-l2-dcache-refill"
,
"--group"
,
"raw-l1-dtlb,raw-l1-dtlb-refill"
,
"--group"
,
"raw-l2-dtlb,raw-l2-dtlb-refill"
,
device_bin_full_path
,
args
,
_tty_in
=
True
,
_out
=
process_output
,
_err_to_out
=
True
)
else
:
sh
.
adb
(
"-s"
,
serialno
,
"shell"
,
ld_preload
,
"MACE_OUT_OF_RANGE_CHECK=%d"
%
out_of_range_check
,
"MACE_OPENCL_PROFILING=%d"
%
opencl_profiling
,
"MACE_CPP_MIN_VLOG_LEVEL=%d"
%
vlog_level
,
device_bin_full_path
,
args
,
_tty_in
=
True
,
_out
=
process_output
,
_err_to_out
=
True
)
return
""
.
join
(
stdout_buff
)
################################
################################
# Toolchain
# Toolchain
################################
################################
...
@@ -433,15 +345,6 @@ def gen_mace_engine_factory_source(model_tags,
...
@@ -433,15 +345,6 @@ def gen_mace_engine_factory_source(model_tags,
six
.
print_
(
"Generate mace engine creator source done!
\n
"
)
six
.
print_
(
"Generate mace engine creator source done!
\n
"
)
def
pull_file_from_device
(
serial_num
,
file_path
,
file_name
,
output_dir
):
if
not
os
.
path
.
exists
(
output_dir
):
sh
.
mkdir
(
"-p"
,
output_dir
)
output_path
=
"%s/%s"
%
(
output_dir
,
file_path
)
if
os
.
path
.
exists
(
output_path
):
sh
.
rm
(
'-rf'
,
output_path
)
adb_pull
(
file_path
+
'/'
+
file_name
,
output_dir
,
serial_num
)
def
merge_opencl_binaries
(
binaries_dirs
,
def
merge_opencl_binaries
(
binaries_dirs
,
cl_compiled_program_file_name
,
cl_compiled_program_file_name
,
output_file_path
):
output_file_path
):
...
@@ -690,19 +593,17 @@ def push_depended_so_libs(libmace_dynamic_library_path,
...
@@ -690,19 +593,17 @@ def push_depended_so_libs(libmace_dynamic_library_path,
abi
,
phone_data_dir
,
serialno
):
abi
,
phone_data_dir
,
serialno
):
dep_so_libs
=
sh
.
bash
(
os
.
environ
[
"ANDROID_NDK_HOME"
]
+
"/ndk-depends"
,
dep_so_libs
=
sh
.
bash
(
os
.
environ
[
"ANDROID_NDK_HOME"
]
+
"/ndk-depends"
,
libmace_dynamic_library_path
)
libmace_dynamic_library_path
)
src_file
=
""
for
dep
in
split_stdout
(
dep_so_libs
):
for
dep
in
split_stdout
(
dep_so_libs
):
if
dep
==
"libgnustl_shared.so"
:
if
dep
==
"libgnustl_shared.so"
:
adb_push
(
src_file
=
"%s/sources/cxx-stl/gnu-libstdc++/4.9/libs/"
\
"%s/sources/cxx-stl/gnu-libstdc++/4.9/libs/%s/libgnustl_shared.so"
# noqa
"%s/libgnustl_shared.so"
\
%
(
os
.
environ
[
"ANDROID_NDK_HOME"
],
abi
),
%
(
os
.
environ
[
"ANDROID_NDK_HOME"
],
abi
)
phone_data_dir
,
serialno
)
elif
dep
==
"libc++_shared.so"
:
elif
dep
==
"libc++_shared.so"
:
adb_push
(
src_file
=
"%s/sources/cxx-stl/llvm-libc++/libs/"
\
"%s/sources/cxx-stl/llvm-libc++/libs/%s/libc++_shared.so"
# noqa
"%s/libc++_shared.so"
%
(
os
.
environ
[
"ANDROID_NDK_HOME"
],
abi
)
%
(
os
.
environ
[
"ANDROID_NDK_HOME"
],
abi
),
print
(
"push %s to %s"
%
(
src_file
,
phone_data_dir
))
phone_data_dir
,
adb_push
(
src_file
,
phone_data_dir
,
serialno
)
serialno
)
def
validate_model
(
abi
,
def
validate_model
(
abi
,
...
@@ -861,149 +762,6 @@ def packaging_lib(libmace_output_dir, project_name):
...
@@ -861,149 +762,6 @@ def packaging_lib(libmace_output_dir, project_name):
################################
################################
# benchmark
# benchmark
################################
################################
def
benchmark_model
(
abi
,
serialno
,
benchmark_binary_dir
,
benchmark_binary_name
,
vlog_level
,
embed_model_data
,
model_output_dir
,
mace_model_dir
,
input_nodes
,
output_nodes
,
input_shapes
,
output_shapes
,
model_tag
,
device_type
,
phone_data_dir
,
model_graph_format
,
opencl_binary_file
,
opencl_parameter_file
,
libmace_dynamic_library_path
,
omp_num_threads
=-
1
,
cpu_affinity_policy
=
1
,
gpu_perf_hint
=
3
,
gpu_priority_hint
=
3
,
input_file_name
=
"model_input"
,
link_dynamic
=
False
):
six
.
print_
(
"* Benchmark for %s"
%
model_tag
)
mace_model_path
=
""
if
model_graph_format
==
ModelFormat
.
file
:
mace_model_path
=
"%s/%s.pb"
%
(
mace_model_dir
,
model_tag
)
if
abi
==
"host"
:
libmace_dynamic_lib_dir_path
=
\
os
.
path
.
dirname
(
libmace_dynamic_library_path
)
p
=
subprocess
.
Popen
(
[
"env"
,
"LD_LIBRARY_PATH=%s"
%
libmace_dynamic_lib_dir_path
,
"MACE_CPP_MIN_VLOG_LEVEL=%s"
%
vlog_level
,
"%s/%s"
%
(
benchmark_binary_dir
,
benchmark_binary_name
),
"--model_name=%s"
%
model_tag
,
"--input_node=%s"
%
","
.
join
(
input_nodes
),
"--output_node=%s"
%
","
.
join
(
output_nodes
),
"--input_shape=%s"
%
":"
.
join
(
input_shapes
),
"--output_shape=%s"
%
":"
.
join
(
output_shapes
),
"--input_file=%s/%s"
%
(
model_output_dir
,
input_file_name
),
"--model_data_file=%s/%s.data"
%
(
mace_model_dir
,
model_tag
),
"--device=%s"
%
device_type
,
"--omp_num_threads=%s"
%
omp_num_threads
,
"--cpu_affinity_policy=%s"
%
cpu_affinity_policy
,
"--gpu_perf_hint=%s"
%
gpu_perf_hint
,
"--gpu_priority_hint=%s"
%
gpu_priority_hint
,
"--model_file=%s"
%
mace_model_path
,
])
p
.
wait
()
else
:
sh
.
adb
(
"-s"
,
serialno
,
"shell"
,
"mkdir"
,
"-p"
,
phone_data_dir
)
internal_storage_dir
=
create_internal_storage_dir
(
serialno
,
phone_data_dir
)
for
input_name
in
input_nodes
:
formatted_name
=
common
.
formatted_file_name
(
input_file_name
,
input_name
)
adb_push
(
"%s/%s"
%
(
model_output_dir
,
formatted_name
),
phone_data_dir
,
serialno
)
if
not
embed_model_data
:
adb_push
(
"%s/%s.data"
%
(
mace_model_dir
,
model_tag
),
phone_data_dir
,
serialno
)
if
device_type
==
common
.
DeviceType
.
GPU
:
if
os
.
path
.
exists
(
opencl_binary_file
):
adb_push
(
opencl_binary_file
,
phone_data_dir
,
serialno
)
if
os
.
path
.
exists
(
opencl_parameter_file
):
adb_push
(
opencl_parameter_file
,
phone_data_dir
,
serialno
)
mace_model_phone_path
=
""
if
model_graph_format
==
ModelFormat
.
file
:
mace_model_phone_path
=
"%s/%s.pb"
%
(
phone_data_dir
,
model_tag
)
adb_push
(
mace_model_path
,
mace_model_phone_path
,
serialno
)
if
link_dynamic
:
adb_push
(
libmace_dynamic_library_path
,
phone_data_dir
,
serialno
)
push_depended_so_libs
(
libmace_dynamic_library_path
,
abi
,
phone_data_dir
,
serialno
)
adb_push
(
"%s/%s"
%
(
benchmark_binary_dir
,
benchmark_binary_name
),
phone_data_dir
,
serialno
)
adb_cmd
=
[
"LD_LIBRARY_PATH=%s"
%
phone_data_dir
,
"MACE_CPP_MIN_VLOG_LEVEL=%s"
%
vlog_level
,
"MACE_RUN_PARAMETER_PATH=%s/mace_run.config"
%
phone_data_dir
,
"MACE_INTERNAL_STORAGE_PATH=%s"
%
internal_storage_dir
,
"MACE_OPENCL_PROFILING=1"
,
"%s/%s"
%
(
phone_data_dir
,
benchmark_binary_name
),
"--model_name=%s"
%
model_tag
,
"--input_node=%s"
%
","
.
join
(
input_nodes
),
"--output_node=%s"
%
","
.
join
(
output_nodes
),
"--input_shape=%s"
%
":"
.
join
(
input_shapes
),
"--output_shape=%s"
%
":"
.
join
(
output_shapes
),
"--input_file=%s/%s"
%
(
phone_data_dir
,
input_file_name
),
"--model_data_file=%s/%s.data"
%
(
phone_data_dir
,
model_tag
),
"--device=%s"
%
device_type
,
"--omp_num_threads=%s"
%
omp_num_threads
,
"--cpu_affinity_policy=%s"
%
cpu_affinity_policy
,
"--gpu_perf_hint=%s"
%
gpu_perf_hint
,
"--gpu_priority_hint=%s"
%
gpu_priority_hint
,
"--model_file=%s"
%
mace_model_phone_path
,
"--opencl_binary_file=%s/%s"
%
(
phone_data_dir
,
os
.
path
.
basename
(
opencl_binary_file
)),
"--opencl_parameter_file=%s/%s"
%
(
phone_data_dir
,
os
.
path
.
basename
(
opencl_parameter_file
)),
]
adb_cmd
=
' '
.
join
(
adb_cmd
)
cmd_file_name
=
"%s-%s-%s"
%
(
'cmd_file'
,
model_tag
,
str
(
time
.
time
()))
adb_cmd_file
=
"%s/%s"
%
(
phone_data_dir
,
cmd_file_name
)
tmp_cmd_file
=
"%s/%s"
%
(
'/tmp'
,
cmd_file_name
)
with
open
(
tmp_cmd_file
,
'w'
)
as
cmd_file
:
cmd_file
.
write
(
adb_cmd
)
adb_push
(
tmp_cmd_file
,
adb_cmd_file
,
serialno
)
os
.
remove
(
tmp_cmd_file
)
sh
.
adb
(
"-s"
,
serialno
,
"shell"
,
"sh"
,
adb_cmd_file
,
_fg
=
True
)
sh
.
adb
(
"-s"
,
serialno
,
"shell"
,
"rm"
,
adb_cmd_file
,
_fg
=
True
)
six
.
print_
(
"Benchmark done!
\n
"
)
def
build_run_throughput_test
(
abi
,
def
build_run_throughput_test
(
abi
,
serialno
,
serialno
,
vlog_level
,
vlog_level
,
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录