提交 3c4edfcc 编写于 作者: 李寅

Merge branch 'hexagon_nn' into 'master'

Open hexagon_nn

See merge request !921
...@@ -178,16 +178,18 @@ quantization_tests: ...@@ -178,16 +178,18 @@ quantization_tests:
- pwd - pwd
- rm -rf mace-models - rm -rf mace-models
- GIT_SSH_COMMAND="ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no" git clone git@github.com:XiaoMi/mace-models.git - GIT_SSH_COMMAND="ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no" git clone git@github.com:XiaoMi/mace-models.git
- CONF_FILE=mace-models/mobilenet-v1/mobilenet-v1-quantize-retrain.yml
- > - >
if ping -c 1 v9.git.n.xiaomi.com 1>/dev/null 2>&1; then if ping -c 1 v9.git.n.xiaomi.com 1>/dev/null 2>&1; then
GIT_SSH_COMMAND="ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no" git clone git@v9.git.n.xiaomi.com:deep-computing/generic-mobile-devices.git GIT_SSH_COMMAND="ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no" git clone git@v9.git.n.xiaomi.com:deep-computing/generic-mobile-devices.git
DEVICE_CONF_FILE=generic-mobile-devices/devices.yml DEVICE_CONF_FILE=generic-mobile-devices/devices.yml
fi fi
- > - >
python tools/converter.py convert --config=${CONF_FILE} --model_graph_format=file --model_data_format=file --cl_mem_type=buffer || exit 1; for CONF_FILE in mace-models/mobilenet-v1/mobilenet-v1-quantize-retrain.yml mace-models/mobilenet-v1/mobilenet-v1-quantize-retrain-for-check-only.yml mace-models/mobilenet-v1/mobilenet-v1-quantize-retrain-dsp.yml;
python tools/converter.py run --config=${CONF_FILE} --device_yml=${DEVICE_CONF_FILE} --round=1 --target_abis=armeabi-v7a,arm64 --validate --model_graph_format=file --model_data_format=file || exit 1; do
python tools/converter.py run --config=${CONF_FILE} --device_yml=${DEVICE_CONF_FILE} --example --target_abis=armeabi-v7a,arm64 --round=1 --validate --model_graph_format=file --model_data_format=file || exit 1; python tools/converter.py convert --config=${CONF_FILE} --model_graph_format=file --model_data_format=file || exit 1;
python tools/converter.py run --config=${CONF_FILE} --device_yml=${DEVICE_CONF_FILE} --round=1 --validate --model_graph_format=file --model_data_format=file || exit 1;
python tools/converter.py run --config=${CONF_FILE} --device_yml=${DEVICE_CONF_FILE} --example --round=1 --validate --model_graph_format=file --model_data_format=file || exit 1;
done
- rm -rf mace-models - rm -rf mace-models
build_android_demo: build_android_demo:
......
...@@ -76,7 +76,7 @@ please refer to [the contribution guide](https://mace.readthedocs.io/en/latest/d ...@@ -76,7 +76,7 @@ please refer to [the contribution guide](https://mace.readthedocs.io/en/latest/d
MACE depends on several open source projects located in the MACE depends on several open source projects located in the
[third_party](third_party) directory. Particularly, we learned a lot from [third_party](third_party) directory. Particularly, we learned a lot from
the following projects during the development: the following projects during the development:
* [Qualcomm Hexagon NN Offload Framework](https://source.codeaurora.org/quic/hexagon_nn/nnlib): the Hexagon DSP runtime * [Qualcomm Hexagon NN Offload Framework](https://developer.qualcomm.com/software/hexagon-dsp-sdk): the Hexagon DSP runtime
depends on this library. depends on this library.
* [TensorFlow](https://github.com/tensorflow/tensorflow), * [TensorFlow](https://github.com/tensorflow/tensorflow),
[Caffe](https://github.com/BVLC/caffe), [Caffe](https://github.com/BVLC/caffe),
......
...@@ -59,9 +59,9 @@ Why is MACE not working on DSP? ...@@ -59,9 +59,9 @@ Why is MACE not working on DSP?
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
Running models on Hexagon DSP need a few prerequisites for DSP developers: Running models on Hexagon DSP need a few prerequisites for DSP developers:
* You need make sure SOCs of your phone is manufactured by Qualcomm and has HVX supported. * You need to make sure SOCs of your phone is manufactured by Qualcomm and has HVX supported.
* You need a phone that disables secure boot (once enabled, cannot be reversed, so you probably can only get that type phones from manufacturers) * You need a phone that disables secure boot (once enabled, cannot be reversed, so you probably can only get that type phones from manufacturers)
* You need sign your phone by using testsig provided by Qualcomm. (Download Qualcomm Hexagon SDK first, plugin your phone to PC, run scripts/testsig.py) * You need to sign your phone by using testsig provided by Qualcomm. (Download Qualcomm Hexagon SDK first, plugin your phone to PC, run scripts/testsig.py)
* You need install Hexagon nnlib backend by following nnlib README (https://github.com/XiaoMi/nnlib). * You need to push `third_party/nnlib/v6x/libhexagon_nn_skel.so` to `/system/vendor/lib/rfsa/adsp/`.
Then, there you go. You can run Mace on Hexagon DSP. Then, there you go. You can run Mace on Hexagon DSP.
...@@ -99,7 +99,6 @@ MACE now supports models from TensorFlow and Caffe (more frameworks will be supp ...@@ -99,7 +99,6 @@ MACE now supports models from TensorFlow and Caffe (more frameworks will be supp
Prepare your pre-trained TensorFlow model.pb file. Prepare your pre-trained TensorFlow model.pb file.
- Caffe - Caffe
Caffe 1.0+ models are supported in MACE converter tool. Caffe 1.0+ models are supported in MACE converter tool.
...@@ -253,7 +252,13 @@ However, there are some differences in different devices. ...@@ -253,7 +252,13 @@ However, there are some differences in different devices.
* **DSP** * **DSP**
MACE only support Qualcomm DSP. MACE only supports Qualcomm DSP. And you need to push the hexagon nn library to the device.
.. code:: sh
# For Android device
adb root; adb remount
adb push third_party/nnlib/v6x/libhexagon_nn_skel.so /system/vendor/lib/rfsa/adsp/
In the converting and building steps, you've got the static/shared library, model files and In the converting and building steps, you've got the static/shared library, model files and
header files. header files.
......
...@@ -22,9 +22,6 @@ models, e.g., MobileNet. The only thing you need to make it run using MACE is to ...@@ -22,9 +22,6 @@ models, e.g., MobileNet. The only thing you need to make it run using MACE is to
2. `quantize`: set `quantize` to be 1. 2. `quantize`: set `quantize` to be 1.
.. note::
You need set `runtime` to be `cpu` because we only support this quantization method to run on CPU for now (soon DSP will be supported).
Post training quantization Post training quantization
--------------------------- ---------------------------
......
...@@ -107,11 +107,7 @@ bool HexagonControlWrapper::Config() { ...@@ -107,11 +107,7 @@ bool HexagonControlWrapper::Config() {
bool HexagonControlWrapper::Init() { bool HexagonControlWrapper::Init() {
LOG(INFO) << "Hexagon init"; LOG(INFO) << "Hexagon init";
#ifdef MACE_USE_NNLIB_OLD
nn_id_ = hexagon_nn_init();
#else
MACE_CHECK(hexagon_nn_init(&nn_id_) == 0, "hexagon_nn_init failed"); MACE_CHECK(hexagon_nn_init(&nn_id_) == 0, "hexagon_nn_init failed");
#endif
ResetPerfInfo(); ResetPerfInfo();
return true; return true;
} }
...@@ -128,138 +124,116 @@ bool HexagonControlWrapper::SetupGraph(const NetDef &net_def, ...@@ -128,138 +124,116 @@ bool HexagonControlWrapper::SetupGraph(const NetDef &net_def,
int64_t t0 = NowMicros(); int64_t t0 = NowMicros();
// const node // const node
#if defined(MACE_USE_NNLIB_CAF) || defined(MACE_USE_NNLIB_OLD) std::vector<hexagon_nn_const_node> const_node_list;
std::thread const_thread([&]() for (const ConstTensor &const_tensor : net_def.tensors()) {
#endif std::vector<int> tensor_shape(const_tensor.dims().begin(),
{ const_tensor.dims().end());
std::vector<hexagon_nn_const_node> const_node_list; while (tensor_shape.size() < 4) {
for (const ConstTensor &const_tensor : net_def.tensors()) { tensor_shape.insert(tensor_shape.begin(), 1);
std::vector<int> tensor_shape(const_tensor.dims().begin(),
const_tensor.dims().end());
while (tensor_shape.size() < 4) {
tensor_shape.insert(tensor_shape.begin(), 1);
}
hexagon_nn_const_node const_node;
const_node.node_id = node_id(const_tensor.node_id());
const_node.tensor.batches = tensor_shape[0];
const_node.tensor.height = tensor_shape[1];
const_node.tensor.width = tensor_shape[2];
const_node.tensor.depth = tensor_shape[3];
if (const_tensor.data_type() == DataType::DT_INT32 &&
const_tensor.data_size() == 0) {
const_node.tensor.data = NULL;
const_node.tensor.dataLen = 0;
} else {
const_node.tensor.data =
const_cast<unsigned char *>(model_data + const_tensor.offset());
const_node.tensor.dataLen = const_tensor.data_size() *
GetEnumTypeSize(const_tensor.data_type());
}
const_node_list.push_back(const_node);
// 255 is magic number: why fastrpc limits sequence length to that?
if (const_node_list.size() >= 250) {
MACE_CHECK(
hexagon_nn_append_const_node_list(nn_id_, const_node_list.data(),
const_node_list.size()) == 0,
"append const node error");
const_node_list.clear();
}
} }
if (!const_node_list.empty()) { hexagon_nn_const_node const_node;
const_node.node_id = node_id(const_tensor.node_id());
const_node.tensor.batches = tensor_shape[0];
const_node.tensor.height = tensor_shape[1];
const_node.tensor.width = tensor_shape[2];
const_node.tensor.depth = tensor_shape[3];
if (const_tensor.data_type() == DataType::DT_INT32 &&
const_tensor.data_size() == 0) {
const_node.tensor.data = NULL;
const_node.tensor.dataLen = 0;
} else {
const_node.tensor.data =
const_cast<unsigned char *>(model_data + const_tensor.offset());
const_node.tensor.dataLen = const_tensor.data_size() *
GetEnumTypeSize(const_tensor.data_type());
}
const_node_list.push_back(const_node);
// 255 is magic number: why fastrpc limits sequence length to that?
if (const_node_list.size() >= 250) {
MACE_CHECK( MACE_CHECK(
hexagon_nn_append_const_node_list(nn_id_, const_node_list.data(), hexagon_nn_append_const_node_list(nn_id_, const_node_list.data(),
const_node_list.size()) == 0, const_node_list.size()) == 0,
"append const node error"); "append const node error");
const_node_list.clear();
} }
const_node_list.clear();
} }
#if defined(MACE_USE_NNLIB_CAF) || defined(MACE_USE_NNLIB_OLD)
); // NOLINT if (!const_node_list.empty()) {
#endif MACE_CHECK(
hexagon_nn_append_const_node_list(nn_id_, const_node_list.data(),
const_node_list.size()) == 0,
"append const node error");
}
const_node_list.clear();
// op node // op node
#if defined(MACE_USE_NNLIB_CAF) || defined(MACE_USE_NNLIB_OLD) OpMap op_map;
std::thread op_thread([&]() op_map.Init();
#endif std::vector<hexagon_nn_op_node> op_node_list;
{ std::vector<std::vector<hexagon_nn_input>> cached_inputs;
OpMap op_map; std::vector<std::vector<hexagon_nn_output>> cached_outputs;
op_map.Init(); std::vector<hexagon_nn_input> inputs;
std::vector<hexagon_nn_op_node> op_node_list; std::vector<hexagon_nn_output> outputs;
std::vector<std::vector<hexagon_nn_input>> cached_inputs;
std::vector<std::vector<hexagon_nn_output>> cached_outputs; for (const OperatorDef &op : net_def.op()) {
std::vector<hexagon_nn_input> inputs; int op_id = op_map.GetOpId(op.type());
std::vector<hexagon_nn_output> outputs; inputs.resize(op.node_input().size());
for (int i = 0; i < op.node_input().size(); ++i) {
for (const OperatorDef &op : net_def.op()) { inputs[i].src_id = node_id(op.node_input()[i].node_id());
int op_id = op_map.GetOpId(op.type()); inputs[i].output_idx = op.node_input()[i].output_port();
inputs.resize(op.node_input().size()); }
for (int i = 0; i < op.node_input().size(); ++i) { outputs.resize(op.output_shape().size());
inputs[i].src_id = node_id(op.node_input()[i].node_id()); for (int i = 0; i < op.output_shape().size(); ++i) {
inputs[i].output_idx = op.node_input()[i].output_port(); outputs[i].rank = op.output_shape()[i].dims().size();
} for (size_t j = 0; j < outputs[i].rank; ++j) {
outputs.resize(op.output_shape().size()); outputs[i].max_sizes[j] = op.output_shape()[i].dims()[j];
for (int i = 0; i < op.output_shape().size(); ++i) {
#ifdef MACE_USE_NNLIB_OLD
outputs[i].max_size = op.out_max_byte_size()[i];
#else
outputs[i].rank = op.output_shape()[i].dims().size();
for (size_t j = 0; j < outputs[i].rank; ++j) {
outputs[i].max_sizes[j] = op.output_shape()[i].dims()[j];
}
if (outputs[i].rank == 0) {
outputs[i].rank = 1;
outputs[i].max_sizes[0] = 1;
}
outputs[i].max_sizes[outputs[i].rank] = 0;
outputs[i].elementsize = GetEnumTypeSize(
static_cast<DataType>(op.output_type()[i]));
outputs[i].zero_offset = 0;
outputs[i].stepsize = 0;
#endif
} }
cached_inputs.push_back(inputs); if (outputs[i].rank == 0) {
cached_outputs.push_back(outputs); outputs[i].rank = 1;
outputs[i].max_sizes[0] = 1;
hexagon_nn_padding_type padding_type =
static_cast<hexagon_nn_padding_type>(op.padding());
hexagon_nn_op_node op_node;
op_node.node_id = node_id(op.node_id());
op_node.operation = op_id;
op_node.padding = padding_type;
op_node.inputs = cached_inputs.back().data();
op_node.inputsLen = inputs.size();
op_node.outputs = cached_outputs.back().data();
op_node.outputsLen = outputs.size();
op_node_list.push_back(op_node);
if (op_node_list.size() >= 125) {
MACE_CHECK(hexagon_nn_append_node_list(nn_id_, op_node_list.data(),
op_node_list.size()) == 0,
"append node error");
op_node_list.clear();
cached_inputs.clear();
cached_outputs.clear();
} }
outputs[i].max_sizes[outputs[i].rank] = 0;
outputs[i].elementsize = GetEnumTypeSize(
static_cast<DataType>(op.output_type()[i]));
outputs[i].zero_offset = 0;
outputs[i].stepsize = 0;
} }
cached_inputs.push_back(inputs);
if (!op_node_list.empty()) { cached_outputs.push_back(outputs);
hexagon_nn_padding_type padding_type =
static_cast<hexagon_nn_padding_type>(op.padding());
hexagon_nn_op_node op_node;
op_node.node_id = node_id(op.node_id());
op_node.operation = op_id;
op_node.padding = padding_type;
op_node.inputs = cached_inputs.back().data();
op_node.inputsLen = inputs.size();
op_node.outputs = cached_outputs.back().data();
op_node.outputsLen = outputs.size();
op_node_list.push_back(op_node);
if (op_node_list.size() >= 125) {
MACE_CHECK(hexagon_nn_append_node_list(nn_id_, op_node_list.data(), MACE_CHECK(hexagon_nn_append_node_list(nn_id_, op_node_list.data(),
op_node_list.size()) == 0, op_node_list.size()) == 0,
"append node error"); "append node error");
op_node_list.clear();
cached_inputs.clear();
cached_outputs.clear();
} }
op_node_list.clear();
cached_inputs.clear();
cached_outputs.clear();
} }
#if defined(MACE_USE_NNLIB_CAF) || defined(MACE_USE_NNLIB_OLD)
); // NOLINT if (!op_node_list.empty()) {
const_thread.join(); MACE_CHECK(hexagon_nn_append_node_list(nn_id_, op_node_list.data(),
op_thread.join(); op_node_list.size()) == 0,
#endif "append node error");
}
op_node_list.clear();
cached_inputs.clear();
cached_outputs.clear();
// input info // input info
num_inputs_ = 0; num_inputs_ = 0;
...@@ -460,7 +434,7 @@ bool HexagonControlWrapper::ExecuteGraph(const Tensor &input_tensor, ...@@ -460,7 +434,7 @@ bool HexagonControlWrapper::ExecuteGraph(const Tensor &input_tensor,
bool HexagonControlWrapper::ExecuteGraphNew( bool HexagonControlWrapper::ExecuteGraphNew(
const std::vector<Tensor *> &input_tensors, const std::vector<Tensor *> &input_tensors,
std::vector<Tensor *> *output_tensors) { std::vector<Tensor *> *output_tensors) {
LOG(INFO) << "Execute graph new: " << nn_id_; VLOG(2) << "Execute graph new: " << nn_id_;
uint32_t num_inputs = static_cast<uint32_t>(input_tensors.size()); uint32_t num_inputs = static_cast<uint32_t>(input_tensors.size());
uint32_t num_outputs = static_cast<uint32_t>(output_tensors->size()); uint32_t num_outputs = static_cast<uint32_t>(output_tensors->size());
MACE_ASSERT(num_inputs_ == num_inputs, "Wrong inputs num"); MACE_ASSERT(num_inputs_ == num_inputs, "Wrong inputs num");
......
// Copyright 2018 Xiaomi, Inc. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#ifndef MACE_CORE_RUNTIME_HEXAGON_HEXAGON_DEVICE_H_
#define MACE_CORE_RUNTIME_HEXAGON_HEXAGON_DEVICE_H_
#include "mace/core/device.h"
namespace mace {
class HexagonDevice : public CPUDevice {
public:
HexagonDevice() : CPUDevice(0, AFFINITY_NONE, false) {}
DeviceType device_type() const override {
return DeviceType::HEXAGON;
};
};
} // namespace mace
#endif // MACE_CORE_RUNTIME_HEXAGON_HEXAGON_DEVICE_H_
...@@ -18,7 +18,6 @@ models: ...@@ -18,7 +18,6 @@ models:
- 1,1001 - 1,1001
runtime: cpu+gpu runtime: cpu+gpu
limit_opencl_kernel_time: 0 limit_opencl_kernel_time: 0
nnlib_graph_mode: 0
obfuscate: 0 obfuscate: 0
winograd: 0 winograd: 0
mobilenet_v2: mobilenet_v2:
...@@ -36,7 +35,6 @@ models: ...@@ -36,7 +35,6 @@ models:
- 1,1001 - 1,1001
runtime: cpu+gpu runtime: cpu+gpu
limit_opencl_kernel_time: 0 limit_opencl_kernel_time: 0
nnlib_graph_mode: 0
obfuscate: 0 obfuscate: 0
winograd: 0 winograd: 0
mobilenet_v1_quant: mobilenet_v1_quant:
...@@ -56,7 +54,6 @@ models: ...@@ -56,7 +54,6 @@ models:
- 1,1001 - 1,1001
runtime: cpu runtime: cpu
limit_opencl_kernel_time: 0 limit_opencl_kernel_time: 0
nnlib_graph_mode: 0
obfuscate: 0 obfuscate: 0
winograd: 0 winograd: 0
quantize: 1 quantize: 1
...@@ -77,7 +74,6 @@ models: ...@@ -77,7 +74,6 @@ models:
- 1,1001 - 1,1001
runtime: cpu runtime: cpu
limit_opencl_kernel_time: 0 limit_opencl_kernel_time: 0
nnlib_graph_mode: 0
obfuscate: 0 obfuscate: 0
winograd: 0 winograd: 0
quantize: 1 quantize: 1
...@@ -34,6 +34,7 @@ ...@@ -34,6 +34,7 @@
#ifdef MACE_ENABLE_HEXAGON #ifdef MACE_ENABLE_HEXAGON
#include "mace/core/runtime/hexagon/hexagon_control_wrapper.h" #include "mace/core/runtime/hexagon/hexagon_control_wrapper.h"
#include "mace/core/runtime/hexagon/hexagon_device.h"
#endif // MACE_ENABLE_HEXAGON #endif // MACE_ENABLE_HEXAGON
namespace mace { namespace mace {
...@@ -387,7 +388,7 @@ MaceEngine::Impl::Impl(const MaceEngineConfig &config) ...@@ -387,7 +388,7 @@ MaceEngine::Impl::Impl(const MaceEngineConfig &config)
#endif #endif
{ {
LOG(INFO) << "Creating MaceEngine, MACE version: " << MaceVersion(); LOG(INFO) << "Creating MaceEngine, MACE version: " << MaceVersion();
if (device_type_ == DeviceType::CPU || device_type_ == DeviceType::HEXAGON) { if (device_type_ == DeviceType::CPU) {
device_.reset(new CPUDevice(config.impl_->num_threads(), device_.reset(new CPUDevice(config.impl_->num_threads(),
config.impl_->cpu_affinity_policy(), config.impl_->cpu_affinity_policy(),
config.impl_->use_gemmlowp())); config.impl_->use_gemmlowp()));
...@@ -405,6 +406,12 @@ MaceEngine::Impl::Impl(const MaceEngineConfig &config) ...@@ -405,6 +406,12 @@ MaceEngine::Impl::Impl(const MaceEngineConfig &config)
config.impl_->use_gemmlowp())); config.impl_->use_gemmlowp()));
} }
#endif #endif
#ifdef MACE_ENABLE_HEXAGON
if (device_type_ == DeviceType::HEXAGON) {
device_.reset(new HexagonDevice());
}
#endif
MACE_CHECK_NOTNULL(device_);
} }
MaceStatus MaceEngine::Impl::Init( MaceStatus MaceEngine::Impl::Init(
...@@ -443,6 +450,7 @@ MaceStatus MaceEngine::Impl::Init( ...@@ -443,6 +450,7 @@ MaceStatus MaceEngine::Impl::Init(
<< "' does not belong to model's outputs " << "' does not belong to model's outputs "
<< MakeString(MapKeys(output_info_map_)); << MakeString(MapKeys(output_info_map_));
} }
ws_->CreateTensor(output_name, device_->allocator(), DT_FLOAT);
} }
#ifdef MACE_ENABLE_HEXAGON #ifdef MACE_ENABLE_HEXAGON
if (device_type_ == HEXAGON) { if (device_type_ == HEXAGON) {
......
...@@ -16,7 +16,6 @@ py_library( ...@@ -16,7 +16,6 @@ py_library(
"converter_tool/onnx_converter.py", "converter_tool/onnx_converter.py",
"converter_tool/shape_inference.py", "converter_tool/shape_inference.py",
"converter_tool/tensorflow_converter.py", "converter_tool/tensorflow_converter.py",
"converter_tool/tf_dsp_converter.py",
"converter_tool/transformer.py", "converter_tool/transformer.py",
"graph_util.py", "graph_util.py",
], ],
......
...@@ -45,14 +45,14 @@ data_format_map = { ...@@ -45,14 +45,14 @@ data_format_map = {
def parse_data_type(data_type, device_type): def parse_data_type(data_type, device_type):
if device_type == cvt.DeviceType.CPU.value or\ if device_type == cvt.DeviceType.CPU.value or \
device_type == cvt.DeviceType.GPU.value: device_type == cvt.DeviceType.GPU.value:
if data_type == 'fp32_fp32': if data_type == 'fp32_fp32':
return mace_pb2.DT_FLOAT return mace_pb2.DT_FLOAT
else: else:
return mace_pb2.DT_HALF return mace_pb2.DT_HALF
elif device_type == cvt.DeviceType.HEXAGON.value: elif device_type == cvt.DeviceType.HEXAGON.value:
return mace_pb2.DT_UINT8 return mace_pb2.DT_FLOAT
else: else:
print("Invalid device type: " + device_type) print("Invalid device type: " + device_type)
...@@ -167,45 +167,39 @@ def main(unused_args): ...@@ -167,45 +167,39 @@ def main(unused_args):
check_node.name = check_node_names[i] check_node.name = check_node_names[i]
check_node.shape = parse_int_array_from_str(check_node_shapes[i]) check_node.shape = parse_int_array_from_str(check_node_shapes[i])
option.add_check_node(check_node) option.add_check_node(check_node)
else:
option.check_nodes = option.output_nodes
option.build() option.build()
print("Transform model to one that can better run on device") print("Transform model to one that can better run on device")
if FLAGS.runtime == 'dsp' and not option.quantize: if FLAGS.platform == 'tensorflow':
mace_check(FLAGS.platform == 'tensorflow', from mace.python.tools.converter_tool import tensorflow_converter
'DSP only supports tensorflow') converter = tensorflow_converter.TensorflowConverter(
from mace.python.tools.converter_tool import tf_dsp_converter
converter = tf_dsp_converter.TensorflowDspConverter(
option, FLAGS.model_file) option, FLAGS.model_file)
output_graph_def = converter.run() elif FLAGS.platform == 'caffe':
from mace.python.tools.converter_tool import caffe_converter
converter = caffe_converter.CaffeConverter(option,
FLAGS.model_file,
FLAGS.weight_file)
elif FLAGS.platform == 'onnx':
from mace.python.tools.converter_tool import onnx_converter
converter = onnx_converter.OnnxConverter(option, FLAGS.model_file)
else: else:
if FLAGS.platform == 'tensorflow': six.print_("Mace do not support platorm %s yet." % FLAGS.platform,
from mace.python.tools.converter_tool import tensorflow_converter file=sys.stderr)
converter = tensorflow_converter.TensorflowConverter( exit(1)
option, FLAGS.model_file)
elif FLAGS.platform == 'caffe': output_graph_def = converter.run()
from mace.python.tools.converter_tool import caffe_converter mace_transformer = transformer.Transformer(
converter = caffe_converter.CaffeConverter(option, option, output_graph_def)
FLAGS.model_file, output_graph_def, quantize_activation_info = mace_transformer.run()
FLAGS.weight_file)
elif FLAGS.platform == 'onnx':
from mace.python.tools.converter_tool import onnx_converter
converter = onnx_converter.OnnxConverter(option, FLAGS.model_file)
else:
six.print_("Mace do not support platorm %s yet." % FLAGS.platform,
file=sys.stderr)
exit(1)
if FLAGS.runtime == 'dsp':
from mace.python.tools.converter_tool import hexagon_converter
converter = hexagon_converter.HexagonConverter(
option, output_graph_def, quantize_activation_info)
output_graph_def = converter.run() output_graph_def = converter.run()
mace_transformer = transformer.Transformer(
option, output_graph_def)
output_graph_def, quantize_activation_info = mace_transformer.run()
if FLAGS.runtime == 'dsp':
from mace.python.tools.converter_tool import hexagon_converter
converter = hexagon_converter.HexagonConverter(
option, output_graph_def, quantize_activation_info)
output_graph_def = converter.run()
model_saver.save_model( model_saver.save_model(
option, output_graph_def, model_checksum, weight_checksum, option, output_graph_def, model_checksum, weight_checksum,
......
...@@ -373,7 +373,7 @@ class ConverterOption(object): ...@@ -373,7 +373,7 @@ class ConverterOption(object):
@input_nodes.setter @input_nodes.setter
def input_nodes(self, input_nodes): def input_nodes(self, input_nodes):
for node in input_nodes: for node in input_nodes.values():
self._input_nodes[node.name] = node self._input_nodes[node.name] = node
def add_input_node(self, input_node): def add_input_node(self, input_node):
...@@ -381,7 +381,7 @@ class ConverterOption(object): ...@@ -381,7 +381,7 @@ class ConverterOption(object):
@output_nodes.setter @output_nodes.setter
def output_nodes(self, output_nodes): def output_nodes(self, output_nodes):
for node in output_nodes: for node in output_nodes.values():
self.output_nodes[node.name] = node self.output_nodes[node.name] = node
def add_output_node(self, output_node): def add_output_node(self, output_node):
...@@ -389,7 +389,7 @@ class ConverterOption(object): ...@@ -389,7 +389,7 @@ class ConverterOption(object):
@check_nodes.setter @check_nodes.setter
def check_nodes(self, check_nodes): def check_nodes(self, check_nodes):
for node in check_nodes: for node in check_nodes.values():
self.check_nodes[node.name] = node self.check_nodes[node.name] = node
def add_check_node(self, check_node): def add_check_node(self, check_node):
......
...@@ -104,7 +104,6 @@ class HexagonConverter(base_converter.ConverterInterface): ...@@ -104,7 +104,6 @@ class HexagonConverter(base_converter.ConverterInterface):
output_name = self._option.output_nodes.values()[0].name output_name = self._option.output_nodes.values()[0].name
else: else:
output_name = self._option.check_nodes.values()[0].name output_name = self._option.check_nodes.values()[0].name
output_name = MaceKeyword.mace_output_node_name + '_' + output_name
output_name = normalize_name(output_name) output_name = normalize_name(output_name)
self._model = graph_util.sort_mace_graph(self._model, output_name) self._model = graph_util.sort_mace_graph(self._model, output_name)
...@@ -311,9 +310,8 @@ class HexagonConverter(base_converter.ConverterInterface): ...@@ -311,9 +310,8 @@ class HexagonConverter(base_converter.ConverterInterface):
return tensor.name return tensor.name
def add_input_output_node(self): def add_input_output_node(self):
input_node = self._option.input_nodes.values()[0]
for op in self._model.op: for op in self._model.op:
if op.name == input_node.name: if op.name.startswith(MaceKeyword.mace_input_node_name):
del op.input[0] del op.input[0]
break break
...@@ -324,8 +322,7 @@ class HexagonConverter(base_converter.ConverterInterface): ...@@ -324,8 +322,7 @@ class HexagonConverter(base_converter.ConverterInterface):
output_name = self._option.check_nodes.values()[0].name output_name = self._option.check_nodes.values()[0].name
output_name = normalize_name(output_name) output_name = normalize_name(output_name)
for op in self._model.op: for op in self._model.op:
if op.name.startswith(MaceKeyword.mace_output_node_name) \ if op.name == output_name:
and op.name.find(output_name) != -1:
output_node = op output_node = op
break break
mace_check(output_node is not None, mace_check(output_node is not None,
...@@ -348,8 +345,6 @@ class HexagonConverter(base_converter.ConverterInterface): ...@@ -348,8 +345,6 @@ class HexagonConverter(base_converter.ConverterInterface):
node_id_counter += 1 node_id_counter += 1
node_id_map[op.name] = op.node_id node_id_map[op.name] = op.node_id
for ipt in op.input: for ipt in op.input:
if ipt.startswith(MaceKeyword.mace_input_node_name):
ipt = ipt[len(MaceKeyword.mace_input_node_name + '_'):]
op_name, port = get_op_and_port_from_tensor(ipt) op_name, port = get_op_and_port_from_tensor(ipt)
node_id = node_id_map[op_name] node_id = node_id_map[op_name]
node_input = op.node_input.add() node_input = op.node_input.add()
......
# Copyright 2018 Xiaomi, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from mace.proto import mace_pb2
from mace.python.tools.converter_tool import base_converter
from mace.python.tools import graph_util
from mace.python.tools.convert_util import mace_check
import six
import tensorflow as tf
from tensorflow.core.framework import tensor_shape_pb2
from operator import mul
import numpy as np
class DspOps(object):
def __init__(self):
self.dsp_ops = {
'INPUT': 'INPUT"',
'OUTPUT': 'OUTPUT',
'NoOp': 'Nop',
'FLATTEN': 'Flatten',
'Identity': 'Nop',
'Placeholder': 'INPUT',
'Const': 'Const',
'QuantizedConv2D': 'QuantizedConv2d_8x8to32',
'QuantizedMatMul': 'QuantizedMatMul_8x8to32',
'QuantizeDownAndShrinkRange': 'QuantizeDownAndShrinkRange_32to8',
'QuantizedRelu': 'QuantizedRelu_8',
'QuantizedReluX': 'QuantizedReluX_8',
'QuantizedMaxPool': 'QuantizedMaxPool_8',
'QuantizedAvgPool': 'QuantizedAvgPool_8',
'QuantizedConcat': 'QuantizedConcat_8',
'QuantizedBiasAdd': 'QuantizedBiasAdd_8p8to32',
'QuantizedResizeBilinear': 'QuantizedResizeBilinear_8',
'QuantizedSpaceToBatchND': 'QuantizedSpaceToBatchND_8',
'QuantizedBatchToSpaceND': 'QuantizedBatchToSpaceND_8',
'QuantizedSoftmax': 'QuantizedSoftmax_8',
'QuantizedTanh': 'QuantizedTanh_8',
'Min': 'Min_f',
'Max': 'Max_f',
'QuantizeV2': 'Quantize',
'Dequantize': 'Dequantize',
'Softmax': 'Softmax_f',
'Reshape': 'Reshape',
'QuantizedReshape': 'QuantizedReshape',
'Sigmoid': 'Sigmoid_f',
'Slice': 'Slice_f',
'Add': 'Add_f',
'Mul': 'Mul_f',
'Requantize': 'Requantize_32to8',
'RequantizationRange': 'RequantizationRange_32',
'Sub': 'Sub_f',
'Pack': 'Pack_int32',
'StridedSlice': 'StridedSlice_f',
'ExpandDims': 'ExpandDims_f',
'QuantizedMul': 'QuantizedMul_8x8to32',
'QuantizedAdd': 'QuantizedAdd_8p8to32',
'Pad': 'Pad_f',
'SpaceToBatchND': 'SpaceToBatchND_f',
'BatchToSpaceND': 'BatchToSpaceND_f',
'ResizeBilinear': 'ResizeBilinear_f',
'ConcatV2': 'ConcatV2_f',
'Conv2DBackpropInput': 'Deconv_f',
'Tanh': 'Tanh_f',
'Split': 'Split_f',
'Transpose': 'Transpose_f',
'Concat': 'Concat_f',
'AddN': 'AddN_f',
}
def has_op(self, tf_op):
return tf_op in self.dsp_ops
def map_nn_op(self, tf_op):
if tf_op not in self.dsp_ops:
raise Exception('Could not map nn op for: ', tf_op)
return self.dsp_ops[tf_op]
TF_DTYPE_2_MACE_DTYPE_MAP = {
tf.float32: mace_pb2.DT_FLOAT,
tf.half: mace_pb2.DT_HALF,
tf.int32: mace_pb2.DT_INT32,
tf.qint32: mace_pb2.DT_INT32,
tf.quint8: mace_pb2.DT_UINT8,
tf.uint8: mace_pb2.DT_UINT8,
}
def tf_dtype_2_mace_dtype(tf_dtype):
mace_dtype = TF_DTYPE_2_MACE_DTYPE_MAP.get(tf_dtype, None)
if not mace_dtype:
raise Exception("Not supported tensorflow dtype: " + tf_dtype)
return mace_dtype
padding_mode = {
'NA': 0,
'SAME': 1,
'VALID': 2,
'MIRROR_REFLECT': 3,
'MIRROR_SYMMETRIC': 4,
'SAME_CAFFE': 5
}
def get_tensor_name_from_op(op_name, port):
return op_name + ':' + str(port)
def get_node_from_map(op_map, op_or_tensor_name):
op_name = op_or_tensor_name.split(':')[0]
return op_map[op_name]
def get_op_and_port_from_tensor(tensor_name):
op, port = tensor_name.split(':')
port = int(port)
return op, port
def max_elem_size(tensor):
if len(tensor.shape.as_list()) == 0:
return tensor.dtype.size
else:
return reduce(mul, tensor.shape.as_list()) * tensor.dtype.size
def find_dtype(tensor_dtype):
if tensor_dtype == tf.float32:
return mace_pb2.DT_FLOAT
elif tensor_dtype == tf.uint8 or tensor_dtype == tf.quint8:
return mace_pb2.DT_UINT8
elif tensor_dtype == tf.int32 or tensor_dtype == tf.qint32:
return mace_pb2.DT_INT32
else:
raise Exception('Unsupported data type: ', tensor_dtype)
def has_padding_and_strides(op):
return 'padding' in op.node_def.attr and 'strides' in op.node_def.attr
def is_node_flatten_reshape(op):
return op.type == 'Reshape' and len(op.outputs[0].shape) == 1
def get_input_tensor(op, index):
input_tensor = op.inputs[index]
if input_tensor.op.type == 'Reshape':
input_tensor = get_input_tensor(input_tensor.op, 0)
return input_tensor
def add_shape_const_node(net_def, op, values, name):
tensor = net_def.tensors.add()
node_name = op.name + '/' + name
tensor.name = node_name + ':0'
tensor.data_type = mace_pb2.DT_INT32
tensor.dims.extend(values)
return tensor.name
def convert_op_outputs(mace_op_def, tf_op):
mace_op_def.out_max_byte_size.extend(
[max_elem_size(output) for output in tf_op.outputs])
mace_op_def.output_type.extend(
[tf_dtype_2_mace_dtype(output.dtype) for output in tf_op.outputs])
output_shapes = []
for output in tf_op.outputs:
output_shape = mace_pb2.OutputShape()
shape_list = output.shape.as_list()
if not shape_list:
shape_list = [1]
elif len(shape_list) == 2:
shape_list = [1, 1, shape_list[0], shape_list[1]]
output_shape.dims.extend(shape_list)
output_shapes.append(output_shape)
mace_op_def.output_shape.extend(output_shapes)
def convert_ops(unresolved_ops, resolved_ops, net_def, dsp_ops):
first_op = unresolved_ops[0]
print('Op: ', first_op.name, first_op.type, first_op.outputs[0].shape)
if first_op.name in resolved_ops:
pass
elif first_op.type == 'Const':
print('Add const node: ', first_op.name)
tf_tensor = first_op.outputs[0].eval()
tensor = net_def.tensors.add()
tensor.name = first_op.outputs[0].name
tensor.data_type = find_dtype(first_op.outputs[0].dtype)
shape = list(tf_tensor.shape)
if len(shape) > 0:
tensor.dims.extend(shape)
if first_op.outputs[0].dtype == tf.float32:
tensor.float_data.extend(tf_tensor.astype(float).flat)
elif first_op.outputs[0].dtype == tf.int32 or \
first_op.outputs[0].dtype == tf.int8 or \
first_op.outputs[0].dtype == tf.int16 or \
first_op.outputs[0].dtype == tf.quint8 or \
first_op.outputs[0].dtype == tf.quint16:
tensor.int32_data.extend(tf_tensor.astype(int).flat)
elif first_op.type == 'Shape':
resolved_ops.add(first_op.name)
else:
op_def = net_def.op.add()
op_def.name = first_op.name
op_def.type = dsp_ops.map_nn_op(first_op.type)
op_def.padding = padding_mode['NA']
if len(first_op.outputs) > 0 and first_op.type == 'Dequantize' \
and len(first_op.outputs[0].consumers()) > 0 \
and (first_op.outputs[0].consumers()[0].type == 'SpaceToBatchND' or
first_op.outputs[0].consumers()[0].type == 'BatchToSpaceND'): # noqa
input_tensor = first_op.inputs[0]
min_tensor = first_op.inputs[1]
max_tensor = first_op.inputs[2]
s2b_op = first_op.outputs[0].consumers()[0]
reshape_op = s2b_op.outputs[0].consumers()[0]
min_op = reshape_op.outputs[0].consumers()[0]
max_op = reshape_op.outputs[0].consumers()[1]
quantize_op = min_op.outputs[0].consumers()[0]
resolved_ops.add(s2b_op.name)
resolved_ops.add(reshape_op.name)
resolved_ops.add(min_op.name)
resolved_ops.add(max_op.name)
resolved_ops.add(quantize_op.name)
op_def.name = quantize_op.name
op_def.type = dsp_ops.map_nn_op('Quantized' + s2b_op.type)
op_def.input.append(input_tensor.name)
op_def.input.extend([t.name for t in s2b_op.inputs[1:]])
op_def.input.extend([min_tensor.name, max_tensor.name])
convert_op_outputs(op_def, quantize_op)
elif (len(first_op.outputs) > 0 and
first_op.type == 'QuantizedReshape' and
len(first_op.outputs[0].consumers()) > 0 and
first_op.outputs[0].consumers()[0].type == 'Dequantize' and
len(first_op.outputs[0].consumers()[0].outputs[0].consumers()) > 0 and # noqa
first_op.outputs[0].consumers()[0].outputs[0].consumers()[0].type == 'Softmax'): # noqa
input_tensor = first_op.inputs[0]
min_tensor = first_op.inputs[2]
max_tensor = first_op.inputs[3]
dequantize_op = first_op.outputs[0].consumers()[0]
softmax_op = dequantize_op.outputs[0].consumers()[0]
reshape_op = softmax_op.outputs[0].consumers()[0]
min_op = reshape_op.outputs[0].consumers()[0]
max_op = reshape_op.outputs[0].consumers()[1]
quantize_op = min_op.outputs[0].consumers()[0]
quantize_reshape_op = quantize_op.outputs[0].consumers()[0]
resolved_ops.add(dequantize_op.name)
resolved_ops.add(softmax_op.name)
resolved_ops.add(reshape_op.name)
resolved_ops.add(min_op.name)
resolved_ops.add(max_op.name)
resolved_ops.add(quantize_op.name)
resolved_ops.add(quantize_reshape_op.name)
op_def.name = quantize_reshape_op.name
op_def.type = dsp_ops.map_nn_op('QuantizedSoftmax')
op_def.input.extend(
[input_tensor.name, min_tensor.name, max_tensor.name])
convert_op_outputs(op_def, quantize_reshape_op)
# remove Squeeze
elif (len(first_op.outputs) > 0 and
first_op.type == 'Requantize' and
len(first_op.outputs[0].consumers()) > 0 and
first_op.outputs[0].consumers()[0].type == 'Dequantize' and
len(first_op.outputs[0].consumers()[0].outputs[0].consumers()) > 0 and # noqa
first_op.outputs[0].consumers()[0].outputs[0].consumers()[0].type == 'Squeeze'): # noqa
dequantize_op = first_op.outputs[0].consumers()[0]
squeeze_op = dequantize_op.outputs[0].consumers()[0]
reshape_op = squeeze_op.outputs[0].consumers()[0]
if reshape_op.type == 'Shape':
reshape_op = squeeze_op.outputs[0].consumers()[1]
min_op = reshape_op.outputs[0].consumers()[0]
max_op = reshape_op.outputs[0].consumers()[1]
quantize_op = min_op.outputs[0].consumers()[0]
resolved_ops.add(dequantize_op.name)
resolved_ops.add(squeeze_op.name)
resolved_ops.add(reshape_op.name)
resolved_ops.add(min_op.name)
resolved_ops.add(max_op.name)
resolved_ops.add(quantize_op.name)
op_def.name = quantize_op.name
op_def.input.extend([t.name for t in first_op.inputs])
convert_op_outputs(op_def, quantize_op)
# Squeeze -> Softmax
next_op = quantize_op.outputs[0].consumers()[0] \
if len(quantize_op.outputs) > 0 else None
dequantize_op = next_op.outputs[0].consumers()[0] \
if next_op and len(next_op.outputs) > 0 and \
next_op.type == 'QuantizedReshape' and \
len(next_op.outputs[0].consumers()) > 0 else None
softmax_op = dequantize_op.outputs[0].consumers()[0] \
if dequantize_op and len(dequantize_op.outputs) > 0 and \
dequantize_op.type == 'Dequantize' and \
len(dequantize_op.outputs[0].consumers()) > 0 else None
if softmax_op and softmax_op.type == 'Softmax':
reshape_op = softmax_op.outputs[0].consumers()[0]
min_op = reshape_op.outputs[0].consumers()[0]
max_op = reshape_op.outputs[0].consumers()[1]
quantize_op = min_op.outputs[0].consumers()[0]
quantize_reshape_op = quantize_op.outputs[0].consumers()[0]
resolved_ops.add(next_op.name)
resolved_ops.add(dequantize_op.name)
resolved_ops.add(softmax_op.name)
resolved_ops.add(reshape_op.name)
resolved_ops.add(min_op.name)
resolved_ops.add(max_op.name)
resolved_ops.add(quantize_op.name)
resolved_ops.add(quantize_reshape_op.name)
softmax_op_def = net_def.op.add()
softmax_op_def.padding = padding_mode['NA']
softmax_op_def.name = quantize_reshape_op.name
softmax_op_def.type = dsp_ops.map_nn_op('QuantizedSoftmax')
softmax_op_def.input.extend([
get_tensor_name_from_op(op_def.name, 0),
get_tensor_name_from_op(op_def.name, 1),
get_tensor_name_from_op(op_def.name, 2)])
convert_op_outputs(softmax_op_def, quantize_reshape_op)
elif len(first_op.outputs) > 0 and first_op.type == 'Dequantize' and \
len(first_op.outputs[0].consumers()) > 0 and \
first_op.outputs[0].consumers()[0].type == 'Tanh':
input_tensor = first_op.inputs[0]
min_tensor = first_op.inputs[1]
max_tensor = first_op.inputs[2]
tanh_op = first_op.outputs[0].consumers()[0]
# if not last op
resolved_ops.add(tanh_op.name)
if tanh_op.outputs[0].consumers():
reshape_op = tanh_op.outputs[0].consumers()[0]
min_op = reshape_op.outputs[0].consumers()[0]
max_op = reshape_op.outputs[0].consumers()[1]
quantize_op = min_op.outputs[0].consumers()[0]
resolved_ops.add(reshape_op.name)
resolved_ops.add(min_op.name)
resolved_ops.add(max_op.name)
resolved_ops.add(quantize_op.name)
op_def.name = quantize_op.name
op_def.type = dsp_ops.map_nn_op('Quantized' + tanh_op.type)
op_def.input.extend(
[input_tensor.name, min_tensor.name, max_tensor.name])
convert_op_outputs(op_def, quantize_op)
# tanh is last op
else:
op_def.name = tanh_op.name + '/QuantizedTanh'
op_def.type = dsp_ops.map_nn_op('Quantized' + tanh_op.type)
op_def.input.extend(
[input_tensor.name, min_tensor.name, max_tensor.name])
op_def.out_max_byte_size.extend([
max_elem_size(input_tensor),
max_elem_size(min_tensor),
max_elem_size(max_tensor)
])
op_def.output_type.extend(
[mace_pb2.DT_UINT8, mace_pb2.DT_FLOAT, mace_pb2.DT_FLOAT])
output_shapes = []
for output in first_op.inputs:
output_shape = mace_pb2.OutputShape()
output_shape.dims.extend(output.shape.as_list())
output_shapes.append(output_shape)
op_def.output_shape.extend(output_shapes)
new_tanh_op_def = net_def.op.add()
new_tanh_op_def.name = tanh_op.name
new_tanh_op_def.type = dsp_ops.map_nn_op('Dequantize')
new_tanh_op_def.input.extend([
get_tensor_name_from_op(op_def.name, 0),
get_tensor_name_from_op(op_def.name, 1),
get_tensor_name_from_op(op_def.name, 2)
])
convert_op_outputs(new_tanh_op_def, tanh_op)
elif has_padding_and_strides(first_op):
op_def.padding = padding_mode[first_op.get_attr('padding')]
op_def.input.extend([t.name for t in first_op.inputs])
if 'ksize' in first_op.node_def.attr:
ksize = first_op.get_attr('ksize')
ksize_tensor = add_shape_const_node(net_def, first_op, ksize,
'ksize')
op_def.input.extend([ksize_tensor])
strides = first_op.get_attr('strides')
strides_tensor = add_shape_const_node(net_def, first_op, strides,
'strides')
op_def.input.extend([strides_tensor])
convert_op_outputs(op_def, first_op)
elif is_node_flatten_reshape(first_op):
op_def.type = 'Flatten'
op_def.input.extend([first_op.inputs[0].name])
convert_op_outputs(op_def, first_op)
elif dsp_ops.has_op(first_op.type):
op_def.input.extend([t.name for t in first_op.inputs])
convert_op_outputs(op_def, first_op)
else:
raise Exception('Unsupported op: ', first_op)
resolved_ops.add(first_op.name)
del unresolved_ops[0]
def add_output_node(net_def, output_node):
op_def = net_def.op.add()
op_def.name = '__output__'
op_def.type = 'OUTPUT'
op_def.input.extend([get_tensor_name_from_op(output_node, 0)])
def reverse_batch_to_space_and_biasadd(net_def):
tensor_map = {}
for tensor in net_def.tensors:
tensor_map[tensor.name] = tensor
op_map = {}
for op in net_def.op:
op_map[op.name] = op
consumers = {}
for op in net_def.op:
for ipt in op.input:
if ipt not in consumers:
consumers[ipt] = []
consumers[ipt].append(op)
new_ops = []
skip_ops = set()
visited_ops = set()
for op in net_def.op:
if op.name in visited_ops:
pass
# pattern: QConv -> RR -> R -> QB2S -> QBiasAdd -> RR -> R
success = False
if op.type == 'Requantize_32to8':
biasadd_requantize_op = op
biasadd_op = get_node_from_map(op_map,
biasadd_requantize_op.input[0])
if biasadd_op.type == 'QuantizedBiasAdd_8p8to32':
b2s_op = get_node_from_map(op_map, biasadd_op.input[0])
if b2s_op.type == 'QuantizedBatchToSpaceND_8':
conv_requantize_op = get_node_from_map(
op_map, b2s_op.input[0])
conv_op = get_node_from_map(op_map,
conv_requantize_op.input[0])
if conv_op.type == 'QuantizedConv2d_8x8to32':
new_biasadd_op = mace_pb2.OperatorDef()
new_biasadd_op.CopyFrom(biasadd_op)
new_biasadd_op.input[0] = get_tensor_name_from_op(
conv_requantize_op.name, 0)
new_biasadd_op.input[2] = get_tensor_name_from_op(
conv_requantize_op.name, 1)
new_biasadd_op.input[3] = get_tensor_name_from_op(
conv_requantize_op.name, 2)
new_biasadd_op.out_max_byte_size[
0] = conv_requantize_op.out_max_byte_size[0] * 4
new_biasadd_requantize_op = mace_pb2.OperatorDef()
new_biasadd_requantize_op.CopyFrom(
biasadd_requantize_op)
new_biasadd_requantize_op.out_max_byte_size[
0] = new_biasadd_op.out_max_byte_size[0] / 4
new_b2s_op = mace_pb2.OperatorDef()
new_b2s_op.CopyFrom(b2s_op)
new_b2s_op.input[0] = get_tensor_name_from_op(
biasadd_requantize_op.name, 0)
new_b2s_op.input[3] = get_tensor_name_from_op(
biasadd_requantize_op.name, 1)
new_b2s_op.input[4] = get_tensor_name_from_op(
biasadd_requantize_op.name, 2)
new_ops.extend([
new_biasadd_op, new_biasadd_requantize_op,
new_b2s_op
])
skip_ops = skip_ops.union([
biasadd_op.name, biasadd_requantize_op.name,
b2s_op.name
])
visited_ops.add(op.name)
follow_ops = consumers[get_tensor_name_from_op(
biasadd_requantize_op.name, 0)]
for follow_op in follow_ops:
new_follow_op = mace_pb2.OperatorDef()
new_follow_op.CopyFrom(follow_op)
for i in six.moves.range(len(follow_op.input)):
for k in six.moves.range(3):
if new_follow_op.input[i] == get_tensor_name_from_op( # noqa
biasadd_requantize_op.name, k):
new_follow_op.input[i] = get_tensor_name_from_op( # noqa
b2s_op.name, k)
new_ops.append(new_follow_op)
skip_ops.add(follow_op.name)
visited_ops.add(follow_op.name)
visited_ops.add(op.name)
new_net_def = mace_pb2.NetDef()
new_net_def.tensors.extend(tensor_map.values())
new_net_def.op.extend([op for op in net_def.op if op.name not in skip_ops])
new_net_def.op.extend(new_ops)
return new_net_def
def add_node_id(net_def):
node_id_counter = 0
node_id_map = {}
for tensor in net_def.tensors:
tensor.node_id = node_id_counter
node_id_counter += 1
tensor_op, port = get_op_and_port_from_tensor(tensor.name)
node_id_map[tensor_op] = tensor.node_id
for op in net_def.op:
op.node_id = node_id_counter
node_id_counter += 1
node_id_map[op.name] = op.node_id
for ipt in op.input:
op_name, port = get_op_and_port_from_tensor(ipt)
node_id = node_id_map[op_name]
node_input = op.node_input.add()
node_input.node_id = node_id
node_input.output_port = int(port)
return net_def
def add_input_output_info(net_def, input_node, output_node, graph, dtype):
input_tensor = graph.get_tensor_by_name(
get_tensor_name_from_op(input_node, 0))
output_tensor = graph.get_tensor_by_name(
get_tensor_name_from_op(output_node, 0))
input_info = net_def.input_info.add()
input_info.name = input_node
input_info.dims.extend(input_tensor.shape.as_list())
input_info.data_type = dtype
if dtype == mace_pb2.DT_UINT8:
for i in six.moves.range(2):
input_info = net_def.input_info.add()
input_info.dims.extend([1, 1, 1, 1])
input_info.data_type = mace_pb2.DT_FLOAT
output_info = net_def.output_info.add()
output_info.name = output_node
output_info.dims.extend(output_tensor.shape.as_list())
output_info.data_type = dtype
if dtype == mace_pb2.DT_UINT8:
for i in six.moves.range(2):
output_info = net_def.output_info.add()
output_info.dims.extend([1, 1, 1, 1])
output_info.data_type = mace_pb2.DT_FLOAT
return net_def
def fuse_quantize(net_def):
tensor_map = {}
for tensor in net_def.tensors:
tensor_map[tensor.name] = tensor
op_map = {}
for op in net_def.op:
op_map[op.name] = op
consumers = {}
for op in net_def.op:
for ipt in op.input:
if ipt not in consumers:
consumers[ipt] = []
consumers[ipt].append(op)
skip_ops = set()
new_ops = []
skip_tensors = set()
# INPUT->Flatten->Minf, Maxf->Quantize
for op in net_def.op:
if op.type == 'INPUT':
input_op = op
flatten_op = None
quantize_op = None
for o in consumers[get_tensor_name_from_op(input_op.name, 0)]:
if o.type == 'Flatten':
flatten_op = o
elif o.type == 'Quantize':
quantize_op = o
if quantize_op is not None:
minf_op, maxf_op = consumers[get_tensor_name_from_op(
flatten_op.name, 0)]
skip_ops = skip_ops.union(
[flatten_op.name, minf_op.name, maxf_op.name])
skip_tensors = skip_tensors.union(
[minf_op.input[0], maxf_op.input[0],
quantize_op.input[1], quantize_op.input[2]])
quantize_op.type = 'AutoQuantize'
del quantize_op.input[1:]
new_net_def = mace_pb2.NetDef()
new_net_def.tensors.extend([
tensor for tensor in net_def.tensors if tensor.name not in skip_tensors
])
new_net_def.op.extend([op for op in net_def.op if op.name not in skip_ops])
new_net_def.op.extend(new_ops)
return new_net_def
class TensorflowDspConverter(base_converter.ConverterInterface):
def __init__(self, option, src_model_file):
self._option = option
self._mace_net_def = mace_pb2.NetDef()
# import tensorflow graph
tf_graph_def = tf.GraphDef()
with tf.gfile.Open(src_model_file, 'rb') as f:
tf_graph_def.ParseFromString(f.read())
self._placeholders = {}
self.add_shape_info(tf_graph_def)
with tf.Session() as session:
with session.graph.as_default() as graph:
tf.import_graph_def(tf_graph_def, name='')
self._tf_graph = graph
def run(self):
ops = self._tf_graph.get_operations()
dsp_ops = DspOps()
resolved_ops = set()
mace_check(len(self._option.input_nodes) == 1
and len(self._option.output_nodes) == 1,
'dsp only support single input and output')
input_node = self._option.input_nodes.values()[0].name
output_node = self._option.output_nodes.values()[0].name
# convert const node
unresolved_ops = [op for op in ops if op.type == 'Const']
with tf.Session() as session:
while len(unresolved_ops) > 0:
convert_ops(unresolved_ops, resolved_ops, self._mace_net_def,
dsp_ops)
# convert op node
unresolved_ops = [op for op in ops if op.type != 'Const']
while len(unresolved_ops) > 0:
convert_ops(unresolved_ops, resolved_ops, self._mace_net_def,
dsp_ops)
add_output_node(self._mace_net_def, output_node)
net_def = reverse_batch_to_space_and_biasadd(self._mace_net_def)
net_def = fuse_quantize(net_def)
sorted_net_def = graph_util.sort_mace_graph(net_def, '__output__')
net_def_with_node_id = add_node_id(sorted_net_def)
dtype = mace_pb2.DT_FLOAT
final_net_def = add_input_output_info(
net_def_with_node_id, input_node, output_node,
self._tf_graph, dtype)
return final_net_def
def add_shape_info(self, tf_graph_def):
for node in tf_graph_def.node:
for input_node in self._option.input_nodes.values():
if node.name == input_node.name or \
node.name + ':0' == input_node.name:
del node.attr['shape'].shape.dim[:]
node.attr['shape'].shape.dim.extend([
tensor_shape_pb2.TensorShapeProto.Dim(size=i) for i in
input_node.shape
])
self._placeholders[node.name + ':0'] = \
np.zeros(shape=input_node.shape, dtype=float)
...@@ -122,8 +122,7 @@ class Transformer(base_converter.ConverterInterface): ...@@ -122,8 +122,7 @@ class Transformer(base_converter.ConverterInterface):
changed = transformer() changed = transformer()
if not changed: if not changed:
break break
self.delete_after_check_nodes()
self.add_check_nodes()
return self._model, self._quantize_activation_info return self._model, self._quantize_activation_info
def filter_format(self): def filter_format(self):
...@@ -278,7 +277,8 @@ class Transformer(base_converter.ConverterInterface): ...@@ -278,7 +277,8 @@ class Transformer(base_converter.ConverterInterface):
input_info.dims.extend(input_node.shape) input_info.dims.extend(input_node.shape)
input_info.data_type = mace_pb2.DT_FLOAT input_info.data_type = mace_pb2.DT_FLOAT
for output_node in self._option.output_nodes.values(): output_nodes = self._option.check_nodes.values()
for output_node in output_nodes:
output_info = net.output_info.add() output_info = net.output_info.add()
output_info.name = output_node.name output_info.name = output_node.name
output_info.data_format = output_node.data_format.value output_info.data_format = output_node.data_format.value
...@@ -1367,7 +1367,8 @@ class Transformer(base_converter.ConverterInterface): ...@@ -1367,7 +1367,8 @@ class Transformer(base_converter.ConverterInterface):
+ '_' + input_node.name + '_' + input_node.name
input_name_map[input_node.name] = new_input_name input_name_map[input_node.name] = new_input_name
for output_node in self._option.output_nodes.values(): output_nodes = self._option.check_nodes.values()
for output_node in output_nodes:
new_output_name = MaceKeyword.mace_output_node_name \ new_output_name = MaceKeyword.mace_output_node_name \
+ '_' + output_node.name + '_' + output_node.name
output_name_map[output_node.name] = new_output_name output_name_map[output_node.name] = new_output_name
...@@ -1378,7 +1379,12 @@ class Transformer(base_converter.ConverterInterface): ...@@ -1378,7 +1379,12 @@ class Transformer(base_converter.ConverterInterface):
op.input[i] = input_name_map[op.input[i]] op.input[i] = input_name_map[op.input[i]]
for i in range(len(op.output)): for i in range(len(op.output)):
if op.output[i] in output_name_map: if op.output[i] in output_name_map:
op.output[i] = output_name_map[op.output[i]] op.name = MaceKeyword.mace_output_node_name \
+ '_' + op.name
new_output_name = output_name_map[op.output[i]]
self._quantize_activation_info[new_output_name] = \
self._quantize_activation_info[op.output[i]]
op.output[i] = new_output_name
data_type_arg = ConverterUtil.get_arg( data_type_arg = ConverterUtil.get_arg(
op, MaceKeyword.mace_op_data_type_str) op, MaceKeyword.mace_op_data_type_str)
...@@ -1399,7 +1405,8 @@ class Transformer(base_converter.ConverterInterface): ...@@ -1399,7 +1405,8 @@ class Transformer(base_converter.ConverterInterface):
for input_node in self._option.input_nodes.values(): for input_node in self._option.input_nodes.values():
op_def = self._model.op.add() op_def = self._model.op.add()
op_def.name = self.normalize_op_name(input_node.name) op_def.name = \
self.normalize_op_name(input_name_map[input_node.name])
op_def.type = MaceOp.Quantize.name op_def.type = MaceOp.Quantize.name
op_def.input.extend([input_node.name]) op_def.input.extend([input_node.name])
op_def.output.extend([input_name_map[input_node.name]]) op_def.output.extend([input_name_map[input_node.name]])
...@@ -1409,10 +1416,9 @@ class Transformer(base_converter.ConverterInterface): ...@@ -1409,10 +1416,9 @@ class Transformer(base_converter.ConverterInterface):
ConverterUtil.add_data_type_arg(op_def, mace_pb2.DT_UINT8) ConverterUtil.add_data_type_arg(op_def, mace_pb2.DT_UINT8)
ConverterUtil.add_data_format_arg(op_def, DataFormat.NHWC) ConverterUtil.add_data_format_arg(op_def, DataFormat.NHWC)
for output_node in self._option.output_nodes.values(): for output_node in output_nodes:
op_def = self._model.op.add() op_def = self._model.op.add()
op_def.name = self.normalize_op_name( op_def.name = self.normalize_op_name(output_node.name)
output_name_map[output_node.name])
op_def.type = MaceOp.Dequantize.name op_def.type = MaceOp.Dequantize.name
op_def.input.extend([output_name_map[output_node.name]]) op_def.input.extend([output_name_map[output_node.name]])
op_def.output.extend([output_node.name]) op_def.output.extend([output_node.name])
...@@ -1721,34 +1727,17 @@ class Transformer(base_converter.ConverterInterface): ...@@ -1721,34 +1727,17 @@ class Transformer(base_converter.ConverterInterface):
arg.i = mace_pb2.GPU_IMAGE if self._option.cl_mem_type == "image"\ arg.i = mace_pb2.GPU_IMAGE if self._option.cl_mem_type == "image"\
else mace_pb2.GPU_BUFFER else mace_pb2.GPU_BUFFER
def add_check_nodes(self): def delete_after_check_nodes(self):
if self._option.check_nodes: if self._option.check_nodes != self._option.output_nodes:
mace_check(len(self._option.check_nodes) == 1, mace_check(len(self._option.check_nodes) == 1,
"Only support one check node now.") "Only support one check node now.")
check_node = None check_node = None
for i in six.moves.range(len(self._model.op)): for i in six.moves.range(len(self._model.op)):
if self._model.op[i].name in self._option.check_nodes: if self._model.op[i].output[0] in self._option.check_nodes:
check_node = self._model.op[i] check_node = self._model.op[i]
del self._model.op[i+1:] del self._model.op[i+1:]
break break
mace_check(check_node is not None, "check node not found.") mace_check(check_node is not None, "check node not found.")
output_name = \
MaceKeyword.mace_output_node_name + '_' + check_node.name
op_def = self._model.op.add()
op_def.name = self.normalize_op_name(output_name)
op_def.type = MaceOp.Dequantize.name
op_def.input.extend([check_node.output[0]])
op_def.output.extend([output_name])
output_shape = op_def.output_shape.add()
output_shape.dims.extend(check_node.output_shape[0].dims)
ConverterUtil.add_data_type_arg(op_def, mace_pb2.DT_UINT8)
op_def.output_type.extend([mace_pb2.DT_FLOAT])
del self._model.output_info[:]
output_info = self._model.output_info.add()
output_info.name = check_node.name
output_info.dims.extend(check_node.output_shape[0].dims)
output_info.data_type = mace_pb2.DT_FLOAT
def transform_caffe_reshape_and_flatten(self): def transform_caffe_reshape_and_flatten(self):
net = self._model net = self._model
......
...@@ -36,197 +36,6 @@ ...@@ -36,197 +36,6 @@
#ifndef THIRD_PARTY_NNLIB_HEXAGON_NN_H_ #ifndef THIRD_PARTY_NNLIB_HEXAGON_NN_H_
#define THIRD_PARTY_NNLIB_HEXAGON_NN_H_ #define THIRD_PARTY_NNLIB_HEXAGON_NN_H_
#ifdef MACE_USE_NNLIB_OLD
#ifndef __QAIC_HEADER
#define __QAIC_HEADER(ff) ff
#endif // __QAIC_HEADER
#ifndef __QAIC_HEADER_EXPORT
#define __QAIC_HEADER_EXPORT
#endif // __QAIC_HEADER_EXPORT
#ifndef __QAIC_HEADER_ATTRIBUTE
#define __QAIC_HEADER_ATTRIBUTE
#endif // __QAIC_HEADER_ATTRIBUTE
#ifndef __QAIC_IMPL
#define __QAIC_IMPL(ff) ff
#endif // __QAIC_IMPL
#ifndef __QAIC_IMPL_EXPORT
#define __QAIC_IMPL_EXPORT
#endif // __QAIC_IMPL_EXPORT
#ifndef __QAIC_IMPL_ATTRIBUTE
#define __QAIC_IMPL_ATTRIBUTE
#endif // __QAIC_IMPL_ATTRIBUTE
#ifdef __cplusplus
extern "C" {
#endif
#if !defined(__QAIC_STRING1_OBJECT_DEFINED__) && !defined(__STRING1_OBJECT__)
#define __QAIC_STRING1_OBJECT_DEFINED__
#define __STRING1_OBJECT__
typedef struct _cstring1_s {
char *data;
int dataLen;
} _cstring1_t;
#endif /* __QAIC_STRING1_OBJECT_DEFINED__ */
typedef struct hexagon_nn_input hexagon_nn_input;
struct hexagon_nn_input {
unsigned int src_id;
unsigned int output_idx;
};
typedef struct hexagon_nn_output hexagon_nn_output;
struct hexagon_nn_output {
unsigned int max_size;
unsigned int unused;
};
typedef struct hexagon_nn_perfinfo hexagon_nn_perfinfo;
struct hexagon_nn_perfinfo {
unsigned int node_id;
unsigned int node_type;
unsigned int executions;
unsigned int unused;
unsigned int counter_lo;
unsigned int counter_hi;
};
typedef int hexagon_nn_nn_id;
enum hexagon_nn_padding_type {
NN_PAD_NA,
NN_PAD_SAME,
NN_PAD_VALID,
NN_PAD_MIRROR_REFLECT,
NN_PAD_MIRROR_SYMMETRIC,
NN_PAD_SAME_CAFFE,
_32BIT_PLACEHOLDER_hexagon_nn_padding_type = 0x7fffffff
};
typedef enum hexagon_nn_padding_type hexagon_nn_padding_type;
typedef struct hexagon_nn_tensordef hexagon_nn_tensordef;
struct hexagon_nn_tensordef {
unsigned int batches;
unsigned int height;
unsigned int width;
unsigned int depth;
unsigned char *data;
int dataLen;
unsigned int data_valid_len;
unsigned int unused;
};
typedef struct hexagon_nn_op_node hexagon_nn_op_node;
struct hexagon_nn_op_node {
unsigned int node_id;
unsigned int operation;
hexagon_nn_padding_type padding;
hexagon_nn_input *inputs;
int inputsLen;
hexagon_nn_output *outputs;
int outputsLen;
};
typedef struct hexagon_nn_const_node hexagon_nn_const_node;
struct hexagon_nn_const_node {
unsigned int node_id;
hexagon_nn_tensordef tensor;
};
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_config)(void)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_init)(void)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_set_debug_level)(
hexagon_nn_nn_id id, int level) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_set_graph_mode)(
hexagon_nn_nn_id id, int mode) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_snpprint)(hexagon_nn_nn_id id,
unsigned char *buf,
int bufLen)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_getlog)(hexagon_nn_nn_id id,
unsigned char *buf,
int bufLen)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_append_node)(
hexagon_nn_nn_id id,
unsigned int node_id,
unsigned int operation,
hexagon_nn_padding_type padding,
const hexagon_nn_input *inputs,
int inputsLen,
const hexagon_nn_output *outputs,
int outputsLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_append_node_list)(
hexagon_nn_nn_id id,
const hexagon_nn_op_node *ops,
int opsLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_append_const_node)(
hexagon_nn_nn_id id,
unsigned int node_id,
unsigned int batches,
unsigned int height,
unsigned int width,
unsigned int depth,
const unsigned char *data,
int dataLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_append_const_node_list)(
hexagon_nn_nn_id id,
const hexagon_nn_const_node *consts,
int constsLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_prepare)(hexagon_nn_nn_id id)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_execute)(
hexagon_nn_nn_id id,
unsigned int batches_in,
unsigned int height_in,
unsigned int width_in,
unsigned int depth_in,
const unsigned char *data_in,
int data_inLen,
unsigned int *batches_out,
unsigned int *height_out,
unsigned int *width_out,
unsigned int *depth_out,
unsigned char *data_out,
int data_outLen,
unsigned int *data_len_out) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_teardown)(hexagon_nn_nn_id id)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_set_powersave_level)(
unsigned int level) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_get_perfinfo)(
hexagon_nn_nn_id id,
hexagon_nn_perfinfo *info_out,
int info_outLen,
unsigned int *n_items) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_reset_perfinfo)(
hexagon_nn_nn_id id, unsigned int event) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_last_execution_cycles)(
hexagon_nn_nn_id id,
unsigned int *cycles_lo,
unsigned int *cycles_hi) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_version)(int *ver)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_op_name_to_id)(
const char *name, unsigned int *node_id) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_op_id_to_name)(
unsigned int node_id, char *name, int nameLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_disable_dcvs)(void)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_GetHexagonBinaryVersion)(
int *ver) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_PrintLog)(
const unsigned char *buf, int bufLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_execute_new)(
hexagon_nn_nn_id id,
const hexagon_nn_tensordef *inputs,
int inputsLen,
hexagon_nn_tensordef *outputs,
int outputsLen) __QAIC_HEADER_ATTRIBUTE;
#ifdef __cplusplus
}
#endif
#elif defined(MACE_USE_NNLIB_2_1) // nnlib version
#ifndef __QAIC_HEADER #ifndef __QAIC_HEADER
#define __QAIC_HEADER(ff) ff #define __QAIC_HEADER(ff) ff
#endif //__QAIC_HEADER #endif //__QAIC_HEADER
...@@ -370,200 +179,4 @@ __QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_execute_new)(hexagon_nn_nn_id ...@@ -370,200 +179,4 @@ __QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_execute_new)(hexagon_nn_nn_id
} }
#endif #endif
#else // nnlib version : MACE_USE_NNLIB_CAF
#ifndef __QAIC_HEADER
#define __QAIC_HEADER(ff) ff
#endif //__QAIC_HEADER
#ifndef __QAIC_HEADER_EXPORT
#define __QAIC_HEADER_EXPORT
#endif // __QAIC_HEADER_EXPORT
#ifndef __QAIC_HEADER_ATTRIBUTE
#define __QAIC_HEADER_ATTRIBUTE
#endif // __QAIC_HEADER_ATTRIBUTE
#ifndef __QAIC_IMPL
#define __QAIC_IMPL(ff) ff
#endif //__QAIC_IMPL
#ifndef __QAIC_IMPL_EXPORT
#define __QAIC_IMPL_EXPORT
#endif // __QAIC_IMPL_EXPORT
#ifndef __QAIC_IMPL_ATTRIBUTE
#define __QAIC_IMPL_ATTRIBUTE
#endif // __QAIC_IMPL_ATTRIBUTE
#ifdef __cplusplus
extern "C" {
#endif
#if !defined(__QAIC_STRING1_OBJECT_DEFINED__) && !defined(__STRING1_OBJECT__)
#define __QAIC_STRING1_OBJECT_DEFINED__
#define __STRING1_OBJECT__
typedef struct _cstring1_s {
char *data;
int dataLen;
} _cstring1_t;
#endif /* __QAIC_STRING1_OBJECT_DEFINED__ */
typedef struct hexagon_nn_input hexagon_nn_input;
struct hexagon_nn_input {
unsigned int src_id;
unsigned int output_idx;
};
typedef struct hexagon_nn_output hexagon_nn_output;
struct hexagon_nn_output {
unsigned int rank;
unsigned int max_sizes[8];
unsigned int elementsize;
int zero_offset;
float stepsize;
};
typedef struct hexagon_nn_perfinfo hexagon_nn_perfinfo;
struct hexagon_nn_perfinfo {
unsigned int node_id;
unsigned int node_type;
unsigned int executions;
unsigned int unused;
unsigned int counter_lo;
unsigned int counter_hi;
};
typedef int hexagon_nn_nn_id;
enum hexagon_nn_padding_type {
NN_PAD_NA,
NN_PAD_SAME,
NN_PAD_VALID,
NN_PAD_MIRROR_REFLECT,
NN_PAD_MIRROR_SYMMETRIC,
NN_PAD_SAME_CAFFE,
_32BIT_PLACEHOLDER_hexagon_nn_padding_type = 0x7fffffff
};
typedef enum hexagon_nn_padding_type hexagon_nn_padding_type;
typedef struct hexagon_nn_tensordef hexagon_nn_tensordef;
struct hexagon_nn_tensordef {
unsigned int batches;
unsigned int height;
unsigned int width;
unsigned int depth;
unsigned char *data;
int dataLen;
unsigned int data_valid_len;
unsigned int unused;
};
typedef struct hexagon_nn_op_node hexagon_nn_op_node;
struct hexagon_nn_op_node {
unsigned int node_id;
unsigned int operation;
hexagon_nn_padding_type padding;
hexagon_nn_input *inputs;
int inputsLen;
hexagon_nn_output *outputs;
int outputsLen;
};
typedef struct hexagon_nn_const_node hexagon_nn_const_node;
struct hexagon_nn_const_node {
unsigned int node_id;
hexagon_nn_tensordef tensor;
};
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_config)(void)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_init)(hexagon_nn_nn_id *g)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_set_debug_level)(
hexagon_nn_nn_id id, int level) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_set_graph_mode)(
hexagon_nn_nn_id id, int mode) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_snpprint)(hexagon_nn_nn_id id,
unsigned char *buf,
int bufLen)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_getlog)(hexagon_nn_nn_id id,
unsigned char *buf,
int bufLen)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_append_node)(
hexagon_nn_nn_id id,
unsigned int node_id,
unsigned int operation,
hexagon_nn_padding_type padding,
const hexagon_nn_input *inputs,
int inputsLen,
const hexagon_nn_output *outputs,
int outputsLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_append_node_list)(
hexagon_nn_nn_id id,
const hexagon_nn_op_node *ops,
int opsLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_append_const_node)(
hexagon_nn_nn_id id,
unsigned int node_id,
unsigned int batches,
unsigned int height,
unsigned int width,
unsigned int depth,
const unsigned char *data,
int dataLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_append_const_node_list)(
hexagon_nn_nn_id id,
const hexagon_nn_const_node *consts,
int constsLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_prepare)(hexagon_nn_nn_id id)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_execute)(
hexagon_nn_nn_id id,
unsigned int batches_in,
unsigned int height_in,
unsigned int width_in,
unsigned int depth_in,
const unsigned char *data_in,
int data_inLen,
unsigned int *batches_out,
unsigned int *height_out,
unsigned int *width_out,
unsigned int *depth_out,
unsigned char *data_out,
int data_outLen,
unsigned int *data_len_out) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_teardown)(hexagon_nn_nn_id id)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_set_powersave_level)(
unsigned int level) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_get_perfinfo)(
hexagon_nn_nn_id id,
hexagon_nn_perfinfo *info_out,
int info_outLen,
unsigned int *n_items) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_reset_perfinfo)(
hexagon_nn_nn_id id, unsigned int event) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_last_execution_cycles)(
hexagon_nn_nn_id id,
unsigned int *cycles_lo,
unsigned int *cycles_hi) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_version)(int *ver)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_op_name_to_id)(
const char *name, unsigned int *node_id) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_op_id_to_name)(
unsigned int node_id, char *name, int nameLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_disable_dcvs)(void)
__QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_GetHexagonBinaryVersion)(
int *ver) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_PrintLog)(
const unsigned char *buf, int bufLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(hexagon_nn_execute_new)(
hexagon_nn_nn_id id,
const hexagon_nn_tensordef *inputs,
int inputsLen,
hexagon_nn_tensordef *outputs,
int outputsLen) __QAIC_HEADER_ATTRIBUTE;
__QAIC_HEADER_EXPORT unsigned int __QAIC_HEADER(hexagon_nn_get_dsp_offset)(void)
__QAIC_HEADER_ATTRIBUTE;
#ifdef __cplusplus
}
#endif
#endif // nnlib version
#endif // THIRD_PARTY_NNLIB_HEXAGON_NN_H_ #endif // THIRD_PARTY_NNLIB_HEXAGON_NN_H_
...@@ -79,147 +79,6 @@ ...@@ -79,147 +79,6 @@
*/ */
// NOLINT(build/header_guard) // NOLINT(build/header_guard)
#ifdef MACE_USE_NNLIB_OLD
DEF_OP(INPUT)
DEF_OP(OUTPUT)
DEF_OP(Nop)
DEF_OP(Const)
DEF_OP(Check)
DEF_OP(Close_f)
DEF_OP(Close_quint8)
DEF_OP(Close_q_quint8)
DEF_OP(Close_int32)
DEF_OP(Close_qint32)
DEF_OP(PPrint_8)
DEF_OP(PPrint_32)
DEF_OP(PPrint_f)
DEF_OP(PreFree)
DEF_OP(Flatten)
#ifndef DEF_OP_WREF
#define DEF_OP_WREF(NAME) DEF_OP(NAME) DEF_OP(NAME##_ref)
#define __SELF_DEF_OP_WREF
#endif
DEF_OP_WREF(QuantizedConv2d_8x8to32)
DEF_OP_WREF(QuantizedMatMul_8x8to32)
DEF_OP_WREF(QuantizeDownAndShrinkRange_32to8)
DEF_OP_WREF(QuantizedRelu_8)
DEF_OP_WREF(QuantizedReluX_8)
DEF_OP_WREF(QuantizedMaxPool_8)
DEF_OP_WREF(QuantizedAvgPool_8)
DEF_OP_WREF(QuantizedConcat_8)
DEF_OP_WREF(QuantizedBiasAdd_8p8to32)
DEF_OP_WREF(Min_f)
DEF_OP_WREF(Max_f)
DEF_OP_WREF(Quantize)
DEF_OP_WREF(Dequantize)
DEF_OP_WREF(Supernode_8x8p8to8)
DEF_OP(QuantizedFlatten)
DEF_OP(Softmax_f)
DEF_OP(Conv2d_f)
DEF_OP(MatMul_f)
DEF_OP(Relu_f)
DEF_OP(ReluX_f)
DEF_OP(AvgPool_f)
DEF_OP(MaxPool_f)
DEF_OP(Concat_f)
DEF_OP(BiasAdd_f)
DEF_OP(LRN_f)
DEF_OP(Variable)
DEF_OP(Assign)
DEF_OP(Reshape)
DEF_OP(QuantizedReshape)
DEF_OP(Tanh_f)
DEF_OP(Sigmoid_f)
DEF_OP(Slice_8)
DEF_OP(Slice_f)
DEF_OP(QuantizedSlice_8)
DEF_OP(Add_f)
DEF_OP(Mul_f)
DEF_OP(Minimum_f)
DEF_OP(Maximum_f)
DEF_OP_WREF(Requantize_32to8)
DEF_OP_WREF(RequantizationRange_32)
DEF_OP(Neg_f)
DEF_OP(Sub_f)
DEF_OP(AddN_f)
DEF_OP(Range_int32)
DEF_OP(Rank_int32)
DEF_OP(Transpose_int32)
DEF_OP(Transpose_f)
DEF_OP(InstanceNorm_f)
DEF_OP_WREF(QuantizedInstanceNorm_8)
DEF_OP(Sub_int32)
DEF_OP(Add_int32)
DEF_OP(Split_f)
DEF_OP(Dequantize_qint32_f)
DEF_OP(PRelu_f)
DEF_OP_WREF(QuantizedPRelu_8)
DEF_OP(Sum_f)
DEF_OP(Prod_f)
DEF_OP(Mul_int32)
DEF_OP(LogicalAnd_int32)
DEF_OP(LogicalOr_int32)
DEF_OP(LogicalXor_int32)
DEF_OP(Shape_int32)
DEF_OP(Pack_int32)
DEF_OP(MirrorPad_f)
DEF_OP(ResizeNearestNeighbor_f)
DEF_OP(StridedSlice_int32)
DEF_OP(StridedSlice_f)
DEF_OP(ExpandDims_int32)
DEF_OP(ExpandDims_f)
DEF_OP(LogSoftmax_f)
DEF_OP(Split_int32)
DEF_OP(QuantizedSplit_8)
DEF_OP(Deconv_f)
DEF_OP_WREF(QuantizedDeconv_8x8to32)
DEF_OP_WREF(QuantizedMul_8x8to32)
DEF_OP_WREF(QuantizedAdd_8p8to32)
DEF_OP_WREF(QuantizedSigmoid_8)
DEF_OP_WREF(QuantizedTanh_8)
DEF_OP_WREF(QuantizedSoftmax_8)
DEF_OP_WREF(QuantizedLRN_8)
DEF_OP_WREF(QuantizedSub_8p8to32)
DEF_OP_WREF(QuantizedMaximum_8)
DEF_OP_WREF(QuantizedMinimum_8)
DEF_OP(Pad_f)
DEF_OP(SpaceToBatchND_f)
DEF_OP(BatchToSpaceND_f)
DEF_OP(QuantizedSpaceToBatchND_8)
DEF_OP(QuantizedBatchToSpaceND_8)
DEF_OP(QuantizedPad_8)
DEF_OP(ResizeBilinear_f)
DEF_OP(QuantizedResizeBilinear_8)
DEF_OP(ConcatV2_f)
DEF_OP(ConcatV2_int32)
DEF_OP(Prod_int32)
DEF_OP(Slice_int32)
DEF_OP(QuantizedAdd_8p8to8)
DEF_OP_WREF(AutoQuantize)
DEF_OP_WREF(QuantizedDepthwiseConv2d_8x8to32)
DEF_OP(DepthwiseConv2d_f)
DEF_OP(QuantizedBiasAdd_8p8to8)
#ifdef __SELF_DEF_OP_WREF
#undef __SELF_DEF_OP_WREF
#undef DEF_OP_WREF
#endif
#elif defined(MACE_USE_NNLIB_2_1) // nnlib version
DEF_OP(INPUT) DEF_OP(INPUT)
DEF_OP(OUTPUT) DEF_OP(OUTPUT)
DEF_OP(Nop) DEF_OP(Nop)
...@@ -441,214 +300,3 @@ DEF_OP(QuantizedChannelShuffle_8) ...@@ -441,214 +300,3 @@ DEF_OP(QuantizedChannelShuffle_8)
#undef __SELF_DEF_OP_WREF #undef __SELF_DEF_OP_WREF
#undef DEF_OP_WREF #undef DEF_OP_WREF
#endif #endif
#else // nnlib version : MACE_USE_NNLIB_CAF
DEF_OP(INPUT)
DEF_OP(OUTPUT)
DEF_OP(Nop)
DEF_OP(Const)
DEF_OP(Check)
DEF_OP(Close_f)
DEF_OP(Close_quint8)
DEF_OP(Close_q_quint8)
DEF_OP(Close_int32)
DEF_OP(Close_qint32)
DEF_OP(PPrint_8)
DEF_OP(PPrint_32)
DEF_OP(PPrint_f)
DEF_OP(PreFree)
DEF_OP(Flatten)
#ifndef DEF_OP_WREF
#define DEF_OP_WREF(NAME) DEF_OP(NAME) DEF_OP(NAME##_ref)
#define __SELF_DEF_OP_WREF
#endif
DEF_OP_WREF(QuantizedConv2d_8x8to32)
DEF_OP_WREF(QuantizedMatMul_8x8to32)
DEF_OP_WREF(QuantizeDownAndShrinkRange_32to8)
DEF_OP_WREF(QuantizedRelu_8)
DEF_OP_WREF(QuantizedReluX_8)
DEF_OP_WREF(QuantizedMaxPool_8)
DEF_OP_WREF(QuantizedAvgPool_8)
DEF_OP_WREF(QuantizedL2Pool_8)
DEF_OP_WREF(QuantizedConcat_8)
DEF_OP_WREF(QuantizedBiasAdd_8p8to32)
DEF_OP_WREF(Min_f)
DEF_OP_WREF(Max_f)
DEF_OP_WREF(Quantize)
DEF_OP_WREF(Dequantize)
DEF_OP_WREF(Supernode_8x8p8to8)
DEF_OP(QuantizedFlatten)
DEF_OP(Softmax_f)
DEF_OP(Conv2d_f)
DEF_OP(MatMul_f)
DEF_OP(Relu_f)
DEF_OP(ReluX_f)
DEF_OP(AvgPool_f)
DEF_OP(L2Pool_f)
DEF_OP(MaxPool_f)
DEF_OP(Concat_f)
DEF_OP(BiasAdd_f)
DEF_OP(LRN_f)
DEF_OP(Variable)
DEF_OP(Assign)
DEF_OP(Reshape)
DEF_OP(QuantizedReshape)
DEF_OP(Tanh_f)
DEF_OP(Sigmoid_f)
DEF_OP(Slice_8)
DEF_OP(Slice_f)
DEF_OP(QuantizedSlice_8)
DEF_OP(Add_f)
DEF_OP(Mul_f)
DEF_OP(Minimum_f)
DEF_OP(Maximum_f)
DEF_OP_WREF(Requantize_32to8)
DEF_OP_WREF(RequantizationRange_32)
DEF_OP(Neg_f)
DEF_OP(Sub_f)
DEF_OP(AddN_f)
DEF_OP(Range_int32)
DEF_OP(Rank_int32)
DEF_OP(Transpose_int32)
DEF_OP(Transpose_f)
DEF_OP(InstanceNorm_f)
DEF_OP_WREF(QuantizedInstanceNorm_8)
DEF_OP(Sub_int32)
DEF_OP(Add_int32)
DEF_OP(Split_f)
DEF_OP(Dequantize_qint32_f)
DEF_OP(PRelu_f)
DEF_OP_WREF(QuantizedPRelu_8)
DEF_OP(Sum_f)
DEF_OP(Prod_f)
DEF_OP(Mul_int32)
DEF_OP(LogicalAnd_int32)
DEF_OP(LogicalOr_int32)
DEF_OP(LogicalXor_int32)
DEF_OP(Shape_int32)
DEF_OP(Pack_int32)
DEF_OP(MirrorPad_f)
DEF_OP(ResizeNearestNeighbor_f)
DEF_OP(StridedSlice_int32)
DEF_OP(StridedSlice_f)
DEF_OP(ExpandDims_int32)
DEF_OP(ExpandDims_f)
DEF_OP(LogSoftmax_f)
DEF_OP(Split_int32)
DEF_OP(QuantizedSplit_8)
DEF_OP(Deconv_f)
DEF_OP_WREF(QuantizedDeconv_8x8to32)
DEF_OP_WREF(QuantizedMul_8x8to32)
DEF_OP_WREF(QuantizedAdd_8p8to32)
DEF_OP_WREF(QuantizedSigmoid_8)
DEF_OP_WREF(QuantizedTanh_8)
DEF_OP_WREF(QuantizedSoftmax_8)
DEF_OP_WREF(QuantizedLRN_8)
DEF_OP_WREF(Quantizedpad2d_frame_8p)
DEF_OP_WREF(QuantizedSub_8p8to32)
DEF_OP_WREF(QuantizedMaximum_8)
DEF_OP_WREF(QuantizedMinimum_8)
DEF_OP(Pad_f)
DEF_OP(SpaceToBatchND_f)
DEF_OP(BatchToSpaceND_f)
DEF_OP(QuantizedPad_8)
DEF_OP(ResizeBilinear_f)
DEF_OP(ConcatV2_f)
DEF_OP(ConcatV2_int32)
DEF_OP(Prod_int32)
DEF_OP(Slice_int32)
DEF_OP(QuantizedAdd_8p8to8)
DEF_OP(QuantizedResizeBilinear_8)
DEF_OP(Supernode_8x8p8to8_d32)
DEF_OP(Convert_to_d32)
DEF_OP(Convert_from_d32)
DEF_OP_WREF(QuantizedMaxPool_8_d32)
DEF_OP_WREF(QuantizedConcat_8_d32)
DEF_OP_WREF(QuantizedAvgPool_8_d32)
DEF_OP(Sink)
DEF_OP_WREF(QuantizedPRelu_8_d32)
DEF_OP_WREF(AutoQuantize)
DEF_OP_WREF(QuantizedDepthwiseConv2d_8x8to32)
DEF_OP_WREF(DepthwiseConv2d_f)
DEF_OP(DepthwiseSupernode_8x8p8to8)
DEF_OP(DepthwiseSupernode_8x8p8to8_d32)
DEF_OP_WREF(QuantizedMul_8x8to8_d32)
DEF_OP(FullyConnected_u8)
#if 0
DEF_OP_WREF(QuantizedFC_8x8p8to8)
#endif
DEF_OP_WREF(QuantizedAdd_8p8to8_d32)
DEF_OP_WREF(QuantizedClamp_8)
DEF_OP(Clamp_f)
DEF_OP(QuantizeForTest_d32)
DEF_OP(Close_d32)
DEF_OP_WREF(QuantizedSub_8p8to8_d32)
DEF_OP(InputSupernode_8x8p8to8_outd32)
DEF_OP(QuantizedLRN_8_d32)
DEF_OP_WREF(QuantizedBiasAdd_32p32to32)
DEF_OP_WREF(Quantize_int32)
DEF_OP(Supernode_8x8p32to8)
DEF_OP(DepthwiseSupernode_8x8p32to8)
DEF_OP(Supernode_8x8p32to8_d32)
DEF_OP(DepthwiseSupernode_8x8p32to8_d32)
DEF_OP(InputSupernode_8x8p32to8_outd32)
DEF_OP(PPrint_8_d32)
DEF_OP(PPrintWithPadding_8_d32)
DEF_OP_WREF(AutoQuantize_d32)
DEF_OP_WREF(QuantizedTanh_8_d32)
DEF_OP_WREF(QuantizedSigmoid_8_d32)
DEF_OP_WREF(QuantizedSoftmax_8_d32)
DEF_OP_WREF(QuantizedL2Pool_8_d32)
DEF_OP(Gather_f)
DEF_OP(Gather_int32)
DEF_OP(Gather_8)
DEF_OP(Table_f)
DEF_OP(Table_int32)
DEF_OP(Table_8)
DEF_OP(FillPadding_8_d32)
DEF_OP(QuantizedResizeBilinear_8_d32)
DEF_OP(QuantizeINPUT_f_to_8)
DEF_OP_WREF(DeconvBias_8x8to32)
DEF_OP(SpaceToBatchND_8)
DEF_OP(BatchToSpaceND_8)
DEF_OP(SpaceToDepth_f)
DEF_OP(DepthToSpace_f)
DEF_OP(SpaceToDepth_8)
DEF_OP(DepthToSpace_8)
#ifdef __SELF_DEF_OP_WREF
#undef __SELF_DEF_OP_WREF
#undef DEF_OP_WREF
#endif
#endif // nnlib version
...@@ -9,7 +9,6 @@ build --copt=-fPIC ...@@ -9,7 +9,6 @@ build --copt=-fPIC
build --copt=-D_GLIBCXX_USE_C99_MATH_TR1 build --copt=-D_GLIBCXX_USE_C99_MATH_TR1
build --copt=-DMACE_OBFUSCATE_LITERALS build --copt=-DMACE_OBFUSCATE_LITERALS
build --copt=-DGEMMLOWP_USE_OPENMP build --copt=-DGEMMLOWP_USE_OPENMP
build --copt=-DMACE_USE_NNLIB_CAF
# Usage example: bazel build --config symbol_hidden # Usage example: bazel build --config symbol_hidden
build:symbol_hidden --copt=-fvisibility=hidden build:symbol_hidden --copt=-fvisibility=hidden
......
...@@ -445,7 +445,8 @@ def format_model_config(flags): ...@@ -445,7 +445,8 @@ def format_model_config(flags):
threshold_dict = { threshold_dict = {
DeviceType.CPU: ValidationThreshold.cpu_threshold, DeviceType.CPU: ValidationThreshold.cpu_threshold,
DeviceType.GPU: ValidationThreshold.gpu_threshold, DeviceType.GPU: ValidationThreshold.gpu_threshold,
DeviceType.HEXAGON: ValidationThreshold.hexagon_threshold, DeviceType.HEXAGON + "_QUANTIZE":
ValidationThreshold.hexagon_threshold,
DeviceType.CPU + "_QUANTIZE": DeviceType.CPU + "_QUANTIZE":
ValidationThreshold.cpu_quantize_threshold, ValidationThreshold.cpu_quantize_threshold,
} }
......
...@@ -515,6 +515,12 @@ class DeviceWrapper: ...@@ -515,6 +515,12 @@ class DeviceWrapper:
for runtime in runtime_list: for runtime in runtime_list:
device_type = parse_device_type(runtime) device_type = parse_device_type(runtime)
# run for specified soc # run for specified soc
if not subgraphs[0][YAMLKeyword.check_tensors]:
output_nodes = subgraphs[0][YAMLKeyword.output_tensors]
output_shapes = subgraphs[0][YAMLKeyword.output_shapes]
else:
output_nodes = subgraphs[0][YAMLKeyword.check_tensors]
output_shapes = subgraphs[0][YAMLKeyword.check_shapes]
run_output = self.tuning_run( run_output = self.tuning_run(
abi=target_abi, abi=target_abi,
target_dir=build_tmp_binary_dir, target_dir=build_tmp_binary_dir,
...@@ -523,9 +529,9 @@ class DeviceWrapper: ...@@ -523,9 +529,9 @@ class DeviceWrapper:
embed_model_data=embed_model_data, embed_model_data=embed_model_data,
model_output_dir=model_output_dir, model_output_dir=model_output_dir,
input_nodes=subgraphs[0][YAMLKeyword.input_tensors], input_nodes=subgraphs[0][YAMLKeyword.input_tensors],
output_nodes=subgraphs[0][YAMLKeyword.output_tensors], output_nodes=output_nodes,
input_shapes=subgraphs[0][YAMLKeyword.input_shapes], input_shapes=subgraphs[0][YAMLKeyword.input_shapes],
output_shapes=subgraphs[0][YAMLKeyword.output_shapes], output_shapes=output_shapes,
mace_model_dir=mace_model_dir, mace_model_dir=mace_model_dir,
model_tag=model_name, model_tag=model_name,
device_type=device_type, device_type=device_type,
...@@ -568,9 +574,9 @@ class DeviceWrapper: ...@@ -568,9 +574,9 @@ class DeviceWrapper:
platform=model_config[YAMLKeyword.platform], platform=model_config[YAMLKeyword.platform],
device_type=device_type, device_type=device_type,
input_nodes=subgraphs[0][YAMLKeyword.input_tensors], input_nodes=subgraphs[0][YAMLKeyword.input_tensors],
output_nodes=subgraphs[0][YAMLKeyword.output_tensors], output_nodes=output_nodes,
input_shapes=subgraphs[0][YAMLKeyword.input_shapes], input_shapes=subgraphs[0][YAMLKeyword.input_shapes],
output_shapes=subgraphs[0][YAMLKeyword.output_shapes], output_shapes=output_shapes,
model_output_dir=model_output_dir, model_output_dir=model_output_dir,
input_data_types=subgraphs[0][ input_data_types=subgraphs[0][
YAMLKeyword.input_data_types], YAMLKeyword.input_data_types],
...@@ -961,7 +967,8 @@ class DeviceManager: ...@@ -961,7 +967,8 @@ class DeviceManager:
YAMLKeyword.address: adb[0], YAMLKeyword.address: adb[0],
YAMLKeyword.username: '', YAMLKeyword.username: '',
} }
devices.append(android) if android not in devices:
devices.append(android)
return devices return devices
@classmethod @classmethod
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册