to improve inference efficiency. You can build it from TensorFlow source,
or download `a pre-compiled x86-64 binary <http://cnbj1-inner-fds.api.xiaomi.net/mace/tool/transform_graph>`__.
The MiAI Compute Engine docker image has this tool pre-installed.
The following commands are optimization for CPU, GPU and DSP.
The following commands show the suggested graph transformations and
optimizations for CPU, GPU and DSP runtime.
.. code:: sh
...
...
@@ -158,6 +155,8 @@ The following commands are optimization for CPU, GPU and DSP.
strip_unused_nodes
sort_by_execution_order'
.. code:: sh
# DSP:
./transform_graph \
--in_graph=tf_model.pb \
...
...
@@ -178,7 +177,8 @@ The following commands are optimization for CPU, GPU and DSP.
- Caffe
Only support versions greater then 1.0, please use the tools caffe supplied to upgrade the models.
The converter only supports Caffe 1.0+, please upgrade your models with Caffe
built-in tool when necessary.
.. code:: bash
...
...
@@ -195,34 +195,38 @@ Only support versions greater then 1.0, please use the tools caffe supplied to u
-----------------
3.1 Overview
-----------------
Mace only build static library. the followings are two use cases.
MiAI Compute Engine only build static library. The followings are two use cases.
* **build for specified SOC**
* **Build well tuned library for specific SoCs**
You must assign ``target_socs`` in yaml configuration file.
if you want to use gpu for the soc, mace will tuning the parameters for better performance automatically.
When ``target_socs`` is specified in YAML model deployment file, the build
tool will enable automatic tuning for GPU kernels. This usually takes some
time to finish depending on the complexity of your model.
.. warning::
.. note::
you should plug in a phone with that soc.
You should plug in device(s) with the correspoding SoC(s).
* **build for all SOC**
* **Build generic library for all SoCs**
When no ``target_soc`` specified, the library is suitable for all soc.
When ``target_soc`` is not specified, the generated library is compatible
with general devices.
.. warning::
.. note::
The performance will be a little poorer than the first case.
There will be around of 1 ~ 10% performance drop for GPU
runtime compared to the well tuned library.
We supply a python script ``tools/converter.py`` to build the library and run the model on the command line.
MiAI Compute Engine provide command line tool (``tools/converter.py``) for
model conversion, compiling, test run, benchmark and correctness validation.
.. warning::
.. note::
must run the script on the root directory of the mace code.
``tools/converter.py`` should be run at the root directory of this project.
------------------------------------------
3.2 \ ``tools/converter.py``\ explanation
3.2 \ ``tools/converter.py``\ usage
------------------------------------------
**Commands**
...
...
@@ -231,24 +235,24 @@ We supply a python script ``tools/converter.py`` to build the library and run th
.. note::
build static library and test tools.
build static library and test tools.
* *--config* (type=str, default="", required): the path of model yaml configuration file.
* *--tuning* (default=false, optional): whether tuning the parameters for the GPU of specified SOC.
* *--tuning* (default=false, optional): whether tuning the parameters for the GPU of specified SoC.
* *--enable_openmp* (default=true, optional): whether use openmp.
* **run**
.. note::
run the models in command line
run the model(s).
* *--config* (type=str, default="", required): the path of model yaml configuration file.
* *--round* (type=int, default=1, optional): times for run.
* *--validate* (default=false, optional): whether to verify the results of mace are consistent with the frameworks。
* *--validate* (default=false, optional): whether to verify the results are consistent with the frameworks。
* *--caffe_env* (type=local/docker, default=docker, optional): you can specific caffe environment for validation. local environment or caffe docker image.
* *--restart_round* (type=int, default=1, optional): restart round between run.
* *--check_gpu_out_of_memory* (default=false, optional): whether check out of memory for gpu.
* *--gpu_out_of_range_check* (default=false, optional): whether check out of memory for gpu.
* *--vlog_level* (type=int[0-5], default=0, optional): verbose log level for debug.
.. warning::
...
...
@@ -306,120 +310,120 @@ We supply a python script ``tools/converter.py`` to build the library and run th