basic_usage.rst 12.9 KB
Newer Older
L
Liangliang He 已提交
1
Basic usage
L
update  
liutuo 已提交
2
============
L
Liangliang He 已提交
3

L
liutuo 已提交
4 5

Build and run an example model
L
update  
liutuo 已提交
6
-------------------------------
L
liutuo 已提交
7

L
update  
liutuo 已提交
8
At first, make sure the environment has been set up correctly already (refer to :doc:`../installation/env_requirement`).
L
liutuo 已提交
9

L
Liangliang He 已提交
10 11
The followings are instructions about how to quickly build and run a provided model in
`MACE Model Zoo <https://github.com/XiaoMi/mace-models>`__.
L
liutuo 已提交
12

L
liutuo 已提交
13
Here we use the mobilenet-v2 model as an example.
L
liutuo 已提交
14

L
liutuo 已提交
15 16
**Commands**

L
Liangliang He 已提交
17
    1. Pull `MACE <https://github.com/XiaoMi/mace>`__ project.
L
liutuo 已提交
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

    .. code:: sh

        git clone https://github.com/XiaoMi/mace.git
        git fetch --all --tags --prune

        # Checkout the latest tag (i.e. release version)
        tag_name=`git describe --abbrev=0 --tags`
        git checkout tags/${tag_name}

    .. note::

        It's highly recommanded to use a release version instead of master branch.


L
Liangliang He 已提交
33
    2. Pull `MACE Model Zoo <https://github.com/XiaoMi/mace-models>`__ project.
L
liutuo 已提交
34 35 36 37 38 39

    .. code:: sh

        git clone https://github.com/XiaoMi/mace-models.git


L
Liangliang He 已提交
40
    3. Build a generic MACE library.
L
liutuo 已提交
41 42

    .. code:: sh
L
liutuo 已提交
43

L
liutuo 已提交
44 45
        cd path/to/mace
        # Build library
L
liuqi 已提交
46 47
        # output lib path: builds/lib
        bash tools/build-standalone-lib.sh
L
liutuo 已提交
48 49


50 51 52 53 54 55 56
    .. note::

        - Libraries in ``builds/lib/armeabi-v7a/cpu_gpu/`` means it can run on ``cpu`` or ``gpu`` devices.

        - The results in ``builds/lib/armeabi-v7a/cpu_gpu_dsp/`` need HVX supported.


L
Liangliang He 已提交
57
    4. Convert the pre-trained mobilenet-v2 model to MACE format model.
L
liutuo 已提交
58

L
liutuo 已提交
59
    .. code:: sh
L
liutuo 已提交
60

L
liutuo 已提交
61 62
        cd path/to/mace
        # Build library
L
liuqi 已提交
63
        python tools/converter.py convert --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml
L
liutuo 已提交
64 65


L
liutuo 已提交
66
    5. Run the model.
L
liutuo 已提交
67

L
Liangliang He 已提交
68
    .. note::
L
liuqi 已提交
69 70 71

        If you want to run on device/phone, please plug in at least one device/phone.

L
liutuo 已提交
72 73
    .. code:: sh

74 75 76
        # Run example
        python tools/converter.py run --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml --example

L
liutuo 已提交
77
    	# Test model run time
L
liutuo 已提交
78
        python tools/converter.py run --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml --round=100
L
liutuo 已提交
79 80 81

    	# Validate the correctness by comparing the results against the
    	# original model and framework, measured with cosine distance for similarity.
L
liutuo 已提交
82
    	python tools/converter.py run --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml --validate
L
liutuo 已提交
83 84 85


Build your own model
L
update  
liutuo 已提交
86
---------------------
L
liutuo 已提交
87

L
Liangliang He 已提交
88
This part will show you how to use your own pre-trained model in MACE.
L
liutuo 已提交
89

L
update  
liutuo 已提交
90
======================
L
liutuo 已提交
91
1. Prepare your model
L
update  
liutuo 已提交
92
======================
L
liutuo 已提交
93

L
Liangliang He 已提交
94
MACE now supports models from TensorFlow and Caffe (more frameworks will be supported).
L
liutuo 已提交
95 96 97

-  TensorFlow

98
   Prepare your pre-trained TensorFlow model.pb file.
L
liutuo 已提交
99 100

   Use `Graph Transform Tool <https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md>`__
L
liutuo 已提交
101
   to optimize your model for inference.
L
liutuo 已提交
102
   This tool will improve the efficiency of inference by making several optimizations like operators
L
liutuo 已提交
103
   folding, redundant node removal etc. We strongly recommend MACE users to use it before building.
L
liutuo 已提交
104

L
liutuo 已提交
105
   Usage for CPU/GPU,
L
liutuo 已提交
106 107 108 109 110

   .. code:: bash

       # CPU/GPU:
       ./transform_graph \
L
liutuo 已提交
111 112 113 114
           --in_graph=/path/to/your/tf_model.pb \
           --out_graph=/path/to/your/output/tf_model_opt.pb \
           --inputs='input node name' \
           --outputs='output node name' \
L
liutuo 已提交
115 116 117 118 119 120 121 122 123
           --transforms='strip_unused_nodes(type=float, shape="1,64,64,3")
               strip_unused_nodes(type=float, shape="1,64,64,3")
               remove_nodes(op=Identity, op=CheckNumerics)
               fold_constants(ignore_errors=true)
               flatten_atrous_conv
               fold_batch_norms
               fold_old_batch_norms
               strip_unused_nodes
               sort_by_execution_order'
李寅 已提交
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145

	Usage for DSP,

   .. code:: bash

       # DSP:
       ./transform_graph \
           --in_graph=/path/to/your/tf_model.pb \
           --out_graph=/path/to/your/output/tf_model_opt.pb \
           --inputs='input node name' \
           --outputs='output node name' \
           --transforms='strip_unused_nodes(type=float, shape="1,64,64,3")
               strip_unused_nodes(type=float, shape="1,64,64,3")
               remove_nodes(op=Identity, op=CheckNumerics)
               fold_constants(ignore_errors=true)
               fold_batch_norms
               fold_old_batch_norms
               backport_concatv2
               quantize_weights(minimum_size=2)
               quantize_nodes
               strip_unused_nodes
               sort_by_execution_order'
L
liutuo 已提交
146 147 148

-  Caffe

L
liutuo 已提交
149 150 151
   Caffe 1.0+ models are supported in MACE converter tool.

   If your model is from lower version Caffe, you need to upgrade it by using the Caffe built-in tool before converting.
L
liutuo 已提交
152 153 154 155 156 157 158 159 160

   .. code:: bash

       # Upgrade prototxt
       $CAFFE_ROOT/build/tools/upgrade_net_proto_text MODEL.prototxt MODEL.new.prototxt

       # Upgrade caffemodel
       $CAFFE_ROOT/build/tools/upgrade_net_proto_binary MODEL.caffemodel MODEL.new.caffemodel

L
liutuo 已提交
161

L
update  
liutuo 已提交
162
===========================================
L
liutuo 已提交
163
2. Create a deployment file for your model
L
update  
liutuo 已提交
164
===========================================
L
liutuo 已提交
165

L
liutuo 已提交
166 167 168 169 170
When converting a model or building a library, MACE needs to read a YAML file which is called model deployment file here.

A model deployment file contains all the information of your model(s) and building options. There are several example
deployment files in *MACE Model Zoo* project.

171
The following shows two basic usage of deployment files for TensorFlow and Caffe models.
L
liutuo 已提交
172
Modify one of them and use it for your own case.
L
liutuo 已提交
173

174
-  TensorFlow
L
liutuo 已提交
175

176
   .. literalinclude:: models/demo_models_tf.yml
L
liutuo 已提交
177 178 179 180
      :language: yaml

-  Caffe

181
   .. literalinclude:: models/demo_models_caffe.yml
L
liutuo 已提交
182 183
      :language: yaml

L
liutuo 已提交
184
More details about model deployment file are in :doc:`advanced_usage`.
L
liutuo 已提交
185

L
update  
liutuo 已提交
186
======================
L
liutuo 已提交
187
3. Convert your model
L
update  
liutuo 已提交
188
======================
L
liutuo 已提交
189

L
liutuo 已提交
190
When the deployment file is ready, you can use MACE converter tool to convert your model(s).
L
liutuo 已提交
191

L
liutuo 已提交
192
.. code:: bash
L
liutuo 已提交
193

L
liutuo 已提交
194
    python tools/converter.py convert --config=/path/to/your/model_deployment_file.yml
L
liutuo 已提交
195

L
liutuo 已提交
196
This command will download or load your pre-trained model and convert it to a MACE model proto file and weights data file.
L
liutuo 已提交
197 198 199 200
The generated model files will be stored in ``build/${library_name}/model`` folder.

.. warning::

L
liuqi 已提交
201 202
    Please set ``model_graph_format: file`` and ``model_data_format: file`` in your deployment file before converting.
    The usage of ``model_graph_format: code`` will be demonstrated in :doc:`advanced_usage`.
L
liutuo 已提交
203

L
update  
liutuo 已提交
204
=============================
L
liutuo 已提交
205
4. Build MACE into a library
L
update  
liutuo 已提交
206
=============================
L
liuqi 已提交
207
You could Download the prebuilt MACE Library from `Github MACE release page <https://github.com/XiaoMi/mace/releases>`__.
L
liutuo 已提交
208

L
liuqi 已提交
209
Or use bazel to build MACE source code into a library.
L
liutuo 已提交
210 211 212 213 214

    .. code:: sh

        cd path/to/mace
        # Build library
L
liuqi 已提交
215 216
        # output lib path: builds/lib
        bash tools/build-standalone-lib.sh
L
liutuo 已提交
217

Y
yejianwu 已提交
218
The above command will generate dynamic library ``builds/lib/${ABI}/${DEVICES}/libmace.so`` and static library ``builds/lib/${ABI}/${DEVICES}/libmace.a``.
L
liutuo 已提交
219 220 221

    .. warning::

L
liuqi 已提交
222
        Please verify that the target_abis param in the above command and your deployment file are the same.
L
liutuo 已提交
223 224


L
update  
liutuo 已提交
225
==================
L
liutuo 已提交
226
5. Run your model
L
update  
liutuo 已提交
227
==================
L
liutuo 已提交
228

L
update  
liutuo 已提交
229 230
With the converted model, the static or shared library and header files, you can use the following commands
to run and validate your model.
L
liutuo 已提交
231

L
liuqi 已提交
232 233 234 235
    .. warning::

        If you want to run on device/phone, please plug in at least one device/phone.

L
liutuo 已提交
236 237 238
* **run**

    run the model.
L
liutuo 已提交
239 240 241 242

    .. code:: sh

    	# Test model run time
L
liutuo 已提交
243
        python tools/converter.py run --config=/path/to/your/model_deployment_file.yml --round=100
L
liutuo 已提交
244 245 246

    	# Validate the correctness by comparing the results against the
    	# original model and framework, measured with cosine distance for similarity.
L
liutuo 已提交
247
    	python tools/converter.py run --config=/path/to/your/model_deployment_file.yml --validate
L
liutuo 已提交
248

L
liutuo 已提交
249
* **benchmark**
L
liutuo 已提交
250

L
liutuo 已提交
251
    benchmark and profile the model.
L
liutuo 已提交
252 253 254 255

    .. code:: sh

        # Benchmark model, get detailed statistics of each Op.
L
liutuo 已提交
256
        python tools/converter.py benchmark --config=/path/to/your/model_deployment_file.yml
L
liutuo 已提交
257 258


L
update  
liutuo 已提交
259
=======================================
L
liutuo 已提交
260
6. Deploy your model into applications
L
update  
liutuo 已提交
261
=======================================
L
liutuo 已提交
262

L
liuqi 已提交
263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278
You could run model on CPU, GPU and DSP (based on the `runtime` in your model deployment file).
However, there are some differences in different devices.

* **CPU**

    Almost all of mobile SoCs use ARM-based CPU architecture, so your model could run on different SoCs in theory.

* **GPU**

    Although most GPUs use OpenCL standard, but there are some SoCs not fully complying with the standard,
    or the GPU is too low-level to use. So you should have some fallback strategies when the GPU run failed.

* **DSP**

    MACE only support Qualcomm DSP.

L
liutuo 已提交
279
In the converting and building steps, you've got the static/shared library, model files and
L
liuqi 已提交
280
header files.
L
liutuo 已提交
281

L
liuqi 已提交
282

L
liutuo 已提交
283
``${library_name}`` is the name you defined in the first line of your deployment YAML file.
L
liutuo 已提交
284

Y
yejianwu 已提交
285 286 287 288 289 290
.. note::

    When linking generated ``libmace.a`` into shared library,
    `version script <ftp://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_25.html>`__
    is helpful for reducing a specified set of symbols to local scope.

L
liutuo 已提交
291 292 293 294
-  The generated ``static`` library files are organized as follows,

.. code::

L
liuqi 已提交
295 296 297 298 299 300 301 302
    builds
    ├── include
    │   └── mace
    │       └── public
    │           ├── mace.h
    │           └── mace_runtime.h
    ├── lib
    │   ├── arm64-v8a
Y
yejianwu 已提交
303 304 305
    │   │   └── cpu_gpu
    │   │       ├── libmace.a
    │   │       └── libmace.so
L
liuqi 已提交
306
    │   ├── armeabi-v7a
Y
yejianwu 已提交
307 308 309 310 311 312 313
    │   │   ├── cpu_gpu
    │   │   │   ├── libmace.a
    │   │   │   └── libmace.so
    │   │   └── cpu_gpu_dsp
    │   │       ├── libhexagon_controller.so
    │   │       ├── libmace.a
    │   │       └── libmace.so
L
liuqi 已提交
314 315 316 317 318 319 320 321 322 323
    │   └── linux-x86-64
    │       ├── libmace.a
    │       └── libmace.so
    └── mobilenet-v1
        ├── model
        │   ├── mobilenet_v1.data
        │   └── mobilenet_v1.pb
        └── _tmp
            └── arm64-v8a
                └── mace_run_static
L
liutuo 已提交
324

L
liutuo 已提交
325

L
liutuo 已提交
326
Please refer to \ ``mace/examples/example.cc``\ for full usage. The following list the key steps.
L
liutuo 已提交
327 328 329 330 331 332 333

.. code:: cpp

    // Include the headers
    #include "mace/public/mace.h"
    #include "mace/public/mace_runtime.h"

L
liuqi 已提交
334
    // 0. Set compiled OpenCL kernel cache, this is used to reduce the
L
liutuo 已提交
335 336 337 338
    // initialization time since the compiling is too slow. It's suggested
    // to set this even when pre-compiled OpenCL program file is provided
    // because the OpenCL version upgrade may also leads to kernel
    // recompilations.
L
liutuo 已提交
339 340 341 342 343
    const std::string file_path ="path/to/opencl_cache_file";
    std::shared_ptr<KVStorageFactory> storage_factory(
        new FileStorageFactory(file_path));
    ConfigKVStorageFactory(storage_factory);

L
liuqi 已提交
344
    // 1. Declare the device type (must be same with ``runtime`` in configuration file)
L
liutuo 已提交
345 346
    DeviceType device_type = DeviceType::GPU;

L
liuqi 已提交
347
    // 2. Define the input and output tensor names.
L
liutuo 已提交
348 349 350
    std::vector<std::string> input_names = {...};
    std::vector<std::string> output_names = {...};

L
liuqi 已提交
351
    // 3. Create MaceEngine instance
L
liutuo 已提交
352 353
    std::shared_ptr<mace::MaceEngine> engine;
    MaceStatus create_engine_status;
L
liutuo 已提交
354 355

    // Create Engine from model file
L
liutuo 已提交
356 357 358 359 360 361 362 363
    create_engine_status =
        CreateMaceEngineFromProto(model_pb_data,
                                  model_data_file.c_str(),
                                  input_names,
                                  output_names,
                                  device_type,
                                  &engine);
    if (create_engine_status != MaceStatus::MACE_SUCCESS) {
L
liuqi 已提交
364
      // fall back to other strategy.
L
liutuo 已提交
365 366
    }

L
liuqi 已提交
367
    // 4. Create Input and Output tensor buffers
L
liutuo 已提交
368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391
    std::map<std::string, mace::MaceTensor> inputs;
    std::map<std::string, mace::MaceTensor> outputs;
    for (size_t i = 0; i < input_count; ++i) {
      // Allocate input and output
      int64_t input_size =
          std::accumulate(input_shapes[i].begin(), input_shapes[i].end(), 1,
                          std::multiplies<int64_t>());
      auto buffer_in = std::shared_ptr<float>(new float[input_size],
                                              std::default_delete<float[]>());
      // Load input here
      // ...

      inputs[input_names[i]] = mace::MaceTensor(input_shapes[i], buffer_in);
    }

    for (size_t i = 0; i < output_count; ++i) {
      int64_t output_size =
          std::accumulate(output_shapes[i].begin(), output_shapes[i].end(), 1,
                          std::multiplies<int64_t>());
      auto buffer_out = std::shared_ptr<float>(new float[output_size],
                                               std::default_delete<float[]>());
      outputs[output_names[i]] = mace::MaceTensor(output_shapes[i], buffer_out);
    }

L
liuqi 已提交
392
    // 5. Run the model
L
liutuo 已提交
393
    MaceStatus status = engine.Run(inputs, &outputs);
L
liutuo 已提交
394

L
Liangliang He 已提交
395
More details are in :doc:`advanced_usage`.