temp commit

b2441b02 · liuqi · 9c408a6c · b2441b02 · b2441b02 · b2441b02
6 changed file
--- a/docs/user_guide/advanced_usage.rst
+++ b/docs/user_guide/advanced_usage.rst
@@ -3,11 +3,6 @@ Advanced usage

 This part contains the full usage of MACE.

-
-How to build
-------------
-
-
 =========
 Overview
 =========
@@ -104,99 +99,216 @@ in one deployment file.

    .. code:: bash

-        # command for fetching android device's soc info.
-        adb shell getprop | grep "model\|version.sdk\|manufacturer\|hardware\|platform\|brand"
+        # Get device's soc info.
+        adb shell getprop | grep platform

        # command for generating sha256_sum
        sha256sum /path/to/your/file


-=========
-Building
-=========
+==============
+Advanced Usage
+==============

-* **Build static or shared library**
+There are two common advanced use cases: 1. convert model to CPP code. 2. tuning for specific SOC if use GPU.

-    MACE can build either static or shared library (which is
-    specified by ``linkshared`` in YAML model deployment file).
-    The followings are two using cases.
+* **Convert model(s) to CPP code**

-* **Build well tuned library for specific SoCs**
+    .. warning::

-    When ``target_socs`` is specified in YAML model deployment file, the build
-    tool will enable automatic tuning for GPU kernels. This usually takes some
-    time to finish depending on the complexity of your model.
+         If you want to use this case, you can just use static mace library.

-    .. note::
+    * **1. Change the model configuration file(.yml)**

-         1. You should plug in device(s) with the specific SoC(s).
+        If you want to protect your model, you can convert model to CPP code. there are also two cases:

-* **Build generic library for all SoCs**
+        * convert model graph to code and model weight to file with blow model configuration.

-    When ``target_socs`` is not specified, the generated library is compatible
-    with general devices.
+        .. code:: sh

-    .. note::
+            model_graph_format: code
+            model_data_format: file

-         1. There will be around of 1 ~ 10% performance drop for GPU
-            runtime compared to the well tuned library.
+        * convert both model graph and model weight to code with blow model configuration.

-* **Build models into file or code**
+        .. code:: sh

-    When ``build_type`` is set to ``code``, model's graph and weights data will be embedded into codes.
-    This is used for model protection.
+            model_graph_format: code
+            model_data_format: code

-    .. note::
+        .. note::

-         1. When ``linkshared`` is set to ``1``, ``build_type`` should be ``proto``.
-            And currently only android devices supported.
-         2. Another model protection method is using ``obfuscate`` to obfuscate the model operator name.
+             Another model protection method is using ``obfuscate`` to obfuscate the model operator name.

+    * **2. Convert model(s) to code**

-**Commands**
+        .. code:: sh

-    * **build library and test tools**
+            python tools/converter.py convert --config=/path/to/model_deployment_file.yml

-    .. code:: sh
+        The command will generate **${library_name}.a** in **builds/${library_name}/model** directory and
+        ** *.h ** in **builds/${library_name}/include** like blow dir-tree.

-        # Build library
-        python tools/converter.py build --config=/path/to/model_deployment_file.yml
+        .. code::

+              builds
+                ├── include
+                │   └── mace
+                │       └── public
+                │           ├── mace_engine_factory.h
+                │           └── mobilenet_v1.h
+                └── model
+                    ├── mobilenet-v1.a
+                    └── mobilenet_v1.data


-    * **run the model**
+    * **3. Deployment**
+        * Link `libmace.a` and `${library_name}.a` to your target.

-    .. code:: sh
+        Please refer to \ ``mace/examples/example.cc``\ for full usage. The following list the key steps.

-    	# Test model run time
-        python tools/converter.py run --config=/path/to/model_deployment_file.yml --round=100
+        .. code:: cpp

-    	# Validate the correctness by comparing the results against the
-    	# original model and framework, measured with cosine distance for similarity.
-    	python tools/converter.py run --config=/path/to/model_deployment_file.yml --validate
+            // Include the headers
+            #include "mace/public/mace.h"
+            #include "mace/public/mace_runtime.h"
+            // If the model_graph_format is code
+            #include "mace/public/${model_name}.h"
+            #include "mace/public/mace_engine_factory.h"

-    	# Check the memory usage of the model(**Just keep only one model in configuration file**)
-    	python tools/converter.py run --config=/path/to/model_deployment_file.yml --round=10000 &
-    	sleep 5
-    	adb shell dumpsys meminfo | grep mace_run
-    	kill %1
+            // ... Same with the code in basic usage

+            // 4. Create MaceEngine instance
+            std::shared_ptr<mace::MaceEngine> engine;
+            MaceStatus create_engine_status;
+            // Create Engine from compiled code
+            create_engine_status =
+                CreateMaceEngineFromCode(model_name.c_str(),
+                                         nullptr,
+                                         input_names,
+                                         output_names,
+                                         device_type,
+                                         &engine);
+            if (create_engine_status != MaceStatus::MACE_SUCCESS) {
+              // Report error
+            }

-    .. warning::
+            // ... Same with the code in basic usage

-        ``run`` rely on ``build`` command, you should ``run`` after ``build``.

-    * **benchmark and profiling model**
+* **Tuning for specific SOC's GPU**

-    .. code:: sh
+    If you want to use GPU of a specific device, you could specify ``target_socs`` and
+    tuning for the specific SOC. It may get 1~10% performance improvement.

-        # Benchmark model, get detailed statistics of each Op.
-        python tools/converter.py benchmark --config=/path/to/model_deployment_file.yml
+    * **1. Change the model configuration file(.yml)**

+        Specific ``target_socs`` in your model configuration file(.yml):

-    .. warning::
+        .. code:: sh
+
+            target_socs: [sdm845]
+
+        .. note::
+
+            Get device's soc info: `adb shell getprop | grep platform`
+
+    * **2. Convert model(s)**
+
+        .. code:: sh
+
+            python tools/converter.py convert --config=/path/to/model_deployment_file.yml
+
+    * **3. Tuning**
+
+        The tools/converter.py will enable automatic tuning for GPU kernels. This usually takes some
+        time to finish depending on the complexity of your model.
+
+        .. note::
+
+             You should plug in device(s) with the specific SoC(s).
+
+
+        .. code:: sh
+
+            python tools/converter.py run --config=/path/to/model_deployment_file.yml --validate
+
+        The command will generate two files in `builds/${library_name}/opencl`, like blow.
+
+        .. code::
+
+              builds
+              └── mobilenet-v2
+                  ├── model
+                  │   ├── mobilenet_v2.data
+                  │   └── mobilenet_v2.pb
+                  └── opencl
+                      └── arm64-v8a
+                         ├── moblinet-v2_compiled_opencl_kernel.MiNote3.sdm660.bin
+                         └── moblinet-v2_tuned_opencl_parameter.MiNote3.sdm660.bin
+
+
+        * **mobilenet-v2-gpu_compiled_opencl_kernel.MI6.msm8998.bin** stands for the OpenCL binaries
+          used for your models, which could accelerate the initialization stage.
+          Details please refer to `OpenCL Specification <https://www.khronos.org/registry/OpenCL/sdk/1.0/docs/man/xhtml/clCreateProgramWithBinary.html>`__.
+        * **mobilenet-v2-tuned_opencl_parameter.MI6.msm8998.bin** stands for the tuned OpenCL parameters
+          for the SOC.
+
+    * **4. Deployment**
+        * Change the names of files generated above for not collision and push them to **your own device' directory**.
+        * Usage like the previous procedure, blow list the key steps different.
+
+        .. code:: cpp
+
+            // Include the headers
+            #include "mace/public/mace.h"
+            #include "mace/public/mace_runtime.h"
+
+            // 0. Set pre-compiled OpenCL binary program file paths and OpenCL parameters file path when available
+            if (device_type == DeviceType::GPU) {
+              mace::SetOpenCLBinaryPaths(path/to/opencl_binary_paths);
+              mace::SetOpenCLParameterPath(path/to/opencl_parameter_file);
+            }
+
+            // ... Same with the code in basic usage.
+
+
+===============
+Useful Commands
+===============
+* **run the model**
+
+.. code:: sh
+
+    # Test model run time
+    python tools/converter.py run --config=/path/to/model_deployment_file.yml --round=100
+
+    # Validate the correctness by comparing the results against the
+    # original model and framework, measured with cosine distance for similarity.
+    python tools/converter.py run --config=/path/to/model_deployment_file.yml --validate
+
+    # Check the memory usage of the model(**Just keep only one model in configuration file**)
+    python tools/converter.py run --config=/path/to/model_deployment_file.yml --round=10000 &
+    sleep 5
+    adb shell dumpsys meminfo | grep mace_run
+    kill %1
+
+
+.. warning::
+
+    ``run`` rely on ``convert`` command, you should ``run`` after ``convert``.
+
+* **benchmark and profiling model**
+
+.. code:: sh
+
+    # Benchmark model, get detailed statistics of each Op.
+    python tools/converter.py benchmark --config=/path/to/model_deployment_file.yml
+
+
+.. warning::

-        ``benchmark`` rely on ``build`` command, you should ``benchmark`` after ``build``.
+    ``benchmark`` rely on ``convert`` command, you should ``benchmark`` after ``convert``.

 **Common arguments**

@@ -242,183 +354,3 @@ Use ``-h`` to get detailed help.
    python tools/converter.py build -h
    python tools/converter.py run -h
    python tools/converter.py benchmark -h
-
-
-
-How to deploy
--------------
-
-
-=========
-Overview
-=========
-
-``build`` command will generate the static/shared library, model files and
-header files and package them as
-``build/${library_name}/libmace_${library_name}.tar.gz``.
-
-  The generated ``static`` libraries are organized as follows,
-
-.. code::
-
-      build/
-      └── mobilenet-v2-gpu
-          ├── include
-          │   └── mace
-          │       └── public
-          │           ├── mace.h
-          │           └── mace_runtime.h
-          |           └── mace_engine_factory.h (Only exists if ``build_type`` set to ``code``))
-          ├── libmace_mobilenet-v2-gpu.tar.gz
-          ├── lib
-          │   ├── arm64-v8a
-          │   │   └── libmace_mobilenet-v2-gpu.MI6.msm8998.a
-          │   └── armeabi-v7a
-          │       └── libmace_mobilenet-v2-gpu.MI6.msm8998.a
-          ├── model
-          │   ├── mobilenet_v2.data
-          │   └── mobilenet_v2.pb
-          └── opencl
-              ├── arm64-v8a
-              │   └── mobilenet-v2-gpu_compiled_opencl_kernel.MI6.msm8998.bin
-              └── armeabi-v7a
-                  └── mobilenet-v2-gpu_compiled_opencl_kernel.MI6.msm8998.bin
-
-
-  The generated ``shared`` libraries are organized as follows,
-
-.. code::
-
-      build
-      └── mobilenet-v2-gpu
-          ├── include
-          │   └── mace
-          │       └── public
-          │           ├── mace.h
-          │           └── mace_runtime.h
-          |           └── mace_engine_factory.h (Only exists if ``build_type`` set to ``code``)
-          ├── lib
-          │   ├── arm64-v8a
-          │   │   ├── libgnustl_shared.so
-          │   │   └── libmace.so
-          │   └── armeabi-v7a
-          │       ├── libgnustl_shared.so
-          │       └── libmace.so
-          ├── model
-          │   ├── mobilenet_v2.data
-          │   └── mobilenet_v2.pb
-          └── opencl
-              ├── arm64-v8a
-              │   └── mobilenet-v2-gpu_compiled_opencl_kernel.MI6.msm8998.bin
-              └── armeabi-v7a
-                  └── mobilenet-v2-gpu_compiled_opencl_kernel.MI6.msm8998.bin
-
-.. note::
-
-    1. DSP runtime depends on ``libhexagon_controller.so``.
-    2. ``${MODEL_TAG}.pb`` file will be generated only when ``build_type`` is ``proto``.
-    3. ``${library_name}_compiled_opencl_kernel.${device_name}.${soc}.bin`` will
-       be generated only when ``target_socs`` and ``gpu`` runtime are specified.
-    4. Generated shared library depends on ``libgnustl_shared.so``.
-    5. Files in opencl folder will be generated only if
-       ``target_soc`` was set and ``runtime`` contains ``gpu`` in the deployment file.
-    6. When ``build_type`` has been set to ``code``, ${library_name}.h and mace_engine_factory.h
-       will be generated in ``include`` folder. This header file will be used to create mace_engine of your model.
-
-
-.. warning::
-
-    ``${library_name}_compiled_opencl_kernel.${device_name}.${soc}.bin`` depends
-    on the OpenCL version of the device, you should maintan the compatibility or
-    configure compiling cache store with ``ConfigKVStorageFactory``.
-
-
-===========
-Deployment
-===========
-
-Unpack the generated libmace_${library_name}.tar.gz file and copy all of the uncompressed files into your project.
-
-Please refer to \ ``mace/examples/example.cc``\ for full usage. The following list the key steps.
-
-.. code:: cpp
-
-    // Include the headers
-    #include "mace/public/mace.h"
-    #include "mace/public/mace_runtime.h"
-    // If the build_type is code
-    #include "mace/public/mace_engine_factory.h"
-
-    // 0. Set pre-compiled OpenCL binary program file paths when available
-    if (device_type == DeviceType::GPU) {
-      mace::SetOpenCLBinaryPaths(opencl_binary_paths);
-    }
-
-    // 1. Set compiled OpenCL kernel cache, this is used to reduce the
-    // initialization time since the compiling is too slow. It's suggested
-    // to set this even when pre-compiled OpenCL program file is provided
-    // because the OpenCL version upgrade may also leads to kernel
-    // recompilations.
-    const std::string file_path ="path/to/opencl_cache_file";
-    std::shared_ptr<KVStorageFactory> storage_factory(
-        new FileStorageFactory(file_path));
-    ConfigKVStorageFactory(storage_factory);
-
-    // 2. Declare the device type (must be same with ``runtime`` in configuration file)
-    DeviceType device_type = DeviceType::GPU;
-
-    // 3. Define the input and output tensor names.
-    std::vector<std::string> input_names = {...};
-    std::vector<std::string> output_names = {...};
-
-    // 4. Create MaceEngine instance
-    std::shared_ptr<mace::MaceEngine> engine;
-    MaceStatus create_engine_status;
-    // Create Engine from compiled code
-    create_engine_status =
-        CreateMaceEngineFromCode(model_name.c_str(),
-                                 nullptr,
-                                 input_names,
-                                 output_names,
-                                 device_type,
-                                 &engine);
-    // Create Engine from model file
-    create_engine_status =
-        CreateMaceEngineFromProto(model_pb_data,
-                                  model_data_file.c_str(),
-                                  input_names,
-                                  output_names,
-                                  device_type,
-                                  &engine);
-    if (create_engine_status != MaceStatus::MACE_SUCCESS) {
-      // Report error
-    }
-
-    // 5. Create Input and Output tensor buffers
-    std::map<std::string, mace::MaceTensor> inputs;
-    std::map<std::string, mace::MaceTensor> outputs;
-    for (size_t i = 0; i < input_count; ++i) {
-      // Allocate input and output
-      int64_t input_size =
-          std::accumulate(input_shapes[i].begin(), input_shapes[i].end(), 1,
-                          std::multiplies<int64_t>());
-      auto buffer_in = std::shared_ptr<float>(new float[input_size],
-                                              std::default_delete<float[]>());
-      // Load input here
-      // ...
-
-      inputs[input_names[i]] = mace::MaceTensor(input_shapes[i], buffer_in);
-    }
-
-    for (size_t i = 0; i < output_count; ++i) {
-      int64_t output_size =
-          std::accumulate(output_shapes[i].begin(), output_shapes[i].end(), 1,
-                          std::multiplies<int64_t>());
-      auto buffer_out = std::shared_ptr<float>(new float[output_size],
-                                               std::default_delete<float[]>());
-      outputs[output_names[i]] = mace::MaceTensor(output_shapes[i], buffer_out);
-    }
-
-    // 6. Run the model
-    MaceStatus status = engine.Run(inputs, &outputs);
-
--- a/docs/user_guide/basic_usage.rst
+++ b/docs/user_guide/basic_usage.rst
@@ -42,7 +42,8 @@ Here we use the mobilenet-v2 model as an example.

        cd path/to/mace
        # Build library
-        python tools/converter.py build --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml
+        # output lib path: builds/lib
+        bash tools/build-standalone-lib.sh


    4. Convert the model to MACE format model.
@@ -51,11 +52,15 @@ Here we use the mobilenet-v2 model as an example.

        cd path/to/mace
        # Build library
-        python tools/converter.py build --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml
+        python tools/converter.py convert --config=/path/to/mace-models/mobilenet-v2/mobilenet-v2.yml


    5. Run the model.

+    .. warning::
+
+        If you want to run on device/phone, please plug in at least one device/phone.
+
    .. code:: sh

    	# Test model run time
@@ -160,8 +165,8 @@ The generated model files will be stored in ``build/${library_name}/model`` fold

 .. warning::

-    Please set ``build_type:proto`` in your deployment file before converting.
-    The usage of ``build_type:code`` will be demonstrated in :doc:`advanced_usage`.
+    Please set ``model_graph_format: file`` and ``model_data_format: file`` in your deployment file before converting.
+    The usage of ``model_graph_format: code`` will be demonstrated in :doc:`advanced_usage`.

 =============================
 4. Build MACE into a library
@@ -173,14 +178,14 @@ Use bazel to build MACE source code into a library.

        cd path/to/mace
        # Build library
-        bazel build --config android mace:libmace --define neon=true --define openmp=true -cpu=arm64-v8a
+        # output lib path: builds/lib
+        bash tools/build-standalone-lib.sh

-The above command will generate a library as ``bazel-bin/mace/libmace.so``.
+The above command will generate dynamic library ``builds/lib/${ABI}/libmace.so`` and static library ``builds/lib/${ABI}/libmace.a``.

    .. warning::

        1. Please verify that the target_abis param in the above command and your deployment file are the same.
-        2. If you want to build a library for a specific soc, please refer to :doc:`advanced_usage`.


 ==================
@@ -190,6 +195,10 @@ The above command will generate a library as ``bazel-bin/mace/libmace.so``.
 With the converted model, the static or shared library and header files, you can use the following commands
 to run and validate your model.

+    .. warning::
+
+        If you want to run on device/phone, please plug in at least one device/phone.
+
 * **run**

    run the model.
@@ -218,8 +227,7 @@ to run and validate your model.
 =======================================

 In the converting and building steps, you've got the static/shared library, model files and
-header files. All of these generated files have been packaged into
-``build/${library_name}/libmace_${library_name}.tar.gz`` when building.
+header files.

 ``${library_name}`` is the name you defined in the first line of your deployment YAML file.

@@ -227,57 +235,31 @@ header files. All of these generated files have been packaged into

 .. code::

-      build/
-      └── mobilenet-v2
-          ├── include
-          │   └── mace
-          │       └── public
-          │           ├── mace.h
-          │           └── mace_runtime.h
-          ├── libmace_mobilenet-v2.tar.gz
-          ├── lib
-          │   ├── arm64-v8a
-          │   │   └── libmace_mobilenet-v2.MI6.msm8998.a
-          │   └── armeabi-v7a
-          │       └── libmace_mobilenet-v2.MI6.msm8998.a
-          ├── model
-          │   ├── mobilenet_v2.data
-          │   └── mobilenet_v2.pb
-          └── opencl
-              ├── arm64-v8a
-              │   └── mobilenet-v2_compiled_opencl_kernel.MI6.msm8998.bin
-              └── armeabi-v7a
-                  └── mobilenet-v2_compiled_opencl_kernel.MI6.msm8998.bin
-
-  The generated ``shared`` library files are organized as follows,
-
-.. code::
+    builds
+    ├── include
+    │   └── mace
+    │       └── public
+    │           ├── mace.h
+    │           └── mace_runtime.h
+    ├── lib
+    │   ├── arm64-v8a
+    │   │   ├── libmace.a
+    │   │   └── libmace.so
+    │   ├── armeabi-v7a
+    │   │   ├── libhexagon_controller.so
+    │   │   ├── libmace.a
+    │   │   └── libmace.so
+    │   └── linux-x86-64
+    │       ├── libmace.a
+    │       └── libmace.so
+    └── mobilenet-v1
+        ├── model
+        │   ├── mobilenet_v1.data
+        │   └── mobilenet_v1.pb
+        └── _tmp
+            └── arm64-v8a
+                └── mace_run_static

-      build
-      └── mobilenet-v2
-          ├── include
-          │   └── mace
-          │       └── public
-          │           ├── mace.h
-          │           └── mace_runtime.h
-          ├── lib
-          │   ├── arm64-v8a
-          │   │   ├── libgnustl_shared.so
-          │   │   └── libmace.so
-          │   └── armeabi-v7a
-          │       ├── libgnustl_shared.so
-          │       └── libmace.so
-          ├── model
-          │   ├── mobilenet_v2.data
-          │   └── mobilenet_v2.pb
-          └── opencl
-              ├── arm64-v8a
-              │   └── mobilenet-v2_compiled_opencl_kernel.MI6.msm8998.bin
-              └── armeabi-v7a
-                  └── mobilenet-v2_compiled_opencl_kernel.MI6.msm8998.bin
-
-
-Unpack the generated libmace_${library_name}.tar.gz file and copy all of the uncompressed files into your project.

 Please refer to \ ``mace/examples/example.cc``\ for full usage. The following list the key steps.


--- a/docs/user_guide/create_a_model_deployment.rst
+++ b/docs/user_guide/create_a_model_deployment.rst
@@ -16,9 +16,6 @@ Example
 ----------
 Here is an example deployment file used by an Android demo application.

-TODO: change this example file to the demo deployment file
-(reuse the same file) and rename to a reasonable name.
-
 .. literalinclude:: models/demo_app_models.yml
   :language: yaml

@@ -34,12 +31,10 @@ Configurations
      - The target ABI to build, can be one or more of 'host', 'armeabi-v7a' or 'arm64-v8a'.
    * - target_socs
      - [optional] build for specified socs if you just want use the model for that socs.
-    * - embed_model_data
-      - Whether embedding model weights as the code, default to 0.
-    * - build_type
-      - model build type, can be ['proto', 'code']. 'proto' for converting model to ProtoBuf file and 'code' for converting model to c++ code.
-    * - linkshared
-      - [optional] Use dynamic linking for libmace library when setting to 1, or static linking when setting to 0, default to 0.
+    * - model_graph_format
+      - MACE model graph type, could be ['file', 'code']. 'file' for converting model to ProtoBuf(`.pb`) file and 'code' for converting model to c++ code.
+    * - model_data_format
+      - MACE model data type, could be ['file', 'code']. 'file' for converting model to `.data` file and 'code' for converting model to c++ code.
    * - model_name
      - model name, should be unique if there are multiple models.
        **LIMIT: if build_type is code, model_name will used in c++ code so that model_name must fulfill c++ name specification.**

--- a/docs/user_guide/models/demo_app_models.yml
+++ b/docs/user_guide/models/demo_app_models.yml
@@ -2,13 +2,11 @@
 library_name: mobile_squeeze
 # host, armeabi-v7a or arm64-v8a
 target_abis: [arm64-v8a]
-# set 1 to embed model weights data into code. default is 0, keep weights in model.data file
-embed_model_data: 1
 # The build mode for model(s).
-# 'code' for transferring model(s) into cpp code, 'proto' for keeping model(s) in protobuf file(s).
-build_type: code
-# 0 for static library, 1 for shared library.
-linkshared: 0
+# 'code' for transferring model(s) into cpp code, 'file' for keeping model(s) in protobuf file(s) (.pb).
+model_graph_format: code
+# 'code' for transferring model data(s) into cpp code, 'file' for keeping model data(s) in file(s) (.data).
+model_data_format: code
 # One yaml config file can contain multi models' deployment info.
 models:
  mobilenet_v1:

--- a/docs/user_guide/models/demo_app_models_caffe.yml
+++ b/docs/user_guide/models/demo_app_models_caffe.yml
 # The name of library
 library_name: squeezenet-v10
 target_abis: [arm64-v8a]
-embed_model_data: 0
-build_type: proto
-linkshared: 1
+model_graph_format: file
+model_data_format: file
 models:
  squeezenet-v10: # model tag, which will be used in model loading and must be specific.
    platform: caffe
@@ -15,10 +14,28 @@ models:
    model_sha256_checksum: db680cf18bb0387ded9c8e9401b1bbcf5dc09bf704ef1e3d3dbd1937e772cae0
    weight_sha256_checksum: 9ff8035aada1f9ffa880b35252680d971434b141ec9fbacbe88309f0f9a675ce
    # define your model's interface
+    # if there multiple inputs or outputs, write like blow:
+    # subgraphs:
+    # - input_tensors:
+    #     - input0
+    #     - input1
+    #   input_shapes:
+    #     - 1,224,224,3
+    #     - 1,224,224,3
+    #    output_tensors:
+    #      - output0
+    #      - output1
+    #    output_shapes:
+    #      - 1,1001
+    #      - 1,1001
    subgraphs:
-      - input_tensors: data
-        input_shapes: 1,227,227,3
-        output_tensors: prob
-        output_shapes: 1,1,1,1000
+      - input_tensors:
+          - data
+        input_shapes:
+          - 1,227,227,3
+        output_tensors:
+          - prob
+        output_shapes:
+          - 1,1,1,1000
    runtime: cpu+gpu
    winograd: 0
--- a/docs/user_guide/models/demo_app_models_tf.yml
+++ b/docs/user_guide/models/demo_app_models_tf.yml
 # The name of library
 library_name: mobilenet
 target_abis: [arm64-v8a]
-embed_model_data: 0
-build_type: proto
-linkshared: 1
+model_graph_format: file
+model_data_format: file
 models:
  mobilenet_v1: # model tag, which will be used in model loading and must be specific.
    platform: tensorflow
@@ -13,11 +12,29 @@ models:
    # use this command to get the sha256_checksum: sha256sum path/to/your/pb/file
    model_sha256_checksum: 71b10f540ece33c49a7b51f5d4095fc9bd78ce46ebf0300487b2ee23d71294e6
    # define your model's interface
+    # if there multiple inputs or outputs, write like blow:
+    # subgraphs:
+    # - input_tensors:
+    #     - input0
+    #     - input1
+    #   input_shapes:
+    #     - 1,224,224,3
+    #     - 1,224,224,3
+    #    output_tensors:
+    #      - output0
+    #      - output1
+    #    output_shapes:
+    #      - 1,1001
+    #      - 1,1001
    subgraphs:
-      - input_tensors: input
-        input_shapes: 1,224,224,3
-        output_tensors: MobilenetV1/Predictions/Reshape_1
-        output_shapes: 1,1001
+      - input_tensors:
+          - input
+        input_shapes:
+          - 1,224,224,3
+        output_tensors:
+          - MobilenetV1/Predictions/Reshape_1
+        output_shapes:
+          - 1,1001
    # cpu, gpu or cpu+gpu
    runtime: cpu+gpu
    winograd: 0
\ No newline at end of file