timeprofiler_tool.md 7.5 KB
Newer Older
T
Ting Wang 已提交
1
# TimeProfiler Tool
T
Ting Wang 已提交
2

3 4 5 6 7 8 9 10 11 12
<!-- TOC -->

- [TimeProfiler Tool](#timeprofiler-tool)
    - [Overview](#overview)
    - [Environment Preparation](#environment-preparation)
    - [Parameter Description](#parameter-description)
    - [Example](#example)

<!-- /TOC -->

J
JunYuLiu 已提交
13
<a href="https://gitee.com/mindspore/docs/blob/r0.7/lite/tutorials/source_en/use/timeprofiler_tool.md" target="_blank"><img src="../_static/logo_source.png"></a>
14 15 16 17 18 19 20 21 22

## Overview

The TimeProfiler tool can be used to analyze the time consumption of forward inference at the network layer of a MindSpore Lite model. The analysis is implemented using the C++ language.

## Environment Preparation

To use the TimeProfiler tool, you need to prepare the environment as follows:

J
JunYuLiu 已提交
23
- Compilation: Install compilation dependencies and perform compilation. The code of the TimeProfiler tool is stored in the `mindspore/lite/tools/time_profiler` directory of the MindSpore source code. For details about the compilation operations, see the [Environment Requirements] (https://www.mindspore.cn/lite/docs/en/r0.7/deploy.html#id2) and [Compilation Example] (https://www.mindspore.cn/lite/docs/en/r0.7/deploy.html#id5) in the deployment document.
24

J
JunYuLiu 已提交
25
- Run: Obtain the `time_profiler` tool and configure environment variables by referring to [Output Description](https://www.mindspore.cn/lite/docs/zh-CN/r0.7/deploy.html#id4) in the deployment document.
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93

## Parameter Description

The command used for analyzing the time consumption of forward inference at the network layer based on the compiled TimeProfiler tool is as follows:

```bash
./timeprofiler --modelPath=<MODELPATH> [--help] [--loopCount=<LOOPCOUNT>] [--numThreads=<NUMTHREADS>] [--cpuBindMode=<CPUBINDMODE>] [--inDataPath=<INDATAPATH>] [--fp16Priority=<FP16PRIORITY>]
```

The following describes the parameters in detail.

| Parameter            | Attribute | Function                                                     | Parameter Type | Default Value | Value Range |
| ----------------- | ---- | ------------------------------------------------------------ | ------ | -------- | ---------------------------------- |
| `--help` | Optional | Displays the help information about the `timeprofiler` command. | - | - | - |
| `--modelPath=<MODELPATH> ` | Mandatory | Specifies the file path of the MindSpore Lite model for time consumption analysis. | String | Null  | -        |
| `--loopCount=<LOOPCOUNT>` | Optional | Specifies the number of times that model inference is executed when the TimeProfiler tool is used for time consumption analysis. The value is a positive integer. | Integer | 100 | - |
| `--numThreads=<NUMTHREADS>` | Optional | Specifies the number of threads for running the model inference program. | Integer | 4 | - |
| `--cpuBindMode=<CPUBINDMODE>` | Optional | Specifies the type of the CPU core bound to the model inference program. | Integer | 1      | −1: medium core<br/>1: large core<br/>0: not bound |
| `--inDataPath=<INDATAPATH>` | Optional | Specifies the file path of the input data of the specified model. If this parameter is not set, a random value will be used. | String | Null  | -        |
| `--fp16Priority=<FP16PRIORITY>` | Optional | Specifies whether the float16 operator is preferred. | Bool | false | true, false |

## Example

Take the `tcpclassify.ms` model as an example and set the number of model inference cycles to 10. The command for using TimeProfiler to analyze the time consumption at the network layer is as follows:

```bash
./timeprofiler --modelPath=./models/tcpclassify.ms --loopCount=10
```

After this command is executed, the TimeProfiler tool outputs the statistics on the running time of the model at the network layer. In this example, the command output is as follows: The statistics are displayed by`opName` and `optype`. `opName` indicates the operator name, `optype` indicates the operator type, and `avg` indicates the average running time of the operator per single run, `percent` indicates the ratio of the operator running time to the total operator running time, `calledTimess` indicates the number of times that the operator is run, and `opTotalTime` indicates the total time that the operator is run for a specified number of times. Finally, `total time` and `kernel cost` show the average time consumed by a single inference operation of the model and the sum of the average time consumed by all operators in the model inference, respectively.

```
-----------------------------------------------------------------------------------------
opName                                                          avg(ms)         percent         calledTimess    opTotalTime
conv2d_1/convolution                                            2.264800        0.824012        10              22.648003
conv2d_2/convolution                                            0.223700        0.081390        10              2.237000
dense_1/BiasAdd                                                 0.007500        0.002729        10              0.075000
dense_1/MatMul                                                  0.126000        0.045843        10              1.260000
dense_1/Relu                                                    0.006900        0.002510        10              0.069000
max_pooling2d_1/MaxPool                                         0.035100        0.012771        10              0.351000
max_pooling2d_2/MaxPool                                         0.014300        0.005203        10              0.143000
max_pooling2d_2/MaxPool_nchw2nhwc_reshape_1/Reshape_0           0.006500        0.002365        10              0.065000
max_pooling2d_2/MaxPool_nchw2nhwc_reshape_1/Shape_0             0.010900        0.003966        10              0.109000
output/BiasAdd                                                  0.005300        0.001928        10              0.053000
output/MatMul                                                   0.011400        0.004148        10              0.114000
output/Softmax                                                  0.013300        0.004839        10              0.133000
reshape_1/Reshape                                               0.000900        0.000327        10              0.009000
reshape_1/Reshape/shape                                         0.009900        0.003602        10              0.099000
reshape_1/Shape                                                 0.002300        0.000837        10              0.023000
reshape_1/strided_slice                                         0.009700        0.003529        10              0.097000
-----------------------------------------------------------------------------------------
opType          avg(ms)         percent         calledTimess    opTotalTime
Activation      0.006900        0.002510        10              0.069000
BiasAdd         0.012800        0.004657        20              0.128000
Conv2D          2.488500        0.905401        20              24.885004
MatMul          0.137400        0.049991        20              1.374000
Nchw2Nhwc       0.017400        0.006331        20              0.174000
Pooling         0.049400        0.017973        20              0.494000
Reshape         0.000900        0.000327        10              0.009000
Shape           0.002300        0.000837        10              0.023000
SoftMax         0.013300        0.004839        10              0.133000
Stack           0.009900        0.003602        10              0.099000
StridedSlice    0.009700        0.003529        10              0.097000

total time :     2.90800 ms,    kernel cost : 2.74851 ms

-----------------------------------------------------------------------------------------
```