提交 c98d031f 编写于 作者: A AGroupofProbiotocs

Add code and description of the Lite classes to tutorials and fix some text errors.

上级 39f4bb91
......@@ -11,5 +11,6 @@ MindSpore端侧文档
:maxdepth: 1
architecture
apicc/apicc
operator_list
glossary
......@@ -57,6 +57,16 @@ In MindSpore Lite, a model file is an `.ms` file converted using the model conve
A model is created based on memory data using the static `Import` method of the Model class. The `Model` instance returned by the function is a pointer, which is created by using `new`. If the pointer is not required, you need to release it by using `delete`.
```cpp
/// \brief Static method to create a Model pointer.
///
/// \param[in] model_buf Define the buffer read from a model file.
/// \param[in] size Define bytes number of model buffer.
///
/// \return Pointer of MindSpore Lite Model.
static Model *Import(const char *model_buf, size_t size);
```
## Session Creation
When MindSpore Lite is used for inference, sessions are the main entrance of inference. You can compile and execute graphs through sessions.
......@@ -67,14 +77,66 @@ Contexts save some basic configuration parameters required by sessions to guide
MindSpore Lite supports heterogeneous inference. The preferred backend for inference is specified by `device_ctx_` in `Context` and is CPU by default. During graph compilation, operator selection and scheduling are performed based on the preferred backend.
```cpp
/// \brief DeviceType defined for holding user's preferred backend.
typedef enum {
DT_CPU, /**< CPU device type */
DT_GPU, /**< GPU device type */
DT_NPU /**< NPU device type, not supported yet */
} DeviceType;
/// \brief DeviceContext defined for holding DeviceType.
typedef struct {
DeviceType type; /**< device type */
} DeviceContext;
DeviceContext device_ctx_{DT_CPU};
```
MindSpore Lite has a built-in thread pool shared by processes. During inference, `thread_num_` is used to specify the maximum number of threads in the thread pool. The default maximum number is 2. It is recommended that the maximum number be no more than 4. Otherwise, the performance may be affected.
```c++
int thread_num_ = 2; /**< thread number config for thread pool */
```
MindSpore Lite supports dynamic memory allocation and release. If `allocator` is not specified, a default `allocator` is generated during inference. You can also use the `Context` method to allow multiple `Context` to share the memory allocator.
If users create the `Context` by using `new`, it should be released by using `delete` once it's not required. Usually the `Context` is released after finishing the session creation.
```cpp
/// \brief Allocator defined a memory pool for malloc memory and free memory dynamically.
///
/// \note List public class and interface for reference.
class Allocator;
/// \brief Context defined for holding environment variables during runtime.
class MS_API Context {
public:
/// \brief Constructor of MindSpore Lite Context using input value for parameters.
///
/// \param[in] thread_num Define the work thread number during the runtime.
/// \param[in] allocator Define the allocator for malloc.
/// \param[in] device_ctx Define device information during the runtime.
Context(int thread_num, std::shared_ptr<Allocator> allocator, DeviceContext device_ctx);
public:
std::shared_ptr<Allocator> allocator = nullptr;
}
```
### Creating Sessions
Use the `Context` created in the previous step to call the static `CreateSession` method of LiteSession to create `LiteSession`. The `LiteSession` instance returned by the function is a pointer, which is created by using `new`. If the pointer is not required, you need to release it by using `delete`.
```cpp
/// \brief Static method to create a LiteSession pointer.
///
/// \param[in] context Define the context of session to be created.
///
/// \return Pointer of MindSpore Lite LiteSession.
static LiteSession *CreateSession(lite::Context *context);
```
### Example
The following sample code demonstrates how to create a `Context` and how to allow two `LiteSession` to share a memory pool.
......@@ -117,6 +179,20 @@ if (session == nullptr) {
When using MindSpore Lite for inference, after the session creation and graph compilation have been completed, if you need to resize the input shape, you can reset the shape of the input tensor, and then call the session's Resize() interface.
```cpp
/// \brief Get input MindSpore Lite MSTensors of model.
///
/// \return The vector of MindSpore Lite MSTensor.
virtual std::vector<tensor::MSTensor *> GetInputs() const = 0;
/// \brief Resize inputs shape.
///
/// \param[in] inputs Define the new inputs shape.
///
/// \return STATUS as an error code of resize inputs, STATUS is defined in errorcode.h.
virtual int Resize(const std::vector<tensor::MSTensor *> &inputs) = 0;
```
### Example
The following code demonstrates how to resize the input of MindSpore Lite:
......@@ -134,6 +210,17 @@ session->Resize(inputs);
Before graph execution, call the `CompileGraph` API of the `LiteSession` to compile graphs and further parse the Model instance loaded from the file, mainly for subgraph split and operator selection and scheduling. This process takes a long time. Therefore, it is recommended that `LiteSession` achieve multiple executions with one creation and one compilation.
```cpp
/// \brief Compile MindSpore Lite model.
///
/// \note CompileGraph should be called before RunGraph.
///
/// \param[in] model Define the model to be compiled.
///
/// \return STATUS as an error code of compiling graph, STATUS is defined in errorcode.h.
virtual int CompileGraph(lite::Model *model) = 0;
```
### Example
The following code demonstrates how to compile graph of MindSpore Lite:
......@@ -160,12 +247,43 @@ Before graph execution, you need to copy the input data to model input tensors.
MindSpore Lite provides the following methods to obtain model input tensors.
1. Use the `GetInputsByName` method to obtain vectors of the model input tensors that are connected to the model input node based on the node name.
```cpp
/// \brief Get input MindSpore Lite MSTensors of model by node name.
///
/// \param[in] node_name Define node name.
///
/// \return The vector of MindSpore Lite MSTensor.
virtual std::vector<tensor::MSTensor *> GetInputsByName(const std::string &node_name) const = 0;
```
2. Use the `GetInputs` method to directly obtain the vectors of all model input tensors.
```cpp
/// \brief Get input MindSpore Lite MSTensors of model.
///
/// \return The vector of MindSpore Lite MSTensor.
virtual std::vector<tensor::MSTensor *> GetInputs() const = 0;
```
### Copying Data
After model input tensors are obtained, you need to enter data into the tensors. Use the `Size` method of `MSTensor` to obtain the size of the data to be entered into tensors, use the `data_type` method to obtain the data type of tensors, and use the `MutableData` method of `MSTensor` to obtain the writable pointer.
```cpp
/// \brief Get byte size of data in MSTensor.
///
/// \return Byte size of data in MSTensor.
virtual size_t Size() const = 0;
/// \brief Get the pointer of data in MSTensor.
///
/// \note The data pointer can be used to both write and read data in MSTensor.
///
/// \return The pointer points to data in MSTensor.
virtual void *MutableData() const = 0;
```
### Example
The following sample code shows how to obtain the entire graph input `MSTensor` from `LiteSession` and enter the model input data to `MSTensor`.
......@@ -205,10 +323,29 @@ Note:
After a MindSpore Lite session performs graph compilation, you can use `RunGraph` of `LiteSession` for model inference.
```cpp
/// \brief Run session with callback.
///
/// \param[in] before Define a call_back_function to be called before running each node.
/// \param[in] after Define a call_back_function to be called after running each node.
///
/// \note RunGraph should be called after CompileGraph.
///
/// \return STATUS as an error code of running graph, STATUS is defined in errorcode.h.
virtual int RunGraph(const KernelCallBack &before = nullptr, const KernelCallBack &after = nullptr) = 0;
```
### Core Binding
The built-in thread pool of MindSpore Lite supports core binding and unbinding. By calling the `BindThread` API, you can bind working threads in the thread pool to specified CPU cores for performance analysis. The core binding operation is related to the context specified when `LiteSession` is created. The core binding operation sets the affinity between a thread and CPU based on the core binding policy in the context.
```cpp
/// \brief Attempt to bind or unbind threads in the thread pool to or from the specified cpu core.
///
/// \param[in] if_bind Define whether to bind or unbind threads.
virtual void BindThread(bool if_bind) = 0;
```
Note that core binding is an affinity operation, which is affected by system scheduling. Therefore, successful binding to the specified CPU core cannot be ensured. After executing the code of core binding, you need to perform the unbinding operation. The following is an example:
```cpp
......@@ -235,6 +372,17 @@ MindSpore Lite can transfer two `KernelCallBack` function pointers to call back
- Input and output tensors before inference of the current node
- Input and output tensors after inference of the current node
```cpp
/// \brief CallBackParam defines input arguments for callback function.
struct CallBackParam {
std::string name_callback_param; /**< node name argument */
std::string type_callback_param; /**< node type argument */
};
/// \brief KernelCallBack defines the function pointer for callback.
using KernelCallBack = std::function<bool(std::vector<tensor::MSTensor *> inputs, std::vector<tensor::MSTensor *> outputs, const CallBackParam &opInfo)>;
```
### Example
The following sample code demonstrates how to use `LiteSession` to compile a graph, defines two callback functions as the before-callback pointer and after-callback pointer, transfers them to the `RunGraph` API for callback inference, and demonstrates the scenario of multiple graph executions with one graph compilation.
......@@ -301,13 +449,70 @@ delete (model);
After performing inference, MindSpore Lite can obtain the model inference result.
MindSpore Lite provides the following methods to obtain the model output `MSTensor`.
1. Use the `GetOutputsByName` method to obtain vectors of the model output `MSTensor` that is connected to the model output node based on the node name.
1. Use the `GetOutputsByNodeName` method to obtain vectors of the model output `MSTensor` that is connected to the model output node based on the node name.
```cpp
/// \brief Get output MindSpore Lite MSTensors of model by node name.
///
/// \param[in] node_name Define node name.
///
/// \return The vector of MindSpore Lite MSTensor.
virtual std::vector<tensor::MSTensor *> GetOutputsByNodeName(const std::string &node_name) const = 0;
```
2. Use the `GetOutputMapByNode` method to directly obtain the mapping between the names of all model output nodes and the model output `MSTensor` connected to the nodes.
```cpp
/// \brief Get output MindSpore Lite MSTensors of model mapped by node name.
///
/// \return The map of output node name and MindSpore Lite MSTensor.
virtual std::unordered_map<std::string, std::vector<mindspore::tensor::MSTensor *>> GetOutputMapByNode() const = 0;
```
3. Use the `GetOutputByTensorName` method to obtain the model output `MSTensor` based on the tensor name.
```cpp
/// \brief Get output MindSpore Lite MSTensors of model by tensor name.
///
/// \param[in] tensor_name Define tensor name.
///
/// \return Pointer of MindSpore Lite MSTensor.
virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &tensor_name) const = 0;
```
4. Use the `GetOutputMapByTensor` method to directly obtain the mapping between the names of all model output tensors and the model output `MSTensor`.
```cpp
/// \brief Get output MindSpore Lite MSTensors of model mapped by tensor name.
///
/// \return The map of output tensor name and MindSpore Lite MSTensor.
virtual std::unordered_map<std::string, mindspore::tensor::MSTensor *> GetOutputMapByTensor() const = 0;
```
After model output tensors are obtained, you need to enter data into the tensors. Use the `Size` method of `MSTensor` to obtain the size of the data to be entered into tensors, use the `data_type` method to obtain the data type of `MSTensor`, and use the `MutableData` method of `MSTensor` to obtain the writable pointer.
```cpp
/// \brief Get byte size of data in MSTensor.
///
/// \return Byte size of data in MSTensor.
virtual size_t Size() const = 0;
/// \brief Get data type of the MindSpore Lite MSTensor.
///
/// \note TypeId is defined in mindspore/mindspore/core/ir/dtype/type_id.h. Only number types in TypeId enum are
/// suitable for MSTensor.
///
/// \return MindSpore Lite TypeId of the MindSpore Lite MSTensor.
virtual TypeId data_type() const = 0;
/// \brief Get the pointer of data in MSTensor.
///
/// \note The data pointer can be used to both write and read data in MSTensor.
///
/// \return The pointer points to data in MSTensor.
virtual void *MutableData() const = 0;
```
### Example
The following sample code shows how to obtain the output `MSTensor` from `LiteSession` using the `GetOutputMapByNode` method and print the first ten data or all data records of each output `MSTensor`.
......
......@@ -56,6 +56,16 @@ Runtime总体使用流程如下图所示:
模型通过Model类的静态`Import`方法从内存数据中创建。函数返回的`Model`实例是一个指针,通过`new`创建,不再需要时,需要用户通过`delete`释放。
```cpp
/// \brief Static method to create a Model pointer.
///
/// \param[in] model_buf Define the buffer read from a model file.
/// \param[in] size Define bytes number of model buffer.
///
/// \return Pointer of MindSpore Lite Model.
static Model *Import(const char *model_buf, size_t size);
```
## 创建会话
使用MindSpore Lite执行推理时,Session是推理的主入口,通过Session我们可以进行图编译、图执行。
......@@ -66,16 +76,66 @@ Runtime总体使用流程如下图所示:
MindSpore Lite支持异构推理,推理时的主选后端由`Context`中的`device_ctx_`指定,默认为CPU。在进行图编译时,会根据主选后端进行算子选型调度。
```cpp
/// \brief DeviceType defined for holding user's preferred backend.
typedef enum {
DT_CPU, /**< CPU device type */
DT_GPU, /**< GPU device type */
DT_NPU /**< NPU device type, not supported yet */
} DeviceType;
/// \brief DeviceContext defined for holding DeviceType.
typedef struct {
DeviceType type; /**< device type */
} DeviceContext;
DeviceContext device_ctx_{DT_CPU};
```
MindSpore Lite内置一个进程共享的线程池,推理时通过`thread_num_`指定线程池的最大线程数,默认为2线程,推荐最多不超过4个线程,否则可能会影响性能。
```cpp
int thread_num_ = 2; /**< thread number config for thread pool */
```
MindSpore Lite支持动态内存分配和释放,如果没有指定`allocator`,推理时会生成一个默认的`allocator`,也可以通过`Context`方法在多个`Context`中共享内存分配器。
如果用户通过`new`创建`Context`,不再需要时,需要用户通过`delete`释放。一般在创建完Session后,Context即可释放。
```cpp
/// \brief Allocator defined a memory pool for malloc memory and free memory dynamically.
///
/// \note List public class and interface for reference.
class Allocator;
/// \brief Context defined for holding environment variables during runtime.
class MS_API Context {
public:
/// \brief Constructor of MindSpore Lite Context using input value for parameters.
///
/// \param[in] thread_num Define the work thread number during the runtime.
/// \param[in] allocator Define the allocator for malloc.
/// \param[in] device_ctx Define device information during the runtime.
Context(int thread_num, std::shared_ptr<Allocator> allocator, DeviceContext device_ctx);
public:
std::shared_ptr<Allocator> allocator = nullptr;
}
```
### 创建会话
用上一步创建得到的`Context`,调用LiteSession的静态`CreateSession`方法来创建`LiteSession`。函数返回的`LiteSession`实例是一个指针,通过`new`创建,不再需要时,需要用户通过`delete`释放。
```cpp
/// \brief Static method to create a LiteSession pointer.
///
/// \param[in] context Define the context of session to be created.
///
/// \return Pointer of MindSpore Lite LiteSession.
static LiteSession *CreateSession(lite::Context *context);
```
### 使用示例
下面示例代码演示了`Context`的创建,以及在两个`LiteSession`间共享内存池的功能:
......@@ -118,6 +178,20 @@ if (session == nullptr) {
使用MindSpore Lite进行推理时,在已完成会话创建与图编译之后,如果需要对输入的shape进行Resize,则可以通过对输入的tensor重新设置shape,然后调用session的Resize()接口。
```cpp
/// \brief Get input MindSpore Lite MSTensors of model.
///
/// \return The vector of MindSpore Lite MSTensor.
virtual std::vector<tensor::MSTensor *> GetInputs() const = 0;
/// \brief Resize inputs shape.
///
/// \param[in] inputs Define the new inputs shape.
///
/// \return STATUS as an error code of resize inputs, STATUS is defined in errorcode.h.
virtual int Resize(const std::vector<tensor::MSTensor *> &inputs) = 0;
```
### 使用示例
下面代码演示如何对MindSpore Lite的输入进行Resize:
......@@ -134,6 +208,17 @@ session->Resize(inputs);
在图执行前,需要调用`LiteSession``CompileGraph`接口进行图编译,进一步解析从文件中加载的Model实例,主要进行子图切分、算子选型调度。这部分会耗费较多时间,所以建议`LiteSession`创建一次,编译一次,多次执行。
```cpp
/// \brief Compile MindSpore Lite model.
///
/// \note CompileGraph should be called before RunGraph.
///
/// \param[in] model Define the model to be compiled.
///
/// \return STATUS as an error code of compiling graph, STATUS is defined in errorcode.h.
virtual int CompileGraph(lite::Model *model) = 0;
```
### 使用示例
下面代码演示如何进行图编译:
......@@ -159,12 +244,43 @@ if (ret != RET_OK) {
MindSpore Lite提供两种方法来获取模型的输入Tensor。
1. 使用`GetInputsByName`方法,根据模型输入节点的名称来获取模型输入Tensor中连接到该节点的Tensor的vector。
```cpp
/// \brief Get input MindSpore Lite MSTensors of model by node name.
///
/// \param[in] node_name Define node name.
///
/// \return The vector of MindSpore Lite MSTensor.
virtual std::vector<tensor::MSTensor *> GetInputsByName(const std::string &node_name) const = 0;
```
2. 使用`GetInputs`方法,直接获取所有的模型输入Tensor的vector。
```cpp
/// \brief Get input MindSpore Lite MSTensors of model.
///
/// \return The vector of MindSpore Lite MSTensor.
virtual std::vector<tensor::MSTensor *> GetInputs() const = 0;
```
### 数据拷贝
当获取到模型的输入,就需要向Tensor中填入数据。通过`MSTensor``Size`方法来获取Tensor应该填入的数据大小,通过`data_type`方法来获取Tensor的数据类型,通过`MSTensor``MutableData`方法来获取可写的指针。
```cpp
/// \brief Get byte size of data in MSTensor.
///
/// \return Byte size of data in MSTensor.
virtual size_t Size() const = 0;
/// \brief Get the pointer of data in MSTensor.
///
/// \note The data pointer can be used to both write and read data in MSTensor.
///
/// \return The pointer points to data in MSTensor.
virtual void *MutableData() const = 0;
```
### 使用示例
下面示例代码演示了从`LiteSession`中获取整图输入`MSTensor`,并且向其中灌入模型输入数据的过程:
......@@ -204,11 +320,30 @@ memcpy(in_data, input_buf, data_size);
MindSpore Lite会话在进行图编译以后,即可使用`LiteSession``RunGraph`进行模型推理。
```cpp
/// \brief Run session with callback.
///
/// \param[in] before Define a call_back_function to be called before running each node.
/// \param[in] after Define a call_back_function to be called after running each node.
///
/// \note RunGraph should be called after CompileGraph.
///
/// \return STATUS as an error code of running graph, STATUS is defined in errorcode.h.
virtual int RunGraph(const KernelCallBack &before = nullptr, const KernelCallBack &after = nullptr) = 0;
```
### 绑核
MindSpore Lite内置线程池支持绑核、解绑操作,通过调用`BindThread`接口,可以将线程池中的工作线程绑定到指定CPU核,用于性能分析。绑核操作与创建`LiteSession`时用户指定的上下文有关,绑核操作会根据上下文中的绑核策略进行线程与CPU的亲和性设置。
需要注意的是,绑核是一个亲和性操作,不保证一定能绑定到指定的CPU核,会受到系统调度的影响。而且绑核后,需要在执行完代码后进行解绑操作,示例如下:
```cpp
/// \brief Attempt to bind or unbind threads in the thread pool to or from the specified cpu core.
///
/// \param[in] if_bind Define whether to bind or unbind threads.
virtual void BindThread(bool if_bind) = 0;
```
需要注意的是,绑核是一个亲和性操作,不保证一定能绑定到指定的CPU核,会受到系统调度的影响。而且绑核后,需要在执行完代码后进行解绑操作。示例如下:
```cpp
// Assume we have created a LiteSession instance named session.
......@@ -234,6 +369,17 @@ Mindspore Lite可以在调用`RunGraph`时,传入两个`KernelCallBack`函数
- 推理当前节点前的输入输出Tensor
- 推理当前节点后的输入输出Tensor
```cpp
/// \brief callbackParam defines input arguments for callback function.
struct CallBackParam {
std::string name_callback_param; /**< node name argument */
std::string type_callback_param; /**< node type argument */
};
/// \brief Kernelcallback defines the function pointer for callback.
using KernelCallBack = std::function<bool(std::vector<tensor::MSTensor *> inputs, std::vector<tensor::MSTensor *> outputs, const CallBackParam &opInfo)>;
```
### 使用示例
下面示例代码演示了使用`LiteSession`进行图编译,并定义了两个回调函数作为前置回调指针和后置回调指针,传入到`RunGraph`接口进行回调推理,并演示了一次图编译,多次图执行的使用场景:
......@@ -301,12 +447,69 @@ MindSpore Lite在执行完推理后,就可以获取模型的推理结果。
MindSpore Lite提供四种方法来获取模型的输出`MSTensor`
1. 使用`GetOutputsByNodeName`方法,根据模型输出节点的名称来获取模型输出`MSTensor`中连接到该节点的Tensor的vector。
```cpp
/// \brief Get output MindSpore Lite MSTensors of model by node name.
///
/// \param[in] node_name Define node name.
///
/// \return The vector of MindSpore Lite MSTensor.
virtual std::vector<tensor::MSTensor *> GetOutputsByNodeName(const std::string &node_name) const = 0;
```
2. 使用`GetOutputMapByNode`方法,直接获取所有的模型输出节点的名称和连接到该节点的模型输出`MSTensor`的一个map。
```cpp
/// \brief Get output MindSpore Lite MSTensors of model mapped by node name.
///
/// \return The map of output node name and MindSpore Lite MSTensor.
virtual std::unordered_map<std::string, std::vector<mindspore::tensor::MSTensor *>> GetOutputMapByNode() const = 0;
```
3. 使用`GetOutputByTensorName`方法,根据模型输出Tensor的名称来获取对应的模型输出`MSTensor`
```cpp
/// \brief Get output MindSpore Lite MSTensors of model by tensor name.
///
/// \param[in] tensor_name Define tensor name.
///
/// \return Pointer of MindSpore Lite MSTensor.
virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &tensor_name) const = 0;
```
4. 使用`GetOutputMapByTensor`方法,直接获取所有的模型输出`MSTensor`的名称和`MSTensor`指针的一个map。
```cpp
/// \brief Get output MindSpore Lite MSTensors of model mapped by tensor name.
///
/// \return The map of output tensor name and MindSpore Lite MSTensor.
virtual std::unordered_map<std::string, mindspore::tensor::MSTensor *> GetOutputMapByTensor() const = 0;
```
当获取到模型的输出Tensor,就需要向Tensor中填入数据。通过`MSTensor``Size`方法来获取Tensor应该填入的数据大小,通过`data_type`方法来获取`MSTensor`的数据类型,通过`MSTensor``MutableData`方法来获取可读写的内存指针。
```c++
/// \brief Get byte size of data in MSTensor.
///
/// \return Byte size of data in MSTensor.
virtual size_t Size() const = 0;
/// \brief Get data type of the MindSpore Lite MSTensor.
///
/// \note TypeId is defined in mindspore/mindspore/core/ir/dtype/type_id.h. Only number types in TypeId enum are
/// suitable for MSTensor.
///
/// \return MindSpore Lite TypeId of the MindSpore Lite MSTensor.
virtual TypeId data_type() const = 0;
/// \brief Get the pointer of data in MSTensor.
///
/// \note The data pointer can be used to both write and read data in MSTensor.
///
/// \return The pointer points to data in MSTensor.
virtual void *MutableData() const = 0;
```
### 使用示例
下面示例代码演示了使用`GetOutputMapByNode`接口获取输出`MSTensor`,并打印了每个输出`MSTensor`的前十个数据或所有数据:
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册