From 2cb84effe24f374d3166e7c7ddb7e6956b1d1231 Mon Sep 17 00:00:00 2001 From: Ting Wang Date: Fri, 4 Sep 2020 18:43:37 +0800 Subject: [PATCH] update index of docs Signed-off-by: Ting Wang --- lite/docs/source_en/api/class_list.md | 12 - lite/docs/source_en/api/context.md | 117 --- lite/docs/source_en/api/vision.md | 11 - lite/docs/source_en/apicc/apicc.rst | 11 + lite/docs/source_en/apicc/class_list.md | 14 + .../{api => apicc}/errorcode_and_metatype.md | 6 +- .../source_en/{api/model.md => apicc/lite.md} | 390 ++++++--- .../{api/lite_session.md => apicc/session.md} | 380 ++++----- .../{api/ms_tensor.md => apicc/tensor.md} | 0 lite/docs/source_en/index.rst | 1 + ...md => mobilenetv2_incremental_learning.md} | 792 +++++++++--------- tutorials/source_zh_cn/index.rst | 1 + 12 files changed, 876 insertions(+), 859 deletions(-) delete mode 100644 lite/docs/source_en/api/class_list.md delete mode 100644 lite/docs/source_en/api/context.md delete mode 100644 lite/docs/source_en/api/vision.md create mode 100644 lite/docs/source_en/apicc/apicc.rst create mode 100644 lite/docs/source_en/apicc/class_list.md rename lite/docs/source_en/{api => apicc}/errorcode_and_metatype.md (97%) rename lite/docs/source_en/{api/model.md => apicc/lite.md} (52%) rename lite/docs/source_en/{api/lite_session.md => apicc/session.md} (94%) rename lite/docs/source_en/{api/ms_tensor.md => apicc/tensor.md} (100%) rename tutorials/source_zh_cn/advanced_use/{mobilenetv2_incremental_learn.md => mobilenetv2_incremental_learning.md} (98%) diff --git a/lite/docs/source_en/api/class_list.md b/lite/docs/source_en/api/class_list.md deleted file mode 100644 index da4efb18..00000000 --- a/lite/docs/source_en/api/class_list.md +++ /dev/null @@ -1,12 +0,0 @@ -Here is a list of all namespace members with links to the namespace documentation for each member: - -| Namespace | Class Name | Description | -| --- | --- | --- | -| mindspore::lite | [Allocator](https://www.mindspore.cn/lite/docs/en/master/api/context.html#allocator) | Allocator defines a memory pool for dynamic memory malloc and memory free. | -| mindspore::lite | [Context](https://www.mindspore.cn/lite/docs/en/master/api/context.html#context) | Context defines for holding environment variables during runtime. | -| mindspore::lite | [ModelImpl](https://www.mindspore.cn/lite/docs/en/master/api/model.html#modelimpl) | ModelImpl defines the implement class of Model in MindSpore Lite. | -| mindspore::lite | [PrimitiveC](https://www.mindspore.cn/lite/docs/en/master/api/model.html#primitivec) | Primitive defines as prototype of operator. | -| mindspore::lite | [Model](https://www.mindspore.cn/lite/docs/en/master/api/model.html#model) | Model defines model in MindSpore Lite for managing graph. | -| mindspore::lite | [ModelBuilder](https://www.mindspore.cn/lite/docs/en/master/api/model.html#modelbuilder) | ModelBuilder is defined by MindSpore Lite. | -| mindspore::session | [LiteSession](https://www.mindspore.cn/lite/docs/en/master/api/lite_session.html#litesession) | LiteSession defines session in MindSpore Lite for compiling Model and forwarding model. | -| mindspore::tensor | [MSTensor](https://www.mindspore.cn/lite/docs/en/master/api/ms_tensor.html#mstensor) | MSTensor defines tensor in MindSpore Lite. | \ No newline at end of file diff --git a/lite/docs/source_en/api/context.md b/lite/docs/source_en/api/context.md deleted file mode 100644 index 0e2d0e14..00000000 --- a/lite/docs/source_en/api/context.md +++ /dev/null @@ -1,117 +0,0 @@ -# mindspore::lite - -## Allocator - -Allocator defines a memory pool for dynamic memory malloc and memory free. - -## Context - -Context is defined for holding environment variables during runtime. - -**Constructors & Destructors** - -``` -Context() -``` - -Constructor of MindSpore Lite Context using default value for parameters. - -``` -Context(int thread_num, std::shared_ptr< Allocator > allocator, DeviceContext device_ctx) -``` -Constructor of MindSpore Lite Context using input value for parameters. - -- Parameters - - - `thread_num`: Define the work thread number during the runtime. - - - `allocator`: Define the allocator for malloc. - - - `device_ctx`: Define device information during the runtime. - -- Returns - - The instance of MindSpore Lite Context. - -``` - ~Context() -``` -Destructor of MindSpore Lite Context. - -**Public Attributes** - -``` -float16_priority -``` -A **bool** value. Defaults to **false**. Prior enable float16 inference. - -``` -device_ctx_{DT_CPU} -``` -A **DeviceContext** struct. - -``` -thread_num_ -``` - -An **int** value. Defaults to **2**. Thread number config for thread pool. - -``` -allocator -``` - -A **std::shared_ptr** pointer. - -``` -cpu_bind_mode_ -``` - -A **CpuBindMode** enum variable. Defaults to **MID_CPU**. - -## CpuBindMode -An **enum** type. CpuBindMode defined for holding bind cpu strategy argument. - -**Attributes** -``` -MID_CPU = -1 -``` -Bind middle cpu first. - -``` -HIGHER_CPU = 1 -``` -Bind higher cpu first. - -``` -NO_BIND = 0 -``` -No bind. - -## DeviceType -An **enum** type. DeviceType defined for holding user's preferred backend. - -**Attributes** -``` -DT_CPU = -1 -``` -CPU device type. - -``` -DT_GPU = 1 -``` -GPU device type. - -``` -DT_NPU = 0 -``` -NPU device type, not supported yet. - -## DeviceContext - -A **struct** . DeviceContext defined for holding DeviceType. - -**Attributes** -``` -type -``` -A **DeviceType** variable. The device type. \ No newline at end of file diff --git a/lite/docs/source_en/api/vision.md b/lite/docs/source_en/api/vision.md deleted file mode 100644 index e9ba66be..00000000 --- a/lite/docs/source_en/api/vision.md +++ /dev/null @@ -1,11 +0,0 @@ -# mindspore::lite - -**Functions** -``` -std::string Version() -``` -Global method to get a version string. - -- Returns - - The version string of MindSpore Lite. diff --git a/lite/docs/source_en/apicc/apicc.rst b/lite/docs/source_en/apicc/apicc.rst new file mode 100644 index 00000000..f0f55504 --- /dev/null +++ b/lite/docs/source_en/apicc/apicc.rst @@ -0,0 +1,11 @@ +C++ API +======= + +.. toctree:: + :maxdepth: 1 + + class_list + lite + session + tensor + errorcode_and_metatype \ No newline at end of file diff --git a/lite/docs/source_en/apicc/class_list.md b/lite/docs/source_en/apicc/class_list.md new file mode 100644 index 00000000..b2646cfa --- /dev/null +++ b/lite/docs/source_en/apicc/class_list.md @@ -0,0 +1,14 @@ +# Class List + +Here is a list of all classes with links to the namespace documentation for each member: + +| Namespace | Class Name | Description | +| --- | --- | --- | +| mindspore::lite | [Allocator](https://www.mindspore.cn/lite/docs/en/master/apicc/lite.html#allocator) | Allocator defines a memory pool for dynamic memory malloc and memory free. | +| mindspore::lite | [Context](https://www.mindspore.cn/lite/docs/en/master/apicc/lite.html#context) | Context defines for holding environment variables during runtime. | +| mindspore::lite | [ModelImpl](https://www.mindspore.cn/lite/docs/en/master/apicc/lite.html#modelimpl) | ModelImpl defines the implement class of Model in MindSpore Lite. | +| mindspore::lite | [PrimitiveC](https://www.mindspore.cn/lite/docs/en/master/apicc/lite.html#primitivec) | Primitive defines as prototype of operator. | +| mindspore::lite | [Model](https://www.mindspore.cn/lite/docs/en/master/apicc/lite.html#model) | Model defines model in MindSpore Lite for managing graph. | +| mindspore::lite | [ModelBuilder](https://www.mindspore.cn/lite/docs/en/master/apicc/lite.html#modelbuilder) | ModelBuilder is defined by MindSpore Lite. | +| mindspore::session | [LiteSession](https://www.mindspore.cn/lite/docs/en/master/apicc/session.html#litesession) | LiteSession defines session in MindSpore Lite for compiling Model and forwarding model. | +| mindspore::tensor | [MSTensor](https://www.mindspore.cn/lite/docs/en/master/apicc/tensor.html#mstensor) | MSTensor defines tensor in MindSpore Lite. | \ No newline at end of file diff --git a/lite/docs/source_en/api/errorcode_and_metatype.md b/lite/docs/source_en/apicc/errorcode_and_metatype.md similarity index 97% rename from lite/docs/source_en/api/errorcode_and_metatype.md rename to lite/docs/source_en/apicc/errorcode_and_metatype.md index 59081a20..96c3294e 100644 --- a/lite/docs/source_en/api/errorcode_and_metatype.md +++ b/lite/docs/source_en/apicc/errorcode_and_metatype.md @@ -1,6 +1,8 @@ +# ErrorCode and MetaType + Description of error code and meta type supported in MindSpore Lite. -# ErrorCode +## ErrorCode | Definition | Value | Description | | --- | --- | --- | @@ -23,7 +25,7 @@ Description of error code and meta type supported in MindSpore Lite. | RET_INFER_ERR | -501 | Failed to infer shape. | | RET_INFER_INVALID | -502 | Invalid infer shape before runtime. | -# MetaType +## MetaType An **enum** type. | Type Name | Definition | Value | Description | diff --git a/lite/docs/source_en/api/model.md b/lite/docs/source_en/apicc/lite.md similarity index 52% rename from lite/docs/source_en/api/model.md rename to lite/docs/source_en/apicc/lite.md index abc00be2..5be6eed3 100644 --- a/lite/docs/source_en/api/model.md +++ b/lite/docs/source_en/apicc/lite.md @@ -1,131 +1,259 @@ -# mindspore::lite -## ModelImpl -ModelImpl defines the implement class of Model in MindSpore Lite. - -## PrimitiveC -Primitive is defined as prototype of operator. - -## Model -Model defines model in MindSpore Lite for managing graph. - -**Constructors & Destructors** -``` -Model() -``` - -Constructor of MindSpore Lite Model using default value for parameters. - -``` -virtual ~Model() -``` - -Destructor of MindSpore Lite Model. - -**Public Member Functions** -``` -PrimitiveC* GetOp(const std::string &name) const -``` -Get MindSpore Lite Primitive by name. - -- Parameters - - - `name`: Define name of primitive to be returned. - -- Returns - - The pointer of MindSpore Lite Primitive. - -``` -const schema::MetaGraph* GetMetaGraph() const -``` -Get graph defined in flatbuffers. - -- Returns - - The pointer of graph defined in flatbuffers. - -``` -void FreeMetaGraph() -``` -Free MetaGraph in MindSpore Lite Model. - -**Static Public Member Functions** -``` -static Model *Import(const char *model_buf, size_t size) -``` -Static method to create a Model pointer. - -- Parameters - - - `model_buf`: Define the buffer read from a model file. - - - `size`: variable. Define bytes number of model buffer. - -- Returns - - Pointer of MindSpore Lite Model. - -**Public Attributes** -``` - model_impl_ -``` -The **pointer** of implement of model in MindSpore Lite. Defaults to **nullptr**. - -## ModelBuilder -ModelBuilder is defined by MindSpore Lite. - -**Constructors & Destructors** -``` -ModelBuilder() -``` - -Constructor of MindSpore Lite ModelBuilder using default value for parameters. - -``` -virtual ~ModelBuilder() -``` - -Destructor of MindSpore Lite ModelBuilder. - -**Public Member Functions** -``` -virtual std::string AddOp(const PrimitiveC &op, const std::vector &inputs) -``` - -Add primitive into model builder for model building. - -- Parameters - - - `op`: Define the primitive to be added. - - - `inputs`: Define input edge of primitive to be added. - -- Returns - - ID of the added primitive. - -``` -const schema::MetaGraph* GetMetaGraph() const -``` -Get graph defined in flatbuffers. - -- Returns - - The pointer of graph defined in flatbuffers. - -``` -virtual Model *Construct() -``` -Finish constructing the model. - -## OutEdge -**Attributes** -``` -nodeId -``` -A **string** variable. ID of a node linked by this edge. - -``` -outEdgeIndex -``` -A **size_t** variable. Index of this edge. \ No newline at end of file +# mindspore::lite context + +## Allocator + +Allocator defines a memory pool for dynamic memory malloc and memory free. + +## Context + +Context is defined for holding environment variables during runtime. + +**Constructors & Destructors** + +``` +Context() +``` + +Constructor of MindSpore Lite Context using default value for parameters. + +``` +Context(int thread_num, std::shared_ptr< Allocator > allocator, DeviceContext device_ctx) +``` +Constructor of MindSpore Lite Context using input value for parameters. + +- Parameters + + - `thread_num`: Define the work thread number during the runtime. + + - `allocator`: Define the allocator for malloc. + + - `device_ctx`: Define device information during the runtime. + +- Returns + + The instance of MindSpore Lite Context. + +``` + ~Context() +``` +Destructor of MindSpore Lite Context. + +**Public Attributes** + +``` +float16_priority +``` +A **bool** value. Defaults to **false**. Prior enable float16 inference. + +``` +device_ctx_{DT_CPU} +``` +A **DeviceContext** struct. + +``` +thread_num_ +``` + +An **int** value. Defaults to **2**. Thread number config for thread pool. + +``` +allocator +``` + +A **std::shared_ptr** pointer. + +``` +cpu_bind_mode_ +``` + +A **CpuBindMode** enum variable. Defaults to **MID_CPU**. + +## ModelImpl +ModelImpl defines the implement class of Model in MindSpore Lite. + +## PrimitiveC +Primitive is defined as prototype of operator. + +## Model +Model defines model in MindSpore Lite for managing graph. + +**Constructors & Destructors** +``` +Model() +``` + +Constructor of MindSpore Lite Model using default value for parameters. + +``` +virtual ~Model() +``` + +Destructor of MindSpore Lite Model. + +**Public Member Functions** +``` +PrimitiveC* GetOp(const std::string &name) const +``` +Get MindSpore Lite Primitive by name. + +- Parameters + + - `name`: Define name of primitive to be returned. + +- Returns + + The pointer of MindSpore Lite Primitive. + +``` +const schema::MetaGraph* GetMetaGraph() const +``` +Get graph defined in flatbuffers. + +- Returns + + The pointer of graph defined in flatbuffers. + +``` +void FreeMetaGraph() +``` +Free MetaGraph in MindSpore Lite Model. + +**Static Public Member Functions** +``` +static Model *Import(const char *model_buf, size_t size) +``` +Static method to create a Model pointer. + +- Parameters + + - `model_buf`: Define the buffer read from a model file. + + - `size`: variable. Define bytes number of model buffer. + +- Returns + + Pointer of MindSpore Lite Model. + +**Public Attributes** +``` + model_impl_ +``` +The **pointer** of implement of model in MindSpore Lite. Defaults to **nullptr**. + +## ModelBuilder +ModelBuilder is defined by MindSpore Lite. + +**Constructors & Destructors** +``` +ModelBuilder() +``` + +Constructor of MindSpore Lite ModelBuilder using default value for parameters. + +``` +virtual ~ModelBuilder() +``` + +Destructor of MindSpore Lite ModelBuilder. + +**Public Member Functions** +``` +virtual std::string AddOp(const PrimitiveC &op, const std::vector &inputs) +``` + +Add primitive into model builder for model building. + +- Parameters + + - `op`: Define the primitive to be added. + + - `inputs`: Define input edge of primitive to be added. + +- Returns + + ID of the added primitive. + +``` +const schema::MetaGraph* GetMetaGraph() const +``` +Get graph defined in flatbuffers. + +- Returns + + The pointer of graph defined in flatbuffers. + +``` +virtual Model *Construct() +``` +Finish constructing the model. + +## OutEdge +**Attributes** +``` +nodeId +``` +A **string** variable. ID of a node linked by this edge. + +``` +outEdgeIndex +``` +A **size_t** variable. Index of this edge. + +## CpuBindMode +An **enum** type. CpuBindMode defined for holding bind cpu strategy argument. + +**Attributes** +``` +MID_CPU = -1 +``` +Bind middle cpu first. + +``` +HIGHER_CPU = 1 +``` +Bind higher cpu first. + +``` +NO_BIND = 0 +``` +No bind. + +## DeviceType +An **enum** type. DeviceType defined for holding user's preferred backend. + +**Attributes** +``` +DT_CPU = -1 +``` +CPU device type. + +``` +DT_GPU = 1 +``` +GPU device type. + +``` +DT_NPU = 0 +``` +NPU device type, not supported yet. + +## DeviceContext + +A **struct** . DeviceContext defined for holding DeviceType. + +**Attributes** +``` +type +``` +A **DeviceType** variable. The device type. + +## Version + +``` +std::string Version() +``` +Global method to get a version string. + +- Returns + + The version string of MindSpore Lite. \ No newline at end of file diff --git a/lite/docs/source_en/api/lite_session.md b/lite/docs/source_en/apicc/session.md similarity index 94% rename from lite/docs/source_en/api/lite_session.md rename to lite/docs/source_en/apicc/session.md index 321e2b5b..c6a22964 100644 --- a/lite/docs/source_en/api/lite_session.md +++ b/lite/docs/source_en/apicc/session.md @@ -1,191 +1,191 @@ -# mindspore::session - -## LiteSession - -LiteSession defines session in MindSpore Lite for compiling Model and forwarding model. - -**Constructors & Destructors** -``` -LiteSession() -``` -Constructor of MindSpore Lite LiteSession using default value for parameters. - -``` -~LiteSession() -``` -Destructor of MindSpore Lite LiteSession. - -**Public Member Functions** -``` -virtual void BindThread(bool if_bind) -``` -Attempt to bind or unbind threads in the thread pool to or from the specified cpu core. - -- Parameters - - - `if_bind`: Define whether to bind or unbind threads. - -``` -virtual int CompileGraph(lite::Model *model) -``` -Compile MindSpore Lite model. - -> Note: CompileGraph should be called before RunGraph. - -- Parameters - - - `model`: Define the model to be compiled. - -- Returns - - STATUS as an error code of compiling graph, STATUS is defined in errorcode.h. - -``` -virtual std::vector GetInputs() const -``` -Get input MindSpore Lite MSTensors of model. - -- Returns - - The vector of MindSpore Lite MSTensor. - -``` -std::vector GetInputsByName(const std::string &node_name) const -``` -Get input MindSpore Lite MSTensors of model by node name. - -- Parameters - - - `node_name`: Define node name. - -- Returns - - The vector of MindSpore Lite MSTensor. - -``` -virtual int RunGraph(const KernelCallBack &before = nullptr, const KernelCallBack &after = nullptr) -``` -Run session with callback. - -> Note: RunGraph should be called after CompileGraph. - -- Parameters - - - `before`: Define a call_back_function to be called before running each node. - - - `after`: Define a call_back_function called after running each node. - -- Returns - - STATUS as an error code of running graph, STATUS is defined in errorcode.h. - -``` -virtual std::unordered_map> GetOutputMapByNode() const -``` -Get output MindSpore Lite MSTensors of model mapped by node name. - -- Returns - - The map of output node name and MindSpore Lite MSTensor. - -``` -virtual std::vector GetOutputsByNodeName(const std::string &node_name) const -``` -Get output MindSpore Lite MSTensors of model by node name. - -- Parameters - - - `node_name`: Define node name. - -- Returns - - The vector of MindSpore Lite MSTensor. - -``` -virtual std::unordered_map GetOutputMapByTensor() const -``` -Get output MindSpore Lite MSTensors of model mapped by tensor name. - -- Returns - - The map of output tensor name and MindSpore Lite MSTensor. - -``` -virtual std::vector GetOutputTensorNames() const -``` -Get name of output tensors of model compiled by this session. - -- Returns - - The vector of string as output tensor names in order. - -``` -virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &tensor_name) const -``` -Get output MindSpore Lite MSTensors of model by tensor name. - -- Parameters - - - `tensor_name`: Define tensor name. - -- Returns - - Pointer of MindSpore Lite MSTensor. - -``` -virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &tensor_name) const -``` -Get output MindSpore Lite MSTensors of model by tensor name. - -- Parameters - - - `tensor_name`: Define tensor name. - -- Returns - - Pointer of MindSpore Lite MSTensor. - -``` -virtual int Resize(const std::vector &inputs) - -``` -Resize inputs shape. - -- Parameters - - - `inputs`: Define the new inputs shape. - -- Returns - - STATUS as an error code of resize inputs, STATUS is defined in errorcode.h. - -**Static Public Member Functions** - -``` -static LiteSession *CreateSession(lite::Context *context) -``` -Static method to create a LiteSession pointer. - -- Parameters - - - `context`: Define the context of session to be created. - -- Returns - - Pointer of MindSpore Lite LiteSession. - - -## CallBackParam - -CallBackParam defines input arguments for callBack function. - -**Attributes** -``` -name_callback_param -``` -A **string** variable. Node name argument. - -``` -type_callback_param -``` +# mindspore::session + +## LiteSession + +LiteSession defines session in MindSpore Lite for compiling Model and forwarding model. + +**Constructors & Destructors** +``` +LiteSession() +``` +Constructor of MindSpore Lite LiteSession using default value for parameters. + +``` +~LiteSession() +``` +Destructor of MindSpore Lite LiteSession. + +**Public Member Functions** +``` +virtual void BindThread(bool if_bind) +``` +Attempt to bind or unbind threads in the thread pool to or from the specified cpu core. + +- Parameters + + - `if_bind`: Define whether to bind or unbind threads. + +``` +virtual int CompileGraph(lite::Model *model) +``` +Compile MindSpore Lite model. + +> Note: CompileGraph should be called before RunGraph. + +- Parameters + + - `model`: Define the model to be compiled. + +- Returns + + STATUS as an error code of compiling graph, STATUS is defined in errorcode.h. + +``` +virtual std::vector GetInputs() const +``` +Get input MindSpore Lite MSTensors of model. + +- Returns + + The vector of MindSpore Lite MSTensor. + +``` +std::vector GetInputsByName(const std::string &node_name) const +``` +Get input MindSpore Lite MSTensors of model by node name. + +- Parameters + + - `node_name`: Define node name. + +- Returns + + The vector of MindSpore Lite MSTensor. + +``` +virtual int RunGraph(const KernelCallBack &before = nullptr, const KernelCallBack &after = nullptr) +``` +Run session with callback. + +> Note: RunGraph should be called after CompileGraph. + +- Parameters + + - `before`: Define a call_back_function to be called before running each node. + + - `after`: Define a call_back_function called after running each node. + +- Returns + + STATUS as an error code of running graph, STATUS is defined in errorcode.h. + +``` +virtual std::unordered_map> GetOutputMapByNode() const +``` +Get output MindSpore Lite MSTensors of model mapped by node name. + +- Returns + + The map of output node name and MindSpore Lite MSTensor. + +``` +virtual std::vector GetOutputsByNodeName(const std::string &node_name) const +``` +Get output MindSpore Lite MSTensors of model by node name. + +- Parameters + + - `node_name`: Define node name. + +- Returns + + The vector of MindSpore Lite MSTensor. + +``` +virtual std::unordered_map GetOutputMapByTensor() const +``` +Get output MindSpore Lite MSTensors of model mapped by tensor name. + +- Returns + + The map of output tensor name and MindSpore Lite MSTensor. + +``` +virtual std::vector GetOutputTensorNames() const +``` +Get name of output tensors of model compiled by this session. + +- Returns + + The vector of string as output tensor names in order. + +``` +virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &tensor_name) const +``` +Get output MindSpore Lite MSTensors of model by tensor name. + +- Parameters + + - `tensor_name`: Define tensor name. + +- Returns + + Pointer of MindSpore Lite MSTensor. + +``` +virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &tensor_name) const +``` +Get output MindSpore Lite MSTensors of model by tensor name. + +- Parameters + + - `tensor_name`: Define tensor name. + +- Returns + + Pointer of MindSpore Lite MSTensor. + +``` +virtual int Resize(const std::vector &inputs) + +``` +Resize inputs shape. + +- Parameters + + - `inputs`: Define the new inputs shape. + +- Returns + + STATUS as an error code of resize inputs, STATUS is defined in errorcode.h. + +**Static Public Member Functions** + +``` +static LiteSession *CreateSession(lite::Context *context) +``` +Static method to create a LiteSession pointer. + +- Parameters + + - `context`: Define the context of session to be created. + +- Returns + + Pointer of MindSpore Lite LiteSession. + + +## CallBackParam + +CallBackParam defines input arguments for callBack function. + +**Attributes** +``` +name_callback_param +``` +A **string** variable. Node name argument. + +``` +type_callback_param +``` A **string** variable. Node type argument. \ No newline at end of file diff --git a/lite/docs/source_en/api/ms_tensor.md b/lite/docs/source_en/apicc/tensor.md similarity index 100% rename from lite/docs/source_en/api/ms_tensor.md rename to lite/docs/source_en/apicc/tensor.md diff --git a/lite/docs/source_en/index.rst b/lite/docs/source_en/index.rst index 3c2526cf..abecfe95 100644 --- a/lite/docs/source_en/index.rst +++ b/lite/docs/source_en/index.rst @@ -11,5 +11,6 @@ MindSpore Lite Documentation :maxdepth: 1 architecture + apicc/apicc operator_list glossary diff --git a/tutorials/source_zh_cn/advanced_use/mobilenetv2_incremental_learn.md b/tutorials/source_zh_cn/advanced_use/mobilenetv2_incremental_learning.md similarity index 98% rename from tutorials/source_zh_cn/advanced_use/mobilenetv2_incremental_learn.md rename to tutorials/source_zh_cn/advanced_use/mobilenetv2_incremental_learning.md index f47fd4ad..acebf4c6 100644 --- a/tutorials/source_zh_cn/advanced_use/mobilenetv2_incremental_learn.md +++ b/tutorials/source_zh_cn/advanced_use/mobilenetv2_incremental_learning.md @@ -1,396 +1,396 @@ -# MobileNetV2 增量学习 -`CPU` `Ascend` `GPU` `模型开发` `中级` `高级` - - - -- [增量学习](#增量学习) - - [概述](#概述) - - [任务描述及准备](#任务描述及准备) - - [环境配置](#环境配置) - - [下载代码](#下载代码) - - [准备预训练模型](#准备预训练模型) - - [准备数据](#准备数据) - - [预训练模型加载代码详解](#预训练模型加载代码详解) - - [参数简介](#参数简介) - - [运行Python文件](#运行python文件) - - [运行Shell脚本](#运行shell脚本) - - [加载增量学习训练](#加载增量学习训练) - - [CPU加载训练](#cpu加载训练) - - [GPU加载训练](#gpu加载训练) - - [Ascend加载训练](#ascend加载训练) - - [增量学习训练结果](#增量学习训练结果) - - [验证增量学习训练模型](#验证增量学习训练模型) - - [验证模型](#验证模型) - - [验证结果](#验证结果) - - -   - -## 概述 - -计算机视觉任务中,从头开始训练一个网络耗时巨大,需要大量计算能力。预训练模型选择的常见的OpenImage、ImageNet、VOC、COCO等公开大型数据集,规模达到几十万甚至超过上百万张。大部分任务数据规模较大,训练网络模型时,如果不使用预训练模型,从头开始训练网络,需要消耗大量的时间与计算能力,模型容易陷入局部极小值和过拟合。因此大部分任务都会选择预训练模型,在其上做增量学习。 - -MindSpore是一个多元化的机器学习框架。既可以在手机等端侧和PC等设备上运行,也可以在云上的服务器集群上运行。目前MobileNetV2支持在Windows系统中使用单核CPU做增量学习,在EulerOS、Ubuntu系统中使用单个或者多个Ascend AI处理器或GPU中做增量学习,本教程将会介绍如何在不同系统与处理器下的MindSpore框架中做增量学习的训练与验证。 - -目前,Window上暂只支持支持CPU,Ubuntu与EulerOS上支持CPU、GPU与Ascend AI处理器三种处理器。 - ->你可以在这里找到完整可运行的样例代码:https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/mobilenetv2 - -## 任务描述及准备 - -### 环境配置 - -若在本地环境运行,需要安装MindSpore框架,配置CPU、GPU或Ascend AI处理器。若在华为云环境上运行,不需要安装MindSpore框架,不需要配置Ascend AI处理器、CPU与GPU,可以跳过本小节。 - -1. 安装MindSpore框架 - 在EulerOS、Ubuntu或者Windows等系统上需要根据系统和处理器架构[安装对应版本MindSpore框架](https://www.mindspore.cn/install)。 - -2. 配置CPU环境 - 使用CPU时,在代码中,需要在调用CPU开始训练或测试前,按照如下代码设置: - - ```Python - if config.platform == "CPU": - context.set_context(mode=context.GRAPH_MODE, device_target=config.platform, \ - save_graphs=False) - ``` - -3. 配置GPU环境 - 使用GPU时,在代码中,需要在调用GPU开始训练或测试前,按照如下代码设置: - - ```Python - elif config.platform == "GPU": - context.set_context(mode=context.GRAPH_MODE, device_target=config.platform, \ - save_graphs=False) - init("nccl") - context.set_auto_parallel_context(device_num=get_group_size(), - parallel_mode=ParallelMode.DATA_PARALLEL, - mirror_mean=True) - ``` - -4. 配置Ascend环境 - 以Ascend 910 AI处理器为例,1个8个处理器环境的json配置文件`hccl_config.json`示例如下。单/多处理器环境可以根据以下示例调整`"server_count"`与`device`: - - ```json - { - "version": "1.0", - "server_count": "1", - "server_list": [ - { - "server_id": "10.155.111.140", - "device": [ba - {"device_id": "0","device_ip": "192.1.27.6","rank_id": "0"}, - {"device_id": "1","device_ip": "192.2.27.6","rank_id": "1"}, - {"device_id": "2","device_ip": "192.3.27.6","rank_id": "2"}, - {"device_id": "3","device_ip": "192.4.27.6","rank_id": "3"}, - {"device_id": "4","device_ip": "192.1.27.7","rank_id": "4"}, - {"device_id": "5","device_ip": "192.2.27.7","rank_id": "5"}, - {"device_id": "6","device_ip": "192.3.27.7","rank_id": "6"}, - {"device_id": "7","device_ip": "192.4.27.7","rank_id": "7"}], - "host_nic_ip": "reserve" - } - ], - "status": "completed" - } - ``` - - 使用Ascend AI处理器时,在代码中,需要在调用Ascend AI处理器开始训练或测试前,按照如下代码设置: - - ```Python - elif config.platform == "Ascend": - context.set_context(mode=context.GRAPH_MODE, device_target=config.platform, \ - device_id=config.device_id, save_graphs=False) - if config.run_distribute: - context.set_auto_parallel_context(device_num=config.rank_size, - parallel_mode=ParallelMode.DATA_PARALLEL, - parameter_broadcast=True, mirror_mean=True) - auto_parallel_context().set_all_reduce_fusion_split_indices([140]) - init() - ... - ``` - -### 下载代码 - -在Gitee中克隆[MindSpore开源项目仓库](https://gitee.com/mindspore/mindspore.git),进入`./model_zoo/official/cv/mobilenetv2/`。 - -```bash -git clone https://gitee.com/mindspore/mindspore/pulls/5766 -cd ./mindspore/model_zoo/official/cv/mobilenetv2 -``` - -代码结构如下: - -``` -├─MobileNetV2 - ├─README.md # descriptions about MobileNetV2 - ├─scripts - │ run_train.sh # Shell script for train with Ascend or GPU - │ run_eval.sh # Shell script for evaluation with Ascend or GPU - ├─src - │ config.py # parameter configuration - │ dataset.py # creating dataset - │ launch.py # start Python script - │ lr_generator.py # learning rate config - │ mobilenetV2.py # MobileNetV2 architecture - │ models.py # net utils to load ckpt_file, define_net... - │ utils.py # net utils to switch precision, set_context and so on - ├─train.py # training script - └─eval.py # evaluation script -``` - -运行增量学习训练与测试时,Windows、Ubuntu与EulersOS上可以使用Python文件`train.py`与`eval.py`,Ubuntu与EulerOS上还可以使用Shell脚本文件`run_train.sh`与`run_eval.sh`。 - -使用脚本文件`run_train.sh`时,该文件会将运行`launch.py`并且将参数传入`launch.py`,`launch.py`根据分配的CPU、GPU或Ascend AI处理器数量,启动单个/多个进程运行`train.py`,每一个进程分配对应的一个处理器。 - -### 准备预训练模型 - -[下载预训练模型](https://download.mindspore.cn/model_zoo/official/lite/mobilenetv2_openimage_lite/mobilenetV2.ckpt)到以下目录: -`./pretrain_checkpoint/[pretrain_checkpoint_file]` - -```Python -mkdir pretrain_checkpoint -wget -P ./pretrain_checkpoint https://download.mindspore.cn/model_zoo/official/lite/mobilenetv2_openimage_lite/mobilenetV2.ckpt -``` - -### 准备数据 - -准备ImageFolder格式管理的数据集,运行`run_train.sh`时加入`[dataset_path]`参数,运行`train.py`时加入`--dataset_path [dataset_path]`参数: - -数据集结构如下: - -``` -└─ImageFolder - ├─train - │ class1Folder - │ class2Folder - │ ...... - └─eval - class1Folder - class2Folder - ...... -``` - -## 预训练模型加载代码详解 - -在增量学习时,需要加载预训练模型。不同数据集和任务中特征提取层(卷积层)分布趋于一致,但是特征向量的组合(全连接层)不相同,分类数量(全连接层output_size)通常也不一致。在增量学习时,只加载与训练特征提取层参数,不加载与训练全连接层参数;在微调与初始训练时,加载与训练特征提取层参数与全连接层参数。 - -在训练与测试之前,首先按照代码第1行,构建MobileNetV2的backbone网络,head网络,并且构建包含这两个子网络的MobileNetV2网络。代码第4-11行展示了如何在`fine_tune`训练模式下,将预训练模型加载入`net`(MobileNetV2);在`incremental_learn`训练模式下,将预训练模型分别加载入backbone_net子网络,并且冻结backbone_net中的参数,不参与训练。代码第22-24行展示了如何冻结网络参数。 - -```Python - 1: backbone_net, head_net, net = define_net(args_opt, config) - 2: ... - 3: def define_net(args, config): - 4: backbone_net = MobileNetV2Backbone(platform=args.platform) - 5: head_net = MobileNetV2Head(input_channel=backbone_net.out_channels, num_classes=config.num_classes) - 6: net = mobilenet_v2(backbone_net, head_net) - 7: if args.pretrain_ckpt: - 8: if args.train_method == "fine_tune": - 9: load_ckpt(net, args.pretrain_ckpt) -10: elif args.train_method == "incremental_learn": -11: load_ckpt(backbone_net, args.pretrain_ckpt, trainable=False) -12: elif args.train_method == "train": -13: pass -14: else: -15: raise ValueError("must input the usage of pretrain_ckpt when the pretrain_ckpt isn't None") -16: return backbone_net, head_net, net -17: ... -18: def load_ckpt(network, pretrain_ckpt_path, trainable=True): -19: """load the pretrain checkpoint and with the param trainable or not""" -20: param_dict = load_checkpoint(pretrain_ckpt_path) -21: load_param_into_net(network, param_dict) -22: if not trainable: -23: for param in network.get_parameters(): -24: param.requires_grad = False -``` - -## 参数简介 - -### 运行Python文件 -在Windows与Linux系统上训练时,运行`train.py`时需要传入`dataset_path`、`platform`、`train_method`与`pretrain_ckpt`四个参数。验证时,运行`eval.py`并且传入`dataset_path`、`platform`、`pretrain_ckpt`与`head_ckpt`四个参数。 - -```Shell -# Windows/Linux train with Python file -python train.py --dataset_path [dataset_path] --platform [platform] --pretrain_ckpt [pretrain_checkpoint_path] --train_method[("train", "fine_tune", "incremental_learn")] - -# Windows/Linux eval with Python file -python eval.py --dataset_path [dataset_path] --platform [platform] --pretrain_ckpt [pretrain_checkpoint_path] --head_ckpt [head_ckpt_path] -``` - -- `--dataset_path`:训练与验证数据集地址,无默认值,用户训练/验证时必须输入。 -- `--platform`:处理器类型,默认为“Ascend”,可以设置为“CPU”或"GPU"。 -- `--train_method`:训练方法,必须输入“train"、"fine_tune"和incremental_learn"其中一个。 -- `--pretrain_ckpt`:增量训练或调优时,需要传入pretrain_checkpoint文件路径以加载预训练好的模型参数权重。 -- `--head_ckpt`:增量训练模型验证时,需要传入head_net预训练模型路径以加载预训练好的模型参数权重。 - - -### 运行Shell脚本 -在Linux系统上时,可以选择运行Shell脚本文件`./scripts/run_train.sh`与`./scripts/run_eval.sh`。运行时需要在交互界面中同时传入参数。 - -```Shell -# Windows doesn't support Shell -# Linux train with Shell script -sh run_train.sh [PLATFORM] [DEVICE_NUM] [VISIABLE_DEVICES(0,1,2,3,4,5,6,7)] [RANK_TABLE_FILE] [DATASET_PATH] [TRAIN_METHOD] [CKPT_PATH] - -# Linux eval with Shell script for incremental learn -sh run_eval.sh [PLATFORM] [DATASET_PATH] [PRETRAIN_CKPT_PATH] [HEAD_CKPT_PATH] -``` - -- `[PLATFORM]`:处理器类型,默认为“Ascend”,可以设置为“GPU”。 -- `[DEVICE_NUM]`:每个节点(一台服务器/PC相当于一个节点)进程数量,建议设置为机器上Ascend AI处理器数量或GPU数量。 -- `[VISIABLE_DEVICES(0,1,2,3,4,5,6,7)]`:字符串格式的的设备ID,训练将会根据`[VISIABLE_DEVICES]`将进程绑定到对应ID的设备上,多个设备ID之间使用','分隔,建议ID数量与进程数量相同。 -- `[RANK_TABLE_FILE]`:platform选择Ascend时,需要配置Ascend的配置Json文件,。 -- `[DATASET_PATH]`:训练与验证数据集地址,无默认值,用户训练/验证时必须输入。 -- `[CKPT_PATH]`:增量训练或调优时,需要传入checkpoint文件路径以加载预训练好的模型参数权重。 -- `[TRAIN_METHOD]`:训练方法,必须输入`train`、`fine_tune`和`incremental_learn`其中一个。 -- `[PRETRAIN_CKPT_PATH]`:针对增量学习的模型做验证时,需要输入主干网络层保存模型路径。 -- `[HEAD_CKPT_PATH]`:针对增量学习的模型做验证时,需要输入全连接层保存模型路径。 - -## 加载增量学习训练 - -Windows系统上,MobileNetV2做增量学习训练时,只能运行`train.py`。Linux系统上,使用MobileNetV2做增量学习训练时,可以选择运行`run_train.sh`, 并在运行Shell脚本文件时传入[参数](#参数简介)。 - -Windows系统输出信息到交互式命令行,Linux系统环境下运行`run_train.sh`时,命令行结尾使用`&> [log_file_path]`将标准输出与错误输出写入log文件。 增量学习成功开始训练,`./train/device*/log*.log`中会持续写入每一个epoch的训练时间与Loss等信息。若未成功,上述log文件会写入失败报错信息。 - -### CPU加载训练 - -- 设置节点数量 - - 目前运行`train.py`时仅支持单处理器,不需要调整处理器数量。运行`run_train.sh`文件时,`CPU`设备默认为单处理器,目前暂不支持修改CPU数量。 - -- 开始增量训练 - - 使用样例1:通过Python文件调用1个CPU处理器。 - - ```Shell - # Windows or Linux with Python - python train.py --platform CPU --dataset_path /store/dataset/OpenImage/train/ -- train_method incremental_learn --pretrain_ckpt ./pretrain_checkpoint/mobilenetV2.ckpt - ``` - - 使用样例2:通过Shell文件调用1个CPU处理器。 - - ```Shell - # Linux with Shell - sh run_train.sh CPU /store/dataset/OpenImage/train/ incremental_learn ../pretrain_checkpoint/mobilenetV2.ckpt - ``` - -### GPU加载训练 - -- 设置节点数量 - - 目前运行`train.py`时仅支持单处理器,不需要调整节点数量。运行`run_train.sh`文件时,设置`[nproc_per_node]`为GPU数量, `[visible_devices]`为可使用的处理器编号,即GPU的ID,可以选择一个或多个设备ID,使用`,`隔开。 - -- 开始增量训练 - - - 使用样例1:通过Python文件调用1个GPU处理器。 - - ```Shell - # Windows or Linux with Python - python train.py --platform GPU --dataset_path /store/dataset/OpenImage/train/ --pretrain_ckpt ./pretrain_checkpoint/mobilenetV2.ckpt --train_method incremental_learn - ``` - - - 使用样例2:通过Shell脚本调用1个GPU处理器,设备ID为`“0”`。 - - ```Shell - # Linux with Shell - sh run_train.sh GPU 1 0 /store/dataset/OpenImage/train/ incremental_learn ../pretrain_checkpoint/mobilenetV2.ckpt - ``` - - - 使用样例3:通过Shell脚本调用8个GPU处理器,设备ID为`“0,1,2,3,4,5,6,7”`。 - - ```Shell - # Linux with Shell - sh run_train.sh GPU 8 0,1,2,3,4,5,6,7 /store/dataset/OpenImage/train/ incremental_learn ../pretrain_checkpoint/mobilenetV2.ckpt - ``` - -### Ascend加载训练 - -- 设置节点数量 - - 目前运行`train.py`时仅支持单处理器,不需要调整节点数量。运行`run_train.sh`文件时,设置`[nproc_per_node]`为Ascend AI处理器数量, `[visible_devices]`为可使用的处理器编号,即Ascend AI处理器的ID,8卡服务器可以选择0-7中一个或多个设备ID,使用`,`隔开。Ascend节点处理器数量目前只能设置为1或者8。 - -- 开始增量训练 - - - 使用样例1:通过Python文件调用1个Ascend处理器。 - - ```Shell - # Windows or Linux with Python - python train.py --platform Ascend --dataset_path /store/dataset/OpenImage/train/ --train_method incremental_learn --pretrain_ckpt ./pretrain_checkpoint/mobilenetV2.ckpt - ``` - - - 使用样例2:通过Shell脚本调用1个Ascend AI处理器,设备ID为“0”。 - - ```Shell - # Linux with Shell - sh run_train.sh Ascend 1 0 ~/rank_table.json /store/dataset/OpenImage/train/ incremental_learn ../pretrain_checkpoint/mobilenetV2.ckpt - ``` - - - 使用样例3:通过Shell脚本调用8个Ascend AI处理器,设备ID为”0,1,2,3,4,5,6,7“。 - - ```Shell - # Linux with Shell - sh run_train.sh Ascend 8 0,1,2,3,4,5,6,7 ~/rank_table.json /store/dataset/OpenImage/train/ incremental_learn ../pretrain_checkpoint/mobilenetV2.ckpt - ``` - -### 增量学习训练结果 - -- 查看运行结果。 - - - 运行Python文件时在交互式命令行中查看打印信息,`Linux`上运行Shell脚本运行后使用`cat ./train/device0/log0.log`中查看打印信息,输出结果如下: - - ```Shell - train args: Namespace(dataset_path='.\\dataset\\train', platform='CPU', \ - pretrain_ckpt='.\\pretrain_checkpoint\\mobilenetV2.ckpt', train_method='incremental_learn') - cfg: {'num_classes': 26, 'image_height': 224, 'image_width': 224, 'batch_size': 150, \ - 'epoch_size': 15, 'warmup_epochs': 0, 'lr_max': 0.03, 'lr_end': 0.03, 'momentum': 0.9, \ - 'weight_decay': 4e-05, 'label_smooth': 0.1, 'loss_scale': 1024, 'save_checkpoint': True, \ - 'save_checkpoint_epochs': 1, 'keep_checkpoint_max': 20, 'save_checkpoint_path': './checkpoint', \ - 'platform': 'CPU'} - Processing batch: 16: 100%|███████████████████████████████████████████ █████████████████████| 16/16 [00:00 + +- [增量学习](#增量学习) + - [概述](#概述) + - [任务描述及准备](#任务描述及准备) + - [环境配置](#环境配置) + - [下载代码](#下载代码) + - [准备预训练模型](#准备预训练模型) + - [准备数据](#准备数据) + - [预训练模型加载代码详解](#预训练模型加载代码详解) + - [参数简介](#参数简介) + - [运行Python文件](#运行python文件) + - [运行Shell脚本](#运行shell脚本) + - [加载增量学习训练](#加载增量学习训练) + - [CPU加载训练](#cpu加载训练) + - [GPU加载训练](#gpu加载训练) + - [Ascend加载训练](#ascend加载训练) + - [增量学习训练结果](#增量学习训练结果) + - [验证增量学习训练模型](#验证增量学习训练模型) + - [验证模型](#验证模型) + - [验证结果](#验证结果) + + +   + +## 概述 + +计算机视觉任务中,从头开始训练一个网络耗时巨大,需要大量计算能力。预训练模型选择的常见的OpenImage、ImageNet、VOC、COCO等公开大型数据集,规模达到几十万甚至超过上百万张。大部分任务数据规模较大,训练网络模型时,如果不使用预训练模型,从头开始训练网络,需要消耗大量的时间与计算能力,模型容易陷入局部极小值和过拟合。因此大部分任务都会选择预训练模型,在其上做增量学习。 + +MindSpore是一个多元化的机器学习框架。既可以在手机等端侧和PC等设备上运行,也可以在云上的服务器集群上运行。目前MobileNetV2支持在Windows系统中使用单核CPU做增量学习,在EulerOS、Ubuntu系统中使用单个或者多个Ascend AI处理器或GPU中做增量学习,本教程将会介绍如何在不同系统与处理器下的MindSpore框架中做增量学习的训练与验证。 + +目前,Window上暂只支持支持CPU,Ubuntu与EulerOS上支持CPU、GPU与Ascend AI处理器三种处理器。 + +>你可以在这里找到完整可运行的样例代码:https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/mobilenetv2 + +## 任务描述及准备 + +### 环境配置 + +若在本地环境运行,需要安装MindSpore框架,配置CPU、GPU或Ascend AI处理器。若在华为云环境上运行,不需要安装MindSpore框架,不需要配置Ascend AI处理器、CPU与GPU,可以跳过本小节。 + +1. 安装MindSpore框架 + 在EulerOS、Ubuntu或者Windows等系统上需要根据系统和处理器架构[安装对应版本MindSpore框架](https://www.mindspore.cn/install)。 + +2. 配置CPU环境 + 使用CPU时,在代码中,需要在调用CPU开始训练或测试前,按照如下代码设置: + + ```Python + if config.platform == "CPU": + context.set_context(mode=context.GRAPH_MODE, device_target=config.platform, \ + save_graphs=False) + ``` + +3. 配置GPU环境 + 使用GPU时,在代码中,需要在调用GPU开始训练或测试前,按照如下代码设置: + + ```Python + elif config.platform == "GPU": + context.set_context(mode=context.GRAPH_MODE, device_target=config.platform, \ + save_graphs=False) + init("nccl") + context.set_auto_parallel_context(device_num=get_group_size(), + parallel_mode=ParallelMode.DATA_PARALLEL, + mirror_mean=True) + ``` + +4. 配置Ascend环境 + 以Ascend 910 AI处理器为例,1个8个处理器环境的json配置文件`hccl_config.json`示例如下。单/多处理器环境可以根据以下示例调整`"server_count"`与`device`: + + ```json + { + "version": "1.0", + "server_count": "1", + "server_list": [ + { + "server_id": "10.155.111.140", + "device": [ba + {"device_id": "0","device_ip": "192.1.27.6","rank_id": "0"}, + {"device_id": "1","device_ip": "192.2.27.6","rank_id": "1"}, + {"device_id": "2","device_ip": "192.3.27.6","rank_id": "2"}, + {"device_id": "3","device_ip": "192.4.27.6","rank_id": "3"}, + {"device_id": "4","device_ip": "192.1.27.7","rank_id": "4"}, + {"device_id": "5","device_ip": "192.2.27.7","rank_id": "5"}, + {"device_id": "6","device_ip": "192.3.27.7","rank_id": "6"}, + {"device_id": "7","device_ip": "192.4.27.7","rank_id": "7"}], + "host_nic_ip": "reserve" + } + ], + "status": "completed" + } + ``` + + 使用Ascend AI处理器时,在代码中,需要在调用Ascend AI处理器开始训练或测试前,按照如下代码设置: + + ```Python + elif config.platform == "Ascend": + context.set_context(mode=context.GRAPH_MODE, device_target=config.platform, \ + device_id=config.device_id, save_graphs=False) + if config.run_distribute: + context.set_auto_parallel_context(device_num=config.rank_size, + parallel_mode=ParallelMode.DATA_PARALLEL, + parameter_broadcast=True, mirror_mean=True) + auto_parallel_context().set_all_reduce_fusion_split_indices([140]) + init() + ... + ``` + +### 下载代码 + +在Gitee中克隆[MindSpore开源项目仓库](https://gitee.com/mindspore/mindspore.git),进入`./model_zoo/official/cv/mobilenetv2/`。 + +```bash +git clone https://gitee.com/mindspore/mindspore/pulls/5766 +cd ./mindspore/model_zoo/official/cv/mobilenetv2 +``` + +代码结构如下: + +``` +├─MobileNetV2 + ├─README.md # descriptions about MobileNetV2 + ├─scripts + │ run_train.sh # Shell script for train with Ascend or GPU + │ run_eval.sh # Shell script for evaluation with Ascend or GPU + ├─src + │ config.py # parameter configuration + │ dataset.py # creating dataset + │ launch.py # start Python script + │ lr_generator.py # learning rate config + │ mobilenetV2.py # MobileNetV2 architecture + │ models.py # net utils to load ckpt_file, define_net... + │ utils.py # net utils to switch precision, set_context and so on + ├─train.py # training script + └─eval.py # evaluation script +``` + +运行增量学习训练与测试时,Windows、Ubuntu与EulersOS上可以使用Python文件`train.py`与`eval.py`,Ubuntu与EulerOS上还可以使用Shell脚本文件`run_train.sh`与`run_eval.sh`。 + +使用脚本文件`run_train.sh`时,该文件会将运行`launch.py`并且将参数传入`launch.py`,`launch.py`根据分配的CPU、GPU或Ascend AI处理器数量,启动单个/多个进程运行`train.py`,每一个进程分配对应的一个处理器。 + +### 准备预训练模型 + +[下载预训练模型](https://download.mindspore.cn/model_zoo/official/lite/mobilenetv2_openimage_lite/mobilenetV2.ckpt)到以下目录: +`./pretrain_checkpoint/[pretrain_checkpoint_file]` + +```Python +mkdir pretrain_checkpoint +wget -P ./pretrain_checkpoint https://download.mindspore.cn/model_zoo/official/lite/mobilenetv2_openimage_lite/mobilenetV2.ckpt +``` + +### 准备数据 + +准备ImageFolder格式管理的数据集,运行`run_train.sh`时加入`[dataset_path]`参数,运行`train.py`时加入`--dataset_path [dataset_path]`参数: + +数据集结构如下: + +``` +└─ImageFolder + ├─train + │ class1Folder + │ class2Folder + │ ...... + └─eval + class1Folder + class2Folder + ...... +``` + +## 预训练模型加载代码详解 + +在增量学习时,需要加载预训练模型。不同数据集和任务中特征提取层(卷积层)分布趋于一致,但是特征向量的组合(全连接层)不相同,分类数量(全连接层output_size)通常也不一致。在增量学习时,只加载与训练特征提取层参数,不加载与训练全连接层参数;在微调与初始训练时,加载与训练特征提取层参数与全连接层参数。 + +在训练与测试之前,首先按照代码第1行,构建MobileNetV2的backbone网络,head网络,并且构建包含这两个子网络的MobileNetV2网络。代码第4-11行展示了如何在`fine_tune`训练模式下,将预训练模型加载入`net`(MobileNetV2);在`incremental_learn`训练模式下,将预训练模型分别加载入backbone_net子网络,并且冻结backbone_net中的参数,不参与训练。代码第22-24行展示了如何冻结网络参数。 + +```Python + 1: backbone_net, head_net, net = define_net(args_opt, config) + 2: ... + 3: def define_net(args, config): + 4: backbone_net = MobileNetV2Backbone(platform=args.platform) + 5: head_net = MobileNetV2Head(input_channel=backbone_net.out_channels, num_classes=config.num_classes) + 6: net = mobilenet_v2(backbone_net, head_net) + 7: if args.pretrain_ckpt: + 8: if args.train_method == "fine_tune": + 9: load_ckpt(net, args.pretrain_ckpt) +10: elif args.train_method == "incremental_learn": +11: load_ckpt(backbone_net, args.pretrain_ckpt, trainable=False) +12: elif args.train_method == "train": +13: pass +14: else: +15: raise ValueError("must input the usage of pretrain_ckpt when the pretrain_ckpt isn't None") +16: return backbone_net, head_net, net +17: ... +18: def load_ckpt(network, pretrain_ckpt_path, trainable=True): +19: """load the pretrain checkpoint and with the param trainable or not""" +20: param_dict = load_checkpoint(pretrain_ckpt_path) +21: load_param_into_net(network, param_dict) +22: if not trainable: +23: for param in network.get_parameters(): +24: param.requires_grad = False +``` + +## 参数简介 + +### 运行Python文件 +在Windows与Linux系统上训练时,运行`train.py`时需要传入`dataset_path`、`platform`、`train_method`与`pretrain_ckpt`四个参数。验证时,运行`eval.py`并且传入`dataset_path`、`platform`、`pretrain_ckpt`与`head_ckpt`四个参数。 + +```Shell +# Windows/Linux train with Python file +python train.py --dataset_path [dataset_path] --platform [platform] --pretrain_ckpt [pretrain_checkpoint_path] --train_method[("train", "fine_tune", "incremental_learn")] + +# Windows/Linux eval with Python file +python eval.py --dataset_path [dataset_path] --platform [platform] --pretrain_ckpt [pretrain_checkpoint_path] --head_ckpt [head_ckpt_path] +``` + +- `--dataset_path`:训练与验证数据集地址,无默认值,用户训练/验证时必须输入。 +- `--platform`:处理器类型,默认为“Ascend”,可以设置为“CPU”或"GPU"。 +- `--train_method`:训练方法,必须输入“train"、"fine_tune"和incremental_learn"其中一个。 +- `--pretrain_ckpt`:增量训练或调优时,需要传入pretrain_checkpoint文件路径以加载预训练好的模型参数权重。 +- `--head_ckpt`:增量训练模型验证时,需要传入head_net预训练模型路径以加载预训练好的模型参数权重。 + + +### 运行Shell脚本 +在Linux系统上时,可以选择运行Shell脚本文件`./scripts/run_train.sh`与`./scripts/run_eval.sh`。运行时需要在交互界面中同时传入参数。 + +```Shell +# Windows doesn't support Shell +# Linux train with Shell script +sh run_train.sh [PLATFORM] [DEVICE_NUM] [VISIABLE_DEVICES(0,1,2,3,4,5,6,7)] [RANK_TABLE_FILE] [DATASET_PATH] [TRAIN_METHOD] [CKPT_PATH] + +# Linux eval with Shell script for incremental learn +sh run_eval.sh [PLATFORM] [DATASET_PATH] [PRETRAIN_CKPT_PATH] [HEAD_CKPT_PATH] +``` + +- `[PLATFORM]`:处理器类型,默认为“Ascend”,可以设置为“GPU”。 +- `[DEVICE_NUM]`:每个节点(一台服务器/PC相当于一个节点)进程数量,建议设置为机器上Ascend AI处理器数量或GPU数量。 +- `[VISIABLE_DEVICES(0,1,2,3,4,5,6,7)]`:字符串格式的的设备ID,训练将会根据`[VISIABLE_DEVICES]`将进程绑定到对应ID的设备上,多个设备ID之间使用','分隔,建议ID数量与进程数量相同。 +- `[RANK_TABLE_FILE]`:platform选择Ascend时,需要配置Ascend的配置Json文件,。 +- `[DATASET_PATH]`:训练与验证数据集地址,无默认值,用户训练/验证时必须输入。 +- `[CKPT_PATH]`:增量训练或调优时,需要传入checkpoint文件路径以加载预训练好的模型参数权重。 +- `[TRAIN_METHOD]`:训练方法,必须输入`train`、`fine_tune`和`incremental_learn`其中一个。 +- `[PRETRAIN_CKPT_PATH]`:针对增量学习的模型做验证时,需要输入主干网络层保存模型路径。 +- `[HEAD_CKPT_PATH]`:针对增量学习的模型做验证时,需要输入全连接层保存模型路径。 + +## 加载增量学习训练 + +Windows系统上,MobileNetV2做增量学习训练时,只能运行`train.py`。Linux系统上,使用MobileNetV2做增量学习训练时,可以选择运行`run_train.sh`, 并在运行Shell脚本文件时传入[参数](#参数简介)。 + +Windows系统输出信息到交互式命令行,Linux系统环境下运行`run_train.sh`时,命令行结尾使用`&> [log_file_path]`将标准输出与错误输出写入log文件。 增量学习成功开始训练,`./train/device*/log*.log`中会持续写入每一个epoch的训练时间与Loss等信息。若未成功,上述log文件会写入失败报错信息。 + +### CPU加载训练 + +- 设置节点数量 + + 目前运行`train.py`时仅支持单处理器,不需要调整处理器数量。运行`run_train.sh`文件时,`CPU`设备默认为单处理器,目前暂不支持修改CPU数量。 + +- 开始增量训练 + + 使用样例1:通过Python文件调用1个CPU处理器。 + + ```Shell + # Windows or Linux with Python + python train.py --platform CPU --dataset_path /store/dataset/OpenImage/train/ -- train_method incremental_learn --pretrain_ckpt ./pretrain_checkpoint/mobilenetV2.ckpt + ``` + + 使用样例2:通过Shell文件调用1个CPU处理器。 + + ```Shell + # Linux with Shell + sh run_train.sh CPU /store/dataset/OpenImage/train/ incremental_learn ../pretrain_checkpoint/mobilenetV2.ckpt + ``` + +### GPU加载训练 + +- 设置节点数量 + + 目前运行`train.py`时仅支持单处理器,不需要调整节点数量。运行`run_train.sh`文件时,设置`[nproc_per_node]`为GPU数量, `[visible_devices]`为可使用的处理器编号,即GPU的ID,可以选择一个或多个设备ID,使用`,`隔开。 + +- 开始增量训练 + + - 使用样例1:通过Python文件调用1个GPU处理器。 + + ```Shell + # Windows or Linux with Python + python train.py --platform GPU --dataset_path /store/dataset/OpenImage/train/ --pretrain_ckpt ./pretrain_checkpoint/mobilenetV2.ckpt --train_method incremental_learn + ``` + + - 使用样例2:通过Shell脚本调用1个GPU处理器,设备ID为`“0”`。 + + ```Shell + # Linux with Shell + sh run_train.sh GPU 1 0 /store/dataset/OpenImage/train/ incremental_learn ../pretrain_checkpoint/mobilenetV2.ckpt + ``` + + - 使用样例3:通过Shell脚本调用8个GPU处理器,设备ID为`“0,1,2,3,4,5,6,7”`。 + + ```Shell + # Linux with Shell + sh run_train.sh GPU 8 0,1,2,3,4,5,6,7 /store/dataset/OpenImage/train/ incremental_learn ../pretrain_checkpoint/mobilenetV2.ckpt + ``` + +### Ascend加载训练 + +- 设置节点数量 + + 目前运行`train.py`时仅支持单处理器,不需要调整节点数量。运行`run_train.sh`文件时,设置`[nproc_per_node]`为Ascend AI处理器数量, `[visible_devices]`为可使用的处理器编号,即Ascend AI处理器的ID,8卡服务器可以选择0-7中一个或多个设备ID,使用`,`隔开。Ascend节点处理器数量目前只能设置为1或者8。 + +- 开始增量训练 + + - 使用样例1:通过Python文件调用1个Ascend处理器。 + + ```Shell + # Windows or Linux with Python + python train.py --platform Ascend --dataset_path /store/dataset/OpenImage/train/ --train_method incremental_learn --pretrain_ckpt ./pretrain_checkpoint/mobilenetV2.ckpt + ``` + + - 使用样例2:通过Shell脚本调用1个Ascend AI处理器,设备ID为“0”。 + + ```Shell + # Linux with Shell + sh run_train.sh Ascend 1 0 ~/rank_table.json /store/dataset/OpenImage/train/ incremental_learn ../pretrain_checkpoint/mobilenetV2.ckpt + ``` + + - 使用样例3:通过Shell脚本调用8个Ascend AI处理器,设备ID为”0,1,2,3,4,5,6,7“。 + + ```Shell + # Linux with Shell + sh run_train.sh Ascend 8 0,1,2,3,4,5,6,7 ~/rank_table.json /store/dataset/OpenImage/train/ incremental_learn ../pretrain_checkpoint/mobilenetV2.ckpt + ``` + +### 增量学习训练结果 + +- 查看运行结果。 + + - 运行Python文件时在交互式命令行中查看打印信息,`Linux`上运行Shell脚本运行后使用`cat ./train/device0/log0.log`中查看打印信息,输出结果如下: + + ```Shell + train args: Namespace(dataset_path='.\\dataset\\train', platform='CPU', \ + pretrain_ckpt='.\\pretrain_checkpoint\\mobilenetV2.ckpt', train_method='incremental_learn') + cfg: {'num_classes': 26, 'image_height': 224, 'image_width': 224, 'batch_size': 150, \ + 'epoch_size': 15, 'warmup_epochs': 0, 'lr_max': 0.03, 'lr_end': 0.03, 'momentum': 0.9, \ + 'weight_decay': 4e-05, 'label_smooth': 0.1, 'loss_scale': 1024, 'save_checkpoint': True, \ + 'save_checkpoint_epochs': 1, 'keep_checkpoint_max': 20, 'save_checkpoint_path': './checkpoint', \ + 'platform': 'CPU'} + Processing batch: 16: 100%|███████████████████████████████████████████ █████████████████████| 16/16 [00:00