HubServing service pack contains 3 files, the directory is as follows:
PaddleClas supports rapid service deployment through PaddleHub. Currently, the deployment of image classification is supported. Please look forward to the deployment of image recognition.
```
hubserving/clas/
## Catalogue
└─ __init__.py Empty file, required
-[1 Introduction](#1-introduction)
└─ config.json Configuration file, optional, passed in as a parameter when using configuration to start the service
-[2. Prepare the environment](#2-prepare-the-environment)
└─ module.py Main module file, required, contains the complete logic of the service
-[3. Download the inference model](#3-download-the-inference-model)
└─ params.py Parameter file, required, including parameters such as model path, pre- and post-processing parameters
-[4. Install the service module](#4-install-the-service-module)
-[5. Start service](#5-start-service)
-[5.1 Start with command line parameters](#51-start-with-command-line-parameters)
-[5.2 Start with configuration file](#52-start-with-configuration-file)
Before installing the service module, you need to prepare the inference model and put it in the correct path. The default model path is:
Before installing the service module, you need to prepare the inference model and put it in the correct path. The default model path is:
```
* Classification inference model structure file: `PaddleClas/inference/inference.pdmodel`
Model structure file: PaddleClas/inference/inference.pdmodel
* Classification inference model weight file: `PaddleClas/inference/inference.pdiparams`
Model parameters file: PaddleClas/inference/inference.pdiparams
```
**Notice**:
* Model file paths can be viewed and modified in `PaddleClas/deploy/hubserving/clas/params.py`:
```python
"inference_model_dir":"../inference/"
```
* Model files (including `.pdmodel` and `.pdiparams`) must be named `inference`.
* We provide a large number of pre-trained models based on the ImageNet-1k dataset. For the model list and download address, see [Model Library Overview](../../docs/en/algorithm_introduction/ImageNet_models_en.md), or you can use your own trained and converted models.
* The model file path can be viewed and modified in `PaddleClas/deploy/hubserving/clas/params.py`.
It should be noted that the prefix of model structure file and model parameters file must be `inference`.
<aname="4"></a>
## 4. Install the service module
* More models provided by PaddleClas can be obtained from the [model library](../../docs/en/models/models_intro_en.md). You can also use models trained by yourself.
* In the Linux environment, the installation example is as follows:
```shell
cd PaddleClas/deploy
# Install the service module:
hub install hubserving/clas/
```
### 3. Install Service Module
* In the Windows environment (the folder separator is `\`), the installation example is as follows:
* On Linux platform, the examples are as follows.
```shell
```shell
cd PaddleClas\deploy
cd PaddleClas/deploy
# Install the service module:
hub install hubserving/clas/
hub install hubserving\clas\
```
```
* On Windows platform, the examples are as follows.
```shell
cd PaddleClas\deploy
hub install hubserving\clas\
```
### 4. Start service
<a name="5"></a>
#### Way 1. Start with command line parameters (CPU only)
## 5. Start service
**start command:**
```shell
$ hub serving start --modulesModule1==Version1 \
--port XXXX \
--use_multiprocess\
--workers\
```
**parameters:**
|parameters|usage|
<a name="5.1"></a>
|-|-|
### 5.1 Start with command line parameters
|--modules/-m|PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When Version is not specified, the latest version is selected by default`*|
|--port/-p|Service port, default is 8866|
This method only supports prediction using CPU. Start command:
|--use_multiprocess|Enable concurrent mode, the default is single-process mode, this mode is recommended for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`*|
|--workers|The number of concurrent tasks specified in concurrent mode, the default is `2*cpu_count-1`, where `cpu_count` is the number of CPU cores|
For example, start the 2-stage series service:
```shell
```shell
hub serving start -m clas_system
hub serving start \
--modules clas_system
--port 8866
```
```
This completes the deployment of a serviced API, using the default port number 8866.
This completes the deployment of a service API, using the default port number 8866.
**Parameter Description**:
| parameters | uses |
| ------------------ | ------------------- |
| --modules/-m | [**required**] PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When no Version is specified, the latest is selected by default version`* |
| --port/-p | [**OPTIONAL**] Service port, default is 8866 |
| --use_multiprocess | [**Optional**] Whether to enable the concurrent mode, the default is single-process mode, it is recommended to use this mode for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`* |
| --workers | [**Optional**] The number of concurrent tasks specified in concurrent mode, the default is `2*cpu_count-1`, where `cpu_count` is the number of CPU cores |
For more deployment details, see [PaddleHub Serving Model One-Click Service Deployment](https://paddlehub.readthedocs.io/zh_CN/release-v2.1/tutorial/serving.html)
<a name="5.2"></a>
### 5.2 Start with configuration file
This method only supports prediction using CPU or GPU. Start command:
#### Way 2. Start with configuration file(CPU、GPU)
**start command:**
```shell
```shell
hub serving start --config/-c config.json
hub serving start -c config.json
```
```
Wherein, the format of `config.json` is as follows:
Among them, the format of `config.json` is as follows:
```json
```json
{
{
"modules_info": {
"modules_info": {
...
@@ -96,104 +129,110 @@ Wherein, the format of `config.json` is as follows:
...
@@ -96,104 +129,110 @@ Wherein, the format of `config.json` is as follows:
"workers": 2
"workers": 2
}
}
```
```
- The configurable parameters in `init_args` are consistent with the `_initialize` function interface in `module.py`. Among them,
- when `use_gpu` is `true`, it means that the GPU is used to start the service.
**Parameter Description**:
- when `enable_mkldnn` is `true`, it means that use MKL-DNN to accelerate.
* The configurable parameters in `init_args` are consistent with the `_initialize` function interface in `module.py`. in,
- The configurable parameters in `predict_args` are consistent with the `predict` function interface in `module.py`.
- When `use_gpu` is `true`, it means to use GPU to start the service.
- When `enable_mkldnn` is `true`, it means to use MKL-DNN acceleration.
**Note:**
* The configurable parameters in `predict_args` are consistent with the `predict` function interface in `module.py`.
- When using the configuration file to start the service, other parameters will be ignored.
- If you use GPU prediction (that is, `use_gpu` is set to `true`), you need to set the environment variable CUDA_VISIBLE_DEVICES before starting the service, such as: ```export CUDA_VISIBLE_DEVICES=0```, otherwise you do not need to set it.
**Notice**:
-**`use_gpu` and `use_multiprocess` cannot be `true` at the same time.**
* When using the configuration file to start the service, the parameter settings in the configuration file will be used, and other command line parameters will be ignored;
-**When both `use_gpu` and `enable_mkldnn` are set to `true` at the same time, GPU is used to run and `enable_mkldnn` will be ignored.**
* If you use GPU prediction (ie, `use_gpu` is set to `true`), you need to set the `CUDA_VISIBLE_DEVICES` environment variable to specify the GPU card number used before starting the service, such as: `export CUDA_VISIBLE_DEVICES=0`;
* **`use_gpu` cannot be `true`** at the same time as `use_multiprocess`;
For example, use GPU card No. 3 to start the 2-stage series service:
* ** When both `use_gpu` and `enable_mkldnn` are `true`, `enable_mkldnn` will be ignored and GPU** will be used.
If you use GPU No. 3 card to start the service:
```shell
```shell
cd PaddleClas/deploy
cd PaddleClas/deploy
export CUDA_VISIBLE_DEVICES=3
export CUDA_VISIBLE_DEVICES=3
hub serving start -c hubserving/clas/config.json
hub serving start -c hubserving/clas/config.json
```
```
## Send prediction requests
<a name="6"></a>
After the service starts, you can use the following command to send a prediction request to obtain the prediction result:
-**image_path**: Test image path, can be a single image path or an image directory path
-**batch_size**: [**Optional**] batch_size. Default by `1`.
-**resize_short**: [**Optional**] In preprocessing, resize according to short size. Default by `256`。
-**crop_size**: [**Optional**] In preprocessing, centor crop size. Default by `224`。
-**normalize**: [**Optional**] In preprocessing, whether to do `normalize`. Default by `True`。
-**to_chw**: [**Optional**] In preprocessing, whether to transpose to `CHW`. Default by `True`。
**Notice**:
After configuring the server, you can use the following command to send a prediction request to get the prediction result:
If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `--resize_short=384`, `--crop_size=384`.
The returned result is a list, including the `top_k`'s classification results, corresponding scores and the time cost of prediction, details as follows.
└─ list: The top-k classification results, sorted in descending order of score
The average time of prediction cost: 2.970 s/image
└─ list: The scores corresponding to the top-k classification results, sorted in descending order of score
The average time cost: 3.014 s/image
└─ float: The time cost of predicting the picture, unit second
The average top-1 score: 0.110
```
**Script parameter description**:
* **server_url**: Service address, the format is `http://[ip_address]:[port]/predict/[module_name]`.
* **image_path**: The test image path, which can be a single image path or an image collection directory path.
* **batch_size**: [**OPTIONAL**] Make predictions in `batch_size` size, default is `1`.
* **resize_short**: [**optional**] When preprocessing, resize by short edge, default is `256`.
* **crop_size**: [**Optional**] The size of the center crop during preprocessing, the default is `224`.
* **normalize**: [**Optional**] Whether to perform `normalize` during preprocessing, the default is `True`.
* **to_chw**: [**Optional**] Whether to adjust to `CHW` order when preprocessing, the default is `True`.
**Note**: If you use `Transformer` series models, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input data size of the model, you need to specify `--resize_short=384 -- crop_size=384`.
**Return result format description**:
The returned result is a list (list), including the top-k classification results, the corresponding scores, and the time-consuming prediction of this image, as follows:
```shell
list: return result
└──list: first image result
├── list: the top k classification results, sorted in descending order of score
├── list: the scores corresponding to the first k classification results, sorted in descending order of score
└── float: The image classification time, in seconds
```
```
**Note:** If you need to add, delete or modify the returned fields, you can modify the corresponding module. For the details, refer to the user-defined modification service module in the next section.
## User defined service module modification
If you need to modify the service logic, the following steps are generally required:
1. Stop service
<a name="7"></a>
```shell
## 7. User defined service module modification
hub serving stop --port/-p XXXX
```
2. Modify the code in the corresponding files, like `module.py` and `params.py`, according to the actual needs. You need re-install(hub install hubserving/clas/) and re-deploy after modifing `module.py`.
If you need to modify the service logic, you need to do the following:
After modifying and installing and before deploying, you can use `python hubserving/clas/module.py` to test the installed service module.
For example, if you need to replace the model used by the deployed service, you need to modify model path parameters `cfg.model_file` and `cfg.params_file` in `params.py`. Of course, other related parameters may need to be modified at the same time. Please modify and debug according to the actual situation.
1. Stop the service
```shell
hub serving stop --port/-p XXXX
```
3. Uninstall old service module
2. Go to the corresponding `module.py` and `params.py` and other files to modify the code according to actual needs. `module.py` needs to be reinstalled after modification (`hub install hubserving/clas/`) and deployed. Before deploying, you can use the `python3.7 hubserving/clas/module.py` command to quickly test the code ready for deployment.
```shell
hub uninstall clas_system
```
4. Install modified service module
3. Uninstall the old service pack
```shell
```shell
hub install hubserving/clas/
hub uninstall clas_system
```
```
5. Restart service
4. Install the new modified service pack
```shell
```shell
hub serving start -m clas_system
hub install hubserving/clas/
```
```
**Note**:
5. Restart the service
```shell
hub serving start -m clas_system
```
Common parameters can be modified in params.py:
**Notice**:
* Directory of model files(include model structure file and model parameters file):
Common parameters can be modified in `PaddleClas/deploy/hubserving/clas/params.py`:
* To replace the model, you need to modify the model file path parameters:
```python
```python
"inference_model_dir":
"inference_model_dir":
```
```
* The number of Top-k results returned during post-processing:
* Change the number of `top-k` results returned when postprocessing:
```python
```python
'topk':
'topk':
```
```
* Mapping file corresponding to label and class ID during post-processing:
* The mapping file corresponding to the lable and class id when changing the post-processing:
```python
```python
'class_id_map_file':
'class_id_map_file':
```
```
In order to avoid unnecessary delay and be able to predict in batch, the preprocessing (include resize, crop and other) is completed in the client, so modify [test_hubserving.py](./test_hubserving.py#L35-L52) if necessary.
In order to avoid unnecessary delay and be able to predict with batch_size, data preprocessing logic (including `resize`, `crop` and other operations) is completed on the client side, so it needs to modify data preprocessing logic related code in [PaddleClas/deploy/hubserving/test_hubserving.py# L41-L47](./test_hubserving.py#L41-L47) and [PaddleClas/deploy/hubserving/test_hubserving.py#L51-L76](./test_hubserving.py#L51-L76).
English | [简体中文](../../zh_CN/inference_deployment/paddle_hub_serving_deploy.md)
# Service deployment based on PaddleHub Serving
# Service deployment based on PaddleHub Serving
PaddleClas supports rapid service deployment through PaddleHub. Currently, the deployment of image classification is supported. Please look forward to the deployment of image recognition.
PaddleClas supports rapid service deployment through PaddleHub. Currently, the deployment of image classification is supported. Please look forward to the deployment of image recognition.
@@ -236,4 +235,4 @@ Common parameters can be modified in `PaddleClas/deploy/hubserving/clas/params.p
...
@@ -236,4 +235,4 @@ Common parameters can be modified in `PaddleClas/deploy/hubserving/clas/params.p
'class_id_map_file':
'class_id_map_file':
```
```
In order to avoid unnecessary delay and be able to predict with batch_size, data preprocessing logic (including `resize`, `crop` and other operations) is completed on the client side, so it needs to be in [PaddleClas/deploy/hubserving/test_hubserving.py# Modify the code related to data preprocessing logic in L35-L52](../../../deploy/hubserving/test_hubserving.py).
In order to avoid unnecessary delay and be able to predict with batch_size, data preprocessing logic (including `resize`, `crop` and other operations) is completed on the client side, so it needs to be in [PaddleClas/deploy/hubserving/test_hubserving.py# L41-L47](../../../deploy/hubserving/test_hubserving.py#L41-L47) and [PaddleClas/deploy/hubserving/test_hubserving.py#L51-L76](../../../deploy/hubserving/test_hubserving.py#L51-L76) Modify the data preprocessing logic related code.