readme_en.md 8.5 KB
Newer Older
M
MissPenguin 已提交
1
English | [简体中文](readme.md)
2

M
MissPenguin 已提交
3 4 5 6
PaddleOCR provides 2 service deployment methods: 
- Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please follow this tutorial. 
- Based on **PaddleServing**: Code path is "`./deploy/pdserving`". Please refer to the [tutorial](../pdserving/readme_en.md) for usage.

M
MissPenguin 已提交
7
# Service deployment based on PaddleHub Serving  
8

M
MissPenguin 已提交
9
The hubserving service deployment directory includes three service packages: detection, recognition, and two-stage series connection. Please select the corresponding service package to install and start service according to your needs. The directory is as follows:  
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
```
deploy/hubserving/
  └─  ocr_det     detection module service package
  └─  ocr_rec     recognition module service package
  └─  ocr_system  two-stage series connection service package
```

Each service pack contains 3 files. Take the 2-stage series connection service package as an example, the directory is as follows:  
```
deploy/hubserving/ocr_system/
  └─  __init__.py    Empty file, required
  └─  config.json    Configuration file, optional, passed in as a parameter when using configuration to start the service
  └─  module.py      Main module file, required, contains the complete logic of the service
  └─  params.py      Parameter file, required, including parameters such as model path, pre- and post-processing parameters
```

## Quick start service
The following steps take the 2-stage series service as an example. If only the detection service or recognition service is needed, replace the corresponding file path.

### 1. Prepare the environment
```shell
# Install paddlehub  
pip3 install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple

littletomatodonkey's avatar
littletomatodonkey 已提交
34
# Set environment variables on Linux
35
export PYTHONPATH=.
M
MissPenguin 已提交
36

littletomatodonkey's avatar
littletomatodonkey 已提交
37 38 39
# Set environment variables on Windows
SET PYTHONPATH=.
```
40

M
MissPenguin 已提交
41
### 2. Download inference model
42 43 44 45 46 47
Before installing the service module, you need to prepare the inference model and put it in the correct path. By default, the ultra lightweight model of v1.1 is used, and the default model path is:  
```
detection model: ./inference/ch_ppocr_mobile_v1.1_det_infer/
recognition model: ./inference/ch_ppocr_mobile_v1.1_rec_infer/
text direction classifier: ./inference/ch_ppocr_mobile_v1.1_cls_infer/
```  
M
MissPenguin 已提交
48

M
MissPenguin 已提交
49
**The model path can be found and modified in `params.py`.** More models provided by PaddleOCR can be obtained from the [model library](../../doc/doc_en/models_list_en.md). You can also use models trained by yourself.
M
MissPenguin 已提交
50 51

### 3. Install Service Module
littletomatodonkey's avatar
littletomatodonkey 已提交
52 53 54
PaddleOCR provides 3 kinds of service modules, install the required modules according to your needs.

* On Linux platform, the examples are as follows.
55
```shell
M
MissPenguin 已提交
56
# Install the detection service module:
57
hub install deploy/hubserving/ocr_det/
M
MissPenguin 已提交
58 59

# Or, install the recognition service module:
60
hub install deploy/hubserving/ocr_rec/
M
MissPenguin 已提交
61

M
MissPenguin 已提交
62
# Or, install the 2-stage series service module:
63
hub install deploy/hubserving/ocr_system/
littletomatodonkey's avatar
littletomatodonkey 已提交
64 65
```

M
MissPenguin 已提交
66
* On Windows platform, the examples are as follows.
littletomatodonkey's avatar
littletomatodonkey 已提交
67
```shell
M
MissPenguin 已提交
68
# Install the detection service module:
littletomatodonkey's avatar
littletomatodonkey 已提交
69
hub install deploy\hubserving\ocr_det\
M
MissPenguin 已提交
70 71

# Or, install the recognition service module:
littletomatodonkey's avatar
littletomatodonkey 已提交
72
hub install deploy\hubserving\ocr_rec\
M
MissPenguin 已提交
73 74

# Or, install the 2-stage series service module:
littletomatodonkey's avatar
littletomatodonkey 已提交
75
hub install deploy\hubserving\ocr_system\
M
MissPenguin 已提交
76
```
77

M
MissPenguin 已提交
78
### 4. Start service
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147
#### Way 1. Start with command line parameters (CPU only)

**start command:**  
```shell
$ hub serving start --modules [Module1==Version1, Module2==Version2, ...] \
                    --port XXXX \
                    --use_multiprocess \
                    --workers \
```  
**parameters:**  

|parameters|usage|  
|-|-|  
|--modules/-m|PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairs<br>*`When Version is not specified, the latest version is selected by default`*|
|--port/-p|Service port, default is 8866|  
|--use_multiprocess|Enable concurrent mode, the default is single-process mode, this mode is recommended for multi-core CPU machines<br>*`Windows operating system only supports single-process mode`*|
|--workers|The number of concurrent tasks specified in concurrent mode, the default is `2*cpu_count-1`, where `cpu_count` is the number of CPU cores|  

For example, start the 2-stage series service:  
```shell
hub serving start -m ocr_system
```  

This completes the deployment of a service API, using the default port number 8866.  

#### Way 2. Start with configuration file(CPU、GPU)
**start command:**  
```shell
hub serving start --config/-c config.json
```  
Wherein, the format of `config.json` is as follows:
```python
{
    "modules_info": {
        "ocr_system": {
            "init_args": {
                "version": "1.0.0",
                "use_gpu": true
            },
            "predict_args": {
            }
        }
    },
    "port": 8868,
    "use_multiprocess": false,
    "workers": 2
}
```
- The configurable parameters in `init_args` are consistent with the `_initialize` function interface in `module.py`. Among them, **when `use_gpu` is `true`, it means that the GPU is used to start the service**.
- The configurable parameters in `predict_args` are consistent with the `predict` function interface in `module.py`.

**Note:**  
- When using the configuration file to start the service, other parameters will be ignored.
- If you use GPU prediction (that is, `use_gpu` is set to `true`), you need to set the environment variable CUDA_VISIBLE_DEVICES before starting the service, such as: ```export CUDA_VISIBLE_DEVICES=0```, otherwise you do not need to set it.
- **`use_gpu` and `use_multiprocess` cannot be `true` at the same time.**  

For example, use GPU card No. 3 to start the 2-stage series service:
```shell
export CUDA_VISIBLE_DEVICES=3
hub serving start -c deploy/hubserving/ocr_system/config.json
```  

## Send prediction requests
After the service starts, you can use the following command to send a prediction request to obtain the prediction result:  
```shell
python tools/test_hubserving.py server_url image_path
```  

Two parameters need to be passed to the script:
littletomatodonkey's avatar
littletomatodonkey 已提交
148
- **server_url**:service address,format of which is
149
`http://[ip_address]:[port]/predict/[module_name]`  
M
MissPenguin 已提交
150
For example, if the detection, recognition and 2-stage serial services are started with provided configuration files, the respective `server_url` would be:  
151 152 153 154 155 156 157 158 159 160 161 162 163
`http://127.0.0.1:8866/predict/ocr_det`  
`http://127.0.0.1:8867/predict/ocr_rec`  
`http://127.0.0.1:8868/predict/ocr_system`  
- **image_path**:Test image path, can be a single image path or an image directory path

**Eg.**
```shell
python tools/test_hubserving.py http://127.0.0.1:8868/predict/ocr_system ./doc/imgs/
```

## Returned result format
The returned result is a list. Each item in the list is a dict. The dict may contain three fields. The information is as follows:

littletomatodonkey's avatar
littletomatodonkey 已提交
164
|field name|data type|description|
165 166 167 168 169 170 171 172 173
|-|-|-|
|text|str|text content|
|confidence|float|text recognition confidence|
|text_region|list|text location coordinates|

The fields returned by different modules are different. For example, the results returned by the text recognition service module do not contain `text_region`. The details are as follows:

|field name/module name|ocr_det|ocr_rec|ocr_system|
|-|-|-|-|  
littletomatodonkey's avatar
littletomatodonkey 已提交
174 175 176
|text||✔|✔|
|confidence||✔|✔|
|text_region|✔||✔|
177 178 179 180 181 182

**Note:** If you need to add, delete or modify the returned fields, you can modify the file `module.py` of the corresponding module. For the complete process, refer to the user-defined modification service module in the next section.

## User defined service module modification
If you need to modify the service logic, the following steps are generally required (take the modification of `ocr_system` for example):

M
MissPenguin 已提交
183
- 1. Stop service
184 185
```shell
hub serving stop --port/-p XXXX
M
MissPenguin 已提交
186
```
187
- 2. Modify the code in the corresponding files, like `module.py` and `params.py`, according to the actual needs.  
188
For example, if you need to replace the model used by the deployed service, you need to modify model path parameters `det_model_dir` and `rec_model_dir` in `params.py`. If you want to turn off the text direction classifier, set the parameter `use_angle_cls` to `False`. Of course, other related parameters may need to be modified at the same time. Please modify and debug according to the actual situation. It is suggested to run `module.py` directly for debugging after modification before starting the service test.  
M
MissPenguin 已提交
189
- 3. Uninstall old service module
190 191 192
```shell
hub uninstall ocr_system
```
M
MissPenguin 已提交
193
- 4. Install modified service module
194 195
```shell
hub install deploy/hubserving/ocr_system/
M
MissPenguin 已提交
196
```
M
MissPenguin 已提交
197
- 5. Restart service
198 199
```shell
hub serving start -m ocr_system
M
MissPenguin 已提交
200
```