README.md 4.7 KB
Newer Older
F
felixhjh 已提交
1 2 3 4 5 6
# OCR 

(English|[简体中文](./README_CN.md))

## Get Model
```
F
felixhjh 已提交
7
wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/image/OCR/ocr_rec.tar.gz
F
felixhjh 已提交
8
tar -xzvf ocr_rec.tar.gz
F
felixhjh 已提交
9
wget https://paddle-serving.bj.bcebos.com/ocr/ocr_det.tar.gz
F
felixhjh 已提交
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103
tar -xzvf ocr_det.tar.gz
```

## Get Dataset (Optional)
```
wget --no-check-certificate https://paddle-serving.bj.bcebos.com/ocr/test_imgs.tar
tar xf test_imgs.tar
```

## Web Service

### Start Service

```
#choose one of cpu/gpu commands as following
#for cpu user
python3 -m paddle_serving_server.serve --model ocr_det_model --port 9293
python3 ocr_web_server.py cpu
#for gpu user
python3 -m paddle_serving_server.serve --model ocr_det_model --port 9293 --gpu_ids 0
python3 ocr_web_server.py gpu
```

### Client Prediction
```
python3 ocr_web_client.py
```
If you want a faster web service, please try Web LocalPredictor Service

## Web LocalPredictor Service
```
#choose one of cpu/gpu commands as following
#for cpu user
python3 ocr_debugger_server.py cpu
#for gpu user
python3 ocr_debugger_server.py gpu 
```

## Web LocalPredictor Client Prediction
```
python3 ocr_web_client.py
```

## Benchmark

CPU: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz * 40

GPU: Nvidia Tesla V100 * 1

Dataset: RCTW 500 sample images

| engine                       | client read image(ms) | client-server tras time(ms) | server read image(ms) | det pre(ms) | det infer(ms) | det post(ms) | rec pre(ms) | rec infer(ms) | rec post(ms) | server-client trans time(ms) | server side time consumption(ms) | server side overhead(ms) | total time(ms) |
|------------------------------|----------------|----------------------------|------------------|--------------------|------------------|--------------------|--------------------|------------------|--------------------|--------------------------|--------------------|--------------|---------------|
| Serving web service          | 8.69         | 13.41                      | 109.97           | 2.82               | 87.76            | 4.29               | 3.98               | 78.51            | 3.66               | 4.12                     | 181.02             | 136.49       | 317.51        |
| Serving LocalPredictor web service |  8.73        | 16.42                      | 115.27           | 2.93               | 20.63            | 3.97               | 4.48               | 13.84            | 3.60               | 6.91                     | 49.45              | 147.33       | 196.78        |

## Appendix: For Users who want to launch Det or Rec only
if you are going to detect images not recognize it or directly recognize the words from images. We also provide Det and Rec server for you.

### Det Server 

```
python3 det_web_server.py cpu #for cpu user
python3 det_web_server.py gpu #for gpu user
#or
python3 det_debugger_server.py cpu #for cpu user
python3 det_debugger_server.py gpu #for gpu user
```

### Det Client

```
# also use ocr_web_client.py
python3 ocr_web_client.py
```

### Rec Server

```
python3 rec_web_server.py cpu #for cpu user
python3 rec_web_server.py gpu #for gpu user
#or
python3 rec_debugger_server.py cpu #for cpu user
python3 rec_debugger_server.py gpu #for gpu user
```

### Rec Client

```
python3 rec_web_client.py
```

## C++ OCR Service

H
huangjianhui 已提交
104
**Notice:** If you need to concatenate det model and rec model, and do pre-processing and post-processing in Paddle Serving C++ framework, you need to use the C++ server compiled with WITH_OPENCV option,see the [COMPILE.md](../../../../doc/Compile_EN.md)
F
felixhjh 已提交
105 106 107

### Start Service

T
Thomas Young 已提交
108 109 110 111 112
When a service starts the concatenation of two models, it only needs to pass in the relative path of the model folder in order after `--model`, and the custom C++ OP class name after `--op`. The order of the model after `--model` and the class name after `--OP` needs to correspond. Here, it is assumed that we have defined the two OPs as GeneralDetectionOp and GeneralRecOp respectively, The script code is as follows:
```python
#One service starts the concatenation of two models
python3 -m paddle_serving_server.serve --model ocr_det_model ocr_rec_model --op GeneralDetectionOp GeneralRecOp --port 9293
#ocr_det_model correspond to GeneralDetectionOp, ocr_rec_model correspond to GeneralRecOp
F
felixhjh 已提交
113 114 115
```

### Client Prediction
T
Thomas Young 已提交
116
The pre-processing and post-processing is in the C++ server part, the image's Base64 encoded string is passed into the C++ server.
F
felixhjh 已提交
117 118 119 120 121 122 123 124 125

so the value of parameter `feed_var` which is in the file `ocr_det_client/serving_client_conf.prototxt` should be changed.

for this case, `feed_type` should be 20(which means the data type is string),`shape` should be 1.

By passing in multiple client folder paths, the client can be started for multi model prediction.
```
python3 ocr_cpp_client.py ocr_det_client ocr_rec_client
```