English | [简体中文](readme.md) PaddleOCR provides 2 service deployment methods: - Based on **PaddleHub Serving**: Code path is "`./deploy/hubserving`". Please refer to the [tutorial](../hubserving/readme_en.md) for usage. - Based on **PaddleServing**: Code path is "`./deploy/pdserving`". Please follow this tutorial. # Service deployment based on Paddle Serving This tutorial will introduce the detail steps of deploying PaddleOCR online prediction service based on [Paddle Serving](https://github.com/PaddlePaddle/Serving). ## Quick start service ### 1. Prepare the environment Let's first install the relevant components of Paddle Serving. GPU is recommended for service deployment with Paddle Serving. **Requirements:** - **CUDA version: 9.0** - **CUDNN version: 7.0** - **Operating system version: >= CentOS 6** - **Python version: 2.7/3.6/3.7** **Installation:** ``` # install GPU server python -m pip install paddle_serving_server_gpu # or, install CPU server python -m pip install paddle_serving_server # install client and App package (CPU/GPU) python -m pip install paddle_serving_app paddle_serving_client ``` ### 2. Model transformation You can directly use converted model provided by `paddle_serving_app` for convenience. Execute the following command to obtain: ``` python -m paddle_serving_app.package --get_model ocr_rec tar -xzvf ocr_rec.tar.gz python -m paddle_serving_app.package --get_model ocr_det tar -xzvf ocr_det.tar.gz ``` Executing the above command will download the `db_crnn_mobile` model, which is in different format with inference model. If you want to use other models for deployment, you can refer to the [tutorial](https://github.com/PaddlePaddle/Serving/blob/develop/doc/INFERENCE_TO_SERVING_CN.md) to convert your inference model to a model which is deployable for Paddle Serving. We take `ch_rec_r34_vd_crnn` model as example. Download the inference model by executing the following command: ``` wget --no-check-certificate https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar tar xf ch_rec_r34_vd_crnn_infer.tar ``` Convert the downloaded model by executing the following python script: ``` from paddle_serving_client.io import inference_model_to_serving inference_model_dir = "ch_rec_r34_vd_crnn" serving_client_dir = "serving_client_dir" serving_server_dir = "serving_server_dir" feed_var_names, fetch_var_names = inference_model_to_serving( inference_model_dir, serving_client_dir, serving_server_dir, model_filename="model", params_filename="params") ``` Finally, model configuration of client and server will be generated in `serving_client_dir` and `serving_server_dir`. ### 3. Start service Start the standard version or the fast version service according to your actual needs. The comparison of the two versions is shown in the table below: |version|characteristics|recommended scenarios| |-|-|-| |standard version|High stability, suitable for distributed deployment|Large throughput and cross regional deployment| |fast version|Easy to deploy and fast to predict|Suitable for scenarios which requires high prediction speed and fast iteration speed| #### Mode 1. Start the standard mode service ``` # start with CPU python -m paddle_serving_server.serve --model ocr_det_model --port 9293 python ocr_web_server.py cpu # or, with GPU python -m paddle_serving_server_gpu.serve --model ocr_det_model --port 9293 --gpu_id 0 python ocr_web_server.py gpu ``` #### Mode 2. Start the fast mode service ``` # start with CPU python ocr_local_server.py cpu # or, with GPU python ocr_local_server.py gpu ``` ## Send prediction requests ``` python ocr_web_client.py ``` ## Returned result format The returned result is a JSON string, eg. ``` {u'result': {u'res': [u'\u571f\u5730\u6574\u6cbb\u4e0e\u571f\u58e4\u4fee\u590d\u7814\u7a76\u4e2d\u5fc3', u'\u534e\u5357\u519c\u4e1a\u5927\u5b661\u7d20\u56fe']}} ``` You can also print the readable result in `res`: ``` 土地整治与土壤修复研究中心 华南农业大学1素图 ``` ## User defined service module modification The pre-processing and post-processing process, can be found in the `preprocess` and `postprocess` function in `ocr_web_server.py` or `ocr_local_server.py`. The pre-processing/post-processing library for common CV models provided by `paddle_serving_app` is called. You can modify the corresponding code as actual needs. If you only want to start the detection service or the recognition service, execute the corresponding script reffering to the following table. Indicate the CPU or GPU is used in the start command parameters. | task | standard | fast | | ---- | ----------------- | ------------------- | | detection | det_web_server.py | det_local_server.py | | recognition | rec_web_server.py | rec_local_server.py | More info can be found in [Paddle Serving](https://github.com/PaddlePaddle/Serving).