paddle_serving_app provides a variety of data preprocessing methods for prediction tasks in the field of CV and NLP.
- class ChineseBertReader
Preprocessing for Chinese semantic representation task.
-`__init__(vocab_file, max_seq_len=20)`
- vocab_file(st ):Path of dictionary file.
- max_seq_len(in ,optional):The length of sample after processing. The excess part will be truncated, and the insufficient part will be padding 0. Default 20.
-`process(line)`
- line(st ):Text input.
[example](../examples/bert/bert_client.py)
- class LACReader
Preprocessing for Chinese word segmentation task.
-`__init__(dict_floder)`
- dict_floder(st )Path of dictionary file.
-`process(sent)`
- sent(st ):Text input.
-`parse_result`
- words(st ):Original text input.
- crf_decode(np.array):CRF code predicted by model.
[example](../examples/bert/lac_web_service.py)
- class SentaReader
-`__init__(vocab_path)`
- vocab_path(st ):Path of dictionary file.
-`process(cols)`
- cols(st ):Word segmentation result.
[example](../examples/senta/senta_web_service.py)
- The image preprocessing method is more flexible than the above method, and can be combined by the following multiple classes,[example](../examples/imagenet/image_rpc_client.py)
- class Sequentia
-`__init__(transforms)`
- transforms(list):List of image preprocessing classes
-`__call__(img)`
- img:The input of image preprocessing. The data type is is related to the first preprocessing method in transforms.
- size(list/int):The expected image size, when the input is a list type, it needs to contain the expected length and width. When the input is int type, the short side will be set to the length of size, and the long side will be scaled proportionally.
-`__call__(img)`
- img(numpy array):Image data.
## Timeline tools
The Timeline tool can be used to visualize the start and end time of various stages such as the preparation data of the prediction service, client wait and server op.
This tool is convenient to analyze the proportion of time occupancy in the prediction service. On this basis, prediction services can be optimized in a targeted manner.
### How to use
1. Before making predictions on the client side, turn on the timeline function of each stage in the Paddle Serving framework by environment variables. It will print timeline information in log.
```shell
export FLAGS_profile_client=1 # Turn on timeline function of client
export FLAGS_profile_server=1 # Turn on timeline function of server
```
2. Perform predictions and redirect client-side logs to files, for example, named as profile.
3. Export the information in the log file into a trace file.
4. Open the `chrome: // tracing /` URL using Chrome browser.
Load the trace file generated in the previous step through the load button, you can
Visualize the time information of each stage of the forecast service.
As shown in next figure, the figure shows the timeline of GPU prediction service using [bert example] (https://github.com/PaddlePaddle/Serving/tree/develop/python/examples/bert).
The server side starts service with 4 GPU cards, the client side starts 4 processes to request, and the batch size is 1.
In the figure, bert_pre represents the data pre-processing stage of the client, and client_infer represents the stage where the client completes the sending of the prediction request to the receiving result.
The process in the figure represents the process number of the client, and the second line of each process shows the timeline of each op of the server.
![timeline](../../doc/timeline-example.png)
## Debug tools
The inference op of Paddle Serving is implemented based on Paddle inference lib.
Before deploying the prediction service, you may need to check the input and output of the prediction service or check the resource consumption.
Therefore, a local prediction tool is built into the paddle_serving_app, which is used in the same way as sending a request to the server through the client.
Taking [fit_a_line prediction service] (../examples/fit_a_line) as an example, the following code can be used to run local prediction.