未验证 提交 4b431359 编写于 作者: M MRXLT 提交者: GitHub

Merge branch 'develop' into 0.3.0-bug-fix

......@@ -88,7 +88,7 @@ with open('processed.data') as f:
In the code, the function `client.add_variant(tag, clusters, variant_weight)` is to add a variant with label `tag` and flow weight `variant_weight`. In this example, a BOW variant with label of `bow` and flow weight of `10`, and an LSTM variant with label of `lstm` and a flow weight of `90` are added. The flow on the client side will be distributed to two variants according to the ratio of `10:90`.
When making prediction on the client side, if the parameter `need_variant_tag=True` is specified, the response will contains the variant tag corresponding to the distribution flow.
When making prediction on the client side, if the parameter `need_variant_tag=True` is specified, the response will contain the variant tag corresponding to the distribution flow.
### Expected Results
......
......@@ -2,9 +2,9 @@
([简体中文](./PERFORMANCE_OPTIM_CN.md)|English)
Due to different model structures, different prediction services consume different computing resources when performing predictions. For online prediction services, models that require less computing resources will have a higher proportion of communication time cost, which is called communication-intensive service. Models that require more computing resources have a higher time cost for inference calculations, which is called computationa-intensive services.
Due to different model structures, different prediction services consume different computing resources when performing predictions. For online prediction services, models that require less computing resources will have a higher proportion of communication time cost, which is called communication-intensive service. Models that require more computing resources have a higher time cost for inference calculations, which is called computation-intensive services.
For a prediction service, the easiest way to determine what type it is is to look at the time ratio. Paddle Serving provides [Timeline tool](../python/examples/util/README_CN.md), which can intuitively display the time spent in each stage of the prediction service.
For a prediction service, the easiest way to determine the type of service is to look at the time ratio. Paddle Serving provides [Timeline tool](../python/examples/util/README_CN.md), which can intuitively display the time spent in each stage of the prediction service.
For communication-intensive prediction services, requests can be aggregated, and within a limit that can tolerate delay, multiple prediction requests can be combined into a batch for prediction.
......
......@@ -34,7 +34,7 @@ for line in sys.stdin:
## Export from saved model files
If you have saved model files using Paddle's `save_inference_model` API, you can use Paddle Serving's` inference_model_to_serving` API to convert it into a model file that can be used for Paddle Serving.
```
```python
import paddle_serving_client.io as serving_io
serving_io.inference_model_to_serving(dirname, serving_server="serving_server", serving_client="serving_client", model_filename=None, params_filename=None )
```
......
......@@ -35,7 +35,7 @@ for line in sys.stdin:
## 从已保存的模型文件中导出
如果已使用Paddle 的`save_inference_model`接口保存出预测要使用的模型,则可以通过Paddle Serving的`inference_model_to_serving`接口转换成可用于Paddle Serving的模型文件。
```
```python
import paddle_serving_client.io as serving_io
serving_io.inference_model_to_serving(dirname, serving_server="serving_server", serving_client="serving_client", model_filename=None, params_filename=None)
```
......
......@@ -18,7 +18,7 @@ http://10.127.3.150:9393/uci/prediction
Here you will be prompted that the HTTP service started is in development mode and cannot be used for production deployment.
The prediction service started by Flask is not stable enough to withstand the concurrency of a large number of requests. In the actual deployment process, WSGI (Web Server Gateway Interface) is used.
Next, we will show how to use the [uWSGI] (https://github.com/unbit/uwsgi) module to deploy HTTP prediction services for production environments.
Next, we will show how to use the [uWSGI](https://github.com/unbit/uwsgi) module to deploy HTTP prediction services for production environments.
```python
......
......@@ -21,7 +21,7 @@ python -m paddle_serving_app.package --list_model
python -m paddle_serving_app.package --get_model senta_bilstm
```
10 pre-trained models are built into paddle_serving_app, covering 6 kinds of prediction tasks.
1 pre-trained models are built into paddle_serving_app, covering 6 kinds of prediction tasks.
The model files can be directly used for deployment, and the `--tutorial` argument can be added to obtain the deployment method.
| Prediction task | Model name |
......@@ -30,7 +30,7 @@ The model files can be directly used for deployment, and the `--tutorial` argume
| SemanticRepresentation | 'ernie' |
| ChineseWordSegmentation | 'lac' |
| ObjectDetection | 'faster_rcnn' |
| ImageSegmentation | 'unet', 'deeplabv3' |
| ImageSegmentation | 'unet', 'deeplabv3','deeplabv3+cityscapes' |
| ImageClassification | 'resnet_v2_50_imagenet', 'mobilenet_v2_imagenet' |
## Data preprocess API
......@@ -38,7 +38,8 @@ The model files can be directly used for deployment, and the `--tutorial` argume
paddle_serving_app provides a variety of data preprocessing methods for prediction tasks in the field of CV and NLP.
- class ChineseBertReader
Preprocessing for Chinese semantic representation task.
- `__init__(vocab_file, max_seq_len=20)`
......@@ -54,7 +55,8 @@ Preprocessing for Chinese semantic representation task.
[example](../examples/bert/bert_client.py)
- class LACReader
Preprocessing for Chinese word segmentation task.
- `__init__(dict_floder)`
......@@ -65,7 +67,7 @@ Preprocessing for Chinese word segmentation task.
- words(st ):Original text input.
- crf_decode(np.array):CRF code predicted by model.
[example](../examples/bert/lac_web_service.py)
[example](../examples/lac/lac_web_service.py)
- class SentaReader
......
......@@ -20,7 +20,7 @@ python -m paddle_serving_app.package --list_model
python -m paddle_serving_app.package --get_model senta_bilstm
```
paddle_serving_app中内置了10种预训练模型,涵盖了6种预测任务。获取到的模型文件可以直接用于部署,添加`--tutorial`参数可以获取对应的部署方式。
paddle_serving_app中内置了11种预训练模型,涵盖了6种预测任务。获取到的模型文件可以直接用于部署,添加`--tutorial`参数可以获取对应的部署方式。
| 预测服务类型 | 模型名称 |
| ------------ | ------------------------------------------------ |
......@@ -36,7 +36,7 @@ paddle_serving_app中内置了10种预训练模型,涵盖了6种预测任务
paddle_serving_app针对CV和NLP领域的模型任务,提供了多种常见的数据预处理方法。
- class ChineseBertReader
中文语义理解模型预处理
- `__init__(vocab_file, max_seq_len=20)`
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册