Merge pull request #923 from wangjiawei04/0.4_readme

[WIP] 0.4 index page readme

Merge pull request #923 from wangjiawei04/0.4_readme
[WIP] 0.4 index page readme
21e6f5fb · Jiawei Wang · wangjiawei04 · f2691e95 · 21e6f5fb · 21e6f5fb
9 changed file
--- a/README.md
+++ b/README.md
@@ -45,10 +45,11 @@ nvidia-docker exec -it test bash
 ```

 ```shell
-pip install paddle-serving-client==0.3.2 
-pip install paddle-serving-server==0.3.2 # CPU
-pip install paddle-serving-server-gpu==0.3.2.post9 # GPU with CUDA9.0
-pip install paddle-serving-server-gpu==0.3.2.post10 # GPU with CUDA10.0
+pip install paddle-serving-client==0.4.0 
+pip install paddle-serving-server==0.4.0 # CPU
+pip install paddle-serving-server-gpu==0.4.0.post9 # GPU with CUDA9.0
+pip install paddle-serving-server-gpu==0.4.0.post10 # GPU with CUDA10.0
+pip install paddle-serving-server-gpu==0.4.0.trt # GPU with CUDA10.1+TensorRT
 ```

 You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source, add `-i https://pypi.tuna.tsinghua.edu.cn/simple` to pip command) to speed up the download.
@@ -57,7 +58,7 @@ If you need install modules compiled with develop branch, please download packag

 Packages of paddle-serving-server and paddle-serving-server-gpu support Centos 6/7, Ubuntu 16/18, Windows 10.

-Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python2.7/3.6/3.7.
+Packages of paddle-serving-client and paddle-serving-app support Linux and Windows, but paddle-serving-client only support python2.7/3.5/3.6/3.7.

 Recommended to install paddle >= 1.8.4.

@@ -113,11 +114,11 @@ tar -xzf uci_housing.tar.gz

 Paddle Serving provides HTTP and RPC based service for users to access

-### HTTP service
+### RPC service

-Paddle Serving provides a built-in python module called `paddle_serving_server.serve` that can start a RPC service or a http service with one-line command. If we specify the argument `--name uci`, it means that we will have a HTTP service with a url of `$IP:$PORT/uci/prediction`
+A user can also start a RPC service with `paddle_serving_server.serve`. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify `--name` here. 
 ``` shell
-python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci
+python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
 ```
 <center>

@@ -125,39 +126,24 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po
 |--------------|------|-----------|--------------------------------|
 | `thread` | int | `4` | Concurrency of current service |
 | `port` | int | `9292` | Exposed port of current service to users|
-| `name` | str | `""` | Service name, can be used to generate HTTP request url |
 | `model` | str | `""` | Path of paddle model directory to be served |
 | `mem_optim_off` | - | - | Disable memory / graphic memory optimization |
 | `ir_optim` | - | - | Enable analysis and optimization of calculation graph |
 | `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
 | `use_trt` (Only for trt version) | - | - | Run inference with TensorRT  |

-Here, we use `curl` to send a HTTP POST request to the service we just started. Users can use any python library to send HTTP POST as well, e.g, [requests](https://requests.readthedocs.io/en/master/).
 </center>
-
-``` shell
-curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
-```
-
-### RPC service
-
-A user can also start a RPC service with `paddle_serving_server.serve`. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify `--name` here. 
-``` shell
-python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
-```
-
 ``` python
 # A user can visit rpc service through paddle_serving_client API
 from paddle_serving_client import Client
-
+import numpy as np
 client = Client()
 client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
 client.connect(["127.0.0.1:9292"])
 data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
        -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
-fetch_map = client.predict(feed={"x": data}, fetch=["price"])
+fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
 print(fetch_map)
-
 ```
 Here, `client.predict` function has two arguments. `feed` is a `python dict` with model input variable alias name and values. `fetch` assigns the prediction variables to be returned from servers. In the example, the name of `"x"` and `"price"` are assigned when the servable model is saved during training.

@@ -169,6 +155,40 @@ Here, `client.predict` function has two arguments. `feed` is a `python dict` wit
 - **Highly concurrent and efficient communication** between clients and servers supported.
 - **Multiple programming languages** supported on client side, such as Golang, C++ and python.

+### WEB service
+
+Users can also put the data format processing logic on the server side, so that they can directly use curl to access the service, refer to the following case whose path is `python/examples/fit_a_line`
+
+```python
+from paddle_serving_server.web_service import WebService
+import numpy as np
+
+class UciService(WebService):
+    def preprocess(self, feed=[], fetch=[]):
+        feed_batch = []
+        is_batch = True
+        new_data = np.zeros((len(feed), 1, 13)).astype("float32")
+        for i, ins in enumerate(feed):
+            nums = np.array(ins["x"]).reshape(1, 1, 13)
+            new_data[i] = nums
+        feed = {"x": new_data}
+        return feed, fetch, is_batch
+
+uci_service = UciService(name="uci")
+uci_service.load_model_config("uci_housing_model")
+uci_service.prepare_server(workdir="workdir", port=9292)
+uci_service.run_rpc_service()
+uci_service.run_web_service()
+```
+for client side,
+```
+curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
+```
+the response is
+```
+{"result":{"price":[[18.901151657104492]]}}
+```
+
 <h2 align="center">Document</h2>

 ### New to Paddle Serving

--- a/README_CN.md
+++ b/README_CN.md
@@ -47,10 +47,11 @@ nvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/se
 nvidia-docker exec -it test bash
 ```
 ```shell
-pip install paddle-serving-client==0.3.2
-pip install paddle-serving-server==0.3.2 # CPU
-pip install paddle-serving-server-gpu==0.3.2.post9 # GPU with CUDA9.0
-pip install paddle-serving-server-gpu==0.3.2.post10 # GPU with CUDA10.0
+pip install paddle-serving-client==0.4.0
+pip install paddle-serving-server==0.4.0 # CPU
+pip install paddle-serving-server-gpu==0.4.0.post9 # GPU with CUDA9.0
+pip install paddle-serving-server-gpu==0.4.0.post10 # GPU with CUDA10.0
+pip install paddle-serving-server-gpu==0.4.0.trt # GPU with CUDA10.1+TensorRT
 ```

 您可能需要使用国内镜像源（例如清华源, 在pip命令中添加`-i https://pypi.tuna.tsinghua.edu.cn/simple`）来加速下载。
@@ -107,13 +108,12 @@ tar -xzf uci_housing.tar.gz

 Paddle Serving 为用户提供了基于 HTTP 和 RPC 的服务

+<h3 align="center">RPC服务</h3>

-<h3 align="center">HTTP服务</h3>
-
-Paddle Serving提供了一个名为`paddle_serving_server.serve`的内置python模块，可以使用单行命令启动RPC服务或HTTP服务。如果我们指定参数`--name uci`，则意味着我们将拥有一个HTTP服务，其URL为$IP:$PORT/uci/prediction`。
+用户还可以使用`paddle_serving_server.serve`启动RPC服务。 尽管用户需要基于Paddle Serving的python客户端API进行一些开发，但是RPC服务通常比HTTP服务更快。需要指出的是这里我们没有指定`--name`。

 ``` shell
-python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci
+python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
 ```
 <center>

@@ -128,21 +128,10 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po
 | `use_mkl` (Only for cpu version) | - | - | Run inference with MKL |
 | `use_trt` (Only for trt version) | - | - | Run inference with TensorRT  |

-我们使用 `curl` 命令来发送HTTP POST请求给刚刚启动的服务。用户也可以调用python库来发送HTTP POST请求，请参考英文文档 [requests](https://requests.readthedocs.io/en/master/)。
+我们使用 `curl` 命令来发送HTTP POST请求给刚刚启动的服务。用户也可以调用python库来发送HTTP POST请求，请参考英文文
+档 [requests](https://requests.readthedocs.io/en/master/)。
 </center>

-``` shell
-curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
-```
-
-<h3 align="center">RPC服务</h3>
-
-用户还可以使用`paddle_serving_server.serve`启动RPC服务。 尽管用户需要基于Paddle Serving的python客户端API进行一些开发，但是RPC服务通常比HTTP服务更快。需要指出的是这里我们没有指定`--name`。
-
-``` shell
-python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292
-```
-
 ``` python
 # A user can visit rpc service through paddle_serving_client API
 from paddle_serving_client import Client
@@ -152,12 +141,45 @@ client.load_client_config("uci_housing_client/serving_client_conf.prototxt")
 client.connect(["127.0.0.1:9292"])
 data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727,
        -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]
-fetch_map = client.predict(feed={"x": data}, fetch=["price"])
+fetch_map = client.predict(feed={"x": np.array(data).reshape(1,13,1)}, fetch=["price"])
 print(fetch_map)

 ```
 在这里，`client.predict`函数具有两个参数。 `feed`是带有模型输入变量别名和值的`python dict`。 `fetch`被要从服务器返回的预测变量赋值。 在该示例中，在训练过程中保存可服务模型时，被赋值的tensor名为`"x"`和`"price"`。

+<h3 align="center">HTTP服务</h3>
+用户也可以将数据格式处理逻辑放在服务器端进行，这样就可以直接用curl去访问服务，参考如下案例，在目录``python/examples/fit_a_line``
+
+```python
+from paddle_serving_server.web_service import WebService
+import numpy as np
+
+class UciService(WebService):
+    def preprocess(self, feed=[], fetch=[]):
+        feed_batch = []
+        is_batch = True
+        new_data = np.zeros((len(feed), 1, 13)).astype("float32")
+        for i, ins in enumerate(feed):
+            nums = np.array(ins["x"]).reshape(1, 1, 13)
+            new_data[i] = nums
+        feed = {"x": new_data}
+        return feed, fetch, is_batch
+
+uci_service = UciService(name="uci")
+uci_service.load_model_config("uci_housing_model")
+uci_service.prepare_server(workdir="workdir", port=9292)
+uci_service.run_rpc_service()
+uci_service.run_web_service()
+```
+客户端输入
+```
+curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction
+```
+返回结果
+```
+{"result":{"price":[[18.901151657104492]]}}
+```
+
 <h2 align="center">Paddle Serving的核心功能</h2>

 - 与Paddle训练紧密连接，绝大部分Paddle模型可以 **一键部署**.

--- a/doc/WINDOWS_TUTORIAL.md
+++ b/doc/WINDOWS_TUTORIAL.md
@@ -42,7 +42,7 @@ python -m paddle_serving_app.package --get_model ocr_rec
 tar -xzvf ocr_rec.tar.gz
 python -m paddle_serving_app.package --get_model ocr_det
 tar -xzvf ocr_det.tar.gz
-python ocr_debugger_server.py &
+python ocr_debugger_server.py cpu &
 python ocr_web_client.py
 ```


--- a/doc/WINDOWS_TUTORIAL_CN.md
+++ b/doc/WINDOWS_TUTORIAL_CN.md
@@ -42,7 +42,7 @@ python -m paddle_serving_app.package --get_model ocr_rec
 tar -xzvf ocr_rec.tar.gz
 python -m paddle_serving_app.package --get_model ocr_det
 tar -xzvf ocr_det.tar.gz
-python ocr_debugger_server.py &
+python ocr_debugger_server.py cpu &
 python ocr_web_client.py
 ```


--- a/python/examples/fit_a_line/README.md
+++ b/python/examples/fit_a_line/README.md
@@ -14,12 +14,6 @@ sh get_data.sh

 ### Start server

-``` shell
-python test_server.py uci_housing_model/
-```
-
-You can also start the default RPC service with the following line of code:
-
 ```shell
 python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393
 ```
@@ -40,7 +34,7 @@ python test_client.py uci_housing_client/serving_client_conf.prototxt

 Start a web service with default web service hosting modules:
 ``` shell
-python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --name uci
+python test_server.py
 ```

 ### Client prediction

--- a/python/examples/fit_a_line/README_CN.md
+++ b/python/examples/fit_a_line/README_CN.md
@@ -41,7 +41,7 @@ python test_client.py uci_housing_client/serving_client_conf.prototxt
 通过下面的一行代码开启默认web服务：

 ``` shell
-python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9393 --name uci
+python test_server.py
 ```

 ### 客户端预测

--- a/python/examples/fit_a_line/test_client.py
+++ b/python/examples/fit_a_line/test_client.py
@@ -15,6 +15,7 @@

 from paddle_serving_client import Client
 import sys
+import numpy as np

 client = Client()
 client.load_client_config(sys.argv[1])
@@ -27,7 +28,6 @@ test_reader = paddle.batch(
    batch_size=1)

 for data in test_reader():
-    import numpy as np
    new_data = np.zeros((1, 1, 13)).astype("float32")
    new_data[0] = data[0][0]
    fetch_map = client.predict(

--- a/python/examples/fit_a_line/test_server.py
+++ b/python/examples/fit_a_line/test_server.py
@@ -13,24 +13,24 @@
 # limitations under the License.
 # pylint: disable=doc-string-missing

-import os
-import sys
-from paddle_serving_server import OpMaker
-from paddle_serving_server import OpSeqMaker
-from paddle_serving_server import Server
+from paddle_serving_server.web_service import WebService
+import numpy as np

-op_maker = OpMaker()
-read_op = op_maker.create('general_reader')
-general_infer_op = op_maker.create('general_infer')
-response_op = op_maker.create('general_response')

-op_seq_maker = OpSeqMaker()
-op_seq_maker.add_op(read_op)
-op_seq_maker.add_op(general_infer_op)
-op_seq_maker.add_op(response_op)
+class UciService(WebService):
+    def preprocess(self, feed=[], fetch=[]):
+        feed_batch = []
+        is_batch = True
+        new_data = np.zeros((len(feed), 1, 13)).astype("float32")
+        for i, ins in enumerate(feed):
+            nums = np.array(ins["x"]).reshape(1, 1, 13)
+            new_data[i] = nums
+        feed = {"x": new_data}
+        return feed, fetch, is_batch

-server = Server()
-server.set_op_sequence(op_seq_maker.get_op_sequence())
-server.load_model_config(sys.argv[1])
-server.prepare_server(workdir="work_dir1", port=9393, device="cpu")
-server.run_server()
+
+uci_service = UciService(name="uci")
+uci_service.load_model_config("uci_housing_model")
+uci_service.prepare_server(workdir="workdir", port=9292)
+uci_service.run_rpc_service()
+uci_service.run_web_service()
--- a/python/paddle_serving_server_gpu/__init__.py
+++ b/python/paddle_serving_server_gpu/__init__.py
@@ -25,7 +25,9 @@ from .version import serving_server_version
 from contextlib import closing
 import argparse
 import collections
-import fcntl
+import sys
+if sys.platform.startswith('win') is False:
+    import fcntl
 import shutil
 import numpy as np
 import grpc