提交 f49f2f3d 编写于 作者: T TeslaZhao

Merge branch 'develop' of https://github.com/bjjwwang/serving into develop

...@@ -59,7 +59,7 @@ message SimpleResponse { required int32 err_code = 1; } ...@@ -59,7 +59,7 @@ message SimpleResponse { required int32 err_code = 1; }
message GetClientConfigRequest {} message GetClientConfigRequest {}
message GetClientConfigResponse { repeated string client_config_str_list = 1; } message GetClientConfigResponse { required string client_config_str = 1; }
service MultiLangGeneralModelService { service MultiLangGeneralModelService {
rpc Inference(InferenceRequest) returns (InferenceResponse) {} rpc Inference(InferenceRequest) returns (InferenceResponse) {}
......
...@@ -153,7 +153,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \ ...@@ -153,7 +153,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \ -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \ -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DOPENCV_DIR=${OPENCV_DIR} \ -DOPENCV_DIR=${OPENCV_DIR} \
-DWITH_OPENCV=ON -DWITH_OPENCV=ON \
-DSERVER=ON .. -DSERVER=ON ..
make -j10 make -j10
``` ```
......
...@@ -152,7 +152,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \ ...@@ -152,7 +152,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \ -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \ -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DOPENCV_DIR=${OPENCV_DIR} \ -DOPENCV_DIR=${OPENCV_DIR} \
-DWITH_OPENCV=ON -DWITH_OPENCV=ON \
-DSERVER=ON .. -DSERVER=ON ..
make -j10 make -j10
``` ```
......
...@@ -25,10 +25,10 @@ kubectl apply -f https://bit.ly/kong-ingress-dbless ...@@ -25,10 +25,10 @@ kubectl apply -f https://bit.ly/kong-ingress-dbless
`tools/generate_runtime_docker.sh`文件下,它的使用方式如下 `tools/generate_runtime_docker.sh`文件下,它的使用方式如下
```bash ```bash
bash tool/generate_runtime_docker.sh --env cuda10.1 --python 2.7 --serving 0.5.0 --paddle 2.0.0 --name serving_runtime:cuda10.1-py27 bash tool/generate_runtime_docker.sh --env cuda10.1 --python 3.6 --serving 0.6.0 --paddle 2.0.1 --name serving_runtime:cuda10.1-py36
``` ```
会生成 cuda10.1,python 2.7,serving版本0.5.0 还有 paddle版本2.0.0的运行镜像。如果有其他疑问,可以执行下列语句得到帮助信息。 会生成 cuda10.1,python 3.6,serving版本0.6.0 还有 paddle版本2.0.1的运行镜像。如果有其他疑问,可以执行下列语句得到帮助信息。
``` ```
bash tools/generate_runtime_docker.sh --help bash tools/generate_runtime_docker.sh --help
...@@ -39,7 +39,7 @@ bash tools/generate_runtime_docker.sh --help ...@@ -39,7 +39,7 @@ bash tools/generate_runtime_docker.sh --help
- paddle-serving-server, paddle-serving-client,paddle-serving-app,paddlepaddle,具体版本可以在tools/runtime.dockerfile当中查看,同时,如果有定制化的需求,也可以在该文件中进行定制化。 - paddle-serving-server, paddle-serving-client,paddle-serving-app,paddlepaddle,具体版本可以在tools/runtime.dockerfile当中查看,同时,如果有定制化的需求,也可以在该文件中进行定制化。
- paddle-serving-server 二进制可执行程序 - paddle-serving-server 二进制可执行程序
也就是说,运行镜像在生成之后,我们只需要将我们运行的代码(如果有)和模型搬运到镜像中就可以。生成后的镜像名为`paddle_serving:cuda10.2-py37` 也就是说,运行镜像在生成之后,我们只需要将我们运行的代码(如果有)和模型搬运到镜像中就可以。生成后的镜像名为`paddle_serving:cuda10.2-py36`
### 添加您的代码和模型 ### 添加您的代码和模型
...@@ -50,8 +50,8 @@ bash tools/generate_runtime_docker.sh --help ...@@ -50,8 +50,8 @@ bash tools/generate_runtime_docker.sh --help
对于pipeline模式,我们需要确保模型和程序文件、配置文件等各种依赖都能够在镜像中运行。因此可以在`/home/project`下存放我们的执行文件时,我们以`Serving/python/example/pipeline/ocr`为例,这是OCR文字识别任务。 对于pipeline模式,我们需要确保模型和程序文件、配置文件等各种依赖都能够在镜像中运行。因此可以在`/home/project`下存放我们的执行文件时,我们以`Serving/python/example/pipeline/ocr`为例,这是OCR文字识别任务。
```bash ```bash
#假设您已经拥有Serving运行镜像,假设镜像名为paddle_serving:cuda10.2-py37 #假设您已经拥有Serving运行镜像,假设镜像名为paddle_serving:cuda10.2-py36
docker run --rm -dit --name pipeline_serving_demo paddle_serving:cuda10.2-py37 bash docker run --rm -dit --name pipeline_serving_demo paddle_serving:cuda10.2-py36 bash
cd Serving/python/example/pipeline/ocr cd Serving/python/example/pipeline/ocr
# get models # get models
python -m paddle_serving_app.package --get_model ocr_rec python -m paddle_serving_app.package --get_model ocr_rec
...@@ -71,7 +71,7 @@ docker commit pipeline_serving_demo ocr_serving:latest ...@@ -71,7 +71,7 @@ docker commit pipeline_serving_demo ocr_serving:latest
``` ```
docker exec -it pipeline_serving_demo bash docker exec -it pipeline_serving_demo bash
cd /home/ocr cd /home/ocr
python3.7 web_service.py python3.6 web_service.py
``` ```
进入容器到工程目录之后,剩下的操作和调试代码的工作是类似的。 进入容器到工程目录之后,剩下的操作和调试代码的工作是类似的。
...@@ -83,8 +83,8 @@ python3.7 web_service.py ...@@ -83,8 +83,8 @@ python3.7 web_service.py
web service模式本质上和pipeline模式类似,因此我们以`Serving/python/examples/bert`为例 web service模式本质上和pipeline模式类似,因此我们以`Serving/python/examples/bert`为例
```bash ```bash
#假设您已经拥有Serving运行镜像,假设镜像名为registry.baidubce.com/paddlepaddle/serving:0.6.0-cuda10.2-py37 #假设您已经拥有Serving运行镜像,假设镜像名为registry.baidubce.com/paddlepaddle/serving:0.6.0-cuda10.2-py36
docker run --rm -dit --name webservice_serving_demo registry.baidubce.com/paddlepaddle/serving:0.6.0-cpu-py27 bash docker run --rm -dit --name webservice_serving_demo registry.baidubce.com/paddlepaddle/serving:0.6.0-cpu-py36 bash
cd Serving/python/examples/bert cd Serving/python/examples/bert
### download model ### download model
wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz
...@@ -102,7 +102,7 @@ docker commit webservice_serving_demo bert_serving:latest ...@@ -102,7 +102,7 @@ docker commit webservice_serving_demo bert_serving:latest
```bash ```bash
docker exec -it webservice_serving_demo bash docker exec -it webservice_serving_demo bash
cd /home/bert cd /home/bert
python3.7 bert_web_service.py 9292 python3.6 bert_web_service.py bert_seq128_model 9292
``` ```
进入容器到工程目录之后,剩下的操作和调试代码的工作是类似的。 进入容器到工程目录之后,剩下的操作和调试代码的工作是类似的。
...@@ -118,14 +118,15 @@ kubenetes集群操作需要`kubectl`去操纵yaml文件。我们这里给出了 ...@@ -118,14 +118,15 @@ kubenetes集群操作需要`kubectl`去操纵yaml文件。我们这里给出了
- pipeline ocr示例 - pipeline ocr示例
```bash ```bash
sh tools/generate_k8s_yamls.sh --app_name ocr --image_name registry.baidubce.com/paddlepaddle/serving:k8s-pipeline-demo --workdir /home/ocr --command "python2.7 web_service.py" --port 9999 sh tools/generate_k8s_yamls.sh --app_name ocr --image_name registry.baidubce.com/paddlepaddle/serving:k8s-pipeline-demo --workdir /home/ocr --command "python3.6 web_service.py" --port 9999
``` ```
- web service bert示例 - web service bert示例
```bash ```bash
sh tools/generate_k8s_yamls.sh --app_name bert --image_name registry.baidubce.com/paddlepaddle/serving:k8s-web-demo --workdir /home/bert --command "python2.7 bert_web_service.py 9292" --port 9292 sh tools/generate_k8s_yamls.sh --app_name bert --image_name registry.baidubce.com/paddlepaddle/serving:k8s-web-demo --workdir /home/bert --command "python3.6 bert_web_service.py bert_seq128_model 9292" --port 9292
``` ```
**需要注意的是,app_name需要同URL的函数名相同。例如示例中bert的访问URL是`https://127.0.0.1:9292/bert/prediction`,那么app_name应为bert。**
接下来我们会看到有两个yaml文件,分别是`k8s_serving.yaml`和 k8s_ingress.yaml`. 接下来我们会看到有两个yaml文件,分别是`k8s_serving.yaml`和 k8s_ingress.yaml`.
...@@ -174,7 +175,7 @@ spec: ...@@ -174,7 +175,7 @@ spec:
workingDir: /home/ocr workingDir: /home/ocr
name: ocr name: ocr
command: ['/bin/bash', '-c'] command: ['/bin/bash', '-c']
args: ["python3.7 web_service.py"] args: ["python3.6 bert_web_service.py bert_seq128_model 9292"]
env: env:
- name: NODE_NAME - name: NODE_NAME
valueFrom: valueFrom:
...@@ -216,7 +217,8 @@ spec: ...@@ -216,7 +217,8 @@ spec:
最终我们执行就可以启动相关容器和API网关。 最终我们执行就可以启动相关容器和API网关。
``` ```
kubectl apply -f k8s_serving.yaml k8s_ingress.yaml kubectl apply -f k8s_serving.yaml
kubectl apply -f k8s_ingress.yaml
``` ```
输入 输入
......
...@@ -94,4 +94,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "hello"}] ...@@ -94,4 +94,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "hello"}]
bash benchmark.sh bert_seq128_model bert_seq128_client bash benchmark.sh bert_seq128_model bert_seq128_client
``` ```
性能测试的日志文件为profile_log_bert_seq128_model 性能测试的日志文件为profile_log_bert_seq128_model
如需修改性能测试用例的参数,请修改benchmark.sh中的配置信息。 如需修改性能测试用例的参数,请修改benchmark.sh中的配置信息。
注意:bert_seq128_model和bert_seq128_client路径后不要加'/'符号,示例需要在GPU机器上运行。
...@@ -17,27 +17,30 @@ sleep 5 ...@@ -17,27 +17,30 @@ sleep 5
#warm up #warm up
$PYTHONROOT/bin/python3 benchmark.py --thread 4 --batch_size 1 --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1 $PYTHONROOT/bin/python3 benchmark.py --thread 4 --batch_size 1 --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
echo -e "import psutil\ncpu_utilization=psutil.cpu_percent(1,False)\nprint('CPU_UTILIZATION:', cpu_utilization)\n" > cpu_utilization.py echo -e "import psutil\nimport time\nwhile True:\n\tcpu_res = psutil.cpu_percent()\n\twith open('cpu.txt', 'a+') as f:\n\t\tf.write(f'{cpu_res}\\\n')\n\ttime.sleep(0.1)" > cpu.py
for thread_num in 1 4 8 16 for thread_num in 1 4 8 16
do do
for batch_size in 1 4 16 64 for batch_size in 1 4 16 64
do do
job_bt=`date '+%Y%m%d%H%M%S'` job_bt=`date '+%Y%m%d%H%M%S'`
nvidia-smi --id=0 --query-compute-apps=used_memory --format=csv -lms 100 > gpu_use.log 2>&1 & nvidia-smi --id=0 --query-compute-apps=used_memory --format=csv -lms 100 > gpu_memory_use.log 2>&1 &
nvidia-smi --id=0 --query-gpu=utilization.gpu --format=csv -lms 100 > gpu_utilization.log 2>&1 & nvidia-smi --id=0 --query-gpu=utilization.gpu --format=csv -lms 100 > gpu_utilization.log 2>&1 &
rm -rf cpu.txt
$PYTHONROOT/bin/python3 cpu.py &
gpu_memory_pid=$! gpu_memory_pid=$!
$PYTHONROOT/bin/python3 benchmark.py --thread $thread_num --batch_size $batch_size --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1 $PYTHONROOT/bin/python3 benchmark.py --thread $thread_num --batch_size $batch_size --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
kill ${gpu_memory_pid} kill `ps -ef|grep used_memory|awk '{print $2}'` > /dev/null
kill `ps -ef|grep used_memory|awk '{print $2}'` kill `ps -ef|grep utilization.gpu|awk '{print $2}'` > /dev/null
kill `ps -ef|grep cpu.py|awk '{print $2}'` > /dev/null
echo "model_name:" $1 echo "model_name:" $1
echo "thread_num:" $thread_num echo "thread_num:" $thread_num
echo "batch_size:" $batch_size echo "batch_size:" $batch_size
echo "=================Done====================" echo "=================Done===================="
echo "model_name:$1" >> profile_log_$1 echo "model_name:$1" >> profile_log_$1
echo "batch_size:$batch_size" >> profile_log_$1 echo "batch_size:$batch_size" >> profile_log_$1
$PYTHONROOT/bin/python3 cpu_utilization.py >> profile_log_$1
job_et=`date '+%Y%m%d%H%M%S'` job_et=`date '+%Y%m%d%H%M%S'`
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "MAX_GPU_MEMORY:", max}' gpu_use.log >> profile_log_$1 awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "CPU_UTILIZATION:", max}' cpu.txt >> profile_log_$1
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "MAX_GPU_MEMORY:", max}' gpu_memory_use.log >> profile_log_$1
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "GPU_UTILIZATION:", max}' gpu_utilization.log >> profile_log_$1 awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "GPU_UTILIZATION:", max}' gpu_utilization.log >> profile_log_$1
rm -rf gpu_use.log gpu_utilization.log rm -rf gpu_use.log gpu_utilization.log
$PYTHONROOT/bin/python3 ../util/show_profile.py profile $thread_num >> profile_log_$1 $PYTHONROOT/bin/python3 ../util/show_profile.py profile $thread_num >> profile_log_$1
......
...@@ -49,4 +49,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1 ...@@ -49,4 +49,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1
bash benchmark.sh uci_housing_model uci_housing_client bash benchmark.sh uci_housing_model uci_housing_client
``` ```
性能测试的日志文件为profile_log_uci_housing_model 性能测试的日志文件为profile_log_uci_housing_model
如需修改性能测试用例的参数,请修改benchmark.sh中的配置信息。 如需修改性能测试用例的参数,请修改benchmark.sh中的配置信息。
注意:uci_housing_model和uci_housing_client路径后不要加'/'符号,示例需要在GPU机器上运行。
...@@ -30,6 +30,7 @@ def single_func(idx, resource): ...@@ -30,6 +30,7 @@ def single_func(idx, resource):
paddle.dataset.uci_housing.train(), buf_size=500), paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=1) batch_size=1)
total_number = sum(1 for _ in train_reader()) total_number = sum(1 for _ in train_reader())
latency_list = []
if args.request == "rpc": if args.request == "rpc":
client = Client() client = Client()
...@@ -37,9 +38,12 @@ def single_func(idx, resource): ...@@ -37,9 +38,12 @@ def single_func(idx, resource):
client.connect([args.endpoint]) client.connect([args.endpoint])
start = time.time() start = time.time()
for data in train_reader(): for data in train_reader():
l_start = time.time()
fetch_map = client.predict(feed={"x": data[0][0]}, fetch=["price"]) fetch_map = client.predict(feed={"x": data[0][0]}, fetch=["price"])
l_end = time.time()
latency_list.append(l_end * 1000 - l_start * 1000)
end = time.time() end = time.time()
return [[end - start], [total_number]] return [[end - start], latency_list, [total_number]]
elif args.request == "http": elif args.request == "http":
train_reader = paddle.batch( train_reader = paddle.batch(
paddle.reader.shuffle( paddle.reader.shuffle(
...@@ -47,11 +51,14 @@ def single_func(idx, resource): ...@@ -47,11 +51,14 @@ def single_func(idx, resource):
batch_size=1) batch_size=1)
start = time.time() start = time.time()
for data in train_reader(): for data in train_reader():
l_start = time.time()
r = requests.post( r = requests.post(
'http://{}/uci/prediction'.format(args.endpoint), 'http://{}/uci/prediction'.format(args.endpoint),
data={"x": data[0]}) data={"x": data[0]})
l_end = time.time()
latency_list.append(l_end * 1000 - l_start * 1000)
end = time.time() end = time.time()
return [[end - start], [total_number]] return [[end - start], latency_list, [total_number]]
start = time.time() start = time.time()
......
rm profile_log*
export CUDA_VISIBLE_DEVICES=0,1
export FLAGS_profile_server=1
export FLAGS_profile_client=1
export FLAGS_serving_latency=1
gpu_id=0
#save cpu and gpu utilization log
if [ -d utilization ];then
rm -rf utilization
else
mkdir utilization
fi
#start server
$PYTHONROOT/bin/python3 -m paddle_serving_server.serve --model $1 --port 9292 --thread 4 --gpu_ids 0,1 --mem_optim --ir_optim > elog 2>&1 &
sleep 5
#warm up
$PYTHONROOT/bin/python3 benchmark.py --thread 4 --batch_size 1 --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
echo -e "import psutil\nimport time\nwhile True:\n\tcpu_res = psutil.cpu_percent()\n\twith open('cpu.txt', 'a+') as f:\n\t\tf.write(f'{cpu_res}\\\n')\n\ttime.sleep(0.1)" > cpu.py
for thread_num in 1 4 8 16
do
for batch_size in 1 4 16 64
do
job_bt=`date '+%Y%m%d%H%M%S'`
nvidia-smi --id=0 --query-compute-apps=used_memory --format=csv -lms 100 > gpu_memory_use.log 2>&1 &
nvidia-smi --id=0 --query-gpu=utilization.gpu --format=csv -lms 100 > gpu_utilization.log 2>&1 &
rm -rf cpu.txt
$PYTHONROOT/bin/python3 cpu.py &
gpu_memory_pid=$!
$PYTHONROOT/bin/python3 benchmark.py --thread $thread_num --batch_size $batch_size --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
kill `ps -ef|grep used_memory|awk '{print $2}'` > /dev/null
kill `ps -ef|grep utilization.gpu|awk '{print $2}'` > /dev/null
kill `ps -ef|grep cpu.py|awk '{print $2}'` > /dev/null
echo "model_name:" $1
echo "thread_num:" $thread_num
echo "batch_size:" $batch_size
echo "=================Done===================="
echo "model_name:$1" >> profile_log_$1
echo "batch_size:$batch_size" >> profile_log_$1
job_et=`date '+%Y%m%d%H%M%S'`
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "CPU_UTILIZATION:", max}' cpu.txt >> profile_log_$1
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "MAX_GPU_MEMORY:", max}' gpu_memory_use.log >> profile_log_$1
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "GPU_UTILIZATION:", max}' gpu_utilization.log >> profile_log_$1
rm -rf gpu_use.log gpu_utilization.log
$PYTHONROOT/bin/python3 ../util/show_profile.py profile $thread_num >> profile_log_$1
tail -n 8 profile >> profile_log_$1
echo "" >> profile_log_$1
done
done
#Divided log
awk 'BEGIN{RS="\n\n"}{i++}{print > "bert_log_"i}' profile_log_$1
mkdir bert_log && mv bert_log_* bert_log
ps -ef|grep 'serving'|grep -v grep|cut -c 9-15 | xargs kill -9
...@@ -554,15 +554,8 @@ class MultiLangClient(object): ...@@ -554,15 +554,8 @@ class MultiLangClient(object):
get_client_config_req = multi_lang_general_model_service_pb2.GetClientConfigRequest( get_client_config_req = multi_lang_general_model_service_pb2.GetClientConfigRequest(
) )
resp = self.stub_.GetClientConfig(get_client_config_req) resp = self.stub_.GetClientConfig(get_client_config_req)
model_config_path_list = resp.client_config_str_list model_config_str = resp.client_config_str
file_path_list = [] self._parse_model_config(model_config_str)
for single_model_config in model_config_path_list:
if os.path.isdir(single_model_config):
file_path_list.append("{}/serving_server_conf.prototxt".format(
single_model_config))
elif os.path.isfile(single_model_config):
file_path_list.append(single_model_config)
self._parse_model_config(file_path_list)
def _flatten_list(self, nested_list): def _flatten_list(self, nested_list):
for item in nested_list: for item in nested_list:
...@@ -572,23 +565,10 @@ class MultiLangClient(object): ...@@ -572,23 +565,10 @@ class MultiLangClient(object):
else: else:
yield item yield item
def _parse_model_config(self, model_config_path_list): def _parse_model_config(self, model_config_str):
if isinstance(model_config_path_list, str):
model_config_path_list = [model_config_path_list]
elif isinstance(model_config_path_list, list):
pass
file_path_list = []
for single_model_config in model_config_path_list:
if os.path.isdir(single_model_config):
file_path_list.append("{}/serving_client_conf.prototxt".format(
single_model_config))
elif os.path.isfile(single_model_config):
file_path_list.append(single_model_config)
model_conf = m_config.GeneralModelConfig() model_conf = m_config.GeneralModelConfig()
f = open(file_path_list[0], 'r') model_conf = google.protobuf.text_format.Merge(model_config_str,
model_conf = google.protobuf.text_format.Merge( model_conf)
str(f.read()), model_conf)
self.feed_names_ = [var.alias_name for var in model_conf.feed_var] self.feed_names_ = [var.alias_name for var in model_conf.feed_var]
self.feed_types_ = {} self.feed_types_ = {}
self.feed_shapes_ = {} self.feed_shapes_ = {}
...@@ -598,11 +578,6 @@ class MultiLangClient(object): ...@@ -598,11 +578,6 @@ class MultiLangClient(object):
self.feed_shapes_[var.alias_name] = var.shape self.feed_shapes_[var.alias_name] = var.shape
if var.is_lod_tensor: if var.is_lod_tensor:
self.lod_tensor_set_.add(var.alias_name) self.lod_tensor_set_.add(var.alias_name)
if len(file_path_list) > 1:
model_conf = m_config.GeneralModelConfig()
f = open(file_path_list[-1], 'r')
model_conf = google.protobuf.text_format.Merge(
str(f.read()), model_conf)
self.fetch_names_ = [var.alias_name for var in model_conf.fetch_var] self.fetch_names_ = [var.alias_name for var in model_conf.fetch_var]
self.fetch_types_ = {} self.fetch_types_ = {}
for i, var in enumerate(model_conf.fetch_var): for i, var in enumerate(model_conf.fetch_var):
......
...@@ -198,5 +198,14 @@ class MultiLangServerServiceServicer(multi_lang_general_model_service_pb2_grpc. ...@@ -198,5 +198,14 @@ class MultiLangServerServiceServicer(multi_lang_general_model_service_pb2_grpc.
#model_config_path_list is list right now. #model_config_path_list is list right now.
#dict should be added when graphMaker is used. #dict should be added when graphMaker is used.
resp = multi_lang_general_model_service_pb2.GetClientConfigResponse() resp = multi_lang_general_model_service_pb2.GetClientConfigResponse()
resp.client_config_str_list[:] = self.model_config_path_list model_config_str = []
for single_model_config in self.model_config_path_list:
if os.path.isdir(single_model_config):
with open("{}/serving_server_conf.prototxt".format(
single_model_config)) as f:
model_config_str.append(str(f.read()))
elif os.path.isfile(single_model_config):
with open(single_model_config) as f:
model_config_str.append(str(f.read()))
resp.client_config_str = model_config_str[0]
return resp return resp
...@@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH} ...@@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH}
# Downgrade TensorRT # Downgrade TensorRT
COPY tools/dockerfiles/build_scripts /build_scripts COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_trt.sh RUN bash /build_scripts/install_trt.sh cuda10.1
RUN rm -rf /build_scripts RUN rm -rf /build_scripts
# git credential to skip password typing # git credential to skip password typing
...@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \ ...@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \
make -j8 && make install && \ make -j8 && make install && \
ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache
RUN python3.8 -m pip install --upgrade pip requests && \ RUN python3.8 -m pip install --upgrade pip==21.1.1 requests && \
python3.7 -m pip install --upgrade pip requests && \ python3.7 -m pip install --upgrade pip==21.1.1 requests && \
python3.6 -m pip install --upgrade pip requests python3.6 -m pip install --upgrade pip==21.1.1 requests
RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \ RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \ tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
......
...@@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH} ...@@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH}
# Downgrade TensorRT # Downgrade TensorRT
COPY tools/dockerfiles/build_scripts /build_scripts COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_trt.sh RUN bash /build_scripts/install_trt.sh cuda10.2
RUN rm -rf /build_scripts RUN rm -rf /build_scripts
# git credential to skip password typing # git credential to skip password typing
...@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \ ...@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \
make -j8 && make install && \ make -j8 && make install && \
ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache
RUN python3.8 -m pip install --upgrade pip requests && \ RUN python3.8 -m pip install --upgrade pip==21.1.1 requests && \
python3.7 -m pip install --upgrade pip requests && \ python3.7 -m pip install --upgrade pip==21.1.1 requests && \
python3.6 -m pip install --upgrade pip requests python3.6 -m pip install --upgrade pip==21.1.1 requests
RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \ RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \ tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
......
# A image for building paddle binaries # A image for building paddle binaries
# Use cuda devel base image for both cpu and gpu environment # Use cuda devel base image for both cpu and gpu environment
# When you modify it, please be aware of cudnn-runtime version # When you modify it, please be aware of cudnn-runtime version
FROM nvidia/cuda:11.2.0-cudnn8-devel-ubuntu16.04 FROM nvidia/cuda:11.0.3-cudnn8-devel-ubuntu16.04
MAINTAINER PaddlePaddle Authors <paddle-dev@baidu.com> MAINTAINER PaddlePaddle Authors <paddle-dev@baidu.com>
# ENV variables # ENV variables
...@@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH} ...@@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH}
# Downgrade TensorRT # Downgrade TensorRT
COPY tools/dockerfiles/build_scripts /build_scripts COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_trt.sh RUN bash /build_scripts/install_trt.sh cuda11
RUN rm -rf /build_scripts RUN rm -rf /build_scripts
# git credential to skip password typing # git credential to skip password typing
...@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \ ...@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \
make -j8 && make install && \ make -j8 && make install && \
ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache
RUN python3.8 -m pip install --upgrade pip requests && \ RUN python3.8 -m pip install --upgrade pip==21.1.1 requests && \
python3.7 -m pip install --upgrade pip requests && \ python3.7 -m pip install --upgrade pip==21.1.1 requests && \
python3.6 -m pip install --upgrade pip requests python3.6 -m pip install --upgrade pip==21.1.1 requests
RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \ RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \ tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
......
...@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \ ...@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \
make -j8 && make install && \ make -j8 && make install && \
ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache
RUN python3.8 -m pip install --upgrade pip requests && \ RUN python3.8 -m pip install --upgrade pip==21.1.1 requests && \
python3.7 -m pip install --upgrade pip requests && \ python3.7 -m pip install --upgrade pip==21.1.1 requests && \
python3.6 -m pip install --upgrade pip requests python3.6 -m pip install --upgrade pip==21.1.1 requests
RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \ RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \ tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
......
...@@ -28,12 +28,12 @@ WORKDIR /home ...@@ -28,12 +28,12 @@ WORKDIR /home
# install whl and bin # install whl and bin
WORKDIR /home WORKDIR /home
COPY tools/dockerfiles/build_scripts /build_scripts COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_whl.sh 0.5.0 2.0.0 <<run_env>> <<python_version>> && rm -rf /build_scripts RUN bash /build_scripts/install_whl.sh <<serving_version>> <<paddle_version>> <<run_env>> <<python_version>> && rm -rf /build_scripts
# install tensorrt # install tensorrt
WORKDIR /home WORKDIR /home
COPY tools/dockerfiles/build_scripts /build_scripts COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_trt.sh && rm -rf /build_scripts RUN bash /build_scripts/install_trt.sh <<run_env>> && rm -rf /build_scripts
# install go # install go
RUN wget -qO- https://dl.google.com/go/go1.14.linux-amd64.tar.gz | \ RUN wget -qO- https://dl.google.com/go/go1.14.linux-amd64.tar.gz | \
......
...@@ -14,31 +14,21 @@ ...@@ -14,31 +14,21 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
VERSION=$(nvcc --version | grep release | grep -oEi "release ([0-9]+)\.([0-9])"| sed "s/release //") VERSION=$1
if [[ "$VERSION" == "cuda10.1" ]];then
if [[ "$VERSION" == "10.1" ]];then
wget -q https://paddle-ci.gz.bcebos.com/TRT/TensorRT6-cuda10.1-cudnn7.tar.gz --no-check-certificate wget -q https://paddle-ci.gz.bcebos.com/TRT/TensorRT6-cuda10.1-cudnn7.tar.gz --no-check-certificate
tar -zxf TensorRT6-cuda10.1-cudnn7.tar.gz -C /usr/local tar -zxf TensorRT6-cuda10.1-cudnn7.tar.gz -C /usr/local
cp -rf /usr/local/TensorRT6-cuda10.1-cudnn7/include/* /usr/include/ && cp -rf /usr/local/TensorRT6-cuda10.1-cudnn7/lib/* /usr/lib/ cp -rf /usr/local/TensorRT6-cuda10.1-cudnn7/include/* /usr/include/ && cp -rf /usr/local/TensorRT6-cuda10.1-cudnn7/lib/* /usr/lib/
echo "cuda10.1 trt install ==============>>>>>>>>>>>>"
rm TensorRT6-cuda10.1-cudnn7.tar.gz rm TensorRT6-cuda10.1-cudnn7.tar.gz
elif [[ "$VERSION" == "11.0" ]];then elif [[ "$VERSION" == "cuda11" ]];then
wget -q https://paddle-ci.cdn.bcebos.com/TRT/TensorRT-7.1.3.4.Ubuntu-16.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz --no-check-certificate wget -q https://paddle-ci.cdn.bcebos.com/TRT/TensorRT-7.1.3.4.Ubuntu-16.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz --no-check-certificate
tar -zxf TensorRT-7.1.3.4.Ubuntu-16.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz -C /usr/local tar -zxf TensorRT-7.1.3.4.Ubuntu-16.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz -C /usr/local
cp -rf /usr/local/TensorRT-7.1.3.4/include/* /usr/include/ && cp -rf /usr/local/TensorRT-7.1.3.4/lib/* /usr/lib/ cp -rf /usr/local/TensorRT-7.1.3.4/include/* /usr/include/ && cp -rf /usr/local/TensorRT-7.1.3.4/lib/* /usr/lib/
rm TensorRT-7.1.3.4.Ubuntu-16.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz rm TensorRT-7.1.3.4.Ubuntu-16.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz
elif [[ "$VERSION" == "10.2" ]];then elif [[ "$VERSION" == "cuda10.2" ]];then
wget https://paddle-ci.gz.bcebos.com/TRT/TensorRT7-cuda10.2-cudnn8.tar.gz --no-check-certificate wget https://paddle-ci.gz.bcebos.com/TRT/TensorRT7-cuda10.2-cudnn8.tar.gz --no-check-certificate
tar -zxf TensorRT7-cuda10.2-cudnn8.tar.gz -C /usr/local tar -zxf TensorRT7-cuda10.2-cudnn8.tar.gz -C /usr/local
cp -rf /usr/local/TensorRT-7.1.3.4/include/* /usr/include/ && cp -rf /usr/local/TensorRT-7.1.3.4/lib/* /usr/lib/ cp -rf /usr/local/TensorRT-7.1.3.4/include/* /usr/include/ && cp -rf /usr/local/TensorRT-7.1.3.4/lib/* /usr/lib/
rm TensorRT7-cuda10.2-cudnn8.tar.gz rm TensorRT7-cuda10.2-cudnn8.tar.gz
elif [[ "$VERSION" == "10.0" ]];then
wget -q https://paddle-ci.gz.bcebos.com/TRT/TensorRT6-cuda10.0-cudnn7.tar.gz --no-check-certificate
tar -zxf TensorRT6-cuda10.0-cudnn7.tar.gz -C /usr/local
cp -rf /usr/local/TensorRT6-cuda10.0-cudnn7/include/* /usr/include/ && cp -rf /usr/local/TensorRT6-cuda10.0-cudnn7/lib/* /usr/lib/
rm TensorRT6-cuda10.0-cudnn7.tar.gz
elif [[ "$VERSION" == "9.0" ]];then
wget -q https://paddle-ci.gz.bcebos.com/TRT/TensorRT6-cuda9.0-cudnn7.tar.gz --no-check-certificate
tar -zxf TensorRT6-cuda9.0-cudnn7.tar.gz -C /usr/local
cp -rf /usr/local/TensorRT6-cuda9.0-cudnn7/include/* /usr/include/ && cp -rf /usr/local/TensorRT6-cuda9.0-cudnn7/lib/* /usr/lib/
rm TensorRT6-cuda9.0-cudnn7.tar.gz
fi fi
...@@ -40,6 +40,9 @@ if [[ $SERVING_VERSION == "0.5.0" ]]; then ...@@ -40,6 +40,9 @@ if [[ $SERVING_VERSION == "0.5.0" ]]; then
elif [[ "$RUN_ENV" == "cuda10.2" ]];then elif [[ "$RUN_ENV" == "cuda10.2" ]];then
server_release="paddle-serving-server-gpu==$SERVING_VERSION.post102" server_release="paddle-serving-server-gpu==$SERVING_VERSION.post102"
serving_bin="https://paddle-serving.bj.bcebos.com/bin/serving-gpu-102-${SERVING_VERSION}.tar.gz" serving_bin="https://paddle-serving.bj.bcebos.com/bin/serving-gpu-102-${SERVING_VERSION}.tar.gz"
elif [[ "$RUN_ENV" == "cuda11" ]];then
server_release="paddle-serving-server-gpu==$SERVING_VERSION.post11"
serving_bin="https://paddle-serving.bj.bcebos.com/bin/serving-gpu-11-${SERVING_VERSION}.tar.gz"
fi fi
client_release="paddle-serving-client==$SERVING_VERSION" client_release="paddle-serving-client==$SERVING_VERSION"
app_release="paddle-serving-app==0.3.1" app_release="paddle-serving-app==0.3.1"
......
...@@ -60,8 +60,8 @@ function run ...@@ -60,8 +60,8 @@ function run
echo "named arg: command: $start_command" echo "named arg: command: $start_command"
echo "named arg: port: $port" echo "named arg: port: $port"
sed -e "s/<< APP_NAME >>/$app_name/g" -e "s/<< IMAGE_NAME >>/$(echo $image_name | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g" -e "s/<< WORKDIR >>/$(echo $workdir | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g" -e "s/<< COMMAND >>/\"$start_command\"/g" -e "s/<< PORT >>/$port/g" tools/k8s_serving.yaml_template > k8s_serving.yaml sed -e "s/<< APP_NAME >>/$app_name/g" -e "s/<< IMAGE_NAME >>/$(echo $image_name | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g" -e "s/<< WORKDIR >>/$(echo $workdir | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g" -e "s/<< COMMAND >>/\"$(echo $start_command | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')\"/g" -e "s/<< PORT >>/$port/g" tools/k8s_serving.yaml_template > k8s_serving.yaml
sed -e "s/<< APP_NAME >>/$app_name/g" -e "s/<< IMAGE_NAME >>/$(echo $image_name | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g" -e "s/<< WORKDIR >>/$(echo $workdir | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g" -e "s/<< COMMAND >>/\"$start_command\"/g" -e "s/<< PORT >>/$port/g" tools/k8s_ingress.yaml_template > k8s_ingress.yaml sed -e "s/<< APP_NAME >>/$app_name/g" -e "s/<< IMAGE_NAME >>/$(echo $image_name | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g" -e "s/<< WORKDIR >>/$(echo $workdir | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g" -e "s/<< COMMAND >>/\"$(echo $start_command | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')\"/g" -e "s/<< PORT >>/$port/g" tools/k8s_ingress.yaml_template > k8s_ingress.yaml
echo "check k8s_serving.yaml and k8s_ingress.yaml please." echo "check k8s_serving.yaml and k8s_ingress.yaml please."
} }
......
...@@ -66,6 +66,8 @@ function run ...@@ -66,6 +66,8 @@ function run
base_image="nvidia\/cuda:10.1-cudnn7-runtime-ubuntu16.04" base_image="nvidia\/cuda:10.1-cudnn7-runtime-ubuntu16.04"
elif [ $env == "cuda10.2" ]; then elif [ $env == "cuda10.2" ]; then
base_image="nvidia\/cuda:10.2-cudnn8-runtime-ubuntu16.04" base_image="nvidia\/cuda:10.2-cudnn8-runtime-ubuntu16.04"
elif [ $env == "cuda11" ]; then
base_image="nvidia\/cuda:11.0.3-cudnn8-runtime-ubuntu16.04"
fi fi
echo "base image: $base_image" echo "base image: $base_image"
echo "named arg: python: $python" echo "named arg: python: $python"
......
...@@ -34,6 +34,7 @@ spec: ...@@ -34,6 +34,7 @@ spec:
containers: containers:
- image: << IMAGE_NAME >> - image: << IMAGE_NAME >>
name: << APP_NAME >> name: << APP_NAME >>
imagePullPolicy: Always
ports: ports:
- containerPort: << PORT >> - containerPort: << PORT >>
workingDir: << WORKDIR >> workingDir: << WORKDIR >>
...@@ -41,6 +42,8 @@ spec: ...@@ -41,6 +42,8 @@ spec:
command: ['/bin/bash', '-c'] command: ['/bin/bash', '-c']
args: [<< COMMAND >>] args: [<< COMMAND >>]
env: env:
- name: SERVING_BIN
value: "/usr/local/serving_bin/serving"
- name: NODE_NAME - name: NODE_NAME
valueFrom: valueFrom:
fieldRef: fieldRef:
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册