Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
Serving
提交
fa52ca27
S
Serving
项目概览
PaddlePaddle
/
Serving
大约 1 年 前同步成功
通知
186
Star
833
Fork
253
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
105
列表
看板
标记
里程碑
合并请求
10
Wiki
2
Wiki
分析
仓库
DevOps
项目成员
Pages
S
Serving
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
105
Issue
105
列表
看板
标记
里程碑
合并请求
10
合并请求
10
Pages
分析
分析
仓库分析
DevOps
Wiki
2
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
fa52ca27
编写于
5月 10, 2021
作者:
T
TeslaZhao
提交者:
GitHub
5月 10, 2021
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #46 from PaddlePaddle/develop
Sync repo
上级
6be73cbf
50bbbbee
变更
21
隐藏空白更改
内联
并排
Showing
21 changed file
with
158 addition
and
90 deletion
+158
-90
core/configure/proto/multi_lang_general_model_service.proto
core/configure/proto/multi_lang_general_model_service.proto
+1
-1
doc/COMPILE.md
doc/COMPILE.md
+1
-1
doc/COMPILE_CN.md
doc/COMPILE_CN.md
+1
-1
doc/PADDLE_SERVING_ON_KUBERNETES.md
doc/PADDLE_SERVING_ON_KUBERNETES.md
+15
-13
python/examples/bert/README_CN.md
python/examples/bert/README_CN.md
+3
-0
python/examples/bert/benchmark.sh
python/examples/bert/benchmark.sh
+9
-6
python/examples/fit_a_line/README_CN.md
python/examples/fit_a_line/README_CN.md
+3
-0
python/examples/fit_a_line/benchmark.py
python/examples/fit_a_line/benchmark.py
+9
-2
python/examples/fit_a_line/benchmark.sh
python/examples/fit_a_line/benchmark.sh
+55
-0
python/paddle_serving_client/client.py
python/paddle_serving_client/client.py
+5
-30
python/paddle_serving_server/rpc_service.py
python/paddle_serving_server/rpc_service.py
+10
-1
tools/Dockerfile.cuda10.1-cudnn7.devel
tools/Dockerfile.cuda10.1-cudnn7.devel
+4
-4
tools/Dockerfile.cuda10.2-cudnn8.devel
tools/Dockerfile.cuda10.2-cudnn8.devel
+4
-4
tools/Dockerfile.cuda11-cudnn8.devel
tools/Dockerfile.cuda11-cudnn8.devel
+5
-5
tools/Dockerfile.devel
tools/Dockerfile.devel
+3
-3
tools/Dockerfile.runtime_template
tools/Dockerfile.runtime_template
+2
-2
tools/dockerfiles/build_scripts/install_trt.sh
tools/dockerfiles/build_scripts/install_trt.sh
+5
-15
tools/dockerfiles/build_scripts/install_whl.sh
tools/dockerfiles/build_scripts/install_whl.sh
+16
-0
tools/generate_k8s_yamls.sh
tools/generate_k8s_yamls.sh
+2
-2
tools/generate_runtime_docker.sh
tools/generate_runtime_docker.sh
+2
-0
tools/k8s_serving.yaml_template
tools/k8s_serving.yaml_template
+3
-0
未找到文件。
core/configure/proto/multi_lang_general_model_service.proto
浏览文件 @
fa52ca27
...
...
@@ -59,7 +59,7 @@ message SimpleResponse { required int32 err_code = 1; }
message
GetClientConfigRequest
{}
message
GetClientConfigResponse
{
re
peated
string
client_config_str_list
=
1
;
}
message
GetClientConfigResponse
{
re
quired
string
client_config_str
=
1
;
}
service
MultiLangGeneralModelService
{
rpc
Inference
(
InferenceRequest
)
returns
(
InferenceResponse
)
{}
...
...
doc/COMPILE.md
浏览文件 @
fa52ca27
...
...
@@ -153,7 +153,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-DPYTHON_LIBRARIES
=
$PYTHON_LIBRARIES
\
-DPYTHON_EXECUTABLE
=
$PYTHON_EXECUTABLE
\
-DOPENCV_DIR
=
${
OPENCV_DIR
}
\
-DWITH_OPENCV
=
ON
-DWITH_OPENCV
=
ON
\
-DSERVER
=
ON ..
make
-j10
```
...
...
doc/COMPILE_CN.md
浏览文件 @
fa52ca27
...
...
@@ -152,7 +152,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-DPYTHON_LIBRARIES
=
$PYTHON_LIBRARIES
\
-DPYTHON_EXECUTABLE
=
$PYTHON_EXECUTABLE
\
-DOPENCV_DIR
=
${
OPENCV_DIR
}
\
-DWITH_OPENCV
=
ON
-DWITH_OPENCV
=
ON
\
-DSERVER
=
ON ..
make
-j10
```
...
...
doc/PADDLE_SERVING_ON_KUBERNETES.md
浏览文件 @
fa52ca27
...
...
@@ -25,10 +25,10 @@ kubectl apply -f https://bit.ly/kong-ingress-dbless
在
`tools/generate_runtime_docker.sh`
文件下,它的使用方式如下
```
bash
bash tool/generate_runtime_docker.sh
--env
cuda10.1
--python
2.7
--serving
0.5.0
--paddle
2.0.0
--name
serving_runtime:cuda10.1-py27
bash tool/generate_runtime_docker.sh
--env
cuda10.1
--python
3.6
--serving
0.6.0
--paddle
2.0.1
--name
serving_runtime:cuda10.1-py36
```
会生成 cuda10.1,python
2.7,serving版本0.5.0 还有 paddle版本2.0.0
的运行镜像。如果有其他疑问,可以执行下列语句得到帮助信息。
会生成 cuda10.1,python
3.6,serving版本0.6.0 还有 paddle版本2.0.1
的运行镜像。如果有其他疑问,可以执行下列语句得到帮助信息。
```
bash tools/generate_runtime_docker.sh --help
...
...
@@ -39,7 +39,7 @@ bash tools/generate_runtime_docker.sh --help
-
paddle-serving-server, paddle-serving-client,paddle-serving-app,paddlepaddle,具体版本可以在tools/runtime.dockerfile当中查看,同时,如果有定制化的需求,也可以在该文件中进行定制化。
-
paddle-serving-server 二进制可执行程序
也就是说,运行镜像在生成之后,我们只需要将我们运行的代码(如果有)和模型搬运到镜像中就可以。生成后的镜像名为
`paddle_serving:cuda10.2-py3
7
`
也就是说,运行镜像在生成之后,我们只需要将我们运行的代码(如果有)和模型搬运到镜像中就可以。生成后的镜像名为
`paddle_serving:cuda10.2-py3
6
`
### 添加您的代码和模型
...
...
@@ -50,8 +50,8 @@ bash tools/generate_runtime_docker.sh --help
对于pipeline模式,我们需要确保模型和程序文件、配置文件等各种依赖都能够在镜像中运行。因此可以在
`/home/project`
下存放我们的执行文件时,我们以
`Serving/python/example/pipeline/ocr`
为例,这是OCR文字识别任务。
```
bash
#假设您已经拥有Serving运行镜像,假设镜像名为paddle_serving:cuda10.2-py3
7
docker run
--rm
-dit
--name
pipeline_serving_demo paddle_serving:cuda10.2-py3
7
bash
#假设您已经拥有Serving运行镜像,假设镜像名为paddle_serving:cuda10.2-py3
6
docker run
--rm
-dit
--name
pipeline_serving_demo paddle_serving:cuda10.2-py3
6
bash
cd
Serving/python/example/pipeline/ocr
# get models
python
-m
paddle_serving_app.package
--get_model
ocr_rec
...
...
@@ -71,7 +71,7 @@ docker commit pipeline_serving_demo ocr_serving:latest
```
docker exec -it pipeline_serving_demo bash
cd /home/ocr
python3.
7
web_service.py
python3.
6
web_service.py
```
进入容器到工程目录之后,剩下的操作和调试代码的工作是类似的。
...
...
@@ -83,8 +83,8 @@ python3.7 web_service.py
web service模式本质上和pipeline模式类似,因此我们以
`Serving/python/examples/bert`
为例
```
bash
#假设您已经拥有Serving运行镜像,假设镜像名为registry.baidubce.com/paddlepaddle/serving:0.6.0-cuda10.2-py3
7
docker run
--rm
-dit
--name
webservice_serving_demo registry.baidubce.com/paddlepaddle/serving:0.6.0-cpu-py
27
bash
#假设您已经拥有Serving运行镜像,假设镜像名为registry.baidubce.com/paddlepaddle/serving:0.6.0-cuda10.2-py3
6
docker run
--rm
-dit
--name
webservice_serving_demo registry.baidubce.com/paddlepaddle/serving:0.6.0-cpu-py
36
bash
cd
Serving/python/examples/bert
### download model
wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz
...
...
@@ -102,7 +102,7 @@ docker commit webservice_serving_demo bert_serving:latest
```
bash
docker
exec
-it
webservice_serving_demo bash
cd
/home/bert
python3.
7 bert_web_service.py
9292
python3.
6 bert_web_service.py bert_seq128_model
9292
```
进入容器到工程目录之后,剩下的操作和调试代码的工作是类似的。
...
...
@@ -118,14 +118,15 @@ kubenetes集群操作需要`kubectl`去操纵yaml文件。我们这里给出了
-
pipeline ocr示例
```
bash
sh tools/generate_k8s_yamls.sh
--app_name
ocr
--image_name
registry.baidubce.com/paddlepaddle/serving:k8s-pipeline-demo
--workdir
/home/ocr
--command
"python
2.7
web_service.py"
--port
9999
sh tools/generate_k8s_yamls.sh
--app_name
ocr
--image_name
registry.baidubce.com/paddlepaddle/serving:k8s-pipeline-demo
--workdir
/home/ocr
--command
"python
3.6
web_service.py"
--port
9999
```
-
web service bert示例
```
bash
sh tools/generate_k8s_yamls.sh
--app_name
bert
--image_name
registry.baidubce.com/paddlepaddle/serving:k8s-web-demo
--workdir
/home/bert
--command
"python
2.7 bert_web_service.py
9292"
--port
9292
sh tools/generate_k8s_yamls.sh
--app_name
bert
--image_name
registry.baidubce.com/paddlepaddle/serving:k8s-web-demo
--workdir
/home/bert
--command
"python
3.6 bert_web_service.py bert_seq128_model
9292"
--port
9292
```
**需要注意的是,app_name需要同URL的函数名相同。例如示例中bert的访问URL是`https://127.0.0.1:9292/bert/prediction`,那么app_name应为bert。**
接下来我们会看到有两个yaml文件,分别是
`k8s_serving.yaml`
和 k8s_ingress.yaml
`.
...
...
@@ -174,7 +175,7 @@ spec:
workingDir: /home/ocr
name: ocr
command: ['/bin/bash', '-c']
args: ["python3.
7 web_service.py
"]
args: ["python3.
6 bert_web_service.py bert_seq128_model 9292
"]
env:
- name: NODE_NAME
valueFrom:
...
...
@@ -216,7 +217,8 @@ spec:
最终我们执行就可以启动相关容器和API网关。
```
kubectl apply -f k8s_serving.yaml k8s_ingress.yaml
kubectl apply -f k8s_serving.yaml
kubectl apply -f k8s_ingress.yaml
```
输入
...
...
python/examples/bert/README_CN.md
浏览文件 @
fa52ca27
...
...
@@ -94,4 +94,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "hello"}]
bash benchmark.sh bert_seq128_model bert_seq128_client
```
性能测试的日志文件为profile_log_bert_seq128_model
如需修改性能测试用例的参数,请修改benchmark.sh中的配置信息。
注意:bert_seq128_model和bert_seq128_client路径后不要加'/'符号,示例需要在GPU机器上运行。
python/examples/bert/benchmark.sh
浏览文件 @
fa52ca27
...
...
@@ -17,27 +17,30 @@ sleep 5
#warm up
$PYTHONROOT
/bin/python3 benchmark.py
--thread
4
--batch_size
1
--model
$2
/serving_client_conf.prototxt
--request
rpc
>
profile 2>&1
echo
-e
"import psutil
\n
cpu_utilization=psutil.cpu_percent(1,False)
\n
print('CPU_UTILIZATION:', cpu_utilization)
\n
"
>
cpu_utilization
.py
echo
-e
"import psutil
\n
import time
\n
while True:
\n\t
cpu_res = psutil.cpu_percent()
\n\t
with open('cpu.txt', 'a+') as f:
\n\t\t
f.write(f'{cpu_res}
\\\n
')
\n\t
time.sleep(0.1)"
>
cpu
.py
for
thread_num
in
1 4 8 16
do
for
batch_size
in
1 4 16 64
do
job_bt
=
`
date
'+%Y%m%d%H%M%S'
`
nvidia-smi
--id
=
0
--query-compute-apps
=
used_memory
--format
=
csv
-lms
100
>
gpu_use.log 2>&1 &
nvidia-smi
--id
=
0
--query-compute-apps
=
used_memory
--format
=
csv
-lms
100
>
gpu_
memory_
use.log 2>&1 &
nvidia-smi
--id
=
0
--query-gpu
=
utilization.gpu
--format
=
csv
-lms
100
>
gpu_utilization.log 2>&1 &
rm
-rf
cpu.txt
$PYTHONROOT
/bin/python3 cpu.py &
gpu_memory_pid
=
$!
$PYTHONROOT
/bin/python3 benchmark.py
--thread
$thread_num
--batch_size
$batch_size
--model
$2
/serving_client_conf.prototxt
--request
rpc
>
profile 2>&1
kill
${
gpu_memory_pid
}
kill
`
ps
-ef
|grep used_memory|awk
'{print $2}'
`
kill
`
ps
-ef
|grep used_memory|awk
'{print $2}'
`
>
/dev/null
kill
`
ps
-ef
|grep utilization.gpu|awk
'{print $2}'
`
>
/dev/null
kill
`
ps
-ef
|grep cpu.py|awk
'{print $2}'
`
>
/dev/null
echo
"model_name:"
$1
echo
"thread_num:"
$thread_num
echo
"batch_size:"
$batch_size
echo
"=================Done===================="
echo
"model_name:
$1
"
>>
profile_log_
$1
echo
"batch_size:
$batch_size
"
>>
profile_log_
$1
$PYTHONROOT
/bin/python3 cpu_utilization.py
>>
profile_log_
$1
job_et
=
`
date
'+%Y%m%d%H%M%S'
`
awk
'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "MAX_GPU_MEMORY:", max}'
gpu_use.log
>>
profile_log_
$1
awk
'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "CPU_UTILIZATION:", max}'
cpu.txt
>>
profile_log_
$1
awk
'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "MAX_GPU_MEMORY:", max}'
gpu_memory_use.log
>>
profile_log_
$1
awk
'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "GPU_UTILIZATION:", max}'
gpu_utilization.log
>>
profile_log_
$1
rm
-rf
gpu_use.log gpu_utilization.log
$PYTHONROOT
/bin/python3 ../util/show_profile.py profile
$thread_num
>>
profile_log_
$1
...
...
python/examples/fit_a_line/README_CN.md
浏览文件 @
fa52ca27
...
...
@@ -49,4 +49,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1
bash benchmark.sh uci_housing_model uci_housing_client
```
性能测试的日志文件为profile_log_uci_housing_model
如需修改性能测试用例的参数,请修改benchmark.sh中的配置信息。
注意:uci_housing_model和uci_housing_client路径后不要加'/'符号,示例需要在GPU机器上运行。
python/examples/fit_a_line/benchmark.py
浏览文件 @
fa52ca27
...
...
@@ -30,6 +30,7 @@ def single_func(idx, resource):
paddle
.
dataset
.
uci_housing
.
train
(),
buf_size
=
500
),
batch_size
=
1
)
total_number
=
sum
(
1
for
_
in
train_reader
())
latency_list
=
[]
if
args
.
request
==
"rpc"
:
client
=
Client
()
...
...
@@ -37,9 +38,12 @@ def single_func(idx, resource):
client
.
connect
([
args
.
endpoint
])
start
=
time
.
time
()
for
data
in
train_reader
():
l_start
=
time
.
time
()
fetch_map
=
client
.
predict
(
feed
=
{
"x"
:
data
[
0
][
0
]},
fetch
=
[
"price"
])
l_end
=
time
.
time
()
latency_list
.
append
(
l_end
*
1000
-
l_start
*
1000
)
end
=
time
.
time
()
return
[[
end
-
start
],
[
total_number
]]
return
[[
end
-
start
],
latency_list
,
[
total_number
]]
elif
args
.
request
==
"http"
:
train_reader
=
paddle
.
batch
(
paddle
.
reader
.
shuffle
(
...
...
@@ -47,11 +51,14 @@ def single_func(idx, resource):
batch_size
=
1
)
start
=
time
.
time
()
for
data
in
train_reader
():
l_start
=
time
.
time
()
r
=
requests
.
post
(
'http://{}/uci/prediction'
.
format
(
args
.
endpoint
),
data
=
{
"x"
:
data
[
0
]})
l_end
=
time
.
time
()
latency_list
.
append
(
l_end
*
1000
-
l_start
*
1000
)
end
=
time
.
time
()
return
[[
end
-
start
],
[
total_number
]]
return
[[
end
-
start
],
latency_list
,
[
total_number
]]
start
=
time
.
time
()
...
...
python/examples/fit_a_line/benchmark.sh
0 → 100755
浏览文件 @
fa52ca27
rm
profile_log
*
export
CUDA_VISIBLE_DEVICES
=
0,1
export
FLAGS_profile_server
=
1
export
FLAGS_profile_client
=
1
export
FLAGS_serving_latency
=
1
gpu_id
=
0
#save cpu and gpu utilization log
if
[
-d
utilization
]
;
then
rm
-rf
utilization
else
mkdir
utilization
fi
#start server
$PYTHONROOT
/bin/python3
-m
paddle_serving_server.serve
--model
$1
--port
9292
--thread
4
--gpu_ids
0,1
--mem_optim
--ir_optim
>
elog 2>&1 &
sleep
5
#warm up
$PYTHONROOT
/bin/python3 benchmark.py
--thread
4
--batch_size
1
--model
$2
/serving_client_conf.prototxt
--request
rpc
>
profile 2>&1
echo
-e
"import psutil
\n
import time
\n
while True:
\n\t
cpu_res = psutil.cpu_percent()
\n\t
with open('cpu.txt', 'a+') as f:
\n\t\t
f.write(f'{cpu_res}
\\\n
')
\n\t
time.sleep(0.1)"
>
cpu.py
for
thread_num
in
1 4 8 16
do
for
batch_size
in
1 4 16 64
do
job_bt
=
`
date
'+%Y%m%d%H%M%S'
`
nvidia-smi
--id
=
0
--query-compute-apps
=
used_memory
--format
=
csv
-lms
100
>
gpu_memory_use.log 2>&1 &
nvidia-smi
--id
=
0
--query-gpu
=
utilization.gpu
--format
=
csv
-lms
100
>
gpu_utilization.log 2>&1 &
rm
-rf
cpu.txt
$PYTHONROOT
/bin/python3 cpu.py &
gpu_memory_pid
=
$!
$PYTHONROOT
/bin/python3 benchmark.py
--thread
$thread_num
--batch_size
$batch_size
--model
$2
/serving_client_conf.prototxt
--request
rpc
>
profile 2>&1
kill
`
ps
-ef
|grep used_memory|awk
'{print $2}'
`
>
/dev/null
kill
`
ps
-ef
|grep utilization.gpu|awk
'{print $2}'
`
>
/dev/null
kill
`
ps
-ef
|grep cpu.py|awk
'{print $2}'
`
>
/dev/null
echo
"model_name:"
$1
echo
"thread_num:"
$thread_num
echo
"batch_size:"
$batch_size
echo
"=================Done===================="
echo
"model_name:
$1
"
>>
profile_log_
$1
echo
"batch_size:
$batch_size
"
>>
profile_log_
$1
job_et
=
`
date
'+%Y%m%d%H%M%S'
`
awk
'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "CPU_UTILIZATION:", max}'
cpu.txt
>>
profile_log_
$1
awk
'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "MAX_GPU_MEMORY:", max}'
gpu_memory_use.log
>>
profile_log_
$1
awk
'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "GPU_UTILIZATION:", max}'
gpu_utilization.log
>>
profile_log_
$1
rm
-rf
gpu_use.log gpu_utilization.log
$PYTHONROOT
/bin/python3 ../util/show_profile.py profile
$thread_num
>>
profile_log_
$1
tail
-n
8 profile
>>
profile_log_
$1
echo
""
>>
profile_log_
$1
done
done
#Divided log
awk
'BEGIN{RS="\n\n"}{i++}{print > "bert_log_"i}'
profile_log_
$1
mkdir
bert_log
&&
mv
bert_log_
*
bert_log
ps
-ef
|grep
'serving'
|grep
-v
grep
|cut
-c
9-15 | xargs
kill
-9
python/paddle_serving_client/client.py
浏览文件 @
fa52ca27
...
...
@@ -554,15 +554,8 @@ class MultiLangClient(object):
get_client_config_req
=
multi_lang_general_model_service_pb2
.
GetClientConfigRequest
(
)
resp
=
self
.
stub_
.
GetClientConfig
(
get_client_config_req
)
model_config_path_list
=
resp
.
client_config_str_list
file_path_list
=
[]
for
single_model_config
in
model_config_path_list
:
if
os
.
path
.
isdir
(
single_model_config
):
file_path_list
.
append
(
"{}/serving_server_conf.prototxt"
.
format
(
single_model_config
))
elif
os
.
path
.
isfile
(
single_model_config
):
file_path_list
.
append
(
single_model_config
)
self
.
_parse_model_config
(
file_path_list
)
model_config_str
=
resp
.
client_config_str
self
.
_parse_model_config
(
model_config_str
)
def
_flatten_list
(
self
,
nested_list
):
for
item
in
nested_list
:
...
...
@@ -572,23 +565,10 @@ class MultiLangClient(object):
else
:
yield
item
def
_parse_model_config
(
self
,
model_config_path_list
):
if
isinstance
(
model_config_path_list
,
str
):
model_config_path_list
=
[
model_config_path_list
]
elif
isinstance
(
model_config_path_list
,
list
):
pass
file_path_list
=
[]
for
single_model_config
in
model_config_path_list
:
if
os
.
path
.
isdir
(
single_model_config
):
file_path_list
.
append
(
"{}/serving_client_conf.prototxt"
.
format
(
single_model_config
))
elif
os
.
path
.
isfile
(
single_model_config
):
file_path_list
.
append
(
single_model_config
)
def
_parse_model_config
(
self
,
model_config_str
):
model_conf
=
m_config
.
GeneralModelConfig
()
f
=
open
(
file_path_list
[
0
],
'r'
)
model_conf
=
google
.
protobuf
.
text_format
.
Merge
(
str
(
f
.
read
()),
model_conf
)
model_conf
=
google
.
protobuf
.
text_format
.
Merge
(
model_config_str
,
model_conf
)
self
.
feed_names_
=
[
var
.
alias_name
for
var
in
model_conf
.
feed_var
]
self
.
feed_types_
=
{}
self
.
feed_shapes_
=
{}
...
...
@@ -598,11 +578,6 @@ class MultiLangClient(object):
self
.
feed_shapes_
[
var
.
alias_name
]
=
var
.
shape
if
var
.
is_lod_tensor
:
self
.
lod_tensor_set_
.
add
(
var
.
alias_name
)
if
len
(
file_path_list
)
>
1
:
model_conf
=
m_config
.
GeneralModelConfig
()
f
=
open
(
file_path_list
[
-
1
],
'r'
)
model_conf
=
google
.
protobuf
.
text_format
.
Merge
(
str
(
f
.
read
()),
model_conf
)
self
.
fetch_names_
=
[
var
.
alias_name
for
var
in
model_conf
.
fetch_var
]
self
.
fetch_types_
=
{}
for
i
,
var
in
enumerate
(
model_conf
.
fetch_var
):
...
...
python/paddle_serving_server/rpc_service.py
浏览文件 @
fa52ca27
...
...
@@ -198,5 +198,14 @@ class MultiLangServerServiceServicer(multi_lang_general_model_service_pb2_grpc.
#model_config_path_list is list right now.
#dict should be added when graphMaker is used.
resp
=
multi_lang_general_model_service_pb2
.
GetClientConfigResponse
()
resp
.
client_config_str_list
[:]
=
self
.
model_config_path_list
model_config_str
=
[]
for
single_model_config
in
self
.
model_config_path_list
:
if
os
.
path
.
isdir
(
single_model_config
):
with
open
(
"{}/serving_server_conf.prototxt"
.
format
(
single_model_config
))
as
f
:
model_config_str
.
append
(
str
(
f
.
read
()))
elif
os
.
path
.
isfile
(
single_model_config
):
with
open
(
single_model_config
)
as
f
:
model_config_str
.
append
(
str
(
f
.
read
()))
resp
.
client_config_str
=
model_config_str
[
0
]
return
resp
tools/Dockerfile.cuda10.1-cudnn7.devel
浏览文件 @
fa52ca27
...
...
@@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH}
# Downgrade TensorRT
COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_trt.sh
RUN bash /build_scripts/install_trt.sh
cuda10.1
RUN rm -rf /build_scripts
# git credential to skip password typing
...
...
@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \
make -j8 && make install && \
ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache
RUN python3.8 -m pip install --upgrade pip requests && \
python3.7 -m pip install --upgrade pip requests && \
python3.6 -m pip install --upgrade pip requests
RUN python3.8 -m pip install --upgrade pip
==21.1.1
requests && \
python3.7 -m pip install --upgrade pip
==21.1.1
requests && \
python3.6 -m pip install --upgrade pip
==21.1.1
requests
RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
...
...
tools/Dockerfile.cuda10.2-cudnn8.devel
浏览文件 @
fa52ca27
...
...
@@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH}
# Downgrade TensorRT
COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_trt.sh
RUN bash /build_scripts/install_trt.sh
cuda10.2
RUN rm -rf /build_scripts
# git credential to skip password typing
...
...
@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \
make -j8 && make install && \
ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache
RUN python3.8 -m pip install --upgrade pip requests && \
python3.7 -m pip install --upgrade pip requests && \
python3.6 -m pip install --upgrade pip requests
RUN python3.8 -m pip install --upgrade pip
==21.1.1
requests && \
python3.7 -m pip install --upgrade pip
==21.1.1
requests && \
python3.6 -m pip install --upgrade pip
==21.1.1
requests
RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
...
...
tools/Dockerfile.cuda11
.2
-cudnn8.devel
→
tools/Dockerfile.cuda11-cudnn8.devel
浏览文件 @
fa52ca27
# A image for building paddle binaries
# Use cuda devel base image for both cpu and gpu environment
# When you modify it, please be aware of cudnn-runtime version
FROM nvidia/cuda:11.
2.0
-cudnn8-devel-ubuntu16.04
FROM nvidia/cuda:11.
0.3
-cudnn8-devel-ubuntu16.04
MAINTAINER PaddlePaddle Authors <paddle-dev@baidu.com>
# ENV variables
...
...
@@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH}
# Downgrade TensorRT
COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_trt.sh
RUN bash /build_scripts/install_trt.sh
cuda11
RUN rm -rf /build_scripts
# git credential to skip password typing
...
...
@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \
make -j8 && make install && \
ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache
RUN python3.8 -m pip install --upgrade pip requests && \
python3.7 -m pip install --upgrade pip requests && \
python3.6 -m pip install --upgrade pip requests
RUN python3.8 -m pip install --upgrade pip
==21.1.1
requests && \
python3.7 -m pip install --upgrade pip
==21.1.1
requests && \
python3.6 -m pip install --upgrade pip
==21.1.1
requests
RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
...
...
tools/Dockerfile.devel
浏览文件 @
fa52ca27
...
...
@@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \
make -j8 && make install && \
ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache
RUN python3.8 -m pip install --upgrade pip requests && \
python3.7 -m pip install --upgrade pip requests && \
python3.6 -m pip install --upgrade pip requests
RUN python3.8 -m pip install --upgrade pip
==21.1.1
requests && \
python3.7 -m pip install --upgrade pip
==21.1.1
requests && \
python3.6 -m pip install --upgrade pip
==21.1.1
requests
RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
...
...
tools/Dockerfile.runtime_template
浏览文件 @
fa52ca27
...
...
@@ -28,12 +28,12 @@ WORKDIR /home
# install whl and bin
WORKDIR /home
COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_whl.sh
0.5.0 2.0.0
<<run_env>> <<python_version>> && rm -rf /build_scripts
RUN bash /build_scripts/install_whl.sh
<<serving_version>> <<paddle_version>>
<<run_env>> <<python_version>> && rm -rf /build_scripts
# install tensorrt
WORKDIR /home
COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_trt.sh && rm -rf /build_scripts
RUN bash /build_scripts/install_trt.sh
<<run_env>>
&& rm -rf /build_scripts
# install go
RUN wget -qO- https://dl.google.com/go/go1.14.linux-amd64.tar.gz | \
...
...
tools/dockerfiles/build_scripts/install_trt.sh
浏览文件 @
fa52ca27
...
...
@@ -14,31 +14,21 @@
# See the License for the specific language governing permissions and
# limitations under the License.
VERSION
=
$(
nvcc
--version
|
grep
release |
grep
-oEi
"release ([0-9]+)
\.
([0-9])"
|
sed
"s/release //"
)
if
[[
"
$VERSION
"
==
"10.1"
]]
;
then
VERSION
=
$1
if
[[
"
$VERSION
"
==
"cuda10.1"
]]
;
then
wget
-q
https://paddle-ci.gz.bcebos.com/TRT/TensorRT6-cuda10.1-cudnn7.tar.gz
--no-check-certificate
tar
-zxf
TensorRT6-cuda10.1-cudnn7.tar.gz
-C
/usr/local
cp
-rf
/usr/local/TensorRT6-cuda10.1-cudnn7/include/
*
/usr/include/
&&
cp
-rf
/usr/local/TensorRT6-cuda10.1-cudnn7/lib/
*
/usr/lib/
echo
"cuda10.1 trt install ==============>>>>>>>>>>>>"
rm
TensorRT6-cuda10.1-cudnn7.tar.gz
elif
[[
"
$VERSION
"
==
"
11.0
"
]]
;
then
elif
[[
"
$VERSION
"
==
"
cuda11
"
]]
;
then
wget
-q
https://paddle-ci.cdn.bcebos.com/TRT/TensorRT-7.1.3.4.Ubuntu-16.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz
--no-check-certificate
tar
-zxf
TensorRT-7.1.3.4.Ubuntu-16.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz
-C
/usr/local
cp
-rf
/usr/local/TensorRT-7.1.3.4/include/
*
/usr/include/
&&
cp
-rf
/usr/local/TensorRT-7.1.3.4/lib/
*
/usr/lib/
rm
TensorRT-7.1.3.4.Ubuntu-16.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz
elif
[[
"
$VERSION
"
==
"10.2"
]]
;
then
elif
[[
"
$VERSION
"
==
"
cuda
10.2"
]]
;
then
wget https://paddle-ci.gz.bcebos.com/TRT/TensorRT7-cuda10.2-cudnn8.tar.gz
--no-check-certificate
tar
-zxf
TensorRT7-cuda10.2-cudnn8.tar.gz
-C
/usr/local
cp
-rf
/usr/local/TensorRT-7.1.3.4/include/
*
/usr/include/
&&
cp
-rf
/usr/local/TensorRT-7.1.3.4/lib/
*
/usr/lib/
rm
TensorRT7-cuda10.2-cudnn8.tar.gz
elif
[[
"
$VERSION
"
==
"10.0"
]]
;
then
wget
-q
https://paddle-ci.gz.bcebos.com/TRT/TensorRT6-cuda10.0-cudnn7.tar.gz
--no-check-certificate
tar
-zxf
TensorRT6-cuda10.0-cudnn7.tar.gz
-C
/usr/local
cp
-rf
/usr/local/TensorRT6-cuda10.0-cudnn7/include/
*
/usr/include/
&&
cp
-rf
/usr/local/TensorRT6-cuda10.0-cudnn7/lib/
*
/usr/lib/
rm
TensorRT6-cuda10.0-cudnn7.tar.gz
elif
[[
"
$VERSION
"
==
"9.0"
]]
;
then
wget
-q
https://paddle-ci.gz.bcebos.com/TRT/TensorRT6-cuda9.0-cudnn7.tar.gz
--no-check-certificate
tar
-zxf
TensorRT6-cuda9.0-cudnn7.tar.gz
-C
/usr/local
cp
-rf
/usr/local/TensorRT6-cuda9.0-cudnn7/include/
*
/usr/include/
&&
cp
-rf
/usr/local/TensorRT6-cuda9.0-cudnn7/lib/
*
/usr/lib/
rm
TensorRT6-cuda9.0-cudnn7.tar.gz
fi
tools/dockerfiles/build_scripts/install_whl.sh
浏览文件 @
fa52ca27
...
...
@@ -40,6 +40,9 @@ if [[ $SERVING_VERSION == "0.5.0" ]]; then
elif
[[
"
$RUN_ENV
"
==
"cuda10.2"
]]
;
then
server_release
=
"paddle-serving-server-gpu==
$SERVING_VERSION
.post102"
serving_bin
=
"https://paddle-serving.bj.bcebos.com/bin/serving-gpu-102-
${
SERVING_VERSION
}
.tar.gz"
elif
[[
"
$RUN_ENV
"
==
"cuda11"
]]
;
then
server_release
=
"paddle-serving-server-gpu==
$SERVING_VERSION
.post11"
serving_bin
=
"https://paddle-serving.bj.bcebos.com/bin/serving-gpu-11-
${
SERVING_VERSION
}
.tar.gz"
fi
client_release
=
"paddle-serving-client==
$SERVING_VERSION
"
app_release
=
"paddle-serving-app==0.3.1"
...
...
@@ -53,6 +56,9 @@ elif [[ $SERVING_VERSION == "0.6.0" ]]; then
elif
[[
"
$RUN_ENV
"
==
"cuda10.2"
]]
;
then
server_release
=
"https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-
$SERVING_VERSION
.post102-py3-none-any.whl"
serving_bin
=
"https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-102-
$SERVING_VERSION
.tar.gz"
elif
[[
"
$RUN_ENV
"
==
"cuda11"
]]
;
then
server_release
=
"https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server_gpu-
$SERVING_VERSION
.post11-py3-none-any.whl"
serving_bin
=
"https://paddle-serving.bj.bcebos.com/test-dev/bin/serving-gpu-11-
$SERVING_VERSION
.tar.gz"
fi
client_release
=
"https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-
$SERVING_VERSION
-cp
$CPYTHON
-none-any.whl"
app_release
=
"https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-
$SERVING_VERSION
-py3-none-any.whl"
...
...
@@ -88,6 +94,16 @@ elif [[ "$RUN_ENV" == "cuda10.2" ]];then
echo
"export SERVING_BIN=
$PWD
/serving_bin/serving"
>>
/root/.bashrc
rm
-rf
serving-gpu-102-
${
SERVING_VERSION
}
.tar.gz
cd
-
elif
[[
"
$RUN_ENV
"
==
"cuda11"
]]
;
then
python
$PYTHON_VERSION
-m
pip
install
$client_release
$app_release
$server_release
python
$PYTHON_VERSION
-m
pip
install
paddlepaddle-gpu
==
${
PADDLE_VERSION
}
cd
/usr/local/
wget
$serving_bin
tar
xf serving-gpu-11-
${
SERVING_VERSION
}
.tar.gz
mv
$PWD
/serving-gpu-11-
${
SERVING_VERSION
}
$PWD
/serving_bin
echo
"export SERVING_BIN=
$PWD
/serving_bin/serving"
>>
/root/.bashrc
rm
-rf
serving-gpu-11-
${
SERVING_VERSION
}
.tar.gz
cd
-
fi
tools/generate_k8s_yamls.sh
浏览文件 @
fa52ca27
...
...
@@ -60,8 +60,8 @@ function run
echo
"named arg: command:
$start_command
"
echo
"named arg: port:
$port
"
sed
-e
"s/<< APP_NAME >>/
$app_name
/g"
-e
"s/<< IMAGE_NAME >>/
$(
echo
$image_name
|
sed
-e
's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g'
)
/g"
-e
"s/<< WORKDIR >>/
$(
echo
$workdir
|
sed
-e
's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g'
)
/g"
-e
"s/<< COMMAND >>/
\"
$
start_command
\"
/g"
-e
"s/<< PORT >>/
$port
/g"
tools/k8s_serving.yaml_template
>
k8s_serving.yaml
sed
-e
"s/<< APP_NAME >>/
$app_name
/g"
-e
"s/<< IMAGE_NAME >>/
$(
echo
$image_name
|
sed
-e
's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g'
)
/g"
-e
"s/<< WORKDIR >>/
$(
echo
$workdir
|
sed
-e
's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g'
)
/g"
-e
"s/<< COMMAND >>/
\"
$
start_command
\"
/g"
-e
"s/<< PORT >>/
$port
/g"
tools/k8s_ingress.yaml_template
>
k8s_ingress.yaml
sed
-e
"s/<< APP_NAME >>/
$app_name
/g"
-e
"s/<< IMAGE_NAME >>/
$(
echo
$image_name
|
sed
-e
's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g'
)
/g"
-e
"s/<< WORKDIR >>/
$(
echo
$workdir
|
sed
-e
's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g'
)
/g"
-e
"s/<< COMMAND >>/
\"
$
(
echo
$start_command
|
sed
-e
's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g'
)
\"
/g"
-e
"s/<< PORT >>/
$port
/g"
tools/k8s_serving.yaml_template
>
k8s_serving.yaml
sed
-e
"s/<< APP_NAME >>/
$app_name
/g"
-e
"s/<< IMAGE_NAME >>/
$(
echo
$image_name
|
sed
-e
's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g'
)
/g"
-e
"s/<< WORKDIR >>/
$(
echo
$workdir
|
sed
-e
's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g'
)
/g"
-e
"s/<< COMMAND >>/
\"
$
(
echo
$start_command
|
sed
-e
's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g'
)
\"
/g"
-e
"s/<< PORT >>/
$port
/g"
tools/k8s_ingress.yaml_template
>
k8s_ingress.yaml
echo
"check k8s_serving.yaml and k8s_ingress.yaml please."
}
...
...
tools/generate_runtime_docker.sh
浏览文件 @
fa52ca27
...
...
@@ -66,6 +66,8 @@ function run
base_image
=
"nvidia
\/
cuda:10.1-cudnn7-runtime-ubuntu16.04"
elif
[
$env
==
"cuda10.2"
]
;
then
base_image
=
"nvidia
\/
cuda:10.2-cudnn8-runtime-ubuntu16.04"
elif
[
$env
==
"cuda11"
]
;
then
base_image
=
"nvidia
\/
cuda:11.0.3-cudnn8-runtime-ubuntu16.04"
fi
echo
"base image:
$base_image
"
echo
"named arg: python:
$python
"
...
...
tools/k8s_serving.yaml_template
浏览文件 @
fa52ca27
...
...
@@ -34,6 +34,7 @@ spec:
containers:
- image: << IMAGE_NAME >>
name: << APP_NAME >>
imagePullPolicy: Always
ports:
- containerPort: << PORT >>
workingDir: << WORKDIR >>
...
...
@@ -41,6 +42,8 @@ spec:
command: ['/bin/bash', '-c']
args: [<< COMMAND >>]
env:
- name: SERVING_BIN
value: "/usr/local/serving_bin/serving"
- name: NODE_NAME
valueFrom:
fieldRef:
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录