Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • Serving
  • 合并请求
  • !667

S
Serving
  • 项目概览

PaddlePaddle / Serving
接近 2 年 前同步成功

通知 186
Star 833
Fork 253
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 105
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 10
  • Wiki 2
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
S
Serving
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 105
    • Issue 105
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 10
    • 合并请求 10
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 2
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板

[WIP] Pipeline Serving !667

  • Report abuse
!667 已合并 6月 09, 2020 由 saxon_zh@saxon_zh 创建
#<User:0x00007ff7cad85008>
  • 概览 2
  • 提交 95
  • 变更 23

Created by: barrierye

当前版本性能测试

结论(简单模型为BOW/CNN,复杂模型为LSTM)

  1. 预测部分使用grpc接口测得性能并不理想:
    • (线程版异步接口)简单单模型是c++ core耗时的230%
    • (线程版异步接口)简单双模型是c++ core耗时的336%
    • (线程版异步接口)复杂双模型是c++ core耗时的134%
    • 进程版(同步接口,因future不可序列化)复杂双模型耗时与线程版差不多(反而有所增加)
    • 线程版同步接口与异步接口在复杂双模型上耗时基本一致
  2. 预测部分使用brpc与预期一致:
    • (线程版)复杂双模型是c++ core耗时的63.3%

所以应该是grpc接口受GIL作用没有并行。

实验

  1. 使用两个LSTM模型(预测耗时较大)进行测试,PyServing比原C++ core增加了34%,猜测是因为数据处理由线程进行,当这部分耗时占比较大时PyServing性能很差,故将多线程改写为多进程
  2. 将线程改成进程,在双LSTM模型测试中,耗时与线程版差不多(反而增加),一部分原因是往多进程空队列里put数据,会有一小段时间延迟(https://hg.python.org/cpython/rev/860fc6a2bd21 ),而目前channel的实现不能阻塞get操作,所以在get数据时加了1e-3s的timeout(https://github.com/PaddlePaddle/Serving/pull/667/files#diff-f6ad39aa57713e2da4fb9e81e6193357R377 ),队列会在timeout前get数据,但经测试timeout越大,等待时间越大;另一部分原因是进程间通信增加。
  3. 重新分析耗时:用线程版查看profile,发现预处理时间耗时占比非常小,在获取future时消耗较多时间:
    线程版双lstm(同一列为顺序执行):
    #G prepack: 0.155 ms
    #G push: 0.058 ms
    <long empty: 大概 0.3 ms>
    l1 prep: 0.158 ms
    l2 prep: 0.175 ms
                              l2 midp: 3.898 ms
    l1 midp: 4.618 ms         l2 postp: 0.022 ms
                              l2 push: 0.050 ms
    l1 postp: 0.022 ms
    l1 push: 0.054 ms
    combine prep: 7.806 ms
    combine postp: 0.026 ms
    combine push: 0.045 ms
    <long empty: 大概 0.3 ms>
    #G postpack: 0.120 ms
    <long empty: 大概 6 ms>
    <next iter>
    取消future换用同步模式,耗时集中在op的mid-process(预测部分),似乎这部分没有并发,总耗时和异步模式差不多: image
  4. 预测部分换用用brpc接口,总耗时降到C++ core的63.3%,profile如下: image

PyServing BOW单模型(thread版 grpc-aync-impl)

Timer unit: 1e-06 s
Total time: 7.91556 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    21                                           def func():
    22         1          8.0      8.0      0.0      client = PyClient()
    23         1       5452.0   5452.0      0.1      client.connect('localhost:8080')
    24
    25                                               #lp = LineProfiler()
    26                                               #lp_wrapper = lp(client.predict)
    27
    28         1          1.0      1.0      0.0      words = 'i am very sad | 0'
    29         1          5.0      5.0      0.0      imdb_dataset = IMDBDataset()
    30         1     106775.0 106775.0      1.3      imdb_dataset.load_resource('imdb.vocab')
    31
    32      2085       7741.0      3.7      0.1      for line in sys.stdin:
    33      2084     861203.0    413.2     10.9          word_ids, label = imdb_dataset.get_words_and_label(line)
    34                                                   #fetch_map = lp_wrapper(
    35      2084       1813.0      0.9      0.0          fetch_map = client.predict(
    36      2084    6932563.0   3326.6     87.6              feed={"words": word_ids}, fetch=["prediction"])

PyServing BOW和CNN双模型(thread版 grpc-aync-impl)

Total time: 12.3527 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    20                                           def func():
    21         1          9.0      9.0      0.0      client = PyClient()
    22         1       5916.0   5916.0      0.0      client.connect('localhost:8080')
    23
    24         1          1.0      1.0      0.0      words = 'i am very sad | 0'
    25         1          6.0      6.0      0.0      imdb_dataset = IMDBDataset()
    26         1     104786.0 104786.0      0.8      imdb_dataset.load_resource('imdb.vocab')
    27
    28      2085       8467.0      4.1      0.1      for line in sys.stdin:
    29      2084     877019.0    420.8      7.1          word_ids, label = imdb_dataset.get_words_and_label(line)
    30      2084   11356504.0   5449.4     91.9          fetch_map = client.predict(feed={"words": word_ids}, fetch=["prediction"])

PyServing LSTM和LSTM双模型(thread版 grpc-aync-impl)

Total time: 21.9295 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    20                                           def func():
    21         1         29.0     29.0      0.0      client = PyClient()
    22         1       6088.0   6088.0      0.0      client.connect('localhost:8080')
    23
    24         1          1.0      1.0      0.0      words = 'i am very sad | 0'
    25         1          7.0      7.0      0.0      imdb_dataset = IMDBDataset()
    26         1     228226.0 228226.0      1.0      imdb_dataset.load_resource('imdb.vocab')
    27
    28      2085      11646.0      5.6      0.1      for line in sys.stdin:
    29      2084     965434.0    463.3      4.4          word_ids, label = imdb_dataset.get_words_and_label(line)
    30      2084   20718107.0   9941.5     94.5          fetch_map = client.predict(feed={"words": word_ids}, fetch=["prediction"])

PyServing LSTM和LSTM双模型(process版 grpc-aync-impl)

Total time: 24.3444 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    20                                           def func():
    21         1          7.0      7.0      0.0      client = PyClient()
    22         1       5404.0   5404.0      0.0      client.connect('localhost:8080')
    23
    24         1          1.0      1.0      0.0      words = 'i am very sad | 0'
    25         1          4.0      4.0      0.0      imdb_dataset = IMDBDataset()
    26         1     103255.0 103255.0      0.4      imdb_dataset.load_resource('imdb.vocab')
    27
    28      2085       9329.0      4.5      0.0      for line in sys.stdin:
    29      2084     895440.0    429.7      3.7          word_ids, label = imdb_dataset.get_words_and_label(line)
    30      2084   23329327.0  11194.5     95.8          fetch_map = client.predict(feed={"words": word_ids}, fetch=["prediction"])
    31         1       1675.0   1675.0      0.0      print(fetch_map)

PyServing LSTM和LSTM双模型(thread版 c++ core)

Total time: 10.3327 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    20                                           def func():
    21         1          6.0      6.0      0.0      client = PyClient()
    22         1       5441.0   5441.0      0.1      client.connect('localhost:8080')
    23
    24         1          1.0      1.0      0.0      words = 'i am very sad | 0'
    25         1          4.0      4.0      0.0      imdb_dataset = IMDBDataset()
    26         1     104695.0 104695.0      1.0      imdb_dataset.load_resource('imdb.vocab')
    27
    28      2085       7459.0      3.6      0.1      for line in sys.stdin:
    29      2084     872195.0    418.5      8.4          word_ids, label = imdb_dataset.get_words_and_label(line)
    30      2084    9339443.0   4481.5     90.4          fetch_map = client.predict(feed={"words": word_ids}, fetch=["prediction"])
    31      2084       2712.0      1.3      0.0          if fetch_map['ecode'] != 0:
    32                                                       print(fetch_map['error_info'])
    33         1        772.0    772.0      0.0      print(fetch_map)

C++ core BOW单模型

Total time: 3.3367 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    20                                           def func():
    21         1          3.0      3.0      0.0      words = 'i am very sad | 0'
    22         1          5.0      5.0      0.0      imdb_dataset = IMDBDataset()
    23         1     101483.0 101483.0      3.0      imdb_dataset.load_resource('imdb.vocab')
    24
    25         1      21312.0  21312.0      0.6      client = Client()
    26         1       2796.0   2796.0      0.1      client.load_client_config('imdb_bow_client_conf/serving_client_conf.prototxt')
    27         1       4403.0   4403.0      0.1      client.connect(["127.0.0.1:8888"])
    28
    29                                               #lp = LineProfiler()
    30                                               #lp_wrapper = lp(client.predict)
    31
    32      2085       7266.0      3.5      0.2      for line in sys.stdin:
    33      2084     840609.0    403.4     25.2          word_ids, label = imdb_dataset.get_words_and_label(line)
    34                                                   #fetch_map = lp_wrapper(
    35      2084       1757.0      0.8      0.1          fetch_map = client.predict(
    36      2084    2357063.0   1131.0     70.6              feed={"words": word_ids}, fetch=["prediction"])

C++ core BOW和CNN双模型

Total time: 3.7254 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    21                                           def func():
    22         1      18822.0  18822.0      0.5      client = Client()
    23                                               # If you have more than one model, make sure that the input
    24                                               # and output of more than one model are the same.
    25         1       2541.0   2541.0      0.1      client.load_client_config('imdb_bow_client_conf/serving_client_conf.prototxt')
    26         1       4270.0   4270.0      0.1      client.connect(["127.0.0.1:8889"])
    27
    28                                               # you can define any english sentence or dataset here
    29                                               # This example reuses imdb reader in training, you
    30                                               # can define your own data preprocessing easily.
    31         1         11.0     11.0      0.0      imdb_dataset = IMDBDataset()
    32         1     107277.0 107277.0      2.9      imdb_dataset.load_resource('imdb.vocab')
    33
    34      2085       7840.0      3.8      0.2      for line in sys.stdin:
    35      2084     858674.0    412.0     23.0          word_ids, label = imdb_dataset.get_words_and_label(line)
    36      2084       3835.0      1.8      0.1          feed = {"words": word_ids}
    37      2084       1556.0      0.7      0.0          fetch = ["prediction"]
    38      2084    2717955.0   1304.2     73.0          fetch_maps = client.predict(feed=feed, fetch=fetch)

C++ core LSTM和LSTM双模型

Total time: 16.3102 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    21                                           def func():
    22         1     453148.0 453148.0      2.8      client = Client()
    23                                               # If you have more than one model, make sure that the input
    24                                               # and output of more than one model are the same.
    25         1       5346.0   5346.0      0.0      client.load_client_config('imdb_bow_client_conf/serving_client_conf.prototxt')
    26         1       4825.0   4825.0      0.0      client.connect(["127.0.0.1:8889"])
    27
    28                                               # you can define any english sentence or dataset here
    29                                               # This example reuses imdb reader in training, you
    30                                               # can define your own data preprocessing easily.
    31         1         13.0     13.0      0.0      imdb_dataset = IMDBDataset()
    32         1     162498.0 162498.0      1.0      imdb_dataset.load_resource('imdb.vocab')
    33
    34      2085      12260.0      5.9      0.1      for line in sys.stdin:
    35      2084     942797.0    452.4      5.8          word_ids, label = imdb_dataset.get_words_and_label(line)
    36      2084       6451.0      3.1      0.0          feed = {"words": word_ids}
    37      2084       1705.0      0.8      0.0          fetch = ["prediction"]
    38      2084   14718960.0   7062.8     90.2          fetch_maps = client.predict(feed=feed, fetch=fetch)
    39         1       2219.0   2219.0      0.0      print(fetch_maps)
指派人
分配到
审核者
Request review from
无
里程碑
无
分配里程碑
工时统计
标识: paddlepaddle/Serving!667
Source branch: github/fork/barrierye/pyserving
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7