Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • ERNIE
  • Issue
  • #363

E
ERNIE
  • 项目概览

PaddlePaddle / ERNIE
大约 2 年 前同步成功

通知 115
Star 5997
Fork 1271
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 29
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 0
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
E
ERNIE
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 29
    • Issue 29
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 0
    • 合并请求 0
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 11月 06, 2019 by saxon_zh@saxon_zhGuest

搭建向量服务器后客户端请求报错:Assertion `id < N` failed

Created by: fengyunshuo

第一步:使用docker安装paddle docker pull hub.baidubce.com/paddlepaddle/paddle:1.6.0-gpu-cuda10.0-cudnn7

docker run --runtime nvidia --name paddle -p 8129:8888 -it -v $PWD:/paddle hub.baidubce.com/paddlepaddle/paddle:1.6.0-gpu-cuda10.0-cudnn7 /bin/bash

在容器内使用python解释器检查显示Your Paddle Fluid is installed succesfully!

第二步:设置变量以及安装所需包 export CUDA_VISIBLE_DEVICES=0

pip3 install zmq

export PYTHONPATH=./:$PYTHONPATH

解压中文与训练模型

等等等

第三步:开启向量服务器

python3 ernie/service/encoder_server.py -m ./inf_model/ -p 8888 -v --encode_layer pooler

显示如下: [INFO] 2019-11-06 03:03:51,781 [encoder_server.py: 65]: propeller server listent on port 8888 [INFO] 2019-11-06 03:03:51,783 [ server.py: 131]: InferenceProxy starting... [DEBUG] 2019-11-06 03:03:51,784 [ server.py: 65]: run_worker 0 [DEBUG] 2019-11-06 03:03:51,785 [ server.py: 68]: cuda_env 0 [INFO] 2019-11-06 03:03:51,787 [ server.py: 140]: Queue init done [DEBUG] 2019-11-06 03:03:53,028 [ server.py: 76]: Predictor building 0 [DEBUG] 2019-11-06 03:03:53,028 [ server.py: 50]: create predictor on card 0 I1106 03:03:53.876863 559 analysis_predictor.cc:88] Profiler is deactivated, and no profiling report will be generated. I1106 03:03:53.893898 559 op_compatible_info.cc:201] The default operator required version is missing. Please update the model version. I1106 03:03:53.893931 559 analysis_predictor.cc:847] MODEL VERSION: 0.0.0 I1106 03:03:53.893942 559 analysis_predictor.cc:849] PREDICTOR VERSION: 1.6.0 --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [is_test_pass] --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [conv_affine_channel_fuse_pass] --- Running IR pass [conv_eltwiseadd_affine_channel_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] --- Running IR pass [conv_eltwiseadd_bn_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass] I1106 03:03:55.595218 559 graph_pattern_detector.cc:96] --- detected 12 subgraphs --- Running IR pass [fc_fuse_pass] I1106 03:03:55.610505 559 graph_pattern_detector.cc:96] --- detected 12 subgraphs I1106 03:03:55.618679 559 graph_pattern_detector.cc:96] --- detected 25 subgraphs --- Running IR pass [fc_elementwise_layernorm_fuse_pass] I1106 03:03:55.633041 559 graph_pattern_detector.cc:96] --- detected 24 subgraphs --- Running IR pass [conv_elementwise_add_act_fuse_pass] --- Running IR pass [conv_elementwise_add2_act_fuse_pass] --- Running IR pass [conv_elementwise_add_fuse_pass] --- Running IR pass [transpose_flatten_concat_fuse_pass] --- Running IR pass [runtime_context_cache_pass] --- Running analysis [ir_params_sync_among_devices_pass] I1106 03:03:55.645728 559 ir_params_sync_among_devices_pass.cc:41] Sync params from CPU to GPU --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [ir_graph_to_program_pass] I1106 03:03:55.840788 559 analysis_predictor.cc:473] ======= optimize end ======= [DEBUG] 2019-11-06 03:03:55,841 [ server.py: 78]: Predictor 0

应该是正确启动了

第四步:客户端请求

python代码如下: from ernie.service.client import ErnieClient client = ErnieClient('./config/vocab.txt', host='localhost', port=8888) ret = client(['谁有狂三这张高清的', '英雄联盟什么英雄最好'])

客户端报错: [INFO] 2019-11-06 03:10:14,655 [ client.py: 60]: Connecting to server... tcp://localhost:8888 [ERROR] 2019-11-06 03:10:24,685 [ client.py: 91]: Resource temporarily unavailable Traceback (most recent call last): File "/home/ERNIE/propeller/service/client.py", line 88, in get reply = await socket.recv(zmq.NOBLOCK) File "/usr/lib/python3.5/asyncio/futures.py", line 363, in iter return self.result() # May raise too. File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result raise self._exception File "/usr/local/lib/python3.5/dist-packages/zmq/_future.py", line 315, in _add_recv_event r = recv(**kwargs) File "zmq/backend/cython/socket.pyx", line 796, in zmq.backend.cython.socket.Socket.recv File "zmq/backend/cython/socket.pyx", line 832, in zmq.backend.cython.socket.Socket.recv File "zmq/backend/cython/socket.pyx", line 191, in zmq.backend.cython.socket._recv_copy File "zmq/backend/cython/socket.pyx", line 186, in zmq.backend.cython.socket._recv_copy File "zmq/backend/cython/checkrc.pxd", line 19, in zmq.backend.cython.checkrc._check_rc zmq.error.Again: Resource temporarily unavailable Traceback (most recent call last): File "test_client.py", line 3, in ret = client(['\u8c01\u6709\u72c2\u4e09\u8fd9\u5f20\u9ad8\u6e05\u7684', '\u82f1\u96c4\u8054\u76df\u4ec0\u4e48\u82f1\u96c4\u6700\u597d']) File "/home/ERNIE/ernie/service/client.py", line 73, in call ret, = super(ErnieClient, self).call(sen_ids, token_type_ids) File "/home/ERNIE/propeller/service/client.py", line 102, in call raise RuntimeError('Client call failed') RuntimeError: Client call failed

服务端报错: W1106 03:21:16.098963 1561 device_context.cc:235] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 10.0 W1106 03:21:16.105224 1561 device_context.cc:243] device: 0, cuDNN Version: 7.5. W1106 03:21:19.452029 1561 naive_executor.cc:43] The NaiveExecutor can not work properly if the cmake flag ON_INFER is not set. W1106 03:21:19.452064 1561 naive_executor.cc:45] Unlike the training phase, all the scopes and variables will be reused to save the allocation overhead. W1106 03:21:19.452075 1561 naive_executor.cc:48] Please re-compile the inference library by setting the cmake flag ON_INFER=ON if you are running Paddle Inference F1106 03:21:20.245061 1561 device_context.cc:318] cudaStreamSynchronize unspecified launch failure errno: 4

一堆以下内容: Exception: /paddle/paddle/fluid/operators/lookup_table_op.cu:43 Assertion id < N failed. Variable value (input) of OP(fluid.layers.embedding) expected >= 0 and < 18000, but got 4656722015836110848. Please check input value. Exception: /paddle/paddle/fluid/operators/lookup_table_op.cu:43 Assertion id < N failed. Variable value (input) of OP(fluid.layers.embedding) expected >= 0 and < 18000, but got 4656722015836110848. Please check input value.

*** Check failure stack trace: *** @ 0x7fa91bd2811d google::LogMessage::Fail() @ 0x7fa91bd2bbcc google::LogMessage::SendToLog() @ 0x7fa91bd27c43 google::LogMessage::Flush() @ 0x7fa91bd2d0de google::LogMessageFatal::~LogMessageFatal() @ 0x7fa91e50b587 paddle::platform::CUDADeviceContext::Wait() @ 0x7fa91e4a0163 paddle::framework::TransDataDevice() @ 0x7fa91e49f11e paddle::framework::TransformData() @ 0x7fa91e48212b paddle::framework::OperatorWithKernel::PrepareData() @ 0x7fa91e483508 paddle::framework::OperatorWithKernel::RunImpl() @ 0x7fa91e483af3 paddle::framework::OperatorWithKernel::RunImpl() @ 0x7fa91e47e2b3 paddle::framework::OperatorBase::Run() @ 0x7fa91e438ef0 paddle::framework::NaiveExecutor::Run() @ 0x7fa91be04cac paddle::AnalysisPredictor::Run() @ 0x7fa91bcf7cd6 ZZN8pybind1112cpp_function10initializeIZN6paddle6pybind12_GLOBAL__N_121BindAnalysisPredictorEPNS_6moduleEEUlRNS2_17AnalysisPredictorERKSt6vectorINS2_12PaddleTensorESaISA_EEE_SC_IS8_SE_EINS_4nameENS_9is_methodENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESW @ 0x7fa91bbfc176 pybind11::cpp_function::dispatcher() @ 0x4e1307 PyCFunction_Call @ 0x535fcb PyEval_EvalFrameEx @ 0x53a81b PyEval_EvalCodeEx @ 0x4e3537 (unknown) @ 0x5c3bd7 PyObject_Call @ 0x532a22 PyEval_EvalFrameEx @ 0x53af6a PyEval_EvalCodeEx @ 0x4e3423 (unknown) @ 0x5c3bd7 PyObject_Call @ 0x4f08be (unknown) @ 0x5c3bd7 PyObject_Call @ 0x55fbf6 (unknown) @ 0x5c3bd7 PyObject_Call @ 0x5354af PyEval_EvalFrameEx @ 0x53af6a PyEval_EvalCodeEx @ 0x4e3537 (unknown) @ 0x5c3bd7 PyObject_Call

请问是我的环境问题吗还是什么问题呢?谢谢

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/ERNIE#363
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7