caffe-SSD转fluid预测Normalize层出错
Created by: nizihan
环境:paddle 1.6.3
使用工具将caffe模型转换为fluid模型,验证模型的预测:
1、用model_with_code/中生成转换好的参数和model.py进行网络做预测,用load_vars方式加载参数,预测时报错,排查到Normalize层报错。
2、再使用inference_model/中的模型,不定义网络结构,直接load_inference_model加载prog,报相同错误。
报错截图为:
F0207 22:01:00.297178 116936 device_context.cc:328] cudaStreamSynchronize an illegal memory access was encountered errno: 77 *** Check failure stack trace: *** @ 0x7f34711e72dd google::LogMessage::Fail() @ 0x7f34711ead8c google::LogMessage::SendToLog() @ 0x7f34711e6e03 google::LogMessage::Flush() @ 0x7f34711ec29e google::LogMessageFatal::~LogMessageFatal() @ 0x7f34739a5137 paddle::platform::CUDADeviceContext::Wait() @ 0x7f347393ee93 paddle::framework::TransDataDevice() @ 0x7f347393d87e paddle::framework::TransformData() @ 0x7f347391bfcb paddle::framework::OperatorWithKernel::PrepareData() @ 0x7f347391d418 paddle::framework::OperatorWithKernel::RunImpl() @ 0x7f347391daf1 paddle::framework::OperatorWithKernel::RunImpl() @ 0x7f347391777c paddle::framework::OperatorBase::Run() @ 0x7f347122e416 paddle::framework::Executor::RunPreparedContext() @ 0x7f3471231c5f paddle::framework::Executor::Run() @ 0x7f347106b6ad _ZZN8pybind1112cpp_function10initializeIZN6paddle6pybindL22pybind11_init_core_avxERNS_6moduleEEUlRNS2_9framework8ExecutorERKNS6_11ProgramDescEPNS6_5ScopeEibbRKSt6vectorISsSaISsEEE103_vIS8_SB_SD_ibbSI_EINS_4nameENS_9is_methodENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNES10_ @ 0x7f34710b4816 pybind11::cpp_function::dispatcher() @ 0x7f353064dbb8 PyEval_EvalFrameEx @ 0x7f35306510bd PyEval_EvalCodeEx @ 0x7f353064e345 PyEval_EvalFrameEx @ 0x7f35306510bd PyEval_EvalCodeEx @ 0x7f353064e345 PyEval_EvalFrameEx @ 0x7f35306510bd PyEval_EvalCodeEx @ 0x7f353064e345 PyEval_EvalFrameEx @ 0x7f35306510bd PyEval_EvalCodeEx @ 0x7f353064e345 PyEval_EvalFrameEx @ 0x7f35306510bd PyEval_EvalCodeEx @ 0x7f35306511f2 PyEval_EvalCode @ 0x7f3530679f42 PyRun_FileExFlags @ 0x7f353067b2d9 PyRun_SimpleFileExFlags @ 0x7f353069100d Py_Main @ 0x7f352f88ebd5 __libc_start_main @ 0x4007a1 (unknown) @ (nil) (unknown) Aborted
Normalize层在caffe deploy中的定义 为:
layer { name: "conv4_3_norm" type: "Normalize" bottom: "conv4_3" top: "conv4_3_norm" norm_param { across_spatial: false scale_filler { type: "constant" value: 20 } channel_shared: false } }
但是在model.py中的实现为:
`def normalize_layer(inputs,
across_spatial=None,
channel_shared=None,
input_shape=None,
name=None):
assert across_spatial == False, "Only support across_spatial == False for Normalize"
input = inputs[0]
l2_norm = fluid.layers.l2_normalize(input, axis=1, name=name + '_l2')
scale_param = fluid.layers.create_parameter(
shape=[1]
if channel_shared else [input_shape[0][0], 1, 1, input_shape[0][1]],
dtype=input.dtype,
attr=name + '_scale')
scale_param = fluid.layers.reshape(x=scale_param, \
shape=[1] if channel_shared else [input_shape[0][0], 1, 1, input_shape[0][1]])
out = fluid.layers.elementwise_mul(x=l2_norm,
y=scale_param,
axis=-1 if channel_shared else 1)
return out
` 问题1:排查Normalize函数错误的原因为:l2_norm维度为1,512,52,52,scale_param维度为1,1,1,512,因此elementwise_mul会出错。 问题2:caffe depoly里定义scale为constant(20),但是model.py里定义的不是constant。 问题3:将问题1中的scale_param维度修改为512后,预测不会出错,但是网络输出全部为[[-1]],没有 检测结果。
希望得到解答。