改进resnet的bottleneck结构,但是总是会报illegal memory错误
Created by: cuicheng01
原bottleneck代码: ` def bottleneck_block(self, input, num_filters, stride, name): conv0 = self.conv_bn_layer( input=input, num_filters=num_filters, filter_size=1, stride=1, act='relu',name=name+"_branch2a") conv1 = self.conv_bn_layer( input=conv0, num_filters=num_filters, filter_size=3, stride=stride, act='relu', name=name+"_branch2b") conv2 = self.conv_bn_layer( input=conv1, num_filters=num_filters * 4, filter_size=1, act=None, name=name+"_branch2c")
short = self.shortcut(input, num_filters * 4, stride, is_first=False, name=name + "_branch1")
return fluid.layers.elementwise_add(x=short, y=conv2, act='relu',name=name+".add.output.5")`
改进代码:(引入了不同大小的kernel) ` def bottleneck_block(self, input, num_filters, stride, name): conv0 = self.conv_bn_layer( input=input, num_filters=num_filters, filter_size=1, stride=stride, act='relu', name=name+"_branch2a") c = conv0.shape[1]
xs = fluid.layers.split(conv0, [c//32, c//32, c//16, c//8, c//4, c//2], 1)
ys = []
for s in range(6):
if s == 0:
ys.append(self.conv_bn_layer(input=xs[s], num_filters=c//32, filter_size=11, act='relu',
name=name+"_branch2b_" + str(s)))
elif s == 1:
ys.append(self.conv_bn_layer(input=xs[s], num_filters=c//32, filter_size=9, act='relu',
name=name+"_branch2b_" + str(s)))
elif s == 2:
ys.append(self.conv_bn_layer(input=xs[s], num_filters=c//16, filter_size=7, act='relu',
name=name+"_branch2b_" + str(s)))
elif s == 3:
ys.append(self.conv_bn_layer(input=xs[s], num_filters=c//8, filter_size=5, act='relu',
name=name+"_branch2b_" + str(s)))
elif s == 4:
ys.append(self.conv_bn_layer(input=xs[s], num_filters=c//4, filter_size=1, act='relu',
name=name+"_branch2b_" + str(s)))
else:
ys.append(self.conv_bn_layer(input=xs[s], num_filters=c//2, filter_size=3, act='relu',
name=name+"_branch2b_" + str(s)))
conv1 = fluid.layers.concat(ys, axis=1)
conv2 = self.conv_bn_layer(
input=conv1, num_filters=int(num_filters * 2), filter_size=1, act=None, name=name+"_branch2c")
short = self.shortcut(input, int(num_filters * 2), stride, name=name + "_branch1")
return fluid.layers.elementwise_add(x=short, y=conv2, act='relu')
`
报错信息:
*** Check failure stack trace: *** F0806 13:54:13.074252 7719 device_context.cc:333] cudaStreamSynchronize an illegal memory access was encountered errno:77 *** Check failure stack trace: *** @ 0x7f165d6c281d google::LogMessage::Fail() @ 0x7f165d6c281d google::LogMessage::Fail() @ 0x7f165d6c62cc google::LogMessage::SendToLog() @ 0x7f165d6c62cc google::LogMessage::SendToLog() @ 0x7f165d6c2343 google::LogMessage::Flush() @ 0x7f165d6c2343 google::LogMessage::Flush() @ 0x7f165d6c77de google::LogMessageFatal::~LogMessageFatal() @ 0x7f165d6c77de google::LogMessageFatal::~LogMessageFatal() @ 0x7f165f793f8d _ZNSt17_Function_handlerIFvvEZNK6paddle8platform17CUDADeviceContext4WaitEvEUlvE_E9_M_invokeERKSt9_Any_data @ 0x7f165f793f8d _ZNSt17_Function_handlerIFvvEZNK6paddle8platform17CUDADeviceContext4WaitEvEUlvE_E9_M_invokeERKSt9_Any_data @ 0x7f165f7a3834 paddle::platform::TemporaryAllocator::Release() @ 0x7f165f7a3834 paddle::platform::TemporaryAllocator::Release() @ 0x7f165f7970b1 paddle::platform::CUDADeviceContext::Wait() @ 0x7f165f7970b1 paddle::platform::CUDADeviceContext::Wait() @ 0x7f165f4ec2a7 paddle::framework::details::OpHandleBase::RecordWaitEventOnCtx() @ 0x7f165f4ec2a7 paddle::framework::details::OpHandleBase::RecordWaitEventOnCtx() @ 0x7f165f4ace75 paddle::framework::details::FetchOpHandle::WaitInputVarGenerated() @ 0x7f165f4ace75 paddle::framework::details::FetchOpHandle::WaitInputVarGenerated() @ 0x7f165f4ad524 paddle::framework::details::FetchOpHandle::RunImpl() @ 0x7f165f4ad524 paddle::framework::details::FetchOpHandle::RunImpl() @ 0x7f165f4ab7b6 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync() @ 0x7f165f4ab7b6 paddle::framework::details::FastThreadedSSAGraphExecutor::RunOpSync() @ 0x7f165f4aa40f paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp() @ 0x7f165f4aa40f paddle::framework::details::FastThreadedSSAGraphExecutor::RunOp() @ 0x7f165f4aa7d0 _ZNSt17_Function_handlerIFvvESt17reference_wrapperISt12_Bind_simpleIFS1_ISt5_BindIFZN6paddle9framework7details28FastThreadedSSAGraphExecutor10RunOpAsyncEPSt13unordered_mapIPNS6_12OpHandleBaseESt6atomicIiESt4hashISA_ESt8equal_toISA_ESaISt4pairIKSA_SC_EEESA_RKSt10shared_ptrINS5_13BlockingQueueImEEEEUlvE_vEEEvEEEE9_M_invokeERKSt9_Any_data @ 0x7f165f4aa7d0 _ZNSt17_Function_handlerIFvvESt17reference_wrapperISt12_Bind_simpleIFS1_ISt5_BindIFZN6paddle9framework7details28FastThreadedSSAGraphExecutor10RunOpAsyncEPSt13unordered_mapIPNS6_12OpHandleBaseESt6atomicIiESt4hashISA_ESt8equal_toISA_ESaISt4pairIKSA_SC_EEESA_RKSt10shared_ptrINS5_13BlockingQueueImEEEEUlvE_vEEEvEEEE9_M_invokeERKSt9_Any_data @ 0x7f165d7b7283 std::_Function_handler<>::_M_invoke() @ 0x7f165d7b7283 std::_Function_handler<>::_M_invoke() @ 0x7f165d647dd7 std::__future_base::_State_base::_M_do_set() @ 0x7f165d647dd7 std::__future_base::_State_base::_M_do_set() @ 0x7f16ac597973 __GI___pthread_once @ 0x7f16ac597973 __GI___pthread_once @ 0x7f165f4a5d52 _ZNSt13__future_base11_Task_stateISt5_BindIFZN6paddle9framework7details28FastThreadedSSAGraphExecutor10RunOpAsyncEPSt13unordered_mapIPNS4_12OpHandleBaseESt6atomicIiESt4hashIS8_ESt8equal_toIS8_ESaISt4pairIKS8_SA_EEES8_RKSt10shared_ptrINS3_13BlockingQueueImEEEEUlvE_vEESaIiEFvvEE6_M_runEv @ 0x7f165f4a5d52 _ZNSt13__future_base11_Task_stateISt5_BindIFZN6paddle9framework7details28FastThreadedSSAGraphExecutor10RunOpAsyncEPSt13unordered_mapIPNS4_12OpHandleBaseESt6atomicIiESt4hashIS8_ESt8equal_toIS8_ESaISt4pairIKS8_SA_EEES8_RKSt10shared_ptrINS3_13BlockingQueueImEEEEUlvE_vEESaIiEFvvEE6_M_runEv @ 0x7f165d649354 _ZZN10ThreadPoolC1EmENKUlvE_clEv @ 0x7f165d649354 _ZZN10ThreadPoolC1EmENKUlvE_clEv @ 0x7f1680fa58a0 execute_native_thread_routine @ 0x7f1680fa58a0 execute_native_thread_routine @ 0x7f16ac5921c3 start_thread @ 0x7f16ac5921c3 start_thread @ 0x7f16abbba12d __clone @ 0x7f16abbba12d __clone @ (nil) (unknown) Aborted
备注:paddle版本:1.5.0,显存优化打开和关闭都是必现,可能跑几轮出现,也可能刚开始出现