FetchOpHandle Error after using control_flow.Switch()
Created by: sefira
I want to implement a exponential decay learning rate policy with warmup in https://github.com/sefira/models/blob/ssd_coco/fluid/object_detection/utility.py#L119.
But I find there is some strange thing in control_flow.Switch(): if use this code
with control_flow.Switch() as switch:
with switch.case(global_step < WARM_UP_ITERS):
alpha = global_step / WARM_UP_ITERS
warmup_factor = WARM_UP_FACTOR * (1 - alpha) + alpha
warmup_val = (values[0] * warmup_factor)
tensor.assign(warmup_val, lr)
for i in range(len(boundaries)):
boundary_val = tensor.fill_constant(
shape=[1], dtype='float32', value=float(boundaries[1]))
value_var = tensor.fill_constant(
shape=[1], dtype='float32', value=float(values[1]))
with switch.case(global_step < boundary_val):
tensor.assign(value_var, lr)
with switch.default():
last_value_var = tensor.fill_constant(
shape=[1],
dtype='float32',
value=float(values[len(values) - 1]))
tensor.assign(last_value_var, lr)
above code can run. But with following code:
with control_flow.Switch() as switch:
with switch.case(global_step < WARM_UP_ITERS):
alpha = global_step / WARM_UP_ITERS
warmup_factor = WARM_UP_FACTOR * (1 - alpha) + alpha
warmup_val = (values[0] * warmup_factor)
tensor.assign(warmup_val, lr)
boundary_val = tensor.fill_constant(
shape=[1], dtype='float32', value=float(boundaries[1]))
value_var = tensor.fill_constant(
shape=[1], dtype='float32', value=float(values[1]))
with switch.case(global_step < boundary_val):
tensor.assign(value_var, lr)
with switch.default():
last_value_var = tensor.fill_constant(
shape=[1],
dtype='float32',
value=float(values[len(values) - 1]))
tensor.assign(last_value_var, lr)
It will got
*** Aborted at 1525768737 (unix time) try "date -d @1525768737" if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 40098 (TID 0x7f6e3fbd1700) from PID 0; stack trace: ***
@ 0x318b20f500 (unknown)
@ 0x7f738b7d52d2 paddle::framework::details::FetchOpHandle::RunImpl()
@ 0x7f738b7d8d9a paddle::framework::details::OpHandleBase::Run()
@ 0x7f738b7cef2c _ZNSt17_Function_handlerIFSt10unique_ptrINSt13__future_base12_Result_baseENS2_8_DeleterEEvENS1_12_Task_setterIS0_INS1_7_ResultIvEES3_ESt12_Bind_simpleIFSt17reference_wrapperISt5_BindIFZN6paddle9framework7details24ThreadedSSAGraphExecutor5RunOpEPNSF_13BlockingQueueIPNSF_13VarHandleBaseEEEPNSF_12OpHandleBaseEEUlvE_vEEEvEEvEEE9_M_invokeERKSt9_Any_data
@ 0x7f738b6854af std::__future_base::_State_baseV2::_M_do_set()
@ 0x318b20cb23 (unknown)
@ 0x7f738b7cd3e8 _ZNSt17_Function_handlerIFvvEZN10ThreadPool7enqueueIRZN6paddle9framework7details24ThreadedSSAGraphExecutor5RunOpEPNS5_13BlockingQueueIPNS5_13VarHandleBaseEEEPNS5_12OpHandleBaseEEUlvE_JEEESt6futureINSt9result_ofIFT_DpT0_EE4typeEEOSI_DpOSJ_EUlvE_E9_M_invokeERKSt9_Any_data
@ 0x7f738b7d2779 _ZNSt6thread5_ImplISt12_Bind_simpleIFZN10ThreadPoolC4EmEUlvE_vEEE6_M_runEv
@ 0x7f742ee2b640 execute_native_thread_routine
@ 0x318b207851 (unknown)
@ 0x318aee767d (unknown)
@ 0x0 (unknown)
The only diff between above two codes is the later one removes the for loop.