FetchOpHandle Error after using control_flow.Switch() (#10490) · Issue · PaddlePaddle / Paddle

FetchOpHandle Error after using control_flow.Switch()

Created by: sefira

I want to implement a exponential decay learning rate policy with warmup in https://github.com/sefira/models/blob/ssd_coco/fluid/object_detection/utility.py#L119.

But I find there is some strange thing in control_flow.Switch(): if use this code

        with control_flow.Switch() as switch:
            with switch.case(global_step < WARM_UP_ITERS):
                alpha = global_step / WARM_UP_ITERS
                warmup_factor = WARM_UP_FACTOR * (1 - alpha) + alpha
                warmup_val = (values[0] * warmup_factor)
                tensor.assign(warmup_val, lr)
            for i in range(len(boundaries)):
                boundary_val = tensor.fill_constant(
                    shape=[1], dtype='float32', value=float(boundaries[1]))
                value_var = tensor.fill_constant(
                    shape=[1], dtype='float32', value=float(values[1]))
                with switch.case(global_step < boundary_val):
                    tensor.assign(value_var, lr)
            with switch.default():
                last_value_var = tensor.fill_constant(
                    shape=[1],
                    dtype='float32',
                    value=float(values[len(values) - 1]))
                tensor.assign(last_value_var, lr)

above code can run. But with following code:

        with control_flow.Switch() as switch:
            with switch.case(global_step < WARM_UP_ITERS):
                alpha = global_step / WARM_UP_ITERS
                warmup_factor = WARM_UP_FACTOR * (1 - alpha) + alpha
                warmup_val = (values[0] * warmup_factor)
                tensor.assign(warmup_val, lr)
            boundary_val = tensor.fill_constant(
                shape=[1], dtype='float32', value=float(boundaries[1]))
            value_var = tensor.fill_constant(
                shape=[1], dtype='float32', value=float(values[1]))
            with switch.case(global_step < boundary_val):
                tensor.assign(value_var, lr)
            with switch.default():
                last_value_var = tensor.fill_constant(
                    shape=[1],
                    dtype='float32',
                    value=float(values[len(values) - 1]))
                tensor.assign(last_value_var, lr)

It will got

*** Aborted at 1525768737 (unix time) try "date -d @1525768737" if you are using GNU date ***
PC: @                0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 40098 (TID 0x7f6e3fbd1700) from PID 0; stack trace: ***
    @       0x318b20f500 (unknown)
    @     0x7f738b7d52d2 paddle::framework::details::FetchOpHandle::RunImpl()
    @     0x7f738b7d8d9a paddle::framework::details::OpHandleBase::Run()
    @     0x7f738b7cef2c _ZNSt17_Function_handlerIFSt10unique_ptrINSt13__future_base12_Result_baseENS2_8_DeleterEEvENS1_12_Task_setterIS0_INS1_7_ResultIvEES3_ESt12_Bind_simpleIFSt17reference_wrapperISt5_BindIFZN6paddle9framework7details24ThreadedSSAGraphExecutor5RunOpEPNSF_13BlockingQueueIPNSF_13VarHandleBaseEEEPNSF_12OpHandleBaseEEUlvE_vEEEvEEvEEE9_M_invokeERKSt9_Any_data
    @     0x7f738b6854af std::__future_base::_State_baseV2::_M_do_set()
    @       0x318b20cb23 (unknown)
    @     0x7f738b7cd3e8 _ZNSt17_Function_handlerIFvvEZN10ThreadPool7enqueueIRZN6paddle9framework7details24ThreadedSSAGraphExecutor5RunOpEPNS5_13BlockingQueueIPNS5_13VarHandleBaseEEEPNS5_12OpHandleBaseEEUlvE_JEEESt6futureINSt9result_ofIFT_DpT0_EE4typeEEOSI_DpOSJ_EUlvE_E9_M_invokeERKSt9_Any_data
    @     0x7f738b7d2779 _ZNSt6thread5_ImplISt12_Bind_simpleIFZN10ThreadPoolC4EmEUlvE_vEEE6_M_runEv
    @     0x7f742ee2b640 execute_native_thread_routine
    @       0x318b207851 (unknown)
    @       0x318aee767d (unknown)
    @                0x0 (unknown)

The only diff between above two codes is the later one removes the for loop.

PaddlePaddle / Paddle 大约 2 年 前同步成功

FetchOpHandle Error after using control_flow.Switch()

PaddlePaddle / Paddle
大约 2 年前同步成功