1. 11 12月, 2019 7 次提交
  2. 10 12月, 2019 13 次提交
  3. 09 12月, 2019 13 次提交
  4. 07 12月, 2019 2 次提交
  5. 06 12月, 2019 5 次提交
    • Z
      Polish op registry codes (#21561) · 0f888836
      Zeng Jinle 提交于
      * polish infer shape registry, test=develop
      
      * modify some operators registry, test=develop
      0f888836
    • A
      3d9dee57
    • Z
    • Z
      97e76cb9
    • H
      Add Much Complex Test and Fix Bugs for Control Flow cond API (#21532) · 1dcf6a72
      Huihuang Zheng 提交于
      Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests.
      
      Fix bugs:
      
      1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. **To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR**. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op.
      
      2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var.
      
      This PR also did some code clean up:
      1. Print the var name when sgd_op catches shape error so that it is easier to debug
      2. Fix a typo: dicta -> dict
      1dcf6a72