提交 · 00c8ee18ce0ba2bdba75fe7f12b1d6be46fdedd0 · Crayon鑫 / Paddle

20 5月, 2020 1 次提交

Fix test_cond flaky test under Windows (#24633) · 00c8ee18

由 Huihuang Zheng 提交于 5月 20, 2020

In the past, the test_cond will fail with 2% probability and easy to re-produce.

Now I re-run 300 times and no failure occurs. The probability of still has the failure is (1 - 2%) ^ 300 ~= 0.00004. We can say the random failure disappears. Maybe someone fixed some bugs in PE.

00c8ee18

12 4月, 2020 1 次提交

Error Message Enhancement (#23483) · 1d3b0134

由 Huihuang Zheng 提交于 4月 12, 2020

This PR enhances error messages of several API/OPs:

ParallelExecutor (python && C++)
Executor (python && C++)
StaticRNN (python)
IfElse (python)
cond (python)
split_lod_tensor (python && C++)

1d3b0134

11 4月, 2020 1 次提交

Temporary Disable Flaky test_cond Under Windows (#23424) · 4c57e395

由 Huihuang Zheng 提交于 4月 11, 2020

The flaky windows test is hard to debug. It just has an exit code 0xc0000374 without any log so we don't know where and why. The probability of failure is about 1/50.

I spent 3 days and found it happened only when using PE + control flow + Windows. Exit code 0xc0000374 indicates heap corruption or access violation, but I found the memory is enough during debugging. There is no failed test under 500+ linux tests. I suspect the reason is multiple thread difference between Windows and Linux but I don't have time to completely debug it now. I will temporary disable the test and fix it in next days.

4c57e395

26 2月, 2020 1 次提交

support control flow cond in dygraph mode (#22693) · b813c948

由 songyouwei 提交于 2月 26, 2020

* dygraph support cond op
test=develop

* unittest coverage
test=develop

* fix coverage
test=develop

* fix for coverage
test=develop

* refine TypeError msg
test=develop

* remove restrict
test=develop

b813c948

06 1月, 2020 1 次提交
- H
  
  Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug (#22029) · dd436156
  由 Huihuang Zheng 提交于 1月 06, 2020
  
  dd436156
18 12月, 2019 1 次提交

Fix Backward Bugs in Conditional Block (#21809) · 557bce77

由 Huihuang Zheng 提交于 12月 18, 2019

The fixed bugs:

1. The condition sub-graph is not pruned
2. When backward graph is extremely simple, the whole backward ops are pruned.

557bce77

06 12月, 2019 1 次提交

Add Much Complex Test and Fix Bugs for Control Flow cond API (#21532) · 1dcf6a72

由 Huihuang Zheng 提交于 12月 06, 2019

Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests.

Fix bugs:

1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. **To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR**. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op.

2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var.

This PR also did some code clean up:
1. Print the var name when sgd_op catches shape error so that it is easier to debug
2. Fix a typo: dicta -> dict

1dcf6a72

29 11月, 2019 1 次提交

Fix Cond Bug for Nested Control Flow (#21340) · 630be319

由 Huihuang Zheng 提交于 11月 29, 2019

* Commit before merging develop

test=develop

* Backup after working with Huihuang logs

* Commit before deleting Huihuang debug loggings

* Commit before debug

test=develop

* Fix bug commit

test=develop

* Backup of fixing bugs

test=develop

* Clean up code

test=develop

* Fix a bug in sum_op

test=develop

630be319

11 11月, 2019 1 次提交
- H
  
  Add basic Python Cond Layer (#21050) · e64d55f0
  由 Huihuang Zheng 提交于 11月 11, 2019
  
  e64d55f0

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致