- 20 5月, 2020 1 次提交
-
-
由 Huihuang Zheng 提交于
In the past, the test_cond will fail with 2% probability and easy to re-produce. Now I re-run 300 times and no failure occurs. The probability of still has the failure is (1 - 2%) ^ 300 ~= 0.00004. We can say the random failure disappears. Maybe someone fixed some bugs in PE.
-
- 12 4月, 2020 1 次提交
-
-
由 Huihuang Zheng 提交于
This PR enhances error messages of several API/OPs: ParallelExecutor (python && C++) Executor (python && C++) StaticRNN (python) IfElse (python) cond (python) split_lod_tensor (python && C++)
-
- 11 4月, 2020 1 次提交
-
-
由 Huihuang Zheng 提交于
The flaky windows test is hard to debug. It just has an exit code 0xc0000374 without any log so we don't know where and why. The probability of failure is about 1/50. I spent 3 days and found it happened only when using PE + control flow + Windows. Exit code 0xc0000374 indicates heap corruption or access violation, but I found the memory is enough during debugging. There is no failed test under 500+ linux tests. I suspect the reason is multiple thread difference between Windows and Linux but I don't have time to completely debug it now. I will temporary disable the test and fix it in next days.
-
- 26 2月, 2020 1 次提交
-
-
由 songyouwei 提交于
* dygraph support cond op test=develop * unittest coverage test=develop * fix coverage test=develop * fix for coverage test=develop * refine TypeError msg test=develop * remove restrict test=develop
-
- 06 1月, 2020 1 次提交
-
-
由 Huihuang Zheng 提交于
-
- 18 12月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
The fixed bugs: 1. The condition sub-graph is not pruned 2. When backward graph is extremely simple, the whole backward ops are pruned.
-
- 06 12月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests. Fix bugs: 1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. **To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR**. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op. 2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var. This PR also did some code clean up: 1. Print the var name when sgd_op catches shape error so that it is easier to debug 2. Fix a typo: dicta -> dict
-
- 29 11月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
* Commit before merging develop test=develop * Backup after working with Huihuang logs * Commit before deleting Huihuang debug loggings * Commit before debug test=develop * Fix bug commit test=develop * Backup of fixing bugs test=develop * Clean up code test=develop * Fix a bug in sum_op test=develop
-
- 11 11月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
-