提交 · 405bb94bd1ef51b51c9fe67ade00c262b169e3ce · PaddlePaddle / Paddle

20 2月, 2020 1 次提交

Add Basic Node Var Type Analysis (#22603) · 14672a63

由 Huihuang Zheng 提交于 2月 20, 2020

1. Move AstNodeWrapper, StaticAnalysisVisitor to a new python file: static_analysis.py
2. Add basic node var type analysis

14672a63

19 2月, 2020 2 次提交
- S
  Linear use mul op (#22662) · 0aee4300
  由 songyouwei 提交于 2月 19, 2020
```
* Linear use mul op
test=develop

* fix unittest
test=develop
```
  0aee4300
- H
  
  add no_check_list for no_grad_set rule (#22571) · df144e21
  由 HappyAngel 提交于 2月 19, 2020
  
  df144e21
18 2月, 2020 4 次提交
- L
  [UT coverage] improve the mul_mkldnn_op line coverage (#22408) · d9262145
  由 lidanqing 提交于 2月 18, 2020
```
* improve the mul_mkldnn_op line coverage
test=develop

* remove fp32 mul mkldnn kernel
test=develop

* locally refactoring
test=develop

* change according to reviews
test=develop
```
  d9262145
- W
  add flag to control profile level in python API (#22319) · c65c6ae5
  由 wangchaochaohu 提交于 2月 18, 2020
```
* add python flag to control profile level test=develop
```
  c65c6ae5
- L
  support num_flatten_dims=-1 of API fc. (#22634) · 17a6b50f
  由 liym27 提交于 2月 18, 2020
```
* support num_flatten_dims=-1 of API fc. test=develop

* fix name of class Test* and add CUDAPlace test. test=develop
```
  17a6b50f
- A
  
  add gast to replace ast test=develop (#22630) · 8b41e2b3
  由 Aurelius84 提交于 2月 18, 2020
  
  8b41e2b3
17 2月, 2020 6 次提交

1

support dumping params/grads in transpiler mode (#22490) · 00594c1c
由 123malin 提交于 2月 17, 2020

00594c1c
Z

fix py_func bug when out is list and add unittest case (#22595) · d8600777
由 zhouwei25 提交于 2月 17, 2020

d8600777

support set param with None value (#22418) · d9f0c9f5

由 songyouwei 提交于 2月 17, 2020

* support reset param with None value

* add unittest
test=develop

* update unittest
test=develop

d9f0c9f5

Add Queue.get delay for multiprocess data loader (#22604) · 19211072

由 Chen Weihang 提交于 2月 17, 2020

* add get delay for multiprocess data loader, test=develop

* add unittest for coverage ci, test=develop

* add timeout unittest, test=develop

* increase the delay time, test=develop

19211072

Add TopK Op Grad CPU&GPU Kernel test=develop (#22628) · 8f035fb6

由 Jiawei Wang 提交于 2月 17, 2020

* Add TopK Op Grad CPU&GPU Kernel test=develop

* Add TopK Op Grad, modify grad op maker test=develop

* Add TopK Op Grad, modify grad op maker test=develop

* Add TopK Op Grad, modify PADDLE_ENFORCE test=develop

* Add TopK Op Grad, modify PADDLE_THROW test=develop

* Add TopK Op Grad, modify unittest test=develop

* fix ngraph top k op unittest test=develop

8f035fb6

G
Modify english document and unittest of while_loop (#22615) · 9ed59da4
由 guofei 提交于 2月 17, 2020
```
Modify english document and unittest of while_loop
```
9ed59da4

16 2月, 2020 3 次提交
- C
  Fix data loader test failed problem in release 1.7 (#22624) · fc645d8a
  由 Chen Weihang 提交于 2月 16, 2020
```
* split unittests in data loader test, test=release/1.7

* split unittests to different files, test=develop

* remove repeat unittest, test=develop
```
  fc645d8a
- 1
  
  test=develop, add distributed tools (#22623) · e59463ef
  由 123malin 提交于 2月 16, 2020
  
  e59463ef
- T
  add texttable for pretty flag output (#22584) · 1aab3e61
  由 tangwei12 提交于 2月 16, 2020
```
pretty print for communicator flag
```
  1aab3e61
15 2月, 2020 1 次提交

update ops's unittest data type from float32 to float64 and shape over 100 (#22544) · 90ee3666

由 Steffy-zxf 提交于 2月 15, 2020

* update ops's unittest of elementwise_pow, elementwise_max, elementwise_min, scale and sqrt
1. update elementwise_pow, elementwise_max and scale's unitests with input data type (float32 -> float64)
2. fix bug that the elementwise_pow doesn't meet threshold requirements with tackling float64 data
3. remove sqrt from op_accuracy_white_list.py
4. update the unittests of elementwise_pow, elementwise_max and elementwise_min ops that their input data shape over 100
5. test=develop

* modify the writing style according suggestions
test=develop

90ee3666

14 2月, 2020 1 次提交
- C
  
  Adjust sleep time of main process in signal handler test (#22597) · ec907427
  由 Chen Weihang 提交于 2月 14, 2020
  
  ec907427
13 2月, 2020 5 次提交
- H
  Enhance load program state (#22546) · 69802396
  由 hong 提交于 2月 13, 2020
```
* enhance load program state; test=develop

* optimize commet; test=develop
```
  69802396
- Z
  [Ernie GPU Optim]: Fuse three fc to multihtead matmul (#22486) · 8acd745c
  由 Zhaolong Xing 提交于 2月 13, 2020
```
* 1. optim multihead matmul: fuse three fc to multihtead matmul

test=develop

* fix conflict
test=develop

* fix comments
test=develop
```
  8acd745c
- H
  Add Static Analysis to Construct AstNodeWrapper (#22569) · a8dd425a
  由 Huihuang Zheng 提交于 2月 13, 2020
```
As the title
```
  a8dd425a
- J
  Add test with reused requantize op (#22482) · 146ed409
  由 joanna.wozna.intel 提交于 2月 13, 2020
```
test=develop
```
  146ed409
- Z
  
  fix traced layer with non persistable vars, test=develop (#22552) · 08033c86
  由 Zeng Jinle 提交于 2月 12, 2020
  
  08033c86
12 2月, 2020 3 次提交

Add support for dynamic_decode(while) training. (#22231) · 31b54646

由 Guo Sheng 提交于 2月 12, 2020

* Add support for dynamic_decode(while) training. test=develop

* Fix assign_op and tensor_array_read_write_op after solving conflict. test=develop

* Fix test_rnn_decode_api.py. test=develop

* Refine docs for apis in rnn.py. test=develop

* Adjust outputs of dynamic_decode. test=develop

* Remove the force_cpu update in assign_op. test=develop

* Remove the force_cpu update in assign_op. test=develop

* Make RNNCell.get_initial_states support batch_dim_idx argument. test=develop

* Rename _create_array_outof_while as _create_array_out_of_while in rnn.py.
test=develop

31b54646

T
fix bug with compiledProgram (#22495) · b0675c81
由 tangwei12 提交于 2月 12, 2020
```
* add thread barrier for the compiled program
```
b0675c81

support slice double grad, test=develop (#22166) · 58d99247

由 Double_V 提交于 2月 12, 2020

* support slice double grad, test=develop
* merge two doublegradopmaker to one doublegradopmaker,test=develop
* change the shape of slice_OP's unittest, test=develop

58d99247

11 2月, 2020 4 次提交

H

【OpPorting Example】DEMO OF FIX COMPILE&RUNTIME LOD_EQUALITY (#22460) · 9e29d3eb
由 huzhiqiang 提交于 2月 11, 2020

9e29d3eb

multi-loss optimization by adding a DownpourOpt worker (#22025) · 2235ee1a

由 yaoxuefeng 提交于 2月 11, 2020

* update

* update test=develop

* update compile set test=develop

* update compile set test=develop

* update test=develop

* update test=develop

* update test=develop

* update compile setting test=develop

* update compile setting test=develop

* update run demo test=develop

* update test=develop

* update test=develop

* fix test=develop

* update test=develop

* update test=develop

* update test=develop

* update test=develop

* update test=develop

* update test=develop

* update test=develop

* update test=develop

* update test=develop

* update format test=develop

* update format test=develop

* update style test=develop

* update style test=develop

* change style test=develop

* change style test=develop

* change style test=develop

* add dataset unittest test=develop

* update test=develop

* update for record test=develop

* udpate style for record test=develop

* update for record test=develop

* update for record test=develop

* update for record test=develop

* fix format test=develop

* update test=develop

* update test=develop

* update test=develop

* update test=develop

* update test=develop

2235ee1a

Improve transpose performance with tile sm copy, test=develop (#22311) · 54970444

由 zhaoyuchen2018 提交于 2月 11, 2020


* Refine code, fix select tile error,test=develop

* Refine element type and some comments, test=develop

* Refine comments and gpu utils, test=develop

* Remove some useless condition

* Refine floor and ceil, test=develop

* refine for loop. test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

54970444

G
Make assign op support LoDTensorArray and modify while_loop API (#22309) · 3a59a7a1
由 guofei 提交于 2月 11, 2020
```
This PR makes assign op support LoDTensorArray and enable the loop_vars in
while_loop to support tuple or list.
```
3a59a7a1

10 2月, 2020 4 次提交
- L
  Implement InferencePassTest for testing precision of inference passes (#22387) · 14b6133b
  由 liu zhengxi 提交于 2月 10, 2020
```
* add InterencePassTest for testing precision of inference passes, test=develop
```
  14b6133b
- G
  
  Fix the leaving out of rnn_memory_helper_grad's output vars. test=develop (#22499) · e7bbad6c
  由 Guo Sheng 提交于 2月 10, 2020
  
  e7bbad6c
- W
  
  fix test_fusion_seqpool_concat lod level between compile and runtime (#22488) · 870f4658
  由 Wilber 提交于 2月 10, 2020
  
  870f4658
- H
  Add Simple Framework for Transforming Dygraph to Static Graph (#22491) · 903039a3
  由 Huihuang Zheng 提交于 2月 10, 2020
```
This PR provides very basic and simple framework for transforming Dygraph to Static Graph.

API names, final outputs are not determined yet. Feel free to modify or add class/function/type when you think the framework is not extendable for you.
```
  903039a3
07 2月, 2020 2 次提交

Enable the detection of subgraph composed of grad ops (#21223) · dcfb6038

由 Yiqun Liu 提交于 2月 07, 2020

* Add the first implememtation of fusion_group op #19621 (#3)

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Enable generating code for a given subgraph. #21126 (#4)

* Enable generating code for a given subgraph.

* Support sorting the subgraph.

* Remove the rearange of expressions because we use the sorted subgraph directly.

* Enable generating code for a subgraph which is composed of grad ops.

* Use expression information to check the accuracy in unittest.

* Separate load and store from computation expressions.
test=develop

* Improve the loading statements in generated codes.
test=develop

* Remove unused arguments from formal list.
test=develop

* Enable the detection of subgraph of grad ops.

* Generate code for detected subgraph in fusion_group_pass.

* Add an option in BuildStrategy to enable fusion_group_pass and add unittest.
test=develop

* Fix a bug when checking whether the shape of all inputs are the same.

* Add debug information.

* Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5)

test=develop

* Call subgraph_detector in fusion_group pass.
test=develop

* Disable fusion_group when WITH_GPU is OFF.
test=develop

* Refine all PADDLE_ENFORCE message.
test=develop

* Fix the case that some inputs are not defined in grad ops, and set op_role for fused op.
test=develop

* Follow review comments.
test=develop

dcfb6038

polish no_grad_set of gradient and append_backward (#22440) · 50af6b5d

由 Aurelius84 提交于 2月 07, 2020

* polish backward api doc test=develop, test=document_preview,
       test=document_fix

* polish backward api doc test=develop, test=document_preview, test=document_fix

* no_grad supports set of Variable test=develop, test=document_preview

* polish sample code of append_backward test=develop, test=document_preview

* modify assert into Raise TypeError test=develop,test=document_preview

* fix unittest failed test=develop

* rm useless file test=develop

* polish en doc test=develop

* polish code of no_grad_set test=develop

* polish code of no_grad_set test=develop

50af6b5d

06 2月, 2020 1 次提交
- A
  add skip_check_grad_ci of var_conv_2d (#22451) · c2f39431
  由 Aurelius84 提交于 2月 06, 2020
```
* add skip_check_grad_ci of var_conv_2d test=develop

* modify check_shape_white_list test=develop
```
  c2f39431
05 2月, 2020 2 次提交

add WITH_NCCL option for cmake. (#22384) · 7bc4b095

由 Wilber 提交于 2月 05, 2020

cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL

添加了PADDLE_WITH_NCCL定义

单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

7bc4b095

B

fix deformable_conv small cases, test=develop (#22441) · c8b90d8f
由 Bai Yifan 提交于 2月 05, 2020

c8b90d8f

04 2月, 2020 1 次提交

Support int16 for Tensor (#22423) · 822e5b36

由 Leo Chen 提交于 2月 04, 2020

* add int16 support, test=develop

* add test, test=develop

* fix typo, test=develop

* fix dtype error in slice, test=develop

822e5b36

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功