提交 · d5cc144c60039813521c14b037ce40839f78d6d8 · 机器未来 / Paddle

14 10月, 2020 1 次提交
- Z
  tune backward filter algorithm for float16 (#27529) · d5cc144c
  由 Zhang Ting 提交于 10月 14, 2020
```
* use exhaustive_search for float16

* tune algo only when dtype is float16
```
  d5cc144c
12 10月, 2020 2 次提交
- J
  
  [oneDNN] adaptive pool support (#27747) · 55e63763
  由 Jacek Czaja 提交于 10月 12, 2020
  
  55e63763
- add musl option (#27798) · 6335e6a0
  由 chen.zhiyu 提交于 10月 12, 2020
  
  6335e6a0
01 10月, 2020 1 次提交

由 Jacek Czaja 提交于 10月 01, 2020

* - condidate fix to issue #25537

test=develop

* - UT for transpose NHWC

test=develop

b9fda2ff

30 9月, 2020 1 次提交
- J
  Add avx512 core instructions check (#27732) · 0cd4907e
  由 joanna.wozna.intel 提交于 9月 30, 2020
```
* Add avx instructions check

* Small fix

* Change function name

* Change uint to unsigned int
```
  0cd4907e
29 9月, 2020 2 次提交
- 1
  test=develop, optimize geo communicator (#26857) · cc780b19
  由 123malin 提交于 9月 29, 2020
```
* test=develop, optimize geo communicator 
```
  cc780b19
- L
  Initialize gloo for low level collective apis (#27672) · bbc2add7
  由 lilong12 提交于 9月 29, 2020
```
* add gloo initializer, test=develop
```
  bbc2add7
28 9月, 2020 4 次提交
- A
  Add support for mkldnn ops types selection with FLAGS in dygraph (#27482) · 0ecf441a
  由 arlesniak 提交于 9月 28, 2020
```
* Add support for mkldnn ops types selection with FLAGS in dygraph

* use regex to match DNNL verbose

* python3 encoding fix
```
  0ecf441a
- L
  
  Revert "Initialize gloo for low level collective apis (#27356)", test=document_fix (#27665) · 36c04102
  由 lilong12 提交于 9月 28, 2020
  
  36c04102
- L
  add ncclSend and ncclRecv (#27621) · 5218b7af
  由 lilong12 提交于 9月 28, 2020
```
* include ncclRecv and ncclSend, test=develop
```
  5218b7af
- L
  Initialize gloo for low level collective apis (#27356) · fa73e4a2
  由 lilong12 提交于 9月 28, 2020
```
* add gloo initializer, test=develop
```
  fa73e4a2
27 9月, 2020 2 次提交

add support to float64 input of warpctc op. (#27399) · 1501a80f

由 Li Fuchen 提交于 9月 27, 2020

* add float64 input to ctc_loss

* modified error message of  warpctc

* update repo and tag of warpctc

* add test for warpctc with float64 input

* modified warpctc.cmake to make sure build always

* resolved sample code bug of warpctc

* add core.ops in warpctc dygraph

* fix a bug of test

1501a80f

support elementwise add, activation, matmul on Baidu Kunlun (#27143) · 6b727e08

由 QingshuChen 提交于 9月 27, 2020

* support elementwise add, activation, matmul on Baidu Kunlun
* test=kunlun

* minor
* test=kunlun

* reconstuct the xpu directory
* test=kunlun

* minor
* test=kunlun

* minor
* test=kunlun

* minor
* test=kunlun

* minor
* test=kunlun

* minor
* test=kunlun

6b727e08

26 9月, 2020 1 次提交
- Z
  fix cpplint error for the autmic max/min · a85592bc
  由 Zhong Hui 提交于 9月 26, 2020
```
fix cpplint error for the autmic max/min
```
  a85592bc
25 9月, 2020 1 次提交
- Z
  fix cuda atomic for ARCH<350 for the automic_max · 597345d1
  由 Zhong Hui 提交于 9月 25, 2020
```
fix cuda atomic for ARCH<350 for the automic_max
```
  597345d1
24 9月, 2020 3 次提交

S
fix tensorrt 6 build error. test=develop (#27511) · 8f7bb52b
由 Shibo Tao 提交于 9月 24, 2020
```
* fix tensorrt 6 build error. test=develop

* fix. test=develop

* bug fix

* test=develop
```
8f7bb52b

use iwyu clean include (#27267) · df43905f

由 wanghuancoder 提交于 9月 24, 2020

* use iwyu clean include, test=develop, test=win

* compilation error, test=develop

* fix compilation error2, test=develop

* fix compilation error3, test=develop

* fix compilation error4, test=develop

* fix compilation error5, test=develop

* fix compilation error6, test=develop

* fix compilation error7, test=develop

* fix compilation error8, test=develop

* fix compilation error8, test=develop

* fix compilation error10, test=develop

* fix compilation error11, test=develop

df43905f

Z
Add GPU Kernels of Segment Ops, support, sum, max, min, mean · 4a9d21de
由 Zhong Hui 提交于 9月 24, 2020
```
Add GPU Kernels of Segment Ops,  support, sum, max, min, mean
```
4a9d21de

23 9月, 2020 2 次提交
- S
  [bug fix]:Memory increases after adapting the cudnn version to cudnn8 (#27436) · c17f9cf2
  由 Shang Zhizhou 提交于 9月 23, 2020
```
* [bug fix]:Memory increases after adapting the cudnn version to 8

* [bug fix]cudnnGetConvolutionForwardAlgorithm not defined
```
  c17f9cf2
- C
  Polish some lost invalid error message (#27445) · 76506447
  由 Chen Weihang 提交于 9月 23, 2020
```
* polish some lost error msg

* add some math file to white list

* polish detail based reviewer commnet
```
  76506447
21 9月, 2020 1 次提交

[Feature] Enhance inplace addto strategy for gradient accumulation in static graph (#27112) · aba759ba

由 Leo Chen 提交于 9月 21, 2020

* support use add instead of sum to do gradient accumulation

* add inplace addto pass

* add grad_add op and inplace addto pass

* remove debug code

* code refine

* fix bug when sereral sum ops inserts at same op_idx

* fix Flags type

* add addto attribute for conv3d

* fix ut

* code clean

* fix type

aba759ba

18 9月, 2020 1 次提交
- G
  fix cudnn dyload (#27308) · 1a755971
  由 GaoWei8 提交于 9月 18, 2020
```
* fix cudnn dyload error
```
  1a755971
17 9月, 2020 1 次提交
- J
  enhance reduce op which can reduce tensor with arbitrary rank · 63203c4a
  由 Jack Zhou 提交于 9月 17, 2020
```
enhance reduce op which can reduce tensor with arbitrary rank 
```
  63203c4a
15 9月, 2020 1 次提交
- G
  change sequence length attribute to input (#27193) · ee1ed42c
  由 GaoWei8 提交于 9月 15, 2020
```
* replace sequence length attr to input
```
  ee1ed42c
14 9月, 2020 1 次提交
- J
  
  Add bfloat16 passes (#26999) · 1483ea23
  由 joanna.wozna.intel 提交于 9月 14, 2020
  
  1483ea23
07 9月, 2020 1 次提交
- G
  Add padding cudnn interface (#26370) · 4ff16eb2
  由 GaoWei8 提交于 9月 07, 2020
```
* add lstm cudnn of padding data and refine cudnn codes
```
  4ff16eb2
03 9月, 2020 2 次提交
- W
  
  [cuda11 support] add support for cublas load of same function name (parameter diff) (#26963) · 3eacced9
  由 wangchaochaohu 提交于 9月 03, 2020
  
  3eacced9
- J
  
  Add bfloat16 data type (#25402) · 95e1434b
  由 joanna.wozna.intel 提交于 9月 03, 2020
  
  95e1434b
28 8月, 2020 1 次提交

Update the demo code and the doc of varbase.backward. (#26506) · f9066e6a

由 Zhen Wang 提交于 8月 28, 2020

* update the demo code and the doc of varbase.backward.

* update the doc of the fake interface `paddle.fluid.Variable`.

* remove BackwardStrategy.

f9066e6a

27 8月, 2020 1 次提交
- L
  [api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis (#26552) · 1c681383
  由 lilong12 提交于 8月 27, 2020
```
add collective op for cpu using gloo and paddle.distributed.* apis
```
  1c681383
26 8月, 2020 1 次提交
- J
  
  Small change in conv2d and quantize pass (#26671) · 559e43ee
  由 joanna.wozna.intel 提交于 8月 26, 2020
  
  559e43ee
21 8月, 2020 2 次提交

A
Add mechanism for blocking oneDNN cache clearing (#26502) · f3909020
由 Adam 提交于 8月 21, 2020
```
* Add mechanism for blocking oneDNN cache clearing

* Review changes and Add thread guards
```
f3909020

support Baidu Kunlun AI Accelerator (#25959) · 138ecf24

由 QingshuChen 提交于 8月 21, 2020

* support Baidu AI Accelerator
  * test=kunlun

* minor
 * test=kunlun

* support xpu op in separate file
 * test=kunlun

* update XPU error message and remove duplicated code

 * test=kunlun

* minor
 * test=kunlun

* minor
 * test=kunlun

138ecf24

19 8月, 2020 1 次提交
- G
  
  remove scope in cudnn lstm (#25188) · 1fbee267
  由 GaoWei8 提交于 8月 19, 2020
  
  1fbee267
17 8月, 2020 1 次提交
- L
  Print user-friendly error message in core.ops (#26261) · 672578a7
  由 Leo Chen 提交于 8月 17, 2020
```
* print user-friendly error message

* adjust error sumary
```
  672578a7
16 8月, 2020 1 次提交
- W
  
  [API2.0] add op for cudnn version query test=develop (#26180) · 0b81d763
  由 wangchaochaohu 提交于 8月 16, 2020
  
  0b81d763
08 8月, 2020 1 次提交

Change use_quantizer attribute name and data type (#25838) · 734cf1c3

由 joanna.wozna.intel 提交于 8月 08, 2020

* Change use_quantizer attribute name and data type

* Fix problem with setting attribute

* Add changes due to review

* Small change in function

* Restore use_quantizer attr for compatibility

734cf1c3

07 8月, 2020 2 次提交
- L
  Add flags to control call stack of error message (#25997) · 751305ec
  由 Leo Chen 提交于 8月 07, 2020
```
* add flags_call_stack_level

* update

* refine code
```
  751305ec
- P
  Fix TRT plugin registry without TRT lib (#25982) · beb0ca5f
  由 Pei Yang 提交于 8月 07, 2020
```
* fix trt plugin registry without trt lib

* support trt4

* refine code style
```
  beb0ca5f
06 8月, 2020 1 次提交

Add oneDNN fusion_gru kernel (#25594) · 68c6160e

由 Adam 提交于 8月 06, 2020

* Add oneDNN fusion_gru kernel and fix fc+gru pass
test=develop

* Formatting changes
test=develop

* Lint fixes
test=develop

* Add memory::format_tag::any to GRU weights
test=develop

* Fix build with CUDA

* Fix build with CUDA v2

68c6160e

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致