提交 · 99fd9815b65ace288ca55ba60f1a4673e858cf82 · PaddlePaddle / Paddle

23 2月, 2021 3 次提交
- W
  fix windows for optimization of elementwise_add Op (#31068) · 364cfa26
  由 wangchaochaohu 提交于 2月 23, 2021
```
* fix windows for optimization of elementwise_add Op
```
  364cfa26
- J
  Unification of BF16 enablement process (#31034) · 781df300
  由 joanna.wozna.intel 提交于 2月 23, 2021
```
* Unification of bfloat16 enablement process and refactor

* Remove unnecessary function

* Standardize the output name search
```
  781df300
- Z
  fix softmax cross entropy integer overflow (#30590) · 16fe11d7
  由 Zhong Hui 提交于 2月 23, 2021
```
[BUG FIX] Fix softmax cross entropy overflow problem.
```
  16fe11d7
22 2月, 2021 1 次提交
- J
  
  fix the bug in backward OP of index_sample. (#31026) · b95eb38b
  由 JamesLim 提交于 2月 22, 2021
  
  b95eb38b
20 2月, 2021 2 次提交

add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel... · d5323dab

由 TTerror 提交于 2月 20, 2021

add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel executor;optimize lookup_table op (#31056)

* add squeeze_op/unsqueeze_op on kunlun; fix conv op and parallel executor on kunlun; optimize lookup_table op on kunlun

* update squeeze/unsqueeze op

d5323dab

[static setitem] Support the index is Tensor; step>1; step<0 .(#30949) · 5b367dab

由 liym27 提交于 2月 20, 2021

* [static setitem] support the index step > 1. tensor_a[::3] = value

* [static setitem] support the index step < 0. Eg: tensor_a[::-3] = value

* [static setitem] support the index is Tensor. eg: tensor_a[tensor_3:0:-1] = value

* Add op version.

5b367dab

19 2月, 2021 3 次提交
- J
  Added reshape grad bf16 (#31035) · f7465641
  由 Jacek Czaja 提交于 2月 19, 2021
```
* - added Reshape grad bf16

* - Added reshape grad bf16

* - cosmetics in py
```
  f7465641
- W
  Modify relu native implementation 2 (#30996) · 615d8a22
  由 Wojciech Uss 提交于 2月 18, 2021
```
* Modify relu native implementation

* fix GPU performance
```
  615d8a22
- G
  add offset parameter in roi_align,generate_proposals.etc ops (#30864) · 5b267474
  由 Guanghua Yu 提交于 2月 19, 2021
```
* add  parameter in roi_align op
```
  5b267474
18 2月, 2021 2 次提交

Z
enable exhaustive_search for forward and backward algos when dtype is float16 (#30959) · f0ee1592
由 Zhang Ting 提交于 2月 18, 2021
```
* enable exhaustive_search for input_grad when dtype is float16

* enable exhaustive_search for forward algos
```
f0ee1592

Add Conv Transpose BF16 (#30877) · caf9d398

由 joanna.wozna.intel 提交于 2月 18, 2021

* Add conv transpose BF16

* Share function GetWeightsTz

* Adjust to review and fix op compatibility

* Add bias to unique handler name

* Remove errors related to paddle enforce

* Add conv2d_transpose to bf16 list and kernel refator

caf9d398

09 2月, 2021 1 次提交
- C
  
  try to fix reader and signal test failed (#30960) · 010f2caa
  由 Chen Weihang 提交于 2月 08, 2021
  
  010f2caa
08 2月, 2021 1 次提交
- L
  
  Add error message for slice op(#30851) · 97f7a70c
  由 liym27 提交于 2月 08, 2021
  
  97f7a70c
06 2月, 2021 1 次提交
- J
  
  [oneDNN] Added basic changes for elementwise_add_grad bf16 (#30925) · 9e527d99
  由 Jacek Czaja 提交于 2月 06, 2021
  
  9e527d99
05 2月, 2021 2 次提交
- L
  
  [Kunlun] add gen_bkcl_id_op, support multi XPU cards training using multiprocess (#30858) · 4a8b8b45
  由 liuyuhui 提交于 2月 05, 2021
  
  4a8b8b45
- T
  
  dyngraph (#30892) · 24873f4f
  由 taixiurong 提交于 2月 05, 2021
  
  24873f4f
04 2月, 2021 2 次提交
- J
  
  [oneDNN]Extended adaptive pooling support for oneDNN pool kernel (#30757) · abfa8226
  由 Jacek Czaja 提交于 2月 04, 2021
  
  abfa8226
- W
  use iwyu clean include second time, test=develop (#30829) · 35c5b23f
  由 wanghuancoder 提交于 2月 04, 2021
```
* use iwyu clean include second time, test=develop
```
  35c5b23f
03 2月, 2021 6 次提交
- C
  
  add clip_by_norm on kunlun, *test=kunlun (#30862) · ac2e2e6b
  由 cucuzg 提交于 2月 03, 2021
  
  ac2e2e6b
- W
  fix the broadcast for the large second input (#30818) · b7560a59
  由 wawltor 提交于 2月 03, 2021
```
fix the broadcast for the large second input 
```
  b7560a59
- J
  
  Implement cuda kernel for index_sample. (#30380) · 6e1e036a
  由 JamesLim 提交于 2月 03, 2021
  
  6e1e036a
- A
  
  Call new cudnn batch norm API regardless of data type and data layout (#30157) · 666efc23
  由 AshburnLee 提交于 2月 03, 2021
  
  666efc23
- L
  
  fix WITH_XPU_BKCL in CMakeLists.txt (#30854) · 2cb55eff
  由 liuyuhui 提交于 2月 03, 2021
  
  2cb55eff
- W
  
  【kunlun】dygraph supports multi xpu card training (#30671) · b1026f64
  由 WangXi 提交于 2月 03, 2021
  
  b1026f64
02 2月, 2021 1 次提交
- J
  
  Update Xbyak to v5.81 (#30809) · 04532b8a
  由 joanna.wozna.intel 提交于 2月 02, 2021
  
  04532b8a
01 2月, 2021 1 次提交
- W
  ci compilation depends on a stable release (#30755) · b08ae368
  由 Wilber 提交于 2月 01, 2021
```
* update lite tag

* disable ut
```
  b08ae368
29 1月, 2021 1 次提交
- Z
  
  Fix the nan bug when passing all zero values into clip_by_norm_op. (#30777) · 53d01afe
  由 Zhen Wang 提交于 1月 29, 2021
  
  53d01afe
28 1月, 2021 2 次提交
- W
  
  A fix for oneDNN matmul kernel. Fixes issue #30309 (#30723) · fc002405
  由 Wojciech Uss 提交于 1月 28, 2021
  
  fc002405
- T
  fix bugs in transformer predict in xpu place (#30730) · caf3680b
  由 taixiurong 提交于 1月 28, 2021
```
* transformer predict

* trans bug fix
```
  caf3680b
27 1月, 2021 1 次提交

REUPLOAD Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30719) · f8da5536

由 jakpiase 提交于 1月 27, 2021

* added external reorder to profiler

* resolved conflict

* added enable_static

* initial version of lstm, not working yet

* added lstm to operators.cmake

* added vanilla lstm mkldnn op

* added peephole weights integration

* minor changes

* added formatting

* added fusion_lstm_mkldnn to static_whitelist

* added formatting

* removed comment

* moved use_peepholes attribute inside is_cached block

* reverted wrong changes

* minor formatting change

* minor changes

* changed stream handling

* minor change

* added datatype to GetExpectedKernelType()

* added reading stream from TLS

f8da5536

26 1月, 2021 2 次提交

T
Revert "Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661)" (#30708) · 824a79d3
由 Tao Luo 提交于 1月 26, 2021
```
This reverts commit d834f4e6.
```
824a79d3

Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661) · d834f4e6

由 jakpiase 提交于 1月 26, 2021

* added external reorder to profiler

* resolved conflict

* added enable_static

* initial version of lstm, not working yet

* added lstm to operators.cmake

* added vanilla lstm mkldnn op

* added peephole weights integration

* minor changes

* added formatting

* added fusion_lstm_mkldnn to static_whitelist

* added formatting

* removed comment

* moved use_peepholes attribute inside is_cached block

* reverted wrong changes

* minor formatting change

* minor changes

d834f4e6

25 1月, 2021 3 次提交
- A
  More precise mkldnn kernel rules in GetExpectedKernelType (#29840) · 5bf25d1e
  由 arlesniak 提交于 1月 25, 2021
```
* More precise mkldnn kernel choice in GetExpectedKernelType

* Fixes after review

* Refresh develop for CI

* CI experiment

* get back from CI exper
```
  5bf25d1e
- J
  
  [oneDNN] Cache oneDNN stream not to recreate in each oneDNN op (#30358) · 173660be
  由 Jacek Czaja 提交于 1月 25, 2021
  
  173660be
- C
  fix abs bug and add abs test case (#30637) · fb7fbc7a
  由 chentianyu03 提交于 1月 25, 2021
```
* add abs test case

* use std::abs to fix abs bug

* fix the abs bug

* fix abs bug
```
  fb7fbc7a
22 1月, 2021 1 次提交
- S
  
  Fix scatter grad bug (#30604) · 9514b4aa
  由 ShenLiang 提交于 1月 22, 2021
  
  9514b4aa
20 1月, 2021 4 次提交
- J
  
  - Disabling oneDNN inplace pass (#30588) · dfdb0359
  由 Jacek Czaja 提交于 1月 20, 2021
  
  dfdb0359
- T
  support reduce_max op on kunlun (#30581) · 10271ddf
  由 TTerror 提交于 1月 20, 2021
```
* support reduce_max op on kunlun

* support reduce_max op on kunlun

* support reduce_max op on kunlun

* support reduce_max op on kunlun
```
  10271ddf
- Q
  
  fix softmax bug for multi_card in kunlun (#30600) · 5013c676
  由 QingshuChen 提交于 1月 20, 2021
  
  5013c676
- W
  optimize unity build (#30195) · 7e671c07
  由 wuhuanzhou 提交于 1月 20, 2021
```
* optimize unity build, test=develop

* fix code style error, test=develop

* fix code style error and test /MP settings, test=develop
```
  7e671c07

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功