提交 · b367122a94b2aef00ad695141bf1bb6765ed7cf5 · BaiXuePrincess / Paddle

05 10月, 2022 1 次提交
- J
  
  - compilation fix · b367122a
  由 Jacek Czaja 提交于 10月 05, 2022
  
  b367122a
04 10月, 2022 2 次提交
- J
  
  - more fixes · f1cebfef
  由 Jacek Czaja 提交于 10月 04, 2022
  
  f1cebfef
- J
  
  first commit · d7652d5f
  由 Jacek Czaja 提交于 10月 04, 2022
  
  d7652d5f
03 10月, 2022 2 次提交
- J
  OneDNN md-in-tensor refactoring: Added support for md in transpose (#46620) · 19746835
  由 jakpiase 提交于 10月 03, 2022
```
* added transpose

* CI fix

* fix for transpose

* fix after review
```
  19746835
- J
  Requantize to use Memory Desc in Tensors (#46608) · a579e523
  由 Jacek Czaja 提交于 10月 03, 2022
```
* - some more MD changes

* - lint

* - compilation fixes

* - compilation fixes

* - lint

* - fix
```
  a579e523
30 9月, 2022 12 次提交
- W
  
  Support both use_calc_stream and sync_op in allgather API (#46295) · ecae7b31
  由 Wen Sun 提交于 9月 30, 2022
  
  ecae7b31
- R
  
  Release memory cache after build_op_func_list in interpretercore (#46670) · 255890ff
  由 Ruibiao Chen 提交于 9月 30, 2022
  
  255890ff
- H
  
  opt GetExpectedKernelType code of fill_constant_op (#46667) · 136b1f42
  由 HongyuJia 提交于 9月 30, 2022
  
  136b1f42
- H
  
  remove MKLDNN hard code in addmm (#46660) · 9900ed52
  由 HongyuJia 提交于 9月 30, 2022
  
  9900ed52
- A
  [IPU] paddle-inference support custom-ops (#45235) · a6b4bee3
  由 Allen Guo 提交于 9月 30, 2022
```
* paddle-inference support custom-ops
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>

* fix tolower
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
```
  a6b4bee3
- C
  
  [MLU] fix phi::Tensor compile error of mlu. (#46649) · 2e231402
  由 Chenxiao Niu 提交于 9月 30, 2022
  
  2e231402
- [MLU] add_fluid_mluop_yolo_box (#46573) · 832b0a15
  由光明和真理提交于 9月 30, 2022
  
  832b0a15
- Y
  fix bugs of tipc, test=kunlun (#46540) · d16360c8
  由 ykkk2333 提交于 9月 30, 2022
```
* migrate sigmoid with cross entropy, and tile xpu kernels to phi, test=kunlun

* migrate add_n kernep to phi, test=kunlun

* fix bugs of tipc, test=kunlun
```
  d16360c8
- H
  [Opt Code] Opt GetExpectedKernelType code of conv_transpose_op (#46666) · 8c067ec1
  由 HongyuJia 提交于 9月 30, 2022
```
* opt GetExpectedKernelType code of conv_transpose_op

* fix if error
```
  8c067ec1
- H
  
  remove_dequantize_mkldnn_headerfile (#46665) · 7a1e1f99
  由 HongyuJia 提交于 9月 30, 2022
  
  7a1e1f99
- H
  
  change mkldnn kernel layout, ALL_LAYOUT->ONEDNN (#46629) · abee2210
  由 HongyuJia 提交于 9月 30, 2022
  
  abee2210
- S
  support pure bfloat16 for more ops (#46364) · b7b231a6
  由 sneaxiy 提交于 9月 30, 2022
```
* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* add bfloat16 to selu_grad to pass CI

* fix selu grad compilation error
```
  b7b231a6
29 9月, 2022 7 次提交

X

fix mpi include bug (#46601) · 7057093e
由 Xinger 提交于 9月 29, 2022

7057093e
Z
[GPUPS]add afs OpenWriter (#46611) · c7d60ce4
由 zmxdream 提交于 9月 29, 2022
```
* add afs OpenWriter

* update
```
c7d60ce4

Add index_select, index_select_grad, reduce_min kernel and their unittests for... · 9a1855ff

由 Leo Guo 提交于 9月 29, 2022

Add index_select, index_select_grad, reduce_min kernel and their unittests for kunlun. Add registers of index_select, index_select_grad, reduce_min, sqrt, sqrt_grad to xpu2_op_list.test=kunlun. (#46557)

9a1855ff

fix P40 topk: Make the optimized topk compatible with P40. (#46547) · 667082c0

由 carryyu 提交于 9月 29, 2022

* fix P40 topk: Make the optimized topk compatible with P40.

* fix P40 topk: Make the optimized topk compatible with P40.

* fix P40 topk: Make the optimized topk compatible with P40.

667082c0

Y
Remove calibration file path when deploy quantize model (#46283) · d71f1b3f
由 yeliang2258 提交于 9月 29, 2022
```
* remove calibration file path

* remove useless code
```
d71f1b3f
[MLU] add mlu kernel for add_reduce_max_grad (#45651) · 1ef1cace
由光明和真理提交于 9月 29, 2022
```
Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com>
```
1ef1cace

[Eager, Performance optimization] support mod / matmul ( % and @ operator) to... · 7d7444cc

由 Weilong Wu 提交于 9月 29, 2022

[Eager, Performance optimization] support mod / matmul ( % and @ operator) to sink to Cpp layer (#46565)

* [Eager, Performance optimization] support mod ( % operator) to sink to Cpp layer

* fix mod logic

* support matmul math operator

* rm LOG(warning), use VLOG(6)

* fix conflicts mistake

7d7444cc

28 9月, 2022 12 次提交

S

fix collective helper (#46582) · bd10211c
由 sneaxiy 提交于 9月 28, 2022

bd10211c

Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e

由 Chen Weihang 提交于 9月 28, 2022

* remove needless using tensor

* remove needless using tensor

* resolve conflict

* replace tensor using

* fix format error

* revert needless changing

* fix rocm and npu compile error

* fix cinn compile error

* fix format error

* fix mkldnn format error

* fix mkldnn format error

* fix cinn compile error

* fix cinn compile error

* fix cinn compile error

* resolve conflict

e12a905e

R
Convert GradMergeAllReduceOpHandle in GraphToBlock (#46544) · 6a706e63
由 Ruibiao Chen 提交于 9月 28, 2022
```
* Convert GradMergeAllReduceOpHandle in GraphToBlock

* Set FLAGS_CONVERT_GRAPH_TO_PROGRAM to False
```
6a706e63
W
[Eager, Performance optimization] support less_than & less_equal( < & <=... · 7d238139
由 Weilong Wu 提交于 9月 28, 2022
```
[Eager, Performance optimization] support less_than & less_equal( < & <= operator) to sink to Cpp layer (#46542)
```
7d238139
Z

[GPUPS]fix ChannelReader (#46575) · 2aec65be
由 zmxdream 提交于 9月 28, 2022

2aec65be
L

remove const qualifier in function return (#46546) · 8c5b9cf8
由 Leo Chen 提交于 9月 28, 2022

8c5b9cf8

Replacing set_format with set_mem_desc in FC onednn kernel (#46372) · 844d9855

由 Jacek Czaja 提交于 9月 28, 2022

* added fc int8 tests

* CI fix

* added skipping UTs for GPUs

* fixes for CI

* added support for residual connections inside fc

* fix for quant int8 bias

* - lint
Co-authored-by: Njakpiase <jakpia21@gmail.com>

844d9855

L

first commit (#46525) · 806b252c
由 limingshu 提交于 9月 28, 2022

806b252c

[PHI] relu6_grad kernel (#46501) · cee2b12d

由 Sławomir Siwek 提交于 9月 28, 2022

* Relu6

* remove fluid handler

* add individual kernel signature

* coding style

* replace bounded_relu with clip

* whitespace

* code style

cee2b12d

[NPU] add gpu kernel for transfer layout (#46307) · 526d963e

由 kangguangli 提交于 9月 28, 2022

* add gpu kernel for transfer layout

* comment error throw

* fix: flag setting in testcase; add condition check for raising error

* fix typo

* fix: add error type for PADDLE_THROW

* remove kernel fallback in data_transfer.cc

* remove useless variable definition

526d963e

W

merge develop (#46520) · 1ecc39b4
由 Weilong Wu 提交于 9月 28, 2022

1ecc39b4
Z
Fix clip_extra logic in remove_training_info (#46534) · 7e2e2ee7
由 zyfncg 提交于 9月 28, 2022
```
* fix clip_extra code in remove_training_info

* revert rnn opmaker clear
```
7e2e2ee7

27 9月, 2022 4 次提交

C

[MLU] add huber_loss kernel. (#46455) · f786fcf9
由 Chenxiao Niu 提交于 9月 27, 2022

f786fcf9

[Eager, Performance optimization] support divide( / operator) to sink to Cpp layer (#46329) · f20b361c

由 Weilong Wu 提交于 9月 27, 2022

* [Eager] math op sink to Cpp level

* fix ci errors

* draft version

* support + and - operator under cpp directly

* add static test

* polish code

* promote types or unify right type to left

* recover static test case

* polish code and fix some ci errors

* support complex and polish code

* fix conflicts

* fix windows ci errors

* fix windows-inference-ci errors

* polish and fix tests

* fix test case

* polish code

* [Eager, Performance optimization] support multiply( * operator) to sink to Cpp layer

* rm useless glog

* [Eager, Performance optimization] support divide( / and // operator) to sink to Cpp layer

* polish code

* polish code and fix code-format

* polish code

* fix CI

* polish code

* update test

* support div operator under cpp

* fix scalar as input

* Polish div logic, fix ci test

* fix errors

f20b361c

L
Add bernoulli primitive op and support dropout op in new AD. (#46238) · fee84e09
由 levi131 提交于 9月 27, 2022
```
* init dropout

* small format fix

* fix pr comments

* add value test
```
fee84e09
C

speedup ChannelClipAndQuantDequantKernelQuantAxis1 kernel (#46471) · 9c426728
由 ceci3 提交于 9月 27, 2022

9c426728

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致