提交 · 9e0686ed45f79bbe6a5434bf453509cab0b630ea · BaiXuePrincess / Paddle

14 1月, 2022 2 次提交
- S
  
  fix bug of -DPADDLE_WITH_SSE3 not set when WITH_AVX AND AVX_FOUND even SSE3_FOUND (#38931) · 9e0686ed
  由 Sing_chan 提交于 1月 14, 2022
  
  9e0686ed
- 石
  
  remove interface: DenseTensor::release, test=develop (#38937) · 9ff989ae
  由石晓伟提交于 1月 14, 2022
  
  9ff989ae
13 1月, 2022 14 次提交
- C
  [PTen] Rename kernel register marco (#38861) · 158bf13f
  由 Chen Weihang 提交于 1月 13, 2022
```
* rename register marco

* fix error changing

* fix format error
```
  158bf13f
- W
  [Paddle-Inference] add Paddle Trt config: with_interleaved (#38884) · dccdc719
  由 Wangzheee 提交于 1月 13, 2022
```
* add Paddle Trt config: with_interleaved
```
  dccdc719
- S
  
  [bug fix] fix unfold bug in compile time (#38907) · 7f123456
  由 shangliang Xu 提交于 1月 13, 2022
  
  7f123456
- F
  [NPU] fix tril_triu (#38864) · eaccdc71
  由 furnace 提交于 1月 13, 2022
```
[NPU] fix tril_triu
```
  eaccdc71
- F
  [NPU] fix expand op (#38526) · 7a5af630
  由 furnace 提交于 1月 13, 2022
```
* [NPU] fix expand op

* [NPU] optimize codes

* [NPU] optimize codes
```
  7a5af630
- S
  force close eager_generator.exe (#38896) · 23aa7b08
  由 Sing_chan 提交于 1月 13, 2022
```
* force close eager_generator.exe

* modify according to zhouwei's comment
```
  23aa7b08
- C
  [pten]Remove pten/include dir files (#38878) · 7e0292ea
  由 chentianyu03 提交于 1月 13, 2022
```
* move dot_dev api into dot_kernel.h

* add infermate header

* modify to dotkerel in dot_op.h

* mvoe conj dev api into complex_kernel.h

* move sign dev api into  sign_kernel.h

* move scale dev api into kernel.h and remove infermete.h

* rm paddle/pten/include/math.h

* rm paddle/pten/include/math.h

* rm include dir

* rm paddle/pten/include/math.h

* fix conflict with develop branch

* rm devContext in conj_op.h

* add the missing complex_kernel header
```
  7e0292ea
- J
  
  [Dist Pass] AMP pass add dist_update_loss_scaling op (#38902) · 53783e1e
  由 JZ-LIANG 提交于 1月 13, 2022
  
  53783e1e
- L
  
  [fleet_executor] fix uninitialized pointer (#38904) · a6cf6cdd
  由 LiYuRio 提交于 1月 13, 2022
  
  a6cf6cdd
- W
  roi_align aligned supported (#38905) · 08dcea18
  由 wenbin 提交于 1月 13, 2022
```
roi_align aligned supported
```
  08dcea18
- J
  Added mul BF16/FP32 FWD/BWD oneDNN kernel (#38552) · fc6eed5b
  由 jakpiase 提交于 1月 13, 2022
```
* base changes for mul reimplementation

* empty commit

* tmp save

* full implementation of mul bf16/fp32 fwd bwd

* CI fix

* CI rerun

* changed unity build cmake to avoid gpu issues

* removed mul mkldnn from unity build

* added skipping tests if not cpu_bf16

* CI fix

* CI fix

* CI fix
```
  fc6eed5b
- C
  Fix mkldnn invalid infershape impl (#38837) · 281644cd
  由 Chen Weihang 提交于 1月 13, 2022
```
* fix mkldnn invalid infershape

* add unittest for mkldnn in new executor

* add import os
```
  281644cd
- W
  Support test_imperative using_non_zero_gpu with _test_eager_guard() (#38881) · 5e515781
  由 Weilong Wu 提交于 1月 13, 2022
```
* Support test_imperative using_non_zero_gpu and Add a TODO comment

* Change GPU number to 0

* Modify the cuda device selection method
```
  5e515781
- 石
  
  splits allocation for pten, test=develop (#38853) · 277cf900
  由石晓伟提交于 1月 13, 2022
  
  277cf900
12 1月, 2022 23 次提交

Z
[part 3]change type of function args (#38887) · 0efcae86
由 Zhang Ting 提交于 1月 12, 2022
```
* code clean

* [part 3]change type of function args
```
0efcae86
Z

pscore perfermance optimization (#38582) · f1201482
由 zhaocaibei123 提交于 1月 12, 2022

f1201482

由 Allen Guo 提交于 1月 12, 2022

* support more ops

* Co-authored-by: Xiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* add authors
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* update date
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

050fd168

the_one_ps dirs reconstruct (#38804) · 50609214

由 ziyoujiyi 提交于 1月 12, 2022

* delete gloo connect retry

* the_one_ps dirs reconstruct

* .

* .

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* create the_one_ps dirs

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

* the one ps dirs modify

50609214

S
Fix conv act int8 scale (#38331) · 4825addd
由 Sylwester Fraczek 提交于 1月 12, 2022
```
* fix conv act int8 scale

* add unit test for conv+hard_swish
```
4825addd

support 5d for nearest interp (#38868) · d296456c

由 xiaoting 提交于 1月 12, 2022

* support 5d for nearest

* update nearest3d unittest, test=develop

* fix approve ci, test=develop

* fix approve ci, test=develop

d296456c

[Dist Pass] Amp Pass (#38764) · cc24427e

由 JZ-LIANG 提交于 1月 12, 2022

* auto parallel sharding base

* chmod

* add unitest

* set unitest cmake dist label

* revise code according to rewiew

* chmod

* bugfix for grad_clip and param broadcast

* chmod

* update unitest

* chmod

* add clip

* chmod

* add amp pass

* chmod

* add unitest

* remove grad update

* fixed bug

* fixed bug

* fixed typose

* fixed typoes

cc24427e

optimize elementwise_max_grad using new interfaces (#37906) · 4a64ca1e

由 Lijunhui 提交于 1月 12, 2022

* init elem_max_grad op

* optimize code and reply review comments

* ternary functors

* apply new reduce func

* move functor to .h

* multi-outputs init

* rearrange code

* modifed functors

* optimizer code

* pass nullptr

* revert the last change as seg fault occurs

* optimize code

* remove inplace

* remove comments

4a64ca1e

C
[PTen] Remove hybird dir (#38863) · 5f5f626b
由 Chen Weihang 提交于 1月 12, 2022
```
* remove hybird dir

* resolve conflit
```
5f5f626b
L
optimize elementwise_min_grad using new reduce interface (#38236) · c2f825d7
由 Lijunhui 提交于 1月 12, 2022
```
* ini commit

* multi-outputs init commit

* optimize code

* remove inplace
```
c2f825d7
Z

[part 6]change type of function args (#38891) · 12c5b1fe
由 Zhang Ting 提交于 1月 12, 2022

12c5b1fe

[pten]Move dot, conj, sign dev_api into kernel.h (#38862) · 5fc8bbf7

由 chentianyu03 提交于 1月 12, 2022

* move dot_dev api into dot_kernel.h

* add infermate header

* modify to dotkerel in dot_op.h

* mvoe conj dev api into complex_kernel.h

* move sign dev api into  sign_kernel.h

5fc8bbf7

J

support test_auto_prune_partial (#38871) · 4640955c
由 Jiabin Yang 提交于 1月 12, 2022

4640955c
Z
Add pten change file check for op benchmark (#38796) · e7f2bf37
由 Zhang Zheng 提交于 1月 12, 2022
```
* Add pten change file check for op benchmark

* fix style format

* test

* revert
```
e7f2bf37
Y
[PTen]Refactor impl of elementwise op grad_kernel (Part1) (#38873) · 676903d5
由 YuanRisheng 提交于 1月 12, 2022
```
* refactor the impl of elementwise grad kernel

* refactor impl of elementwise grad kernel(cuda)

* fix compile bugs
```
676903d5

Fix api docs (#38882) · 572ba24e

由 Chen Long 提交于 1月 12, 2022

* update readme test=document_fix

* update conll05 docs

* update conll05 docs test=document_fix

572ba24e

Z

[part 4]change type of function args (#38888) · a250c56c
由 Zhang Ting 提交于 1月 12, 2022

a250c56c
Z

[part 2]change type of function args (#38886) · 86434818
由 Zhang Ting 提交于 1月 12, 2022

86434818
Z

[part 1]change type of function args (#38885) · df5d55bb
由 Zhang Ting 提交于 1月 12, 2022

df5d55bb

Adjust warpper of gpu_lanuch_config (#38654) · f5166284

由 limingshu 提交于 1月 12, 2022

* first commit

* fix wrong filename

* fix the wrong spell name

* fix gpu config warper

* modify according to pr advices

* fix GpuLauchConfig1D api bugs

* change the config for dropout grad

* fix bugs

* modification according to pr advices

* modification according to pr advices

f5166284

Os info (#38779) · 0d8d1e0e

由 liutiexing 提交于 1月 12, 2022

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* os_info update

* update

* update

* update

* update

* update

* fix

* update

* update for windows

* fix windows

* update

* update
Co-authored-by: Nliutiexing <liutiexing@google.com>

0d8d1e0e

S
add args check and comment for exp,polynomy decay (#38782) · b7bae939
由 Sing_chan 提交于 1月 12, 2022
```
* add args check and comment for exp,polynomy decay

* modify according to zhouwei's comment
```
b7bae939
C

add xiaoguang into big pr approve list, test=document_fix (#38883) · e9c77e09
由 Chen Weihang 提交于 1月 12, 2022

e9c77e09

11 1月, 2022 1 次提交
- Y
  
  refactor reshape grad kernel (#38833) · 8cc09552
  由 YuanRisheng 提交于 1月 11, 2022
  
  8cc09552

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致