提交 · 305f32d1c40683a32141d151ac610bec37a6e1c4 · Crayon鑫 / Paddle

24 3月, 2022 4 次提交

由 Roc 提交于 3月 24, 2022

* # This is a combination of 10 commits.
# The first commit's message is:
add expert count op

add ut for expert_count

# This is the 2nd commit message:

update UT only for cuda

# This is the 3rd commit message:

fix for rocm

# This is the 4th commit message:

update ut

# This is the 5th commit message:

add moe module

# This is the 6th commit message:

add expert count op

add ut for expert_count

# This is the 7th commit message:

update UT only for cuda

# This is the 8th commit message:

update ut

# This is the 9th commit message:

add moe module

# This is the 10th commit message:

make expert count private

* add assign pos op

* fix upper num name

* add api _assign pos

* add ut for assign pos op

* update date

* fix for win

* update for test (timeout)

* fix ut

* update

* fix ut for number count
Co-authored-by: Nhlygit66666 <2570058140@qq.com>

305f32d1

[Phi] Migrate InferShape of multiplex, qr, tril_triu (#40102) · 2e736531

由 caozhou 提交于 3月 24, 2022

* migrate infershape

* fix tril_triu infershape error

* fix qr_op infershape

* add parse qr mode func

* move order

2e736531

[Phi] Move mul op kernel into phi (#40833) · 1b491818

由 Chen Weihang 提交于 3月 24, 2022

* add mul phi kernel

* remove mul op kernel

* remove original mul grad op

* fix cinn test

* fix dygraph test failed

1b491818

N

Add is_mean param for mean op (#40757) · 7e1155ed
由 niuliling123 提交于 3月 24, 2022

7e1155ed

23 3月, 2022 12 次提交

J
Added support for BF16 datatype for all oneDNN activation kernels (#40721) · 8e67629c
由 jakpiase 提交于 3月 23, 2022
```
* added missing BF16 activations

* added softplus bf16

* minor change

* disabled tests for GPU
```
8e67629c

[NPU] add npu support for conv3d and conv3d_grad (#38480) · ff568afa

由 furnace 提交于 3月 23, 2022

* [NPU] add npu support for conv3d and conv3d_grad

* [NPU] delete failed unittests due to Ascend not support

* [NPU] delete debug codes

* [NPU] optimize codes, notest

* [NPU] remove const_cast

* [NPU] optimize for remove const_cast

* [NPU] fix written errors

ff568afa

two-phase training for ps (#40762) · b1a4668c

由 zhaocaibei123 提交于 3月 23, 2022

* fix benchmark and communicator config

* fix bugs of the_one_ps

* multi program and fix bug in optimizer

* multi program in the_one_ps

* public commcontext

* ps optimizer multi programs

* cvm & datanorm backend

* fix dim

* fix unittest

* fix

* the one ps merge

* remove comm

* add DownpourLiteWorker

* all

* fix

* fix

* device worker downpour lite

* fix

* fix bug in global shuffle

* save inference model

* fix & add log

* fix

* remove log

* fix

* fix save summary

* fix

* fix pscore

* fix

* fix

* fix

* fix

* fix

* remove logs

* fix

* fix

* fix

* fix

* fix

* add some comments

* fix
Co-authored-by: Nesythan <esythan@126.com>

b1a4668c

[Phi] Move deformable_conv and deformable_conv_v1 to phi (#40794) · 7e3752bb

由 zyfncg 提交于 3月 23, 2022

* move deformable_conv_grad to phi

* move infershape of deformable_conv to phi

* adjust some code format

* move deformable_conv_v1 to phi

7e3752bb

[Phi]Remove InferShape and Kernel of flatten_contiguous_range op (#40638) · 778008d7

由 YuanRisheng 提交于 3月 23, 2022

* remove flatten infermeta

* fix bugs when run inference ci

* fix bugs when run inference ci

* fix bugs when run ci

* support infrt

* inplace infershape code'

778008d7

W

Fix quant and dequant cuda kernels when quant_axis==1 (#40772) · 8991e9ae
由 whs 提交于 3月 23, 2022

8991e9ae

Add complex type compatibility for stft api and stft op. (#40113) · 319f95d0

由 KP 提交于 3月 23, 2022

* Add stft_op.

* Add stft_grad_op.

* Add stft_op unittest.

* [DLTP-45176] Add complex compatibility in static mode for stft api.

* [DLTP-45176] Add complex compatibility in static mode for stft api.

* Add doc.

* Update unitests of stft op.

* Update spectral helper.

* fix coding style.

319f95d0

N

Modified dropout Kernel with Kernel Primitive API (#40766) · 95d3ebc8
由 niuliling123 提交于 3月 23, 2022

95d3ebc8
J

fix cinn graph may hasn't input problem (#40814) · 17b8335b
由 jiangcheng 提交于 3月 23, 2022

17b8335b

[phi] transfer unsqueeze to phi (#40596) · 9121115b

由 xiongkun 提交于 3月 23, 2022

* transfer unsqueeze to phi

* fix conflict

* add squeeze

* add infershape

* fix xpu and npu error

9121115b

Y
[Phi]Move log/log2/log10/log1p Kernels to Phi (#40785) · 13c99434
由 YuanRisheng 提交于 3月 23, 2022
```
* move activation

* fix bugs when run ce
```
13c99434
C
[Phi] Move fill_constant_batch_size_like op kernel into phi (#40784) · b03ef424
由 Chen Weihang 提交于 3月 23, 2022
```
* add full_batch_size_like phi kernel

* remove fill constant bs like

* update year
```
b03ef424

22 3月, 2022 5 次提交

Move embedding to phi (#39901) · 0331cfda

由 hong 提交于 3月 22, 2022

* move embeding to phi;

* update sig; test=develop

* move reset impl to phi; test=develop

* remove old register; test=develop

* fix cpu bf16 bug; test=develop

* fix lookup speed error

* polish code

* fix paddle throw type

0331cfda

C
[Phi] Move reverse kernel and infershape into phi (#40791) · 7fc0c619
由 Chen Weihang 提交于 3月 22, 2022
```
* add reverse phi kernel

* add reverse infermeta

* remove original reverse op kernl & infershape
```
7fc0c619

[phi] Update graph_send_recv OP (#40509) · 67b46e45

由 Siming Dai 提交于 3月 22, 2022

* add out_size shape for graph_send_recv

* fix bug in register kernel: no const int& support

* add out_size in infermeta

* change unittest

* fix unittest

* fix out_size default value

* fix doc

* delete arg mapping

* add sig

* move -1 to 0

* move -1 to 0

67b46e45

[Phi]Modify reduce arg order (#40706) · 67ffb86e

由 chentianyu03 提交于 3月 22, 2022

* modify out and out_grad order in reduce_grad_kernel

* delete unsed boolReduceKernel

* fix conflict

67ffb86e

fix group_norm address misalignment (#40657) · dd9d7206

由 crystal 提交于 3月 22, 2022

* fix group_norm address misalignment

* fix vectorize

* fix code

* fix vectorize length

* optimize code

dd9d7206

21 3月, 2022 6 次提交

[Phi]add pad3d kernel into phi (#40701) · 382e460b

由 chentianyu03 提交于 3月 21, 2022

* add pad3d kernel into phi

* add pad3d infermeta

* fix build error

* remove raw pad3d infershape function

382e460b

Z

conv2d support FP16 on xpu and update unittest for conv2d, test=kunlun (#40395) · 276017bb
由 zhangyikun02 提交于 3月 21, 2022

276017bb
F
Move conv-transpose OPs to phi (#40675) · 1eb96eec
由 From00 提交于 3月 21, 2022
```
* Move conv-transpose OPs to phi

* Fix CI errors

* Fix CI errors
```
1eb96eec
F

Move frobenius_norm OP to phi (#40707) · 564dcd52
由 From00 提交于 3月 21, 2022

564dcd52

[IPU] update ipu_backend (#40685) · d67fe921

由 Allen Guo 提交于 3月 21, 2022

* sync changes

* copy sOpNamescope

* fix UTs

* add authors
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NAllen Guo <alleng@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

* fix code-format

* fix compile error

* add comments for feed_op
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai>
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>
Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>

d67fe921

Z

[MLU]add compiler options and remove redundant code (#40705) · a6f77fdf
由 zn 提交于 3月 21, 2022

a6f77fdf

19 3月, 2022 2 次提交

Z

move deformable_conv forward kernel to phi (#40700) · a8e5c9be
由 zyfncg 提交于 3月 19, 2022

a8e5c9be

Add infer meta (#40544) · 8e4e19ab

由 hong 提交于 3月 19, 2022

* add infer meta; test=develop

* add histogram infer meta; test=develop

* fix unitest bug; test=develop

* format; test=develop

* format; test=develop

* bn not use new infer meta; test=develop

* add infer meta; test=develop

* fixbug; test=develop

* fix bug;

* recover unitest; test=develop

8e4e19ab

18 3月, 2022 8 次提交

[Phi] Migrate gelu/log_softmax/prelu op kernel and infershape (#40393) · aed6faf2

由 shentanyue 提交于 3月 18, 2022

* add gelu

* fix gelu

* add log_softmax

* add prelu kernel and prelu/gelu/logsoftmax infershape

* fix

* fix

* fix

* fix

* fix ci

* log_softmax rewrite

* fix

* fix

* fix conflict

* fix compile error

* fix comment

* fix

* ci_fix
Co-authored-by: NYan Li <liyan665@gmail.com>

aed6faf2

[Phi]Move hierarchical_sigmoid kernel to phi (#40553) · 64a7cbd3

由 Zhang Zheng 提交于 3月 18, 2022

* first commit

* fix compile error

* support std::vector<std::srting>

* fix

* fix op support on GPU by chenweihang

* pass test

* infershape

* add set_dtype

* fix order

* fix

* unify the impl of dt and sr

* fix

64a7cbd3

F
[NPU] fix fp16 (PART I) (#40259) · aaa71ea4
由 furnace 提交于 3月 18, 2022
```
[NPU] fix fp16 (PART I)
```
aaa71ea4
Z
[Phi] Move infershape of roi_pool to phi (#40682) · 579173d8
由 zyfncg 提交于 3月 18, 2022
```
* move infershape of roi_pool to phi

* polish code
```
579173d8
X
[phi] tranfer kthvalue from fluid to phi (#40676) · d7ccd6bf
由 xiongkun 提交于 3月 18, 2022
```
* tranfer kthvalue from fluid to phi

* transfer infershape
```
d7ccd6bf

[Phi] move reduce_grad kernel into phi (#40522) · 70726696

由 chentianyu03 提交于 3月 18, 2022

* move reduce_mean_grad kernel into phi

* move reduce_max/min_grad into phi

* remove raw max/min grad kernel

* fix bug

* fix max/min grad error

* move all reduce_grad kernel into one file

* add prod grad kernel

* add infermeta for prod kernel

70726696

F
[NPU] fix fp16 (PART II) (#40537) · 1a13fa0f
由 furnace 提交于 3月 18, 2022
```
[NPU] fix fp16 (PART II)
```
1a13fa0f
Z
Optimize perf of softmax_with_cross_entropy_bwd (#40643) · 081e4307
由 Zhang Zheng 提交于 3月 18, 2022
```
* Optimize perf of softmax_with_cross_entropy_bwd

* fix

* fix
```
081e4307

17 3月, 2022 3 次提交

[Phi] Move assign kernel into phi (#40022) · 1904572a

由 Chen Weihang 提交于 3月 17, 2022

* move assign kernel init commit

* change vec<tensor> to vec<tensor*>

* support tensor array

* support api declare

* fix test_list failed

* fix npu and xpu failed

* fix infrt failed

* remove assign array size in operator

* move assign sr header into sr dir

* add infermeta for assign

* test op success

* fix test_list failed

* fix kunlun failed

* add set host allocator in tests

* support tensor array in arg ctx

* open set layout in share_meta

* fix meta tensor layout error

* fix test failed

1904572a

C
Revert "Fix truncated norm operator (#40287)" (#40614) · 313bff6b
由 Chang Xu 提交于 3月 17, 2022
```
This reverts commit 0c333543.
```
313bff6b
Y

rename math (#40641) · 883a8eea
由 YuanRisheng 提交于 3月 17, 2022

883a8eea

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致