提交 · 3e1e482b4960c8030753ade44c0aa61d89187642 · PaddlePaddle / Paddle

26 9月, 2022 1 次提交
- C
  
  [MLU] fluid: add mluop (#46429) · 3e1e482b
  由 cifar10 提交于 9月 26, 2022
  
  3e1e482b
16 9月, 2022 2 次提交

Support broadcast elementwise operators with int64 index type (#45741) · 20b5bf84

由 sneaxiy 提交于 9月 16, 2022

* support int64 non-broadcast

* support broadcast case for int64 index

* fix bug

* support more Arity

* remove some codes

* upgrade patchelf to v0.15.0 to pass CI build

* fix bug

* fix patchelf installation

* add debug flags

* remove useless codes

* fix viterbi_decode and set_value op uts

* remove always enable int64

20b5bf84

[CustomDevice] add new executor support (#46038) · 268f097e

由 ronnywang 提交于 9月 16, 2022

* [CustomDevice] add custom_device_resource_pool & device_event_custom_device

* update

* update

* update

* update

268f097e

09 9月, 2022 1 次提交
- C
  
  [MLU] fix mluinfo compile error. (#45886) · f06ab336
  由 Chenxiao Niu 提交于 9月 09, 2022
  
  f06ab336
08 9月, 2022 1 次提交
- T
  xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持xpu (#45706) · 7085cb97
  由 taixiurong 提交于 9月 08, 2022
```
* add gemm_epilogue

* xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持 test=kunlun
```
  7085cb97
07 9月, 2022 1 次提交
- H
  
  [XPU] move rnn op to phi. (#45822) · 91631492
  由 houj04 提交于 9月 07, 2022
  
  91631492
05 9月, 2022 2 次提交
- C
  
  Fix jetson compile error (#45692) · cfaee812
  由 chalsliu 提交于 9月 05, 2022
  
  cfaee812
- S
  
  fix some op int32 exceed range (#45711) · a1dbee23
  由 sneaxiy 提交于 9月 05, 2022
  
  a1dbee23
01 9月, 2022 2 次提交
- H
  
  [XPU] add c_embedding_op_xpu. (#45617) · ed2ad5d9
  由 houj04 提交于 9月 01, 2022
  
  ed2ad5d9
- T
  xpu-paddlepaddle-37 [任务] 迁移lamb到phi (#45520) · 1a0ef45e
  由 taixiurong 提交于 9月 01, 2022
```
test=kunlun
```
  1a0ef45e
29 8月, 2022 2 次提交

[IPU] support depthwise_conv2d ops (#45234) · a237ff8e

由 Allen Guo 提交于 8月 29, 2022

* support depthwise_conv2d ops
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>

* fix duplicate name
Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>

a237ff8e

A

fix compile (#45441) · f49f3b4f
由 Allen Guo 提交于 8月 29, 2022

f49f3b4f

26 8月, 2022 1 次提交
- H
  
  [XPU] add load_combine_op_xpu. test=kunlun (#45436) · 3055d71a
  由 houj04 提交于 8月 26, 2022
  
  3055d71a
25 8月, 2022 1 次提交
- H
  
  add temporal shift and grad *test=kunlun (#45300) · 63d9a175
  由 haosicheng 提交于 8月 25, 2022
  
  63d9a175
24 8月, 2022 1 次提交

Support fp16 of adam operator in xpu environment (#45292) · a012d426

由 mengqingchun02 提交于 8月 24, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support fp16 of adam operator in xpu environment. test=kunlun

* support fp16 of adam operator in xpu environment. test=kunlun

* support fp16 of adam operator in xpu environment. test=kunlun

a012d426

19 8月, 2022 3 次提交

H

[XPU] c_allreduce support int. update bkcl to 1.0.5. test=kunlun (#45248) · 9f1f1b0a
由 houj04 提交于 8月 19, 2022

9f1f1b0a

[XPU] add merged_momentum unittest and change momentum (#45241) · e0f1c9f2

由 dongfangshenzhu 提交于 8月 19, 2022

* add merged_momentum *test=kunlun

* add merged_momentum *test=kunlun

* add fp16 to merged_momentum,*test=kunlun

* change dist_model.cc

* add merged_momentum unittest and  change momentum,test=kunlun

* add merged_momentum unittest and  change momentum,test=kunlun

* add merged_momentum unittest and  change momentum,test=kunlun

* add merged_momentum unittest and  change momentum,test=kunlun

e0f1c9f2

Support beam search decode op in XPU environment (#44917) · adaffb7b

由 mengqingchun02 提交于 8月 19, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

adaffb7b

18 8月, 2022 1 次提交

change to async mode for xpu multi-card training in static graph mode, test=kunlun (#45024) · 41bdf41d

由 zhangxiaoci 提交于 8月 18, 2022

* change to async mode for xpu multi-card training in static graph mode

* minor bugfix

* irrelevant. move to another pr

* move change to other pr

* fix stream issue

* fix 'stream not meet with current context' error

* fix branch diverge, test=kunlun

41bdf41d

17 8月, 2022 1 次提交

add instance norm op for xpu (#45097) · 216d25ac

由 ykkk2333 提交于 8月 17, 2022

* xpu unittest grad compute supports more types, *test=kunlun

* add instance norm xpu, *test=kunlun

216d25ac

16 8月, 2022 1 次提交
- H
  
  [XPU] add truncated_gaussian_random op. (#45152) · 5bcabf78
  由 houj04 提交于 8月 16, 2022
  
  5bcabf78
15 8月, 2022 2 次提交
- Z
  
  add mish and mish_grad for XPU, test=kunlun (#45098) · 6815c8ab
  由 zhangyikun02 提交于 8月 15, 2022
  
  6815c8ab
- H
  [XPU] add some collective ops. (#45049) · 7e2a20d5
  由 houj04 提交于 8月 15, 2022
```
* [XPU] add some collective ops. test=kunlun

* use XPUOpTestWrapper. test=kunlun

* skip kl1 for collective ops. fix typo: deivce -> device. test=kunlun
```
  7e2a20d5
12 8月, 2022 2 次提交

A

fix compilation (#45087) · 4eec94dd
由 Allen Guo 提交于 8月 12, 2022

4eec94dd

[geometric]Add paddle.geometric.send_ue_recv API (#43174) · 615b15a3

由 Siming Dai 提交于 8月 12, 2022

* add init file

* add op definition and infermeta

* add kernel definition funcs

* add broadcast infer shape

* add gpu forward kernel

* delete SUB and DIV

* add x_grad

* add template

* add e_grad for min and max

* fix small bug

* temp commit

* temp commit

* add e_grad for sum and mean

* fix some compile bug

* fix compile bugs

* fix compile problem

* add sum forward unittest

* fix broadcast error, add kernel sig, register e_grad, change unit test

* fix grad

* add temp grad fix

* temp commit

* add min max unittest

* add max, min unittest, fix mul bug

* add cpu forward sum and mean

* add forward min max, fix mean unittest

* add cpu backward min max

* fix code-style

* add backward sum mean

* fix rocm ci

* set uniitest timeout

* fix bug of x broadcast to e, gpu grad

* fix bug of x broadcast to e, cpu grad

* rename BOOST_GET_CONST macro

* fix rocm ci

* mv graph_send_e_recv to graph_send_ue_recv

* move out_size to IntArray

* add eager op test

* fix max pool type bug, add unittest for api

* revise api doc

* add fp16 for atomic min and max, add unittest

* add unittest

* add fp16 support for graph_send_recv

* fix unittest fp16 bug

* change OutSizeTensor to Out_size

* move E to Y

* add copyright, fix comment

* review code

* fix thread block size

* fix thread block size

* change api attribute name: pool_type to reduce_op, compute_type to message_op

* change api attribute name, move pool_type to reduce_op, move compute_type to message_op

615b15a3

10 8月, 2022 1 次提交
- Z
  add macro control in enforce_xpu.h, test=kunlun (#45022) · 9e74211f
  由 zhangxiaoci 提交于 8月 10, 2022
```
* add macro control in enforce_xpu.h, test=kunlun

* minor bugfix

* minor bugfix
```
  9e74211f
09 8月, 2022 1 次提交

add phi empty kernel for xpu,*test=kunlun (#44745) · cd0b03cd

由 z8hanghuan 提交于 8月 09, 2022

* add phi empty,*test=kunlun

* support empty op in xpu, *test=kunlun

* support empty op in xpu, *test=kunlun

cd0b03cd

05 8月, 2022 1 次提交
- Z
  
  refactor xpu tests for squeeze/unsqueeze, *test=kunlun (#44812) · 54d98963
  由 zhangxiaoci 提交于 8月 05, 2022
  
  54d98963
04 8月, 2022 1 次提交
- D
  [XPU] add merged_momentum including fp32 and fp16 (#44824) · 4922376c
  由 dongfangshenzhu 提交于 8月 04, 2022
```
* add merged_momentum *test=kunlun

* add merged_momentum *test=kunlun

* add fp16 to merged_momentum,*test=kunlun
```
  4922376c
03 8月, 2022 1 次提交

add sequence_unpad for xpu (#44808) · ed0e95a8

由 z8hanghuan 提交于 8月 03, 2022

* add sequence_unpad for xpu,*test=kunlun

* add sequence_unpad, *test=kunlun

* fix bug in testcase,should not be sequence_pad,*test=kunlun

ed0e95a8

02 8月, 2022 2 次提交

H
[XPU] fp16 for layer_norm op (#44778) · 4c3e13de
由 houj04 提交于 8月 02, 2022
```
* [XPU] fp16 for layer_norm op. test=kunlun
```
4c3e13de

support beam_search operator on xpu. test=kunlun (#44720) · 9bf80772

由 mengqingchun02 提交于 8月 02, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

9bf80772

01 8月, 2022 1 次提交

unify gpu context (#44740) · 86763023

由 Leo Chen 提交于 8月 01, 2022

* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes

86763023

29 7月, 2022 3 次提交
- Q
  add some fp16 op for kunlun resnet50 model (#44672) · fecbc958
  由 QingshuChen 提交于 7月 29, 2022
```
* add some fp16 op for kunlun resnet50 model
*test=kunlun

* tmp
*test=kunlun
```
  fecbc958
- A
  
  update to sdk2.6.0 (#44673) · 23ad0cc4
  由 Allen Guo 提交于 7月 29, 2022
  
  23ad0cc4
- H
  
  [XPU] add sampling_id op, add top_k op, update xdnn api. test=kunlun (#44704) · e61f48c1
  由 houj04 提交于 7月 29, 2022
  
  e61f48c1
28 7月, 2022 4 次提交
- N
  
  delete elementwise pow in xpu_kp_list (#44661) · dfeb1942
  由 niuliling123 提交于 7月 28, 2022
  
  dfeb1942
- support log_grad op, *test=kunlun (#44662) · 067107ad
  由 z8hanghuan 提交于 7月 28, 2022
  
  067107ad
- L
  
  Complete the dtypes for all_gather, add all_gather_object api (#44417) · d4cf02bc
  由 LiYuRio 提交于 7月 28, 2022
  
  d4cf02bc
- H
  [XPU] add top_k op (#44656) · acf07c74
  由 houj04 提交于 7月 28, 2022
```
* [XPU] add top_k op. test=kunlun

* [XPU] add top_k op. test=kunlun

* use PADDLE_ENFORCE_XDNN_NOT_NULL to check pointer. test=kunlun
```
  acf07c74

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功