提交 · adaffb7be19cc4311d5b828693fa11cbc3062c41 · 机器未来 / Paddle

19 8月, 2022 1 次提交

Support beam search decode op in XPU environment (#44917) · adaffb7b

由 mengqingchun02 提交于 8月 19, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

adaffb7b

18 8月, 2022 2 次提交

apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in... · d8d124b6

由 pangyoki 提交于 8月 18, 2022

apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in Standalone Executor (#45085)

* apply inplace addto in python apply_pass

* fix

* apply inplace pass for program

* skip feed and fetch var

* fix block_desc.move_from

* fix block desc

* alltoall remove inplace

* fix

d8d124b6

A
[OpAttr]Squeeze axes support Tensor (#45189) · c93451f4
由 Aurelius84 提交于 8月 18, 2022
```
* [OpAttr]Squeeze axes support Tensor

* add support_tensor

* fix unittest

* fix coverage
```
c93451f4

17 8月, 2022 4 次提交
- A
  [OpAttr]Add SupportTensor for OpMaker with whitelist mechanism (#45084) · 2594935a
  由 Aurelius84 提交于 8月 17, 2022
```
* [OpAttr]Add SupportTensor for OpMaker

* fix typo

* fix code style

* add SupportTensor for concat op

* add unittest for register Tensor

* add shape checker and split attribute
```
  2594935a
- W
  fix multi stream error. (#45196) · a79d4a75
  由 Wilber 提交于 8月 17, 2022
```
* fix multi stream error.
```
  a79d4a75
- F
  
  [MLU] fix copy error (#45194) · 75690584
  由 fwenguang 提交于 8月 17, 2022
  
  75690584
- Y
  add instance norm op for xpu (#45097) · 216d25ac
  由 ykkk2333 提交于 8月 17, 2022
```
* xpu unittest grad compute supports more types, *test=kunlun

* add instance norm xpu, *test=kunlun
```
  216d25ac
16 8月, 2022 5 次提交

[Phi] Move amp ops into phi (#45079) · b4f67757

由 Chen Weihang 提交于 8月 16, 2022

* move check finite and unscale kernel into phi

* move infershape into phi

* move update_loss_scaling kernel into phi

* remove original kernels

* move update loss scaling infershape into phi

* add header for xpu and npu

* solve coverage failed

* fix npu test failed

* remove mutable data in cu file

* fix new executor failed

* add valid check for meta tensor output

b4f67757

convert multihead to oss (#45019) · f706d95d

由 feng_shuai 提交于 8月 16, 2022

* convert multihead to oss

* fix:bug

* fix:delete const cast

* fix:don't support bias_qk

* add vit pass

* fix:convert bug and add preln_residual_bias

* support length=-1

* add UT for convert

* add no_bias_qk support for gpu_multihead_op

* delete infer_shape depends on bias_qk

* oss just can be used in T4 and A*

* fix:change api for ROCM CI

f706d95d

A

support fp16 softmax on custom place (#45177) · a0bbfbd4
由 Aganlengzi 提交于 8月 16, 2022

a0bbfbd4
F
Fix problem that the shape of tensor is not inited correctly when backward in static graph (#45030) · e26f80ad
由 feifei-111 提交于 8月 16, 2022
```
* fix_shape

* code style

* fix assert

* fix to_tensor badreturn
```
e26f80ad

【autograd】add select_p、eq_p、pow_p primitive operator for new autograd (#44813) · b681c88c

由 Sing_chan 提交于 8月 16, 2022

* add select_p

* fix bugs

* add custom test for select_p; modify select_p primrules

* modify according to xiaoxu's comment

* add eq_p, select_p, pow_p, use autograd to test grad of high order

* add requirement of autograd, modify expected type of eq

* modify according to xiaoxu's comment

* import primops to use primops.pow

b681c88c

15 8月, 2022 4 次提交
- Y
  
  fused_embedding_eltwise_layernorm_op and skip_layernorm_op support fp16 (#44969) · ac0553a0
  由 Yuanle Liu 提交于 8月 15, 2022
  
  ac0553a0
- Z
  
  add mish and mish_grad for XPU, test=kunlun (#45098) · 6815c8ab
  由 zhangyikun02 提交于 8月 15, 2022
  
  6815c8ab
- H
  [XPU] add some collective ops. (#45049) · 7e2a20d5
  由 houj04 提交于 8月 15, 2022
```
* [XPU] add some collective ops. test=kunlun

* use XPUOpTestWrapper. test=kunlun

* skip kl1 for collective ops. fix typo: deivce -> device. test=kunlun
```
  7e2a20d5
- W
  convert_fp16 support multi block (#45050) · 9aecf286
  由 Wilber 提交于 8月 15, 2022
```
* convert_fp16 support multi block

* update

* update
```
  9aecf286
12 8月, 2022 6 次提交

Offload calculations from matmul op to fuse pass (#44941) · acb78ea2

由 Sławomir Siwek 提交于 8月 12, 2022

* remove v2_transpose_reshape

* matmul_transpose_reshape

* reshape_transpose_matmul

* Add int8 support for matmulV2

* restore ut

* adjust old ut

* restore parallel UT ruels

* remove mkldnn code from base ops

* move enforces to pass

* remove duplicated functions

* delete duplicated enforces

* feedback from review

* add comments to variables

* enable eltwise support

* dynamic attribute

* remove fusepass tests from op test

* remove fuse pass cases from op test

* revert introduction of dynamic attributes

* style
Co-authored-by: Nwozna <joanna.wozna@intel.com>

acb78ea2

transfer memcpy_h2d from fluid to phi (#44932) · 7bc57d35

由 kangguangli 提交于 8月 12, 2022

* transfer memcpy_h2d from fluid to phi

* use UnchangedInferMeta instead

* restore test_standalone_executor

* add newline to fix codestyle check

* rename pt -> phi

* simplify logic and add check

* make the comment more clear

* remove useless comment

* refine code

7bc57d35

Y
trt engine input data type should be consistent with trt input bindin… (#45103) · a3eb341e
由 Yuanle Liu 提交于 8月 12, 2022
```
* trt engine input data type should be consistent with trt input bindings type

* fix some bugs

* fix some bugs

* fix some bugs
```
a3eb341e
D
enhance grid_sampler to support 3d input (#45015) · 1773fbba
由 duanyanhui 提交于 8月 12, 2022
```
* enhance grid_sampler to support 3d input
```
1773fbba
Z

fix extra output of kernels for inference (#45048) · 1cb883da
由 zyfncg 提交于 8月 12, 2022

1cb883da

[geometric]Add paddle.geometric.send_ue_recv API (#43174) · 615b15a3

由 Siming Dai 提交于 8月 12, 2022

* add init file

* add op definition and infermeta

* add kernel definition funcs

* add broadcast infer shape

* add gpu forward kernel

* delete SUB and DIV

* add x_grad

* add template

* add e_grad for min and max

* fix small bug

* temp commit

* temp commit

* add e_grad for sum and mean

* fix some compile bug

* fix compile bugs

* fix compile problem

* add sum forward unittest

* fix broadcast error, add kernel sig, register e_grad, change unit test

* fix grad

* add temp grad fix

* temp commit

* add min max unittest

* add max, min unittest, fix mul bug

* add cpu forward sum and mean

* add forward min max, fix mean unittest

* add cpu backward min max

* fix code-style

* add backward sum mean

* fix rocm ci

* set uniitest timeout

* fix bug of x broadcast to e, gpu grad

* fix bug of x broadcast to e, cpu grad

* rename BOOST_GET_CONST macro

* fix rocm ci

* mv graph_send_e_recv to graph_send_ue_recv

* move out_size to IntArray

* add eager op test

* fix max pool type bug, add unittest for api

* revise api doc

* add fp16 for atomic min and max, add unittest

* add unittest

* add fp16 support for graph_send_recv

* fix unittest fp16 bug

* change OutSizeTensor to Out_size

* move E to Y

* add copyright, fix comment

* review code

* fix thread block size

* fix thread block size

* change api attribute name: pool_type to reduce_op, compute_type to message_op

* change api attribute name, move pool_type to reduce_op, move compute_type to message_op

615b15a3

11 8月, 2022 1 次提交
- C
  make affine_grid_op support 5d input_dim on cpu and gpu (#45012) · 7812522c
  由 carryyu 提交于 8月 11, 2022
```
* make affine_grid_op support 5d_input on cpu and gpu
```
  7812522c
10 8月, 2022 4 次提交
- Y
  
  fix mkldnn interpolate ops (#45008) · 3f49817a
  由 yeliang2258 提交于 8月 10, 2022
  
  3f49817a
- D
  [phi] migration of class center sample infermeta (#45025) · b1e33bea
  由 duanboqiang 提交于 8月 10, 2022
```
* add class center sample infershape

* add yaml

* modify unittest

* modify unittest

* remove comment
```
  b1e33bea
- fix bug of adaptive pool2d_grad, *test=kunlun (#45031) · 01d05bc0
  由 z8hanghuan 提交于 8月 10, 2022
```
* fix bug of adaptive pool2d_grad, *test=kunlun

* fix bug of adaptive pool2d_grad, *test=kunlun

* fix bug of adaptive pool2d_grad, *test=kunlun
```
  01d05bc0
- A
  [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute (#44737) · 81d6fa6c
  由 Aurelius84 提交于 8月 10, 2022
```
* [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute

* add unittest for inference predictor
```
  81d6fa6c
09 8月, 2022 7 次提交
- S
  [geometric]Add paddle.geometric.send_u_recv API (#44580) · 34b43555
  由 Siming Dai 提交于 8月 09, 2022
```
* change out_size to INTArray

* fix out_size eager bug

* add unittest for out_size tensor

* add deprecated for paddle.incubate.graph_send_recv, add paddle.geometric.send_u_recv and unittests

* fix lowest bug

* fix according review comment

* add default value in yaml

* change api file name

* change name
```
  34b43555
- C
  move api(erfinv) from legacy_api.yaml to api.yaml (#44987) · 76e0926c
  由 Charles-hit 提交于 8月 09, 2022
```
* move api(erfinv) from legacy_api.yaml to api.yaml

* change inplace_map key
```
  76e0926c
- D
  [phi]migrate class center sample kernel (#44949) · a46d7fe6
  由 duanboqiang 提交于 8月 09, 2022
```
* migrate class center sample kernel

* fix Resize ddim error

* set buffer ptr

* add header

* add header

* remove comment

* remove header
```
  a46d7fe6
- Y
  
  fix vol2col (#44998) · ecc3098e
  由 yeliang2258 提交于 8月 09, 2022
  
  ecc3098e
- D
  [phi] migrate margin infer shape and yaml (#44940) · 6d5744b4
  由 duanboqiang 提交于 8月 09, 2022
```
* add margin infer

* migrate yaml

* modify unittests script
```
  6d5744b4
- Y
  Fix a bug in transpose2 when run native cpu (#44659) · 8185cecd
  由 yeliang2258 提交于 8月 09, 2022
```
* fix a bug in transpose2 about mkldnn

* fix bug
```
  8185cecd
- A
  
  fix format for paddle/phi/api/lib/tensor.cc (#44972) · b54abbe8
  由 Allen Guo 提交于 8月 09, 2022
  
  b54abbe8
08 8月, 2022 6 次提交

【autograd】add log_p primitive operator for new autograd (#44779) · 463fc15e

由 Sing_chan 提交于 8月 08, 2022

* add log_p for auto_grad

* add log_p_op.cc in prim_op_test srcs

* fix bug of wrong op name; add test in test_primops

* add test case of log in testprimapi

* fix bug of test_without_guard

* no need to fix test_without_guard

463fc15e

[phi] Transfer fluid fill_any to PHI fill (#44879) · ad716551

由 HongyuJia 提交于 8月 08, 2022

* transfer kernel, make complete

* add fill_sig file

* fix code style

* fix fill_sig, add yaml, modify python API

* fix inplace, add inplace testcase

* deprecated_op_names append fill

* resolve comments, add test_backward

ad716551

Lml/fix utf8 bug windows (#44945) · cf5742ac

由 levi131 提交于 8月 08, 2022

* for test

* Revert "for test"

This reverts commit baf58738ca485a06073d771e20e3644d8811bf31.

* fix utf8 bug on windows

cf5742ac

T

move lamb_op to phi (#44899) · 4a7aa7c3
由 Thomas Young 提交于 8月 08, 2022

4a7aa7c3
F

[MLU] fix bn_grad and hard_sigmoid_grad error (#44919) · 8573ca54
由 fwenguang 提交于 8月 08, 2022

8573ca54
L
clean includes of tensor.h (#44928) · ee9ea48d
由 Leo Chen 提交于 8月 08, 2022
```
* clean tensor.h

* fix gather_nd
```
ee9ea48d

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致