提交 · adaffb7be19cc4311d5b828693fa11cbc3062c41 · TonyTonyFun / Paddle

19 8月, 2022 1 次提交

Support beam search decode op in XPU environment (#44917) · adaffb7b

由 mengqingchun02 提交于 8月 19, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* fix beam_search operator bugs on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

* support beam_search_decode operator on xpu. test=kunlun

adaffb7b

17 8月, 2022 1 次提交

add instance norm op for xpu (#45097) · 216d25ac

由 ykkk2333 提交于 8月 17, 2022

* xpu unittest grad compute supports more types, *test=kunlun

* add instance norm xpu, *test=kunlun

216d25ac

16 8月, 2022 1 次提交
- H
  
  [XPU] add truncated_gaussian_random op. (#45152) · 5bcabf78
  由 houj04 提交于 8月 16, 2022
  
  5bcabf78
15 8月, 2022 2 次提交
- Z
  
  add mish and mish_grad for XPU, test=kunlun (#45098) · 6815c8ab
  由 zhangyikun02 提交于 8月 15, 2022
  
  6815c8ab
- H
  [XPU] add some collective ops. (#45049) · 7e2a20d5
  由 houj04 提交于 8月 15, 2022
```
* [XPU] add some collective ops. test=kunlun

* use XPUOpTestWrapper. test=kunlun

* skip kl1 for collective ops. fix typo: deivce -> device. test=kunlun
```
  7e2a20d5
09 8月, 2022 1 次提交

add phi empty kernel for xpu,*test=kunlun (#44745) · cd0b03cd

由 z8hanghuan 提交于 8月 09, 2022

* add phi empty,*test=kunlun

* support empty op in xpu, *test=kunlun

* support empty op in xpu, *test=kunlun

cd0b03cd

05 8月, 2022 1 次提交
- Z
  
  refactor xpu tests for squeeze/unsqueeze, *test=kunlun (#44812) · 54d98963
  由 zhangxiaoci 提交于 8月 05, 2022
  
  54d98963
04 8月, 2022 1 次提交
- D
  [XPU] add merged_momentum including fp32 and fp16 (#44824) · 4922376c
  由 dongfangshenzhu 提交于 8月 04, 2022
```
* add merged_momentum *test=kunlun

* add merged_momentum *test=kunlun

* add fp16 to merged_momentum,*test=kunlun
```
  4922376c
03 8月, 2022 1 次提交

add sequence_unpad for xpu (#44808) · ed0e95a8

由 z8hanghuan 提交于 8月 03, 2022

* add sequence_unpad for xpu,*test=kunlun

* add sequence_unpad, *test=kunlun

* fix bug in testcase,should not be sequence_pad,*test=kunlun

ed0e95a8

02 8月, 2022 2 次提交

H
[XPU] fp16 for layer_norm op (#44778) · 4c3e13de
由 houj04 提交于 8月 02, 2022
```
* [XPU] fp16 for layer_norm op. test=kunlun
```
4c3e13de

support beam_search operator on xpu. test=kunlun (#44720) · 9bf80772

由 mengqingchun02 提交于 8月 02, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

9bf80772

29 7月, 2022 2 次提交
- Q
  add some fp16 op for kunlun resnet50 model (#44672) · fecbc958
  由 QingshuChen 提交于 7月 29, 2022
```
* add some fp16 op for kunlun resnet50 model
*test=kunlun

* tmp
*test=kunlun
```
  fecbc958
- H
  
  [XPU] add sampling_id op, add top_k op, update xdnn api. test=kunlun (#44704) · e61f48c1
  由 houj04 提交于 7月 29, 2022
  
  e61f48c1
28 7月, 2022 2 次提交
- support log_grad op, *test=kunlun (#44662) · 067107ad
  由 z8hanghuan 提交于 7月 28, 2022
  
  067107ad
- H
  [XPU] add top_k op (#44656) · acf07c74
  由 houj04 提交于 7月 28, 2022
```
* [XPU] add top_k op. test=kunlun

* [XPU] add top_k op. test=kunlun

* use PADDLE_ENFORCE_XDNN_NOT_NULL to check pointer. test=kunlun
```
  acf07c74
22 7月, 2022 1 次提交
- Q
  add xpu lars_momentum/pow2_decay (#44448) · 8ccbb863
  由 QingshuChen 提交于 7月 22, 2022
```
*test=kunlun
```
  8ccbb863
18 7月, 2022 1 次提交
- Q
  add xpu resnet_unit (#44297) · 02e9453f
  由 QingshuChen 提交于 7月 18, 2022
```
* add xpu resnet_unit
*test=kunlun

* tmp
*test=kunlun
```
  02e9453f
15 7月, 2022 1 次提交
- Z
  support KL2 multi-card training, *test=kunlun (#43889) · 270f25e9
  由 zhangxiaoci 提交于 7月 15, 2022
```
* update xccl lib
    * use separate streams for compute/comm on XPU
    * add broadcast op to xpu2_op_list
```
  270f25e9
14 7月, 2022 1 次提交
- Y
  
  add xpu pnorm op and fix pool op, *test=kunlun (#44214) · 84b72c5f
  由 ykkk2333 提交于 7月 14, 2022
  
  84b72c5f
13 7月, 2022 4 次提交
- Z
  
  add ResNetBasicBlock python api for kunlun, test=kunlun (#44171) · 917235be
  由 zhangyikun02 提交于 7月 13, 2022
  
  917235be
- D
  Zhusonghe (#44274) · 01b3ccae
  由 dongfangshenzhu 提交于 7月 13, 2022
```
* add relu6 and relu6_grad

* change code style of relu6 and relu6_grad

* add relu6 and relu6_grad *test=kunlun

* add relu6 and relu6_grad *test=kunlun
```
  01b3ccae
- Q
  fix cpu lars_momentum bug & add xpu grad_add/log_softmax/log_softmax_… (#44260) · d6d60cbc
  由 QingshuChen 提交于 7月 13, 2022
```
* fix cpu lars_momentum bug & add xpu grad_add/log_softmax/log_softmax_grad
*test=kunlun

* minor
*test=kunlun
```
  d6d60cbc
- H
  add grid_sampler and update relu op for xpu. (#44227) · 0470e9da
  由 houj04 提交于 7月 13, 2022
```
* grid sampler op for xpu. test=kunlun

* update relu xdnn api. test=kunlun.
```
  0470e9da
11 7月, 2022 1 次提交
- H
  rmsprop for xpu. test=kunlun (#44175) · 3ca713ee
  由 houj04 提交于 7月 11, 2022
```
* rmsprop for xpu. test=kunlun

* minor fix (follow comments). test=kunlun
```
  3ca713ee
08 7月, 2022 2 次提交
- H
  
  unsqueeze2 support fp16. test=kunlun (#44142) · 19902a12
  由 houj04 提交于 7月 08, 2022
  
  19902a12
- Z
  
  add implement of resnet_basic_block op for XPU2, test=kunlun (#44143) · d7be46b3
  由 zhangyikun02 提交于 7月 08, 2022
  
  d7be46b3
07 7月, 2022 1 次提交
- T
  
  xpu-paddlepaddle-31 优化matmul test=kunlun (#43975) · d752a7f2
  由 taixiurong 提交于 7月 07, 2022
  
  d752a7f2
26 6月, 2022 1 次提交
- S
  
  format all files in fluid using new config (#43776) · 576236a0
  由 Sing_chan 提交于 6月 26, 2022
  
  576236a0
16 6月, 2022 1 次提交
- Z
  
  remove fp16 support of depthwise_conv2d and add unittest for depthwise_conv2d, test=kunlun (#43483) · 6be3ee26
  由 zhangyikun02 提交于 6月 16, 2022
  
  6be3ee26
02 6月, 2022 1 次提交

Add generate_proposals_v2 op and expend function of gather op for kunlun. *test=kunlun (#43162) · ff22a9c4

由 Leo Guo 提交于 6月 02, 2022

* Add generate_proposals_v2 op and unittest for kunlun. *test=kunlun

* Add the assign op to xpu2_op_list and expand the function of gather op. Add the unit-test of generate_proposals_v2. *test=kunlun

ff22a9c4

11 5月, 2022 1 次提交
- T
  
  remove old XDNN implementation test=kunlun (#42404) · 7b828f71
  由 taixiurong 提交于 5月 11, 2022
  
  7b828f71
10 5月, 2022 1 次提交
- T
  
  add fp16 for reshape op on kunlun2, *test=kunlun (#42605) · 754edf6e
  由 TTerror 提交于 5月 10, 2022
  
  754edf6e
06 5月, 2022 1 次提交

bind elementwise_mod_op_xpu (#42175) · 6ea2f049

由 enzodechine 提交于 5月 06, 2022

* bind elementwise_mod_op_xpu *test=kunlun

* add more supported dtypes and UTs *test=kunlun

* fix datatype error

* add op to in xpu1_op_list

* Update Mac cmake version >=3.15 (#41456)

* Update Mac cmake version >=3.15

* notest;read test1

notest;read test2

notest;read test3

* fix inference link error

* fix inference link error

* fix windows link error

* fix cmake_policy

* fix build big size

* Add paddle::variant and replace paddle::any (#42139)

* add variant and replace any

* split attribute

* disable unittest failed in eager CI in temporary (#42101)

* test=py3-eager

* test=py3-eager

* test=py3-eager

* combine graph_table and feature_table in graph_engine (#42134)

* extract sub-graph

* graph-engine merging

* fix

* fix

* fix heter-ps config

* test performance

* test performance

* test performance

* test

* test

* update bfs

* change cmake

* test

* test gpu speed

* gpu_graph_engine optimization

* add dsm sample method

* add graph_neighbor_sample_v2

* Add graph_neighbor_sample_v2

* fix for loop

* add cpu sample interface

* fix kernel judgement

* add ssd layer to graph_engine

* fix allocation

* fix syntax error

* fix syntax error

* fix pscore class

* fix

* change index settings

* recover test

* recover test

* fix spelling

* recover

* fix

* move cudamemcpy after cuda stream sync

* fix linking problem

* remove comment

* add cpu test

* test

* add cpu test

* change comment

* combine feature table and graph table

* test

* test

* pybind

* test

* test

* test

* test

* pybind

* pybind

* fix cmake

* pybind

* fix

* fix

* add pybind

* add pybind
Co-authored-by: NDesmonDay <908660116@qq.com>

* [CustomDevice] add eager mode support (#42034)

* fix FlattenContiguousRangeOpConverter out dim error (#42087)

* fix FlattenContiguousRangeOpConverter out dim error

* update code

* fix python3.10 compile bug on windows (#42140)

* Optimize dygraph GetExpectedKernelType perf (#42154)

* opt dygraph scheduling

* revert part impl

* fix incorrect usages of std::move and other compile errors (#41045)

* fix bug of std::move and others

* fix an compile error in debug mode

* fix wrong copy assignment operator
Signed-off-by: Ntiancaishaonvjituizi <452565578@qq.com>

* reformat
Signed-off-by: Ntiancaishaonvjituizi <452565578@qq.com>

* reformat
Signed-off-by: Ntiancaishaonvjituizi <452565578@qq.com>

* fix ArrayRef constructor following llvm

* fix format

* fix conflict with master

* fix variant compile error (#42203)

* [Eager] Support numpy.ndarry in CastNumpy2Scalar (#42136)

* [Eager] Remove redundancy code, fix fp16 case (#42169)

* [Eager] Support div(scalar) in eager mode (#42148)

* [Eager] Support div scalar in eager mode

* Updated and remove debug logs

* Remove list, use 'or' directly

* Remove useless statement

* fix recompute (#42128)

* fix recompute

* modify return

* add LICENSE in wheel dist-info package (#42187)

* replace any by variant in infermeta (#42181)

* 【PaddlePaddle Hackathon 2】24、为 Paddle 新增 nn.ChannelShuffle 组网 API (#40743)

* Add infermeta for ChannelShuffle

* Create channel_shuffle_grad_kernel.h

* Create channel_shuffle_kernel.h

* Create channel_shuffle_sig.cc

* Create channel_shuffle_op.cc

ChannelShuffle算子的描述

* Create channel_shuffle_kernel_impl.h

ChannelShuffle核函数的实现

* Create channel_shuffle_grad_kernel_impl.h

ChannelShuffle反向核函数的实现

* Add kernel register of channel shuffle and grad

注册ChannelShuffle及其反向的核函数

* add nn.functional.channel_shuffle

* add nn.ChannelShuffle

* Create test_channel_shuffle.py

* Update example of ChannelShuffle in vision.py

* Update test_channel_shuffle.py

* 修改channel_shuffle核函数的实现位置

* 修正代码格式

* 删除多余空格

* 完善channel_shuffle的错误检查

* Update unary.cc

* Update channel_shuffle_op.cc

* Update test_channel_shuffle.py

* Update unary.cc

* add channel_shuffle

* Update test_channel_shuffle.py

* Update vision.py

* 调整代码格式

* Update channel_shuffle_sig.cc

* 更新ChannelShuffle的文档

* 更新channel_shuffle的文档

* remove ChannelShuffleOpArgumentMapping

* add ChannelShuffleGradInferMeta

* Update channel_shuffle_op.cc

* 调整channel_shuffle及其梯度的核函数的位置

* Do not reset default stream for StreamSafeCUDAAllocator (#42149)

* remove redundant computation in Categorical.probs (#42114)

* Downloading data for test_analyzer_vit_ocr (#42041)

* Change server URL

* update config

* add test to parallel UT rule

* add checksum to ensure files are downloaded

* change downloading target

* reuse existing variable

* change target directory

* fix en docs of some Apis (gradients, scope_guard, cuda_places, name_scope, device_guard, load_program_state, scale, ParamAttr and WeightNormParamAttr) (#41604)

* Update scope_guard; test=document_fix

* gradients; test=document_fix

* gradients; test=document_fix

* name_scope; test=document_fix

* cpu_places; test=document_fix

* WeightNormParamAttr; test=document_fix

* cuda_places; test=document_fix

* load_program_state; test=document_fix

* device_guard; test=document_fix

* device_guard; test=document_fix

* ParamAttr; test=document_fix

* scale; test=document_fix

* scale; test=document_fix

* update code example；test=document_fix
Co-authored-by: NChen Long <1300851984@qq.com>

* fix datatype error

add op to in xpu1_op_list

*test=kunlun

* fix elementwise_mod op path error  *test=kunlun

* fix elementwise_mod UT error  *test=kunlun

* fix datatype error

add op to in xpu1_op_list

*test=kunlun

add op to in xpu1_op_list

fix elementwise_mod op path error  *test=kunlun

fix elementwise_mod UT error  *test=kunlun
Co-authored-by: Ntianshuo78520a <707759223@qq.com>
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: Npangyoki <pangyoki@126.com>
Co-authored-by: Nseemingwang <seemingwang@users.noreply.github.com>
Co-authored-by: NDesmonDay <908660116@qq.com>
Co-authored-by: Nronnywang <524019753@qq.com>
Co-authored-by: Nbaoachun <962571062@qq.com>
Co-authored-by: Zhou Wei <1183042833@qq.com>
Co-authored-by: Ntiancaishaonvjituizi <452565578@qq.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>
Co-authored-by: NRoc <30228238+sljlp@users.noreply.github.com>
Co-authored-by: NBrilliantYuKaimin <91609464+BrilliantYuKaimin@users.noreply.github.com>
Co-authored-by: NRuibiao Chen <chenruibiao@baidu.com>
Co-authored-by: NFeiyu Chan <chenfeiyu@baidu.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
Co-authored-by: NYilingyelu <103369238+Yilingyelu@users.noreply.github.com>
Co-authored-by: NChen Long <1300851984@qq.com>

6ea2f049

23 4月, 2022 1 次提交
- T
  
  update reduce_max for kunlun, *test=kunlun (#42116) · 1587ad07
  由 TTerror 提交于 4月 23, 2022
  
  1587ad07
19 4月, 2022 1 次提交
- support bmm&bmm_grad for KL2, *test=kunlun (#41935) · 60bec700
  由 z8hanghuan 提交于 4月 19, 2022
  
  60bec700
18 4月, 2022 1 次提交
- support tril_triu_grad for KL2, *test=kunlun (#41877) · 0759e99d
  由 z8hanghuan 提交于 4月 18, 2022
  
  0759e99d
15 4月, 2022 1 次提交
- T
  
  add fp16 for masked_select on kunlun, *test=kunlun (#41215) · ff818c77
  由 TTerror 提交于 4月 15, 2022
  
  ff818c77
14 4月, 2022 1 次提交

support multi layer and bidirection of lstm_grad, *test=kunlun (#41742) · 8b07ce0e

由 z8hanghuan 提交于 4月 14, 2022

* support multi layer and bidirection of lstm_grad, *test=kunlun

* support multi layer and bidirection of lstm_grad, *test=kunlun

8b07ce0e

13 4月, 2022 2 次提交
- Z
  
  concat and relu sopport FP16 in XPU, test=kunlun (#41631) · c4d5a77f
  由 zhangyikun02 提交于 4月 13, 2022
  
  c4d5a77f
- Z
  
  support bce_loss and bce_loss_grad in XPU, test=kunlun (#41610) · 468c1ad7
  由 zhangyikun02 提交于 4月 13, 2022
  
  468c1ad7

TonyTonyFun / Paddle 与 Fork 源项目一致

TonyTonyFun / Paddle
与 Fork 源项目一致