提交 · bed9aaea4339c20d7b71c272788b903e95411461 · BaiXuePrincess / Paddle

07 5月, 2022 2 次提交

[Auto Parallel] Improve the codes of the completion and distributed context (#40671) · bed9aaea

由 Yulong Ao 提交于 5月 07, 2022

* [Auto Parallel] Replace the old planner by the new partition tuner

* [Auto Parallel] Improve the completion and distributed context

* [Auto Parallel] Fix some bugs of the compatible check of some dist ops

* [Auto Parallel] Fix some bugs

bed9aaea

A

sync misc changes (#42534) · 37580838
由 Allen Guo 提交于 5月 07, 2022

37580838

06 5月, 2022 11 次提交

bind elementwise_mod_op_xpu (#42175) · 6ea2f049

由 enzodechine 提交于 5月 06, 2022

* bind elementwise_mod_op_xpu *test=kunlun

* add more supported dtypes and UTs *test=kunlun

* fix datatype error

* add op to in xpu1_op_list

* Update Mac cmake version >=3.15 (#41456)

* Update Mac cmake version >=3.15

* notest;read test1

notest;read test2

notest;read test3

* fix inference link error

* fix inference link error

* fix windows link error

* fix cmake_policy

* fix build big size

* Add paddle::variant and replace paddle::any (#42139)

* add variant and replace any

* split attribute

* disable unittest failed in eager CI in temporary (#42101)

* test=py3-eager

* test=py3-eager

* test=py3-eager

* combine graph_table and feature_table in graph_engine (#42134)

* extract sub-graph

* graph-engine merging

* fix

* fix

* fix heter-ps config

* test performance

* test performance

* test performance

* test

* test

* update bfs

* change cmake

* test

* test gpu speed

* gpu_graph_engine optimization

* add dsm sample method

* add graph_neighbor_sample_v2

* Add graph_neighbor_sample_v2

* fix for loop

* add cpu sample interface

* fix kernel judgement

* add ssd layer to graph_engine

* fix allocation

* fix syntax error

* fix syntax error

* fix pscore class

* fix

* change index settings

* recover test

* recover test

* fix spelling

* recover

* fix

* move cudamemcpy after cuda stream sync

* fix linking problem

* remove comment

* add cpu test

* test

* add cpu test

* change comment

* combine feature table and graph table

* test

* test

* pybind

* test

* test

* test

* test

* pybind

* pybind

* fix cmake

* pybind

* fix

* fix

* add pybind

* add pybind
Co-authored-by: NDesmonDay <908660116@qq.com>

* [CustomDevice] add eager mode support (#42034)

* fix FlattenContiguousRangeOpConverter out dim error (#42087)

* fix FlattenContiguousRangeOpConverter out dim error

* update code

* fix python3.10 compile bug on windows (#42140)

* Optimize dygraph GetExpectedKernelType perf (#42154)

* opt dygraph scheduling

* revert part impl

* fix incorrect usages of std::move and other compile errors (#41045)

* fix bug of std::move and others

* fix an compile error in debug mode

* fix wrong copy assignment operator
Signed-off-by: Ntiancaishaonvjituizi <452565578@qq.com>

* reformat
Signed-off-by: Ntiancaishaonvjituizi <452565578@qq.com>

* reformat
Signed-off-by: Ntiancaishaonvjituizi <452565578@qq.com>

* fix ArrayRef constructor following llvm

* fix format

* fix conflict with master

* fix variant compile error (#42203)

* [Eager] Support numpy.ndarry in CastNumpy2Scalar (#42136)

* [Eager] Remove redundancy code, fix fp16 case (#42169)

* [Eager] Support div(scalar) in eager mode (#42148)

* [Eager] Support div scalar in eager mode

* Updated and remove debug logs

* Remove list, use 'or' directly

* Remove useless statement

* fix recompute (#42128)

* fix recompute

* modify return

* add LICENSE in wheel dist-info package (#42187)

* replace any by variant in infermeta (#42181)

* 【PaddlePaddle Hackathon 2】24、为 Paddle 新增 nn.ChannelShuffle 组网 API (#40743)

* Add infermeta for ChannelShuffle

* Create channel_shuffle_grad_kernel.h

* Create channel_shuffle_kernel.h

* Create channel_shuffle_sig.cc

* Create channel_shuffle_op.cc

ChannelShuffle算子的描述

* Create channel_shuffle_kernel_impl.h

ChannelShuffle核函数的实现

* Create channel_shuffle_grad_kernel_impl.h

ChannelShuffle反向核函数的实现

* Add kernel register of channel shuffle and grad

注册ChannelShuffle及其反向的核函数

* add nn.functional.channel_shuffle

* add nn.ChannelShuffle

* Create test_channel_shuffle.py

* Update example of ChannelShuffle in vision.py

* Update test_channel_shuffle.py

* 修改channel_shuffle核函数的实现位置

* 修正代码格式

* 删除多余空格

* 完善channel_shuffle的错误检查

* Update unary.cc

* Update channel_shuffle_op.cc

* Update test_channel_shuffle.py

* Update unary.cc

* add channel_shuffle

* Update test_channel_shuffle.py

* Update vision.py

* 调整代码格式

* Update channel_shuffle_sig.cc

* 更新ChannelShuffle的文档

* 更新channel_shuffle的文档

* remove ChannelShuffleOpArgumentMapping

* add ChannelShuffleGradInferMeta

* Update channel_shuffle_op.cc

* 调整channel_shuffle及其梯度的核函数的位置

* Do not reset default stream for StreamSafeCUDAAllocator (#42149)

* remove redundant computation in Categorical.probs (#42114)

* Downloading data for test_analyzer_vit_ocr (#42041)

* Change server URL

* update config

* add test to parallel UT rule

* add checksum to ensure files are downloaded

* change downloading target

* reuse existing variable

* change target directory

* fix en docs of some Apis (gradients, scope_guard, cuda_places, name_scope, device_guard, load_program_state, scale, ParamAttr and WeightNormParamAttr) (#41604)

* Update scope_guard; test=document_fix

* gradients; test=document_fix

* gradients; test=document_fix

* name_scope; test=document_fix

* cpu_places; test=document_fix

* WeightNormParamAttr; test=document_fix

* cuda_places; test=document_fix

* load_program_state; test=document_fix

* device_guard; test=document_fix

* device_guard; test=document_fix

* ParamAttr; test=document_fix

* scale; test=document_fix

* scale; test=document_fix

* update code example；test=document_fix
Co-authored-by: NChen Long <1300851984@qq.com>

* fix datatype error

add op to in xpu1_op_list

*test=kunlun

* fix elementwise_mod op path error  *test=kunlun

* fix elementwise_mod UT error  *test=kunlun

* fix datatype error

add op to in xpu1_op_list

*test=kunlun

add op to in xpu1_op_list

fix elementwise_mod op path error  *test=kunlun

fix elementwise_mod UT error  *test=kunlun
Co-authored-by: Ntianshuo78520a <707759223@qq.com>
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: Npangyoki <pangyoki@126.com>
Co-authored-by: Nseemingwang <seemingwang@users.noreply.github.com>
Co-authored-by: NDesmonDay <908660116@qq.com>
Co-authored-by: Nronnywang <524019753@qq.com>
Co-authored-by: Nbaoachun <962571062@qq.com>
Co-authored-by: Zhou Wei <1183042833@qq.com>
Co-authored-by: Ntiancaishaonvjituizi <452565578@qq.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>
Co-authored-by: NRoc <30228238+sljlp@users.noreply.github.com>
Co-authored-by: NBrilliantYuKaimin <91609464+BrilliantYuKaimin@users.noreply.github.com>
Co-authored-by: NRuibiao Chen <chenruibiao@baidu.com>
Co-authored-by: NFeiyu Chan <chenfeiyu@baidu.com>
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
Co-authored-by: NYilingyelu <103369238+Yilingyelu@users.noreply.github.com>
Co-authored-by: NChen Long <1300851984@qq.com>

6ea2f049

A
[NPU] add clip_by_norm op (#42411) · 1588e7e7
由 Aganlengzi 提交于 5月 06, 2022
```
* [NPU] add clip_by_norm op

* fix

* update
```
1588e7e7
A

update UTs 3 (#42519) · 94acf7c8
由 Allen Guo 提交于 5月 06, 2022

94acf7c8
Y
fix dataset ut (#42504) · 06927016
由 yaoxuefeng 提交于 5月 06, 2022
```
* fix dataset ut

* fix seed state ut
```
06927016
L

skip bf16 test if not supported (#42503) · 69b5d74d
由 Leo Chen 提交于 5月 06, 2022

69b5d74d
A
[IPU] remove transfer cast pass (#42520) · 09a13294
由 Allen Guo 提交于 5月 06, 2022
```
* rm transfer_cast_op_pass

* rm header
```
09a13294
A

update UTs 2 (#42518) · 001dab0b
由 Allen Guo 提交于 5月 06, 2022

001dab0b
A

update UTs 1 (#42517) · 063a3509
由 Allen Guo 提交于 5月 06, 2022

063a3509
A
[IPU] update UTs 0 (#42516) · 63d4d05a
由 Allen Guo 提交于 5月 06, 2022
```
* update UTs 0

* fix ci

* fix ci 3
```
63d4d05a
W

[Eager] Enabled several ops test under eager mode (#42510) · 6ff35e17
由 Weilong Wu 提交于 5月 06, 2022

6ff35e17

[AutoParallel] adapt for 2d laplace (#41601) · c043a21b

由 zhaoyingli 提交于 5月 06, 2022

* add default_ctx in backward.py

* record grad_var_to_var with grad_times

* fix backward

* update annotation

* add complete_high_order_grad in complete_forward

* add dist slice op

* update grad_var_to_var type

* update partition_block init mapping before loss op

* update compatible for 'XShape' & update 'allreduce_vars'

* add dist reshape op when input dim equal to output dim

* update 'set_grad_var_shape' with grad_var_to_var

* fix dist slice

* fix set_grad_var_shape

* add dist pnorm op

* fix dist pnorm dist_attr

* fix engine startprogram & adapt highorder grad

* fix set_grad_var_shape when mp

* update unittest

* update cmakelist

* default strategy in engine: dp

* bug fix

* tiny fix

* flatten outputs

* fix default strategy

* init default ctx

* tiny fix

* test=allcase

c043a21b

05 5月, 2022 6 次提交
- A
  [IPU] merge recent changes (#42078) · 6ec89eeb
  由 Allen Guo 提交于 5月 05, 2022
```
* merge recent changes

* fix setting pipline
```
  6ec89eeb
- L
  
  fix wrong place in ut (#42486) · a5de44f5
  由 Leo Chen 提交于 5月 05, 2022
  
  a5de44f5
- Z
  
  fix sparse mask (#42305) · e8e3b997
  由 zhangkaihuo 提交于 5月 05, 2022
  
  e8e3b997
- R
  
  Disable standalone executor for test_tensordot (#42476) · e51fad5f
  由 Ruibiao Chen 提交于 5月 05, 2022
  
  e51fad5f
- W
  
  fix unittest of conv2d due to V100 do not support bfloat16 (#42483) · 70120c7f
  由 wangxinxin08 提交于 5月 05, 2022
  
  70120c7f
- W
  
  fix the v100 cuda11.2 matmul_v2 and elementwise_div bug (#42477) · 98c3f85e
  由 wawltor 提交于 5月 05, 2022
  
  98c3f85e
04 5月, 2022 4 次提交
- K
  
  fix Tensor share memory in eager mode. test=develop (#42445) · be77aeea
  由 Kaipeng Deng 提交于 5月 04, 2022
  
  be77aeea
- G
  
  support fuse conv and bn in QAT (#42255) · d6442df6
  由 Guanghua Yu 提交于 5月 04, 2022
  
  d6442df6
- G
  
  support skip_op_list in PostTrainingQuantization (#42378) · b621a4f1
  由 Guanghua Yu 提交于 5月 04, 2022
  
  b621a4f1
- G
  
  fix PTQ unittest timeout (#42450) · 87afccb2
  由 Guanghua Yu 提交于 5月 04, 2022
  
  87afccb2
03 5月, 2022 1 次提交

Hotfix Release 2.3 Bug for CUDA 11.2 (#42437) · b0a64800

由 Huihuang Zheng 提交于 5月 03, 2022

This PR hotfixed the `test_cond.py` in CUDA 11.2

The reason of the bug is that the `fill_constant` op returns wrong value in the modified test case `test_extremely_simple_net_with_op_in_condition`, SWEs can use `layers.Print(a)` and `layers.Print(b)` in the test case to reproduce it and they can see the `fill_constant` returns something `e-50` instead of `1.23` and `1.25`

This PR hotfixed the bug by comparing `b` value instead of actual number, which makes sure the `cond` logic is right. **However, the PR didn't fix `fill_constant`**. We would let the SWEs who are working here to find the op bug and fix it.

b0a64800

29 4月, 2022 9 次提交
- modify reshape to reshape2 in paddle.nn.initializer.dirac (#42396) · eca6638c
  由 zhouweiwei2014 提交于 4月 29, 2022
  
  eca6638c
- Y
  
  add unit test for batch_norm and leaky_relu (#42369) · dbe189b1
  由 YuanRisheng 提交于 4月 29, 2022
  
  dbe189b1
- X
  Make einsum_v2 support multi-operands (#42327) · 32cae24c
  由 xiongkun 提交于 4月 29, 2022
```
* Extend python einsum interface to make einsum_v2 support multi-operands and switch it to default.

* add opt_einsum dependence

* add yaml and support eager model

* fix by code review
```
  32cae24c
- W
  
  [Eager] Support test_diff_op switch to eager mode (#42360) · 21d94dd3
  由 Weilong Wu 提交于 4月 29, 2022
  
  21d94dd3
- W
  [Eager] Remove enable_legacy_dygraph setting (#42363) · 05d6be7e
  由 Weilong Wu 提交于 4月 29, 2022
```
* [Eager] Remove enable_legacy_dygraph setting

* Add more tests
```
  05d6be7e
- W
  
  [Eager] Support test_label_smooth_functional switch to eager mode (#42366) · c3852b08
  由 Weilong Wu 提交于 4月 29, 2022
  
  c3852b08
- W
  
  [Eager] Support test_eigh_op switch to eager mode (#42379) · 08f07dcb
  由 Weilong Wu 提交于 4月 29, 2022
  
  08f07dcb
- Y
  Add some double/triple grad kernel yaml file (#42361) · 24ec6ed0
  由 YuanRisheng 提交于 4月 29, 2022
```
* add double yaml

* add inline func
```
  24ec6ed0
- A
  [Dy2Stat]Fix losting pre/post hook from outermost layer while jit.save (#42273) · 27cf7afb
  由 Aurelius84 提交于 4月 29, 2022
```
* [Dy2Stat]Fix losting pre/post hook from outermost layer while jit.save

* fix kwargs

* fix unittest
```
  27cf7afb
28 4月, 2022 4 次提交

Add gradient merge for DistributedFusedLamb optimizer (#40177) · 108aeb28

由 sneaxiy 提交于 4月 28, 2022

* add gradient merge for DistributedFusedLamb

* use master acc gradient

* fix CI ut

* polish

* remove math_function_impl.h change

* fix test_update_loss_scaling_op.py

* try to fix XPU/NPU CI

* add gm ut

108aeb28

R

[CustomDevice] add amp support (#42035) · acbb5dbe
由 ronnywang 提交于 4月 28, 2022

acbb5dbe

[CustomDevice]change import way of unpublished file in op_test test=allcases (#42285) · 62c0304b

由 Aganlengzi 提交于 4月 28, 2022

* test op_test test=allcases

* fix

* avoid copy many same file

* fix for win

* test PYTHONPATH

* change path adding way

* fix win

* use old way

* use old way test=allcase

* use old way test=allcase

62c0304b

P
fix collections.Sequence in python3.10 (#42242) · edb61a52
由 pangyoki 提交于 4月 28, 2022
```
* fix collections.Sequence in python3.10

* fix format
```
edb61a52

27 4月, 2022 3 次提交
- J
  Added missing test for shuffle_channel_mkldnn_detect_pass (#42001) · 5134f110
  由 jakpiase 提交于 4月 27, 2022
```
* added test for shuffle_channel_mkldnn_detect_pass

* added UT using new framework

* CI fix
```
  5134f110
- Z
  
  implement autotune python API (#42299) · 2094a584
  由 Zhang Ting 提交于 4月 27, 2022
  
  2094a584
- P
  
  fix collections.Iterable in python3.10 (#42295) · 3d6fb260
  由 pangyoki 提交于 4月 27, 2022
  
  3d6fb260

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致