提交 · 40cf6e649c7a11305c7c753898cfcd1c1c9eab1d · PaddlePaddle / Paddle

26 7月, 2023 3 次提交
- [BUG] fix bug of float/int/long/index Tensor (#55568) · a4644c50
  由 zhouweiwei2014 提交于 7月 26, 2023
  
  a4644c50
- Y
  [New IR]Bind core structrure (#55665) · ee506c2f
  由 YuanRisheng 提交于 7月 26, 2023
```
* bind ir core

* perfect code

* deal with conflict
```
  ee506c2f
- T
  Add py3.10 (#55286) · 97ec1d84
  由 tianshuo78520a 提交于 7月 26, 2023
```
* Add py3.10;test=py3-ninja

* Add py3.10;test=py3-ninja

* test=py3-ninja

* test=py3-ninja

* test=py3-ninja

* test=py3-ninja

* test=py3-ninja

* Fix test error

* Fix build docker error

* Fix build docker error
```
  97ec1d84
25 7月, 2023 1 次提交

Call multiply_ instead of scale_ to avoid multiple DtoH copy. (#55589) · 05720257

由 Yiqun Liu 提交于 7月 25, 2023

* Call multiply_ instead of scale_ to avoid multiple DtoH copy.

* Call _squared_l2_norm to calculate grad_clip.

* Fix import error.

05720257

24 7月, 2023 4 次提交
- J
  修改COPY-FROM No.13 distributed (#55236) · 38fbbe6b
  由 jjyaoao 提交于 7月 24, 2023
```
Signed-off-by: Njjyaoao <jjyaoao@126.com>
```
  38fbbe6b
- W
  
  [Bug Fix] convert environment variables' types (#55586) · 0f0dfe9a
  由 Windfarer 提交于 7月 24, 2023
  
  0f0dfe9a
- Y
  
  [sharding stage 1 optim] Sharding comm overlap with backward (#55598) · a9f877ff
  由 Yuang Liu 提交于 7月 24, 2023
  
  a9f877ff
- C
  [AutoParallel] Add shard tensor and DistAttr api (#55494) · bd60757d
  由 Chen Weihang 提交于 7月 24, 2023
```
* add shard tensor api

* add DistAttr api

* add unittest for coverage

* fix process mesh sample code

* fix checking error
```
  bd60757d
22 7月, 2023 2 次提交
- fix group_shard3_get_all_parameter (#55572) · 6da9db50
  由 zhenhailiu 提交于 7月 22, 2023
  
  6da9db50
- S
  Fix launch error when PADDLE_TRAINER_ENDPOINTS is too long (#55478) · db921ae9
  由 sneaxiy 提交于 7月 22, 2023
```
* fix new launch

* fix ps uit
```
  db921ae9
21 7月, 2023 1 次提交

开发grad_fn、next_functions两个API 并暴露到python端- 修改单侧文件路径到合理位置 (#55311) · 03f06841

由 qiuwenbo 提交于 7月 21, 2023

* [尝试] 给tensor增加一个属性, 这个属性是一个定值 1

* 暴露gradnode 并构建gradnode新的方法(用来测试)进行暴露给python python端可以访问

* 开发grad_fn、next_functions两个API 并暴露到python端- 做一些规范化处理

* 增加一个单元测试

* 优化 code-style

* 将单侧文件迁到正确的位置

* 优化 code-style

* 删除无用注释

* 解决 __main__ has no attribute

* 修改单侧文件

* 修改单侧脚本-temp

03f06841

20 7月, 2023 7 次提交
- L
  
  polish some code (#55583) · f172b02f
  由 Leo Chen 提交于 7月 20, 2023
  
  f172b02f
- N
  
  Add fuse_linear_activation (#55420) · fa084e5e
  由 niuliling123 提交于 7月 20, 2023
  
  fa084e5e
- N
  
  [Dy2St] fix `func_self` maybe a callable empty list (#55554) · 3b58a68f
  由 Nyakku Shigure 提交于 7月 20, 2023
  
  3b58a68f
- X
  [Kunlun] Modify some legacy code on distributed training (#55515) · 806f8d2b
  由 XiaociZhang 提交于 7月 20, 2023
```
* [Kunlun] Mofify some legacy code on distributed training

There were limitations on XPUs before, such as concat/split is not
supported, and c_broadcast only support fp32. These limitations are
lifted recently.

Multi-device profiling on XPU will also be supported by this PR.
Without this PR, a hanging broadcast will be issued by devices that
enables profiling, eventually lead to kernel timeout error.

* fix typo
```
  806f8d2b
- shard grad reduce (#55495) · 284e0d12
  由 zhenhailiu 提交于 7月 20, 2023
  
  284e0d12
- K
  
  fix data load error in static mode (#55541) · 746e7cdc
  由 Kai Song 提交于 7月 20, 2023
  
  746e7cdc
- Y
  
  pp comm overlap use tensor fusion helper (#55540) · 1f79fd47
  由 Yuang Liu 提交于 7月 20, 2023
  
  1f79fd47
19 7月, 2023 7 次提交

Z

[AutoParallel] keep lr_sheduler same bewteen executor and engine (#55516) · 36bc5511
由 zhaoyingli 提交于 7月 19, 2023

36bc5511
C

enhance etcd stability (#55499) · 7fc0fed8
由 caozhou 提交于 7月 19, 2023

7fc0fed8
陶

add sequence parallel utils to fleet utils (#55462) · bc153701
由陶泽伟提交于 7月 19, 2023

bc153701
Y

Sharding stage 1 tensor fusion (#55427) · 4c4d3185
由 Yuang Liu 提交于 7月 19, 2023

4c4d3185
J
修改COPY-FROM No.14 incubate (#55234) · cf146106
由 jjyaoao 提交于 7月 19, 2023
```
Signed-off-by: Njjyaoao <jjyaoao@126.com>
```
cf146106
J
修改COPY-FROM No.4 optimizer (#55238) · 413efdc9
由 jjyaoao 提交于 7月 19, 2023
```
Signed-off-by: Njjyaoao <jjyaoao@126.com>
```
413efdc9

disable __setitem__ in static mode & add API paddle.static.setitem with dy2st strategy (#53682) · 7849d58d

由 JYChen 提交于 7月 19, 2023

* add paddle.static.setitem

* add some help doc

* support setitem

* support machanism

* add more unittest

* remove usless code

* raise error in static setitem

* fix d2s UT

* remove static only for both-used code

* fix bool set_value in static, fix set_value_op UT

* fix unittests

* [May case some error]: remove inplace-version check

* add two test case for dy2st

* fix function in vision

* fix dy2st setitem support, refine UT case

* fix slice in static_mode

* add ParametersMap

* remove pop

* modify place

* [fix]: variable is also a tensor

* rewrite some ut & remove slicetransformer in dy2st

* solve error in static-mode

* fix ut

* return a result for set_array_write

* fix test_set_value_op_xpu

* code is different in dynamic / static mode

---------
Co-authored-by: NAurelius84 <zhangliujie@baidu.com>
Co-authored-by: NNotHaozi <zhangmenghao@baidu.com>

7849d58d

18 7月, 2023 5 次提交

Z
修改COPY-FROM add_example_for_lazygurd (#55411) · 96ff6103
由 zhangjingwei 提交于 7月 18, 2023
```
* add_example_for_lazygurd

* fix
```
96ff6103

batch add inpalce api (#55078) · 19302938

由 GGBond8488 提交于 7月 18, 2023

* batch add inpalce api

* fix inplace fn generate

* add test for  new inpalce api

* fix typro

* fix typro

* fix typro

* fix test error

* fix atan2

* remove atan2

* auto genereate inpalce api

* fix inplace generate fn error

* fix windows error

* fix test error

* fix test error

* fix windows ci error

* fix test error

* fix test_error

* fix test error

* fix eigen aliasing error in inplace

* remove elementwise_pow inplace

* fix doc error

* fix test error

19302938

[NewIR]Fix new ir concat split bug (#55419) · 5e6645d7

由 hong 提交于 7月 18, 2023

* fix new ir concat op bug

* fix bug

* using add_n_with_kernel instead of add_n impl

* fix pd_op yaml bug

* fix bug

5e6645d7

N

[Dy2St] skip compare between func and module attribute to fix NumPy 1.25 error (#55482) · 2dcb0ebf
由 Nyakku Shigure 提交于 7月 18, 2023

2dcb0ebf

[Add] Paddle 代码 CI 中引入 xdoctest 检查 (#55295) · 26fba07c

由 megemini 提交于 7月 18, 2023

* [Add]Add Xdoctester

* [Fix]fix beta docstring

* [Doctest]change dirichlet docstring

* [Doctest]change gumbel docstring

* [Doctest]change bernoulli docstring

* [Doctest]change categorical docstring

* [Doctest]change ops.py docstring

* [Doctest]change conv docstring

* [Doctest]change distance docstring, test=docs_preview

* [Change]add ref

* [Change]patch xdoctest debug

26fba07c

17 7月, 2023 1 次提交

Support more dtype for any/all API. (#55253) · 7b19efe4

由 zxcd 提交于 7月 17, 2023

* add more data type for all/any.

* remove xpu fix.

* add test unit.

* fix typename name.

* fix output data type.

7b19efe4

14 7月, 2023 1 次提交

[AutoTuner] Distribute best cfg (#54834) · 7f6d222f

由 caozhou 提交于 7月 14, 2023

* distribute best cfg

* adapt to multi args transmission

* update metric extracting

* fix bugs of prune and reading log

* fix time default value

* remove time record

* adjust the order of searching dim

* fix prune bugs

* fix adding cfg bug

* fix multi nodes bug

* reset status

* remove alarm and set logdir

* deepcopy ctx

* change alarm

* fix restart bug

* add exit

* best no need alarm

* add warmup time

7f6d222f

13 7月, 2023 7 次提交
- N
  
  Add fused_attention, fused_feedforward, fused_gemm_epilogue to amp white_list (#55373) · cb68b58a
  由 niuliling123 提交于 7月 13, 2023
  
  cb68b58a
- R
  Support nvprof for auto parallel (#55347) · 9210b1af
  由 Ruibiao Chen 提交于 7月 13, 2023
```
* Support nvprof for auto parallel

* Fix CI errors

* Fix CI errors
```
  9210b1af
- C
  【AMP Prim OP】support instance_norm prim ops for fp16 and bf16 dtype (#55368) · 65950324
  由 Charles-hit 提交于 7月 13, 2023
```
* [prim]support fp16 for instance_norm and instance_norm_grad

* support fp16 and bfp16 dtype for instance_norm prim rules

* fix new ir test

---------
Co-authored-by: Ncxxly <chenxx_id@163.com>
```
  65950324
- add phi operator c_concat and ut (#55320) · 788be26d
  由 lil-Xing 提交于 7月 13, 2023
```
* add phi operator c_concat and ut

* update create_var use

* update copyright
```
  788be26d
- L
  Integrate QAT into distributed optimizer (#54241) · aaf021c9
  由 Leo Chen 提交于 7月 13, 2023
```
* Support AMP program for onnx QAT API

* Integrate QAT into distributed optimizer

* Reduce the size of test data and increase time limit

* Use logger and reduce time limit of unittests

* Rename and move unittest into fleet test

* Test qat_init API
```
  aaf021c9
- R
  fix protobuf problem (#55305) · 0cea7b7d
  由 risemeup1 提交于 7月 13, 2023
```
* fix protobuf problem

* fix protobuf problem
```
  0cea7b7d
- Y
  
  sharding vpp overlap bug fixer (#55365) · 1558ee02
  由 Yuang Liu 提交于 7月 13, 2023
  
  1558ee02
12 7月, 2023 1 次提交

[ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7

由 YangQun 提交于 7月 12, 2023

* squash pick the poc code
* fix build after rebase
* fix int8 conv and fc uts
* Fix and clean-up Get_SRC_Scale_Memory
* fix floating point fc uts
* fix test_analyzer_int8_googlenet
* test_analyzer_int8_mobilenetv1
* fix int8 mobilenet v2 and v3
* fix build error after rebase
* [oneDNN] rename library version
* fix conv bias datatype
* try to fix import error
* fix rebase error
* [oneDNN] pack library into python wheel
* add MKLDNN_SHARED_LIB_3 to env_dict
* fix test_analyzer_bert
* fix fill_constant op kernel
* fix ernie and matmul op ut
* fix softplus ut
* fix conv+relu6 fusion ut
* fix hardswish fusion
* fix quant+transpose fusion ut
* fixsgd ut
* fix int8 matmul with flatten
* fix fc+scale fusion
* fix conv/matmul+gelu fusion uts
* fix rebase error
* Revert "fix conv/matmul+gelu fusion uts"
This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
* upgrade to onednn v3.1
* remove older version onednn
* use densetensor::data() for achieving mean and var in layernorm impl
* comments for atol of integer tests
* fix clang-format
* Revert "remove older version onednn"
This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
* improve binary handle
* fix expand kernel
* Revert "use densetensor::data() for achieving mean and var in layernorm impl"
* always use forward_inference for conv
* remove activation scales
* rollback changes to mkldnn.cmake
* address comments
* port changes to dequantize kernel
* fix merge error
* fix fused_elementwise_kernel
* upgrade onednn version to v3.1.1
* fix some approval error
* fix error msg format
* remove old onednn libs
* try to fix symbolic link issue
* fix cinn test case segfault
* do not explicit link test with onednn
* remove unnecessary changes
* integrate CINN with onednn v3
* link with mkldnn project
* fix cinn build file

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
Co-authored-by: Ntianshuo78520a <707759223@qq.com>

cfa513f7

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功