提交 · c789907430fe2014c3aafc4276f50829a4c867d4 · BaiXuePrincess / Paddle

06 1月, 2023 1 次提交

[Auto Parallel] Merge dist attrs from python into c++ (#49214) · c7899074

由 Yulong Ao 提交于 1月 06, 2023

* [Auto Parallel] Rename methods of ProcessMesh

* [Auto Parallel] Impl the python process_mesh by the c++ one

* [Auto Parallel] Add some minor modifications

* [Auto Parallel] Rename some methods

* [Auto Parallel] Remove unnecessary codes

* [Auto Parallel] Add back some removed files

* [Auto Parallel] Fix bugs

* [Auto Parallel] Fix a bug

* Update process_mesh.cc

* [Auto Parallel] Merge dist attrs of Python into C++

* [Auto Parallel] Add back deleted importing

* [Auto Parallel] Add back removed unittest

* [Auto Parallel] Remove type qualifiers of return types

* [Auto Parallel] Fix some bugs

* [Auto Parallel] Fix a bug of the quant pass

* [Auto Parallel] Fix the code style

c7899074

05 1月, 2023 9 次提交

F
sequence_mask fix: when the input length is an empty tensor, the kernel tries... · 0f3ccd14
由 Feiyu Chan 提交于 1月 05, 2023
```
sequence_mask fix: when the input length is an empty tensor, the kernel tries to dereference illegal sentinel iterator (#49525)
```
0f3ccd14

Support 0D for paddle.sort/argsort (#49501) · 032da731

由 Siming Dai 提交于 1月 05, 2023

* support 0D for paddle.sort/argsort

* support 0D tensor for paddle.sort/argsort in xpu

* fix bug

* fix grad and add value assertion

032da731

[inference][trt]Upgrade expand cast nearestinterp for sd (#48998) · 5defefd6

由 Zhang Jun 提交于 1月 05, 2023

* update nearest_interp, expand_v2, cast for stable diffusion

* update nearest_interp, expand_v2, cast for stable diffusion

* correct shape rank

* Update expand_v2_op.cc

5defefd6

J
[Auto Parallel] Add conv2d and pool flops (#48084) · 351d37d9
由 Jianghai 提交于 1月 05, 2023
```
* add pool flops

* add annotations and tests
```
351d37d9

姜

Yj/rm core ops exp (#49490) · 70ea88bf

由姜永久提交于 1月 05, 2023

* rm op_function_generator

* rm op_func_generator.h

* rm op_function

* modify cmake

* rm op_function.h

* rm check for op_function_generator.cc

* reset imperative

* rm python part

* fix imperative

* lint

* lint

* modify legacy_c

* review

* modify

* modify legacy

* rm gen op_functions code

* reset framework

* rm core.ops for test

* core.ops->core.eager.ops.legacy

* not raiseError for xpu

70ea88bf

W

[Inference] inplace all reshape op (#49146) · 017af746
由 Wilber 提交于 1月 05, 2023

017af746

Add 0d Tensor Test Cases for cond, case, switch_case (#49544) · d5f1e300

由 Huihuang Zheng 提交于 1月 05, 2023

Add 0d Tensor Test Cases for cond, case, switch_case. Since the 3 APIs are control flow APIs, their support for 0d tensor relies on the underneath APIs. This PR just added test cases to prove that the 3 APIs have already handled 0d tensor well.

d5f1e300

Y

Add transpose_qkv_wb flags to the fused_attention_op. (#49494) · ec857b85
由 Yuang Liu 提交于 1月 05, 2023

ec857b85
Z

move fuild.dygraph.amp to paddle.amp (#49193) · da3e9d66
由 zhangkaihuo 提交于 1月 05, 2023

da3e9d66

04 1月, 2023 5 次提交
- H
  
  [XPU] fix clip op unit test. (#49535) · 2098c283
  由 houj04 提交于 1月 04, 2023
  
  2098c283
- W
  
  [Inference] Add conv_fusion nhwc impl. (#49047) · 4a8708bb
  由 Wilber 提交于 1月 04, 2023
  
  4a8708bb
- J
  [Auto Parallel-Performance] Sharding Comm Optimization (#48604) · 5592f8ad
  由 JZ-LIANG 提交于 1月 04, 2023
```
* remove deps and prior comm

* grad comm fuse

* add deps for amp&global norm

* stage2 broadcast prior deps

* stage2 grad overlap

* stream_analyzer bugfix

* overlap enable

* dep op namescope

* depend support multiple inputs

* check finite deps

* stage2 param comm overlap

* Set kD2HStream

* grad comm hierarchical

* grad comm hierarchical

* new unitest
Co-authored-by: Nchenruibiao <chenruibiao@baidu.com>
```
  5592f8ad
- S
  Revert "Replace matmul with matmul_v2 during oneDNN fuse passes (#49108)" (#49524) · 338cbeaa
  由 Sławomir Siwek 提交于 1月 04, 2023
```
This reverts commit 2c444dfa.
```
  338cbeaa
- 张
  Add for-else (#49521) · 49f5a97b
  由张春乔提交于 1月 04, 2023
```
* add for-else

* add * for unpacking
```
  49f5a97b
03 1月, 2023 10 次提交
- W
  [Dy2St]Fix param and out grad names in dy2st for high order grad (#49461) · f484a61e
  由 WangZhen 提交于 1月 03, 2023
```
* Fix param and out grad names in dy2st for high order grad
```
  f484a61e
- S
  Replace matmul with matmul_v2 during oneDNN fuse passes (#49108) · 2c444dfa
  由 Sławomir Siwek 提交于 1月 03, 2023
```
* replace matmul with matmul_v2 in fuse passes

* Remove fusion logic from matmul

* removing fusion methods

* add proper name

* adjust namespaces
```
  2c444dfa
- C
  move fc api to paddle2.0 (#49379) · 958b9f07
  由 Charles-hit 提交于 1月 03, 2023
```
* move fc from fluid to paddle2.0

* fix unit test

* fix some examples

* fix some examples
```
  958b9f07
- G
  Move out sequential and replace save_dygraph and load_dygraph (#48709) · 7ff66973
  由 GGBond8488 提交于 1月 03, 2023
```
* remove fluid.save_dygraph and fluid.load_dygraph use paddle.save and paddle.load instead

* move Sequential to paddle.nn

* modify convert_call_func.py Sequential reference

* remove related unitests

* remove fluid.dynamic.Sequntial

* test remove conver_call_func

* fix conflicts

* fix typro

* fix unitests

* fix sample_code

* fix unitest

* fix __init__
```
  7ff66973
- Z
  [Paddle Inference] Implement conv2d_fusion NHWC format using cutlass (#47989) · c123dd1e
  由 zhoutianzi666 提交于 1月 03, 2023
```
* Implement conv2d_fusion NHWC format using CUTLASS
* Add unit testing for CUTLASS Conv in inference
* Add experimental API for CUTLASS.
```
  c123dd1e
- A
  [OpAttr]Fix Ignore AttriteTensor in IndicateDataType bug in grad_op (#49472) · 5ac96468
  由 Aurelius84 提交于 1月 03, 2023
```
* [OpAttr]Fix Ignore AttriteTensor in IndicateDataType bug in grad_op

* add GetExpectedKernelType
```
  5ac96468
- Z
  [Zero-Dim] reshape/reshape_/reverse 0D support (#49357) · 347d2123
  由 zhaoyingli 提交于 1月 03, 2023
```
* [Zero-Dim] reshape/reshape_/reverse 0D support

* rm comment

* change paddle.to_tensor to paddle.full

* fix docs

* update paddle.full
```
  347d2123
- 骑
  
  [FluidAPI]remove clip api (#48946) · fe0dc40d
  由骑马小猫提交于 1月 03, 2023
  
  fe0dc40d
- S
  
  Add not_equal trt converter (#49393) · 822ea0f9
  由 Sanbu 提交于 1月 03, 2023
  
  822ea0f9
- J
  [Auto Parallel] Add All Relu Flops (#48083) · c5137b22
  由 Jianghai 提交于 1月 03, 2023
```
* relu flops all

* add annotations and tests

* revision for codestyle
```
  c5137b22
02 1月, 2023 1 次提交
- H
  
  Scale Matmul Fuse pass rewritten (#49105) · 18c0a002
  由 Hulek 提交于 1月 02, 2023
  
  18c0a002
31 12月, 2022 1 次提交
- C
  
  support flip 0D (#49460) · cb22a5c7
  由 caozhou 提交于 12月 31, 2022
  
  cb22a5c7
30 12月, 2022 9 次提交

X
[ bugfix ] fix bugs in Indexable and support LayerDict (#49409) · 291cf821
由 xiongkun 提交于 12月 30, 2022
```
* bugfix: fix bugs in Indexable and support LayerDict

* fix bugs.
```
291cf821
W
check weight shape of conv1d_transpose (#49417) · 5c4adfae
由 wangxinxin08 提交于 12月 30, 2022
```
* check weight shape of conv1d_transpose

* add unittest case
```
5c4adfae

[Custom device] Add custom_cpu testcase of custom_relu (#49300) · 69c7edcf

由 HongyuJia 提交于 12月 30, 2022

* add custom_cpu testcase

* update test_custom_device_setup

* update path to custom_runtime

* fix cmd wait

* test Linux only

* setup once

* integrate to one run_cmd

* add pip install

* change timeout

* add debug string

* add debug string

* add debug string

* use os.system and change module name

* add runtime

* add more debug message

* continue debug

* timestamp

* fix testcase import bug

* remove error message

* set TIMEOUT property

69c7edcf

R

unit test of reduce with zero dim (#49436) · b2f41825
由 Roc 提交于 12月 30, 2022

b2f41825

[Custom Extension] Polish xpu testcase (#49158) · 9f5afa62

由 HongyuJia 提交于 12月 30, 2022

* clean custom_xpu testcase test_static_pe

* use assert_allclose to solve precision error

* adjust precision

* flatten tensor

* fix flatten

9f5afa62

Z

[clean fluid api] Move fluid/contrib/slim and remove fluid api. (#48717) · 72973d5a
由 zhouzj 提交于 12月 30, 2022

72973d5a

在文档中统一静态图模式与动态图模式的英文翻译 (#49170) · a186e60d

由 Sanbu 提交于 12月 30, 2022

* 1219

* temporarily change the num_diff_files limit, test=document_fix

* Revert "temporarily change the num_diff_files limit, test=document_fix"

This reverts commit 8e70f00ef468d2dad0e38b3da06295ed62990d20.

* for codestyle

* remove duplicate license

* `static mode` -> `static graph mode`

* Update hybrid_parallel_inference.py

* Update layer_function_generator.py

* Update manipulation.py

* reset
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

a186e60d

W
Fix default GetExpectedKernelType for ops supported tensor attrs (#49414) · 8a859554
由 WangZhen 提交于 12月 30, 2022
```
* Fix default GetExpectedKernelType for ops supported tensor attrs
```
8a859554
姜
Yj/rm legacy part 0 (#49424) · 3ffcd693
由姜永久提交于 12月 30, 2022
```
* rm legacy

* clear in_legacy

* fix tracer
```
3ffcd693

29 12月, 2022 4 次提交
- L
  
  Add scale and floor_divide ut cases (#49418) · a30e3602
  由 Lin Manhui 提交于 12月 29, 2022
  
  a30e3602
- X
  auto parallel bf16 (#49079) · 418edae5
  由 xu98bin 提交于 12月 29, 2022
```
* auto parallel bf16
```
  418edae5
- 姜
  rm legacy dygraph part7 (#49285) · df3f74df
  由姜永久提交于 12月 29, 2022
```
* rm legacy dygraph part7

* rm non_static_mode

* modify

* modify

* add static test

* set static for lstm_cudnn test

* reset tracer

* reset varbase

* fix
```
  df3f74df
- W
  fused_attention_op paratmers stop grad support (#49351) · 0bb999b6
  由 Wang Bojun 提交于 12月 29, 2022
```
* fusedAttenGrad_noGrad

* code style fix

* add ut

* remove unnecessary log
```
  0bb999b6

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致