提交 · 39b5960357c5eb9569460f383a995b0e52830981 · PaddlePaddle / Paddle

04 8月, 2023 8 次提交

[CINN] Dump more compilation result and optimize parallel compiler flags (#55935) · 39b59603

由 Fisher 提交于 8月 04, 2023

1. `Parallel Compiler`：
    - 合并`FLAGS_cinn_parallel_compile_size`和`FLAGS_cinn_parallel_compile_thread`，通过`FLAGS_cinn_parallel_compile_thread`即可指定编译时使用的线程数，所有的`fusion_groups`将会平均分配到可用的线程上
    - 增强编译完成后返回的信息，除`instruction`外，将`lowered_function`、`source_code`、`source_ptx`返回，供上层进一步使用
2. Debug信息：
    - 新增`FLAGS_ cinn_dump_group_lowered_func`、`FLAGS_cinn_dump_group_source_code`、`FLAGS_ cinn_dump_group_ptx`、`FLAGS_ cinn_dump_group_instruction`，可分别按`fusion_groups`储存编译的每个阶段中的中间代码
    - 重新整理`graph_visualization`，所有的可视化图、单测代码均能正确分组储存
3. Bug修复：
    - 修复`MakeDirectory`不能正确创建文件夹的问题
4. 其他：
    - 清除了一些无用代码

39b59603

[clang-tidy] enable modernize-use-emplace (#55799) · 469a0392

由 Ruibin Cheung 提交于 8月 04, 2023

* [clang-tidy] enable modernize-use-emplace

* Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into modernize_use_emplace

469a0392

Z

[clang-tidy] NO.12 enable modernize-use-nullptr check(#55800) · 1e4f627d
由 Zhenghai Zhang 提交于 8月 04, 2023

1e4f627d

[NewIR] add decorator for dy2st test with new ir (#55840) · b67715a4

由 kangguangli 提交于 8月 04, 2023

* add decorator for new_ir_test

* fix bug and only test in ci-coverage

* fix bug and only test in ci-coverage

* fix

* fix bugs

* fix

* fix

b67715a4

Support Combined indexing for __getitem__ and __setitem__ (#55211) · 697c712f

由 JYChen 提交于 8月 04, 2023

* WIP: start writing combined indexing get

* list/tuple/Variable

* getitem 80%

* add setitem

* add some unittest for setitem

* lazy import

* fix some setitem error

* fix advance indexing with decreasing axes; fix strided_slice input name

* combine int-tensor getitem is ok (without boolean support & broadcast); add getitem unittest for static

* add broadcast & parse bool tensor for __getitem

* [change getitem] _getitem_impl_ to _getitem_static, not deleting the former one

* refine new getitem; fix ut in variable/var_base

* add __getitem__ ut in dygraph

* re-dispatch getitem for Py/CPP; fix strided_slice decrease axes error in dygraph

* fix ut; support tensor in slice

* [change setitem] _setitem_impl_ to _setitem_static, not deleting the former one

* remove some UT (for some, temporarily)

* add IndexError to solve timeout problem in static-mode

* 1.temply forbideen all-False bool-indexput; 2.setitem_static will return new variable

* xpu uses old stratege

* rename dy2st setitem ut to avoid same-name problem

* dy2st for new combined index

* ut case for combine-index with dy2st

* open ut with all-false-bool setitem

* remove useless doc and _getitem_impl_

* change static res

* fix static xpu

697c712f

N

Fix a bug in VecAutomaticAddPerBlock (#55929) · 81511469
由 niuliling123 提交于 8月 04, 2023

81511469
C
[IR] Reshape2 and Flatten_contiguous_range Support Inplace (#55809) · dd0681e3
由 chen 提交于 8月 04, 2023
```
* inplace pass support reshape2 and flatten_contiguous_range

* recover the modification to inplace_op_var_pass.cc
```
dd0681e3
J

[XPU] Add int support for elementwise_sub/elementwise_div (#55920) · 97ab6aa6
由 jiangfan06 提交于 8月 04, 2023

97ab6aa6

03 8月, 2023 14 次提交
- Y
  
  Optim fused linear grad add (#55927) · 91873469
  由 Yuang Liu 提交于 8月 03, 2023
  
  91873469
- Y
  
  FLUID: move limit_by_capacity to PHI (#55948) · 230c6ce1
  由 yangguohao 提交于 8月 03, 2023
  
  230c6ce1
- W
  
  [clang-tidy] [No.4] enable `modernize-loop-convert` (#55704) · 81ccd99e
  由 Wang Xin 提交于 8月 03, 2023
  
  81ccd99e
- G
  [clang-tidy][task 46] enable `modernize-avoid-bind` (#55895) · a172e6cc
  由 gouzil 提交于 8月 03, 2023
```
* [clang-tidy] modernize-avoid-bind

* rollback
```
  a172e6cc
- W
  
  eliminate small pattern (#55843) · dc4b48f6
  由 wz1qqx 提交于 8月 03, 2023
  
  dc4b48f6
- K
  [NewIR ]fix bug: program translator not set value index correctly (#55789) · c4694c15
  由 kangguangli 提交于 8月 03, 2023
```
* fix bug: program translator not set value index correctly

* fix slice for setparameter
```
  c4694c15
- X
  
  add eq and hash (#55909) · 8ddf51ff
  由 xiaoguoguo626807 提交于 8月 03, 2023
  
  8ddf51ff
- W
  
  Fix run program grad node mem (#55869) · 275a8102
  由 WangZhen 提交于 8月 03, 2023
  
  275a8102
- H
  
  [XPU] Fix compilation errors of XPU plugin on multiple versions of GCC (#55924) · 613beeb6
  由 hong19860320 提交于 8月 03, 2023
  
  613beeb6
- H
  
  fix new ir optimizer bug (#55910) · 445d7337
  由 hong 提交于 8月 03, 2023
  
  445d7337
- H
  
  fix new ir stream analysis bug (#55915) · 37c487e0
  由 hong 提交于 8月 03, 2023
  
  37c487e0
- W
  fix security bug (#55870) · 08f28b40
  由 wanghuancoder 提交于 8月 03, 2023
```
* fix security bug
```
  08f28b40
- W
  fix security bug (#55865) · dcf30692
  由 wanghuancoder 提交于 8月 03, 2023
```
* fix security bug
```
  dcf30692
- W
  Eager tensor doc2 (#55886) · 9db219d1
  由 wanghuancoder 提交于 8月 03, 2023
```
* add docstring of three eager method

* test=docs_preview

* update element size bind

* update docs of numpy, clone, clear_gradient, element_size; test=docs_preview

* refine clear_gradient docs; test=docs_preview

* refine element_size docs; test=docs_preview

* add detach doc; test=docs_preview

* empty commit; test=docs_preview

* update signature; test=docs_preview

* refactor; test=docs_preview

* empty commit; test=docs_preview

* add docstring of Tensor

* empty commit; test=docs_preview

* refine TensorDoc; test=docs_preview

* refine TensorDoc; test=docs_preview

* remove extra indent in TensorDoc; test=docs_preview

* remove a space; test=docs_preview

* move docs ahead of implementation; test=docs_preview

* add doc

* refine

* refine

* refine

---------
Co-authored-by: Nwj-Mcat <1435130236@qq.com>
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
```
  9db219d1
02 8月, 2023 11 次提交

X

[EvalFrame] support python3.11 in eval frame. (#55887) · f45dd5ee
由 xiongkun 提交于 8月 02, 2023

f45dd5ee

Eager tensor doc (#55879) · 880e94fc

由 wanghuancoder 提交于 8月 02, 2023

* add docstring of three eager method

* test=docs_preview

* update element size bind

* update docs of numpy, clone, clear_gradient, element_size; test=docs_preview

* refine clear_gradient docs; test=docs_preview

* refine element_size docs; test=docs_preview

* add detach doc; test=docs_preview

* empty commit; test=docs_preview

* update signature; test=docs_preview

* refactor; test=docs_preview

* empty commit; test=docs_preview

* add docstring of Tensor

* empty commit; test=docs_preview

* refine TensorDoc; test=docs_preview

* refine TensorDoc; test=docs_preview

* remove extra indent in TensorDoc; test=docs_preview

* remove a space; test=docs_preview

* move docs ahead of implementation; test=docs_preview

* refine

---------
Co-authored-by: Nwj-Mcat <1435130236@qq.com>
Co-authored-by: NSigureMo <sigure.qaq@gmail.com>

880e94fc

[clang-tidy] NO.6 enable `modernize-avoid-c-arrays` check (#55774) · c000091e

由 gouzil 提交于 8月 02, 2023

* [clang-tidy] modernize-avoid-c-arrays

* rollback

* [clang-tidy] fix

* close modernize-avoid-c-arrays

* fix PHI_DEFINE_string; add PHI_DEFINE_bool NOLINT

* fix PHI_DEFINE_string

* fix next_h_state and parity err

* fix win32

* fix cuda_graph

* fix accuracy_kernel

* fix math_function

* fix fused_softmax_mask_kernel.cu load_data and warp_reduce; rollback concat_and_split_functor ins_addr

* fix fused_dropout_add_grad_kernel

* fix

* rollback cu

* rollback concat_and_split_functor.cu

* rollback

c000091e

W

[XPU]Add conv1d fuse pass (#55719) · 22c7a6eb
由 wz1qqx 提交于 8月 02, 2023

22c7a6eb

[IR] NewIr Interpreter Beta run regular (#55828) · 63b7fc80

由 zhangbo9674 提交于 8月 02, 2023

* add interface

* add code

* add code

* add code

* add code

* fix bug

* fix bug

* add var prefix

* add code

* add code

* add code

* fix compile bug

* fix bug

* refine code

* refine code

* refine code

* refine code

* fix bug

* add code

* add code

* fix bug

* add code

* add code

* refine code

* refine code

* fix bug

* add code

* fix bug in phi__kernel_utils

* refine code

* fix bug

* open flag

* refine code

* fix bug

* fix bug

* refine code

* fix bug

63b7fc80

[Inference] Replace groupNorm when data types are bf16 and fp16, and data... · e61d892a

由 yangjianfengo1 提交于 8月 02, 2023

[Inference] Replace groupNorm when data types are bf16 and fp16, and data format is NHWC implementation. (#55399)

* finish

* cpergroup odd

* fix bf16

* single channel

* code style

* jingdu duiqi

* add head_file

* add bf16 head file

* bf16 2

* bf16

* bf16 head

* bf16 compile

* py test

* bf16 compile

* bf16 compile

* unset py test

* nhwc

* test

* mean var

* bf16 success

* su

* ctest success

* use is_same_as

* is_same

* use is_same

* rtol

* gpu_stream

* del sigmod

* fix bfloat16 type

* use cuda_bf16_hpp

* use_cuda_arch

* bfloat162float2

* del inplace_tol

* del max_releative_tol

* temp store

* jingdu duiqi

* temp store

* plugin

* jingdu duiqi

* duiqi

* include cuda.h

* del half

* half single

* ci

* add const

* ci

* cudamemset

* del printf

* fp16 test

* add half compute

* del br16 ci

* del ci

* ci approve

* del fluid include

e61d892a

W
fix security bug (#55866) · 92aa92fa
由 wanghuancoder 提交于 8月 02, 2023
```
* fix security bug
```
92aa92fa
C

Add FP16 & BF16 for erfinv (#55287) · 6d7efd09
由 cyberslack_lee 提交于 8月 02, 2023

6d7efd09
W
fix security bug (#55782) · 19da5c0c
由 wanghuancoder 提交于 8月 02, 2023
```
* fix security bug
```
19da5c0c
J

[XPU] Add gather_squeeze_pass (#55605) · d13a49d6
由 jiangfan06 提交于 8月 02, 2023

d13a49d6

【new ir】add ir pybind api (#55745) · ef29468e

由 xiaoguoguo626807 提交于 8月 02, 2023

* add ir core

* add test

* modify name

* merge

* add test for __eq__

* shield  test for __eq__

* --amend

* Update new_ir_compiler.cc

ef29468e

01 8月, 2023 7 次提交
- R
  
  [CustomDevice] release gil in predictor.run (#55783) · 683287ba
  由 ronnywang 提交于 8月 01, 2023
  
  683287ba
- Y
  
  move clear (#55844) · 2b258c58
  由 YuanRisheng 提交于 8月 01, 2023
  
  2b258c58
- Z
  
  enable bugprone-unused-raii check (#55815) · 4aad9c69
  由 Zhenghai Zhang 提交于 8月 01, 2023
  
  4aad9c69
- S
  move prune_gate_by_capacity to phi (#55780) · 6b93ba0a
  由 Sonder 提交于 8月 01, 2023
```
* move prune_gate_by_capacity to phi

* fix

* fix registe info

* remove useless codes
```
  6b93ba0a
- G
  
  [phi] move nop to phi (#55816) · 719b1ed3
  由 gouzil 提交于 8月 01, 2023
  
  719b1ed3
- H
  [NewIR]New ir support print op (#55648) · 75c29ac1
  由 hong 提交于 8月 01, 2023
```
* new ir support print op

* fix gpu bug

* fix bug

* update

* remove layout to string

* remove usless header

* polish code

* fix bug

* posolis code
```
  75c29ac1
- H
  [NewIR]Fix new ir dy2st cache bug (#55703) · 33e50b27
  由 hong 提交于 8月 01, 2023
```
* skip inplace check for new ir

* program id combine inner scope pointer
```
  33e50b27

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功