提交 · 3d1981ad87d91bb88c46d87e2c4df40812ce291f · PaddlePaddle / Paddle

25 11月, 2022 5 次提交
- C
  [PROFILER] add flops for Profiler (#47766) · 3d1981ad
  由 Chitsing KUI 提交于 11月 25, 2022
```
* attr ready

* op ip ready

* start dynamic

* end2end ok

* input shape to map, stat by op

* layer wip

* first version ready

* fix proto depds

* fix profiler deps

* fix flops typo, rm tuple shape
```
  3d1981ad
- R
  Refactor stream anayzer (#48158) · 889318d8
  由 Ruibiao Chen 提交于 11月 25, 2022
```
* Move stream_anayzer to interpreter

* Refactor StreamAnalyzer

* Refactor RunNextInstructionList

* Remove no_data_transform_index

* Fix typos

* Fix data_transfer OpFuncType error

* Add event for depend_op

* Update transfer OpFuncType for heter place
```
  889318d8
- N
  
  [Dy2St] clean `FLAGS_jit_engine_type` related comments (#48354) · 66eeb6a6
  由 Nyakku Shigure 提交于 11月 25, 2022
  
  66eeb6a6
- W
  
  fix mac python link (#48317) · 00b3b4bd
  由 wanghuancoder 提交于 11月 25, 2022
  
  00b3b4bd
- H
  
  fix xpu compile on phi::enforce. (#48345) · d90469a4
  由 houj04 提交于 11月 25, 2022
  
  d90469a4
24 11月, 2022 12 次提交

T

Delete fluid_convert_utils fix PR-CI-Build (#48347) · 1b59830b
由 tianshuo78520a 提交于 11月 24, 2022

1b59830b
Z

add exp_grad, hard_sigmoid and hard_sigmoid_grad for xpu, test=kunlun (#48307) · d2f87d96
由 zhangyikun02 提交于 11月 24, 2022

d2f87d96
Z

add pad3d and pad3d_grad op for xpu, test=kunlun (#48306) · 22555e96
由 zhangyikun02 提交于 11月 24, 2022

22555e96

[PHI decoupling] simplify "convert_utils.h" in fluid (#48168) · de4310e6

由 huangjiyi 提交于 11月 24, 2022

* rm dependence to "convert_utils.h" in some files

* fix bugs

* replace DataType2String with DataTypeToString

* replace framework::DataTypeSize with phi::SizeOf

* mv convert_function from fluid to phi and rm old map

* recommit with pre-commit

* repalce ProtoVarType with ProtoDataType and update comment.

* fix error about include "dnnl.hpp"

* revert add dep mkldnn to convert_utils in phi

* add mkldnn deps in convert_utils.h in phi

* move deps to convert_utils.h in phi

de4310e6

P

[PHI decoupling] remove "paddle/fluid/platform/enforce.h" in phi (#48049) · df23c7c3
由 PuQing 提交于 11月 24, 2022

df23c7c3
W
[Paddle Inference]optimize token prune for Paddle-TensorRT (#48241) · 29782728
由 Wangzheee 提交于 11月 24, 2022
```
* optimize token prune
```
29782728
N

[Dy2St] remove deprecated JIT engines (#48298) · 5664306b
由 Nyakku Shigure 提交于 11月 24, 2022

5664306b

[Phi Support CuDNN] Support ALL CuDNN (#47865) · 1623f1b4

由 HongyuJia 提交于 11月 24, 2022

* support default use_gpudnn=True

* fully support cudnn in phi

* add header file

* add white_list, verify accuracy

* phi support all cudnn

* opt affine_grad

* try different arches of pretrained_model

* try different arches of pretrained_model

* add debug string

* debug eager_method

* add debug string, pass all local ctest

* polish all debug code

* delete use_cudnn relevant code autogen

* fix depthwise_conv2d

* Share all other members of Tensor except use_cudnn

* polish codes according to review opinion

* polish codes according to review opinion, fix bug

* polish codes according to review opinion, opt performance

* polish codes according to review opinion, fix pooling.py

1623f1b4

S

[PHI] Migrate batch_norm_grad kernel (#48288) · 561b7278
由 Sławomir Siwek 提交于 11月 24, 2022

561b7278

processgroup bkcl support reduce (#48232) · 5f995d3f

由 james 提交于 11月 24, 2022

Note: this is a temporary solution, should be replaced once reduce kernel
is natively supported on KL2

5f995d3f

do not calc reduce_all in eager mode (#48199) · bcf75132

由 wanghuancoder 提交于 11月 24, 2022

* do not calc reduce_all in eager mode

* refine python c cast list

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

bcf75132

W
dense tensor in eager mode support data_ptr (#48235) · 3f265815
由 wanghuancoder 提交于 11月 24, 2022
```
* dense tensor in eager mode support data_ptr
```
3f265815

23 11月, 2022 10 次提交
- W
  Add static checks for collective communication on NCCL (#48256) · d828ca46
  由 Wen Sun 提交于 11月 23, 2022
```
* feat: static check
```
  d828ca46
- H
  [PHI decoupling] move im2col from fluid to phi (#48174) · 88cac16b
  由 huangjiyi 提交于 11月 23, 2022
```
* decouple im2col from fluid

* move im2col to phi

* fix build error

* delete redundant comment
```
  88cac16b
- C
  Add nparray case for basic operator (#48229) · b7d3143f
  由 Charles-hit 提交于 11月 23, 2022
```
* add nparray case for basic operator

* fix unit test

* fix unit test

* add unit test

* fix unit test
```
  b7d3143f
- Y
  add masked_select_grad kernel (#48137) · db0ea0ce
  由 ykkk2333 提交于 11月 23, 2022
```
* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun

* add masked_selected_grad kernel,test=kunlun
```
  db0ea0ce
- W
  
  add map_depthwise_conv_to_conv pass (#47955) · 3daf5185
  由 Wilber 提交于 11月 23, 2022
  
  3daf5185
- Y
  
  [Paddle Inference] add Conv2d fusion layout transfer pass (#48128) · 67204c18
  由 Yuanle Liu 提交于 11月 23, 2022
  
  67204c18
- S
  Make bfloat16 implicitly convert to float/double (#48238) · 1066094a
  由 sneaxiy 提交于 11月 23, 2022
```
* make bfloat16 implicit convert to float/double

* fix bfloat16_test ut compile
```
  1066094a
- D
  
  add support of controlflow op for custom device (#48259) · edf46919
  由 duanyanhui 提交于 11月 23, 2022
  
  edf46919
- Z
  
  add warpctc kernel and change cast_v2 to cast for xpu, test=kunlun (#48134) · 25ffe9c2
  由 zhangyikun02 提交于 11月 23, 2022
  
  25ffe9c2
- Use cublaslt in multi transformer FFN (#48052) · b07e6b45
  由 MarDino 提交于 11月 23, 2022
```
* use fused mlp in multi transformer
* Restruct code
* use cublaslt to fuse ffn
* fix conflict
```
  b07e6b45
22 11月, 2022 8 次提交

P
[PHI] Migrate elementwise_div + all elementwise grad kernels (#48210) · 78b30e97
由 Piotr Paturej 提交于 11月 22, 2022
```
* Migrate elementwise_div

* Migrate elementwise grad kernels
```
78b30e97
F
fix:fix the bug of TRT_8.0.3.4 (#48135) · 1022b777
由 feng_shuai 提交于 11月 22, 2022
```
* fix:fix the bug of trt_8.0.3.4

* fix: fix the bug of trt_8.0

* fix: notes
```
1022b777
H

fix typo error (#48156) · 0cdca676
由 HongyuJia 提交于 11月 22, 2022

0cdca676
H
[PHI decoupling] move vol2col from fluid to phi (#48175) · aa36c6aa
由 huangjiyi 提交于 11月 22, 2022
```
* move vol2col from fluid to phi

* update copyright year
```
aa36c6aa
T
CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203) · df4dfda0
由 Tian Zheng 提交于 11月 22, 2022
```
* Skip tests that use fused_ops on H100

* Add error message to FusedOps on H100
```
df4dfda0

Some residualdata fixes (#48118) · 7bbdbe5b

由 Sylwester Fraczek 提交于 11月 22, 2022

Removed ResidualData and Bias from ExtraAttrProperties because it's not an attribute.
Removed bug with checking for ResidualData attribute in matmul_elementwise_add_fuse_pass
Removed residualData from list of matmul outputs in cpu_bfloat16_pass.cc because it's input
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

7bbdbe5b

H
Delete caching from requantize_mkldnn_op and changed to Acquire API (#48113) · 7d6a4a54
由 Hulek 提交于 11月 22, 2022
```
* Delete caching from requantize_mkldnn_op and changed to Acquire API
* Fixed codestyle and implementation
```
7d6a4a54

[PHI decoupling] remove "gpu_device_function.h" in fluid. (#48117) · 4da1a0fe

由 huangjiyi 提交于 11月 22, 2022

* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi

* update copyright years

* rm "fluid/platform/device/gpu/gpu_device_function.h" in phi

* rm dependence to "gpu_device_function.h" in fluid

* rm gpu_device_function.h etc in fluid

* fix rocm-complie bugs

* fix cuda_helper_test.cu bugs

4da1a0fe

21 11月, 2022 5 次提交

L
fix doc of NPUPlace (#48148) · 809516f6
由 Leo Chen 提交于 11月 21, 2022
```
* fix doc of NPUPlace

* fix doc of NPUPlace, test=document_fix
```
809516f6
R

Fix Ctx Dev pointer for KUNLUN (#48184) · 2d0fb059
由 Roc 提交于 11月 21, 2022

2d0fb059

add fc-residual quantization (#46917) · fed0ed34

由 Sylwester Fraczek 提交于 11月 21, 2022

* add fc-residual quantization

* revert removal of check for use_mkldnn

* fix bug

* add disable_logs

* review fix

call twice AreScalesPresntForNodes instead of if-else

* rewrite residual input to output

* revert fc mkldnn taking residual data

* format fix

* fix LoDTensor->DenseTensor

* LoDTensor->DenseTensor

* output->input

* revert changes to unsupported script

revert changes to unsupported script

* remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc

fed0ed34

R

delete unnecessary shape and slice op (#48112) · 41483383
由 RichardWooSJTU 提交于 11月 21, 2022

41483383

[PHI] Migrate mul_grad kernel (#48061) · 55f6fb3d

由 Sławomir Siwek 提交于 11月 21, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

55f6fb3d

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功