提交 · cc52501e471e6dcd09c5f045b3460da37ebac3d2 · 吃玉米的猫 / Paddle

29 3月, 2022 1 次提交
- 王
  
  [Infrt] delete custom_pdop.td and move op to infrt dialect. (#41021) · 9bb3744f
  由王明冬提交于 3月 29, 2022
  
  9bb3744f
28 3月, 2022 2 次提交
- 王
  
  [infrt] move graph op from pd dialect to infrt dialect. (#41003) · bf93050c
  由王明冬提交于 3月 28, 2022
  
  bf93050c
- C
  [Phi] Fix assign kernel bug (#40927) · 822a2d1f
  由 Chen Weihang 提交于 3月 28, 2022
```
* fix assign kernel bug

* fix xpu kernel select error

* add cudn pinned place

* fix copy error

* fix infrt error
```
  822a2d1f
27 3月, 2022 1 次提交

由 Jack Zhou 提交于 3月 27, 2022

* add string tensor and case convert kernels

* Add strings empty kernel; Reorganize the structure of case convert kernel

* Add string infermeta

* Update mutable_data of string tensor

* rename kernel name

* add string copy tmp

* Fix strings copy device bug

* add utf8 gpu converter

* add string tensor c++ api

* Remove mutable_data of string tensor

* update string tensor interface

* remove charcases_flag.h

* remove some fluid headers

* Add make_ddim

* __HIPCC__ -> PADDLE_WITH_HIP

* remove fluid headers

* fix cpu compile

* remove std::hash

* Fix cudaMalloc

* Remove strings/impl directory

* Fix infrt/get_phi_kernel_info.py;Add custom_kernels deps

* Add empty kernel test

* Remove some comments

* Modify lower/upper api encoding type: string->bool

* STRING->PSTRING; Add CreateInferLikeMeta

* Add code gen for C++ String API

* remove strings_api_utils.h

* Add ignore file (strings_api.h, strings_api.cc)

* update strings gen script

* change args order of case convert kernels

* Add comments for pstring, StringTensor

* cpstring_internal.h -> cpstring_impl.h

* Update accordding to comments:

1. Remove fluid headers
2. paddle::platform::errors -> phi::errors
3. Use 'place.GetType() == phi::AllocationType::GPU' instead of 'paddle::platform::is_cpu_space()'
4. Use camel code style

* Remove all singletons in strings kernels

* fix rocm compile

* Fix py3 compile

* Fix c++ coverage

* 1. Add pstring proto type
2. Add StringTensor debug info
3. Rename case_convert_kernel to strings_lower_upper
4. Remove serialize derialize strings kernel

* DataLayout::PSTRING -> DataLayout::PSTRING_UNION

* Register pstring data type

* Fix strings api gen

* Fix dense tensor register pstring dtype

* Fix error messages

* remove line

* add pstring unittest

* remove test string api unitest

* remove empty line

* Remove some headers to decrease the size of executable file

0695e1ac

25 3月, 2022 3 次提交

W
infrt update phi gpu register. (#40866) · 5f6038ff
由 Wilber 提交于 3月 25, 2022
```
* update register every make.

* fix

* update
```
5f6038ff

[Phi] Migrate Adam and AdamW into Phi (#40351) · 56cd3407

由 Aurelius84 提交于 3月 25, 2022

* [Phi] Migrate Adam and Adamw into Phi

* fix compile error and unittest ok

* fix compile error and unittest ok

* fix undefined reference to fLI::FLAGS

* test depend on operator

* fix cmake

* fix xpu compile

* fix infrt

* fix amp_type_traits

* fix amp_type_traits

* modify according reviewer

* modify according reviewer

* fix dtype float16

* fix typo

* fix Cmake

* fix code style

56cd3407

王

[infrt] add phi_dt.create_inited_dense_tensor.cpu.f32 kernel. (#40902) · 65478332
由王明冬提交于 3月 25, 2022

65478332

24 3月, 2022 4 次提交

[AMP] Support amp for Intermediate_dygraph (#40623) · c12f7d48

由 zhangbo9674 提交于 3月 24, 2022

* approve amp for intermediate_dygraph

* add amp_utils for intermediate_dygraph

* add amp needcast check for mlu & npu

* test unittest

* add SetGradNode for set_stop_gradient && add checktensor for GradientHooks

* refine code

* refien unittest of imperative_amp for new dygraph

* inplace api skip amp

* add test_imperative_qat_amp for intermediate amp

* refine code

* refine test_amp ci strategy

* refine unittest code

* refine amp_utils code

* refine amp getpromotetype for some special op

* refine unittest code

c12f7d48

R

the `defaults` in FullArgSpec may be `None` (#40882) · 99541895
由 Ren Wei (任卫) 提交于 3月 24, 2022

99541895
H

[Infrt] add method for automatically scanning pass and kernel info (#40822) · f51a5791
由 huzhiqiang 提交于 3月 24, 2022

f51a5791
H

[Infrt] upgrade kernel launcher fun generator (#40826) · 7fa3a724
由 huzhiqiang 提交于 3月 24, 2022

7fa3a724

23 3月, 2022 2 次提交
- 王
  
  [infrt] add ir support for phi kernel batch_norm_infer. (#40755) · c751e405
  由王明冬提交于 3月 23, 2022
  
  c751e405
- Z
  Removed redundant use of declarations.h (#40703) · 2a1b4c07
  由 Zhanlue Yang 提交于 3月 23, 2022
```
* Removed redundant use of declarations.h

* Fixed minor bug
```
  2a1b4c07
21 3月, 2022 1 次提交
- 石
  
  add the map for dense tensor, test=develop (#40665) · b77e20ac
  由石晓伟提交于 3月 21, 2022
  
  b77e20ac
18 3月, 2022 4 次提交
- H
  
  update infrt script (#40670) · 50fad3ed
  由 huzhiqiang 提交于 3月 18, 2022
  
  50fad3ed
- S
  
  set +x to close showing command, update check_change code with linux (#40456) · 161d27dc
  由 Sing_chan 提交于 3月 18, 2022
  
  161d27dc
- W
  support register with attr (#40564) · 755a6c53
  由 Wilber 提交于 3月 18, 2022
```
* support register with attr

* add infrt_with_gpu macor
```
  755a6c53
- 王
  [infrt] rename pd dialect from mlir to infrt. (#40651) · ef4ef154
  由王明冬提交于 3月 18, 2022
```
* [infrt] rename pd dialect from mlir to infrt. test=develop

* [infrt] fix the kernel signature generator bug.
```
  ef4ef154
17 3月, 2022 2 次提交
- 王
  
  [infrt] move pd_ops.td to pd floder. test=develop (#40613) · 4c01763c
  由王明冬提交于 3月 17, 2022
  
  4c01763c
- 王
  
  [infrt] move pd dialect position. test=develop (#40616) · 3a256637
  由王明冬提交于 3月 17, 2022
  
  3a256637
15 3月, 2022 2 次提交

Skip infrt when checking log fatal (#40529) · c9f3ad03

由 Chen Weihang 提交于 3月 15, 2022

* skip infrt when checking log fatal, test=document_fix

* remove test=document_fix

* update commit

c9f3ad03

[Phi]Move Tanh/BRelu/LeakyRelu/ThresholdedRelu Kernels to Phi (#40385) · d7112180

由 YuanRisheng 提交于 3月 15, 2022

* move activation op

* adjust code format

* fix compile bugs

* fix ci bugs

* code format adjust

* code format adjust2

* activate ci status

* modify according to comment

* move activation kernel

* revert relu6

* reduce add code

* perfect use_phi_functor

* completing func name

* fix bugs when run ci

* fix bugs when run infr

* modifpy infrt get kernel signature

d7112180

14 3月, 2022 1 次提交
- H
  
  [infrt] add skip list (#40450) · 95a526b2
  由 huzhiqiang 提交于 3月 13, 2022
  
  95a526b2
10 3月, 2022 1 次提交

Add trt execute (#40224) · e72ef603

由 Shang Zhizhou 提交于 3月 10, 2022

* add trt.execute

* merge trt.engine type

* update return op

* update comments

* fix style

* fix style

e72ef603

09 3月, 2022 2 次提交

H

[Infrt]Update kernel dialect (#40141) · 767647ce
由 huzhiqiang 提交于 3月 09, 2022

767647ce

build documents if public apis modified, meanwhile their samplecodes should be tested (#39728) · 041c4bca

由 Ren Wei (任卫) 提交于 3月 09, 2022

* run document_preview when samplecodes be tested

* run document_preview when samplecodes be tested

* sphinx-build symbol link; and build-doc default

* FLUIDDOCDIR typo

* download the required configirations and some other scripts

* install required python packages.

* clone specified branch of docs repo, and if failed, clone the default branch

* clean workspace for docs repo

* use the conf.py imported by https://github.com/PaddlePaddle/docs/pull/4222/

* download and install the boscmd

* Optimaze the code comments.

* specify the pypi index server

* only do doc-build when running in cpu mode

* pull docs pr

git log

paddle_pr_info

* install jq

* force using sphinx-build under py3.7

* using our new domain name for preview

* install python package error

* don't build doc default

041c4bca

07 3月, 2022 2 次提交

王

[infrt] fold the infrt.cvtTensorOp. test=develop (#40214) · b798fb07
由王明冬提交于 3月 07, 2022

b798fb07

cuBlasLt Epilogue To Fuse Linear + ReLU|GeLU (#39437) · 2a3d9eca

由 Ming-Xu Huang 提交于 3月 07, 2022

* Added cuBlasLtHandle_t to device context.

* Added fused_gemm_epilogue op.

1. Added fused_gemm_epilogue op to leverage cuBlastLt Epilogue.
2. Support fusion Act(X*Y + bias), X'dims >=2 and Y'dims shoule be 2.
2. Act currently only be supported ReLU. (Will add GeLU in the future).

* Added UT to fused_gemm_epilogue op.

* Added LinearAct Pattern

1. Added LinearAct into graph_pattern_detector.* to define (2.)'s
pattern.
2. LinearAct is used to detect act(element_add(matmul_v2(x, w), bias)).
3. act currently only support ReLU (Will support GeLU in the future).

* Added FuseGemmEpiloguePass

1, Added FuseGemmEpiloguePass to handle nn.Linear + Act{ReLU}
fusion (GeLU will be supported in the future).
2. Only support matmul_v2 from nn.Linear.

* Added pybind to BuildStrageter.fuse_gemm_epilogue_.

* Added UT for fuse_gemm_epilogue_pass.

* GeLU support and EpilogueSingleton

1. Added GeLU support to fused_gemm_epilogue op.
2. Added EpilogueSingleton to cache auxiliary pointer.
3. Added related UTs.

* Rename cublaslt_epilogue_opto gemm_epilogue_op.*.

* Added both train and infer pattern to LinearAct.

1. Added support of fwd graph with grap_ops linking to LinearAct.
2. Added related changes to fuse_gemm_epilogue_pass for above
modification.

* Changed CUDA requirement from 11.4 to 11.6 for fuse_gemm_epilogue_pass.

* Added identity activation support to gemm_epilogue_op.

* Added Linear Fusion (matmul_v2 + ele_add)

1. Added matmul_v2 + ele_add pattern to LinearActPattern.
2. Added matmul_v2 + ele_add support to fuse_gemm_epilogue_pass.

* Rename gemm_epilogue_op.* to fused_gemm_epilogue_op.*

* Add fused_gemm_epilogue_grad op.

1. Added fused_gemm_epilogue_grad to support backward epilogue fusion.

* Add UTs to fused_gemm_epilogue_grad_op.

* Change attribute name in fused_gemm_epilogue_grad_op for clearing.

* Allow DX and DBias be dispensable to fused_gemm_epilogue_grad op.

* Added ElementwiseAdd+Matmul+Act graph pattern detection.

* Fuse backward of Linear( Act(x))

1. Added backward fusion pass to Linear( Act(x)).
2. Added backward fusion pass to Linear(x).

* Added UTs to backward fusion of Linear(Act(x)).

* Complete document of arguments to fused_gemm_epilogue_op.

* Made arguments of some functions pass by reference.

* Modify code with review comments.

1. Made arguments of some function pass by reference.
2. Removed redundant code.
3. Followed Google code style to change code.

* Made 'const' code style be consistent

* Fixed random seed of python UTs.

* Set Compiling constrains to cuBlasLt

1. Require CUDA 11.6+
2. Remove fuse_gemm_epilogue related tests when CUDA < 11.6.

* Code Reivew from Paddle

1. Changed arguments name is_first_gemm to without_x_gradient for
clearing.
2. Applied PADDLE_THROW in fused_gemm_epilogue_op.

* Remove EpilogueSingleton

1. Applied ReserveSpace to replace Epilogue for passing auxiliary
pointers between FWD and BWD.

* Fix a logical error and enhance UTs.

1. Added act op count checking in UTs.
2. Fix issue to fuse backward or ReLU(Linear(X)).
3. TODO: solve GELU fusion issues.

* Fix Linear and GeLU fusion issues.

1. Modified graph_detech_pattern to fit with both linear wiht gelu or
relu.
2. Modified data range in Uts to allow negative values.

* Removed fused_gemm_epilogue_op.h.

* Rename namespace pten to phi.

* Rename name of arguments in fused_gemm_epilogue_op

1. bias -> Bias.
2. out -> Out.
3. reserve_space -> ReserveSpace.

* Change EpiloguePassActivationCache as local variable.

1. Removed singleton in EpiloguePassActivationCache.
2. Made EpiloguePassActivationCache as an argument to each pass
functions.

2a3d9eca

04 3月, 2022 1 次提交
- 王
  
  [infrt] add ir for convert pd dilect to phi dialect. test=develop (#40104) · 3ac9bc95
  由王明冬提交于 3月 04, 2022
  
  3ac9bc95
03 3月, 2022 1 次提交
- 石
  mlir attr types for infrt place, test=develop (#40087) · b1d38dea
  由石晓伟提交于 3月 03, 2022
```
* mlir attr types for infrt place, test=develop

* fix a bug, test=develop
```
  b1d38dea
02 3月, 2022 3 次提交
- A
  [IPU] update dockerfile (#40061) · 7ef61789
  由 Allen Guo 提交于 3月 02, 2022
```
* update dockerfile for ipu

* update comments, test=document_fix
```
  7ef61789
- P
  support checking `phi` directory in CI op benchmark (#40026) · f30b3f81
  由 pangyoki 提交于 3月 02, 2022
```
* support phi checking in CI op benchmark

* add sparse/gpu

* remove h file in cpu directory
```
  f30b3f81
- H
  
  [Infrt]add phi kernel dialect (#39726) · 07dad6d6
  由 huzhiqiang 提交于 3月 02, 2022
  
  07dad6d6
01 3月, 2022 2 次提交
- W
  remove conv_affine_channel_fuse_pass (#39817) · fc06be9d
  由 wenbin 提交于 3月 01, 2022
```
* remove

* pass

* more pass
```
  fc06be9d
- P
  
  change tests_v2 to dynamic_tests_v2 in CI op benchmark (#39995) · 4204b97a
  由 pangyoki 提交于 3月 01, 2022
  
  4204b97a
28 2月, 2022 2 次提交
- T
  
  Change CI-Build build develop (#39863) · 61443a0e
  由 tianshuo78520a 提交于 2月 28, 2022
  
  61443a0e
- W
  
  infrt add trt engine (#39885) · 27536a32
  由 Wilber 提交于 2月 28, 2022
  
  27536a32
22 2月, 2022 3 次提交
- 王
  
  add pten convert pass.test=develop (#39664) · a6abb6e7
  由王明冬提交于 2月 22, 2022
  
  a6abb6e7
- C
  [pten]add check for using HostAlloc (#39771) · 12c6d06a
  由 chentianyu03 提交于 2月 22, 2022
```
* add check for using HostAlloc

* add check for using HostAlloc
```
  12c6d06a
- Z
  
  update precision catalog (#39717) · df1dbff1
  由 zhangchunle 提交于 2月 22, 2022
  
  df1dbff1

吃玉米的猫 / Paddle 与 Fork 源项目一致

吃玉米的猫 / Paddle
与 Fork 源项目一致