提交 · 3ce879dba700ef20415e95722de1c5845deab403 · Crayon鑫 / Paddle

14 4月, 2022 1 次提交
- Y
  
  Optimize the finding of max workspace size. (#41741) · 3ce879db
  由 Yiqun Liu 提交于 4月 14, 2022
  
  3ce879db
09 4月, 2022 1 次提交

Autotune the workspace_size_limit in conv. (#40338) · b937cdc5

由 limingshu 提交于 4月 09, 2022

* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode.

* Use the system cudaMalloc and cudaFree to allocate workspace during searching.

* Enable switch of two kind of workspace setting methods.
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

b937cdc5

04 3月, 2022 1 次提交

Move conv to pten (#39354) · d50fb43e

由 hong 提交于 3月 04, 2022

* move conv to pten

* move conv to pten; test=develop

* fix bug;

* add conv cudnn impl; test=develop

* update

* update operator; test=develop

* fix bug; test=develop

* move operator and prepared_operator to develop; test=develop

* resolve conflict; test=develop

* remove useless code;test=develop

* add depency ; test=develop

* fix bug;

* add sig.cc ; test=develop

* fix use_op error; test=develop

* fix bug; test=develop

* fix bug; test=develop

* add conv3d register; test=develop

* fix star gan and conv_nn_grad test failed; test=develop

* add header; test=develop

* manul to recover to develop;

* resolve confilct; test=develop

* remove useless code

* fix bug;

* remove conv2d_cudnn; test=develop

* fix bugs; test=develop

* fix cpu rocm compile bugs; test=develop

* fix blas error; test=develop

* fix compile bug; test=develop

* fix windows compile error; test=develop

* fix windows error; test=develop

* resolve confilct; test=develop

d50fb43e

20 2月, 2022 1 次提交

[PTen->Phi PR1] Change pten dirname and namespace to phi (#39748) · dcfe1986

由 Chen Weihang 提交于 2月 20, 2022

* rename pten dir to phi

* rename namespace to phi

* rename infrt pten dir to phi

* resolve conflict

* rename pten to phi in cmake

* revert all infrt change

* change needed files

* fix infrt failed

* fix inference failed

dcfe1986

19 2月, 2022 1 次提交

[Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264

由 Aurelius84 提交于 2月 19, 2022

* Unify paddle/pten::framework::ddim into pten::ddim

* fix paddle namespace

* compile sucessfully

* fix npu src file

* fix conflict

* fix conflict

* fix tensorrt compiler error

* fix conflict

* fix conflict

* fix tesst file conflict

* fix conflict

* fix mlu file conflict

* fix mlu file conflict

* fix cinn header file conflict

* fix conflict

* fix conflict

* fix conflict

* fix conflict

2fe04264

08 2月, 2022 1 次提交
- W
  [PTEN] Update gpu_context. (#39359) · 24103cbb
  由 Wilber 提交于 2月 08, 2022
```
* gpu_context..

* update

* update

* update
```
  24103cbb
25 1月, 2022 1 次提交
- L
  GetWorkspaceSize trigger modfication in heuristic cudnn conv (#39184) · 4c61e141
  由 limingshu 提交于 1月 25, 2022
```
* first commit

* add more changes
```
  4c61e141
30 12月, 2021 1 次提交
- L
  
  first commit (#38590) · ebc72ac2
  由 limingshu 提交于 12月 30, 2021
  
  ebc72ac2
03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
08 10月, 2021 1 次提交

Support CUDA Graph on ParallelExecutor (#36250) · f9591bb1

由 Zeng Jinle 提交于 10月 08, 2021

* support CUDA Graph on PE

* add ut, fix CI compile

* reduce memory consumption

* fix CUDA 10 CI

* improve coverage

* improve python coverage

f9591bb1

04 8月, 2021 1 次提交
- L
  
  Set Tensor Core MathType for bfloat16 in conv using cudnn (#34409) · c79fa1c3
  由 Lijunhui 提交于 8月 04, 2021
  
  c79fa1c3
02 6月, 2021 1 次提交
- W
  
  conv2d support bfloat16 (#32221) · 5981bee2
  由 wuhuanzhou 提交于 6月 02, 2021
  
  5981bee2
26 5月, 2021 1 次提交

optimize OP's compilation time (#32617) · 78ecb668

由 wuhuanzhou 提交于 5月 26, 2021

* optimize OP's compilation time, test=develop

* add more op and run ci test, test=develop

* CUDA Kernel register in cc file, test=develop

* fix macros, test=develop

* fix undefined symbol error, test=develop

* fix compilation error and undefined symbol, test=develop

* fix compilation error on Windows, test=develop

* fix compilation error on Windows, test=develop

78ecb668

18 2月, 2021 1 次提交
- Z
  enable exhaustive_search for forward and backward algos when dtype is float16 (#30959) · f0ee1592
  由 Zhang Ting 提交于 2月 18, 2021
```
* enable exhaustive_search for input_grad when dtype is float16

* enable exhaustive_search for forward algos
```
  f0ee1592
11 1月, 2021 1 次提交
- A
  
  Add tf32 switch for cuDNN (#29192) · 924aac22
  由 AshburnLee 提交于 1月 11, 2021
  
  924aac22
20 11月, 2020 1 次提交
- W
  
  fix the number of perf algo for conv cudnn in exhaustive mode (#28694) · 8b853b30
  由 wangchaochaohu 提交于 11月 20, 2020
  
  8b853b30
16 11月, 2020 1 次提交
- L
  
  Fix cudnn workspace limit in cudnn-8 (#28611) · f962bd34
  由 Leo Chen 提交于 11月 16, 2020
  
  f962bd34
14 10月, 2020 1 次提交
- Z
  tune backward filter algorithm for float16 (#27529) · d5cc144c
  由 Zhang Ting 提交于 10月 14, 2020
```
* use exhaustive_search for float16

* tune algo only when dtype is float16
```
  d5cc144c
23 9月, 2020 1 次提交
- S
  [bug fix]:Memory increases after adapting the cudnn version to cudnn8 (#27436) · c17f9cf2
  由 Shang Zhizhou 提交于 9月 23, 2020
```
* [bug fix]:Memory increases after adapting the cudnn version to 8

* [bug fix]cudnnGetConvolutionForwardAlgorithm not defined
```
  c17f9cf2
05 8月, 2020 1 次提交
- Z
  [CUDNN8 support] : support CUDNN8 (#25664) · 358bc06c
  由 Zhaolong Xing 提交于 8月 05, 2020
```
* cunn8 support
test=develop

* fix ci error
test=develop
```
  358bc06c
27 5月, 2020 1 次提交
- W
  
  fix conv_transpose Op fp16 error test=develop (#24695) · 355caee1
  由 wangchaochaohu 提交于 5月 27, 2020
  
  355caee1
12 4月, 2020 1 次提交
- Z
  
  fix bug for exhaustive_search in conv_fusion_op, test=develop (#23727) · b4b6763a
  由 zhongpu 提交于 4月 12, 2020
  
  b4b6763a
03 4月, 2020 1 次提交

support Exhaustive search in dygraph (#23415) · dbfbd7ea

由 zhongpu 提交于 4月 03, 2020

* use global conv cache; test=develop

* use singleton cache; test=develop

* fix format error; test=develop

* add cudnn helper header; test=develop

* fix header error; test=develop

* fix mac unitest; test=develop

* fix mac unitest; test=develop

* fix file format; test=develop

* fix include file error, test=develop

* remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop

* fix test_elementwise_mul_op_dim, test=develop

* fix compile error, test=develop
Co-authored-by: Nphlrain <phliuhongyu@126.com>

dbfbd7ea

02 4月, 2020 2 次提交

Z
Revert "Exhaustive search (#22821)", test=develop (#23401) · bfb07aaf
由 zhongpu 提交于 4月 02, 2020
```
This reverts commit 48144e40.
```
bfb07aaf

Exhaustive search (#22821) · 48144e40

由 zhongpu 提交于 4月 02, 2020

* use global conv cache; test=develop

* use singleton cache; test=develop

* fix format error; test=develop

* add cudnn helper header; test=develop

* fix header error; test=develop

* fix mac unitest; test=develop

* fix mac unitest; test=develop

* fix file format; test=develop

* fix include file error, test=develop

* remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop

* fix test_elementwise_mul_op_dim, test=develop
Co-authored-by: Nphlrain <phliuhongyu@126.com>

48144e40

07 1月, 2020 1 次提交
- C
  
  replace CUDNN_ENFORCE with PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#22109) · ba8414d3
  由 Chen Weihang 提交于 1月 07, 2020
  
  ba8414d3
04 11月, 2019 1 次提交
- W
  
  refine code for code reuse test=develop (#20988) · bf379fef
  由 wangchaochaohu 提交于 11月 04, 2019
  
  bf379fef
09 10月, 2019 1 次提交
- L
  
  fix conv_op compilation issue on windows (#20230) · e03c1d8a
  由 liuwei1031 提交于 10月 09, 2019
  
  e03c1d8a
03 9月, 2019 1 次提交
- G
  Change backward_guard to optimize_guard to maximize the allreduce overlap. (#19506) · abaf87be
  由 gongweibao 提交于 9月 03, 2019
```
Change backward_guard to optimize_guard to maximize the allreduce overlap
```
  abaf87be
23 7月, 2019 1 次提交

Cudnn convolution reconstruction (#18284) · 6b78e00d

由 wangchaochaohu 提交于 7月 23, 2019

* rewrite the conv_op using cudnn_conv_helper

* add workspace limit for v7 test=develop

* fix test=develop

* add half float test=develop

* fix test=develop

* fix test=develop

* revise code style test=develop

* fix test=develop

6b78e00d

10 5月, 2019 1 次提交

Double backward of conv2d. (#17211) · e32c9888

由 qingqing01 提交于 5月 10, 2019

* Add conv2d_grad_grad_op
* Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h.
    - Now use it in conv2d_grad_grad.
    - Will simply the searching code in conv2d and conv2d_grad in next PR.
* Enhance and fix bug in unit testing of gradient_checker.
* Support to fetch empty variables，return None in Python.

e32c9888

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致