提交 · c0ed87296c3a09a1afc61effae736597dd70d350 · PaddlePaddle / Paddle

20 10月, 2022 1 次提交

[Cherry-pick] Simplify conv codes and fix cache and autotune bugs. (#47197) · c0ed8729

由 Yiqun Liu 提交于 10月 20, 2022

* Simplify the codes of conv. (#45966)

* Enable to record whether the conv algo is got by exhaustive search to fix autotune cache bug. (#47065)

c0ed8729

20 9月, 2022 1 次提交

Fix wrong eigen header include (#46082) (#46202) · ac8cce20

由 zyfncg 提交于 9月 20, 2022

* fix wrong eigen header include

* fix complie bug

* fix nan_inf_utils_detail

* fix resource_manager

* fix conv_miopen_helper

ac8cce20

25 8月, 2022 1 次提交

optimize conv algo cache (#41891) · 1cd7e68b

由 hong 提交于 8月 25, 2022

* optimizer conv alog speed

* code polish

* remove useless code

* fix compile error

* fix cpu compile error

* not use cudnn alog t

* add search cache max number

* polish code

* fix cache test bug

* add groups data format to conv args

* fix cache test bug

* fix cudnn_deterministic bug

* fix test switch auto tune bug

* fix test swith autotune bug;

* fix conv cache bug

* fix cache test error

* fix cache test bug

* fix windows mac compile error

* fix workspace search error

* update cudnn cache

* fix cache test bug; test=develop

* fix autotune swith test error

* polish code

* oplish code

1cd7e68b

26 6月, 2022 1 次提交
- S
  
  format all files in fluid using new config (#43776) · 576236a0
  由 Sing_chan 提交于 6月 26, 2022
  
  576236a0
27 5月, 2022 1 次提交
- R
  Support memory stats for CPU (#42945) · 21f11d35
  由 Ruibiao Chen 提交于 5月 27, 2022
```
* Support memory stats for CPU

* Add UTs

* Fix typos

* Fix typos
```
  21f11d35
15 4月, 2022 1 次提交

Change cuDNN Conv kernel for auto tune feature (#41313) · 35acfeda

由 limingshu 提交于 4月 15, 2022

* change cudnn helper for auto-tune

* Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm.

* Fix the bug in calculating and printing current step cache hit rate.

* Improve the autotune cache and fix unittest.

* Change the key from AlgorithmType to int64_t.

* Fix unittest for cpu-only env.

* change ChooseAlgoByWorkspace for heuristic mode
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

35acfeda

14 4月, 2022 1 次提交
- Y
  
  Optimize the finding of max workspace size. (#41741) · 3ce879db
  由 Yiqun Liu 提交于 4月 14, 2022
  
  3ce879db
09 4月, 2022 1 次提交

Autotune the workspace_size_limit in conv. (#40338) · b937cdc5

由 limingshu 提交于 4月 09, 2022

* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode.

* Use the system cudaMalloc and cudaFree to allocate workspace during searching.

* Enable switch of two kind of workspace setting methods.
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

b937cdc5

04 3月, 2022 1 次提交

Move conv to pten (#39354) · d50fb43e

由 hong 提交于 3月 04, 2022

* move conv to pten

* move conv to pten; test=develop

* fix bug;

* add conv cudnn impl; test=develop

* update

* update operator; test=develop

* fix bug; test=develop

* move operator and prepared_operator to develop; test=develop

* resolve conflict; test=develop

* remove useless code;test=develop

* add depency ; test=develop

* fix bug;

* add sig.cc ; test=develop

* fix use_op error; test=develop

* fix bug; test=develop

* fix bug; test=develop

* add conv3d register; test=develop

* fix star gan and conv_nn_grad test failed; test=develop

* add header; test=develop

* manul to recover to develop;

* resolve confilct; test=develop

* remove useless code

* fix bug;

* remove conv2d_cudnn; test=develop

* fix bugs; test=develop

* fix cpu rocm compile bugs; test=develop

* fix blas error; test=develop

* fix compile bug; test=develop

* fix windows compile error; test=develop

* fix windows error; test=develop

* resolve confilct; test=develop

d50fb43e

20 2月, 2022 1 次提交

[PTen->Phi PR1] Change pten dirname and namespace to phi (#39748) · dcfe1986

由 Chen Weihang 提交于 2月 20, 2022

* rename pten dir to phi

* rename namespace to phi

* rename infrt pten dir to phi

* resolve conflict

* rename pten to phi in cmake

* revert all infrt change

* change needed files

* fix infrt failed

* fix inference failed

dcfe1986

19 2月, 2022 1 次提交

[Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264

由 Aurelius84 提交于 2月 19, 2022

* Unify paddle/pten::framework::ddim into pten::ddim

* fix paddle namespace

* compile sucessfully

* fix npu src file

* fix conflict

* fix conflict

* fix tensorrt compiler error

* fix conflict

* fix conflict

* fix tesst file conflict

* fix conflict

* fix mlu file conflict

* fix mlu file conflict

* fix cinn header file conflict

* fix conflict

* fix conflict

* fix conflict

* fix conflict

2fe04264

08 2月, 2022 1 次提交
- W
  [PTEN] Update gpu_context. (#39359) · 24103cbb
  由 Wilber 提交于 2月 08, 2022
```
* gpu_context..

* update

* update

* update
```
  24103cbb
25 1月, 2022 1 次提交
- L
  GetWorkspaceSize trigger modfication in heuristic cudnn conv (#39184) · 4c61e141
  由 limingshu 提交于 1月 25, 2022
```
* first commit

* add more changes
```
  4c61e141
30 12月, 2021 1 次提交
- L
  
  first commit (#38590) · ebc72ac2
  由 limingshu 提交于 12月 30, 2021
  
  ebc72ac2
03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
08 10月, 2021 1 次提交

Support CUDA Graph on ParallelExecutor (#36250) · f9591bb1

由 Zeng Jinle 提交于 10月 08, 2021

* support CUDA Graph on PE

* add ut, fix CI compile

* reduce memory consumption

* fix CUDA 10 CI

* improve coverage

* improve python coverage

f9591bb1

04 8月, 2021 1 次提交
- L
  
  Set Tensor Core MathType for bfloat16 in conv using cudnn (#34409) · c79fa1c3
  由 Lijunhui 提交于 8月 04, 2021
  
  c79fa1c3
02 6月, 2021 1 次提交
- W
  
  conv2d support bfloat16 (#32221) · 5981bee2
  由 wuhuanzhou 提交于 6月 02, 2021
  
  5981bee2
26 5月, 2021 1 次提交

optimize OP's compilation time (#32617) · 78ecb668

由 wuhuanzhou 提交于 5月 26, 2021

* optimize OP's compilation time, test=develop

* add more op and run ci test, test=develop

* CUDA Kernel register in cc file, test=develop

* fix macros, test=develop

* fix undefined symbol error, test=develop

* fix compilation error and undefined symbol, test=develop

* fix compilation error on Windows, test=develop

* fix compilation error on Windows, test=develop

78ecb668

18 2月, 2021 1 次提交
- Z
  enable exhaustive_search for forward and backward algos when dtype is float16 (#30959) · f0ee1592
  由 Zhang Ting 提交于 2月 18, 2021
```
* enable exhaustive_search for input_grad when dtype is float16

* enable exhaustive_search for forward algos
```
  f0ee1592
11 1月, 2021 1 次提交
- A
  
  Add tf32 switch for cuDNN (#29192) · 924aac22
  由 AshburnLee 提交于 1月 11, 2021
  
  924aac22
20 11月, 2020 1 次提交
- W
  
  fix the number of perf algo for conv cudnn in exhaustive mode (#28694) · 8b853b30
  由 wangchaochaohu 提交于 11月 20, 2020
  
  8b853b30
16 11月, 2020 1 次提交
- L
  
  Fix cudnn workspace limit in cudnn-8 (#28611) · f962bd34
  由 Leo Chen 提交于 11月 16, 2020
  
  f962bd34
14 10月, 2020 1 次提交
- Z
  tune backward filter algorithm for float16 (#27529) · d5cc144c
  由 Zhang Ting 提交于 10月 14, 2020
```
* use exhaustive_search for float16

* tune algo only when dtype is float16
```
  d5cc144c
23 9月, 2020 1 次提交
- S
  [bug fix]:Memory increases after adapting the cudnn version to cudnn8 (#27436) · c17f9cf2
  由 Shang Zhizhou 提交于 9月 23, 2020
```
* [bug fix]:Memory increases after adapting the cudnn version to 8

* [bug fix]cudnnGetConvolutionForwardAlgorithm not defined
```
  c17f9cf2
05 8月, 2020 1 次提交
- Z
  [CUDNN8 support] : support CUDNN8 (#25664) · 358bc06c
  由 Zhaolong Xing 提交于 8月 05, 2020
```
* cunn8 support
test=develop

* fix ci error
test=develop
```
  358bc06c
27 5月, 2020 1 次提交
- W
  
  fix conv_transpose Op fp16 error test=develop (#24695) · 355caee1
  由 wangchaochaohu 提交于 5月 27, 2020
  
  355caee1
12 4月, 2020 1 次提交
- Z
  
  fix bug for exhaustive_search in conv_fusion_op, test=develop (#23727) · b4b6763a
  由 zhongpu 提交于 4月 12, 2020
  
  b4b6763a
03 4月, 2020 1 次提交

support Exhaustive search in dygraph (#23415) · dbfbd7ea

由 zhongpu 提交于 4月 03, 2020

* use global conv cache; test=develop

* use singleton cache; test=develop

* fix format error; test=develop

* add cudnn helper header; test=develop

* fix header error; test=develop

* fix mac unitest; test=develop

* fix mac unitest; test=develop

* fix file format; test=develop

* fix include file error, test=develop

* remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop

* fix test_elementwise_mul_op_dim, test=develop

* fix compile error, test=develop
Co-authored-by: Nphlrain <phliuhongyu@126.com>

dbfbd7ea

02 4月, 2020 2 次提交

Z
Revert "Exhaustive search (#22821)", test=develop (#23401) · bfb07aaf
由 zhongpu 提交于 4月 02, 2020
```
This reverts commit 48144e40.
```
bfb07aaf

Exhaustive search (#22821) · 48144e40

由 zhongpu 提交于 4月 02, 2020

* use global conv cache; test=develop

* use singleton cache; test=develop

* fix format error; test=develop

* add cudnn helper header; test=develop

* fix header error; test=develop

* fix mac unitest; test=develop

* fix mac unitest; test=develop

* fix file format; test=develop

* fix include file error, test=develop

* remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop

* fix test_elementwise_mul_op_dim, test=develop
Co-authored-by: Nphlrain <phliuhongyu@126.com>

48144e40

07 1月, 2020 1 次提交
- C
  
  replace CUDNN_ENFORCE with PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#22109) · ba8414d3
  由 Chen Weihang 提交于 1月 07, 2020
  
  ba8414d3
04 11月, 2019 1 次提交
- W
  
  refine code for code reuse test=develop (#20988) · bf379fef
  由 wangchaochaohu 提交于 11月 04, 2019
  
  bf379fef
09 10月, 2019 1 次提交
- L
  
  fix conv_op compilation issue on windows (#20230) · e03c1d8a
  由 liuwei1031 提交于 10月 09, 2019
  
  e03c1d8a
03 9月, 2019 1 次提交
- G
  Change backward_guard to optimize_guard to maximize the allreduce overlap. (#19506) · abaf87be
  由 gongweibao 提交于 9月 03, 2019
```
Change backward_guard to optimize_guard to maximize the allreduce overlap
```
  abaf87be
23 7月, 2019 1 次提交

Cudnn convolution reconstruction (#18284) · 6b78e00d

由 wangchaochaohu 提交于 7月 23, 2019

* rewrite the conv_op using cudnn_conv_helper

* add workspace limit for v7 test=develop

* fix test=develop

* add half float test=develop

* fix test=develop

* fix test=develop

* revise code style test=develop

* fix test=develop

6b78e00d

10 5月, 2019 1 次提交

Double backward of conv2d. (#17211) · e32c9888

由 qingqing01 提交于 5月 10, 2019

* Add conv2d_grad_grad_op
* Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h.
    - Now use it in conv2d_grad_grad.
    - Will simply the searching code in conv2d and conv2d_grad in next PR.
* Enhance and fix bug in unit testing of gradient_checker.
* Support to fetch empty variables，return None in Python.

e32c9888

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功