- 19 2月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* Unify paddle/pten::framework::ddim into pten::ddim * fix paddle namespace * compile sucessfully * fix npu src file * fix conflict * fix conflict * fix tensorrt compiler error * fix conflict * fix conflict * fix tesst file conflict * fix conflict * fix mlu file conflict * fix mlu file conflict * fix cinn header file conflict * fix conflict * fix conflict * fix conflict * fix conflict
-
- 08 2月, 2022 1 次提交
-
-
由 Wilber 提交于
* gpu_context.. * update * update * update
-
- 25 1月, 2022 1 次提交
-
-
由 limingshu 提交于
* first commit * add more changes
-
- 30 12月, 2021 1 次提交
-
-
由 limingshu 提交于
-
- 03 12月, 2021 1 次提交
-
-
由 ronnywang 提交于
* refine structure for cuda and rocm * update * update * update * update
-
- 08 10月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
* support CUDA Graph on PE * add ut, fix CI compile * reduce memory consumption * fix CUDA 10 CI * improve coverage * improve python coverage
-
- 04 8月, 2021 1 次提交
-
-
由 Lijunhui 提交于
-
- 02 6月, 2021 1 次提交
-
-
由 wuhuanzhou 提交于
-
- 26 5月, 2021 1 次提交
-
-
由 wuhuanzhou 提交于
* optimize OP's compilation time, test=develop * add more op and run ci test, test=develop * CUDA Kernel register in cc file, test=develop * fix macros, test=develop * fix undefined symbol error, test=develop * fix compilation error and undefined symbol, test=develop * fix compilation error on Windows, test=develop * fix compilation error on Windows, test=develop
-
- 18 2月, 2021 1 次提交
-
-
由 Zhang Ting 提交于
* enable exhaustive_search for input_grad when dtype is float16 * enable exhaustive_search for forward algos
-
- 11 1月, 2021 1 次提交
-
-
由 AshburnLee 提交于
-
- 20 11月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 16 11月, 2020 1 次提交
-
-
由 Leo Chen 提交于
-
- 14 10月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* use exhaustive_search for float16 * tune algo only when dtype is float16
-
- 23 9月, 2020 1 次提交
-
-
由 Shang Zhizhou 提交于
* [bug fix]:Memory increases after adapting the cudnn version to 8 * [bug fix]cudnnGetConvolutionForwardAlgorithm not defined
-
- 05 8月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* cunn8 support test=develop * fix ci error test=develop
-
- 27 5月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 12 4月, 2020 1 次提交
-
-
由 zhongpu 提交于
-
- 03 4月, 2020 1 次提交
-
-
由 zhongpu 提交于
* use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop * fix compile error, test=develop Co-authored-by: Nphlrain <phliuhongyu@126.com>
-
- 02 4月, 2020 2 次提交
-
-
由 zhongpu 提交于
* use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop Co-authored-by: Nphlrain <phliuhongyu@126.com>
- 07 1月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
-
- 04 11月, 2019 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 09 10月, 2019 1 次提交
-
-
由 liuwei1031 提交于
-
- 03 9月, 2019 1 次提交
-
-
由 gongweibao 提交于
Change backward_guard to optimize_guard to maximize the allreduce overlap
-
- 23 7月, 2019 1 次提交
-
-
由 wangchaochaohu 提交于
* rewrite the conv_op using cudnn_conv_helper * add workspace limit for v7 test=develop * fix test=develop * add half float test=develop * fix test=develop * fix test=develop * revise code style test=develop * fix test=develop
-
- 10 5月, 2019 1 次提交
-
-
由 qingqing01 提交于
* Add conv2d_grad_grad_op * Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h. - Now use it in conv2d_grad_grad. - Will simply the searching code in conv2d and conv2d_grad in next PR. * Enhance and fix bug in unit testing of gradient_checker. * Support to fetch empty variables,return None in Python.
-