- 18 11月, 2022 2 次提交
-
-
由 huangjiyi 提交于
* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi * update copyright years * rm "fluid/platform/device/gpu/gpu_device_function.h" in phi * fix rocm-complie bugs
-
由 Tian Zheng 提交于
* Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation * Fix macro * Add implementation for conv_kernel and conv_grad_kernel * Modification after rebase onto latest develop * Modify plan cache to comply with the API of phi::autotune * Refactor to reduce duplicate code * Review fix: - move functions in conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu - add const specifier for input tensor - add logging when plans fail to execute - move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h * - move plan building outside of cache * Fix ROCM build
-
- 16 11月, 2022 1 次提交
-
-
由 Wang Xin 提交于
-
- 11 11月, 2022 1 次提交
-
-
由 Wang Xin 提交于
-
- 10 11月, 2022 1 次提交
-
-
由 huangjiyi 提交于
[PHI Decoupling] remove "paddle/fluid/platform/float16.h" and "paddle/fluid/platform/for_range.h" in phi. (#47817) * rm "paddle/fluid/platform/float16.h" in phi * rm "paddle/fluid/platform/for_range.h" in phi
-
- 09 11月, 2022 1 次提交
-
-
由 huangjiyi 提交于
* rm "paddle/fluid/platform/dynload/cudnn.h" in phi * rm "paddle/fluid/platform/dynload/mklml.h" in phi * rm "paddle/fluid/platform/dynload/rocblas.h" in phi * replace "paddle::platform::dynload::" with "phi::dynload::" in phi * revert "blas_impl.cu.h"
-
- 08 11月, 2022 1 次提交
-
-
由 jzhang533 提交于
* removing dependent to fluid/framework/eigen.h in phi * more fix according to PR-CI-Py3 fail
-
- 07 11月, 2022 1 次提交
-
-
由 Yiqun Liu 提交于
* Define ConvRunner to wrapper the call of cudnn conv functions. * Use ConvKind in SearchAlgorithm.
-
- 02 11月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
* Standardise batch norm * standardize conv3d and depwise_conv2d * fix ci bugs
-
- 01 11月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* add extra attr property set * add type_info for all context * add onednn context to all context * fix context compile error * simplify conv kernel args * pass runtime attr into dev_ctx * fix marco error * clear conv_grad_kernel extra args * merge conv_grad_grad into conv_grad * clear conv2d_grad_grad extra attrs * clear yaml and eager extra attr * fix conv1d error * change to thread local * fix npu compile failed * try to fix windows compile failed * add conv2d onednn phi kernel * fix ci bugs (#36) * fix compile bugs (#38) * fix extra input transform bug (#39) * support dynamic created attr (#40) * reset extra info gen code * rm conv_grad_grad kernel * reimpl pass attr adapting * add int attr support * remove vector inputnames creating * fix map at error * Update paddle/phi/kernels/onednn/conv_grad_kernel.cc Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com> * remove useless extra attrs * replace mkldnn_engine by onednn_engine Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com> Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
-
- 25 10月, 2022 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 24 10月, 2022 2 次提交
- 19 10月, 2022 1 次提交
-
-
由 Yiqun Liu 提交于
Enable to record whether the conv algo is got by exhaustive search to fix autotune cache bug. (#47065)
-
- 13 10月, 2022 1 次提交
-
-
由 carryyu 提交于
-
- 29 9月, 2022 1 次提交
-
-
由 carryyu 提交于
-
- 09 9月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* fix softmax int64 * follow comments
-
- 07 9月, 2022 1 次提交
-
-
由 WangZhen 提交于
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
-
- 05 9月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* [OpAttr]ksize of pool2d support Tensor type * fix unittest * add unittest
-
- 25 8月, 2022 1 次提交
-
-
由 hong 提交于
* optimizer conv alog speed * code polish * remove useless code * fix compile error * fix cpu compile error * not use cudnn alog t * add search cache max number * polish code * fix cache test bug * add groups data format to conv args * fix cache test bug * fix cudnn_deterministic bug * fix test switch auto tune bug * fix test swith autotune bug; * fix conv cache bug * fix cache test error * fix cache test bug * fix windows mac compile error * fix workspace search error * update cudnn cache * fix cache test bug; test=develop * fix autotune swith test error * polish code * oplish code
-
- 23 8月, 2022 1 次提交
-
-
由 niuliling123 提交于
-
- 03 8月, 2022 1 次提交
-
-
由 Thomas Young 提交于
* save change * save change by YSL * save change by YSL * change by YSL * test pre commit * Revert "test pre commit" This reverts commit eee5e116331186cc544de871b4a5174a6431f17c. * fix code style * fix ctest * temp save * save change * change by YSL * final change by ysl * fix ci * fix code style * delete unuse code * change by ysl
-
- 21 6月, 2022 2 次提交
-
-
由 Sing_chan 提交于
resort .cu headers, set clang-format not sort include block and consider .cu as main source file (#43633)
-
由 Zhang Ting 提交于
-
- 10 6月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* fix depthwise conv yaml error * fix depthwise conv double grad error
-
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 01 6月, 2022 1 次提交
-
-
由 chentianyu03 提交于
* add conv3d yaml * add conv3d_grad, conv3d_double_grad * add final_state_conv3d test case * add conv3d double test case * add depthwise_conv2d grad yaml * add depthwise_conv2d double grad test case * modify the order of args * add depthwise_conv2d_grad_grad config
-
- 30 5月, 2022 1 次提交
-
-
由 crystal 提交于
-
- 27 5月, 2022 1 次提交
-
-
由 zyfncg 提交于
* refactor the optional tensor * remove optiona<MetaTensor> in InferMeta * fix bug * fix optional<vector<Tensor>> * fix bug * fix rmsprop * fix amp of eager_gen * polish code * fix deleted code * fix merge conflict * polish code * remove is_nullopt_ * fix merge conflict * fix merge conflict
-
- 15 4月, 2022 1 次提交
-
-
由 Zhanlue Yang 提交于
* [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad * Fixed elementwise issue * Addressed CI failures * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode * [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode * Enabled more test cases * [DoubleGrad] Enabled test_imperative_star_gan_with_gradient_penalty.py under eager mode * Adjusted test_imperative_star_gan_with_gradient_penalty.py
-
- 12 4月, 2022 1 次提交
-
-
由 hong 提交于
-
- 09 4月, 2022 2 次提交
-
-
由 hong 提交于
-
由 limingshu 提交于
* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode. * Use the system cudaMalloc and cudaFree to allocate workspace during searching. * Enable switch of two kind of workspace setting methods. Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
- 06 4月, 2022 1 次提交
-
-
由 hong 提交于
* update * add conv yaml * add backward * remove useless code * fix bug * fix bug * revert fluid dygraph conv2d * remove useless infermeta function * fix meta fn deluplicat error * conv using custom impl * remove amp include * fix bug * use cudnn = true * fix test mkldnn caching bug
-
- 22 3月, 2022 1 次提交
-
-
由 hong 提交于
* move mutable_data to context alloc * move mutable_data to context alloc * remvoe duplicate code
-
- 21 3月, 2022 1 次提交
-
-
由 From00 提交于
* Move conv-transpose OPs to phi * Fix CI errors * Fix CI errors
-
- 16 3月, 2022 1 次提交
-
-
由 Zhang Zheng 提交于
* Optimize the computation of log_softmax * modify the var name
-
- 14 3月, 2022 2 次提交
-
-
由 Zhang Zheng 提交于
* Optimize performance of log_softmax * delete unity build * modify to phi * fix * fixfixfixfix * fix * fix * fix * fix * simplify * fix * fix enforce
-
由 From00 提交于
* Move Pool OPs to phi * Fix CI error * Fix conflicts
-
- 12 3月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename softmax kernel name * move softmax infershape * fix failed test
-