- 29 9月, 2021 14 次提交
-
-
由 Zeng Jinle 提交于
* add basic support for CUDA Graph * fix ci compile error * fix LOG print, fix windows CI * follow comments and update * small fix for default ctor * fix rocm compile error * fix CPU compile error
-
由 Liu-xiandong 提交于
* fix cusparse compile problem, test=develop * Modify file permissions
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge spinlock
-
由 yaoxuefeng 提交于
-
由 zhulei 提交于
* [npu] add box coder * [npu] add box coder
-
由 pangyoki 提交于
-
由 zhulei 提交于
* [NPU] Add group norm * [NPU] Add group norm * [NPU] Add group norm * [NPU] Add group norm * [NPU] Add group_norm op
-
由 Aganlengzi 提交于
* merge conflict of paddle_gtest_main.cc * modify FLAGS_npu_precision_mode and default not to call aclSetCompileopt
-
由 Yiqun Liu 提交于
-
由 Zeng Jinle 提交于
-
由 baoachun 提交于
-
由 Zeng Jinle 提交于
-
由 Li Min 提交于
-
由 ronnywang 提交于
-
- 28 9月, 2021 13 次提交
-
-
由 Liu-xiandong 提交于
Add sparse_attention OPs, python api will be added in next pr
-
由 Lijunhui 提交于
* Add paddle.linalg.eig op * remove comments * remove comments * extend batch_size to the origin * add real times complex functor & destroy the backward complex output bug * terminate output diff when input real tensors * correct tiny doc errors * move functions from eig_helper to svd_helper and remove eig_helper * remove tensor.Resize * remove no longer used code * use existing lapack functions * reply review comments 21/27 * remove .cu as this op is only executed on CPU * remove const_cast & add const in argument list for read-only references * fix sample code error in CI * remove template typename Tbase and more * remove eig exposure in paddle.* * add 'name=None' in eig python implementation * handle the unittest * try to solve the unittest * solve CI coverage * remove no longer used code * polish API doc and more * reply review comments * polish unittest, commit plan B * polish unittest
-
由 ronnywang 提交于
-
由 Thunderbrook 提交于
* ps gpu dump * remove log
-
由 xiayanming 提交于
* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] fix seed ci failed issue * add AsExtra for force_cpu of seed op
-
由 Jiabin Yang 提交于
* fix dygraph double grad dtype error when calling for high differential senario * reinvoke ci * add test for partial_engine.cc
-
由 Leo Chen 提交于
* read envs in flags_map * add flags to undefok
-
由 Leo Chen 提交于
-
由 Guoxia Wang 提交于
-
由 Zeng Jinle 提交于
-
由 Yanxing Shi 提交于
* Initial Commit * add unittest and add error information * modify doc * fix some error * fix some word * fix bug cudaDeviceProp* and modify error explanation * fix cudaDeviceProp* error and unnitest samples * fix hip error and PADDLE_WITH_HIP * update style * fix error is_compiled_with_cuda * fix paddle.device.cuda.get_device_properties * fix error for multi thread safe * update style * merge conflict * modify after mentor review * update style * delete word * fix unittest error for windows * support string input and modify some code * modify doc to support string input * fix error for express information * fix error for express information * fix unnitest for windows * fix device.startswith('gpu:') * format error and doc * fix after review * format code * fix error for doc compile * fix error for doc compile * fix error for doc compile * fix error for doc compile * fix error for doc compile * fix py2 error * fix wrong words and doc * fix _gpuDeviceProperties
-
由 Huihuang Zheng 提交于
* Add Basic CINN Runner Class * Add CinnCacheKey * Add Cache logic and improve CinnCacheKey * Modify as reviewer commented * Implement hash_combine to fix MAC build.
-
由 Siming Dai 提交于
-
- 27 9月, 2021 6 次提交
-
-
由 xiaoxiao-luomu 提交于
* gloo hdfs set check & gloo connect retry * add vlog * print gloo connect addr & add vlog * . * modify vlof * modify vlog * modify vlog
-
由 Jiawei Wang 提交于
* fix extra op for expand, expand_as, tile, unstack * fix unique unstack dim 0 * Update expand_v2_op.cc * fix unique_op format
-
由 limingshu 提交于
* A leap of try for cudaLaunchCooperativeKernel * fix bugs * Totally replace the lar cuda kernel * Fix bugs * fix code according to comments * fix codes according to review comments * adding some function overload * relocate the power operation.
-
由 jakpiase 提交于
* refactored reshape multiop kernel and added flatten1/2 kernels * added formatting for flatten tests * CI fix * disabled reshape_kernel ops after succesful CI run * minor fix
-
由 Aurelius84 提交于
* Polish multi-thread schedule strategy * fix atomic_deps * modify into lambda function * add and run
-
- 26 9月, 2021 7 次提交
-
-
由 JZ-LIANG 提交于
-
由 JYChen 提交于
* add func/class API psroi_pool and UT * add UT in static mode * Remove redundant type checks in static mode * More detailed description for test_psroi_pool_op * fix code format of UT * fix en-doc
-
由 Thunderbrook 提交于
* set file_num in one shard * format
-
由 zhangbo9674 提交于
* adam to adamw in AdamW * add lr_ratio in adamw * refine logic bug in cpu adamw * delete fix bug for cpu adamw * delete fix bug for cpu adamw
-
由 Leo Chen 提交于
-
由 Yulong Ao 提交于
-
由 whs 提交于
-