- 01 9月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* refine cmake of framework * add deps for dense tensor * fix deps * remove alloc(ctx) * add depends on mkldnn
-
- 01 8月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* remove cudaDeviceContext * remove more template * fix rocm compile * remove alias name CUDADeviceContext * fix compile * fix tests * revert changes
-
- 26 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 20 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename pten dir to phi * rename namespace to phi * rename infrt pten dir to phi * resolve conflict * rename pten to phi in cmake * revert all infrt change * change needed files * fix infrt failed * fix inference failed
-
- 11 2月, 2022 1 次提交
-
-
由 Feiyu Chan 提交于
* move operators/math/math_function_* to pten/kernels/func * namespace from `paddle::operators::math` to `pten::funcs`
-
- 03 12月, 2021 1 次提交
-
-
由 ronnywang 提交于
* refine structure for cuda and rocm * update * update * update * update
-
- 11 7月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* fix softmax_with_cross_entropy cuda kernel overflow bug, test=develop * replace old macro & for condition, test=develop * polish details, test=develop
-
- 11 9月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory. We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton. Also added data_feed_proto to operator to fix CI in CPU compilation
-
- 21 12月, 2018 1 次提交
-
-
由 chengduo 提交于
* Add Temporal Allocator * add Temporay Allocator to DeviceContext test=develop * code refine test=develop * fix mean_iou test=develop * Add DeviceTemporaryAllocator test=develop * fix conv_op bug test=develop * small fix test=develop * code refine test=develop * log refine test=develop * fix unit test test=develop * move double check * refine concat_and_split test=develop * add limit_of_temporary_allocation test=develop * fix name test=develop
-
- 14 6月, 2018 1 次提交
-
-
由 whs 提交于
* Add mean_iou op. * Add unitest for mean iou op. * Add optional collections of confusion matrix and mean_iou. * Fix cuda kernel. * Refine code. 1. Merge computing in GPU to two kernel. 2. Use wrong array and correct array instead of confusion matrix. * Add python api and fix cuda kernel. * Fix comments. * Small fix. * Small fix.
-