- 27 5月, 2022 1 次提交
-
-
由 zyfncg 提交于
* refactor the optional tensor * remove optiona<MetaTensor> in InferMeta * fix bug * fix optional<vector<Tensor>> * fix bug * fix rmsprop * fix amp of eager_gen * polish code * fix deleted code * fix merge conflict * polish code * remove is_nullopt_ * fix merge conflict * fix merge conflict
-
- 16 3月, 2022 1 次提交
-
-
由 Zhong Hui 提交于
* segment pool support for int int64 kernel. * add support in python api
-
- 10 3月, 2022 1 次提交
-
-
由 Zhong Hui 提交于
* move segment_pool to phi. * mark summed ids as optional tensor. * fix as reviews.
-
- 02 3月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* move gather.h gather.cu.h scatter.h scatter.cu.h to phi library * fix CI * fix rocm ci
-
- 20 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename pten dir to phi * rename namespace to phi * rename infrt pten dir to phi * resolve conflict * rename pten to phi in cmake * revert all infrt change * change needed files * fix infrt failed * fix inference failed
-
- 11 2月, 2022 1 次提交
-
-
由 Feiyu Chan 提交于
* move operators/math/math_function_* to pten/kernels/func * namespace from `paddle::operators::math` to `pten::funcs`
-
- 17 12月, 2021 1 次提交
-
-
由 zlsh80826 提交于
From --ptxas-options=-v, SegmentOpsKernel uses 66 registers in a block. There are two ways to resolve this problem: Reduce the threads per block launch configuration add __launch_bound__ to give information to nvcc compiler for reducing registers usage this PR chooses __launch_bound__ solution because changing gpu_launch_config may affect other ops.
-
- 03 12月, 2021 1 次提交
-
-
由 ronnywang 提交于
* refine structure for cuda and rocm * update * update * update * update
-
- 27 4月, 2021 1 次提交
-
-
由 Zhong Hui 提交于
* [OPs] Bug fix, fix the segment mean for illegal syncthreads usage.
-
- 20 10月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 26 9月, 2020 1 次提交
-
-
由 Zhong Hui 提交于
fix cpplint error for the autmic max/min
-
- 24 9月, 2020 1 次提交
-
-
由 Zhong Hui 提交于
Add GPU Kernels of Segment Ops, support, sum, max, min, mean
-