- 09 12月, 2021 1 次提交
 - 
- 
由 sneaxiy 提交于
* fix cuda atomicAdd for FP16 * try to fix ci
 
 - 
 - 03 12月, 2021 1 次提交
 - 
- 
由 ronnywang 提交于
* refine structure for cuda and rocm * update * update * update * update
 
 - 
 - 19 11月, 2021 1 次提交
 - 
- 
由 Siming Dai 提交于
* add cpu version, using set: sum, min, max * add cpu version: mean * improve cpu code and fix dynamic memory allcation problem * fix arg error, add index judge, delete fp16 * fix bug in CudaAtomicMax and CudaAtomicMin * add CUDA version * fix grad_op bug for index * add op test, add correct cpu grad op * Add correct CUDA Mean grad * [Add] Successful MEAN and SUM * [Add] Successful MIN and MAX in CPU * [Add] Successful MIN and MAX in CUDA * fix windows dtype ci * fix ROCM ci by adding HIP flag * rename fused_gather_scatter to send_recv * unify name as send and recv * change zero index return time * add send_recv incubate api * fix index data type, add unittest case for API * delete redundant input tensor * fix en example and docs, add default value in pool_type * add shape judge and max grid judge * fix comment * fix index type bug * add const & * fix en docs * delete numpy in examples * add unittest for int input * fix send_recv comment * change send_recv to graph_send_recv
 
 - 
 - 01 6月, 2021 1 次提交
 - 
- 
由 chentianyu03 提交于
* replace and remove complex64/128 types in custom OP and other files * fix custom_tensor_test fail bug * fix custom_conj_test fail bug * fix dispatch_test_op build fail bug
 
 - 
 - 07 4月, 2021 1 次提交
 - 
- 
由 furnace 提交于
 
 - 
 - 08 2月, 2021 1 次提交
 - 
- 
由 Qi Li 提交于
 
 - 
 - 25 12月, 2020 1 次提交
 - 
- 
由 Chen Weihang 提交于
* add support for complex grad accumulated * add unittest for coverage * update test dtype * remove useless blank line
 
 - 
 - 26 9月, 2020 1 次提交
 - 
- 
由 Zhong Hui 提交于
fix cpplint error for the autmic max/min
 
 - 
 - 25 9月, 2020 1 次提交
 - 
- 
由 Zhong Hui 提交于
fix cuda atomic for ARCH<350 for the automic_max
 
 - 
 - 24 9月, 2020 1 次提交
 - 
- 
由 Zhong Hui 提交于
Add GPU Kernels of Segment Ops, support, sum, max, min, mean
 
 - 
 - 31 7月, 2018 1 次提交
 - 
- 
由 dzhwinter 提交于
* "rewrite the test case" * "follow comment"
 
 - 
 - 30 7月, 2018 1 次提交
 - 
- 
由 dzhwinter 提交于
* cherry picked * "cherry picked platform" * "add comment" * "fix ci"
 
 - 
 - 03 5月, 2018 1 次提交
 - 
- 
由 chengduo 提交于
* fix __shfl_down_sync_ of cross_entropy * use reduceSum * "fix ci"
 
 - 
 - 02 5月, 2018 2 次提交
 - 
- 
由 chengduoZH 提交于
 - 
由 chengduoZH 提交于
 
 - 
 - 30 4月, 2018 1 次提交
 - 
- 
由 dzhwinter 提交于
* "re-commit " * "picked up" * "fix ci" * "fix pdb hang up issue in cuda 9"
 
 - 
 - 10 4月, 2018 2 次提交
 - 28 2月, 2018 1 次提交
 - 
- 
由 chengduoZH 提交于
 
 - 
 - 26 2月, 2018 1 次提交
 - 
- 
由 chengduoZH 提交于
 
 - 
 - 24 2月, 2018 1 次提交
 - 
- 
由 chengduoZH 提交于
 
 - 
 - 12 2月, 2018 1 次提交
 - 
- 
由 qingqing01 提交于
 
 - 
 - 10 2月, 2018 1 次提交
 - 
- 
由 Yi Wang 提交于
 
 - 
 - 23 11月, 2017 1 次提交
 - 
- 
由 Yu Yang 提交于
* Support int64 for sum op * Refine code
 
 - 
 - 18 9月, 2017 1 次提交
 - 
- 
由 武毅 提交于
* refind accuracy_op * follow comments * follow comments
 
 - 
 - 23 8月, 2017 1 次提交
 - 
- 
由 dangqingqing 提交于
 
 - 
 - 22 8月, 2017 2 次提交
 - 
- 
由 dangqingqing 提交于
 - 
由 dangqingqing 提交于
1. finish lookup table CPU and GPU kernel 2. Add some cuda helper 3. Add some math funtor
 
 -