- 02 11月, 2022 1 次提交
-
-
由 Siming Dai 提交于
-
- 24 10月, 2022 1 次提交
-
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * support pure bfloat16 * support bf16 linear * update PR to pass CI * tiny fix where_grad_kernel.cu * Support bfloat16 type for reducer and sharding. * Fix some bug. * Polish code. * Polise code. * Add bfloat16 datatype in fill_grad kernels. Co-authored-by: Nsneaxiy <sneaxiy@126.com> Co-authored-by: Nsneaxiy <sneaxiy@126.com>
-
- 20 10月, 2022 2 次提交
-
-
由 liu zhengxi 提交于
Add value check & error message for gather_tree cherry-pick #47051
-
由 sneaxiy 提交于
support pure bfloat16 for more ops
-
- 17 10月, 2022 2 次提交
-
-
由 Zhang Zheng 提交于
Optimize performance of depthwise_conv Config: input[2048, 1024, 4, 4], filter[1024, 1, 4, 4], stride=1, pad=0, dilation=1
-
由 Zhang Zheng 提交于
为了提升性能,将label的边界检查从python端转移到kernel内,减少额外op的调用,如min、max和同步拷贝等 当前的模板参数IgnoreIndex仅在ignore_index取值范围在[0, dim)时才生效,但是当某个label值超出了边界,ignore_index等于该label,这种情况下是应该仍然能正常计算。虽然当前的计算逻辑在结果上不会出错,但逻辑上仍是有问题的,且模板参数IgnoreIndex是没有必要的
-
- 11 10月, 2022 1 次提交
-
-
由 Feiyu Chan 提交于
-
- 29 9月, 2022 1 次提交
-
-
由 傅剑寒 提交于
Add FP16 support for uniform in dygraph mode on Nvidia GPU Dev PR link PR46212
-
- 27 9月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
-
- 20 9月, 2022 2 次提交
-
-
由 HongyuJia 提交于
* polish code comments * polish data_device_transform.cc
-
由 Jiabin Yang 提交于
* [Eager] Fix ocr (#46124) * fix linspace error in amp * fix log * fix amp error * fix ocr error which caused by amp * add more check * rename dtype ns * [Eager Bug fix]Fix Detection (#46147) * fix linspace error in amp * fix log * fix amp error * Revert "Simplify size op impl (#45808)" This reverts commit c252b1de. * fix_seg * fix detection Co-authored-by: NChen Weihang <sunny_cwh@163.com> Co-authored-by: NChen Weihang <sunny_cwh@163.com>
-
- 19 9月, 2022 3 次提交
-
-
由 RichardWooSJTU 提交于
[vision.ops.nms] Fix return order error and duplicate results with specific inputs (#46148) (#46193) * fix return order error and duplicate results with specific inputs
-
由 sneaxiy 提交于
-
由 Chen Weihang 提交于
This reverts commit c252b1de.
-
- 14 9月, 2022 1 次提交
-
-
由 engineer1109 提交于
修复cuda11.7编译出错的问题
-
- 13 9月, 2022 1 次提交
-
-
由 JingZhuangzhuang 提交于
-
- 09 9月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* simplify size op * trans to cuda manuly * fix copy error
-
- 07 9月, 2022 2 次提交
- 06 9月, 2022 4 次提交
-
-
由 ykkk2333 提交于
-
由 xiaohemaikoo 提交于
-
由 LielinJiang 提交于
* fix grad error of grounorm op when cuda version==11.7
-
由 Wen Sun 提交于
-
- 05 9月, 2022 1 次提交
-
-
由 sneaxiy 提交于
-
- 02 9月, 2022 2 次提交
-
-
由 Yuanle Liu 提交于
-
由 thunder95 提交于
* add dist cuda kernel * reuse some funcs in phi * 使用pnorm * fix code style - explicit * fix code sytle * fix bug * remove unused headers
-
- 01 9月, 2022 2 次提交
-
-
由 HongyuJia 提交于
* copy kernel file to phi * delete some code * migrate uniform_random, test=kunlun * fix input error, test=kunlun * fix gpu register error, test=kunlun * add include file, test=kunlun * try fix error from CI, test=kunlun * polish other PR * fix CI-coverage error, test=kunlun
-
由 Leo Chen 提交于
* refine cmake of framework * add deps for dense tensor * fix deps * remove alloc(ctx) * add depends on mkldnn
-
- 31 8月, 2022 3 次提交
-
-
由 Aurelius84 提交于
* [OpAttr]output_size of unpool support Tensor type * fix coverage * fix contain_var * fix coverage
-
由 Charles-hit 提交于
* fix split bug * solve function redefine * fix fluid.layers.split and add unit test * delete splitInferMeta register in unary.cc * modify test_split_op GPU unit test * modify test_split_op GPU unit test place param * refactor split op and fix infershape bugs * add () in && and || * fix split C++ unit test * fix split infershape
-
由 Li Min 提交于
-
- 30 8月, 2022 4 次提交
- 29 8月, 2022 1 次提交
-
-
由 Siming Dai 提交于
* move incubate to geometric * add paddle.geometric * fix unittest bug * add float16 support for segment op * change reindex and sample neighbors flag name * add heter graph reindex * move sample_neighbors.py to neighbors.py * delete khop_sampler in geometric * delete unused code * change sample_neighbors api input order * fix en doc * fix unittest * fix unittest * change reindex * fix division by 0 * delete unnecessary input argument * delete final_state
-
- 25 8月, 2022 4 次提交
-
-
由 Aurelius84 提交于
* [OpAttr]min/max of Uniform_rand support Tensor type * fix typo
-
由 Sing_chan 提交于
* make full_like support double_max in dygraph * fix bug
-
由 wanghuancoder 提交于
* sync_batch_norm_grad delete mean and variance
-
由 Rayman 提交于
-