- 08 9月, 2021 17 次提交
-
-
由 WangXi 提交于
-
由 Leo Chen 提交于
* add clip_by_norm fp16 kernel * add ut
-
由 Shang Zhizhou 提交于
* update slice plugin * add test * fix code style * fix trt6 * update test * fix test * add timeout * update trt version * update cmake
-
由 xiongkun 提交于
* can pass the fake test * add files * modify cmake to pass windows-ci * for ci pass * WITH_GLOO=ON * for pass coverage test * add cpuonly testcase * add * disable nccl when compile with cuda * change python version in cpuonly * add backend argument * add required gpu * add required:gpu
-
由 Guoxia Wang 提交于
-
由 Zeng Jinle 提交于
* fix scatter_add_nd and gather bug * fix gather compile error
-
由 Zeng Jinle 提交于
* add fleet api for program pass * turn on apply pass for CI test * fix disable fuse_all_optimizer bug * try to test ci * fix CI * fill unspecified op role * fix fuse_allreduce * add ut to improve coverage * remove useless change * improve c++ coverage * follow some comments * test ir pass pipeline * update doc * reduce ut time again
-
由 zhangkaihuo 提交于
The bug is that access to mean and var is incorrect, and the array will be out of bounds: the shape of mean and var is [batch_size], and the range of thread idx is 0~feature_size, so mean[idx] and var[idx] is incorrect. When batch_size=1, the correct access is mean[0] and var[0], and a unit test with batch_size=1 is added.
-
由 cc 提交于
-
由 lilong12 提交于
* update, test=develop
-
由 lilong12 提交于
* update, test=develop
-
由 feng_shuai 提交于
* merge CMakeList.txt manual * add platform for changethreadnum * repair some bugs according to make error * do nothing just flush CI * forget change thread num * add inplace_atol param for check_output_with_place * Windows * std:min and std::max should be change because of windows
-
由 lilong12 提交于
* support weight sharing
-
由 Leo Chen 提交于
* release gil before op run * support npu grad test * fix op_test
-
由 Zhong Hui 提交于
-
由 wawltor 提交于
* add the matmul v2 grad kernel * relief the test case time * update the test case for the matmul double grad * remove the unsed code for the matmul double grad * update the test case for the double grad matmul * remove the unused code in dot
-
由 WangXi 提交于
-
- 07 9月, 2021 15 次提交
-
-
由 Zeng Jinle 提交于
* fix scatter_nd_add doc, test=document_fix * update test=document_fix
-
由 yaoxuefeng 提交于
-
由 wangxinxin08 提交于
* add conv op check for illegal input or attributes
-
由 Qi Li 提交于
* [NPU] update batch norm op, test=develop * add NHWC support for bn, test=develop
-
由 XiangGao 提交于
Co-authored-by: Nroot <root@bjyz-sys-gpu-kongming9.bjyz.baidu.com>
-
由 Aurelius84 提交于
* Add DPADDLE_WITH_CUDA for GCC * polish code
-
由 furnace 提交于
* [NPU] fix for test_norm_op_npu * [NPU] add norm_grad * [NPU] add CheckAxis for axis * [NPU] delete debug codes * norm can not use L2Normalize, norm_grad can use L2NormalizeGrad * [NPU] delete useless codes * [NPU] optimize norm_grad OpMaker * Update python import path
-
由 Qi Li 提交于
* [NPU] log_softmax_grad, test=develop * remove debug files, test=develop * update lookup_table_v2 for CANN 5.0.x, test=develop
-
由 jakpiase 提交于
* fix for reshape2 * added reviewers sugestions
-
由 XiangGao 提交于
* add AsExtra in data_norm op * pass data_layout from python to data_norm op * fix data_layout in data_norm op Co-authored-by: Nroot <root@bjyz-sys-gpu-kongming9.bjyz.baidu.com>
-
由 Aurelius84 提交于
* fix commit * Open unittest * fix unittest on Windows * fix constructor
-
由 Sing_chan 提交于
-
由 Aurelius84 提交于
* open test_resnet_amp on Windows * disable on Windows CPU CI for timeout * disable on Windows CPU CI for timeout * fix code style
-
由 wawltor 提交于
* transfer the static.accurcay to v2 api * remove the unused code
-
由 xiayanming 提交于
* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug
-
- 06 9月, 2021 8 次提交
-
-
由 wangguanzhong 提交于
* support double in deformable conv * add double for dcn v2
-
由 joanna.wozna.intel 提交于
* Add fusion_lstm INT8 PTQ * Correct mkldnn_cache_capacity and enable fc_lstm_fuse_pass only for this test * Change mkldnn_cache_capacity
-
由 Wei Shengyu 提交于
* add pool2d grad grad * dbg * add unittest * update format * add more unittests * dbg
-
由 Double_V 提交于
* add kernel, stride check * add unitest for param out of range * delete max limit check
-
由 heliqi 提交于
* add depthwise_conv_npu_grad op * add depthwise_conv_npu_grad op * add depthwise_conv_npu_grad op * add NHWC test case
-
由 WeiXin 提交于
* support numpy dtype and polish code of list index. * polish code.
-
由 Feng Xing 提交于
This PR adds error exception in fused transformer python interface. The function body are not implemented (will be implemented later). Following zhiqiu's comment in previous PR-35206 (merged already), it is better to raise an exception instead of using "pass".
-
由 Wilber 提交于
-