- 02 6月, 2022 3 次提交
-
-
由 Guoxia Wang 提交于
-
由 Li Min 提交于
* extend forward fast_ln_kernel to support more column values.
-
由 sneaxiy 提交于
* support CUDAGraph for partial graph * add ut * fix ci * fix ut again because of eager mode * fix kunlun ci * fix win ci
-
- 01 6月, 2022 11 次提交
-
-
由 YuanRisheng 提交于
* add yaml * fix infrt compile bugs
-
由 Aganlengzi 提交于
-
由 Qi Li 提交于
-
由 Guoxia Wang 提交于
-
由 sneaxiy 提交于
* support weight transpose * add ut * add template * fix transpose error * fix transpose_comment * add api tests * add skipif * add doc
-
由 YUNSHEN XIE 提交于
-
由 Sing_chan 提交于
-
由 zhangchunle 提交于
unittest parallel Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>
-
由 Ruibiao Chen 提交于
* Add pinned memory to HostMemoryStats * Add macro for WrapStatAllocator * Fix CI errors
-
由 huzhiqiang 提交于
-
由 chentianyu03 提交于
* add conv3d yaml * add conv3d_grad, conv3d_double_grad * add final_state_conv3d test case * add conv3d double test case * add depthwise_conv2d grad yaml * add depthwise_conv2d double grad test case * modify the order of args * add depthwise_conv2d_grad_grad config
-
- 31 5月, 2022 15 次提交
-
-
由 Sławomir Siwek 提交于
* remove attrs from base op * fix typos * remove brelu * undo removing code related to matmul * remove whitespaces * undo changes in matmul * remove empty line
-
由 wanghuancoder 提交于
* fix full zero * fix full zero * fix full zero * fix full zero * refine * refine * refine
-
由 Sing_chan 提交于
-
由 Chen Weihang 提交于
* fix assign kernel copy impl * fix test failed
-
由 cambriconhsq 提交于
-
由 Chen Weihang 提交于
* polish append op using * fix var error * fix group norm impl
-
由 Aganlengzi 提交于
* fix arg_max and reduce_max * add arg_max ut
-
由 thunder95 提交于
* rrelu逻辑部分 * unregistered op kernel (unresolved) * commit before merge * 丰富测试用例 * 修复rrelu-sig的bug * 修复cpu环境测试 * 修改拼写错误 * 修改code format * 尝试优化测试用例timeout的问题 * 优化测试用例 * 移除seed, 优化随机函数 * update en doc for rrelu * fix rrelu en docs, test=document_fix * add paper link for en docs, test=document_fix * udpate en doc * add r,test=document_fix
-
由 xiongkun 提交于
* change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 * make EInsumOP support bf16 * add unittest for BF16 * add condition for test_BF16 * fix bugs * fix
-
由 Leo Chen 提交于
Co-authored-by: NRyan Jeng <rjeng@nvidia.com>
-
由 Jiabin Yang 提交于
* support is empty * fix error * fix code error * change to fake empty * using fake empty first * using fake empty first * Support backward prune in fluid
-
由 Li Min 提交于
* replace dropout_is_test with is_test. * improve atol on a100.
-
由 zyfncg 提交于
* add embedding yaml * fix infermeta bug * fix bug of selected_rows infer_meta * fix selected_rows * add unittest
-
由 Wilber 提交于
-
由 jakpiase 提交于
OneDNN md-in-tensor refactoring part 5: Memory descriptor enabled for elementwises, reductions and expand_v2 ops (#43036) * enabled md in elementwises, reductions and expand_v2 * CI fix for invalid numpy copy * fixed formatting * CI rerun * changes after review
-
- 30 5月, 2022 11 次提交
-
-
由 Chenxiao Niu 提交于
-
由 Li Min 提交于
* add fused_bias_dropout_residual_ln op and layer.
-
由 heliqi 提交于
-
由 shentanyue 提交于
* update lite compile cmake * Update delete_fill_constant_op_pass.cc * Update analysis_config.cc
-
由 pangyoki 提交于
* support backward inplace in eager fluid mode * fix * fix * optimize format * little change
-
由 crystal 提交于
-
由 thunder95 提交于
* nanmedian op * 修改cuda kernel的bug * 修复count_if在其他硬件平台不兼容 * 修复某些cpu硬件不兼容 * 修复某些cpu硬件不兼容 * 修复isnan判断 * 兼容numpy低版本不支持全部nan的情况 * 兼容numpy低版本不支持全部nan的情况 * fix code example * fix api comment error * 修改反向传播逻辑以及c++处理逻辑 * 完成修改建议 * typo pre_dim * update en docs, test=document_fix * remove numpy in en doc, test=document_fix * add r,test=document_fix * 添加api到all * follow advice from chenwhql
-
由 huzhiqiang 提交于
-
由 cambriconhsq 提交于
-
由 limingshu 提交于
* 1st commit * fix usless change in header transpose_kernel_h file * add sync
-
由 WangZhen 提交于
* Fix cond_block_grad error when handle no need grad vras * Add comment and UT
-