- 16 8月, 2021 5 次提交
-
-
由 zyfncg 提交于
* Support NPU OP hard_swish and hard_swish_grad * Support NPU OP hard_swish and hard_swish_grad * add the unittest to compare the result between npu ans cpu * format the prompt of exception * replace Min and Max op by ClipByValue op * fix the precision problem for fp16 * Using HardtanhGrad to improve performace
-
由 Zhanlue Yang 提交于
-
由 Leo Chen 提交于
-
由 tianshuo78520a 提交于
-
由 ronnywang 提交于
* add p_norm_op_npu * remove p_norm_grad op * update
-
- 13 8月, 2021 9 次提交
-
-
由 Tongxin Bai 提交于
* OP dot: refactor CPU kernels and get better loop performance. * Minor fix on code format. * Fixed minor errors. * Add new API: einsum * Update the Einsum unit test. One case failed with matmul_v2, where the dtype is int64: a = np.arange(2 * 3 * 1).reshape(2, 3, 1) b = np.arange(1) paddle.einsum("...i, ...i", a, b) * Test cases in test_einsum test floating point dtypes only. As of now Paddle only supports float/double dtypes in matmul, which is one of building blocks of this Einsum implementation. We decide not to test einsum against other dtypes. * Polish format. * More formatting. * Format... * Einsum: improve test coverage. * Einsum: bug fixes and more testcases for testing error messages * Einsum: fix format.. * Einsum: fixed typo and format. * Einsum: format again... * Einsum: applied suggested changes. * Einsum API: improve API documentation. * Einsum API: apply suggested changes. * Einsum API: Add dygraph only note. * Einsum API: Add dygraph only note. * Einsum API: fixed unittest.
-
由 zyfncg 提交于
-
由 zyfncg 提交于
* Fix a bug : can't load more than one custom op module * Fix a bug : can't load more than one custom op module * add test for load multiple modules of custom c++ op * add config for Coverage CI
-
由 Zeng Jinle 提交于
-
由 zhouweiwei2014 提交于
-
由 Qi Li 提交于
-
由 ronnywang 提交于
-
由 Baibaifan 提交于
-
由 andyjpaddle 提交于
-
- 12 8月, 2021 11 次提交
-
-
由 Qi Li 提交于
-
由 Chen Weihang 提交于
* remove unmatched signal error stack * fix error writing for cond
-
由 Chen Weihang 提交于
This reverts commit 0a5c99e8.
-
由 zhouweiwei2014 提交于
-
由 Wilber 提交于
-
由 Feng Xing 提交于
This PR adds fused transformer related files defining c interface including class, function etc..
-
由 zhulei 提交于
* Fix safety-bug of functional.linear * Fix safety-bug of functional.linear * Fix safety-bug of functional.linear * Fix safety-bug of functional.linear
-
由 ShenLiang 提交于
* add recompute for pp * add recompute offload * add recompute partition
-
由 wuhuachaocoding 提交于
-
由 Fan Zhang 提交于
* [NPU] Support npu op expand_v2 and expand_v2_grad * [NPU] Support npu op expand_v2 and expand_v2_grad * [NPU] Support npu op expand_v2 and expand_v2_grad * update test_expand_v2_op_npu.py * update test_expand_v2_op_npu.py * modify expand_v2_op_npu.cc * modify expand_v2_op_npu.cc
-
由 Peihan 提交于
* add det_mv3_db & LeViT test case in pr-ci-inference * fix LeViT model dir bugs * fix grammar error
-
- 11 8月, 2021 15 次提交
-
-
由 Jacek Czaja 提交于
* - Added softmax without caching * - Binary is no longer manually cached * - Activation onednn caching removed * - Removed manual caching of activation * - modified UT * - fix * - fix * - fixes to building * - fix * - fix * - fix to UT * - Faulty UT workaround * - approval workaround * - Fixes after review * - compilation fixes * - more lint fixes * - more fixes after review * - fixes after another round of review
-
由 WeiXin 提交于
* add set_value_grad op * add unittest. * polish unittest. * polish code. * support cuda kernel * polish code according to CI * polish code. * polish code * remove *.pyc * polish code. * add unittest to improve coverage. * polish code.
-
由 Wangzheee 提交于
* fix_fc_reshape_convert * fix
-
由 Fan Zhang 提交于
-
由 pangyoki 提交于
* add while read_from_array write_to_array npu op * optimize unittest
-
由 Roc 提交于
-
由 ronnywang 提交于
* add momentum_op_npu and test * update * fix hang
-
由 ronnywang 提交于
* add reduce_mean_op_npu and test * remove skip.If * update
-
由 ronnywang 提交于
* add batch_norm_op_npu and tests * remove skip.If * fix bug
-
由 Hao Lin 提交于
* Add ext_tensor.slice() API, test=develop * Call Tensor::mutable_data first to fix bugs and add test for writing to sliced tensor * Fix unit test bug * Fix code format problem, test=develop * Fix code format problem * Fix code format problem * strengthen unit test * Use CustomTensorUtils::ShareDataFrom to simplify codes
-
由 lilong12 提交于
* add auto_parallel apis
-
由 0x45f 提交于
* add exp and exp_grad npu op * modify support register type * remove empty line and remove exp_grad support data type int/int64 * move exp and epx_grad kernel to activation_op_npu.cc, delete attrs * move code to activation_op_npu.cc
-
由 andyjpaddle 提交于
-
由 wenbin 提交于
-
由 niuliling123 提交于
-