- 11 10月, 2021 9 次提交
-
-
由 Liu-xiandong 提交于
Add paddle.nn.functional.sparse_attention API 本个PR主要将sparse_attention功能在python层进行了一层封装,OP的主体代码见:#PR35676 此外,对于封装的python 接口,增加了相应的单测。
-
由 caozhou 提交于
* add reshard module * fix conflict * update reshard module * update and add unitest * update reshard module and unitest * add more unitests
-
由 wangxinxin08 提交于
* enhance yolobox plugin
-
由 Qi Li 提交于
* [NPU] fix matmul_v2 and utils.run_check, test=develop * remove debug files, test=develop * fix install_check, test=develop * fix doc, test=develop * fix review comments, test=develop
-
由 Qi Li 提交于
* [NPU] fix set_value, test=develop * fix typo, test=develop * fix typo, test=develop
-
由 wangxinxin08 提交于
* add mish trt plugin, compile & install success, run error. test=develop * modify code according to review * add TRT_NOEXCEPT for mish trt plugin * add unittest for mish trt plugin * remove unnecessary check of mish in op_teller.cc * fix some problem of trt8 * add check and modify unittest while converting mish to trt plugin Co-authored-by: Ndengkaipeng <dengkaipeng@baidu.com>
-
由 baoachun 提交于
* add skip case in trt converter ut * disable group_norm trt plugin
-
由 Huihuang Zheng 提交于
Add use_cinn flag and use it to control whether we run PaddlePaddle using CINN. Also add: Replace PaddlePaddle graph with a CINN graph in a pass PE Method to feed data and run the graph by CINN
-
由 JingZhuangzhuang 提交于
-
- 09 10月, 2021 5 次提交
-
-
由 Yiqun Liu 提交于
-
由 From00 提交于
* Add new API tensordot * Set timeout value 400 for UT; Fix format for EN docs * Set timeout value 1000 for UT; Fix format for EN docs * Remove some input check * Coding style improve: don't compare boolean values to True or False using ==
-
由 zhiboniu 提交于
-
由 zhaoyingli 提交于
* support ClipGradByGlobalNorm in sharding * support ClipGradByGlobalNorm in sharding * test=allcase
-
由 wuhuanzhou 提交于
对于__getattr__重载后不满足条件的参数,全部抛出AttributeError异常,达到与未重载版本一致。
-
- 08 10月, 2021 5 次提交
-
-
由 Zeng Jinle 提交于
* support CUDA Graph on PE * add ut, fix CI compile * reduce memory consumption * fix CUDA 10 CI * improve coverage * improve python coverage
-
由 yaoxuefeng 提交于
-
由 Qi Li 提交于
* [NPU] support NCL and NCL for BatchNorm, test=develop * [NPU] remove debug files, test=develop * update, test=develop
-
由 huangxu96 提交于
Add python interface of subgraph: 1. all_sub_graphs() 2. get_sub_graph(idx)
-
由 arlesniak 提交于
* Added oneDNN BF16 relu * fixed typo * refactored test, review fixes
-
- 07 10月, 2021 1 次提交
-
-
由 Adam Osewski 提交于
* Remove unused header. * Use ConvMKLDNNHandlerT for conv2d INT8. * Use absolute module path to import.
-
- 05 10月, 2021 1 次提交
-
-
由 jakpiase 提交于
* tmp * added concat BF16/FP32 BWD oneDNN kernel * minor change * minor change * fix for CI * added formatting * Reverted deleting static keyword * added reviewers suggestions * reverted deleting concat bf16 test file * fixed concat tests
-
- 30 9月, 2021 4 次提交
-
-
由 levi131 提交于
-
由 Aganlengzi 提交于
* [NPU] modify transpose2 and index_select_grad kernels for model xlnet * add transpose2 int64_t unit test * add more transpose2 unit tests * update test_transpose_op_npu.py
-
由 李季 提交于
* fix raw optim * pre-commit test file Co-authored-by: Nsneaxiy <sneaxiy@126.com>
-
由 李季 提交于
-
- 29 9月, 2021 10 次提交
-
-
由 Zeng Jinle 提交于
* add basic support for CUDA Graph * fix ci compile error * fix LOG print, fix windows CI * follow comments and update * small fix for default ctor * fix rocm compile error * fix CPU compile error
-
由 zhaoyingli 提交于
* update func name * skip cpu * update unittest * update unittest
-
由 Liu-xiandong 提交于
* fix cusparse compile problem, test=develop * Modify file permissions
-
由 levi131 提交于
* init functional jacobian api * finish test with dtype float32 * add float64 test case * polish code * use atol=1e-5 with dtype float64 * fix for ci * set timeout for test_jacobian * init hessian API * save status * polish API docstring * modify docstring * add utils.py * save status * fix dygraph double grad dtype error when calling for high differential senario * reinvoke ci * test_hessian.py is ok * polish hessian API * init vhp * Revert "init vhp" This reverts commit cbd4d3b66abe82b0ac10721b9eddeb7d82e0a1c8. * add test for partial_engine.cc * modify numerical_delta with dtype float32 * merge fix for dtype float64 * spell fix * polish code * rm _stop_gradient_pre_process Co-authored-by: NJiabinYang <360788950@qq.com>
-
由 zhulei 提交于
* [npu] add box coder * [npu] add box coder
-
由 pangyoki 提交于
-
由 zhulei 提交于
* [NPU] Add group norm * [NPU] Add group norm * [NPU] Add group norm * [NPU] Add group norm * [NPU] Add group_norm op
-
由 Aganlengzi 提交于
* merge conflict of paddle_gtest_main.cc * modify FLAGS_npu_precision_mode and default not to call aclSetCompileopt
-
由 WangXi 提交于
-
由 hlygit66666 提交于
* add op paddle.device.cuda.get_device_name * fix some bugs * fix some bugs * fix error message bugs * fix en docs * fix bugs * fix bugs * fix bugs * add error message test case * add get_device_name and get_device_capability * fix review * fix docs bug * fix docs * fix docs
-
- 28 9月, 2021 5 次提交
-
-
由 Liu-xiandong 提交于
Add sparse_attention OPs, python api will be added in next pr
-
由 Lijunhui 提交于
* Add paddle.linalg.eig op * remove comments * remove comments * extend batch_size to the origin * add real times complex functor & destroy the backward complex output bug * terminate output diff when input real tensors * correct tiny doc errors * move functions from eig_helper to svd_helper and remove eig_helper * remove tensor.Resize * remove no longer used code * use existing lapack functions * reply review comments 21/27 * remove .cu as this op is only executed on CPU * remove const_cast & add const in argument list for read-only references * fix sample code error in CI * remove template typename Tbase and more * remove eig exposure in paddle.* * add 'name=None' in eig python implementation * handle the unittest * try to solve the unittest * solve CI coverage * remove no longer used code * polish API doc and more * reply review comments * polish unittest, commit plan B * polish unittest
-
由 xiayanming 提交于
* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid * [HIP] fix op not support AMD GPU bug * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] seed and dropout op support force-cpu * [hybrid] fix seed ci failed issue * add AsExtra for force_cpu of seed op
-
由 zhiboniu 提交于
remove recent linalg api in paddle.init; add args 'name' in some new linalg api interface same change in develop branch to #36112
-
由 Jiabin Yang 提交于
* fix dygraph double grad dtype error when calling for high differential senario * reinvoke ci * add test for partial_engine.cc
-