- 11 10月, 2021 27 次提交
-
-
由 Leo Chen 提交于
* do not use alignedAllocator when cuda has alignment * update test * fix error during multiple process
-
由 danleifeng 提交于
* heterps:add fuse_allreduce op; test=develop * add program_mode in minimize for pslib mode;test=develop
-
由 jakpiase 提交于
-
由 Zeng Jinle 提交于
* add FLAGS_allreduce_record_one_event * add more comments * fix ut * improve coverage * fix ut, improve coverage
-
由 Liu-xiandong 提交于
Add paddle.nn.functional.sparse_attention API 本个PR主要将sparse_attention功能在python层进行了一层封装,OP的主体代码见:#PR35676 此外,对于封装的python 接口,增加了相应的单测。
-
由 jakpiase 提交于
-
由 Zhang Zheng 提交于
-
由 niuliling123 提交于
* Add functor_primitives.h for kernel primtive api * update * move namespace kps * subFunctor init_data * delete InvalidArgumentError
-
由 Sing_chan 提交于
-
由 Sing_chan 提交于
-
由 Yuang Liu 提交于
-
由 zlsh80826 提交于
Sparse tensor core for convolution requires the input channel dimension is 2:4 structed sparse. So we have to mask the input channel dimension for using sparse tensor core
-
由 caozhou 提交于
* add reshard module * fix conflict * update reshard module * update and add unitest * update reshard module and unitest * add more unitests
-
由 yaoxuefeng 提交于
-
由 tianshuo78520a 提交于
-
由 wangxinxin08 提交于
* enhance yolobox plugin
-
由 Qi Li 提交于
* [NPU] fix matmul_v2 and utils.run_check, test=develop * remove debug files, test=develop * fix install_check, test=develop * fix doc, test=develop * fix review comments, test=develop
-
由 Qi Li 提交于
* [NPU] fix set_value, test=develop * fix typo, test=develop * fix typo, test=develop
-
由 Qi Li 提交于
-
由 Sing_chan 提交于
-
由 Xiaoxu Chen 提交于
-
由 Feiyu Chan 提交于
fix: `-1` is used when fft's axis is `0`
-
由 李季 提交于
-
由 wangxinxin08 提交于
* add mish trt plugin, compile & install success, run error. test=develop * modify code according to review * add TRT_NOEXCEPT for mish trt plugin * add unittest for mish trt plugin * remove unnecessary check of mish in op_teller.cc * fix some problem of trt8 * add check and modify unittest while converting mish to trt plugin Co-authored-by: Ndengkaipeng <dengkaipeng@baidu.com>
-
由 baoachun 提交于
* add skip case in trt converter ut * disable group_norm trt plugin
-
由 Huihuang Zheng 提交于
Add use_cinn flag and use it to control whether we run PaddlePaddle using CINN. Also add: Replace PaddlePaddle graph with a CINN graph in a pass PE Method to feed data and run the graph by CINN
-
由 JingZhuangzhuang 提交于
-
- 09 10月, 2021 9 次提交
-
-
由 Zhang Zheng 提交于
-
由 Yiqun Liu 提交于
-
由 Zeng Jinle 提交于
* add const OpDesc id() * add const for VarDesc::id()
-
由 From00 提交于
* Add new API tensordot * Set timeout value 400 for UT; Fix format for EN docs * Set timeout value 1000 for UT; Fix format for EN docs * Remove some input check * Coding style improve: don't compare boolean values to True or False using ==
-
由 zhiboniu 提交于
-
由 zhiboniu 提交于
* update fft api path * add sample code for ihfft2 Co-authored-by: Nchenfeiyu <chenfeiyu@baidu.com>
-
由 zhaoyingli 提交于
* support ClipGradByGlobalNorm in sharding * support ClipGradByGlobalNorm in sharding * test=allcase
-
由 wuhuanzhou 提交于
支持C++开发注册GeneratePass,简化针对fusion等子图优化场景开发方式。
-
由 wuhuanzhou 提交于
对于__getattr__重载后不满足条件的参数,全部抛出AttributeError异常,达到与未重载版本一致。
-
- 08 10月, 2021 4 次提交
-
-
由 jakpiase 提交于
* fix for conv op * Minor change
-
由 Zeng Jinle 提交于
* support CUDA Graph on PE * add ut, fix CI compile * reduce memory consumption * fix CUDA 10 CI * improve coverage * improve python coverage
-
由 yaoxuefeng 提交于
-
由 Qi Li 提交于
* [NPU] support NCL and NCL for BatchNorm, test=develop * [NPU] remove debug files, test=develop * update, test=develop
-