- 26 1月, 2022 11 次提交
-
-
由 yaozhixin 提交于
-
由 zyfncg 提交于
-
由 石晓伟 提交于
-
由 Li Min 提交于
* Optimize layer_norm fwd when cols is 1024.
-
由 yaozhixin 提交于
-
由 houj04 提交于
* add sigmoid cross entropy with logits to kl2. test=kunlun * add sigmoid cross entropy with logits to kl2. test=kunlun * follow comments. test=kunlun
-
由 baoachun 提交于
* support npu weight unified H2D copy * remove redundant variable
-
由 houj04 提交于
* fix gradient accumulator bug. test=kunlun * fix typo. test=kunlun * fix typo. test=kunlun * fix unit tests. test=kunlun * using TensorCopySync. test=kunlun * only fix for xpu place. test=kunlun
-
由 Yuang Liu 提交于
-
由 joeqiao12 提交于
-
由 Chen Weihang 提交于
* infermeta context init design * support infermeta called in fluid op * add hasattr and attr methods * add dygraah GetVarPtrs support * rename arg_map_context to arg_map_utils * add registry for arg map func * resolve conflit * refactor op utils design * polish meta config * fix details * remove hasattr method * resolve conflit * revert cmake order change * revert some change * change init pos * fix compile faileed * fix typo * fix inference failed * fix windows ccompile failed * polish format Co-authored-by: NWang Huan <wanghuan29@baidu.com>
-
- 25 1月, 2022 29 次提交
-
-
由 yaoxuefeng 提交于
-
由 zyfncg 提交于
-
由 YuanRisheng 提交于
-
由 hlygit66666 提交于
* add fuse_relu_depthwise_conv_pass unittest * fix atol and rtol * fix according to review * Add fuse_bn_act_pass unittest * rm others * add fuse_bn_act_pass
-
由 limingshu 提交于
* first commit * add more changes
-
由 chenjian 提交于
* add trace event data structure definition * convert enum item to string for cupti enum explaination * modify paddle_enforce_eq description
-
由 Zhang Jun 提交于
* [inference] update convert reduce op&ut,test=develop * update * update * update * add int32 support * add int32 support * add comments * trt < 7.0 do not support int32 * test=develop * update * test=develop
-
由 joeqiao12 提交于
* [MLU]add mlu kernel for fill_constant op * delete device_context DEPS
-
由 niuliling123 提交于
This reverts commit 9059ef69.
-
由 石晓伟 提交于
-
由 feng_shuai 提交于
-
由 sneaxiy 提交于
* assert _compile_dir include file existence * polish
-
由 fwenguang 提交于
-
由 joeqiao12 提交于
* [MLU]add mlu kernel for concat and split op * delete device_context DEPS
-
由 Yuang Liu 提交于
-
由 Baibaifan 提交于
-
由 niuliling123 提交于
-
由 Haohongxiang 提交于
* support param groups in grad_clip * update * modify for review
-
由 kuizhiqing 提交于
-
由 TTerror 提交于
-
由 Lijunhui 提交于
* init commit * remove comments * remove nchw branch * optimize code * apply fast div mod in 1D kernel, rm 3D kernel * move init of FastDivMode to CPU * 3D kernel for nchw, FastDiv for 1D kernel * debug done. process boundary * 2^n * optimize * optimize * change code & optimize code
-
由 Weilong Wu 提交于
* Added selected_rows and rw_lock to pten * Renamed the unit test target to fix CI * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid * Remove rw_lock.h,rw_lock_test.cc in fluid * Use pten::RWLock and pten::AutoRDLock, fix CI * Use pten::SelectedRows * Use pten::SelectedRows * Fix to pass NPU CI * Use pten::SelectedRows, to pass NPU CI * To fix NPU CI * To fix NPU CI again
-
由 Noel 提交于
-
由 Wilber 提交于
-
由 From00 提交于
-
由 caozhou 提交于
* update reshard for newest completion * update unitest * merge newest
-
由 zyfncg 提交于
-
由 Zhanlue Yang 提交于
-
由 xiongkun 提交于
* transfer: string tinyformat errors and part of enforce into pten * remove comment * fix by code review * assert is not compile in -DNDEBUG * add string as dependences of paddle_inference
-