- 26 1月, 2022 4 次提交
-
-
由 Li Min 提交于
* Optimize layer_norm fwd when cols is 1024.
-
由 houj04 提交于
* add sigmoid cross entropy with logits to kl2. test=kunlun * add sigmoid cross entropy with logits to kl2. test=kunlun * follow comments. test=kunlun
-
由 joeqiao12 提交于
-
由 Chen Weihang 提交于
* infermeta context init design * support infermeta called in fluid op * add hasattr and attr methods * add dygraah GetVarPtrs support * rename arg_map_context to arg_map_utils * add registry for arg map func * resolve conflit * refactor op utils design * polish meta config * fix details * remove hasattr method * resolve conflit * revert cmake order change * revert some change * change init pos * fix compile faileed * fix typo * fix inference failed * fix windows ccompile failed * polish format Co-authored-by: NWang Huan <wanghuan29@baidu.com>
-
- 25 1月, 2022 12 次提交
-
-
由 yaoxuefeng 提交于
-
由 YuanRisheng 提交于
-
由 limingshu 提交于
* first commit * add more changes
-
由 Zhang Jun 提交于
* [inference] update convert reduce op&ut,test=develop * update * update * update * add int32 support * add int32 support * add comments * trt < 7.0 do not support int32 * test=develop * update * test=develop
-
由 joeqiao12 提交于
* [MLU]add mlu kernel for fill_constant op * delete device_context DEPS
-
由 niuliling123 提交于
This reverts commit 9059ef69.
-
由 joeqiao12 提交于
* [MLU]add mlu kernel for concat and split op * delete device_context DEPS
-
由 niuliling123 提交于
-
由 Lijunhui 提交于
* init commit * remove comments * remove nchw branch * optimize code * apply fast div mod in 1D kernel, rm 3D kernel * move init of FastDivMode to CPU * 3D kernel for nchw, FastDiv for 1D kernel * debug done. process boundary * 2^n * optimize * optimize * change code & optimize code
-
由 Weilong Wu 提交于
* Added selected_rows and rw_lock to pten * Renamed the unit test target to fix CI * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid * Remove rw_lock.h,rw_lock_test.cc in fluid * Use pten::RWLock and pten::AutoRDLock, fix CI * Use pten::SelectedRows * Use pten::SelectedRows * Fix to pass NPU CI * Use pten::SelectedRows, to pass NPU CI * To fix NPU CI * To fix NPU CI again
-
由 Noel 提交于
-
由 Wilber 提交于
-
- 24 1月, 2022 7 次提交
-
-
由 chentianyu03 提交于
* add scale xpu kernel * add scale xpu kernel * add scale xpu kernel * replace with pten scale kernel * change dev_ctx * modify float16 head path * remove unused xpu header
-
由 YuanRisheng 提交于
[Pten]Refactor elementwise_add grad / double grad / triple grad Kernel and move them to pten (#39048) * refactor elementwise add grad * fix compile bugs * fix unit test bugs * fix file conflicts * fix bugs when buildPtenContext
-
由 Jacek Czaja 提交于
* - more unlikely * - compilation fix * - removed redundant definition * - fix * - Fixes * - compilation fix for windows
-
由 Feiyu Chan 提交于
* migration of functors in paddle/fluid/operators/eigen and paddle/fluid/platform/eigen_ext.h * update path of data types like float16.h in includes in extensions.h
-
由 Zhang Ting 提交于
-
由 Wilber 提交于
* move dynload from fluid to pten. * fix ci compile * fix windows ci compile. * update * update * fix compile error
-
由 z8hanghuan 提交于
* support sparse of adam, *test=kunlun * add pre-commit-config.yaml * support sparse of adam in KL2,*test=kunlun * support sparse of adam in KL2, *test=kunlun * modify xpu.cmake, *test=kunlun * support sparse of adam, rm some wait, *test=kunlun * support sparse of adam, rm some wait, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun
-
- 21 1月, 2022 12 次提交
-
-
由 chentianyu03 提交于
* fix test concat dev api build failed * fix conflict * fix conflict
-
由 YuanRisheng 提交于
* add kernel for c++ api * fix compile bugs * fix kunlun compile bugs * perfect cmake * fix compile bugs when run ci-inference * fix compile bugs * add non-raw kernel for fluid op * fix compile bugs * fix compile bugs * fix unit test bug
-
由 chentianyu03 提交于
-
由 Weilong Wu 提交于
-
由 Zhang Ting 提交于
-
由 TeslaZhao 提交于
Keep strided_slice op behavior consistent with slice op when starts input is less than -rank (#39066)
-
由 fwenguang 提交于
* [MLU]add mlu ci dockerfile * fix comment * add cncl
-
由 Aurelius84 提交于
* Migrate Dim and DDim from paddle::framework into pten namespace * fix paddle::framework::Array * fix framework::Array
-
由 ronnywang 提交于
-
由 FlyingQianMM 提交于
* add block and grid loop for index_sample kernel to deal with a large-shape tensor * fix code format * limit grid dim
-
由 fwenguang 提交于
-
由 Wilber 提交于
* add cpu_context. * update * update * update * update * update * fix ci problem * fix npu ci problem * update * fix ci compile
-
- 20 1月, 2022 5 次提交
-
-
由 fwenguang 提交于
-
由 fwenguang 提交于
-
由 Aurelius84 提交于
* Migrate bfloat16/float16/complex from platform into pten::common * fix typo * fix code style
-
由 yaoxuefeng 提交于
-
由 zhangbo9674 提交于
* fix mp * support merged_momentum for mp
-