- 26 1月, 2022 1 次提交
-
-
由 houj04 提交于
* add sigmoid cross entropy with logits to kl2. test=kunlun * add sigmoid cross entropy with logits to kl2. test=kunlun * follow comments. test=kunlun
-
- 25 1月, 2022 3 次提交
-
-
由 joeqiao12 提交于
* [MLU]add mlu kernel for concat and split op * delete device_context DEPS
-
由 Lijunhui 提交于
* init commit * remove comments * remove nchw branch * optimize code * apply fast div mod in 1D kernel, rm 3D kernel * move init of FastDivMode to CPU * 3D kernel for nchw, FastDiv for 1D kernel * debug done. process boundary * 2^n * optimize * optimize * change code & optimize code
-
由 Wilber 提交于
-
- 24 1月, 2022 1 次提交
-
-
由 Wilber 提交于
* move dynload from fluid to pten. * fix ci compile * fix windows ci compile. * update * update * fix compile error
-
- 21 1月, 2022 3 次提交
- 20 1月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* Migrate bfloat16/float16/complex from platform into pten::common * fix typo * fix code style
-
- 19 1月, 2022 1 次提交
-
-
由 zhangyikun02 提交于
-
- 18 1月, 2022 1 次提交
-
-
由 Zhanlue Yang 提交于
* Merged LoDTensor with Tensor,test=allcases * Patched python level LoDTensor * Patched python level LoDTensor * Merge Tensor into DenseTensor * Fixed namespace issues,test=allcases * Fixed merge issues * Fixed inference issues * Fixed NPU test issues * Fixed merge issues
-
- 17 1月, 2022 3 次提交
-
-
由 Allen Guo 提交于
Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NAllen Guo <alleng@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai> Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
-
由 Allen Guo 提交于
* update ipu_backend * sync with paddle internal Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NAllen Guo <alleng@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai> * apply comments 01 * update error messag * restore ipu_executor and ipu_optimizer * add clang-format on Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
-
由 Wilber 提交于
* add pten::Place data structure. * update ci problem * fix ci problem * update * using platform::Place=pten::Place * remove BOOST_GET_CONST for CPUPlace and GPUPlace * compile pass 25%. * compile pass 45% * compile pass 60% * remove boost_get for xpu npu mlu and ipu * compile pass on cpu and gpu. * fix compile problem * fix compile error. * update * fix ci problem * update * ci approve * fix ci problem * fix ci eager test problem * remove BOOST_GET_CONST * fix npu compile
-
- 14 1月, 2022 1 次提交
-
-
由 Zhangjingyu06 提交于
* [XPU]add split op for kunlun2,*test=kunlun * [XPU]add split op for kunlun2,*test=kunlun * [XPU]add split op for kunlun,*test=kunlun * [XPU]add stack_grad op for kunlun2,*test=kunlun Co-authored-by: NQingshuChen <chenqingshu@baidu.com>
-
- 13 1月, 2022 1 次提交
-
-
由 石晓伟 提交于
-
- 12 1月, 2022 2 次提交
-
-
由 Allen Guo 提交于
* support more ops * Co-authored-by: Xiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NAllen Guo <alleng@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai> * add authors Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NAllen Guo <alleng@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai> * update date Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
-
由 limingshu 提交于
* first commit * fix wrong filename * fix the wrong spell name * fix gpu config warper * modify according to pr advices * fix GpuLauchConfig1D api bugs * change the config for dropout grad * fix bugs * modification according to pr advices * modification according to pr advices
-
- 10 1月, 2022 2 次提交
-
-
由 taixiurong 提交于
-
由 Zhanlue Yang 提交于
* Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues
-
- 05 1月, 2022 1 次提交
-
-
由 TTerror 提交于
* add huber_loss for kunlun * update xpu.cmake * update unitests * update unitests * update elementwise_add * update elementwise_add * update elementwise_add
-
- 04 1月, 2022 2 次提交
- 31 12月, 2021 1 次提交
-
-
由 Zhangjingyu06 提交于
* [XPU]add split op for kunlun2,*test=kunlun * [XPU]add split op for kunlun2,*test=kunlun * [XPU]add split op for kunlun,*test=kunlun Co-authored-by: NQingshuChen <chenqingshu@baidu.com>
-
- 30 12月, 2021 4 次提交
-
-
由 houj04 提交于
* add sigmoid cross entropy with logits to kl1. test=kunlun * add sigmoid cross entropy with logits to kl1. test=kunlun
-
由 zhangyk0314 提交于
Add exp, abs_grad, reciprocal, reciprocal_grad operator for XPU and update xpu2_op_list.h,test=kunlun (#38570)
-
由 zhangkaihuo 提交于
将cuSparse的handle与DeviceContext进行绑定,避免op中进行创建和销毁 添加对cuSparse中dense和sparse转换的API进行封装 添加对封装的API的单测
-
由 Leo Guo 提交于
* Fix the bug of batch_norm and batch_norm_grad op. Add the "roi_align" and "roi_align_grad" op in xpu2 op list. * Fix the bug of batch_norm and batch_norm_grad op. Add the "roi_align" and "roi_align_grad" op in xpu2 op list. test=kunlun Co-authored-by: NZibin <guozibin@baidu.com>
-
- 29 12月, 2021 2 次提交
- 28 12月, 2021 1 次提交
-
-
由 houj04 提交于
* add reduce_prod_xpu. fix reduce_mean_xpu bug. * iadd reduce_prod_xpu. fix reduce_mean_xpu bug. test=kunlun
-
- 27 12月, 2021 1 次提交
-
-
由 sneaxiy 提交于
-
- 23 12月, 2021 1 次提交
-
-
由 houj04 提交于
-
- 20 12月, 2021 1 次提交
-
-
由 fwenguang 提交于
-
- 17 12月, 2021 2 次提交
-
-
由 From00 提交于
* Get GPU BasePtr from CUDA allocation * Fix compile error for ROCm * Add BasePtr function for IPUPlace in naive_best_fit_allocator.cc * Add alignment for BuddyAllocator * Set address alignment of BuddyAllocator to 32 bytes * Fix CI error * Remove code for naive_best_fit strategy
-
由 houj04 提交于
-
- 13 12月, 2021 1 次提交
-
-
由 jianghaicheng 提交于
-
- 10 12月, 2021 3 次提交
-
-
由 sneaxiy 提交于
-
由 jianghaicheng 提交于
-
由 jianghaicheng 提交于
-