- 02 11月, 2022 1 次提交
-
-
由 houj04 提交于
* [XPU] add int64 support for slice and subtract. test=kunlun * try to fix xpu compile. test=kunlun * try to fix xpu compile. test=kunlun * try to fix xpu compile. test=kunlun * remove unnecessary modification. test=kunlun
-
- 01 11月, 2022 2 次提交
-
-
由 HongyuJia 提交于
* move cudnn hardcode outside GetExpectedKernelType * add header file * debug * update interpreter_util with hardcode * update interpreter_util headerfile * solve activation hardcode * debug with CI * add mkldnn_op_list header file * temporarily uncomment mkldnn * temporarily uncomment mkldnn * delete sequence_softmax cudnn hardcode * add hardcode to data_transfer.cc * update data_transfer headerfile * try fix segment fault * update cudnn&miopen_helper * reset HasAttr of DygraphExctnCtx * debug, this commit should pass all CI * debug should pass CI, temporarily disable activation * debug should pass CI * fix default_attr=nullptr bug * clean debug code
-
由 Chen Weihang 提交于
* add extra attr property set * add type_info for all context * add onednn context to all context * fix context compile error * simplify conv kernel args * pass runtime attr into dev_ctx * fix marco error * clear conv_grad_kernel extra args * merge conv_grad_grad into conv_grad * clear conv2d_grad_grad extra attrs * clear yaml and eager extra attr * fix conv1d error * change to thread local * fix npu compile failed * try to fix windows compile failed * add conv2d onednn phi kernel * fix ci bugs (#36) * fix compile bugs (#38) * fix extra input transform bug (#39) * support dynamic created attr (#40) * reset extra info gen code * rm conv_grad_grad kernel * reimpl pass attr adapting * add int attr support * remove vector inputnames creating * fix map at error * Update paddle/phi/kernels/onednn/conv_grad_kernel.cc Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com> * remove useless extra attrs * replace mkldnn_engine by onednn_engine Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com> Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
-
- 25 10月, 2022 1 次提交
-
-
由 HongyuJia 提交于
-
- 21 10月, 2022 1 次提交
-
-
由 Yuanle Liu 提交于
* fix nvprof_nvtx_push interface bug
-
- 19 10月, 2022 2 次提交
-
-
由 Yuanle Liu 提交于
-
由 Leo Chen 提交于
* clean unused code: piece.cc/h * clean usage
-
- 17 10月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
* namespace modify * update by comment
-
- 11 10月, 2022 2 次提交
-
-
由 Wen Sun 提交于
-
由 Chen Weihang 提交于
* remove using lodtensor part1 * polish history code format
-
- 30 9月, 2022 3 次提交
-
-
由 Allen Guo 提交于
* paddle-inference support custom-ops Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> * fix tolower Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai>
-
由 ykkk2333 提交于
* migrate sigmoid with cross entropy, and tile xpu kernels to phi, test=kunlun * migrate add_n kernep to phi, test=kunlun * fix bugs of tipc, test=kunlun
-
由 sneaxiy 提交于
* support pure bfloat16 * support bf16 linear * update PR to pass CI * tiny fix where_grad_kernel.cu * add bfloat16 to selu_grad to pass CI * fix selu grad compilation error
-
- 29 9月, 2022 1 次提交
-
-
由 Leo Guo 提交于
Add index_select, index_select_grad, reduce_min kernel and their unittests for kunlun. Add registers of index_select, index_select_grad, reduce_min, sqrt, sqrt_grad to xpu2_op_list.test=kunlun. (#46557)
-
- 28 9月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* remove needless using tensor * remove needless using tensor * resolve conflict * replace tensor using * fix format error * revert needless changing * fix rocm and npu compile error * fix cinn compile error * fix format error * fix mkldnn format error * fix mkldnn format error * fix cinn compile error * fix cinn compile error * fix cinn compile error * resolve conflict
-
- 26 9月, 2022 1 次提交
-
-
由 cifar10 提交于
-
- 16 9月, 2022 2 次提交
-
-
由 sneaxiy 提交于
* support int64 non-broadcast * support broadcast case for int64 index * fix bug * support more Arity * remove some codes * upgrade patchelf to v0.15.0 to pass CI build * fix bug * fix patchelf installation * add debug flags * remove useless codes * fix viterbi_decode and set_value op uts * remove always enable int64
-
由 ronnywang 提交于
* [CustomDevice] add custom_device_resource_pool & device_event_custom_device * update * update * update * update
-
- 09 9月, 2022 1 次提交
-
-
由 Chenxiao Niu 提交于
-
- 08 9月, 2022 1 次提交
-
-
由 taixiurong 提交于
* add gemm_epilogue * xpu-paddlepaddle-40 [任务] fused_gemm_epilogue 支持 test=kunlun
-
- 07 9月, 2022 1 次提交
-
-
由 houj04 提交于
-
- 05 9月, 2022 2 次提交
- 01 9月, 2022 2 次提交
-
-
由 houj04 提交于
-
由 taixiurong 提交于
test=kunlun
-
- 29 8月, 2022 2 次提交
-
-
由 Allen Guo 提交于
* support depthwise_conv2d ops Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai> * fix duplicate name Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai>
-
由 Allen Guo 提交于
-
- 26 8月, 2022 1 次提交
-
-
由 houj04 提交于
-
- 25 8月, 2022 1 次提交
-
-
由 haosicheng 提交于
-
- 24 8月, 2022 1 次提交
-
-
由 mengqingchun02 提交于
* support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support fp16 of adam operator in xpu environment. test=kunlun * support fp16 of adam operator in xpu environment. test=kunlun * support fp16 of adam operator in xpu environment. test=kunlun
-
- 19 8月, 2022 3 次提交
-
-
由 houj04 提交于
-
由 dongfangshenzhu 提交于
* add merged_momentum *test=kunlun * add merged_momentum *test=kunlun * add fp16 to merged_momentum,*test=kunlun * change dist_model.cc * add merged_momentum unittest and change momentum,test=kunlun * add merged_momentum unittest and change momentum,test=kunlun * add merged_momentum unittest and change momentum,test=kunlun * add merged_momentum unittest and change momentum,test=kunlun
-
由 mengqingchun02 提交于
* support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * fix beam_search operator bugs on xpu. test=kunlun * fix beam_search operator bugs on xpu. test=kunlun * fix beam_search operator bugs on xpu. test=kunlun * fix beam_search operator bugs on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun
-
- 18 8月, 2022 1 次提交
-
-
由 zhangxiaoci 提交于
* change to async mode for xpu multi-card training in static graph mode * minor bugfix * irrelevant. move to another pr * move change to other pr * fix stream issue * fix 'stream not meet with current context' error * fix branch diverge, test=kunlun
-
- 17 8月, 2022 1 次提交
-
-
由 ykkk2333 提交于
* xpu unittest grad compute supports more types, *test=kunlun * add instance norm xpu, *test=kunlun
-
- 16 8月, 2022 1 次提交
-
-
由 houj04 提交于
-
- 15 8月, 2022 2 次提交
-
-
由 zhangyikun02 提交于
-
由 houj04 提交于
* [XPU] add some collective ops. test=kunlun * use XPUOpTestWrapper. test=kunlun * skip kl1 for collective ops. fix typo: deivce -> device. test=kunlun
-
- 12 8月, 2022 2 次提交
-
-
由 Allen Guo 提交于
-
由 Siming Dai 提交于
* add init file * add op definition and infermeta * add kernel definition funcs * add broadcast infer shape * add gpu forward kernel * delete SUB and DIV * add x_grad * add template * add e_grad for min and max * fix small bug * temp commit * temp commit * add e_grad for sum and mean * fix some compile bug * fix compile bugs * fix compile problem * add sum forward unittest * fix broadcast error, add kernel sig, register e_grad, change unit test * fix grad * add temp grad fix * temp commit * add min max unittest * add max, min unittest, fix mul bug * add cpu forward sum and mean * add forward min max, fix mean unittest * add cpu backward min max * fix code-style * add backward sum mean * fix rocm ci * set uniitest timeout * fix bug of x broadcast to e, gpu grad * fix bug of x broadcast to e, cpu grad * rename BOOST_GET_CONST macro * fix rocm ci * mv graph_send_e_recv to graph_send_ue_recv * move out_size to IntArray * add eager op test * fix max pool type bug, add unittest for api * revise api doc * add fp16 for atomic min and max, add unittest * add unittest * add fp16 support for graph_send_recv * fix unittest fp16 bug * change OutSizeTensor to Out_size * move E to Y * add copyright, fix comment * review code * fix thread block size * fix thread block size * change api attribute name: pool_type to reduce_op, compute_type to message_op * change api attribute name, move pool_type to reduce_op, move compute_type to message_op
-