- 04 3月, 2022 11 次提交
-
-
由 chentianyu03 提交于
* move reduce gpu impl funcs into pten/kernels/funcs * change reduce header name and namespace * fix spell word error * change mutable_data to dev_ctx.Alloc * modify place to devcontex * format code style * fix build error * fix build error * fix conflict
-
由 王明冬 提交于
-
由 xiongkun 提交于
-
由 chenjian 提交于
-
由 Zhanlue Yang 提交于
* [Eager][Yaml]Supported Scalar and ScalarArray for AutoCodeGen * Generate forward-only operators * [Yaml]Support parsing fwd & bwd returns with name * Fixed issues * Fixed minor issues
-
由 tianshuo78520a 提交于
-
由 Chen Weihang 提交于
* change input vec tensor to pointer * update input between * fix format error * resolve conflict * resolve conflict
-
由 Zhanlue Yang 提交于
-
由 Jiabin Yang 提交于
-
由 hong 提交于
-
由 hong 提交于
* move conv to pten * move conv to pten; test=develop * fix bug; * add conv cudnn impl; test=develop * update * update operator; test=develop * fix bug; test=develop * move operator and prepared_operator to develop; test=develop * resolve conflict; test=develop * remove useless code;test=develop * add depency ; test=develop * fix bug; * add sig.cc ; test=develop * fix use_op error; test=develop * fix bug; test=develop * fix bug; test=develop * add conv3d register; test=develop * fix star gan and conv_nn_grad test failed; test=develop * add header; test=develop * manul to recover to develop; * resolve confilct; test=develop * remove useless code * fix bug; * remove conv2d_cudnn; test=develop * fix bugs; test=develop * fix cpu rocm compile bugs; test=develop * fix blas error; test=develop * fix compile bug; test=develop * fix windows compile error; test=develop * fix windows error; test=develop * resolve confilct; test=develop
-
- 03 3月, 2022 29 次提交
-
-
由 YuanRisheng 提交于
-
由 梦柳 提交于
-
由 0x45f 提交于
-
由 TeFeng Chen 提交于
* swith to PE execution in cinn launch * fix outer variables erased * skip the map bug temporarily for test * temporary solution for batch_norm bug * update comment * fix compile error * cinn_instruction_run_op_test: update code to skip external alloc/free instructions generated
-
由 JingZhuangzhuang 提交于
-
由 石晓伟 提交于
* mlir attr types for infrt place, test=develop * fix a bug, test=develop
-
由 From00 提交于
* Move compare OPs to phi * Fix bug * Use BroadcastKernel and ElementwiseKernel in phi
-
由 From00 提交于
* Support cuda graph in StreamSafeCudaAllocator * Fix CI error * Arrange AllocatorFacade * Fix CI error * Fix CI error * Fix ROCM Compile error * Fix ROCM Compile error
-
由 Zhanlue Yang 提交于
-
由 ronnywang 提交于
-
由 wangxinxin08 提交于
* modify infershape of multiclass nms
-
由 Sing_chan 提交于
-
由 YuanRisheng 提交于
* delete elementwise_sub kernel registry * fix compile bugs in xpu ci * fix bugs when run inference ci
-
由 wenbin 提交于
* emb fix * fix trt6 compile * fix half * absolute error fix
-
由 huangxu96 提交于
* Modified sigmoid by elementwise interface. * using TensorReduceImpl to repalce Sum function * using reduceimpl to calculate the norm variable * Removed useless code
-
由 Li Min 提交于
* add support of int16 for gather op. * Recover formats. * Recover formats. * fix. * Fix format. * Fix format.
-
由 xiongkun 提交于
* add pad forward * fix error * transfer pad and pass the test_pad_op
-
由 lilong12 提交于
-
由 chentianyu03 提交于
-
由 zyfncg 提交于
* suppport sparse api in yaml * support auto-gen code of sparse api * do some refactor * add unittest test_sparse_conv_api * add unitest file Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * Set thread name for WorkQueue * Add thread names * fix ut Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 crystal 提交于
-
由 furnace 提交于
[Phi] move gaussian_random kernel
-
由 Baibaifan 提交于
-
由 zhangxiaoci 提交于
-
由 Jiabin Yang 提交于
* eager, test=develop * fix bug, test=develop * eager, test=develop * merge legacy to fluid * eager, test=develop * eager, test=develop * Refactor TensorAdd func by template and remove gradient_accumulation in eager * Remove needless target name * eager, test=develop * eager, test=develop * Use overload instead of template * Remove legacy code * Remove legacy code * selectedrows, test=develop * Remove DataType test * eager, test=develop * eager, test=develop * support gan, test=develop * Using Tensor directly instead of using EagerTensor * support gradient_accumulation * make test_imperative_lod_tensor_to_selected_rows longer * make test_imperative_lod_tensor_to_selected_rows longer * refine code * ptb, test=develop * Rename all EagerTensor to Tensor * Rename some EagerTensor to Tensor * rename EagerTensor to EagerVariable * eager, test=develop * eager, test=develop * eager, test=develop * eager, test=develop * add more test * eager, test=develop * Support copiable selected rows and merge develop * save load, eager, test=develop * save load, eager, test=develop * refine, test=develop * remove useless _set_value method * refine, test=develop * refine, test=develop * revert static_runner, test=develop * EagerTensor to Tensor, test=develop * refine, test=develop * refine, test=develop * clear grad, test=develop * merge, develop * merge, develop * merge, test=develop * merge, test=develop * Support quant and part of slice * support legacy static save * extend slim tests time * remove imperative on inference * remove imperative on inference * merge develop * fix typo * fix typo * split slice related code into 2 part for imperative and eager * split slice from inference * split slice from inference * fix test_tensor_register_hook Co-authored-by: NWang Huan <wanghuan29@baidu.com> Co-authored-by: NWeilong Wu <veyron_wu@163.com> Co-authored-by: Nwanghuancoder <wanghuancoder@163.com>
-
由 zyfncg 提交于
-
由 niuliling123 提交于
1. set xpu2 block_size = 64 2. fix a bug when reduce_num is too large
-
由 zhangkaihuo 提交于
* sparse conv3d: gpu code
-