- 04 1月, 2022 4 次提交
-
-
由 niuliling123 提交于
Add OpFunctor and replace cast, scale, clip, bce_loss and abs_grad with elementwise_no_broadcast (#38500)
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs * move cpu_impl of elementwise kernel to new directory
-
由 Zhanlue Yang 提交于
[Unify Tensors PR #3]Port framework::Tensor members & interfaces to pten::DenseTensor, test=allcases (#38473) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes
-
由 Chen Weihang 提交于
* move inner cast api to cast_kernel.h * resolve conflit
-
- 31 12月, 2021 3 次提交
-
-
由 Chen Weihang 提交于
-
由 Chen Weihang 提交于
* unify data layout * fix test_transfer_layout error
-
由 YuanRisheng 提交于
* change 'math' to 'math_kernel' * fix compile bugs * merge develop * fix compile bugs
-
- 30 12月, 2021 2 次提交
-
-
由 sneaxiy 提交于
-
由 Chen Weihang 提交于
* remove offset in storage * revert api change * fix custom op slice bug * fix mutable_data error
-
- 29 12月, 2021 3 次提交
-
-
由 Chen Weihang 提交于
-
由 Shang Zhizhou 提交于
-
由 limingshu 提交于
-
- 28 12月, 2021 5 次提交
-
-
由 limingshu 提交于
* first commit * pass ctest of elementwise_div_grad
-
由 Jiabin Yang 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators * Refactored Eager AutoCodeGen with more organized helper objects * Enabled Eager AutoCodeGen for operators with multiple OpBases * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen * Adjusted function generation/call between Python-C API & Dygraph API * Synchronized auto-generated Python-C API with Dygraph Forward Functions * support more eager tensor api * fix merge compile error * fix compile error and fit develop code * support pure CPU * fix some logic error in eager_mode * support _varbase_creator in eager mode * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs * for eager mode * refine * support multiple constructor for eager tensor * add place related code * polish code * specific randint with dtype of int64 * Support pure cpu test * eager logic * refine test in pure cpu * eager logic * eager logic * eager logic, test=develop * skip core.eager when in inference, test=develop * refine, test=develop * refine, test=develop * call RetainGrad after run forward kernel, test=develop * refine, test=develop * support dygraph util, meta, guard test * support inference test * refine test and fix initializer failed * support create varbase and fix retain grad error * fix windows error * support test code coverage * support test code coverage * support test code coverage Co-authored-by: Njim19930609 <jim19930609@gmail.com> Co-authored-by: NWang Huan <wanghuan29@baidu.com>
-
由 zyfncg 提交于
* refactor matmul directory in pten * fix merge conflict
-
由 chentianyu03 提交于
* remove intype arg in cast kernel * modify conj config in api.yaml by dictionary order * rm unused code in cast_kernel.cu
-
由 Zhanlue Yang 提交于
-
- 27 12月, 2021 4 次提交
-
-
由 YuanRisheng 提交于
* move reshape * fix compile bugs * delete manipulation file * fix compile bugs
-
由 limingshu 提交于
* No harm to KP * Pass the compile stage * change the WriteData function * fix template bugs and pass ctest of current elementwise * for passing partial template specialization of tempalte function in CI-ROCm * To make 'WriteData' funtion flexible. * a less harmful way to support multi-output * a less harmful way to support multi-output
-
由 Chen Weihang 提交于
-
由 Chen Weihang 提交于
* rename to api to copy_to * revert needless change * polish format
-
- 26 12月, 2021 2 次提交
-
-
由 Chen Weihang 提交于
* add register general kernel marco * move copy kernel impl * revert needless change * polish details * fix xpu compil faild * fix xpu compile failed * polish format
-
由 Zhanlue Yang 提交于
* Replaced pten::LoD with paddle::framework::LoD * Overrided CPUVector with CUDAVector * Refactored paddle::framework::Vector
-
- 24 12月, 2021 4 次提交
-
-
由 Chen Weihang 提交于
-
由 chentianyu03 提交于
* combine reduce_cuda codes * support float16 in pten redcue_mean * replace ReduceCudaKernel impl with pten reduce impl * mv reduce funcs into reduce_cuda_impl * rm unsed codes and headers * mv GetReduceDim into reduce_cuda_impl * recover GetReduceDim in reduce_op.h * add new dispatch macro * fix pool op output not inited and cause transform to pten::denseTensor error * fix output tensor not initialized error * rename new dispatch macro and format code style * rm reduce_functor_op.h file
-
由 Zhanlue Yang 提交于
[Unify Tensors PR #1] Replaced pten::Allocation with shared_ptr<memory::Allocation> for Storage (#38301) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage
-
由 Chen Weihang 提交于
-
- 23 12月, 2021 5 次提交
-
-
由 Chen Weihang 提交于
-
由 zyfncg 提交于
* add empty and empty_like kernel in pten * add empty dev_api
-
由 Chen Weihang 提交于
-
由 Chen Weihang 提交于
* move dot kernel impl * remove needless cmake items
-
由 石晓伟 提交于
* updates the pten allocation, test=develop * avoids an error message, test=develop
-
- 22 12月, 2021 5 次提交
-
-
由 Chen Weihang 提交于
* change functions to funcs * remove useless code
-
由 Chen Weihang 提交于
* add pten kernel cmake * add pten kernel cmake function * fix compile error * add enforce include for full kernel * fix compile failed * change cuda to gpu * fix cmake function error
-
由 Chen Weihang 提交于
-
由 YuanRisheng 提交于
* move flatten * fix bugs of test * modify header file * add copy declare * fix compile bugs
-
由 zyfncg 提交于
* rename full infer_meta * fix merge problem
-
- 21 12月, 2021 3 次提交
-
-
由 Chen Weihang 提交于
* rename cuda to gpu * revert CMake change * resolve conflit * rename other cuda to gpu * poish details
-
由 chentianyu03 提交于
* fix when out_dtype is same with x.dtype and still transform type error * fix spell error
-
由 Chen Weihang 提交于
* remove eigen and blas dir * fix declare error
-