- 24 12月, 2021 3 次提交
-
-
由 chentianyu03 提交于
* combine reduce_cuda codes * support float16 in pten redcue_mean * replace ReduceCudaKernel impl with pten reduce impl * mv reduce funcs into reduce_cuda_impl * rm unsed codes and headers * mv GetReduceDim into reduce_cuda_impl * recover GetReduceDim in reduce_op.h * add new dispatch macro * fix pool op output not inited and cause transform to pten::denseTensor error * fix output tensor not initialized error * rename new dispatch macro and format code style * rm reduce_functor_op.h file
-
由 Zhanlue Yang 提交于
[Unify Tensors PR #1] Replaced pten::Allocation with shared_ptr<memory::Allocation> for Storage (#38301) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage
-
由 Chen Weihang 提交于
-
- 23 12月, 2021 5 次提交
-
-
由 Chen Weihang 提交于
-
由 zyfncg 提交于
* add empty and empty_like kernel in pten * add empty dev_api
-
由 Chen Weihang 提交于
-
由 Chen Weihang 提交于
* move dot kernel impl * remove needless cmake items
-
由 石晓伟 提交于
* updates the pten allocation, test=develop * avoids an error message, test=develop
-
- 22 12月, 2021 5 次提交
-
-
由 Chen Weihang 提交于
* change functions to funcs * remove useless code
-
由 Chen Weihang 提交于
* add pten kernel cmake * add pten kernel cmake function * fix compile error * add enforce include for full kernel * fix compile failed * change cuda to gpu * fix cmake function error
-
由 Chen Weihang 提交于
-
由 YuanRisheng 提交于
* move flatten * fix bugs of test * modify header file * add copy declare * fix compile bugs
-
由 zyfncg 提交于
* rename full infer_meta * fix merge problem
-
- 21 12月, 2021 3 次提交
-
-
由 Chen Weihang 提交于
* rename cuda to gpu * revert CMake change * resolve conflit * rename other cuda to gpu * poish details
-
由 chentianyu03 提交于
* fix when out_dtype is same with x.dtype and still transform type error * fix spell error
-
由 Chen Weihang 提交于
* remove eigen and blas dir * fix declare error
-
- 20 12月, 2021 3 次提交
-
-
由 chentianyu03 提交于
* add pten conj kernel * modify conj_kernel file path * add defined cuda macro to cuda/conj_kernel.h
-
由 石晓伟 提交于
-
由 zyfncg 提交于
-
- 17 12月, 2021 5 次提交
-
-
由 Jiabin Yang 提交于
* support more eager tensor api * support multiple constructor for eager tensor * add place related code * polish code * specific randint with dtype of int64 * Support pure cpu test * refine test in pure cpu * refine test in pure cpu
-
由 Chen Weihang 提交于
-
由 chentianyu03 提交于
* modify sum mean args * add GetExpectedPtenKernelArgs for redcue_op * modify kernel args number * modify kernel args number
-
由 Chen Weihang 提交于
-
由 limingshu 提交于
* fix_bugs_for_elementwise_branch_selection * fix merge_dims bugs * fix all influenced file
-
- 16 12月, 2021 4 次提交
-
-
由 Chen Weihang 提交于
* unify device context entrance * move all_context include to header * polish cmake relay for device_context * fix npu compile failed * fix npu compile failed
-
由 Chen Weihang 提交于
* add register_ctx_kernel and move scale kernel * polish details by reviewer comment * fix xpu compile failed * fix cmake error
-
由 YuanRisheng 提交于
* Reduce reshape kernel functions in pten * delete notes * fix bugs when compile * modify register name * fix compile bugs
-
由 Chen Weihang 提交于
* unify device context entrance * move all_context include to header * polish cmake relay for device_context * fix npu compile failed * fix npu compile failed * revert part of change
-
- 15 12月, 2021 4 次提交
-
-
由 Yiqun Liu 提交于
test=document_fix
-
由 Chen Weihang 提交于
-
由 Chen Weihang 提交于
-
由 Chen Weihang 提交于
-
- 14 12月, 2021 3 次提交
-
-
由 Chen Weihang 提交于
* polish register marco * resolve compile failed * revert needless change * revert eager related change * revert eager related change * change register marco name * polish deetails
-
由 YuanRisheng 提交于
-
由 YuanRisheng 提交于
* Reduce reshape kernel functions in pten * delete notes * fix bugs when compile
-
- 13 12月, 2021 3 次提交
-
-
由 Chen Weihang 提交于
-
由 zyfncg 提交于
* add variadic_args kernel in pten * merge develop code * add variadic_args kernel and benchmark * change dynamic_cast to static_cast for DeviceContext * merge the code * modify code format * refactor variadic kernel function
-
由 Shang Zhizhou 提交于
* fix reduce_max bug * add unittest
-
- 10 12月, 2021 2 次提交
-
-
由 chentianyu03 提交于
-
由 YuanRisheng 提交于
* add alias kernel name * modify code as suggestions * add alias name for matmul and remove redundant member in kernel factory
-