- 11 4月, 2022 1 次提交
-
-
由 Allen Guo 提交于
cherry from #41481
-
- 02 4月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* fix no pinned trans * fix cond error
-
- 01 4月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* add cross_entropy_with_softmax phi kernel * remove softmax_with_cross_entropy kernel * add softmax_with_cross_entropy grad kernel * remove original op kernel * refine cross entropy impl * fix pointer error * revert kernel cu change * fix xpu failed * fix cinn failed * fix npu failed * add forward sig * add check_nan_inf for pt kernel * remove repeat cmake item * fix unittest error
-
- 22 3月, 2022 1 次提交
-
-
由 zyfncg 提交于
* optimize performance of C++ API * remove stop_data_transform flag temparily
-
- 03 3月, 2022 1 次提交
-
-
由 ronnywang 提交于
-
- 01 3月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* support kps backend and compile * resolve conflict * fix kps backend trans * test in xpu2 device * remove dummy kernel
-
- 28 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename pten_utils to phi_utils * rename pten_utils target * rename Pten to Phi * replace pten with phi * resolve conflict
-
- 25 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* support cudnn kernel moving * polish cmake rules * add unittest for coverage * remove orig kernel * remove softmax cudnn kernel * fix softmax test failed * fix npu func error * resolve conflict * rename gpu dnn kernels * fix name rule error * fix compile error * update fp16 namespace
-
- 24 2月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* [Phi] Fix XPU OP segmentation Fault problem * fix cast_op_xpu in kunlun1 * fix cast_op_xpu in kunlun1
-
- 20 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename pten dir to phi * rename namespace to phi * rename infrt pten dir to phi * resolve conflict * rename pten to phi in cmake * revert all infrt change * change needed files * fix infrt failed * fix inference failed
-
- 18 2月, 2022 1 次提交
-
-
由 ronnywang 提交于
-
- 15 2月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* #1 migrate dist-related type()-> dtype() * move datatype function from pten -> fluid/framework * change type() in imperative into convert(dtype()) * modify xx_tensor->type into xx_tensor->dtype * change the set_type interface and the caller * modify xx_tensor.type into xx_tensor.dtype * fix mutable_data(place, dtype()) * change caller of mutable_data in pten and distributed * change the caller of mutable_data in fluid/framework * change the caller of mutable_data in imperative directory * mutable_data: inference * update the call of mutable_data * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType * pass the compile. the next step is remove VarType in Pten * fix all and remove VarType from pten. success in linux. Next task is other platform * fix conflict with develop * fix compiled error * Fix reset conversion * fix conflict * fix compiled problem * fix typo * Fix << in tensor_utils.cc * fix type->dtype * fix unittest * fix tensor init constructor * fix DataTypeSize for BFloat16 * fix code style * fix npu compiled error * fix npu * compile npu sucessfully * fix conflict * fix conflict Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
-
- 11 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* ermove xxx_info include * fix namespace error * resolve conflict * skip xpu context in registry * fix macro error * resolve conflict * resolve conflict * revert xpu convert * remove trans to fluid place * remove useless headers
-
- 02 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* remove kernel alias name * fix depreacted error * fix deprecated failed * fix mean error * resolve conflict * fix windows failed
-
- 30 1月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* upgrade _get_all_register_op_kernels * add ut * support xpu/npu * fix device id * enhance TransToFluidPlace * fix compile
-
- 29 1月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* open header for custom kernel * add core utils * tidy core code * tify header * tidy include * tidy namespace * resolve conflit * fix unittest and coverage * remove platform using * resolve conflict * resolve conflict * fix digamma namespace error * fix xpu full kernel error * fix xpu full kernel error * polish details * add place for lib storage
-
- 31 12月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* unify data layout * fix test_transfer_layout error
-
- 21 12月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* rename cuda to gpu * revert CMake change * resolve conflit * rename other cuda to gpu * poish details
-
- 08 12月, 2021 1 次提交
-
-
由 YuanRisheng 提交于
* add alias kernel name * modify code as suggestions
-
- 07 12月, 2021 1 次提交
-
-
由 wanghuancoder 提交于
* refine a test case, test=develop * rm python, test=develop * refine, test=develop * fix cmake generate error, and fix circular import, test=develop
-
- 03 12月, 2021 2 次提交
-
-
由 ronnywang 提交于
* refine structure for cuda and rocm * update * update * update * update
-
由 wanghuancoder 提交于
* refine a test case, test=develop * publish python c api for eager, test=develop * revert modify about test_allclose_layer.py, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * delete numpy includes, use pybind11 numpy.h, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * suport eager error msg, and add grad test case, test=develop * refine, test=develop * refine, test=develop
-
- 22 11月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* add cast kernel * add cast cuda kernel * add cast kernel * make cast kernel output dtype undefined * get cast dtype from vardesc * move cast to manipulation and add test case * add castinfershape * avoid reinitilaze variable * InitializeVariable support datatype * merge develop branch * fix merge bug * revert modify initializeVariable * revert modify on InitializeVariable * revert modify on InitializeVariable * mutable support reset dtype * enable make pten tensor from variable when def_arg.type is undefined * fix build pten ctx start_idx error * copy pten out tensor to variable * merge develop branch * fix non pten kernel cast failed * add reset allocation place for remake tensor * fix inplace realloc error * add mutable on pten kernles and remove unused cast files * rename function names * fix output type error * fix conflict with develop branch * set data type to variable with pten's dtype * fix test_cast_api type mismatch * densorTensro mutable_data support 0 bytes value * fix the inplace bug of reshape kernel * fix pten.backend != variable.place when moving storage, palce mismatch bug * fix conflict with develop branch * Fix bug of paddle::experimental::MovesStorage * fix ReMakePtenDenseTensor place mismatch bug * Revert "fix ReMakePtenDenseTensor place mismatch bug" This reverts commit 86336032f60b8a15eacd2c1ff2fa513f5d8dfd1a. * fix ReMakePtenDenseTensor place mismatch bug * reverts the set_lod interface, test=develop * modify by the review options * modify error message * add & for const input arguments * add reference in params * elementwise_sub add mutable_data * fix ResetHolderWithType check size bug * add dependence pten_tensor to test_cast_api object * remove unused code to pass ci coverage Co-authored-by: NChen Weihang <chenweihang@baidu.com> Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com> Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 01 11月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
* initial tensor design & sign kernel demo * add move constructor for meta & add lodtensor * add dirs & sign xpu kernel * add mean cpu&cuda kernel impl * move sign & mean xpu & npu kernel * add selected_rows basic impl * refactor design, BaseTensor to DenseTensor, etc. * add scale mkldnn kernel * polish xpu & npu impl details * fix mkldnn reuse compile failed * change tensor operation lib name * rename util filename * add more comments * change TensorImplInterface to TensorInterface * add kernel key and factory * remove MKLDNNTensorMeta, add MKLDNNDenseTensor * change XXDeviceContext to XXContext * add base kernel registrar utils & test on sign * replace boost::any by paddle::any * fix several ci failed * fix npu compile error * add ordered map util * fix multiple ordered_map compile errors * move dev into include dir * support sign op in static op run * fix static op run error * fix new executor compile failed * add dygraph branch & remove sign_op.h * fix test_infer_no_need_buffer_slots * fix rocm compile link error * fix unitybuild error & clear glog * fix npu compile failed * skip quant trans test * fix part windows compile problem * fix xpu enforce error * fix inference test failed * remove ordered_map to solve quant failed * fix part of rcom compile faild * add more register kernels * revert scale kernel temporarily * fix code format error * add new kernel registrar marco * rename top to tcmpt * revert xpu, npu, mkldnn impl & remove op def * add kernel args parse functor to auto parse args * revert some change & add scale kernels * add op proto in dygraph kernelcontext building * polish kernel dispatch logic & nameing rule * fix scale kernel match error * fix scale test failed * add mean API and unittest * test mean api success * add branch to solve compiled error * skip clang format error * add mean skip rule in op_library * add dot kernel, api and unittest (#6) * remove old kernel and add symbol link * fix dot compiled failed * add merco for module declare * fix npu and xpu compile error * revert sign, mean, scale, dot kernel removing * add comment for keeping old kernel impl * fix mutable_data error * fix bfloat16 conflit * fix inference undef error * adapt to msvc compile rules * polish comment for template inst * add cmake template instantiation for win * fix backend to place device id bug * fix ifdef error * Op2functor (#7) * add kernel args maker class * make args maker non-const * remove debug log * modify codes by review options * split constructPrKernelContext function * fix output name bug * fix test_mean_op test_sign_op failed * fill_any_like kernel refactor (#10) * fill_any_like kernel refactor * remove useless code of full_like c++ api * skip dtype for fill_any_like * add attrs for kernel key constrcut * add use_pt_kernel Flags to control whether to use pt kernel (#13) * add use_pt_kernel Flags to control whether to use pt kernel * change the default value to true for cheking pt kernels * fix mutable_data cuda place error * move high level apis into hapi * remove selectedrows adapting temporarily * Support Scalar in Tensor Compute Library (#14) * fill_any_like kernel refactor * remove useless code of full_like c++ api * Support Scalar in Tensor Compute Library * add scalar in dygraph and static graph mode * keep the basic type for attr, instead of using scalar for all * merge the code * remove mkldnn tensor & polish details * use flat_hash_map and small_vector in kernel factory * Refactor flatten kernel (#12) * refactor flatten kernel * update infershape function * fix compile bugs * fix bugs when merge * fix compiler bugs * fix bugs when run test_flatten_api * fix bugs when run test * Revert "use flat_hash_map and small_vector in kernel factory" This reverts commit 23091495cfdd3df8cc1be592d30f09ea66a7c72b. * Move cpu, cuda and other device code into kernels (#15) * fill_any_like kernel refactor * remove useless code of full_like c++ api * Support Scalar in Tensor Compute Library * add scalar in dygraph and static graph mode * keep the basic type for attr, instead of using scalar for all * merge the code * start refactor matmul * move cpu, cuda and other device modules into kernels * merge code * polish code in operator.cc * Perfect unitests (#16) * perfect unittest * update license * replace with flat_hash_map, small_vector (#19) * fix small_vector build error on windows platform * replace with flat_hash_map, small_vector * remove todo * Perfect unitests (#20) * perfect unittest * update license * fix bug when run tcmpt_utils_test * refactor execution adapting impl * fix insert conflit * Fix CI bug of test_yolov3 (#21) * fill_any_like kernel refactor * remove useless code of full_like c++ api * Support Scalar in Tensor Compute Library * add scalar in dygraph and static graph mode * keep the basic type for attr, instead of using scalar for all * merge the code * start refactor matmul * move cpu, cuda and other device modules into kernels * merge code * polish code in operator.cc * Fix CI bug of test_yolov3 * add the tensor base class, test=develop (#17) * update the tensor base class, test=develop * remove two funcs, test=develop * update the error msg, test=develop Co-authored-by: NChen Weihang <chenweihang@baidu.com> * [no-verify] commit backend and tensor signature changes * Rename tcmpt to pten (#23) * rename tcmpt to pten * update omitted files for rename to pten * update omitted file for rename to pten * remove k of all enum var * remove kernel_instantiate (#26) * remove symbols and spatial_tensor * change common to functions * readd share tensor impl methods * add a candidate dense tensor class, test=develop (#28) * change all Pt to Pten * resolve conflit with xiaowei * Op2functor opt1 (#27) * replace to small vector and change to const & * add std::move Co-authored-by: NChen Weihang <chenweihang@baidu.com> * polish kernel factory and kernel registry * fix operator test error msg mismatch * remove tensor signature and backend set member * move scalar and polish enforce * revert dtype layout change to fix error * fix enum operator override error * add several base unittests * add pten utils tests * polish some details * Dev/op2func refactor 3 (#30) * add a candidate dense tensor class, test=develop * remove TensorBase::backend(), test=develop * remove some ops, test=develop * cherry-pick the pr of tensor meta, test=develop * moves the dense tensor and some ops, test=develop * update the linalg operator, test=develop * update other operators, test=develop * fix errors, test=develop * fix bugs, test=develop * try to resolve the problem of windows ci, test=develop * updates codes, test=develop * fix the tensor_utils.cc, test=develop * modify the dense tensor, test=develop * fix the data type, test=develop Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com> * polish some details * polish kernel signature details * fix a bug about offsets of the tensor, test=develop (#31) Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com> * polish some details Co-authored-by: Nchentianyu03 <ctychentianyu@gmail.com> Co-authored-by: Nzyfncg <1370305206@qq.com> Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com> Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-