- 01 3月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* vectorize lamb kernel * remove flags, add ut * remove useless codes * refine code, add param order
-
- 25 2月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* add multi tensor apply l2 norm * add multi_tensor_apply code * make sizeof(TensorMeta) smalller * move code to distributed_fused_lamb_op.cu * remove useless FLAGS
-
- 22 2月, 2022 1 次提交
-
-
由 xiongkun 提交于
* change Vector to std::vector and provide MixVector class as a helper wrapper class * solve the multi-gpu hang problem * remove the duplicate template instantialize * Copy vector to cpu * add CopyToCPU * xxx * final version: fix the problem of all reduce * remove mixvector dependence * fix * merge * fix code * fix by CI
-
- 21 2月, 2022 1 次提交
-
-
由 sneaxiy 提交于
-
- 20 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename pten dir to phi * rename namespace to phi * rename infrt pten dir to phi * resolve conflict * rename pten to phi in cmake * revert all infrt change * change needed files * fix infrt failed * fix inference failed
-
- 19 2月, 2022 2 次提交
-
-
由 Aurelius84 提交于
* Unify paddle/pten::framework::ddim into pten::ddim * fix paddle namespace * compile sucessfully * fix npu src file * fix conflict * fix conflict * fix tensorrt compiler error * fix conflict * fix conflict * fix tesst file conflict * fix conflict * fix mlu file conflict * fix mlu file conflict * fix cinn header file conflict * fix conflict * fix conflict * fix conflict * fix conflict
-
由 sneaxiy 提交于
* add DistributedFusedLamb op * polish code * fix compile error * compatible with pten changement * fix rocm compile error * improve converage * update upstream/develop * fix cast_with_ptr.h * add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1 * fix clip before allreduce * add use_master_param_norm * code polish * fix bug * fix ROCM ci
-
- 15 2月, 2022 2 次提交
-
-
由 Feiyu Chan 提交于
Move paddle/fluid/operators/math/algorithm.h to paddle/pten/kernels/funcs and rename all references to symbols in it.
-
由 Aurelius84 提交于
* #1 migrate dist-related type()-> dtype() * move datatype function from pten -> fluid/framework * change type() in imperative into convert(dtype()) * modify xx_tensor->type into xx_tensor->dtype * change the set_type interface and the caller * modify xx_tensor.type into xx_tensor.dtype * fix mutable_data(place, dtype()) * change caller of mutable_data in pten and distributed * change the caller of mutable_data in fluid/framework * change the caller of mutable_data in imperative directory * mutable_data: inference * update the call of mutable_data * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType * pass the compile. the next step is remove VarType in Pten * fix all and remove VarType from pten. success in linux. Next task is other platform * fix conflict with develop * fix compiled error * Fix reset conversion * fix conflict * fix compiled problem * fix typo * Fix << in tensor_utils.cc * fix type->dtype * fix unittest * fix tensor init constructor * fix DataTypeSize for BFloat16 * fix code style * fix npu compiled error * fix npu * compile npu sucessfully * fix conflict * fix conflict Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
-
- 11 2月, 2022 1 次提交
-
-
由 Feiyu Chan 提交于
* move operators/math/math_function_* to pten/kernels/func * namespace from `paddle::operators::math` to `pten::funcs`
-
- 09 2月, 2022 2 次提交
-
-
由 fwenguang 提交于
-
由 Jiabin Yang 提交于
* merge legacy to fluid * Remove legacy code * Remove legacy code * Remove DataType test * Using Tensor directly instead of using EagerTensor * support gradient_accumulation * make test_imperative_lod_tensor_to_selected_rows longer * make test_imperative_lod_tensor_to_selected_rows longer
-
- 07 2月, 2022 1 次提交
-
-
由 jakpiase 提交于
* Added adam kernel * CI rerun
-
- 27 1月, 2022 1 次提交
-
-
由 Feiyu Chan 提交于
-
- 25 1月, 2022 1 次提交
-
-
由 Weilong Wu 提交于
* Added selected_rows and rw_lock to pten * Renamed the unit test target to fix CI * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid * Remove rw_lock.h,rw_lock_test.cc in fluid * Use pten::RWLock and pten::AutoRDLock, fix CI * Use pten::SelectedRows * Use pten::SelectedRows * Fix to pass NPU CI * Use pten::SelectedRows, to pass NPU CI * To fix NPU CI * To fix NPU CI again
-
- 24 1月, 2022 2 次提交
-
-
由 Feiyu Chan 提交于
* migration of functors in paddle/fluid/operators/eigen and paddle/fluid/platform/eigen_ext.h * update path of data types like float16.h in includes in extensions.h
-
由 z8hanghuan 提交于
* support sparse of adam, *test=kunlun * add pre-commit-config.yaml * support sparse of adam in KL2,*test=kunlun * support sparse of adam in KL2, *test=kunlun * modify xpu.cmake, *test=kunlun * support sparse of adam, rm some wait, *test=kunlun * support sparse of adam, rm some wait, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun * support sparse of adam, *test=kunlun
-
- 21 1月, 2022 1 次提交
-
-
由 Weilong Wu 提交于
-
- 20 1月, 2022 1 次提交
-
-
由 zhangbo9674 提交于
* fix mp * support merged_momentum for mp
-
- 18 1月, 2022 1 次提交
-
-
由 Zhanlue Yang 提交于
* Merged LoDTensor with Tensor,test=allcases * Patched python level LoDTensor * Patched python level LoDTensor * Merge Tensor into DenseTensor * Fixed namespace issues,test=allcases * Fixed merge issues * Fixed inference issues * Fixed NPU test issues * Fixed merge issues
-
- 17 1月, 2022 1 次提交
-
-
由 sneaxiy 提交于
-
- 10 1月, 2022 1 次提交
-
-
由 Zhanlue Yang 提交于
* Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues
-
- 07 1月, 2022 1 次提交
-
-
由 zhangbo9674 提交于
* add multi tensor for adam * add merged_adam op * refine code * refine adam compute logic
-
- 29 12月, 2021 1 次提交
-
-
由 sneaxiy 提交于
-
- 28 12月, 2021 1 次提交
-
-
由 Guoxia Wang 提交于
-
- 24 12月, 2021 1 次提交
-
-
由 zhangbo9674 提交于
-
- 17 12月, 2021 1 次提交
-
-
由 sneaxiy 提交于
* support multi precision update for LAMB * hide some api * fix ci uts * fix lamb output of dygraph * remove some changes to some PR * try to fix Py3 CI compile error * fix test_imperative_optimizer, add lars ut, add layer_norm ut * fix ut, fix format * fix ut * fix windows ci
-
- 03 12月, 2021 1 次提交
-
-
由 ronnywang 提交于
* refine structure for cuda and rocm * update * update * update * update
-
- 30 11月, 2021 1 次提交
-
-
由 zhangbo9674 提交于
* add regularation and Nesterov for mergerd_momentum * refine unittest for use_nesterov attr * refine op check * refine code * fix bug * refine code of regularization_flag * delete useless code
-
- 29 11月, 2021 1 次提交
-
-
由 piotrekobiIntel 提交于
-
- 27 11月, 2021 1 次提交
-
-
由 Aganlengzi 提交于
* [NPU] reorganization for device API abstraction * [NPU] delete old files * [NPU] fix npu_collective_helper * [NPU] fix collective_helper * [NPU] fix ut * [NPU] mod memory allocation and hccl_helper * [NPU] fix place_type * [NPU] split enfoce.h * move acl* call into npu_info * merge conflict * fix merge * merge conflict * merge conflict
-
- 17 11月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* copy beta pow to same place when skip_update=1 * fix xpu
-
- 20 10月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 19 10月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
* add pow2_warmup op * remove contrib __all__ * add AttrT * rename * follow comments * fix duplicate PADDLE_RESTRICT
-
- 17 10月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 15 10月, 2021 2 次提交
-
-
由 Zeng Jinle 提交于
* remove wrong restrict * remove master_param_out __restrict__ * update
-
由 Zeng Jinle 提交于
-
- 14 10月, 2021 3 次提交
-
-
由 Zeng Jinle 提交于
-
由 Zeng Jinle 提交于
-
由 Zeng Jinle 提交于
* merge momentum ops * update * add ut to improve coverage * remove optimizer change * fix error msg * update ut * add __restrict__ for CUDA * update ut * move merged_momentum_op to optimizer dir * fix coverage
-