- 23 2月, 2022 4 次提交
-
-
由 ShenLiang 提交于
* add processgroup_nccl
-
由 Zhanlue Yang 提交于
-
由 mhhhh1 提交于
* [MLU] add cncl parallel context and mlu resource pool * [MLU] fix the cncl_context_test
-
由 wanghuancoder 提交于
* eager, test=develop * fix bug, test=develop * eager, test=develop * merge legacy to fluid * eager, test=develop * eager, test=develop * Refactor TensorAdd func by template and remove gradient_accumulation in eager * Remove needless target name * eager, test=develop * eager, test=develop * Use overload instead of template * Remove legacy code * Remove legacy code * selectedrows, test=develop * Remove DataType test * eager, test=develop * eager, test=develop * support gan, test=develop * Using Tensor directly instead of using EagerTensor * support gradient_accumulation * make test_imperative_lod_tensor_to_selected_rows longer * make test_imperative_lod_tensor_to_selected_rows longer * refine code * ptb, test=develop * Rename all EagerTensor to Tensor * Rename some EagerTensor to Tensor * rename EagerTensor to EagerVariable * eager, test=develop * eager, test=develop * eager, test=develop * eager, test=develop * add more test * eager, test=develop * Support copiable selected rows and merge develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * clear grad, test=develop * merge, develop * merge, develop Co-authored-by: NJiabinYang <360788950@qq.com> Co-authored-by: NWeilong Wu <veyron_wu@163.com>
-
- 22 2月, 2022 3 次提交
-
-
由 lilong12 提交于
* add tcp_socket and tcp_store
-
由 wangguanqun 提交于
* fix benchmark and communicator config * fix bugs of the_one_ps * multi program and fix bug in optimizer * multi program in the_one_ps * public commcontext
-
由 Yuang Liu 提交于
-
- 20 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename pten dir to phi * rename namespace to phi * rename infrt pten dir to phi * resolve conflict * rename pten to phi in cmake * revert all infrt change * change needed files * fix infrt failed * fix inference failed
-
- 19 2月, 2022 2 次提交
-
-
由 Aurelius84 提交于
* Unify paddle/pten::framework::ddim into pten::ddim * fix paddle namespace * compile sucessfully * fix npu src file * fix conflict * fix conflict * fix tensorrt compiler error * fix conflict * fix conflict * fix tesst file conflict * fix conflict * fix mlu file conflict * fix mlu file conflict * fix cinn header file conflict * fix conflict * fix conflict * fix conflict * fix conflict
-
由 chenjian 提交于
* fix RecordEvent interface * modify default level to 4 * update interface use * add const default trace level * update record event interface using * update operator.cc * update part1 * fix include profiler.h header in ps server * fix include profiler.h header in ps server
-
- 18 2月, 2022 2 次提交
-
-
由 zhangbo9674 提交于
* support dtype param for auto_cast * add amp_dtype for tracer * add unsupported bf16 list * support bf16 amp for O2 * refine python interface for bfloat16 * refine code * refine code * refine unittest * refine code * refine code * add bf16 o1 * refine code by comment * add gradient accumulator * add recompute
-
由 Shang Zhizhou 提交于
* add tool: print kernel signaturs * fix windows compile
-
- 16 2月, 2022 3 次提交
-
-
由 Yuang Liu 提交于
-
由 Jiabin Yang 提交于
* merge legacy to fluid * Remove legacy code * Remove legacy code * Remove DataType test * Using Tensor directly instead of using EagerTensor * support gradient_accumulation * make test_imperative_lod_tensor_to_selected_rows longer * make test_imperative_lod_tensor_to_selected_rows longer * refine code * Rename all EagerTensor to Tensor * Rename some EagerTensor to Tensor * rename EagerTensor to EagerVariable * add more test * merge develop and refine code
-
由 Weilong Wu 提交于
-
- 15 2月, 2022 2 次提交
-
-
由 ronnywang 提交于
* [CustomRuntime] Add DeviceManager * [CustomRuntime] Add DeviceInterface * [CustomRuntime] Add Stream, Event, DeviceGuard, CallbackManager * [CustomRuntime] Add plug-in device * [CustomRuntime] Memory module support PluggableDevice * [CustomRuntime] Add WITH_PLUGGABLE_DEVICE cmake option * update * [API] update API doc based on comments, test=develop Co-authored-by: Nqili93 <qili93@qq.com>
-
由 Aurelius84 提交于
* #1 migrate dist-related type()-> dtype() * move datatype function from pten -> fluid/framework * change type() in imperative into convert(dtype()) * modify xx_tensor->type into xx_tensor->dtype * change the set_type interface and the caller * modify xx_tensor.type into xx_tensor.dtype * fix mutable_data(place, dtype()) * change caller of mutable_data in pten and distributed * change the caller of mutable_data in fluid/framework * change the caller of mutable_data in imperative directory * mutable_data: inference * update the call of mutable_data * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType * pass the compile. the next step is remove VarType in Pten * fix all and remove VarType from pten. success in linux. Next task is other platform * fix conflict with develop * fix compiled error * Fix reset conversion * fix conflict * fix compiled problem * fix typo * Fix << in tensor_utils.cc * fix type->dtype * fix unittest * fix tensor init constructor * fix DataTypeSize for BFloat16 * fix code style * fix npu compiled error * fix npu * compile npu sucessfully * fix conflict * fix conflict Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
-
- 14 2月, 2022 3 次提交
-
-
由 Chen Weihang 提交于
-
由 Wilber 提交于
* context add generator * update
-
由 Zhanlue Yang 提交于
* Enabled Eager OpTest #1 * Enabled Eager OpTest #1 * Fixed get_tensor method for EagerTensor
-
- 11 2月, 2022 1 次提交
-
-
由 Leo Chen 提交于
-
- 10 2月, 2022 2 次提交
-
-
由 Zhanlue Yang 提交于
* Removed debug info * Added automatic code generation for final state Eager Dygraph * Modified backward yaml * Added EagerUtils helper functions for final state CodeGen * Adjusted CMakeFiles to support compilation for final state auto generated codes * Added python-c code generation for final state Eager Dygraph * Fixed minor issue * Fixed yaml.load() method failure * Fixed minor issues * Refactored Python-C Attributes Parsing Functions * Fixed minor issue with Python-C AddFunctions * Fixed issues from merge * Fixed merge issues
-
由 Zhanlue Yang 提交于
-
- 09 2月, 2022 2 次提交
-
-
由 Leo Chen 提交于
* fit pten for amp * fix typo
-
由 Jiabin Yang 提交于
* merge legacy to fluid * Remove legacy code * Remove legacy code * Remove DataType test * Using Tensor directly instead of using EagerTensor * support gradient_accumulation * make test_imperative_lod_tensor_to_selected_rows longer * make test_imperative_lod_tensor_to_selected_rows longer
-
- 08 2月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* hack custom op * add ut * skip windows ci
-
- 07 2月, 2022 1 次提交
-
-
由 sneaxiy 提交于
-
- 06 2月, 2022 1 次提交
-
-
由 Wilber 提交于
-
- 02 2月, 2022 1 次提交
-
-
由 Jiabin Yang 提交于
-
- 30 1月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* upgrade _get_all_register_op_kernels * add ut * support xpu/npu * fix device id * enhance TransToFluidPlace * fix compile
-
- 29 1月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* open header for custom kernel * add core utils * tidy core code * tify header * tidy include * tidy namespace * resolve conflit * fix unittest and coverage * remove platform using * resolve conflict * resolve conflict * fix digamma namespace error * fix xpu full kernel error * fix xpu full kernel error * polish details * add place for lib storage
-
- 28 1月, 2022 1 次提交
-
-
由 Fan Zhang 提交于
* [PSLIB] Add Metrics Module, Support User-defined Add Metric * [PSLIB] Modify According to CI * [PSLIB] Modify According to CI * [PSLIB] Modify According to CI * [PSLIB] Modify According to CI Coverage * [PSLIB] Modify According to CI * [PSLIB] Modify According to CI * [PSLIB] Modify According to CI * [PSLIB] Modify According to CI * [PSLIB] Modify According to CI * [PSLIB] Modify According to CI Coverage * [PSLIB] Modify According to CI Coverage * [PSLIB] Modify According to CI Coverage * modify role_maker * update CMakeLists.txt
-
- 27 1月, 2022 5 次提交
-
-
由 Siming Dai 提交于
* add the test case for the UVA * add the context load for the uva * Add graph_sample kernel * Add graph_sample commit * add new commit for graph_sample * add unsigned long long int * delete some remarks * add cpu version * add cuda eids * add cpu eids * delete _uva * optimize speed: emplace_back, last_layer * add to_uva_tensor * add cpu return_eids choice * add gpu return_eids choice * add cpu reindex_nodes * add gpu reindex_nodes * rename op and add OMP for cpu * add incubate api * fix the compile problem for the PADDLE_ENFORE and different device * fix the rcom and windows compile problem * add unittest for graph_sample_neighbors * fix cpu unittest and unique problem * fix uva unittest, fix cuda unique problem * fix the windows compile problem * fix the windows rand_r compile problem * add correct unittest, add src_eids dispensable * delete black * combine uva unittest * mv Sample_index to Sample_Index; check input shape; fix random sample func * delete memset & cudaMemset * fix according to PR comments * fix rocm ci * modify function names according to the specification * fix windows_openblas ci * refine annotations, fix windows unittest, add default value for uva device_id, fix bug for input nodes with empty neighbors * fix rocm ci * rename graph_sample_neighbors as graph_khop_sampler, add incubate api doc * add data type * fix conflict Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
-
由 zyfncg 提交于
* remove remake densetensor * fix eager test error * fix bug in eager
-
由 Aganlengzi 提交于
* [Demo] custom kernel based on pten kernel * merge and npu custom work well * del comments * delete other code * fix CUDAContext * fix not found small_vector.h * support NPU * fix NPUContext * fix DeviceContext support * add UT * fix call * add UT * fix * fix for comments and ut * add MACRO control * fix multi input output * support env CUSTOM_DEVICE_ROOT * deal with special cases * fix for Windows * try coverage with test_custom_kernel_dot.py * fix test_custom_kernel_dot * fix test_custom_kernel_dot * fix merge * fix merge * fix CI * update * merge and fix * remove WITH_CUSTOM_KERNEL * fix merge * merge and fix * fix ut * fix ut for mac * add more UT * add more UT * fix
-
由 Thunderbrook 提交于
* compile for afs api * with pslib
-
由 Yuang Liu 提交于
-
- 26 1月, 2022 3 次提交
-
-
由 Leo Chen 提交于
* update cmake file to remove fluid kernel * add pten declaration.h to where pybind.h used * fix sync_bn and tensorrt_engine * refine detection_library * fix interpreter_core * support eager legacy * fit eager legacy for pten * fall back to cpu if not found kernel * fix compile problem * fix compile problem * refine fallback logic * fit operator.run() * fix xpu compile * fit for new_exec * add REGISTER_OP_WITHOUT_GRADIENT * un-cache pt_kernel_context * fix compile * fix cudnn * fix compiling with on_infer * fix mkldnn * fix isfinite_v2 * fix xpu problem * fix op_device * refine fallback for xpu * fix xpu compile * merge develop * refine code format * fix compile * fix compile * add data_transfer * fix PreparePtenData * fix cpu context * merge develop * fix compile * fix error device context * fix xpu * fix dev_ctx
-
由 Allen Guo 提交于
* sync misc changes * apply comments 01 * fix compile error * remove is_ipu_place check * add authors Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NAllen Guo <alleng@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai> * sync changes * restore cmake * update ir cmake and setup.py * update inference_lib cmake * restore for split PR Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NHaicheng Jiang <haichengj@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
-
由 pangyoki 提交于
* add profile record for dygraph * add op type in record * fix little bug * solve conflict
-