- 31 1月, 2023 1 次提交
-
-
由 姜永久 提交于
* rm flags_retain grad in pybind * retain grads for xpu test * set retain grad for xpu * rm flag * lint --------- Co-authored-by: Nwanghuancoder <wanghuan29@baidu.com>
-
- 30 1月, 2023 5 次提交
-
-
由 jiangcheng 提交于
-
由 engineer1109 提交于
replace all TensorFromVector & TensorToVector AssignKernel async copy
-
由 Ruibiao Chen 提交于
* Support stream priority for standalone executor * Fix compile error * Fix compile error * Fix compile error * Fix compile error * Fix compile error
-
由 zmxdream 提交于
* add set slot_num for psgpuwraper (#177) * add set slot_num_for_pull_feature for psgpuwarper * Add get_epoch_finish python interface (#182) * add get_epoch_finish interface * add return * delete return * add unzip op (#183) * fix miss key for error dataset (#186) * fix miss key for error dataset * fix miss key for error dataset Co-authored-by: Nyangjunchao <yangjunchao@baidu.com> * add excluded_train_pair and infer_node_type (#187) * support return of degree (#188) * fix task stuck in barrier (#189) Co-authored-by: Nyangjunchao <yangjunchao@baidu.com> * check node/feature format when loading (#190) * check node&feature format when loading * check node&feature format when loading (2£ (2) * degrade log (#191) * [PGLBOX]fix conflict * [PGLBOX]fix conflict * [PGLBOX]replace LodTensor with phi::DenseTensor * [PGLBOX]fix gpu_primitives.h include path * [PGLBOX]from platform::PADDLE_CUDA_NUM_THREADS to phi::PADDLE_CUDA_NUM_THREADS * [PGLBOX]fix unzip example code * [PGLBOX]fix unzip example code * [PGLBOX]fix unzip example code * [PGLBOX]fix unzip example code * [PGLBOX]fix unzip ut * [PGLBOX]fix unzip ut * [PGLBOX]fix code style * [PGLBOX]fix code style * [PGLBOX]fix code style * fix code style * fix code style * fix unzip ut * fix unzip ut * fix unzip ut * fix unzip * fix code stype * add ut * add c++ ut & fix train_mode_ set * fix load into memory * fix c++ ut * fix c++ ut * fix c++ ut * fix c++ ut * fix code style * fix collective * fix unzip_op.cc * fix barrier * fix code style * fix barrier * fix barrier * fix code styple * fix unzip * add unzip.py * add unzip.py * fix unzip.py --------- Co-authored-by: Nchao9527 <33347532+chao9527@users.noreply.github.com> Co-authored-by: NSiming Dai <908660116@qq.com> Co-authored-by: Nhuwei02 <53012141+huwei02@users.noreply.github.com> Co-authored-by: Nyangjunchao <yangjunchao@baidu.com>
-
由 gem5 提交于
-
- 29 1月, 2023 8 次提交
-
-
由 jiangcheng 提交于
-
由 zhangbo9674 提交于
-
由 sneaxiy 提交于
-
由 sneaxiy 提交于
* add missing proto file * fix windows ci * fix ci compile error
-
由 ronnywang 提交于
[CustomDevice] registering feed_dense_tensor, feed_sparse_coo_tensor, feed_strings kernels for custom device (#50042) * [CustomDevice] registering feed_dense_tensor, feed_sparse_coo_tensor, feed_strings kernels for custom device * update * update * update
-
由 LiYuRio 提交于
* remove max_slot_num * fix test case
-
由 jiangcheng 提交于
* [CINN] collect inplace var into cinn op desc's kInplaceVarNames attribute * attr move from op desc to subgraph * GetFetchIds from var_map instead of var_model_to_program_map_
-
由 Yuang Liu 提交于
-
- 28 1月, 2023 1 次提交
-
-
由 LiYuRio 提交于
-
- 20 1月, 2023 2 次提交
-
-
由 Jiabin Yang 提交于
-
由 jameszhang 提交于
* update xccl lib & use native Reduce in dygraph * minor
-
- 19 1月, 2023 4 次提交
-
-
由 Feiyu Chan 提交于
-
由 xiaoguoguo626807 提交于
* modify name * merge develop * fix param * fix exp gen bug * fix sum_grad * comment
-
由 jameszhang 提交于
* [KUNLUN] add op: maxpool_with_index * use DeviceContext::Alloc() instead of DenseTensor::mutable_data() * fix file format * solve clip unittest failure * minor fix * Revert "solve clip unittest failure" since the issue is fixed in #49535 This reverts commit 1127adc66e79afe35ac3c00bb34e6aaa7cd7d78b. * align with xdnn on the definition of mask in max_pool_with_index * minor
-
由 heliqi 提交于
* support PaddlePaddle Backend on Triton * fix test cases * fix Codestyle * add test case * add test case
-
- 18 1月, 2023 6 次提交
-
-
由 Sławomir Siwek 提交于
* extract fuse pass logic to header file * adjust namespaces * Update paddle/fluid/framework/ir/mkldnn/activation_onednn_fuse_pass.h update date Co-authored-by: NTomasz Socha <tomasz.socha@intel.com> * add inline remove static Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
由 wenbin 提交于
* fix cast issue * add ut
-
由 jameszhang 提交于
-
由 wawltor 提交于
* Add the cumsum 0d tensor * xpu and cpu judge the 0d tensor * change to 2022 to 2023 in new commit * fix the reverse logic
-
由 Leo Chen 提交于
-
由 jameszhang 提交于
* revert to use default XPU stream for computing XPUContext now has a null stream by default. If you want to use a separate stream (e.g. in async collective communication), you should create a dedicated XPUContext and invoke its XPUContext::CreateStream() * minor
-
- 17 1月, 2023 10 次提交
-
-
由 Jiabin Yang 提交于
* add test for composite with dy2st * add more log
-
由 zhangbo9674 提交于
* refine munmap freq for ref_cnt_mmap_allocator * add shm reuse logic * fix compile bug * fix compile bug * fix bug of file refcount * fix compile bug * fix compile bug * refine code for delete shm case * polish code * refine shm cache pool size setting logic * set buffer is 2 * refine shm cache size logic * refine max shm cache * refine shm cache size
-
由 Paulina Gacek 提交于
* reshape_transpose_matmul_pass_tester rewritten * matmul_transpose_reshape_pass_tester rewritten * mkldnn to onednn
-
由 pangyoki 提交于
* new exe supports CUDA Graph * fix * fix * fix * fix FLAGS_use_stream_safe_cuda_allocator in unittest * insert output of coalesce_tensor op to skip_gc_var * fix
-
由 xiaoguoguo626807 提交于
* proto type of composite grad in paddle * proto type of composite grad in paddle * refactor composite api with phi * fix compile error * support static graph code-gen for squeeze op * generate static graph code of unsqueeze * refine op name * fix compile error * add extra output in op_compat * remove debug log * fix clang compile error * support prim switch flag * support prim switch flag * fix dygraph error * merge develop * add code_gen * add necessary files without codegen * fix code_gen bug * add deps * modify igmnore * add ignore * delete std cout * add composite logic for backward.py * add tanh first order grad composite * support enable_prim flag for static graph * throw expection when both GrapOpMaker and GradCompOpMaker not been registered * reorganize the directory of prim api tests * fix windows error * add eager_utils * add eager_utils * modify code gen * add composite parse * add unittest for get_grad_op_desc * code optimize * fix static test on windows * support generate static graph code for imag and real op * fix windows compile error in test_static_prim * merge develop * disable test eager in inference * prim code gen * disable eager compile in inference * origin_yaml codegen success * rm other file * rm gitignore file * code_style * add eager test * code_style * clear # * merge develop * clear # * remove useless files * modify static test * support bool flag from singlton * merge develop * recover git ignore * fix conflict * clear prim_gen * recover git ignore for generated op * parse_yaml success * fix test compile error * remove some tests * add python test * code_style * revert parse_utils+ clear prim_gen * fix some name issue * add composite code gen * modify backward yaml * fix static composite grad maker code gen * remove addtional files * add some static funcs unit test * fix some bugs * fix composite grad maker register code gen * optimize some functions * modify gen cmake * add more api gen * add header * modify static * add static expand unsqueeze * comments * modify compopmaker * revert * modify gen name Co-authored-by: NJiabinYang <360788950@qq.com> Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com> Co-authored-by: Ncxxly <chenxx_id@163.com> Co-authored-by: Ncharles-hit <wanghao107@baidu.com>
-
由 YuanRisheng 提交于
* change feed_op to phi kernel * fix ci bugs * fix build bugs * fix ci bugs * fix compile bugs * fix ci bugs * perfect code * perfect comment code * fix install bugs * modify code according comment * remove visitor in feed_op * modify according comment * perfect code according comment * add infershape * fix py3 bugs * fix getexpected kernel type * fix getexpected kernel type * fix ci bugs * add registry for custom device * fix py3 bugs * fix floating point error * fix py3 test bugs
-
由 Jiabin Yang 提交于
-
由 WangZhen 提交于
* Support call backward() without params in dy2st
-
由 LiYuRio 提交于
-
由 Xiaoxu Chen 提交于
* support elementwise base func * fix compiling error and add test * support vjp for div using comp * remove additional change * fix dy2st error with magic num * fix dy magic num * another magic * another magic * another magic * add skip rename strategy * support add vjp * support add with new axis cal * support sub vjp * [prim] add multiply vjp rules * [prim] add multiply vjp rules * [prim] fix no infershape with composite in _append_backward_ops * [prim] add expand vjp rule * [prim] add exp vjp rule * uncomment infer shape for reshape/sum static prim api * [prim] fix tanh nullptr error * remove some print message * fix magic number in run_program relative tests @JiaBinYang * [prim] add expand,multiply,exp vjp rules * fix only support single direction reduce error * infer reduce dims using out dims Co-authored-by: NJiabinYang <360788950@qq.com>
-
- 16 1月, 2023 3 次提交
-
-
由 HappyHeavyRain 提交于
* support the 'data_transform' for generating static graph ops * reset 'pow' code * change the 'GetKernelTypeForVar'
-
由 zlsh80826 提交于
* Update warpctc for cuda-12 * Deprecate cudaProfilerInitialize for CUDA > 11 * Deprecate CUSPARSE_MV_ALG_DEFAULT for CUDA_VERSION >= 11040 * Add the missing thrust header
-
由 Zhang Jun 提交于
* add outvar name for nvtx mark * nly network created with kEXPLICIT_BATCH can setsetMaxBatchSize
-