- 01 9月, 2023 9 次提交
-
-
由 Scotty 提交于
* support index_select op * index_sample in cpu * support index_sample in gpu * change data_transform * fix api gen and use skip_transform in yaml
-
由 Aurelius84 提交于
* [NewIR]Part-2.1 Refactor NewIRCompiler to support Group Ops * fix gflags link error * fix include ir_printer.h * fix unittest * fix conflict * fix flags * fix comment
-
由 chen2016013 提交于
* add test for legacy_op * add test for legacy_op * add test * change test legacy op : pd.c_concat * fix codestyle * add legacy kernel op test * Update ir_kernel_dialect_pass_test.cc * Update ir_kernel_dialect_pass_test.cc * Update ir_kernel_dialect_pass_test.cc
-
由 gouzil 提交于
-
由 cyberslack_lee 提交于
[clang-tidy] No.34,36 enable performance-noexcept-move-constructor,modernize-use-transparent-functors (#56261) * fix * fix * CI * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * CI * fix * CI
-
由 chen2016013 提交于
* Generate pd_op.parsed.yaml from pd_op.yaml * Generate pd_op.parsed.yaml from pd_op.yaml * fix bug * bug fix * bug fix * bug fix * 向pd_ops.yaml中新增算子 & 修改pd_ops.parsed.yaml存放路径 * 修复路径依赖bug & 添加 .gitignore文件 * fix bug - compat input args in save_combine op * fix compat file * fix set_value_with_tensor yaml * split backward op in original yaml file * add send_v2 & recv_v2
-
由 risemeup1 提交于
* set make -j18,test=document_fix * set make -j18,test=document_fix * set make -j18,test=document_fix
-
由 Chen Weihang 提交于
* fix custom device errro by dist * polish details
-
由 zhangbo9674 提交于
* fix bug * fix bug
-
- 31 8月, 2023 11 次提交
-
-
由 iSerendipity 提交于
* add complex support for isclose * add complex test for isclose * fix template complie issue * fix cuda compilation error * fix type typo * fix error for complex's abs * add complex dtype into input * fix ut
-
由 hong 提交于
* fix install check bug * fix bug
-
由 Leo Chen 提交于
-
由 Leo Chen 提交于
* Add elementwise_add support into NHWC IR
-
由 hong 提交于
* update * fix batch norm grad args def * fix bug * fix combine slice bug * fix slice bug * update builtin split * disable using kernel resigter dtype * polish code * disable some test
-
由 Tian Zheng 提交于
* Add fused_scale_bias_relu_conv_bnstats op * Review changes * Fix no CUDNN Frontend build * Fix PADDLE_ENFORCE format * Fix PADDLE_ENFORCE CI error * Rename kernel filename * Refactor unittest to use paddle eager_op_test * Fix padding bugs * Review changes * test=cuda117 * test=cuda117
-
由 engineer1109 提交于
fix style
-
由 LiYuRio 提交于
-
由 Zero Rains 提交于
-
由 ronnywang 提交于
-
由 Chen Weihang 提交于
* move matmul spmd rules into phi * add basic infer spmd utils * addspmd factory * fix compile error * add unittest * refine infer spmd test and utils * debug infer spmd test * adapt python test * poish details * change to vector attr arg * revert needless change * update matmul spmd rule test * remove original rule * polish details * fix marco error * add comment * pass backward test * fix compile error * add cmake rule for spmd_rules_test * add dist meta tensor * update pybind impl * add marco for rules
-
- 30 8月, 2023 13 次提交
-
-
由 kangguangli 提交于
* fix logical op infermeta * add test * adpat inplace api
-
由 huangjiyi 提交于
* update * repalce gflags header * replace DEFINE_<type> with PD_DEFINE_<type> * fix bug * fix bug * fix bug * update cmake * add :: before some paddle namespace * fix link error * fix CI-Py3 * allow commandline parse * fix SetFlagsFromEnv * fix bug * fix bug * fix CI-CINN * fix CI-Coverage-build * fix CI-Windows-build * fix CI-Inference * fix bug * fix bug * fix CI-CINN * fix inference api test * fix infer_ut test * revert infer_ut gflags usage * update * fix inference * remove flags export macro * revert inference demo_ci gflags usage * update * update * update * update * update * update * update * update * fix bug when turn on WITH_GFLAGS * turn on WITH_GFLAGS * fix bug when turn on WITH_GFLAGS * fix bug when turn on WITH_GFLAGS * update * update and add unittest * add unittest * fix conflict * rerun ci * update * resolve conflict
-
由 Nyakku Shigure 提交于
-
由 xuxinyi389 提交于
* fix bugs of tp * fix bugs of tp * fix bugs * fix bugs * fix bugs of md5
-
由 ronnywang 提交于
-
由 WangZhen 提交于
-
由 kangguangli 提交于
* add_arg_mapping_for_fetch * fix * fix
-
由 Ghost Screaming 提交于
* for verify fluid operator support new comm library * u * u * u * compatiable new comm library upgrade for c_allgather, c_reduce, c_reduce_scatter and c_scatter. * Remove useless comments in process_group.py * Polish code style. * Fix some problems. * Remove use fluid api in phi comm_context_manager. * Add PPADDLE_WITH_CUDA and PADDLE_WITH_NCCL micro judgement. * Fix bug of HIP architecture. * Fix some problems. 1. remove useless loggings. 2. Fix conditional compilation for HIP. 3. Fix problems of test_pass_generation_pipeline.py. It calls paddle.distributed.init_parallel_env() at first, then auto.Engine calls _init_comm(), which will calls process_group.instantiate(). However, init_parallel_env() will call paddle.distributed.barrier(), it will call CreateNCCLEnvCache and create corresponding NCCLCommContext. But dev_id is not set, as a result, NCCLCommContext's dev_ctx is not initialized. * Fix some problems. * Polish code. * Polish code. * Revert compatiable upgrade for communication operators. Their upgrades will be submitted in another PR. * Remove StaticTCPStore. * Remove useless modification. * Remove useless set_cuda_device_id. * Polish code. * Remove fluid header files in phi files. * Remove useless comments. * Fix problems of hip arch. * Fix some problems. * Polish code. * Polish code style. --------- Co-authored-by:
hitywt <yuwentao126@126.com>
-
由 chen2016013 提交于
* Register LegacyKernelDialect & Rigister LegacyKernelOp * fix code style * delete LegacyKernelDialect ,register LegacyKernelOp into PaddleKernelDialect * fix bug * change as reviewed comments * bug fix * bug fix * try to restart coverage CI * pass legacy op to kernel pass * fix code style * fix code style * fix code style
-
由 ronnywang 提交于
-
由 Nyakku Shigure 提交于
* [clang-tidy] enable `hicpp-exception-baseclass` and fix existing errors * config * update error format to pass the ci check (at least 20 chars)
-
由 gouzil 提交于
-
由 iSerendipity 提交于
* 【complex op】No.6 add complex support for logical_and/or/xor/not * fix dtype check * modify the docs * add special condition for not raise when x.dtype is complex * add random generate for complex dtype * fix generate for complex * fix * fix * add corner case for complex type * fix ut * fix ut
-
- 29 8月, 2023 7 次提交
-
-
由 ronnywang 提交于
-
由 zhaoyingli 提交于
* [AutoParallel][NewIR] support calc_sync/comm_sync/send_v2/recv_v2 * pre-commit * rm unittest * tiny fix * api_gen support send_v2's output is empty * fix format * python_c_gen support send_v2
-
由 Fisher 提交于
When using paddle2cinn, CompilationContext.with_instantiate_variables should be set to false, otherwise CINN will instant and manage variables memory, this leads to double the memory usage, which eventually leads to out of memory error. This PR will set CompilationContext.with_instantiate_variables to false before context pass to constructing the graph compiler.
-
由 Chen Zhiyang 提交于
* add vjp code gen for SplitOp * change vjp manual file name
-
由 Leo Chen 提交于
* add pass registry * add pass registry macro
-
由 Sonder 提交于
* remove flag * open static build flag * add searchsorted to list * add register info for fused layernorm * fix fused_layernorm_kernel output registe info * fix stft registe info * add include * fix registe info * add skip fake init for fused_layernorm:residual_out * fix error * add distributed_fused_lamb_init to StaticBuildBlackList * set static_build flag to false
-
由 duanyanhui 提交于
* support cum & multinomial for dcu * rm commt
-