- 02 12月, 2022 12 次提交
-
-
由 Piotr Paturej 提交于
* Add migrations * Fix build errors * Remove elementwise_mul from migration
-
由 Hulek 提交于
* Migrate mul_mkldnn_op to matmul_kernel * Review fixes - changed mutable_data, changed ctx to dev_ctx, fixed namespaces * switched some funcs to phi * Deleted not needed phi:: and changed place checking according to standards
-
由 Jiabin Yang 提交于
* [Eager] Fix paddle.grad interface * [Eager] Support minimum SubGraph for GeneralGrad * Add needed_nodes to prune grad graph more thoroughly * [Eager] Add grad_node_trans_mapping_ to record which grad_node has been transformed to AccumulationNode * [Eager] Fix paddle.grad interface * Polish code * remove potential_stop_node * Add endding_nodes to enhance genSugraph logic * clear endding_nodes_ * polish code * rename endding_nodes to endding_nades_ * Refactor grad interface * Add register_hook case to fix coverage-ci * Fix code format * Refactor general_grad * Add more code comments * call clear directly to release GradSlotMeta * fix a mistake * fix matmul/ multiply kernel logic and optional input in yaml, fill zeros logic and so on. * fix batch_norm_double_grad yaml optional config * fix tanh_triple_grad yaml and kernels * fix MultiplyTripleGradKernel optional logic * fix merge mistake * fix compile error * remove legacy attr for bn * polish code * fix some kernel * merge develop * fix error * remote log * fix kernel with full like * hide value log behind * hide value log behind * fix matmul_triple grad * fix xpu compile error * fix xpu compile error * fix xpu ut * fix xpu ut * fix_xpu_compile_error Co-authored-by: NWeilong Wu <veyron_wu@163.com>
-
由 Bo Zhang 提交于
* profile reduce kernel for fp16 and reduceHigherdim * use reinterpret_cast * fix for CI on ROCm * add Macro for ROCm * ROCm CI config * ROCm CI config * unit test repair * pull * add common_funcs.h * reduceType * Update reduce_function.h * not higher * rename
-
由 Shijie 提交于
* Fix fuse_gemm_epilogue * update tests * Update CMakeLists.txt * Update CMakeLists.txt * Update CMakeLists.txt * fix random seed * use assert_allclose * Update test_dist_fuse_gemm_epilogue_pass.py * Update cpp_pass.py * Update test_dist_fuse_gemm_epilogue_pass.py * fix codestyle * update seed and atol
-
由 gem5 提交于
-
由 ronnywang 提交于
* fix capi kernel registration macro error * update
-
由 Weilong Wu 提交于
[Eager, Performance Optimization] modify AllocateFrom to reduce deconstruction of shared_ptr (#48548)
-
由 Jiabin Yang 提交于
* [Eager] Fix paddle.grad interface * [Eager] Support minimum SubGraph for GeneralGrad * Add needed_nodes to prune grad graph more thoroughly * [Eager] Add grad_node_trans_mapping_ to record which grad_node has been transformed to AccumulationNode * [Eager] Fix paddle.grad interface * Polish code * remove potential_stop_node * Add endding_nodes to enhance genSugraph logic * clear endding_nodes_ * polish code * rename endding_nodes to endding_nades_ * Refactor grad interface * Add register_hook case to fix coverage-ci * Fix code format * Refactor general_grad * Add more code comments * call clear directly to release GradSlotMeta * fix a mistake * fix matmul/ multiply kernel logic and optional input in yaml, fill zeros logic and so on. * fix batch_norm_double_grad yaml optional config * fix tanh_triple_grad yaml and kernels * fix MultiplyTripleGradKernel optional logic * fix merge mistake * fix compile error * remove legacy attr for bn * polish code * fix some kernel * merge develop * fix error * remote log * fix kernel with full like * hide value log behind * hide value log behind * fix matmul_triple grad Co-authored-by: NWeilong Wu <veyron_wu@163.com>
-
由 ykkk2333 提交于
* add stat tool * add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun * add silu, unfold and their grads,test=kunlun
-
由 Infinity_lee 提交于
* fix boardcasting superlink * Update bitwise_op.cc * fix typo errors(from 48186) * Update python/paddle/distribution/uniform.py Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com> * Update math.py * Update math.py * refix * Update logic.py * BaseTransform api doc; test=docs_preview * Update python/paddle/vision/transforms/transforms.py * for text block; test=docs_preview * Update transforms.py Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
-
由 Chen Weihang 提交于
-
- 01 12月, 2022 11 次提交
-
-
由 zyfncg 提交于
* rename kernel for top_k, slogdeterminant, generate_proposals_v2 * fix bug
-
由 Wangzheee 提交于
* general optimization for no_varlen multihead
-
由 Wilber 提交于
* update memory_optimize pass
-
由 wanghuancoder 提交于
* do not link python lib in tensor wrapper
-
由 Zhang Jun 提交于
* instance norm support dynamic shape * update unittest
-
由 xiaoxiaohehe001 提交于
-
由 zhoutianzi666 提交于
* remove conv_act_set from graph_pattern_detector.cc
-
由 Zhang Jun 提交于
* Support FP16 in generic TensorRT plugin. * Support FP16 for Pad3D.
-
由 minghaoBD 提交于
* fuse-mt passes compatible with structured pruning
-
由 HongyuJia 提交于
* fix typo error * pass CI-coverage
-
由 zhangyikun02 提交于
-
- 30 11月, 2022 15 次提交
-
-
由 Qi Li 提交于
-
由 zyfncg 提交于
* fix error log for yaml check * remove grad_op of increment
-
由 Netpunk 提交于
* migrate transpose_op.cu.h and gpu_utils.h * format code style * fix some problems * format code * reset tranpose_op.cc * test commit * recover transpose_op.h * delete transpose_op.h * adjust header files order in transpose_op.cc
-
由 ShenLiang 提交于
* fix bug of pylayer * fix bug
-
由 wanghuancoder 提交于
-
由 Aurelius84 提交于
* [Perf]Fix interploate OutSize data transform problem * fix code style * fix grad * fix phi kernel
-
由 MarDino 提交于
* add activation support * fix cublasLt bug * remove useless code and fix test random range
-
由 feng_shuai 提交于
-
由 Yuanle Liu 提交于
-
由 zhangbo9674 提交于
* add fuse act add grad pass * polish code * refine code * add test * refine code
-
由 zyfncg 提交于
* rename some kernel name * fix compile problem
-
由 zyfncg 提交于
* fix bug of eigen_dependency * fix xpu compile
-
由 RichardWooSJTU 提交于
* delete unnecessary shape and slice op Co-authored-by: NYour Name <you@example.com>
-
由 james 提交于
some legacy code still use xpu_wait() for stream sync -- it only syncs default stream. this PR replaces them with dev_ctx.Wait() to ensure that correct stream is always used
-
由 zhangyikun02 提交于
-
- 29 11月, 2022 2 次提交