1. 06 12月, 2022 8 次提交
    • Z
      Clear extra input (Bias, ResidualData) in OpMaker of conv2d (#47579) · 0a2dfa38
      zyfncg 提交于
      * delete Bias and ResidualData in OpMaker of conv2d
      
      * delete extra input of conv3d
      
      * refactor pass of conv_bias_fusion
      
      * fix mkldnn dependency
      
      * fix mkldnn compile
      
      * fix test_conv_bias_mkldnn_fuse_pass
      
      * police some code
      
      * remove useless log
      
      * fix analyzer_vit_ocr_tester
      
      * fix conv_activation_mkldnn_fuse_pass
      
      * fix test_analyzer_ocr
      
      * add fused_conv_sig
      
      * fix performence regression
      
      * fix performance regression
      0a2dfa38
    • Q
      add xpu_support op function (#48606) · 06b32b38
      QingshuChen 提交于
      *test=kunlun
      06b32b38
    • S
      [PHI] Migrate elementwise_(add/mul) kernels (#48625) · 7575d37c
      Sławomir Siwek 提交于
      * remove fluid code
      
      * init
      
      * typo
      
      * fix merge conflicts
      7575d37c
    • H
      [XPU] add tile_grad op (#48720) · 8de336f9
      houj04 提交于
      8de336f9
    • K
      Remove fluid matmul (#47988) · 8fb829ba
      kangguangli 提交于
      * remove layers.matmul in nets.py
      
      * remove layers.matmul in rnn_impl/test_quantization_pass/auto_parallel_gpt_model/test_auto_parallel_completion_gpt
      
      * remove layers.matmul in other files
      
      * fix
      
      * fix
      
      * remove layers.matmul itself
      
      * remove ref in CMakeLists.txt and tools directory
      
      * remove matmul in fluid.layers.nn.py
      
      * remove matmul in fluid.dygraph.rnn.py && resotre test_matmul_op.py
      
      * replace matmul in fluid.dygraph.rnn.py && clean api_test in test_matmul_op.py
      
      * fix error && restore empty test_auto_search_dist_matmul_op.py
      
      * fix check in test_auto_parallel_partitioner.py
      
      * fix test_dist_matmul && test_flags_mkldnn_ops_on_off
      
      * fix test_fused_attention_op_xpu.py && test_matmul_op_xpu.py
      
      * remove test_auto_search_dist_matmul_op.py
      
      * remove layers.matmul in auto_parallel_gpt_model.py && fix doc in fluid/io.py
      
      * fix for matmul_grad
      
      * fix codestyle
      
      * fix codestyle
      
      * resolve conflicts error
      
      * restore unit test file but not compiled it for later remove
      
      * fix codestyle
      
      * fix wrong unittest skip
      
      * fix unittest delete
      
      * fix scale cost
      
      * fix scale cost
      
      * resolve conflicts error
      
      * resolve conflicts error
      Co-authored-by: Njakpiase <jakpia21@gmail.com>
      8fb829ba
    • Z
      [inference][trt] add reduce max for trt (#48684) · dd304f31
      Zhang Jun 提交于
      * add reduce max for trt
      dd304f31
    • Y
    • Y
      add xpu centered rmsprop (#48658) · 54b756e2
      ykkk2333 提交于
      * add stat tool
      
      * add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun
      
      * add xpu rmsprop centered, test=kunlun
      54b756e2
  2. 05 12月, 2022 20 次提交
  3. 04 12月, 2022 1 次提交
  4. 03 12月, 2022 2 次提交
  5. 02 12月, 2022 9 次提交
    • Y
    • P
      [PHI] Migrate elementwise_sub kernel (#48611) · 493825a5
      Piotr Paturej 提交于
      * Add migrations
      
      * Fix build errors
      
      * Remove elementwise_mul from migration
      493825a5
    • H
      Migrate mul_mkldnn_op to phi matmul_kernel (#48299) · e8edbb09
      Hulek 提交于
      * Migrate mul_mkldnn_op to matmul_kernel
      
      * Review fixes - changed mutable_data, changed ctx to dev_ctx, fixed namespaces
      
      * switched some funcs to phi
      
      * Deleted not needed phi:: and changed place checking according to standards
      e8edbb09
    • J
      [XPU ]Fix xpu compile error (#48621) · 2af82190
      Jiabin Yang 提交于
      * [Eager] Fix paddle.grad interface
      
      * [Eager] Support minimum SubGraph for GeneralGrad
      
      * Add needed_nodes to prune grad graph more thoroughly
      
      * [Eager] Add grad_node_trans_mapping_ to record which grad_node has been transformed to AccumulationNode
      
      * [Eager] Fix paddle.grad interface
      
      * Polish code
      
      * remove potential_stop_node
      
      * Add endding_nodes to enhance genSugraph logic
      
      * clear endding_nodes_
      
      * polish code
      
      * rename endding_nodes to endding_nades_
      
      * Refactor grad interface
      
      * Add register_hook case to fix coverage-ci
      
      * Fix code format
      
      * Refactor general_grad
      
      * Add more code comments
      
      * call clear directly to release GradSlotMeta
      
      * fix a mistake
      
      * fix matmul/ multiply kernel logic and optional input in yaml, fill zeros logic and so on.
      
      * fix batch_norm_double_grad yaml optional config
      
      * fix tanh_triple_grad yaml and kernels
      
      * fix MultiplyTripleGradKernel optional logic
      
      * fix merge mistake
      
      * fix compile error
      
      * remove legacy attr for bn
      
      * polish code
      
      * fix some kernel
      
      * merge develop
      
      * fix error
      
      * remote log
      
      * fix kernel with full like
      
      * hide value log behind
      
      * hide value log behind
      
      * fix matmul_triple grad
      
      * fix xpu compile error
      
      * fix xpu compile error
      
      * fix xpu ut
      
      * fix xpu ut
      
      * fix_xpu_compile_error
      Co-authored-by: NWeilong Wu <veyron_wu@163.com>
      2af82190
    • B
      Split common funcs from reduction and structure modification (#46970) · ef575d6a
      Bo Zhang 提交于
      * profile reduce kernel for fp16 and reduceHigherdim
      
      * use reinterpret_cast
      
      * fix for CI on ROCm
      
      * add Macro for ROCm
      
      * ROCm CI config
      
      * ROCm CI config
      
      * unit test repair
      
      * pull
      
      * add common_funcs.h
      
      * reduceType
      
      * Update reduce_function.h
      
      * not higher
      
      * rename
      ef575d6a
    • S
      Fix fuse_gemm_epilogue (#47805) · 6efc2888
      Shijie 提交于
      * Fix fuse_gemm_epilogue
      
      * update tests
      
      * Update CMakeLists.txt
      
      * Update CMakeLists.txt
      
      * Update CMakeLists.txt
      
      * fix random seed
      
      * use assert_allclose
      
      * Update test_dist_fuse_gemm_epilogue_pass.py
      
      * Update cpp_pass.py
      
      * Update test_dist_fuse_gemm_epilogue_pass.py
      
      * fix codestyle
      
      * update seed and atol
      6efc2888
    • G
      add some compare and logical trt converter (#48592) · 4c38b87e
      gem5 提交于
      4c38b87e
    • R
      fix phi capi kernel registration macro error (#48616) · 0f3b1ad6
      ronnywang 提交于
      * fix capi kernel registration macro error
      
      * update
      0f3b1ad6
    • W
      [Eager, Performance Optimization] modify AllocateFrom to reduce deconstruction... · 708c4f88
      Weilong Wu 提交于
      [Eager, Performance Optimization] modify AllocateFrom to reduce deconstruction of shared_ptr (#48548)
      
      708c4f88