1. 16 12月, 2021 20 次提交
    • C
      pylayer support HIP (#38184) · 2e76d5ad
      chentianyu03 提交于
      2e76d5ad
    • Z
      Fixed LD_LIBRARY_PATH for eager_code_generator (#38160) · af30f545
      Zhanlue Yang 提交于
      * Rearranged Eager AutoCodeGen directory structure
      
      * Removed USE_OP in Eager AutoCodeGen
      
      * Enabled generation for Operators without Grad/Inputs/Outputs
      
      * Resolved operators without input
      
      * Fixed merge conflicts
      
      * Enabled Eager AutoCodeGen for 10+ more operators
      
      * Refactored Eager AutoCodeGen with more organized helper objects
      
      * Enabled Eager AutoCodeGen for operators with multiple OpBases
      
      * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument
      
      * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen
      
      * Enabled Eager AutoCodeGen for All Existing Operators & Possible Future Operators
      
      * Fixed CI issues
      
      * Fixed LD_LIBRARY_PATH for eager_code_generator
      af30f545
    • X
      Add arc hyperbolic function op (#37076) · 36b7368d
      xiaoting 提交于
      * add activation
      
      * update activation_op
      
      * add unitest for activation
      
      * fix acosh for init, test=develop
      36b7368d
    • F
      Conv transpose eltwiseadd bn fuse pass (#37800) · e64f0997
      feng_shuai 提交于
      * conv_transpose_eltwiseadd_bn_fuse_pass
      
      * change timeout
      
      * add TIMEOUT
      
      * add random num for group and dilation
      
      * change PassCompat
      e64f0997
    • Revert "modify the fix_seed attribute in dropout op is a def... · 464f2af8
      王明冬 提交于
      Revert "modify the fix_seed attribute in dropout op is a def attribute.test=develop (#38100)" (#38127)
      
      This reverts commit f44add7b.
      464f2af8
    • Y
      Add tests for PaddleInference Pass (#37676) · 96597a85
      yeliang2258 提交于
      * add test for conv_elementwise_add2_act_fuse_pass and conv_elementwise_add_act_fuse_pass
      
      * Add conv_eltwiseadd_bn_fuse_pass test and fix test_conv_elementwise_addX_act_fuse_pass
      
      * add tests for conv_act_mkldnn_fuse_pass
      
      * add test for conv_bias_mkldnn_fuse_pass
      
      * update code
      
      * add conv_act_mkldnn_fuse_pass for relu, relu6, swish, leaky_relu
      
      * update test
      
      * update
      
      * update bug
      
      * update
      
      * update pattern_detector
      
      * fix test_conv_eltwiseadd_bn_fuse_pass
      
      * add diff display notest;test=windows_ci_inference
      
      * fix
      
      * remove test_conv_act_mkldnn_fuse_pass.py
      
      * ifix
      96597a85
    • C
      [PTen] Unify device context entrance in pten part 2 (#38182) · e02537f9
      Chen Weihang 提交于
      * unify device context entrance
      
      * move all_context include to header
      
      * polish cmake relay for device_context
      
      * fix npu compile failed
      
      * fix npu compile failed
      e02537f9
    • Y
      add defaults value for disable_ut (#38110) · 55509ae7
      YUNSHEN XIE 提交于
      55509ae7
    • C
      [PTen] Add register_ctx_kernel marco and move scale kernel (#38121) · af498677
      Chen Weihang 提交于
      * add register_ctx_kernel and move scale kernel
      
      * polish details by reviewer comment
      
      * fix xpu compile failed
      
      * fix cmake error
      af498677
    • J
      support eager switch system (#38170) · 8305c2be
      Jiabin Yang 提交于
      * support eager switch system
      
      * polish code
      8305c2be
    • D
      [psgpu]add checknan print and fix trainer device (#38131) · 092839d6
      danleifeng 提交于
      * trainer_device fix and checknan tool for psgpu;test=develop
      
      * disable show_one_table;test=develop
      092839d6
    • L
      Adapt host event recorder to profiler (#37766) · 5b6be4d7
      liutiexing 提交于
      * add align for WorkQueue
      
      * add spinlock
      
      * merge develop
      
      * merge
      
      * Add EventsWaiter
      
      * Revert "Add EventsWaiter"
      
      This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.
      
      * add os_info
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update for bugfix
      
      * update
      
      * update
      
      * update
      Co-authored-by: Nliutiexing <liutiexing@google.com>
      5b6be4d7
    • L
      Add fmax and fmin operators (#37826) · dd3afc9d
      LJQ❤️ 提交于
      Add elementwise_fmax and elementwise_fmin operators
      dd3afc9d
    • L
      Add sparse_attention mask ,test=develop (#37973) · fa463b90
      Liu-xiandong 提交于
      Add key_padding_mask and attn_mask in sparse_attention Api
      
      1.Key padding mask is a tensor with dimensions [batch_size, seq_len], and attention mask is a tensor with dimensions [seq_len, seq_len]. The data types of the two masks are consistent with Q, K, and V, which are float32 or float64. If the value in Mask is 0, it means that the position needs to be masked.
      
      2.The changed files are mainly paddle/fluid/operators/sparse_attention_op.cu and python/paddle/fluid/tests/unittests/test_sparse_attention_op.py. sparse_attention has three parts: sddmm, softmax, and dsd. Adding the mask operation only needs to modify the softmax. It has no effect on the other two parts. In addition, in order to test the mask function, related tests has been added.
      fa463b90
    • N
      Add the transformop parameter in TensorReduceFunctorImpl (#38135) · 524389ee
      niuliling123 提交于
      * Add the transformop parameter in TensorReduceFunctorImpl
      524389ee
    • Y
      [Pten]Modify registered kernel name (#38109) · be874c08
      YuanRisheng 提交于
      * Reduce reshape kernel functions in pten
      
      * delete notes
      
      * fix bugs when compile
      
      * modify register name
      
      * fix compile bugs
      be874c08
    • C
      [PTen] Unify device context entrance in pten part 1 (#38172) · 047ee26c
      Chen Weihang 提交于
      * unify device context entrance
      
      * move all_context include to header
      
      * polish cmake relay for device_context
      
      * fix npu compile failed
      
      * fix npu compile failed
      
      * revert part of change
      047ee26c
    • C
      pylayer support tuple/list type args and fix check args bug (#38146) · 861053eb
      chentianyu03 提交于
      * Revert "Revert "pylayer support tuple/list type args (#37727)" (#37956)"
      
      This reverts commit d848ff04.
      
      * move check args,kwargs before forward execute
      861053eb
    • L
      Add float16 type for scatter op. (#38136) · 9bac4a76
      Li Min 提交于
      * Add float16 type for scatter op.
      
      * Add fp16 test for scatter op.
      
      * Add int and int64 support for scatter_grad on gpu.
      
      * Add int and int64 for check_variable_and_dtype routine.
      
      * Minors.
      
      * Code format.
      9bac4a76
    • Z
      Enabled Eager AutoCodeGen for All Existing Operators & Possible Future Operators (#37969) · 08482a86
      Zhanlue Yang 提交于
      * Rearranged Eager AutoCodeGen directory structure
      
      * Removed USE_OP in Eager AutoCodeGen
      
      * Enabled generation for Operators without Grad/Inputs/Outputs
      
      * Resolved operators without input
      
      * Fixed merge conflicts
      
      * Enabled Eager AutoCodeGen for 10+ more operators
      
      * Refactored Eager AutoCodeGen with more organized helper objects
      
      * Enabled Eager AutoCodeGen for operators with multiple OpBases
      
      * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument
      
      * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen
      
      * Enabled Eager AutoCodeGen for All Existing Operators & Possible Future Operators
      
      * Fixed CI issues
      08482a86
  2. 15 12月, 2021 14 次提交
  3. 14 12月, 2021 6 次提交