1. 30 3月, 2022 7 次提交
    • X
      Optest refactor (#40998) · 04325d2c
      xiongkun 提交于
      * first version, maybe many errors
      
      * refactor op_test
      
      * fix compare list
      
      * fix bg
      
      * fix bugs
      
      * skip name
      04325d2c
    • H
      swish and pow op for xpu test=kunlun (#40654) · d951f3af
      houj04 提交于
      * swish and pow op for xpu. test=kunlun
      
      * fix code style. test=kunlun.
      
      * use pow_grad xdnn api. test=kunlun.
      d951f3af
    • P
      suppor inplace in tensor_method_setitem (#40915) · 7170c687
      pangyoki 提交于
      * suppor inplace in tensor_method_setitem
      
      * delete bump_inplace_version
      
      * optimize inplace unittest
      
      * fix
      
      * fix setitem bug
      
      * update eager_generator
      
      * optimize inplace unittest
      
      * little change
      7170c687
    • P
      support view strategy in dygraph eager_final state (#40891) · 495ca4aa
      pangyoki 提交于
      * support view strategy in eager_final state
      
      * perfect reshape kernel
      
      * fix bugs of sig
      
      * add unittest for reshape_sig
      
      * fix bugs when run converage
      
      * fix inplace bug in final_state eager_gen
      
      * fix python_c_gen
      
      * support view strategy for final state
      
      * fix order of out and xshape in reshape
      
      * fix Coverage_CI unittest timeout error
      
      * support reshape view
      
      * fix reshape_sig
      
      * fix yml and api_base
      Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
      495ca4aa
    • Z
      Add timer tool to Profiler (#40386) · 83efeeae
      Zhang Ting 提交于
      83efeeae
    • W
      Fix argsort cpu kernel when with input of NaN (#41070) · 17af293f
      wawltor 提交于
      * fix the argosrt cpu
      
      * add the test case for the paddle.argsort
      17af293f
    • W
      [Eager] Pylayer (#39989) · 157c1a28
      wanghuancoder 提交于
      * Supported Complex2Real Conversion for Eager Dygraph
      
      * Supported Complex2Real Conversion for Eager Dygraph
      
      * Enabled complex type promotion test for matmul_v2
      
      * pylayer, test=develop
      
      * Fix CI issues
      
      * Support initializing specific grad tensors to zero for selected operators
      
      * finish forward, test=develop
      
      * create grad node finish, test=develop
      
      * Merged adj_edges_ with GradSlotMeta
      
      * Fixed monir issue
      
      * backward finish, start dbg, test=develop
      
      * Adjusted num runs
      
      * Recovered Eager performance tests configurations
      
      * Recovered Eager performance tests configurations
      
      * finish, test=develop
      
      * polish, test=develop
      
      * polish, test=develop
      
      * refine, test=develop
      
      * eager, test=develop
      
      * Adjusted performance tests configurations
      
      * Fixed Minor Issues with performance tests
      
      * [Phi] Fix macro name typo
      
      * support set_materialize_grads, test=develop
      
      * suppotr mark_non_differentiable, test=develop
      
      * support once_differentiable, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * Moved out Edge from GradSlotMeta
      
      * Fixed issues from merge
      
      * Fixed typo
      
      * Addressed review comments
      
      * Fixed merge issues
      
      * Fixed minor issues
      
      * Fixed minor issue
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * Fixed major issues and enabled auto_prune test cases
      
      * Fixed issues from merge
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      Co-authored-by: Njim19930609 <jim19930609@gmail.com>
      Co-authored-by: NAurelius84 <zhangliujie@baidu.com>
      157c1a28
  2. 29 3月, 2022 8 次提交
    • J
      Update of oneDNN to 2.5 (#39426) · 35b96d48
      Jacek Czaja 提交于
      * - update of oneDNN to 2.5
      
      * - changes to UT testing onednn verbose
      
      * - Update of oneDNN to 2.5.3
      
      * - update onednn to 2.5.4
      35b96d48
    • R
      [MoE] Moe apis (#40895) · aeade538
      Roc 提交于
      * add random routing op
      
      add _random_routing api in utils
      
      add random routing ut
      
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * add op about moe gate
      
      update utils
      
      add limit by capacity op
      
      add ut for limit_by_capacity
      
      add ut for prune_gate_by_capacity
      
      add ut for limit_by_capacity
      
      add ut for prune_gate_by_capacity
      
      * fix for win
      
      * fix bugs in test_limit_by_capacity_op
      
      * update ut
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * update(fix) ut for win
      
      * moe apis in incubate
      
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * fix for win
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * fix ut for number count
      
      * add apis and utils
      
      * add gate apis
      
      * add moe and grad clip apis
      
      * update moe apis
      
      * add ops for moe gate
      
      * fix
      
      * update for base moe layer api
      
      * add random routing op
      
      add _random_routing api in utils
      
      add random routing ut
      
      * fix for dygraph
      
      * update with ranodm routing
      
      * update
      
      * fix ut for limit by capacity
      
      * update
      Co-authored-by: Nhlygit66666 <2570058140@qq.com>
      aeade538
    • W
      add elementwise sub and elementwise div in tensorrt op teller (#40806) · f3022dfa
      wangxinxin08 提交于
      * add elementwise sub and elementwise div in tensorrt op teller
      
      * add unittest of elementwise mul, sub and div
      f3022dfa
    • Z
      Add Sparse op sparse_relu (#40959) · c544a181
      zhangkaihuo 提交于
      c544a181
    • T
      Revert "Move some activation to phi (#40727)" (#41056) · 05f3d48e
      tianshuo78520a 提交于
      This reverts commit e77a947e.
      05f3d48e
    • Z
    • Z
      [MLU]add reduce op mlu kernel (#41028) · d1c1d731
      zn 提交于
      d1c1d731
    • A
      [Eager]Switch new Eager mode (#40990) · 55f9b71a
      Aurelius84 提交于
      * [Eager]Switch new Eager mode
      
      * switch into eager
      
      * fix typo
      55f9b71a
  3. 28 3月, 2022 13 次提交
  4. 27 3月, 2022 4 次提交
    • X
      [ Optest ] refactor optest check_output_with_place logic (#40928) · 37f914c8
      xiongkun 提交于
      * first version, maybe many errors
      
      * refactor op_test
      
      * fix compare list
      
      * fix bg
      
      * fix bugs
      37f914c8
    • L
      [new-exec] fit for mkldnn and inplace op (#40955) · afa0e82c
      Leo Chen 提交于
      * fit for mkldnn and inplace op
      
      * fix compile
      
      * refine ut
      
      * register op version
      
      * fix inplace op
      
      * fix transfer_layout
      afa0e82c
    • H
      Move slice to phi (#40736) · b8236b7b
      hong 提交于
      * move slice to pten
      
      * merge develop; test=develop
      
      * fix slice bug;
      
      * update
      
      * update
      
      * fix error
      
      * update
      
      * fix bug
      
      * polish code
      
      * polish code
      
      * polish code
      
      * try to fix windows bug
      
      * add gpu compile flag;
      
      * try to fix
      
      * remov template;
      
      * polish code;
      
      * fix npu bug;
      
      * fix npu bug
      
      * fix npu bug; test=develop
      
      * fix slice bug;
      
      * remove no need dep
      b8236b7b
    • A
      [NPU] fix npu cast ut (#40982) · f6b6b057
      Aganlengzi 提交于
      * [NPU] fix npu cast ut
      
      * [NPU] fix npu cast ut
      f6b6b057
  5. 25 3月, 2022 8 次提交
    • H
      update eager code gen (#40924) · afe2fdd1
      hong 提交于
      * update
      
      * remove useless code
      
      * remove label smooth test
      
      * polish code
      
      * polish code
      
      * polish code
      
      * remove _in_eager_mode error;
      afe2fdd1
    • Z
      [MLU]add allreduce max/prod/min mlu kernel (#40792) · 9261dff4
      zn 提交于
      9261dff4
    • Z
      add cast_grad phi kernel (#40798) · b79c6a9b
      zhangbo9674 提交于
      * add cast_grad phi kernel
      
      * refie unittest
      
      * refien unittest
      
      * refine unittest
      
      * refine include header path
      
      * refien xpu cast unittest
      
      * refine code
      b79c6a9b
    • z8hanghuan's avatar
      support multi_dims for tril_triu, *test=kunlun (#40712) · 9ffedcfd
      z8hanghuan 提交于
      * support multi_dims for tril_triu, *test=kunlun
      
      * support multi_dims for tril_triu, *test=kunlun
      
      * support multi_dims for tril_triu, *test=kunlun
      
      * update xpu.cmake date, support multi_dims for tril_triu, *test=kunlun
      9ffedcfd
    • zhouweiwei2014's avatar
      change CUDA implementation of dropout OP (#40874) · 1c01d1cc
      zhouweiwei2014 提交于
      1c01d1cc
    • J
      Refactor Dygraph Flags (#40786) · 3085d5e4
      Jiabin Yang 提交于
      * refactor eager flags
      
      * fix flags error when we switch from eager to dygraph
      
      * fix ci problem
      
      * fix ci
      
      * fix ci
      
      * merge develop and fix code style
      
      * merge develop and fix code style
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * merge develop
      3085d5e4
    • T
      fix xpu op test, *test=kunlun (#40862) · 1db9cd46
      TTerror 提交于
      1db9cd46
    • X
      [OpTest] Polish optest (#40879) · d43e8433
      xiongkun 提交于
      * 1. add the python api grad 2. add final and intermediate state vlog 3. change the python_api error logic
      
      * add python api or close the check_eager=True
      
      * fix the compatibility
      
      * matmul
      
      * disable unittests: test_elementwise_add_op test_scatter_nd_op test_gather_nd_op test_scatter_op test_index_sample_op test_elementwise_add_mkldnn_op
      
      * refine the logic of prepara_parameter logic
      
      * fix Tensor(gpu) 2 Scalar segment fault.
      
      * add multi-attribute. (test_unsqueeze_op); add python_sig_out for customizing op sig out
      
      * fix some bugs, support python_out_sig
      d43e8433