1. 30 3月, 2022 24 次提交
    • R
      [MoE] Moe apis (#41092) · aac7879a
      Roc 提交于
      * add random routing op
      
      add _random_routing api in utils
      
      add random routing ut
      
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * add op about moe gate
      
      update utils
      
      add limit by capacity op
      
      add ut for limit_by_capacity
      
      add ut for prune_gate_by_capacity
      
      add ut for limit_by_capacity
      
      add ut for prune_gate_by_capacity
      
      * fix for win
      
      * fix bugs in test_limit_by_capacity_op
      
      * update ut
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * update(fix) ut for win
      
      * moe apis in incubate
      
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * fix for win
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * fix ut for number count
      
      * add apis and utils
      
      * add gate apis
      
      * add moe and grad clip apis
      
      * update moe apis
      
      * add ops for moe gate
      
      * fix
      
      * update for base moe layer api
      
      * add random routing op
      
      add _random_routing api in utils
      
      add random routing ut
      
      * fix for dygraph
      
      * update with ranodm routing
      
      * update
      
      * fix ut for limit by capacity
      
      * update
      
      * update limit by capacity for easily to switch to single thread mode
      
      * update api docs
      Co-authored-by: Nhlygit66666 <2570058140@qq.com>
      aac7879a
    • H
      [Op] Fix uncontrolled randomness of index_select op (#41078) · 8f7c02f2
      Haohongxiang 提交于
      * fix uncontrolled randomness of op
      
      * fix bugs
      8f7c02f2
    • F
      Add new APIs for GPU memory monitoring (max_memory_allocated,... · afe02e9d
      From00 提交于
      Add new APIs for GPU memory monitoring (max_memory_allocated, max_memory_reserved, memory_allocated, memory_reserved) (#38657)
      
      * Add new API memory_reserved
      
      * Add memory_allocated, max_memory_reserved and max_memory_allocater
      
      * Fix CI error
      
      * Fix CI error
      
      * Enhance UT
      
      * Add FLAGS_memory_stats_opt
      
      * Add STATS macro functions
      
      * Add StatAllocator
      
      * Fix CI errors
      
      * Add UT
      
      * Fix CI errors
      afe02e9d
    • C
      fix reshard bug (#41106) · e494b73b
      caozhou 提交于
      e494b73b
    • H
      Revert "Revert "Move some activation to phi (#40727)" (#41056)" (#41095) · 91bb52cd
      hong 提交于
      This reverts commit 05f3d48e.
      91bb52cd
    • Z
      [DoubleGrad PR #3] Supported higher-order GradNode generation (#41051) · abd2df4c
      Zhanlue Yang 提交于
      * [Refactor] refactored eager_gen.py PR #2
      
      * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes
      
      * Fixed minor issue
      
      * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition
      
      * Fixed issues
      
      * Supported higher-order grad node generation
      
      * [DoubleGrad PR #4] Supported higher-order GradNode generation
      
      * Fixed yaml typo
      abd2df4c
    • 努力努力在努力丶's avatar
    • P
      add _reset_grad_inplace_version (#41101) · cb8afc24
      pangyoki 提交于
      cb8afc24
    • 0
      Switch some dy2st UT to eager mode (#41052) · a5bfa797
      0x45f 提交于
      * Switch some dy2st UT to eager mode
      
      * Add UT
      a5bfa797
    • C
      remove set_value numpy (#41017) · 1042f42e
      crystal 提交于
      * remove set_value numpy
      
      * optimize code
      
      * optimize to_tensor
      
      * use common function
      Co-authored-by: Nroot <root@yq01-sys-hic-k8s-v100-box-a225-0186.yq01.baidu.com>
      1042f42e
    • A
      [Yaml] Fix topk yaml compilation problem on Windows (#41082) · 95265d5c
      Aurelius84 提交于
      * [Yaml] Fix topk yaml compilation on Windows
      
      * fix make_shared
      
      * fix conflict
      95265d5c
    • Y
      add bilinear interpolate v2 to xpu list and unitteset, *test=kunlun (#41037) · 4e86dff2
      ykkk2333 提交于
      * add bilinear interpolate v2 to xpu list and unitteset, *test=kunlun
      
      * Delete ps_usr_print_log
      
      * Delete ps_usr_print_log
      
      * Delete xpu_op_test
      4e86dff2
    • A
      [Eager] Fix legacy always make sense (#41048) · 922e076e
      Aurelius84 提交于
      922e076e
    • W
      [Eager] dlpack (#40811) · 4d300224
      wanghuancoder 提交于
      * dlpack eager, test=develop
      
      * eager test_base_layer, test=develop
      
      * fix error report, test=develop
      
      * eager _getitem_from_offset, test=develop
      
      * refine, test=develop
      
      * refine offset, test=develop
      
      * add test_inner test_outer, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      4d300224
    • X
      Optest refactor (#40998) · 04325d2c
      xiongkun 提交于
      * first version, maybe many errors
      
      * refactor op_test
      
      * fix compare list
      
      * fix bg
      
      * fix bugs
      
      * skip name
      04325d2c
    • H
      swish and pow op for xpu test=kunlun (#40654) · d951f3af
      houj04 提交于
      * swish and pow op for xpu. test=kunlun
      
      * fix code style. test=kunlun.
      
      * use pow_grad xdnn api. test=kunlun.
      d951f3af
    • Z
      fix (#41083) · b1ee9d5e
      zhaocaibei123 提交于
      b1ee9d5e
    • P
      suppor inplace in tensor_method_setitem (#40915) · 7170c687
      pangyoki 提交于
      * suppor inplace in tensor_method_setitem
      
      * delete bump_inplace_version
      
      * optimize inplace unittest
      
      * fix
      
      * fix setitem bug
      
      * update eager_generator
      
      * optimize inplace unittest
      
      * little change
      7170c687
    • Z
      9fcb6a1d
    • Z
      Refactor code auto-gene for no_need_buffer (#41025) · 97cd0f51
      zyfncg 提交于
      * refactor code auto-gene for no_need_buffer
      
      * fix some bug
      
      * delete test code
      97cd0f51
    • P
      support view strategy in dygraph eager_final state (#40891) · 495ca4aa
      pangyoki 提交于
      * support view strategy in eager_final state
      
      * perfect reshape kernel
      
      * fix bugs of sig
      
      * add unittest for reshape_sig
      
      * fix bugs when run converage
      
      * fix inplace bug in final_state eager_gen
      
      * fix python_c_gen
      
      * support view strategy for final state
      
      * fix order of out and xshape in reshape
      
      * fix Coverage_CI unittest timeout error
      
      * support reshape view
      
      * fix reshape_sig
      
      * fix yml and api_base
      Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
      495ca4aa
    • Z
      Add timer tool to Profiler (#40386) · 83efeeae
      Zhang Ting 提交于
      83efeeae
    • W
      Fix argsort cpu kernel when with input of NaN (#41070) · 17af293f
      wawltor 提交于
      * fix the argosrt cpu
      
      * add the test case for the paddle.argsort
      17af293f
    • W
      [Eager] Pylayer (#39989) · 157c1a28
      wanghuancoder 提交于
      * Supported Complex2Real Conversion for Eager Dygraph
      
      * Supported Complex2Real Conversion for Eager Dygraph
      
      * Enabled complex type promotion test for matmul_v2
      
      * pylayer, test=develop
      
      * Fix CI issues
      
      * Support initializing specific grad tensors to zero for selected operators
      
      * finish forward, test=develop
      
      * create grad node finish, test=develop
      
      * Merged adj_edges_ with GradSlotMeta
      
      * Fixed monir issue
      
      * backward finish, start dbg, test=develop
      
      * Adjusted num runs
      
      * Recovered Eager performance tests configurations
      
      * Recovered Eager performance tests configurations
      
      * finish, test=develop
      
      * polish, test=develop
      
      * polish, test=develop
      
      * refine, test=develop
      
      * eager, test=develop
      
      * Adjusted performance tests configurations
      
      * Fixed Minor Issues with performance tests
      
      * [Phi] Fix macro name typo
      
      * support set_materialize_grads, test=develop
      
      * suppotr mark_non_differentiable, test=develop
      
      * support once_differentiable, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * Moved out Edge from GradSlotMeta
      
      * Fixed issues from merge
      
      * Fixed typo
      
      * Addressed review comments
      
      * Fixed merge issues
      
      * Fixed minor issues
      
      * Fixed minor issue
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * Fixed major issues and enabled auto_prune test cases
      
      * Fixed issues from merge
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      Co-authored-by: Njim19930609 <jim19930609@gmail.com>
      Co-authored-by: NAurelius84 <zhangliujie@baidu.com>
      157c1a28
  2. 29 3月, 2022 14 次提交
    • A
      [Eager]Add sort-simple-yaml for automatically sort api|backward.yaml (#41038) · cc52501e
      Aurelius84 提交于
      * [Eager]Add sort-simple-yaml for automatically sort api|backward.yaml
      
      * remove it test=document_fix
      
      * refine
      
      * add more yaml
      
      * remove optional
      
      * fix infRT CI
      cc52501e
    • J
      Update of oneDNN to 2.5 (#39426) · 35b96d48
      Jacek Czaja 提交于
      * - update of oneDNN to 2.5
      
      * - changes to UT testing onednn verbose
      
      * - Update of oneDNN to 2.5.3
      
      * - update onednn to 2.5.4
      35b96d48
    • A
      [Yaml] Refine yaml as order test=document_fix (#41098) · e04493de
      Aurelius84 提交于
      e04493de
    • R
      [MoE] Moe apis (#40895) · aeade538
      Roc 提交于
      * add random routing op
      
      add _random_routing api in utils
      
      add random routing ut
      
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * add op about moe gate
      
      update utils
      
      add limit by capacity op
      
      add ut for limit_by_capacity
      
      add ut for prune_gate_by_capacity
      
      add ut for limit_by_capacity
      
      add ut for prune_gate_by_capacity
      
      * fix for win
      
      * fix bugs in test_limit_by_capacity_op
      
      * update ut
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * update(fix) ut for win
      
      * moe apis in incubate
      
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * fix for win
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * fix ut for number count
      
      * add apis and utils
      
      * add gate apis
      
      * add moe and grad clip apis
      
      * update moe apis
      
      * add ops for moe gate
      
      * fix
      
      * update for base moe layer api
      
      * add random routing op
      
      add _random_routing api in utils
      
      add random routing ut
      
      * fix for dygraph
      
      * update with ranodm routing
      
      * update
      
      * fix ut for limit by capacity
      
      * update
      Co-authored-by: Nhlygit66666 <2570058140@qq.com>
      aeade538
    • W
      add elementwise sub and elementwise div in tensorrt op teller (#40806) · f3022dfa
      wangxinxin08 提交于
      * add elementwise sub and elementwise div in tensorrt op teller
      
      * add unittest of elementwise mul, sub and div
      f3022dfa
    • Z
      Add Sparse op sparse_relu (#40959) · c544a181
      zhangkaihuo 提交于
      c544a181
    • T
      Revert "Move some activation to phi (#40727)" (#41056) · 05f3d48e
      tianshuo78520a 提交于
      This reverts commit e77a947e.
      05f3d48e
    • S
      Add Identity module name in __init__ (#39615) · 869287f8
      shiyutang 提交于
      * add_module_in_init_
      
      * Update __init__.py
      
      * Update __init__.py
      869287f8
    • H
      fix lrn bug in export model · bea725bb
      huangjun12 提交于
      bea725bb
    • Z
    • Z
      [MLU]add reduce op mlu kernel (#41028) · d1c1d731
      zn 提交于
      d1c1d731
    • 0
      Use _C_ops.yolov3_loss in eager mode for test_yolov3.py (#40831) · 3b381aac
      0x45f 提交于
      * Use _C_ops.yolov3_loss in eager mode for test_yolov3.py
      
      * fix code for test_yolov3_loss_op
      
      * remove useless import
      
      * Fix dygraph_mode flag
      3b381aac
    • A
      [Eager]Switch new Eager mode (#40990) · 55f9b71a
      Aurelius84 提交于
      * [Eager]Switch new Eager mode
      
      * switch into eager
      
      * fix typo
      55f9b71a
    • J
      support env variable control flags (#41013) · 5de41ef2
      Jiabin Yang 提交于
      5de41ef2
  3. 28 3月, 2022 2 次提交
    • H
      Move meshgrid to phi (#40994) · ca871957
      hong 提交于
      * move momentum, rmsprop to phi; test=develop
      
      * update
      
      * update
      
      * update
      
      * update
      
      * udpate; test=develop
      
      * fix xpu npu bugs; test=develop
      
      * fix npu bug; test=develop
      
      * fix windows compile error; test=develop
      
      * fix windows compile error; test=develop
      
      * polish code; test=develop
      
      * fix conflict; test=develop
      
      * add meshgrid;
      
      * update
      
      * polish code
      
      * polish code;
      
      * fix bug
      
      * format; remove useless code
      
      * fix npu bug
      
      * fix bug
      ca871957
    • H
      Move some activation to phi (#40727) · e77a947e
      hong 提交于
      * update
      
      * add forward case
      
      * update
      
      * update; test=develop
      
      * add some grad kernel; test=develop
      
      * move gpu kernel; test=develop
      
      * update
      
      * update;
      
      * update test;
      
      * fix selected rows bug;
      
      * add mix vector include ;
      
      * add mixed vector depen; test=develop
      
      * add logit grad signature;
      
      * polish code
      
      * fix bug;
      
      * add namespace for abs
      
      * revert code
      
      * not move softsign
      
      * revmove duplate register;
      
      * fix softsign bug
      
      * polish code
      
      * format
      
      * format
      
      * fix bug
      
      * remove cmake dep
      
      * add square sqrt selected rows support
      
      * update
      
      * remove clip norm
      
      * add standalone executor sqrt dep
      
      * standalone exec denp sqrt
      
      * remove sqrt op in cmkaelist
      
      * open some case
      e77a947e