1. 28 3月, 2022 3 次提交
  2. 27 3月, 2022 4 次提交
    • X
      [ Optest ] refactor optest check_output_with_place logic (#40928) · 37f914c8
      xiongkun 提交于
      * first version, maybe many errors
      
      * refactor op_test
      
      * fix compare list
      
      * fix bg
      
      * fix bugs
      37f914c8
    • L
      [new-exec] fit for mkldnn and inplace op (#40955) · afa0e82c
      Leo Chen 提交于
      * fit for mkldnn and inplace op
      
      * fix compile
      
      * refine ut
      
      * register op version
      
      * fix inplace op
      
      * fix transfer_layout
      afa0e82c
    • H
      Move slice to phi (#40736) · b8236b7b
      hong 提交于
      * move slice to pten
      
      * merge develop; test=develop
      
      * fix slice bug;
      
      * update
      
      * update
      
      * fix error
      
      * update
      
      * fix bug
      
      * polish code
      
      * polish code
      
      * polish code
      
      * try to fix windows bug
      
      * add gpu compile flag;
      
      * try to fix
      
      * remov template;
      
      * polish code;
      
      * fix npu bug;
      
      * fix npu bug
      
      * fix npu bug; test=develop
      
      * fix slice bug;
      
      * remove no need dep
      b8236b7b
    • A
      [NPU] fix npu cast ut (#40982) · f6b6b057
      Aganlengzi 提交于
      * [NPU] fix npu cast ut
      
      * [NPU] fix npu cast ut
      f6b6b057
  3. 26 3月, 2022 1 次提交
  4. 25 3月, 2022 11 次提交
  5. 24 3月, 2022 10 次提交
    • Z
      [AMP] Support amp for Intermediate_dygraph (#40623) · c12f7d48
      zhangbo9674 提交于
      * approve amp for intermediate_dygraph
      
      * add amp_utils for intermediate_dygraph
      
      * add amp needcast check for mlu & npu
      
      * test unittest
      
      * add SetGradNode for set_stop_gradient && add checktensor for GradientHooks
      
      * refine code
      
      * refien unittest of imperative_amp for new dygraph
      
      * inplace api skip amp
      
      * add test_imperative_qat_amp for intermediate amp
      
      * refine code
      
      * refine test_amp ci strategy
      
      * refine unittest code
      
      * refine amp_utils code
      
      * refine amp getpromotetype for some special op
      
      * refine unittest code
      c12f7d48
    • R
      [MoE]Assign pos op (#40580) · 305f32d1
      Roc 提交于
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * fix for win
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * fix ut for number count
      Co-authored-by: Nhlygit66666 <2570058140@qq.com>
      305f32d1
    • L
      Wrap dist api for dygraph mode (#40408) · 9d8cfc1b
      lilong12 提交于
      9d8cfc1b
    • G
    • X
      [Auto Parallel] Gradient merge pass support dist attribute (#40737) · 0443c6f4
      xiayanming 提交于
      * [Auto Parallel] gradient merge pass support dist attribute
      0443c6f4
    • Z
      a8f86600
    • K
      fix device id env (#40844) · 8562668e
      kuizhiqing 提交于
      8562668e
    • X
      Polish optest: refine the optest parameter logic. support name, dtype, out,... · a8df3901
      xiongkun 提交于
      Polish optest: refine the optest parameter logic. support name, dtype, out, output in arbitrary position (#40824)
      
      * 1. add the python api grad 2. add final and intermediate state vlog 3. change the python_api error logic
      
      * add python api or close the check_eager=True
      
      * fix the compatibility
      
      * matmul
      
      * disable unittests: test_elementwise_add_op test_scatter_nd_op test_gather_nd_op test_scatter_op test_index_sample_op test_elementwise_add_mkldnn_op
      
      * refine the logic of prepara_parameter logic
      
      * fix Tensor(gpu) 2 Scalar segment fault.
      a8df3901
    • 0
      Refine eager run_program OP for dy2st UT (#40768) · 4ccd5cb8
      0x45f 提交于
      * Refine eager run_program OP for dy2st UT
      
      * append run_program error string and refine run_program_grad
      
      * remove some comments
      
      * refine ConstructXGradTensors
      4ccd5cb8
    • C
      [Auto Parallel] Update cost model (#40457) · c1c9368f
      caozhou 提交于
      * refactor cost model
      c1c9368f
  6. 23 3月, 2022 11 次提交
    • J
      Added support for BF16 datatype for all oneDNN activation kernels (#40721) · 8e67629c
      jakpiase 提交于
      * added missing BF16 activations
      
      * added softplus bf16
      
      * minor change
      
      * disabled tests for GPU
      8e67629c
    • F
      [NPU] add npu support for conv3d and conv3d_grad (#38480) · ff568afa
      furnace 提交于
      * [NPU] add npu support for conv3d and conv3d_grad
      
      * [NPU] delete failed unittests due to Ascend not support
      
      * [NPU] delete debug codes
      
      * [NPU] optimize codes, notest
      
      * [NPU] remove const_cast
      
      * [NPU] optimize for remove const_cast
      
      * [NPU] fix written errors
      ff568afa
    • Z
      two-phase training for ps (#40762) · b1a4668c
      zhaocaibei123 提交于
      * fix benchmark and communicator config
      
      * fix bugs of the_one_ps
      
      * multi program and fix bug in optimizer
      
      * multi program in the_one_ps
      
      * public commcontext
      
      * ps optimizer multi programs
      
      * cvm & datanorm backend
      
      * fix dim
      
      * fix unittest
      
      * fix
      
      * the one ps merge
      
      * remove comm
      
      * add DownpourLiteWorker
      
      * all
      
      * fix
      
      * fix
      
      * device worker downpour lite
      
      * fix
      
      * fix bug in global shuffle
      
      * save inference model
      
      * fix & add log
      
      * fix
      
      * remove log
      
      * fix
      
      * fix save summary
      
      * fix
      
      * fix pscore
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * remove logs
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * add some comments
      
      * fix
      Co-authored-by: Nesythan <esythan@126.com>
      b1a4668c
    • Z
      [AutoParallel] engine & dist_saver (#40528) · 3980e222
      zhaoyingli 提交于
      * add dist_saver and update engine
      
      * add dist_saver and update engine
      3980e222
    • W
      [Eager Hook + Inplace] Refactor register_hook and test with inplace operation (#40778) · ff7cbaae
      Weilong Wu 提交于
      * disable scatter case in test_inplace_eager_fluid
      
      * Update register_hook logic
      
      * Add register_hook test cases
      Co-authored-by: Npangyoki <pangyoki@126.com>
      ff7cbaae
    • J
      Support sharding (#40637) · fe291daf
      Jiabin Yang 提交于
      * suppor sharding api
      
      * support multi api for sharding in eager
      
      * support multi api for sharding in eager
      
      * fix test
      
      * fix test coverage
      fe291daf
    • H
      Add yaml config part2 (#40742) · f4075db8
      hong 提交于
      * fix error; test=develop
      
      * update
      
      * close some yaml
      
      * fix backward attrite error; test=develop
      
      * add div test
      
      * polish code; test=develop
      
      * remove none gbk charactor;
      
      * remove some yaml;
      
      * fix optional bug
      
      * recover yaml config
      
      * resolve confilct; test=develop
      
      * close div; test=develop
      f4075db8
    • W
      [Eager] Slice (#40587) · b07d239c
      wanghuancoder 提交于
      * fix some slice bug, test=develop
      
      * eager slice, test=develop
      
      * eager slice, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * fix bug, test=develop
      
      * refine, test=develop
      
      * rename function name, test=develop
      b07d239c
    • Z
      Support initializing specific grad tensors to zero for selected operators (#39963) · 2f50ae99
      Zhanlue Yang 提交于
      * Supported Complex2Real Conversion for Eager Dygraph
      
      * Supported Complex2Real Conversion for Eager Dygraph
      
      * Enabled complex type promotion test for matmul_v2
      
      * Fix CI issues
      
      * Support initializing specific grad tensors to zero for selected operators
      
      * Merged adj_edges_ with GradSlotMeta
      
      * Fixed monir issue
      
      * Adjusted num runs
      
      * Recovered Eager performance tests configurations
      
      * Recovered Eager performance tests configurations
      
      * Adjusted performance tests configurations
      
      * Fixed Minor Issues with performance tests
      
      * Moved out Edge from GradSlotMeta
      
      * Fixed issues from merge
      
      * Fixed typo
      
      * Addressed review comments
      
      * Fixed merge issues
      
      * Fixed minor issues
      
      * Fixed minor issue
      
      * Fixed major issues and enabled auto_prune test cases
      
      * Fixed issues from merge
      2f50ae99
    • K
      Add complex type compatibility for stft api and stft op. (#40113) · 319f95d0
      KP 提交于
      * Add stft_op.
      
      * Add stft_grad_op.
      
      * Add stft_op unittest.
      
      * [DLTP-45176] Add complex compatibility in static mode for stft api.
      
      * [DLTP-45176] Add complex compatibility in static mode for stft api.
      
      * Add doc.
      
      * Update unitests of stft op.
      
      * Update spectral helper.
      
      * fix coding style.
      319f95d0
    • C
      Add profiler features (#40357) · c15e3823
      chenjian 提交于
      * add event record for model profiling
      
      * fix format
      
      * fix format
      
      * fix code example bug
      
      * no
      
      * add profiler statistic
      
      * add profiler feature
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * required: gpu
      
      * required: gpu
      
      * fix bug
      
      * required: gpu
      
      * fix ci bug
      
      * fix ci error
      
      * fix ci error
      
      * upgrade document
      
      * fix doc
      
      * fix ci bug
      
      * add doc and fix bug
      
      * nothing
      
      * fix bug
      
      * fix format bug
      
      * modify format
      
      * add deprecated description for old profiler
      
      * fix bug
      
      * fix bug
      
      * fix
      
      * add load_profiler_reuslt doc
      
      * add load_profiler_reuslt doc
      
      * add load_profiler_reuslt doc
      
      * help fix old profiler sample code
      
      * add api doc
      
      * fix format
      
      * fix api doc
      
      * fix api doc format
      
      * fix api doc format
      
      * fix api doc c format
      
      * fix api doc format
      c15e3823