1. 22 2月, 2022 2 次提交
  2. 21 2月, 2022 18 次提交
  3. 20 2月, 2022 4 次提交
  4. 19 2月, 2022 9 次提交
    • A
      [Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264
      Aurelius84 提交于
      * Unify paddle/pten::framework::ddim into pten::ddim
      
      * fix paddle namespace
      
      * compile sucessfully
      
      * fix npu src file
      
      * fix conflict
      
      * fix conflict
      
      * fix tensorrt compiler error
      
      * fix conflict
      
      * fix conflict
      
      * fix tesst file conflict
      
      * fix conflict
      
      * fix mlu file conflict
      
      * fix mlu file conflict
      
      * fix cinn header file conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      2fe04264
    • Z
      [Pten] Add selected_rows kernel for Full (#39465) · 79f8eeca
      zyfncg 提交于
      * Add selected_rows kernel for full
      
      * remove fill_constant register in fluid
      
      * fix bug without GPU
      
      * add jit_kernel_helper dependency for fc
      
      * do some refactor
      
      * add unittest for ops signatures
      
      * add coverage unittest
      
      * fix merge conflict
      
      * fix full selectew_rows bug
      79f8eeca
    • C
      Update record interface using part1 (#39693) · eec6ef81
      chenjian 提交于
      * fix RecordEvent interface
      
      * modify default level to 4
      
      * update interface use
      
      * add const default trace level
      
      * update record event interface using
      
      * update operator.cc
      
      * update part1
      
      * fix include profiler.h header in ps server
      
      * fix include profiler.h header in ps server
      eec6ef81
    • Z
      Enabled test_matmul_v2_op for final state Eager Dygraph (#39504) · 77625d7d
      Zhanlue Yang 提交于
      * Enabled test_matmul_v2_op for final state Eager Dygraph
      
      * Fixed minor issue
      
      * Fixed format issue
      77625d7d
    • C
      [PTen] Support parse cc file in gpu (#39691) · b29c05c7
      Chen Weihang 提交于
      * support parse cc in gpu
      
      * change file name
      b29c05c7
    • C
      fix RecordEvent interface (#39675) · 019a552b
      chenjian 提交于
      * fix RecordEvent interface
      
      * modify default level to 4
      
      * update interface use
      
      * add const default trace level
      
      * update operator.cc
      019a552b
    • Z
      [Pten] Adjust the params of creation kernel for inference (#39573) · 4e5d6743
      zyfncg 提交于
      * remove manual_api
      
      * change sig map of full and empty
      
      * fix fill_any_like_xpu_op
      
      * fix fill_any_like_xpu_op
      
      * fix problem of fill_any_like_xpu_op
      
      * fix conflict
      
      * polish code
      4e5d6743
    • W
      [Eager Hook] Support ReduceHook in GradNodeAccumulation (#39674) · 06b177c0
      Weilong Wu 提交于
      * [Eager] Support GradientHook before running seperate GradNode
      
      * Fix CI issue
      
      * Support eager ReduceHook in accumulation_node
      
      * Fix CI  issue
      
      * Add some tests to fix coverage CI issue
      06b177c0
    • S
      Add the DistributedFusedLamb optimizer (#39148) · 5df3cd61
      sneaxiy 提交于
      * add DistributedFusedLamb op
      
      * polish code
      
      * fix compile error
      
      * compatible with pten changement
      
      * fix rocm compile error
      
      * improve converage
      
      * update upstream/develop
      
      * fix cast_with_ptr.h
      
      * add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1
      
      * fix clip before allreduce
      
      * add use_master_param_norm
      
      * code polish
      
      * fix bug
      
      * fix ROCM ci
      5df3cd61
  5. 18 2月, 2022 7 次提交
    • J
      Shared selected rows (#39608) · 7fc04070
      Jiabin Yang 提交于
      * merge legacy to fluid
      
      * Remove legacy code
      
      * Remove legacy code
      
      * Remove DataType test
      
      * Using Tensor directly instead of using EagerTensor
      
      * support gradient_accumulation
      
      * make test_imperative_lod_tensor_to_selected_rows longer
      
      * make test_imperative_lod_tensor_to_selected_rows longer
      
      * refine code
      
      * Rename all EagerTensor to Tensor
      
      * Rename some EagerTensor to Tensor
      
      * rename EagerTensor to EagerVariable
      
      * add more test
      
      * Support copiable selected rows and merge develop
      7fc04070
    • Z
      bug fix (#39630) · bbf31a4e
      zhaoyingli 提交于
      bbf31a4e
    • F
      [Pten] blas and lapck migration (#39587) · 8c7ee8c2
      Feiyu Chan 提交于
      * move blas related files
      * move lapack related files
      8c7ee8c2
    • Z
      Fix wrong inputs (#39700) · 1d6fd81d
      zlsh80826 提交于
      1d6fd81d
    • T
      cinn_instruction_run_op test (#39576) · fdc4fe3b
      TeFeng Chen 提交于
      * add cinn_instruction_run_op test code
      
      * update several interfaces of CinnLaunchContext
      
      * update several interfaces and add detail comments in CinnLaunchContext class
      
      * to skip the bug of error message check
      
      * fix ut test failed due to reliant interface updated
      fdc4fe3b
    • X
      [pten] trans diagonal kernel into pten (#39575) · 5c66338f
      xiongkun 提交于
      * trans diagonal kernel into pten
      
      * fix by code review
      5c66338f
    • Z
      [AMP] support GPU BF16 amp for dygraph (#39029) · 7d6d3848
      zhangbo9674 提交于
      * support dtype param for auto_cast
      
      * add amp_dtype for tracer
      
      * add unsupported bf16 list
      
      * support bf16 amp for O2
      
      * refine python interface for bfloat16
      
      * refine code
      
      * refine code
      
      * refine unittest
      
      * refine code
      
      * refine code
      
      * add bf16 o1
      
      * refine code by comment
      
      * add gradient accumulator
      
      * add recompute
      7d6d3848