1. 09 10月, 2021 8 次提交
  2. 08 10月, 2021 7 次提交
  3. 07 10月, 2021 2 次提交
  4. 05 10月, 2021 1 次提交
    • J
      Added concat BF16/FP32 BWD OneDNN kernel (#35889) · dc4d5719
      jakpiase 提交于
      * tmp
      
      * added concat BF16/FP32 BWD oneDNN kernel
      
      * minor change
      
      * minor change
      
      * fix for CI
      
      * added formatting
      
      * Reverted deleting static keyword
      
      * added reviewers suggestions
      
      * reverted deleting concat bf16 test file
      
      * fixed concat tests
      dc4d5719
  5. 04 10月, 2021 1 次提交
  6. 30 9月, 2021 6 次提交
  7. 29 9月, 2021 15 次提交
    • Z
      Add basic support for CUDA Graph (#36190) · 21b93c3d
      Zeng Jinle 提交于
      * add basic support for CUDA Graph
      
      * fix ci compile error
      
      * fix LOG print, fix windows CI
      
      * follow comments and update
      
      * small fix for default ctor
      
      * fix rocm compile error
      
      * fix CPU compile error
      21b93c3d
    • Z
      add optest for adamw (#36148) · 69eed34d
      zhaoyingli 提交于
      * update func name
      
      * skip cpu
      
      * update unittest
      
      * update unittest
      69eed34d
    • L
      fix cusparse compile problem, test=develop (#36199) · 3eb50715
      Liu-xiandong 提交于
      * fix cusparse compile problem, test=develop
      
      * Modify file permissions
      3eb50715
    • L
      Add functional autograd API:hessian (#36108) · 1f93582c
      levi131 提交于
      * init functional jacobian api
      
      * finish test with dtype float32
      
      * add float64 test case
      
      * polish code
      
      * use atol=1e-5 with dtype float64
      
      * fix for ci
      
      * set timeout for test_jacobian
      
      * init hessian API
      
      * save status
      
      * polish API docstring
      
      * modify docstring
      
      * add utils.py
      
      * save status
      
      * fix dygraph double grad dtype error when calling for high differential senario
      
      * reinvoke ci
      
      * test_hessian.py is ok
      
      * polish hessian API
      
      * init vhp
      
      * Revert "init vhp"
      
      This reverts commit cbd4d3b66abe82b0ac10721b9eddeb7d82e0a1c8.
      
      * add test for partial_engine.cc
      
      * modify numerical_delta with dtype float32
      
      * merge fix for dtype float64
      
      * spell fix
      
      * polish code
      
      * rm _stop_gradient_pre_process
      Co-authored-by: NJiabinYang <360788950@qq.com>
      1f93582c
    • L
      Spinlock (#36030) · a9ea41c5
      liutiexing 提交于
      * add align for WorkQueue
      
      * add spinlock
      
      * merge spinlock
      a9ea41c5
    • Y
      add slot record dataset (#36200) · 79bd5f90
      yaoxuefeng 提交于
      79bd5f90
    • Z
      [npu] add box coder (#36171) · 83578cfa
      zhulei 提交于
      * [npu] add box coder
      
      * [npu] add box coder
      83578cfa
    • P
      fix bug of top_k npu op (#36175) · 2b8fd704
      pangyoki 提交于
      2b8fd704
    • Z
      [NPU] Add group norm (#35937) · c79de728
      zhulei 提交于
      * [NPU] Add group norm
      
      * [NPU] Add group norm
      
      * [NPU] Add group norm
      
      * [NPU] Add group norm
      
      * [NPU] Add group_norm op
      c79de728
    • A
      [NPU] mod for model bert (#36165) · 7bddf2e8
      Aganlengzi 提交于
      * merge conflict of paddle_gtest_main.cc
      
      * modify FLAGS_npu_precision_mode and default not to call aclSetCompileopt
      7bddf2e8
    • W
      bec9fc9a
    • H
      Add op paddle.device.cuda.get_device_name and paddle.device.cuda.get_device_capability. (#35672) · f703558d
      hlygit66666 提交于
      * add op paddle.device.cuda.get_device_name
      
      * fix some bugs
      
      * fix some bugs
      
      * fix error message bugs
      
      * fix en docs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * add error message test case
      
      * add get_device_name and get_device_capability
      
      * fix review
      
      * fix docs bug
      
      * fix docs
      
      * fix docs
      f703558d
    • Y
      fix paddle.device.cuda.get_device_properties doc (#36178) · 6d4435ac
      Yanxing Shi 提交于
      * Initial Commit
      
      * add unittest and add error information
      
      * modify doc
      
      * fix some error
      
      * fix some word
      
      * fix bug cudaDeviceProp* and modify error explanation
      
      * fix cudaDeviceProp* error and unnitest samples
      
      * fix hip error and PADDLE_WITH_HIP
      
      * update style
      
      * fix error is_compiled_with_cuda
      
      * fix paddle.device.cuda.get_device_properties
      
      * fix error for multi thread safe
      
      * update style
      
      * merge conflict
      
      * modify after mentor review
      
      * update style
      
      * delete word
      
      * fix unittest error for windows
      
      * support string input and modify some code
      
      * modify doc to support string input
      
      * fix error for express information
      
      * fix error for express information
      
      * fix unnitest for windows
      
      * fix device.startswith('gpu:')
      
      * format error and doc
      
      * fix after review
      
      * format code
      
      * fix error for doc compile
      
      * fix error for doc compile
      
      * fix error for doc compile
      
      * fix error for doc compile
      
      * fix error for doc compile
      
      * fix py2 error
      
      * fix wrong words and doc
      
      * fix _gpuDeviceProperties
      
      * test=document_fix
      6d4435ac
    • Y
    • Z
      remove wait if no fetch (#36150) · b3d2dc7b
      Zeng Jinle 提交于
      b3d2dc7b