1. 18 8月, 2023 1 次提交
    • L
      [Inference] Make share_external_data supports bf16 and bool; fix while_op... · c65ef07c
      lzy 提交于
      [Inference] Make share_external_data supports bf16 and bool; fix while_op cache_inference_while_scope when using fleet_executor. (#56055)
      
      * 1. make share_external_data supports bf16 and bool; 2. don't drop_kids when cache_inference_while_scope
      
      * fix FLAGS_cache_inference_while_scope
      
      * add unitest
      
      * add unitest
      
      * skip unitest when cudnn_version < 8100
      
      * skip test share_external_data_bf16 when CUDA_ARCH < 80
      c65ef07c
  2. 22 3月, 2023 1 次提交
  3. 05 1月, 2023 1 次提交
  4. 01 8月, 2022 1 次提交
    • L
      unify gpu context (#44740) · 86763023
      Leo Chen 提交于
      * remove cudaDeviceContext
      
      * remove more template
      
      * fix rocm compile
      
      * remove alias name CUDADeviceContext
      
      * fix compile
      
      * fix tests
      
      * revert changes
      86763023
  5. 28 6月, 2022 1 次提交
  6. 26 6月, 2022 1 次提交
  7. 05 6月, 2022 1 次提交
  8. 03 12月, 2021 1 次提交
  9. 14 9月, 2021 1 次提交
  10. 31 8月, 2021 1 次提交
  11. 27 8月, 2021 1 次提交
  12. 26 8月, 2021 1 次提交
    • S
      Add copy from tensor (#34406) · ac33c0ca
      Shang Zhizhou 提交于
      * add api
      
      * temp save
      
      * revert
      
      * copytocpu async ok
      
      * fix style
      
      * copy sync ok
      
      * fix compile error
      
      * fix compile error
      
      * api done
      
      * update python async api
      
      * fix compile
      
      * remove async python api; add c++ async unittest
      
      * remove python async api
      
      * update unittest
      
      * update unittest
      
      * add C++ unittest for copytensor
      
      * add unittest
      
      * update namespace utils to class TensorUtils
      
      * add unittest
      
      * update unittest
      
      * update unittest
      
      * update code style
      
      * update code style
      
      * update unittest
      ac33c0ca