- 29 6月, 2020 1 次提交
-
-
由 Wilber 提交于
-
- 23 4月, 2020 2 次提交
-
-
由 Wojciech Uss 提交于
test=release/2.0 Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>
-
由 石晓伟 提交于
* cherry-pick of DeviceContext Split, test=develop (#23737) * New feature: thread local allocator, test=develop (#23989) * add the thread_local_allocator, test=develop * refactor the thread_local_allocator, test=develop * provides option setting strategy, test=develop * add boost dependency to cuda_stream, test=develop * declare the stream::Priority as enum class, test=develop * deal with PADDLE_ENFORCE_CUDA_SUCCESS macro in pr #23816
-
- 21 4月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
* cherry-pick,Optimize the error messages of paddle CUDA API * fix the error messages of paddle CUDA API * Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL * remove build_ex_string
-
- 15 4月, 2020 1 次提交
-
-
由 Yi Liu 提交于
eagerly release cuda resources before cuda enviroment destroying test=develop
-
- 11 4月, 2020 1 次提交
-
-
由 Michał Gallus 提交于
* Initial FP32 DNNL MatMul Implementation * Implement int8 DNNL MatMul * Unify in-kernel-naming, clean UTs * MatmuL: Introduce op caching * Final adjustments test=develop * Remove dy_graph disablement test=develop * Change dnnl header name to new one test=develop * Contrain multi head check to prevent fails test=develop * Resolve dnnl header problems on MAC CI * Variable namings to kernel and skip_grad_ci added test=develop * Prevent MAC CI from failing * Prevent windows build from failing test=develop * Modify UTs to conform to the rules * Modify MatMul aux functions namings test=develop
-
- 10 4月, 2020 4 次提交
-
-
由 littletomatodonkey 提交于
add addmm op
-
由 Zeng Jinle 提交于
-
由 silingtong123 提交于
-
由 Tao Luo 提交于
-
- 09 4月, 2020 1 次提交
-
-
由 mozga-intel 提交于
* Remove the NGraph engine from PDPD repository 1. Each operator was removed from the operator's directory 2. Each test was removed from the unittest directory 3. The parallel executor support was removed from the PDPD 4. The CMake file was removed from the PDPD 5. The NG flags were removed from the repository test=develop * Remove ngraph from: 1. Cmake file 2. Python file test=develop
-
- 08 4月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
-
- 04 4月, 2020 2 次提交
-
-
由 Chen Weihang 提交于
* delete invalid check inferface Ref & VectorRef, test=develop * fix vector ref delete error, test=develop * try the new check inferface, test=develop * change all related code with new check macro, test=develop * remove static assert, test=develop * polish detail, test=develop * skip coverage problem, test=develop * add new check macro, test=develop
-
由 Leo Chen 提交于
* fix init_gflags with 'python -c', test=develop * add test, test=develop * use sys.executable instead of python, test=develop * keep dummy, test=develop
-
- 03 4月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add op inout check macro, test=develop * fix enforce_test, test=develop
-
- 02 4月, 2020 1 次提交
-
-
由 Adam 提交于
* Delete is_test from activation operators test=develop * Revent unneeded changes test=develop
-
- 01 4月, 2020 1 次提交
-
-
由 石晓伟 提交于
-
- 31 3月, 2020 2 次提交
-
-
由 Yi Liu 提交于
As nccl comm is not created by CUDADeviceContext, it should be destroyed by the creator as the best practice of RAII.
-
由 wangchaochaohu 提交于
* refine output of profiler for child event
-
- 30 3月, 2020 2 次提交
- 27 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 25 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 19 3月, 2020 1 次提交
-
-
由 Sylwester Fraczek 提交于
-
- 18 3月, 2020 1 次提交
-
-
由 Yi Liu 提交于
initialize global nccl context in dygraph test=develop
-
- 13 3月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 12 3月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 07 3月, 2020 2 次提交
-
-
由 Zhang Ting 提交于
-
由 wangchaochaohu 提交于
* refine the profiler print test=develop
-
- 04 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* add recorded cuda memory apis, fix typo, test=develop * add more ut, test=develop * follow comments, test=develop * fix py35 incompatible issues, test=develop
-
- 03 3月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
-
- 02 3月, 2020 2 次提交
-
-
由 wangchaochaohu 提交于
-
由 wangchaochaohu 提交于
* add profiler_help.h to refine the code test=develop
-
- 26 2月, 2020 1 次提交
-
-
由 Adam 提交于
-
- 25 2月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* add framework overhead ratio, test=develop * print GpuMemcpy overhead, test=develop
-
- 24 2月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
* add support for the driver api callback and fix the profiler name show bug
-
- 23 2月, 2020 1 次提交
-
-
由 tianshuo78520a 提交于
-
- 21 2月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 19 2月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
* fix the profile print error test=develop
-
- 18 2月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
* add python flag to control profile level test=develop
-