- 18 4月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* update eigen, test=develop * remove patches, test=develop * add definition of -fabi-version, test=develop * add patch for TensorBlock.h, test=develop * test windows, test=develop * only update eigen for Linux, test=develop * add code comments, test=develop
-
- 17 4月, 2020 1 次提交
-
-
由 石晓伟 提交于
* supports thread-binding stream, test=develop * avoid using thread_local variables in dtor, test=develop * modify the stream priority enum, test=develop
-
- 15 4月, 2020 1 次提交
-
-
由 guofei 提交于
Correct the name [`FLAGS_sync_nccl_allreduce`](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/flags/others_cn.html#flags-sync-nccl-allreduce) based on the information from our official website.
-
- 14 4月, 2020 1 次提交
-
-
由 Yi Liu 提交于
eagerly release cuda resources before cuda enviroment destroying test=develop
-
- 11 4月, 2020 1 次提交
-
-
由 Michał Gallus 提交于
* Initial FP32 DNNL MatMul Implementation * Implement int8 DNNL MatMul * Unify in-kernel-naming, clean UTs * MatmuL: Introduce op caching * Final adjustments test=develop * Remove dy_graph disablement test=develop * Change dnnl header name to new one test=develop * Contrain multi head check to prevent fails test=develop * Resolve dnnl header problems on MAC CI * Variable namings to kernel and skip_grad_ci added test=develop * Prevent MAC CI from failing * Prevent windows build from failing test=develop * Modify UTs to conform to the rules * Modify MatMul aux functions namings test=develop
-
- 10 4月, 2020 4 次提交
-
-
由 littletomatodonkey 提交于
add addmm op
-
由 Zeng Jinle 提交于
-
由 silingtong123 提交于
-
由 Tao Luo 提交于
-
- 09 4月, 2020 1 次提交
-
-
由 mozga-intel 提交于
* Remove the NGraph engine from PDPD repository 1. Each operator was removed from the operator's directory 2. Each test was removed from the unittest directory 3. The parallel executor support was removed from the PDPD 4. The CMake file was removed from the PDPD 5. The NG flags were removed from the repository test=develop * Remove ngraph from: 1. Cmake file 2. Python file test=develop
-
- 08 4月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
-
- 04 4月, 2020 2 次提交
-
-
由 Chen Weihang 提交于
* delete invalid check inferface Ref & VectorRef, test=develop * fix vector ref delete error, test=develop * try the new check inferface, test=develop * change all related code with new check macro, test=develop * remove static assert, test=develop * polish detail, test=develop * skip coverage problem, test=develop * add new check macro, test=develop
-
由 Leo Chen 提交于
* fix init_gflags with 'python -c', test=develop * add test, test=develop * use sys.executable instead of python, test=develop * keep dummy, test=develop
-
- 03 4月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add op inout check macro, test=develop * fix enforce_test, test=develop
-
- 02 4月, 2020 1 次提交
-
-
由 Adam 提交于
* Delete is_test from activation operators test=develop * Revent unneeded changes test=develop
-
- 01 4月, 2020 1 次提交
-
-
由 石晓伟 提交于
-
- 31 3月, 2020 2 次提交
-
-
由 Yi Liu 提交于
As nccl comm is not created by CUDADeviceContext, it should be destroyed by the creator as the best practice of RAII.
-
由 wangchaochaohu 提交于
* refine output of profiler for child event
-
- 30 3月, 2020 2 次提交
- 27 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 25 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 19 3月, 2020 1 次提交
-
-
由 Sylwester Fraczek 提交于
-
- 18 3月, 2020 1 次提交
-
-
由 Yi Liu 提交于
initialize global nccl context in dygraph test=develop
-
- 13 3月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 12 3月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 07 3月, 2020 2 次提交
-
-
由 Zhang Ting 提交于
-
由 wangchaochaohu 提交于
* refine the profiler print test=develop
-
- 04 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* add recorded cuda memory apis, fix typo, test=develop * add more ut, test=develop * follow comments, test=develop * fix py35 incompatible issues, test=develop
-
- 03 3月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
-
- 02 3月, 2020 2 次提交
-
-
由 wangchaochaohu 提交于
-
由 wangchaochaohu 提交于
* add profiler_help.h to refine the code test=develop
-
- 26 2月, 2020 1 次提交
-
-
由 Adam 提交于
-
- 25 2月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* add framework overhead ratio, test=develop * print GpuMemcpy overhead, test=develop
-
- 24 2月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
* add support for the driver api callback and fix the profiler name show bug
-
- 23 2月, 2020 1 次提交
-
-
由 tianshuo78520a 提交于
-
- 21 2月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 19 2月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
* fix the profile print error test=develop
-
- 18 2月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
* add python flag to control profile level test=develop
-
- 14 2月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
-