- 25 4月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 24 4月, 2020 2 次提交
-
-
由 Guo Sheng 提交于
* Add cholesky_op forward part. test=develop * Complete cholesky_op forward part. test=develop * Add cholesky_op backward part. test=develop * Complete cholesky_op backward part. test=develop * Refine cholesky_op error check and docs. test=develop * Add grad_check unit test for cholesky_op. test=develop * Fix sample code in cholesky doc. test=develop * Refine some error messages of cholesky_op. test=develop * Refine some error messages of cholesky_op. test=develop * Remove unused input in cholesky_grad. test=develop * Remove unused input in cholesky_grad. test=develop * Fix stream for cusolverDnSetStream. test=develop * Update PADDLE_ENFORCE_CUDA_SUCCESS from cholesky_op to adapt to latest code. test=develop * Add CUSOLVER ERROR in enforce.h test=develop * Fix the missing return value in cholesky. test=develop
-
由 wangchaochaohu 提交于
-
- 23 4月, 2020 1 次提交
-
-
由 石晓伟 提交于
-
- 22 4月, 2020 2 次提交
-
-
由 Jacek Czaja 提交于
-
由 石晓伟 提交于
-
- 21 4月, 2020 1 次提交
-
-
由 石晓伟 提交于
* add the thread_local_allocator, test=develop * refactor the thread_local_allocator, test=develop * provides option setting strategy, test=develop
-
- 20 4月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
* Optimize the error messages of paddle CUDA API, test=develop * fix the error messages of paddle CUDA API, test=develop * Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop * remove build_ex_string,test=develop * merge conflict,test=develop
-
- 18 4月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* update eigen, test=develop * remove patches, test=develop * add definition of -fabi-version, test=develop * add patch for TensorBlock.h, test=develop * test windows, test=develop * only update eigen for Linux, test=develop * add code comments, test=develop
-
- 17 4月, 2020 1 次提交
-
-
由 石晓伟 提交于
* supports thread-binding stream, test=develop * avoid using thread_local variables in dtor, test=develop * modify the stream priority enum, test=develop
-
- 15 4月, 2020 1 次提交
-
-
由 guofei 提交于
Correct the name [`FLAGS_sync_nccl_allreduce`](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/flags/others_cn.html#flags-sync-nccl-allreduce) based on the information from our official website.
-
- 14 4月, 2020 1 次提交
-
-
由 Yi Liu 提交于
eagerly release cuda resources before cuda enviroment destroying test=develop
-
- 11 4月, 2020 1 次提交
-
-
由 Michał Gallus 提交于
* Initial FP32 DNNL MatMul Implementation * Implement int8 DNNL MatMul * Unify in-kernel-naming, clean UTs * MatmuL: Introduce op caching * Final adjustments test=develop * Remove dy_graph disablement test=develop * Change dnnl header name to new one test=develop * Contrain multi head check to prevent fails test=develop * Resolve dnnl header problems on MAC CI * Variable namings to kernel and skip_grad_ci added test=develop * Prevent MAC CI from failing * Prevent windows build from failing test=develop * Modify UTs to conform to the rules * Modify MatMul aux functions namings test=develop
-
- 10 4月, 2020 4 次提交
-
-
由 littletomatodonkey 提交于
add addmm op
-
由 Zeng Jinle 提交于
-
由 silingtong123 提交于
-
由 Tao Luo 提交于
-
- 09 4月, 2020 1 次提交
-
-
由 mozga-intel 提交于
* Remove the NGraph engine from PDPD repository 1. Each operator was removed from the operator's directory 2. Each test was removed from the unittest directory 3. The parallel executor support was removed from the PDPD 4. The CMake file was removed from the PDPD 5. The NG flags were removed from the repository test=develop * Remove ngraph from: 1. Cmake file 2. Python file test=develop
-
- 08 4月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
-
- 04 4月, 2020 2 次提交
-
-
由 Chen Weihang 提交于
* delete invalid check inferface Ref & VectorRef, test=develop * fix vector ref delete error, test=develop * try the new check inferface, test=develop * change all related code with new check macro, test=develop * remove static assert, test=develop * polish detail, test=develop * skip coverage problem, test=develop * add new check macro, test=develop
-
由 Leo Chen 提交于
* fix init_gflags with 'python -c', test=develop * add test, test=develop * use sys.executable instead of python, test=develop * keep dummy, test=develop
-
- 03 4月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add op inout check macro, test=develop * fix enforce_test, test=develop
-
- 02 4月, 2020 1 次提交
-
-
由 Adam 提交于
* Delete is_test from activation operators test=develop * Revent unneeded changes test=develop
-
- 01 4月, 2020 1 次提交
-
-
由 石晓伟 提交于
-
- 31 3月, 2020 2 次提交
-
-
由 Yi Liu 提交于
As nccl comm is not created by CUDADeviceContext, it should be destroyed by the creator as the best practice of RAII.
-
由 wangchaochaohu 提交于
* refine output of profiler for child event
-
- 30 3月, 2020 2 次提交
- 27 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 25 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 19 3月, 2020 1 次提交
-
-
由 Sylwester Fraczek 提交于
-
- 18 3月, 2020 1 次提交
-
-
由 Yi Liu 提交于
initialize global nccl context in dygraph test=develop
-
- 13 3月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 12 3月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 07 3月, 2020 2 次提交
-
-
由 Zhang Ting 提交于
-
由 wangchaochaohu 提交于
* refine the profiler print test=develop
-
- 04 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* add recorded cuda memory apis, fix typo, test=develop * add more ut, test=develop * follow comments, test=develop * fix py35 incompatible issues, test=develop
-
- 03 3月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
-
- 02 3月, 2020 2 次提交
-
-
由 wangchaochaohu 提交于
-
由 wangchaochaohu 提交于
* add profiler_help.h to refine the code test=develop
-