- 03 4月, 2020 12 次提交
-
-
由 Feiyu Chan 提交于
-
由 zhongpu 提交于
* use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop * fix compile error, test=develop Co-authored-by: Nphlrain <phliuhongyu@126.com>
-
由 zhaoyuchen2018 提交于
elementwise function used before definition then failed in cuda 8, move it ahead.
-
由 gongweibao 提交于
-
由 Zeng Jinle 提交于
-
由 channings 提交于
* update linspace, equal operators to API 2.0, test=develop * equal support higher performance CUDA kernel, test=develop * update comment of equal&linspace operator, test=develop * update comment of equal&linspace operator, test=develop
-
由 zhaoyuchen2018 提交于
* improve elementwise performance. * Add contiguous check, test=develop
-
由 wangchaochaohu 提交于
-
由 Leo Chen 提交于
* prune train program by fetch_list, test=develop * add unittest for prune, test=develop * fix pruned feed, test=develop * support ParallelExecutor and feed prune, test=develop * add comments, test=develop * update unittest, test=develop * update unittests, test=develop * remove debug code, test=develop * support cond in clone, test=develop * support cond in prune, test=develop * support multiple minimize, test=develop * support cache, test=develop * fix _copy_param_info_from, test=develop * support python2 str, test=develop * remove debug code, test=develop * fix bug of caching CompiledProgram, test=develop * fix multi_device issue, test=develop * tmp * support tuple in fetch_list and overriding use_prune, test=develop * dont use nonlocal in python2, test=develop * remove nonlocal, test=develop * code clean, test=develop * code clean, test=develop * feed list, test=develop * test adam, test=develop * follow comments, test=develop * reduce duplicate code, test=develop * update comments, test=develop
-
由 Chen Weihang 提交于
* add op inout check macro, test=develop * fix enforce_test, test=develop
-
由 Yiqun Liu 提交于
-
由 Zeng Jinle 提交于
-
- 02 4月, 2020 8 次提交
-
-
由 Pei Yang 提交于
-
由 liym27 提交于
* Add unittest for transformer prediction in dygraph_to_static. * fix bug in fill_constant api. * Make transpose support size 0. test=develop
-
由 xujiaqi01 提交于
* fix stat var in hogwild worker * test=develop
-
由 joanna.wozna.intel 提交于
-
由 zhongpu 提交于
* use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop Co-authored-by: Nphlrain <phliuhongyu@126.com>
-
由 Adam 提交于
* Delete is_test from activation operators test=develop * Revent unneeded changes test=develop
-
由 Kaipeng Deng 提交于
* add inplace_abn_op. test=develop
-
- 01 4月, 2020 9 次提交
-
-
由 Yi Liu 提交于
test=develop
-
由 Zeng Jinle 提交于
-
由 wangchaochaohu 提交于
* refine the error message of tensor_array_read_write Op
-
由 石晓伟 提交于
-
由 wangchaochaohu 提交于
* add attr support for fusion group and add support for fill_constant and scale Op
-
由 xujiaqi01 提交于
* add fleet pslib pull and push sparse op and push dense op * test=develop
-
由 songyouwei 提交于
test=develop
-
由 Zhaolong Xing 提交于
test=develop
-
由 Jacek Czaja 提交于
-
- 31 3月, 2020 4 次提交
-
-
由 Yi Liu 提交于
As nccl comm is not created by CUDADeviceContext, it should be destroyed by the creator as the best practice of RAII.
-
由 wangchaochaohu 提交于
* refine output of profiler for child event
-
由 Leo Chen 提交于
* expand parameters, test=develop * support resnet, test=develop * fix resnet, test=develop * support duplicable out, test=develop * support ptb * fix bugs, test=develop * support null input, test=develop * fix bugs, test=develop * fix batchNorm is_test, test=develop * refine code, test=develop * follow comments, test=develop * follow comments, test=develop * follow comments, test=develop * follow comments, test=develop
-
由 GaoWei8 提交于
-
- 30 3月, 2020 4 次提交
-
-
由 Yi Liu 提交于
-
由 Jacek Czaja 提交于
-
由 石晓伟 提交于
-
由 石晓伟 提交于
-
- 29 3月, 2020 2 次提交
-
-
由 Zeng Jinle 提交于
* distinguish public/private vars, test=develop * fix windows issues, test=develop
-
由 zhaoyuchen2018 提交于
* Improve elementwise performance. Elementwise performace is poor as walk into CommonGradBroadcastCUDA, add some new kernels for different data pattern. * Add some cuda kernel to speedup common broadcast cases. test=develop * Add more test cases and fix cuda kernel bug. test=develop * Remove tests as cpu percision fails.test=develop * Refine SplitDims, test=develop * Change file mode, test=develop
-
- 28 3月, 2020 1 次提交
-
-
由 Wojciech Uss 提交于
-