- 20 4月, 2020 3 次提交
-
-
由 zhaoyuchen2018 提交于
* OP(fusion_gru) error message enhancement. test=develop * refine code, test=develop * Refine inout log, test=develop * Refine description, test=develop
-
由 Zhou Wei 提交于
* Optimize the error messages of paddle CUDA API, test=develop * fix the error messages of paddle CUDA API, test=develop * Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop * remove build_ex_string,test=develop * merge conflict,test=develop
-
由 Yiqun Liu 提交于
-
- 15 4月, 2020 2 次提交
-
-
由 yiicy 提交于
-
由 zhaoyuchen2018 提交于
* API(fused_embedding_fc_lstm) error message enhancement. test=develop C++ API enhancement. * Refine code, test=develop * Refine code. test=develop
-
- 14 4月, 2020 2 次提交
-
-
由 yiicy 提交于
-
由 huzhiqiang 提交于
-
- 12 4月, 2020 1 次提交
-
-
由 zhongpu 提交于
-
- 10 4月, 2020 3 次提交
-
-
由 zhaoyuchen2018 提交于
C++ OP enhancement.
-
由 Wilber 提交于
error message enhancement for fusion_seqpool_concat_op
-
由 Wilber 提交于
error message enhancement for py_func op.
-
- 09 4月, 2020 2 次提交
-
-
由 Zhaolong Xing 提交于
* refine fusion_transpose_flatten_concat_op log test=develop * fix ci error test=develop
-
由 Wilber 提交于
error message enhancement for repeated fc
-
- 07 4月, 2020 1 次提交
-
-
由 zhangchunle 提交于
-
- 04 4月, 2020 2 次提交
-
-
由 zhangchunle 提交于
-
由 Chen Weihang 提交于
* delete invalid check inferface Ref & VectorRef, test=develop * fix vector ref delete error, test=develop * try the new check inferface, test=develop * change all related code with new check macro, test=develop * remove static assert, test=develop * polish detail, test=develop * skip coverage problem, test=develop * add new check macro, test=develop
-
- 03 4月, 2020 1 次提交
-
-
由 zhongpu 提交于
* use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop * fix compile error, test=develop Co-authored-by: Nphlrain <phliuhongyu@126.com>
-
- 02 4月, 2020 2 次提交
-
-
由 zhongpu 提交于
* use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop Co-authored-by: Nphlrain <phliuhongyu@126.com>
- 26 3月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* add dynamic plugin support. test=develop * change emb eltwise layernorm to math function test=develop * add emb eltwise layernorm test=develop * can run dynamic shape ernie test=develop * fix ci test=develop * add ut for trt ernie dynamic test=develop * refine dynamic shape c++ interface. test=develop * fix comments test=develop * fix comments test=develop
-
- 20 3月, 2020 2 次提交
-
-
由 Wilber 提交于
update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input
-
由 Zeng Jinle 提交于
* add double grad implementation for dygraph, test=develop * polish code, add uts, test=develop * fix place bug, test=develop * polish codes, add more uts for coverages, test=develop * add no_grad_set, test=develop * add star gan ut, test=develop * follow comments, test=develop
-
- 19 3月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
test=develop
-
- 13 3月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
* add fusion group test for backward and refine code
-
- 12 3月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
* add support for expression type convert and add cast Op support in fusion group
-
- 11 3月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* 1. add embedding eltwise layernorm fuse 2. add embedding eltwise layernorm op 3. refine inplace_add_relu 4. refine fc_eltwise_layernorm test=develop * 1. refine fc test=develop * fix comments test=develop * fix comments test=develop
-
- 09 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* refine grad maker, test=develop * refactor tracer stage 1, test=develop * merge develop to solve conflict third times, test=develop
-
- 05 3月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
test=develop
-
- 28 2月, 2020 1 次提交
-
-
由 tianshuo78520a 提交于
-
- 23 2月, 2020 1 次提交
-
-
由 tianshuo78520a 提交于
-
- 21 2月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 13 2月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* 1. optim multihead matmul: fuse three fc to multihtead matmul test=develop * fix conflict test=develop * fix comments test=develop
-
- 10 2月, 2020 1 次提交
-
-
由 Wilber 提交于
-
- 16 1月, 2020 1 次提交
-
-
由 lidanqing 提交于
-
- 10 1月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
* add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop
-
- 07 1月, 2020 2 次提交
-
-
由 zhaoyuchen2018 提交于
windows conv_fusion failed as no kernel, explicit declare lambda Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 Chen Weihang 提交于
-
- 03 1月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop
-
- 27 12月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
* Refine multihead kernel, align block to 32 test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * Refine log comments test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
- 16 12月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
* Fix softmax cuda bug * Refine multihead log and softmax logic
-