- 23 12月, 2021 9 次提交
-
-
由 wuhuanzhou 提交于
* add erfinv API, test=develop * fix gradient accuracy error, test=develop * fix cuda compilation error on Windows, test=develop * fix M_2_SQRTPI undeclared identifier on Windows, test=develop
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * update EventsWater * fix * split workqueue files * add more tests * fix * bugfix * bugfix * update Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 zyfncg 提交于
* add empty and empty_like kernel in pten * add empty dev_api
-
由 Wilber 提交于
* support external stream. * update * update * update
-
由 houj04 提交于
-
由 baoachun 提交于
* add mkldnn conv_elementwise_add_mkldnn_fuse_pass ut * update mkldnn conv_elementwise_add_mkldnn_fuse_pass ut * update conv_elementwise_add_mkldnn_fuse_pass ut * update conv_elementwise_add_mkldnn_fuse_pass ut * update conv_elementwise_add_mkldnn_fuse_pass ut * restrict conv2d data_format in conv_elementwise_add_mkldnn_fuse_pass * update conv_elementwise_add_mkldnn_fuse_pass OpCompat * update conv_elementwise_add_mkldnn_fuse_pass ut * update ut
-
由 zhouweiwei2014 提交于
* add new API: paddle.clone;Tensor.element_size;nn.utils.parameters_to_vector * fix comment
-
由 heliqi 提交于
* add flatten2_matmul squeeze2_matmul reshape2_matmul test case * modify skip func to ignore_pass_case func * rebuild CI * add test_xx_matmul_fuse_pass timeout * add test_map_xx_pass timeout * add max_duration of test cast * add trt skip * add timeout * del commented code
-
由 Chen Weihang 提交于
-
- 22 12月, 2021 8 次提交
-
-
由 crystal 提交于
* optimize gelu backward * optimize gelu backward * optimize code * Number to expression * Replacement number
-
由 Yang 提交于
-
由 baoachun 提交于
* add mkldnn reshape_transpose_matmul fuse pass ut and op version check * update reshape_transpose_matmul_mkldnn_fuse_pass ut * update ut
-
由 baoachun 提交于
* update mkldnn batch_norm_activation fuse pass ut * update ut * update mkldnn batch_norm_act_fuse_pass ut * update batch_norm_act_fuse_pass ut * update ut
-
由 LiYuRio 提交于
-
由 YuanRisheng 提交于
* move flatten * fix bugs of test * modify header file * add copy declare * fix compile bugs
-
由 joanna.wozna.intel 提交于
-
由 wenbin 提交于
* CE fix * more format
-
- 21 12月, 2021 9 次提交
-
-
由 zyfncg 提交于
* add inplace_map for trace_op in pybind * fix inplace problem of setitem * refactor the param format of trace_op Co-authored-by: Npangyoki <pangyoki@126.com>
-
由 baoachun 提交于
* update seqconv_eltadd_relu_fuse_pass ut * update ut * update ut * update ut
-
由 baoachun 提交于
* update squared_mat_sub_fuse_pass ut * update ut * update ut
-
由 Chen Weihang 提交于
* rename cuda to gpu * revert CMake change * resolve conflit * rename other cuda to gpu * poish details
-
由 crystal 提交于
* relu forward opt * add gelu functor * optimize code
-
由 arlesniak 提交于
-
由 Yuang Liu 提交于
-
由 baoachun 提交于
* add seqpool_cvm_concat_fuse_pass ut * rename ut name
-
由 sneaxiy 提交于
* mean first version * fix scalar mean * add fp16 dtype for api
-
- 20 12月, 2021 14 次提交
-
-
由 baoachun 提交于
* add mkldnn conv_transpose_bias fuse pass ut * update conv_transpose_bias_mkldnn_fuse_pass ut * update conv_transpose_bias_mkldnn_fuse_pass ut * update conv_transpose_bias_mkldnn_fuse_pass ut * restrict conv2d data_format in conv_transpose_bias_mkldnn_fuse_pass * update ut timeout setting * update ut
-
由 chentianyu03 提交于
* add pten conj kernel * modify conj_kernel file path * add defined cuda macro to cuda/conj_kernel.h
-
由 baoachun 提交于
-
由 fwenguang 提交于
-
由 From00 提交于
-
由 sneaxiy 提交于
* support FP16 for more ops * add amp list tests * refine reduce_mean_grad * fix OP benchmark ci * fix fp16 reduce_mean * updat ut, but still have some problems * remove mean/reduce_mean fp16 kernel
-
由 Feng Xing 提交于
softmax_with_cross_entropy optimization with soft label. This PR includes optimization of "SoftmaxWithCrossEntropySoftLabel" : compute log_softmax and then compute loss. "CrossEntropySoftLabel" : compute loss with softmax as input. These optimization includes following technics: read data to buffer with vectorization compute max and sum in warp fixed loop size with macro Performance (computation time): softmax_with_cross_entropy_0 (forward) : -40.1% softmax_with_cross_entropy_0 (backward): -41%
-
由 石晓伟 提交于
-
由 Feiyu Chan 提交于
-
由 heliqi 提交于
* add matmul_scale matmul_v2_scale fuse pass * add scaletensor judge * modify var name * add timeout notest;test=coverag * fix error commit * fix use_mkldnn attr * fix use_mkldnn attr
-
由 Sylwester Fraczek 提交于
-
由 zhangbo9674 提交于
* add multi_tensor for momentum and clear_grads for optimizer * fix bug for dygraph * add unittest * refine comment * add param_group * refine regularizaiton logic * del clear_grads * add clear_grads * add dispensable check of None * refine clear_grad * fix build bug * refine code by comment * refine code * add multi tensor check * refine param_group update * add multi tensor for static mode * refine comments * delete useless comma for momentum * refine comment for momentum * refine code by commment
-
由 Yuang Liu 提交于
-
由 YuanRisheng 提交于
* fix bugs when run reshape * fix ci bug
-