- 16 12月, 2021 7 次提交
-
-
由 Liu-xiandong 提交于
Add key_padding_mask and attn_mask in sparse_attention Api 1.Key padding mask is a tensor with dimensions [batch_size, seq_len], and attention mask is a tensor with dimensions [seq_len, seq_len]. The data types of the two masks are consistent with Q, K, and V, which are float32 or float64. If the value in Mask is 0, it means that the position needs to be masked. 2.The changed files are mainly paddle/fluid/operators/sparse_attention_op.cu and python/paddle/fluid/tests/unittests/test_sparse_attention_op.py. sparse_attention has three parts: sddmm, softmax, and dsd. Adding the mask operation only needs to modify the softmax. It has no effect on the other two parts. In addition, in order to test the mask function, related tests has been added.
-
由 niuliling123 提交于
* Add the transformop parameter in TensorReduceFunctorImpl
-
由 YuanRisheng 提交于
* Reduce reshape kernel functions in pten * delete notes * fix bugs when compile * modify register name * fix compile bugs
-
由 Chen Weihang 提交于
* unify device context entrance * move all_context include to header * polish cmake relay for device_context * fix npu compile failed * fix npu compile failed * revert part of change
-
由 chentianyu03 提交于
* Revert "Revert "pylayer support tuple/list type args (#37727)" (#37956)" This reverts commit d848ff04. * move check args,kwargs before forward execute
-
由 Li Min 提交于
* Add float16 type for scatter op. * Add fp16 test for scatter op. * Add int and int64 support for scatter_grad on gpu. * Add int and int64 for check_variable_and_dtype routine. * Minors. * Code format.
-
由 Zhanlue Yang 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators * Refactored Eager AutoCodeGen with more organized helper objects * Enabled Eager AutoCodeGen for operators with multiple OpBases * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen * Enabled Eager AutoCodeGen for All Existing Operators & Possible Future Operators * Fixed CI issues
-
- 15 12月, 2021 14 次提交
-
-
由 baoachun 提交于
* add mkldnn conv3d_bias_mkldnn_fuse_pass ut * update conv3d_bias_mkldnn_fuse_pass ut * disable conv3d_bias_mkldnn_fuse_pass
-
由 Yiqun Liu 提交于
test=document_fix
-
由 jianghaicheng 提交于
* add ipu_inference * resovle commments * resolve comments * add EnableIpu introduction * rm line * restore npu update * add ernie and resnet50 test * fix copyright time Co-authored-by: Nyaozhixin <522190855@qq.com>
-
由 Leo Chen 提交于
* refine test * add download_program target * update ut code * refine code * disable profiler * add comments * refine cmake * skip coverage ci
-
由 baoachun 提交于
* update mkldnn conv_concat_relu_mkldnn_fuse_pass ut * update conv_concat_relu_mkldnn_fuse_pass ut * restrict conv2d data_format in conv_concat_relu_mkldnn_fuse_pass
-
由 Chen Weihang 提交于
-
由 wenbin 提交于
* remove bf16 * remove comments * remove wrong return * fix UT
-
由 Yiqun Liu 提交于
test=document_fix
-
由 王明冬 提交于
-
由 Chen Weihang 提交于
-
由 Huihuang Zheng 提交于
As the title.
-
由 Chen Weihang 提交于
-
由 chentianyu03 提交于
* replace with pten kernel in cast cuda compute and remove unused codes * rm unused header file * replace CastCUDAOpKernel with CastOpKernel
-
由 Zhanlue Yang 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators * Refactored Eager AutoCodeGen with more organized helper objects * Enabled Eager AutoCodeGen for operators with multiple OpBases * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen * Adjusted function generation/call between Python-C API & Dygraph API * Synchronized auto-generated Python-C API with Dygraph Forward Functions * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs
-
- 14 12月, 2021 14 次提交
-
-
由 Sylwester Fraczek 提交于
* add map_matmul passes to quant2_int8_mkldnn_pass * fix fc+act fuse (activation scale) * ci fix, c++17 structured bindings not available * fix ci static check
-
由 zyfncg 提交于
* fix bug of set_value op * fix BumpInplaceVersion * polish some comments * revert change of full_like
-
由 Chen Weihang 提交于
* polish register marco * resolve compile failed * revert needless change * revert eager related change * revert eager related change * change register marco name * polish deetails
-
由 baoachun 提交于
* add conv_gelu_mkldnn_fuse_pass * add post ops
-
由 Aurelius84 提交于
-
由 weishengying 提交于
-
由 YuanRisheng 提交于
-
由 Yuang Liu 提交于
-
由 YuanRisheng 提交于
* Reduce reshape kernel functions in pten * delete notes * fix bugs when compile
-
由 feng_shuai 提交于
* test_mkldnn_depthwise_conv_pass * test: add TimeOut * sset TIMEOUT * fix:add random num for dilation and group
-
由 Zhanlue Yang 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators * Refactored Eager AutoCodeGen with more organized helper objects * Enabled Eager AutoCodeGen for operators with multiple OpBases * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen
-
由 heliqi 提交于
* add layer_norm_fuse_pass test case * restore cmakelist code * Merge branch 'develop' into layer_norm_fuse_pass * Merge branch 'develop' into layer_norm_fuse_pass * add bad case test
-
由 wangguanzhong 提交于
-
由 Sylwester Fraczek 提交于
* reshape+transpose+matmul_v2 * in_name->input_name * fix pr-ci-static-check
-
- 13 12月, 2021 5 次提交
-
-
由 zhenlin 提交于
* update 3 tests * fix typo error
-
由 wenbin 提交于
* disabled bad case * int to size_t
-
由 jianghaicheng 提交于
-
由 taixiurong 提交于
-
由 xiongkun 提交于
* fix single card 8 unittests in new executor * fix * fix
-