- 23 11月, 2022 6 次提交
-
-
由 limingshu 提交于
* first commit * 2nd commit
-
由 duanyanhui 提交于
-
由 HongyuJia 提交于
-
由 Leo Chen 提交于
-
由 zhangyikun02 提交于
-
由 MarDino 提交于
* use fused mlp in multi transformer * Restruct code * use cublaslt to fuse ffn * fix conflict
-
- 22 11月, 2022 10 次提交
-
-
由 Piotr Paturej 提交于
* Migrate elementwise_div * Migrate elementwise grad kernels
-
由 Zhang Zheng 提交于
-
由 feng_shuai 提交于
* fix:fix the bug of trt_8.0.3.4 * fix: fix the bug of trt_8.0 * fix: notes
-
由 HongyuJia 提交于
-
由 huangjiyi 提交于
* move vol2col from fluid to phi * update copyright year
-
由 Tian Zheng 提交于
* Skip tests that use fused_ops on H100 * Add error message to FusedOps on H100
-
由 Sylwester Fraczek 提交于
Removed ResidualData and Bias from ExtraAttrProperties because it's not an attribute. Removed bug with checking for ResidualData attribute in matmul_elementwise_add_fuse_pass Removed residualData from list of matmul outputs in cpu_bfloat16_pass.cc because it's input Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
由 Hulek 提交于
* Delete caching from requantize_mkldnn_op and changed to Acquire API * Fixed codestyle and implementation
-
由 Yuang Liu 提交于
-
由 huangjiyi 提交于
* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi * update copyright years * rm "fluid/platform/device/gpu/gpu_device_function.h" in phi * rm dependence to "gpu_device_function.h" in fluid * rm gpu_device_function.h etc in fluid * fix rocm-complie bugs * fix cuda_helper_test.cu bugs
-
- 21 11月, 2022 16 次提交
-
-
由 Leo Chen 提交于
* fix doc of NPUPlace * fix doc of NPUPlace, test=document_fix
-
由 Roc 提交于
-
由 Sylwester Fraczek 提交于
* add fc-residual quantization * revert removal of check for use_mkldnn * fix bug * add disable_logs * review fix call twice AreScalesPresntForNodes instead of if-else * rewrite residual input to output * revert fc mkldnn taking residual data * format fix * fix LoDTensor->DenseTensor * LoDTensor->DenseTensor * output->input * revert changes to unsupported script revert changes to unsupported script * remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc
-
由 RichardWooSJTU 提交于
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency * init * ExecuteMatMulV2 * rm fluid kernel * matmul_grad * remove mutable_data * mul_grad
-
由 lzy 提交于
* use mma for QK dot computing in fused_multi_transformer. * Update fused_multi_transformer_op.cu.h
-
由 wanghuancoder 提交于
* refine reduce_all
-
由 zyfncg 提交于
* Fix wrong eigen header include * fix compile bug
-
由 PuQing 提交于
* move threadpool fix cmake * fix make
-
由 taixiurong 提交于
-
由 wenbin 提交于
-
由 huangjiyi 提交于
* move cross_entropy from fluid to phi * replace mutable_data with Alloc * use .template
-
由 Wen Sun 提交于
* refactor: replace Collective & PointToPoint with NCCLEnv * refactor: rename to RunFnInNCCLEnv * refactor: pass std::function by value
-
由 LiYuRio 提交于
-
由 LiYuRio 提交于
-
由 PuQing 提交于
-
- 19 11月, 2022 2 次提交
-
-
由 Wen Sun 提交于
-
由 Aganlengzi 提交于
* [CustomPlace] fix amp * [CustomPlace] fix amp * fix ut because of too long time matmul fp16
-
- 18 11月, 2022 6 次提交
-
-
由 wanghuancoder 提交于
-
由 MarDino 提交于
* fused qkvBiasAdd and transpose with split qkv * fix typo * fix format * fix name * add annotation * fix comment
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency * init * ExecuteMatMulV2 * rm fluid kernel * matmul_grad * remove mutable_data
-
由 Zuza Gawrysiak 提交于
* Migrate conv_transpose to phi * Move handler to kernel * kernel m * Fix formatting * handler * remove fluid * revert tcp_store * tcp_store * remove unused * Fix declaration * add dnn input * Fix typo Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
-
由 zyfncg 提交于
* fix bug of zero_allocator in host * fix test compile bug * add unittest * update test
-
由 MarDino 提交于
* Add quick gelu and fused bias add kernel * fix annotation * remove useless code * add fast gelu option and set it in multi transformer op * add flag to restrict if use fast gelu approximate * fix flags conflict * fix use tanh function instead * add cudart version limit * use phi fast tanh func * fix comment
-