- 17 3月, 2022 2 次提交
- 15 3月, 2022 2 次提交
-
-
由 Chen Weihang 提交于
* skip infrt when checking log fatal, test=document_fix * remove test=document_fix * update commit
-
由 YuanRisheng 提交于
* move activation op * adjust code format * fix compile bugs * fix ci bugs * code format adjust * code format adjust2 * activate ci status * modify according to comment * move activation kernel * revert relu6 * reduce add code * perfect use_phi_functor * completing func name * fix bugs when run ci * fix bugs when run infr * modifpy infrt get kernel signature
-
- 14 3月, 2022 1 次提交
-
-
由 huzhiqiang 提交于
-
- 10 3月, 2022 1 次提交
-
-
由 Shang Zhizhou 提交于
* add trt.execute * merge trt.engine type * update return op * update comments * fix style * fix style
-
- 09 3月, 2022 2 次提交
-
-
由 huzhiqiang 提交于
-
由 Ren Wei (任卫) 提交于
* run document_preview when samplecodes be tested * run document_preview when samplecodes be tested * sphinx-build symbol link; and build-doc default * FLUIDDOCDIR typo * download the required configirations and some other scripts * install required python packages. * clone specified branch of docs repo, and if failed, clone the default branch * clean workspace for docs repo * use the conf.py imported by https://github.com/PaddlePaddle/docs/pull/4222/ * download and install the boscmd * Optimaze the code comments. * specify the pypi index server * only do doc-build when running in cpu mode * pull docs pr git log paddle_pr_info * install jq * force using sphinx-build under py3.7 * using our new domain name for preview * install python package error * don't build doc default
-
- 07 3月, 2022 2 次提交
-
-
由 王明冬 提交于
-
由 Ming-Xu Huang 提交于
* Added cuBlasLtHandle_t to device context. * Added fused_gemm_epilogue op. 1. Added fused_gemm_epilogue op to leverage cuBlastLt Epilogue. 2. Support fusion Act(X*Y + bias), X'dims >=2 and Y'dims shoule be 2. 2. Act currently only be supported ReLU. (Will add GeLU in the future). * Added UT to fused_gemm_epilogue op. * Added LinearAct Pattern 1. Added LinearAct into graph_pattern_detector.* to define (2.)'s pattern. 2. LinearAct is used to detect act(element_add(matmul_v2(x, w), bias)). 3. act currently only support ReLU (Will support GeLU in the future). * Added FuseGemmEpiloguePass 1, Added FuseGemmEpiloguePass to handle nn.Linear + Act{ReLU} fusion (GeLU will be supported in the future). 2. Only support matmul_v2 from nn.Linear. * Added pybind to BuildStrageter.fuse_gemm_epilogue_. * Added UT for fuse_gemm_epilogue_pass. * GeLU support and EpilogueSingleton 1. Added GeLU support to fused_gemm_epilogue op. 2. Added EpilogueSingleton to cache auxiliary pointer. 3. Added related UTs. * Rename cublaslt_epilogue_opto gemm_epilogue_op.*. * Added both train and infer pattern to LinearAct. 1. Added support of fwd graph with grap_ops linking to LinearAct. 2. Added related changes to fuse_gemm_epilogue_pass for above modification. * Changed CUDA requirement from 11.4 to 11.6 for fuse_gemm_epilogue_pass. * Added identity activation support to gemm_epilogue_op. * Added Linear Fusion (matmul_v2 + ele_add) 1. Added matmul_v2 + ele_add pattern to LinearActPattern. 2. Added matmul_v2 + ele_add support to fuse_gemm_epilogue_pass. * Rename gemm_epilogue_op.* to fused_gemm_epilogue_op.* * Add fused_gemm_epilogue_grad op. 1. Added fused_gemm_epilogue_grad to support backward epilogue fusion. * Add UTs to fused_gemm_epilogue_grad_op. * Change attribute name in fused_gemm_epilogue_grad_op for clearing. * Allow DX and DBias be dispensable to fused_gemm_epilogue_grad op. * Added ElementwiseAdd+Matmul+Act graph pattern detection. * Fuse backward of Linear( Act(x)) 1. Added backward fusion pass to Linear( Act(x)). 2. Added backward fusion pass to Linear(x). * Added UTs to backward fusion of Linear(Act(x)). * Complete document of arguments to fused_gemm_epilogue_op. * Made arguments of some functions pass by reference. * Modify code with review comments. 1. Made arguments of some function pass by reference. 2. Removed redundant code. 3. Followed Google code style to change code. * Made 'const' code style be consistent * Fixed random seed of python UTs. * Set Compiling constrains to cuBlasLt 1. Require CUDA 11.6+ 2. Remove fuse_gemm_epilogue related tests when CUDA < 11.6. * Code Reivew from Paddle 1. Changed arguments name is_first_gemm to without_x_gradient for clearing. 2. Applied PADDLE_THROW in fused_gemm_epilogue_op. * Remove EpilogueSingleton 1. Applied ReserveSpace to replace Epilogue for passing auxiliary pointers between FWD and BWD. * Fix a logical error and enhance UTs. 1. Added act op count checking in UTs. 2. Fix issue to fuse backward or ReLU(Linear(X)). 3. TODO: solve GELU fusion issues. * Fix Linear and GeLU fusion issues. 1. Modified graph_detech_pattern to fit with both linear wiht gelu or relu. 2. Modified data range in Uts to allow negative values. * Removed fused_gemm_epilogue_op.h. * Rename namespace pten to phi. * Rename name of arguments in fused_gemm_epilogue_op 1. bias -> Bias. 2. out -> Out. 3. reserve_space -> ReserveSpace. * Change EpiloguePassActivationCache as local variable. 1. Removed singleton in EpiloguePassActivationCache. 2. Made EpiloguePassActivationCache as an argument to each pass functions.
-
- 04 3月, 2022 1 次提交
-
-
由 王明冬 提交于
-
- 03 3月, 2022 1 次提交
-
-
由 石晓伟 提交于
* mlir attr types for infrt place, test=develop * fix a bug, test=develop
-
- 02 3月, 2022 3 次提交
-
-
由 Allen Guo 提交于
* update dockerfile for ipu * update comments, test=document_fix
-
由 pangyoki 提交于
* support phi checking in CI op benchmark * add sparse/gpu * remove h file in cpu directory
-
由 huzhiqiang 提交于
-
- 01 3月, 2022 2 次提交
- 28 2月, 2022 2 次提交
-
-
由 tianshuo78520a 提交于
-
由 Wilber 提交于
-
- 22 2月, 2022 4 次提交
-
-
由 王明冬 提交于
-
由 chentianyu03 提交于
* add check for using HostAlloc * add check for using HostAlloc
-
由 zhangchunle 提交于
-
由 Chen Weihang 提交于
* unify register macro * rename declare macro * fix infrt error
-
- 20 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename pten dir to phi * rename namespace to phi * rename infrt pten dir to phi * resolve conflict * rename pten to phi in cmake * revert all infrt change * change needed files * fix infrt failed * fix inference failed
-
- 18 2月, 2022 1 次提交
-
-
由 Wilber 提交于
* the mlir representation of pten, test=develop * fixes an error, test=develop * infrt registers pten kernels Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 17 2月, 2022 3 次提交
-
-
由 YUNSHEN XIE 提交于
-
由 QingshuChen 提交于
* update kunlun label_smooth unitest *test=kunlun * minor *test=kunlun
-
由 huzhiqiang 提交于
* update generate_pd_op_dialect_from_paddle_op_maker.py * update mlir tensor load interface * refine * fix bug * fix * refine * fix * 3 * fix * codestyle Co-authored-by: weishengying <1343838695@qq.com>
-
- 16 2月, 2022 2 次提交
-
-
由 Shang Zhizhou 提交于
-
由 chentianyu03 提交于
* change ci using mutable_data() check's directions from paddle/pten to paddle/pten/kernels * change echo info from paddle/pten to paddle/pten/kernels
-
- 15 2月, 2022 1 次提交
-
-
由 YUNSHEN XIE 提交于
-
- 14 2月, 2022 2 次提交
-
-
由 chentianyu03 提交于
-
由 Qi Li 提交于
-
- 11 2月, 2022 1 次提交
-
-
由 zhangchunle 提交于
-
- 09 2月, 2022 1 次提交
-
-
由 huzhiqiang 提交于
-
- 07 2月, 2022 1 次提交
-
-
由 Yan Chunwei 提交于
-
- 29 1月, 2022 2 次提交
-
-
由 Zhanlue Yang 提交于
-
由 QingshuChen 提交于
* fix kunlun2 softmax unitest bug *test=kunlun * minor
-
- 27 1月, 2022 2 次提交