- 11 3月, 2022 5 次提交
-
-
由 Yuang Liu 提交于
-
由 zn 提交于
-
由 z8hanghuan 提交于
-
由 houj04 提交于
-
由 Baibaifan 提交于
-
- 10 3月, 2022 3 次提交
-
-
由 caozhou 提交于
* update reshard for while sub block * fix code format error
-
由 z8hanghuan 提交于
* add tril_triu for xpu, *test=kunlun * add tril_triu for xpu, *test=kunlun * add tril_triu for xpu, *test=kunlun * add tril_triu for xpu, *test=kunlun * add tril_triu for xpu, *test=kunlun
-
由 hong 提交于
* move dropout to phi; test=develop * fix xpu, npu compile error; test=develop
-
- 09 3月, 2022 11 次提交
-
-
由 Baibaifan 提交于
-
由 feng_shuai 提交于
-
由 wawltor 提交于
* fix the full_like with fill the value of inf * update the test case for the fill_any_like * updae the comments for the full_like
-
由 0x45f 提交于
* adapt run_program OP for eager * fix program_id * refine code * fix test
-
由 ShenLiang 提交于
* fix time of utest
-
由 WangXi 提交于
-
由 Weilong Wu 提交于
-
由 xiongkun 提交于
[optest]: fix transpose, support different parameter name between python_api and KernelSignature. (#40258) * optest: fix transpose * fix
-
由 Allen Guo 提交于
-
由 Allen Guo 提交于
* update ipu UTs part1 * rename ut * sync api changes * update uts for new api * update use_ipumodel() * update use_ipumodel() * split pr
-
由 Allen Guo 提交于
* update ipu UTs part3 * rename uts * sync api changes * update uts for new api * update use_ipumodel() * split pr
-
- 08 3月, 2022 10 次提交
-
-
由 chenjian 提交于
* add python profiler package * update according to review * fix bug * fix bug * fix bug * add unit test * Revert "add unit test" This reverts commit 4e69ff71b0645e069afe5dd8fea0d07717852c48. * reduce for pr * add unit test * modify for pr * fix unittest * update for ci coverage * modify according to review * fix bug * improve coverage * add profiler code * add statistic code * reduce content for pr
-
由 Kaipeng Deng 提交于
-
由 xiaoting 提交于
* fix fold python examples, test=develop * fix size type, test=develop * fix python example, test=develop * fix fold shape check * fix fold dygraph mode, test=develop
-
由 lilong12 提交于
* add pg_hccl
-
由 xiongkun 提交于
-
由 Allen Guo 提交于
* update ipu UTs part4 * rename uts * sync api changes * update uts for new api
-
由 Allen Guo 提交于
* update ipu UTs part2 * clean git * rename ut * rename ut 1 * sync api changes * update uts for new api * update uts for new api * fix re-define
-
由 mhhhh1 提交于
* [MLU] add fleet init api and collective api pytest for mlu * fix no value for argument 'data_type' in method call
-
由 chenjian 提交于
* add profiler helper * fix unittest * improve test coverage rate
-
由 chenjian 提交于
* add python profiler package * update according to review * fix bug * fix bug * fix bug * add unit test * Revert "add unit test" This reverts commit 4e69ff71b0645e069afe5dd8fea0d07717852c48. * reduce for pr * add unit test * modify for pr * fix unittest * update for ci coverage * modify according to review * fix bug * improve coverage
-
- 07 3月, 2022 8 次提交
-
-
由 xiongkun 提交于
* add python api test in TestOp * test_python_api if self.python_api is set * fix code by CR
-
由 houj04 提交于
* refactor unittest for nearest_interp_v2_op_xpu. test=kunlun * fix code style. test=kunlun * fix code style. test=kunlun
-
由 Ming-Xu Huang 提交于
* Added cuBlasLtHandle_t to device context. * Added fused_gemm_epilogue op. 1. Added fused_gemm_epilogue op to leverage cuBlastLt Epilogue. 2. Support fusion Act(X*Y + bias), X'dims >=2 and Y'dims shoule be 2. 2. Act currently only be supported ReLU. (Will add GeLU in the future). * Added UT to fused_gemm_epilogue op. * Added LinearAct Pattern 1. Added LinearAct into graph_pattern_detector.* to define (2.)'s pattern. 2. LinearAct is used to detect act(element_add(matmul_v2(x, w), bias)). 3. act currently only support ReLU (Will support GeLU in the future). * Added FuseGemmEpiloguePass 1, Added FuseGemmEpiloguePass to handle nn.Linear + Act{ReLU} fusion (GeLU will be supported in the future). 2. Only support matmul_v2 from nn.Linear. * Added pybind to BuildStrageter.fuse_gemm_epilogue_. * Added UT for fuse_gemm_epilogue_pass. * GeLU support and EpilogueSingleton 1. Added GeLU support to fused_gemm_epilogue op. 2. Added EpilogueSingleton to cache auxiliary pointer. 3. Added related UTs. * Rename cublaslt_epilogue_opto gemm_epilogue_op.*. * Added both train and infer pattern to LinearAct. 1. Added support of fwd graph with grap_ops linking to LinearAct. 2. Added related changes to fuse_gemm_epilogue_pass for above modification. * Changed CUDA requirement from 11.4 to 11.6 for fuse_gemm_epilogue_pass. * Added identity activation support to gemm_epilogue_op. * Added Linear Fusion (matmul_v2 + ele_add) 1. Added matmul_v2 + ele_add pattern to LinearActPattern. 2. Added matmul_v2 + ele_add support to fuse_gemm_epilogue_pass. * Rename gemm_epilogue_op.* to fused_gemm_epilogue_op.* * Add fused_gemm_epilogue_grad op. 1. Added fused_gemm_epilogue_grad to support backward epilogue fusion. * Add UTs to fused_gemm_epilogue_grad_op. * Change attribute name in fused_gemm_epilogue_grad_op for clearing. * Allow DX and DBias be dispensable to fused_gemm_epilogue_grad op. * Added ElementwiseAdd+Matmul+Act graph pattern detection. * Fuse backward of Linear( Act(x)) 1. Added backward fusion pass to Linear( Act(x)). 2. Added backward fusion pass to Linear(x). * Added UTs to backward fusion of Linear(Act(x)). * Complete document of arguments to fused_gemm_epilogue_op. * Made arguments of some functions pass by reference. * Modify code with review comments. 1. Made arguments of some function pass by reference. 2. Removed redundant code. 3. Followed Google code style to change code. * Made 'const' code style be consistent * Fixed random seed of python UTs. * Set Compiling constrains to cuBlasLt 1. Require CUDA 11.6+ 2. Remove fuse_gemm_epilogue related tests when CUDA < 11.6. * Code Reivew from Paddle 1. Changed arguments name is_first_gemm to without_x_gradient for clearing. 2. Applied PADDLE_THROW in fused_gemm_epilogue_op. * Remove EpilogueSingleton 1. Applied ReserveSpace to replace Epilogue for passing auxiliary pointers between FWD and BWD. * Fix a logical error and enhance UTs. 1. Added act op count checking in UTs. 2. Fix issue to fuse backward or ReLU(Linear(X)). 3. TODO: solve GELU fusion issues. * Fix Linear and GeLU fusion issues. 1. Modified graph_detech_pattern to fit with both linear wiht gelu or relu. 2. Modified data range in Uts to allow negative values. * Removed fused_gemm_epilogue_op.h. * Rename namespace pten to phi. * Rename name of arguments in fused_gemm_epilogue_op 1. bias -> Bias. 2. out -> Out. 3. reserve_space -> ReserveSpace. * Change EpiloguePassActivationCache as local variable. 1. Removed singleton in EpiloguePassActivationCache. 2. Made EpiloguePassActivationCache as an argument to each pass functions.
-
由 JingZhuangzhuang 提交于
* fix_conv2d_trt_convert_test_case * fix_conv2d_trt_convert_test_case * fix_conv2d_trt_convert_test_case * fix_conv2d_trt_convert_test_case
-
由 zhangbo9674 提交于
* add gaussian random * add full * refine reduce * refine code * refine gaussian_random unittest * add unittest for fill_any_like fill_constant
-
由 zhaoyingli 提交于
* engine support pp * fix format * avoid multi print * fix convert * bug fix * add pp unittest
-
由 zhangbo9674 提交于
* add activ * refine unittest * refine unittest * refine unittest * refine unittest * refine code
-
由 lilong12 提交于
-
- 05 3月, 2022 1 次提交
-
-
由 wangguanqun 提交于
* fix benchmark and communicator config * fix bugs of the_one_ps * multi program and fix bug in optimizer * multi program in the_one_ps * public commcontext * ps optimizer multi programs * the one ps merge * fix bug in test
-
- 04 3月, 2022 2 次提交