- 08 3月, 2022 3 次提交
-
-
由 tanzhipeng 提交于
-
由 furnace 提交于
* [Phi] move InferShape for truncated_gaussian_random and gaussian_random * [Phi] delete useless codes
-
由 Linjie Chen 提交于
* move infershapes to phi * update code format * update code format
-
- 07 3月, 2022 11 次提交
-
-
由 0x45f 提交于
* move bincount OP to phi * fix dtype * set_dtype by weights or x * fix conflicts
-
由 Ming-Xu Huang 提交于
* Added cuBlasLtHandle_t to device context. * Added fused_gemm_epilogue op. 1. Added fused_gemm_epilogue op to leverage cuBlastLt Epilogue. 2. Support fusion Act(X*Y + bias), X'dims >=2 and Y'dims shoule be 2. 2. Act currently only be supported ReLU. (Will add GeLU in the future). * Added UT to fused_gemm_epilogue op. * Added LinearAct Pattern 1. Added LinearAct into graph_pattern_detector.* to define (2.)'s pattern. 2. LinearAct is used to detect act(element_add(matmul_v2(x, w), bias)). 3. act currently only support ReLU (Will support GeLU in the future). * Added FuseGemmEpiloguePass 1, Added FuseGemmEpiloguePass to handle nn.Linear + Act{ReLU} fusion (GeLU will be supported in the future). 2. Only support matmul_v2 from nn.Linear. * Added pybind to BuildStrageter.fuse_gemm_epilogue_. * Added UT for fuse_gemm_epilogue_pass. * GeLU support and EpilogueSingleton 1. Added GeLU support to fused_gemm_epilogue op. 2. Added EpilogueSingleton to cache auxiliary pointer. 3. Added related UTs. * Rename cublaslt_epilogue_opto gemm_epilogue_op.*. * Added both train and infer pattern to LinearAct. 1. Added support of fwd graph with grap_ops linking to LinearAct. 2. Added related changes to fuse_gemm_epilogue_pass for above modification. * Changed CUDA requirement from 11.4 to 11.6 for fuse_gemm_epilogue_pass. * Added identity activation support to gemm_epilogue_op. * Added Linear Fusion (matmul_v2 + ele_add) 1. Added matmul_v2 + ele_add pattern to LinearActPattern. 2. Added matmul_v2 + ele_add support to fuse_gemm_epilogue_pass. * Rename gemm_epilogue_op.* to fused_gemm_epilogue_op.* * Add fused_gemm_epilogue_grad op. 1. Added fused_gemm_epilogue_grad to support backward epilogue fusion. * Add UTs to fused_gemm_epilogue_grad_op. * Change attribute name in fused_gemm_epilogue_grad_op for clearing. * Allow DX and DBias be dispensable to fused_gemm_epilogue_grad op. * Added ElementwiseAdd+Matmul+Act graph pattern detection. * Fuse backward of Linear( Act(x)) 1. Added backward fusion pass to Linear( Act(x)). 2. Added backward fusion pass to Linear(x). * Added UTs to backward fusion of Linear(Act(x)). * Complete document of arguments to fused_gemm_epilogue_op. * Made arguments of some functions pass by reference. * Modify code with review comments. 1. Made arguments of some function pass by reference. 2. Removed redundant code. 3. Followed Google code style to change code. * Made 'const' code style be consistent * Fixed random seed of python UTs. * Set Compiling constrains to cuBlasLt 1. Require CUDA 11.6+ 2. Remove fuse_gemm_epilogue related tests when CUDA < 11.6. * Code Reivew from Paddle 1. Changed arguments name is_first_gemm to without_x_gradient for clearing. 2. Applied PADDLE_THROW in fused_gemm_epilogue_op. * Remove EpilogueSingleton 1. Applied ReserveSpace to replace Epilogue for passing auxiliary pointers between FWD and BWD. * Fix a logical error and enhance UTs. 1. Added act op count checking in UTs. 2. Fix issue to fuse backward or ReLU(Linear(X)). 3. TODO: solve GELU fusion issues. * Fix Linear and GeLU fusion issues. 1. Modified graph_detech_pattern to fit with both linear wiht gelu or relu. 2. Modified data range in Uts to allow negative values. * Removed fused_gemm_epilogue_op.h. * Rename namespace pten to phi. * Rename name of arguments in fused_gemm_epilogue_op 1. bias -> Bias. 2. out -> Out. 3. reserve_space -> ReserveSpace. * Change EpiloguePassActivationCache as local variable. 1. Removed singleton in EpiloguePassActivationCache. 2. Made EpiloguePassActivationCache as an argument to each pass functions.
-
由 WJJ1995 提交于
* Add is_empty * fixed for CI * fixed code style * resolve conflict * deal with comments * replace pt by pd
-
由 YuanRisheng 提交于
* move elementwise_div grad * change mutable_data to alloc * fix compile bugs
-
由 Wei Shengyu 提交于
* dbg pool infer shapes * dbg * fix format
-
由 Aurelius84 提交于
-
由 zhangbo9674 提交于
* add gaussian random * add full * refine reduce * refine code * refine gaussian_random unittest * add unittest for fill_any_like fill_constant
-
由 Liu-xiandong 提交于
* [phi] move multi_dot OP * fix the segment bug * fix bug * delete useless comment * fix CI bug
-
由 zhangbo9674 提交于
* add activ * refine unittest * refine unittest * refine unittest * refine unittest * refine code
-
由 zn 提交于
* [MLU]support reduce tensors on mlu * [MLU]fix compiler options
-
由 Aurelius84 提交于
* [Phi]Migrate Adamax into phi * Add adadelta kernel
-
- 06 3月, 2022 3 次提交
-
-
由 Chen Weihang 提交于
* replace prefix pt by pd * replace added kernel * revert util change * pd kernel to phi * resolve conflict * resolve conflict
-
由 Zhong Hui 提交于
* move dist op to phi * fix * fix * fix as reviews
-
由 zhouweiwei2014 提交于
* Migrate triangular_solve op into phi * fix CI * move MatrixReduceSum to phi funcs * move MatrixReduceSum to phi funcs * fix comment * fic CI
-
- 05 3月, 2022 3 次提交
-
-
由 Chen Weihang 提交于
* remove eig dep for svd helper * fix win failed
-
由 furnace 提交于
* [Phi] move infershape for mv * [Phi] delete extra codes for mv
-
由 Chen Weihang 提交于
-
- 04 3月, 2022 13 次提交
-
-
由 hong 提交于
* add yolo box kernel; test=develop * fix comile error; test=develop
-
由 sneaxiy 提交于
* move gather_nd/scatter/scatter_nd_add * fix npu/xpu ci * follow comments * small fix
-
由 Feiyu Chan 提交于
move cpu_vec.h to phi/kernels/funcs.
-
由 Linjie Chen 提交于
* move sigmoid cross entopy with logits to phi * fix ci * move log_loss to phi * move cumsum to phi * revert infershape * fix xpu ci * move auc to phi * remove comment * update sigmoid_cross_entropy_with_logits_op.cu * update sigmoid_cross_entropy_with_logits_op * Update log_loss
-
由 hong 提交于
* add digamma, abs, trunc; test=develop * fix bug and add diagonal; test=develop * add name coverter; test=develop * update tracer.py; test=develop * add test case; test=develop * fix bugs; test=develop
-
由 zyfncg 提交于
* remove emtpy kernel and infershape in fluid * fix bug of infershape_utils
-
由 zyfncg 提交于
* fix bug caused by split infershape * revert infer_shape of split * revert split
-
由 Chen Weihang 提交于
* remove cholsky solve deps with svd helper * fix shape infer bug
-
由 zhouweiwei2014 提交于
* Migrate bitwise_and/or/xor/not op into phi * fix CI
-
由 Leo Chen 提交于
* clean distribution_helper, index_impl, aligned_vector code in fluid * fix conflicts
-
由 chentianyu03 提交于
* move reduce gpu impl funcs into pten/kernels/funcs * change reduce header name and namespace * fix spell word error * change mutable_data to dev_ctx.Alloc * modify place to devcontex * format code style * fix build error * fix build error * fix conflict
-
由 xiongkun 提交于
-
由 hong 提交于
* move conv to pten * move conv to pten; test=develop * fix bug; * add conv cudnn impl; test=develop * update * update operator; test=develop * fix bug; test=develop * move operator and prepared_operator to develop; test=develop * resolve conflict; test=develop * remove useless code;test=develop * add depency ; test=develop * fix bug; * add sig.cc ; test=develop * fix use_op error; test=develop * fix bug; test=develop * fix bug; test=develop * add conv3d register; test=develop * fix star gan and conv_nn_grad test failed; test=develop * add header; test=develop * manul to recover to develop; * resolve confilct; test=develop * remove useless code * fix bug; * remove conv2d_cudnn; test=develop * fix bugs; test=develop * fix cpu rocm compile bugs; test=develop * fix blas error; test=develop * fix compile bug; test=develop * fix windows compile error; test=develop * fix windows error; test=develop * resolve confilct; test=develop
-
- 03 3月, 2022 7 次提交
-
-
由 YuanRisheng 提交于
-
由 0x45f 提交于
-
由 TeFeng Chen 提交于
* swith to PE execution in cinn launch * fix outer variables erased * skip the map bug temporarily for test * temporary solution for batch_norm bug * update comment * fix compile error * cinn_instruction_run_op_test: update code to skip external alloc/free instructions generated
-
由 From00 提交于
* Move compare OPs to phi * Fix bug * Use BroadcastKernel and ElementwiseKernel in phi
-
由 wangxinxin08 提交于
* modify infershape of multiclass nms
-
由 YuanRisheng 提交于
* delete elementwise_sub kernel registry * fix compile bugs in xpu ci * fix bugs when run inference ci
-
由 wenbin 提交于
* emb fix * fix trt6 compile * fix half * absolute error fix
-