- 02 3月, 2022 25 次提交
-
-
由 pangyoki 提交于
* support phi checking in CI op benchmark * add sparse/gpu * remove h file in cpu directory
-
由 huzhiqiang 提交于
-
由 zhangbo9674 提交于
* add softmax log_softmax * refine rocm * refine unittest
-
由 JZ-LIANG 提交于
* adapot dist op * add dist_fill_constant_batch_size_like * remvoe print * update compitable * add unitest
-
由 crystal 提交于
* move to phi * migrate gather_tree_op into phi * move reduce_prod tp phi * optimize code
-
由 Allen Guo 提交于
* update ipu UTs part0 * rename UT * sync api changes * update uts for new api * use_ipumodel() as classmethod
-
由 chenjian 提交于
* add new profiler components * fix bug * upgrade new profiler * fix operator.cc * fix operator.cc * fix cmakelists.txt * fix bug * fix according to pr * fix bug * fix cmake * fix bug * fix a bug * fix bug * fix bug
-
由 joeqiao12 提交于
-
由 Yuang Liu 提交于
[fleet_executor] Add entrance of FleetExecutor in AnalysisPredictor for distributed inference (#39992)
-
由 Lijunhui 提交于
-
由 Chen Weihang 提交于
* unify complex type trait and fix real imag bug * add unittest for type tratis
-
由 qipengh 提交于
* [MLU] adapt matmul op * [MLU] fix phi namespace
-
由 zhangchunle 提交于
-
由 王明冬 提交于
-
由 fwenguang 提交于
-
由 Baibaifan 提交于
-
由 lkylkylky 提交于
-
由 zhouweiwei2014 提交于
* change CUDA implementaion of randint OP,move distribution common func to phi * fix CI * fix CI
-
由 wanghuancoder 提交于
* open eager when WITH_PYTHON, test=develop * refine, test=develop * refine, test=develop * add DWITH_PYTHON for gen_fluid_lib, test=develop
-
由 Wangzheee 提交于
-
由 JingZhuangzhuang 提交于
-
由 Feiyu Chan 提交于
* move sequence2batch * move lstm and gru * Add phi/kernels directory into exclusion to stop using hipcc to compile non .cu files in it.
-
由 Weilong Wu 提交于
-
由 From00 提交于
-
由 Shang Zhizhou 提交于
* update pd_2_trt lower pass * update pd_2_trt lower pass * update style * udpate * change trt.graph to trt.create_engine * update comments * update comments * add test
-
- 01 3月, 2022 15 次提交
-
-
由 Zhanlue Yang 提交于
-
由 Qi Li 提交于
-
由 zhouweiwei2014 提交于
* fix bug of paddle.to_tensor and paddle.moveaxis * fix CI
-
由 Allen Guo 提交于
-
由 chentianyu03 提交于
* modify infershape utils and rm reduce infershape * merge develop * fix infermete bug * add IsForInferShape func in ArgumentMappingContext * add reduce_mean infermeta * modify annotation * add default dims
-
由 xiongkun 提交于
* tranfer the selu_op and pass the CI * add sig files * fix code * fix by code review * remove TOOD * change the include position * change the head position
-
由 niuliling123 提交于
* Add function description for Kernel Primitive API 1. Set cumsum and sort share memory size = 1024 2.sort and cumsum api limitation : blockDim.x must be less than 512 (blockDim.x <= 512)
-
由 Zhanlue Yang 提交于
* Refactored GradNodeAccumulation data structure and behaviour * Fixed CI issues * Fix compilation issues * Fixed minor issues * Reverted changes for intermediate and OverwriteOutput * fixed minor issue * Fixed auto codegen for intermediate tensors * Removed restriction on AccumulationNode modification * Fixed CI Coverage issues * Adjusted Log contents * Fixed CI issues
-
由 joanna.wozna.intel 提交于
* Add mobilenetv3_large performance test * Disable the BF16 test if the device does not support BF16 computations * Change test timeout
-
由 zhangbo9674 提交于
* add layer norm * add p norm * add reduce sum * refine layer norm register bf16 for cudnn811 * add bf16 cast for hip * add unittest * refine rocm * refine layer_norm unittest * refine reduce op * refine unittest * enhance atol for reduce unittest
-
由 wenbin 提交于
* remove * pass * more pass
-
由 zhangchunle 提交于
-
由 zhangbo9674 提交于
* add scale gather sum * refine CUDA_ATOMIC_WRAPPER ADD for bf16 * add gather unittest * solve conflict * add scale uinttest * add sum unittest * solve conflict * refine gather unittest * refine unittest
-
由 Guoxia Wang 提交于
-
由 pangyoki 提交于
-