-
由 GaoWei8 提交于
* Optimize the kernel implementation of layernorm with openmp (#20895) * Add ernie c++ inference test (#21015) * Add ernie unit test test=develop * Add ernie unit test test=develop * Add ernie unit test test=develop * remove ngraph * optimize gpu test test=develop * optimize codes test=develop * fix cmake fails on inference_download_and_uncompress (#21185) * solve cmake fails on inference_download_and_uncompress test=develop * solve cmake fails on inference_download_and_uncompress test=develop * Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972) * Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop * Polish the codes of fc when needs padding (#21378) test=develop * Add ernie large c++ inference test (#21365) * add ernie-large test test=develop * add ernie large c++ inference test test=develop * Modify padding strategy: remove weight copy in fc padding (#21650) test=develop * optimize fc jit (#21878) test=develop Co-authored-by: NYihua Xu <yihuaxu@hotmail.com>
3df38f5c