[OptionalOptimization]: LayerNorm forward Optimization with Welford (#50362)
* first commit * main codes has been developed * fix all bugs * add vectorize input&output * a test for optimization_of_layer_norm_fwd * add some changes * fix memory coalesced access for more optimization. * fix addition ctest error * fix according to ci-approval * remove change on slice
Showing
想要评论请 注册 或 登录