Cherry pick for fix of operator precision. (#52705)
* Fix scale kernel for low precision, cherry pick #50998.
* Fix the FP16 precision problem of add_n. (#50129)
* Change squared_l2_norm to reuse ReduceKernel, and register fp16 and bf16 kernel, which is cherry pick #48315.
* Cherry-pick the fix of MPTypeTrait in KP, which is implemented in #50993.
* Cherry-pick the multi-precision support of AdamW for bf16, #48041.
* Fix compiling error.
* Cherry-pick the fix of CubTensorReduceImpl for bfloat16 in #50993.
* Fix unittest.
---------
Co-authored-by: Nliuruyan <44316842+liuruyan@users.noreply.github.com>
Showing
想要评论请 注册 或 登录