Add `scale_attn_by_inverse_layer_idx` feature (#2486)
* Add scale_attn_by_inverse_layer_idx feature * Fix layer_id bug * Fix scaling value Co-authored-by: NConnor Holmes <connorholmes@microsoft.com> Co-authored-by: NReza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>
Showing
想要评论请 注册 或 登录