IB/hfi1: Add adaptive cacheless verbs copy
The kernel memcpy is faster than a cacheless copy. However, if too much of the L3 cache is overwritten by one-time copies then overall bandwidth suffers. Implement an adaptive scheme where full page copies are tracked and if the number of unique entries are larger than a threshold, verbs will use a cacheless copy. Tracked entries are gradually cleaned, allowing memcpy to resume once the larger copies have stopped. Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDean Luick <dean.luick@intel.com> Signed-off-by: NJubin John <jubin.john@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
Showing
想要评论请 注册 或 登录