arm64: Implement optimised IP checksum helpers
AArch64 is capable of 128-bit memory accesses without alignment restrictions, which makes it both possible and highly practical to slurp up a typical 20-byte IP header in just 2 loads. Implement our own version of ip_fast_checksum() to take advantage of that, resulting in considerably fewer instructions and memory accesses than the generic version. We can also get more optimal code generation for csum_fold() by defining it a slightly different way round from the generic version, so throw that into the mix too. Suggested-by: NLuke Starrett <luke.starrett@broadcom.com> Acked-by: NLuke Starrett <luke.starrett@broadcom.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Showing
arch/arm64/include/asm/checksum.h
0 → 100644
想要评论请 注册 或 登录