arm64: lib: accelerate crc32_be
It makes no sense to leave crc32_be using the generic code while we
only accelerate the little-endian ops.
Even though the big-endian form doesn't fit as smoothly into the arm64,
we can speed it up and avoid hitting the D cache.
Tested on Cortex-A53. Without acceleration:
crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64
crc32: self tests passed, processed 225944 bytes in 192240 nsec
crc32c: CRC_LE_BITS = 64
crc32c: self tests passed, processed 112972 bytes in 21360 nsec
With acceleration:
crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64
crc32: self tests passed, processed 225944 bytes in 53480 nsec
crc32c: CRC_LE_BITS = 64
crc32c: self tests passed, processed 112972 bytes in 21480 nsec
Signed-off-by: NKevin Bracey <kevin@bracey.fi>
Tested-by: NArd Biesheuvel <ardb@kernel.org>
Reviewed-by: NArd Biesheuvel <ardb@kernel.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Showing
想要评论请 注册 或 登录