提交 · 652ccae5cc4e1305fb0a4619947f9ee89d8c7f5a · openeuler / Kernel

02 8月, 2014 2 次提交

ARM: 8120/1: crypto: sha512: add ARM NEON implementation · c8611d71

由 Jussi Kivilinna 提交于 7月 29, 2014

This patch adds ARM NEON assembly implementation of SHA-512 and SHA-384
algorithms.

tcrypt benchmark results on Cortex-A8, sha512-generic vs sha512-neon-asm:

block-size      bytes/update    old-vs-new
16              16              2.99x
64              16              2.67x
64              64              3.00x
256             16              2.64x
256             64              3.06x
256             256             3.33x
1024            16              2.53x
1024            256             3.39x
1024            1024            3.52x
2048            16              2.50x
2048            256             3.41x
2048            1024            3.54x
2048            2048            3.57x
4096            16              2.49x
4096            256             3.42x
4096            1024            3.56x
4096            4096            3.59x
8192            16              2.48x
8192            256             3.42x
8192            1024            3.56x
8192            4096            3.60x
8192            8192            3.60x
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

c8611d71

ARM: 8119/1: crypto: sha1: add ARM NEON implementation · 60468255

由 Jussi Kivilinna 提交于 7月 29, 2014

This patch adds ARM NEON assembly implementation of SHA-1 algorithm.

tcrypt benchmark results on Cortex-A8, sha1-arm-asm vs sha1-neon-asm:

block-size      bytes/update    old-vs-new
16              16              1.04x
64              16              1.02x
64              64              1.05x
256             16              1.03x
256             64              1.04x
256             256             1.30x
1024            16              1.03x
1024            256             1.36x
1024            1024            1.52x
2048            16              1.03x
2048            256             1.39x
2048            1024            1.55x
2048            2048            1.59x
4096            16              1.03x
4096            256             1.40x
4096            1024            1.57x
4096            4096            1.62x
8192            16              1.03x
8192            256             1.40x
8192            1024            1.58x
8192            4096            1.63x
8192            8192            1.63x
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

60468255

05 10月, 2013 1 次提交

ARM: add support for bit sliced AES using NEON instructions · e4e7f10b

由 Ard Biesheuvel 提交于 9月 16, 2013

Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption
and around 25% for decryption. This implementation of the AES algorithm
does not rely on any lookup tables so it is believed to be invulnerable
to cache timing attacks.

This algorithm processes up to 8 blocks in parallel in constant time. This
means that it is not usable by chaining modes that are strictly sequential
in nature, such as CBC encryption. CBC decryption, however, can benefit from
this implementation and runs about 25% faster. The other chaining modes
implemented in this module, XTS and CTR, can execute fully in parallel in
both directions.

The core code has been adopted from the OpenSSL project (in collaboration
with the original author, on cc). For ease of maintenance, this version is
identical to the upstream OpenSSL code, i.e., all modifications that were
required to make it suitable for inclusion into the kernel have been made
upstream. The original can be found here:

    http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130

Note to integrators:
While this implementation is significantly faster than the existing table
based ones (generic or ARM asm), especially in CTR mode, the effects on
power efficiency are unclear as of yet. This code does fundamentally more
work, by calculating values that the table based code obtains by a simple
lookup; only by doing all of that work in a SIMD fashion, it manages to
perform better.

Cc: Andy Polyakov <appro@openssl.org>
Acked-by: NNicolas Pitre <nico@linaro.org>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>

e4e7f10b

07 9月, 2012 1 次提交

arm/crypto: Add optimized AES and SHA1 routines · f0be44f4

由 David McCullough 提交于 9月 07, 2012

Add assembler versions of AES and SHA1 for ARM platforms. This has provided
up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1.

Platform CPU SPeed Endian Before (bps) After (bps) Improvement

IXP425 533 MHz big 11217042 15566294 ~38%
KS8695 166 MHz little 3828549 5795373 ~51%
Signed-off-by: NDavid McCullough <ucdevel@gmail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

f0be44f4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功