提交 · a4397635afea5d127548d64e0055ed471ef2d5be · openeuler / Kernel

15 8月, 2019 2 次提交

crypto: aegis128 - provide a SIMD implementation based on NEON intrinsics · a4397635

由 Ard Biesheuvel 提交于 8月 12, 2019

Provide an accelerated implementation of aegis128 by wiring up the
SIMD hooks in the generic driver to an implementation based on NEON
intrinsics, which can be compiled to both ARM and arm64 code.

This results in a performance of 2.2 cycles per byte on Cortex-A53,
which is a performance increase of ~11x compared to the generic
code.
Reviewed-by: NOndrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

a4397635

crypto: aegis128 - add support for SIMD acceleration · cf3d41ad

由 Ard Biesheuvel 提交于 8月 12, 2019

Add some plumbing to allow the AEGIS128 code to be built with SIMD
routines for acceleration.
Reviewed-by: NOndrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

cf3d41ad

02 8月, 2019 2 次提交

crypto: jitterentropy - build without sanitizer · dec0fb39

由 Arnd Bergmann 提交于 7月 24, 2019

Recent clang-9 snapshots double the kernel stack usage when building
this file with -O0 -fsanitize=kernel-hwaddress, compared to clang-8
and older snapshots, this changed between commits svn364966 and
svn366056:

crypto/jitterentropy.c:516:5: error: stack frame size of 2640 bytes in function 'jent_entropy_init' [-Werror,-Wframe-larger-than=]
int jent_entropy_init(void)
    ^
crypto/jitterentropy.c:185:14: error: stack frame size of 2224 bytes in function 'jent_lfsr_time' [-Werror,-Wframe-larger-than=]
static __u64 jent_lfsr_time(struct rand_data *ec, __u64 time, __u64 loop_cnt)
             ^

I prepared a reduced test case in case any clang developers want to
take a closer look, but from looking at the earlier output it seems
that even with clang-8, something was very wrong here.

Turn off any KASAN and UBSAN sanitizing for this file, as that likely
clashes with -O0 anyway.  Turning off just KASAN avoids the warning
already, but I suspect both of these have undesired side-effects
for jitterentropy.

Link: https://godbolt.org/z/fDcwZ5Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

dec0fb39

Revert "crypto: aegis128 - add support for SIMD acceleration" · c9f1fd4f

由 Herbert Xu 提交于 8月 02, 2019

This reverts commit ecc8bc81
("crypto: aegis128 - provide a SIMD implementation based on NEON
intrinsics") and commit 7cdc0ddb
("crypto: aegis128 - add support for SIMD acceleration").

They cause compile errors on platforms other than ARM because
the mechanism to selectively compile the SIMD code is broken.
Repoted-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

c9f1fd4f

26 7月, 2019 4 次提交

crypto: aegis128 - provide a SIMD implementation based on NEON intrinsics · ecc8bc81

由 Ard Biesheuvel 提交于 7月 03, 2019

Provide an accelerated implementation of aegis128 by wiring up the
SIMD hooks in the generic driver to an implementation based on NEON
intrinsics, which can be compiled to both ARM and arm64 code.

This results in a performance of 2.2 cycles per byte on Cortex-A53,
which is a performance increase of ~11x compared to the generic
code.
Reviewed-by: NOndrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

ecc8bc81

crypto: aegis128 - add support for SIMD acceleration · 7cdc0ddb

由 Ard Biesheuvel 提交于 7月 03, 2019

Add some plumbing to allow the AEGIS128 code to be built with SIMD
routines for acceleration.
Reviewed-by: NOndrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

7cdc0ddb

crypto: aegis128l/aegis256 - remove x86 and generic implementations · 520c1993

由 Ard Biesheuvel 提交于 7月 03, 2019

Three variants of AEGIS were proposed for the CAESAR competition, and
only one was selected for the final portfolio: AEGIS128.

The other variants, AEGIS128L and AEGIS256, are not likely to ever turn
up in networking protocols or other places where interoperability
between Linux and other systems is a concern, nor are they likely to
be subjected to further cryptanalysis. However, uninformed users may
think that AEGIS128L (which is faster) is equally fit for use.

So let's remove them now, before anyone starts using them and we are
forced to support them forever.

Note that there are no known flaws in the algorithms or in any of these
implementations, but they have simply outlived their usefulness.
Reviewed-by: NOndrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

520c1993

crypto: morus - remove generic and x86 implementations · 5cb97700

由 Ard Biesheuvel 提交于 7月 03, 2019

MORUS was not selected as a winner in the CAESAR competition, which
is not surprising since it is considered to be cryptographically
broken [0]. (Note that this is not an implementation defect, but a
flaw in the underlying algorithm). Since it is unlikely to be in use
currently, let's remove it before we're stuck with it.

[0] https://eprint.iacr.org/2019/172.pdfReviewed-by: NOndrej Mosnacek <omosnace@redhat.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

5cb97700

06 6月, 2019 1 次提交

crypto: xxhash - Implement xxhash support · 67882e76

由 Nikolay Borisov 提交于 5月 30, 2019

xxhash is currently implemented as a self-contained module in /lib.
This patch enables that module to be used as part of the generic kernel
crypto framework. It adds a simple wrapper to the 64bit version.

I've also added test vectors (with help from Nick Terrell). The upstream
xxhash code is tested by running hashing operation on random 222 byte
data with seed values of 0 and a prime number. The upstream test
suite can be found at https://github.com/Cyan4973/xxHash/blob/cf46e0c/xxhsum.c#L664

Essentially hashing is run on data of length 0,1,14,222 with the
aforementioned seed values 0 and prime 2654435761. The particular random
222 byte string was provided to me by Nick Terrell by reading
/dev/random and the checksums were calculated by the upstream xxsum
utility with the following bash script:

dd if=/dev/random of=TEST_VECTOR bs=1 count=222

for a in 0 1; do
	for l in 0 1 14 222; do
		for s in 0 2654435761; do
			echo algo $a length $l seed $s;
			head -c $l TEST_VECTOR | ~/projects/kernel/xxHash/xxhsum -H$a -s$s
		done
	done
done

This produces output as follows:

algo 0 length 0 seed 0
02cc5d05  stdin
algo 0 length 0 seed 2654435761
02cc5d05  stdin
algo 0 length 1 seed 0
25201171  stdin
algo 0 length 1 seed 2654435761
25201171  stdin
algo 0 length 14 seed 0
c1d95975  stdin
algo 0 length 14 seed 2654435761
c1d95975  stdin
algo 0 length 222 seed 0
b38662a6  stdin
algo 0 length 222 seed 2654435761
b38662a6  stdin
algo 1 length 0 seed 0
ef46db3751d8e999  stdin
algo 1 length 0 seed 2654435761
ac75fda2929b17ef  stdin
algo 1 length 1 seed 0
27c3f04c2881203a  stdin
algo 1 length 1 seed 2654435761
4a15ed26415dfe4d  stdin
algo 1 length 14 seed 0
3d33dc700231dfad  stdin
algo 1 length 14 seed 2654435761
ea5f7ddef9a64f80  stdin
algo 1 length 222 seed 0
5f3d3c08ec2bef34  stdin
algo 1 length 222 seed 2654435761
6a9df59664c7ed62  stdin

algo 1 is xx64 variant, algo 0 is the 32 bit variant which is currently
not hooked up.
Signed-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NEric Biggers <ebiggers@kernel.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

67882e76

30 5月, 2019 1 次提交

crypto: cryptd - move kcrypto_wq into cryptd · 3e56e168

由 Eric Biggers 提交于 5月 20, 2019

kcrypto_wq is only used by cryptd, so move it into cryptd.c and change
the workqueue name from "crypto" to "cryptd".
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

3e56e168

18 4月, 2019 2 次提交

crypto: ecrdsa - add EC-RDSA (GOST 34.10) algorithm · 0d7a7864

由 Vitaly Chikunov 提交于 4月 11, 2019

Add Elliptic Curve Russian Digital Signature Algorithm (GOST R
34.10-2012, RFC 7091, ISO/IEC 14888-3) is one of the Russian (and since
2018 the CIS countries) cryptographic standard algorithms (called GOST
algorithms). Only signature verification is supported, with intent to be
used in the IMA.

Summary of the changes:

* crypto/Kconfig:
  - EC-RDSA is added into Public-key cryptography section.

* crypto/Makefile:
  - ecrdsa objects are added.

* crypto/asymmetric_keys/x509_cert_parser.c:
  - Recognize EC-RDSA and Streebog OIDs.

* include/linux/oid_registry.h:
  - EC-RDSA OIDs are added to the enum. Also, a two currently not
    implemented curve OIDs are added for possible extension later (to
    not change numbering and grouping).

* crypto/ecc.c:
  - Kenneth MacKay copyright date is updated to 2014, because
    vli_mmod_slow, ecc_point_add, ecc_point_mult_shamir are based on his
    code from micro-ecc.
  - Functions needed for ecrdsa are EXPORT_SYMBOL'ed.
  - New functions:
    vli_is_negative - helper to determine sign of vli;
    vli_from_be64 - unpack big-endian array into vli (used for
      a signature);
    vli_from_le64 - unpack little-endian array into vli (used for
      a public key);
    vli_uadd, vli_usub - add/sub u64 value to/from vli (used for
      increment/decrement);
    mul_64_64 - optimized to use __int128 where appropriate, this speeds
      up point multiplication (and as a consequence signature
      verification) by the factor of 1.5-2;
    vli_umult - multiply vli by a small value (speeds up point
      multiplication by another factor of 1.5-2, depending on vli sizes);
    vli_mmod_special - module reduction for some form of Pseudo-Mersenne
      primes (used for the curves A);
    vli_mmod_special2 - module reduction for another form of
      Pseudo-Mersenne primes (used for the curves B);
    vli_mmod_barrett - module reduction using pre-computed value (used
      for the curve C);
    vli_mmod_slow - more general module reduction which is much slower
     (used when the modulus is subgroup order);
    vli_mod_mult_slow - modular multiplication;
    ecc_point_add - add two points;
    ecc_point_mult_shamir - add two points multiplied by scalars in one
      combined multiplication (this gives speed up by another factor 2 in
      compare to two separate multiplications).
    ecc_is_pubkey_valid_partial - additional samity check is added.
  - Updated vli_mmod_fast with non-strict heuristic to call optimal
      module reduction function depending on the prime value;
  - All computations for the previously defined (two NIST) curves should
    not unaffected.

* crypto/ecc.h:
  - Newly exported functions are documented.

* crypto/ecrdsa_defs.h
  - Five curves are defined.

* crypto/ecrdsa.c:
  - Signature verification is implemented.

* crypto/ecrdsa_params.asn1, crypto/ecrdsa_pub_key.asn1:
  - Templates for BER decoder for EC-RDSA parameters and public key.

Cc: linux-integrity@vger.kernel.org
Signed-off-by: NVitaly Chikunov <vt@altlinux.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

0d7a7864

crypto: ecc - make ecc into separate module · 4a2289da

由 Vitaly Chikunov 提交于 4月 11, 2019

ecc.c have algorithms that could be used togeter by ecdh and ecrdsa.
Make it separate module. Add CRYPTO_ECC into Kconfig. EXPORT_SYMBOL and
document to what seems appropriate. Move structs ecc_point and ecc_curve
from ecc_curve_defs.h into ecc.h.

No code changes.
Signed-off-by: NVitaly Chikunov <vt@altlinux.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

4a2289da

08 3月, 2019 1 次提交

lib/lzo: separate lzo-rle from lzo · 45ec975e

由 Dave Rodgman 提交于 3月 07, 2019

To prevent any issues with persistent data, separate lzo-rle from lzo so
that it is treated as a separate algorithm, and lzo is still available.

Link: http://lkml.kernel.org/r/20190205155944.16007-3-dave.rodgman@arm.comSigned-off-by: NDave Rodgman <dave.rodgman@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Cc: Matt Sealey <matt.sealey@arm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <nitingupta910@gmail.com>
Cc: Richard Purdie <rpurdie@openedhand.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Sonny Rao <sonnyrao@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

45ec975e

07 12月, 2018 1 次提交

crypto: user - made crypto_user_stat optional · 2ced2607

由 Corentin Labbe 提交于 11月 29, 2018

Even if CRYPTO_STATS is set to n, some part of CRYPTO_STATS are
compiled.
This patch made all part of crypto_user_stat uncompiled in that case.
Signed-off-by: NCorentin Labbe <clabbe@baylibre.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

2ced2607

20 11月, 2018 3 次提交

crypto: adiantum - add Adiantum support · 059c2a4d

由 Eric Biggers 提交于 11月 16, 2018

Add support for the Adiantum encryption mode. Adiantum was designed by
Paul Crowley and is specified by our paper:

Adiantum: length-preserving encryption for entry-level processors
(https://eprint.iacr.org/2018/720.pdf)

See our paper for full details; this patch only provides an overview.

Adiantum is a tweakable, length-preserving encryption mode designed for
fast and secure disk encryption, especially on CPUs without dedicated
crypto instructions. Adiantum encrypts each sector using the XChaCha12
stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash
function, and an invocation of the AES-256 block cipher on a single
16-byte block. On CPUs without AES instructions, Adiantum is much
faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors
Adiantum encryption is about 4 times faster than AES-256-XTS encryption,
and decryption about 5 times faster.

Adiantum is a specialization of the more general HBSH construction. Our
earlier proposal, HPolyC, was also a HBSH specialization, but it used a
different εA∆U hash function, one based on Poly1305 only. Adiantum's
εA∆U hash function, which is based primarily on the "NH" hash function
like that used in UMAC (RFC4418), is about twice as fast as HPolyC's;
consequently, Adiantum is about 20% faster than HPolyC.

This speed comes with no loss of security: Adiantum is provably just as
secure as HPolyC, in fact slightly *more* secure. Like HPolyC,
Adiantum's security is reducible to that of XChaCha12 and AES-256,
subject to a security bound. XChaCha12 itself has a security reduction
to ChaCha12. Therefore, one need not "trust" Adiantum; one need only
trust ChaCha12 and AES-256. Note that the εA∆U hash function is only
used for its proven combinatorical properties so cannot be "broken".

Adiantum is also a true wide-block encryption mode, so flipping any
plaintext bit in the sector scrambles the entire ciphertext, and vice
versa. No other such mode is available in the kernel currently; doing
the same with XTS scrambles only 16 bytes. Adiantum also supports
arbitrary-length tweaks and naturally supports any length input >= 16
bytes without needing "ciphertext stealing".

For the stream cipher, Adiantum uses XChaCha12 rather than XChaCha20 in
order to make encryption feasible on the widest range of devices.
Although the 20-round variant is quite popular, the best known attacks
on ChaCha are on only 7 rounds, so ChaCha12 still has a substantial
security margin; in fact, larger than AES-256's. 12-round Salsa20 is
also the eSTREAM recommendation. For the block cipher, Adiantum uses
AES-256, despite it having a lower security margin than XChaCha12 and
needing table lookups, due to AES's extensive adoption and analysis
making it the obvious first choice. Nevertheless, for flexibility this
patch also permits the "adiantum" template to be instantiated with
XChaCha20 and/or with an alternate block cipher.

We need Adiantum support in the kernel for use in dm-crypt and fscrypt,
where currently the only other suitable options are block cipher modes
such as AES-XTS. A big problem with this is that many low-end mobile
devices (e.g. Android Go phones sold primarily in developing countries,
as well as some smartwatches) still have CPUs that lack AES
instructions, e.g. ARM Cortex-A7. Sadly, AES-XTS encryption is much too
slow to be viable on these devices. We did find that some "lightweight"
block ciphers are fast enough, but these suffer from problems such as
not having much cryptanalysis or being too controversial.

The ChaCha stream cipher has excellent performance but is insecure to
use directly for disk encryption, since each sector's IV is reused each
time it is overwritten. Even restricting the threat model to offline
attacks only isn't enough, since modern flash storage devices don't
guarantee that "overwrites" are really overwrites, due to wear-leveling.
Adiantum avoids this problem by constructing a
"tweakable super-pseudorandom permutation"; this is the strongest
possible security model for length-preserving encryption.

Of course, storing random nonces along with the ciphertext would be the
ideal solution. But doing that with existing hardware and filesystems
runs into major practical problems; in most cases it would require data
journaling (like dm-integrity) which severely degrades performance.
Thus, for now length-preserving encryption is still needed.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

059c2a4d

crypto: nhpoly1305 - add NHPoly1305 support · 26609a21

由 Eric Biggers 提交于 11月 16, 2018

Add a generic implementation of NHPoly1305, an ε-almost-∆-universal hash
function used in the Adiantum encryption mode.

CONFIG_NHPOLY1305 is not selectable by itself since there won't be any
real reason to enable it without also enabling Adiantum support.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

26609a21

crypto: chacha20-generic - refactor to allow varying number of rounds · 1ca1b917

由 Eric Biggers 提交于 11月 16, 2018

In preparation for adding XChaCha12 support, rename/refactor
chacha20-generic to support different numbers of rounds. The
justification for needing XChaCha12 support is explained in more detail
in the patch "crypto: chacha - add XChaCha12 support".

The only difference between ChaCha{8,12,20} are the number of rounds
itself; all other parts of the algorithm are the same. Therefore,
remove the "20" from all definitions, structures, functions, files, etc.
that will be shared by all ChaCha versions.

Also make ->setkey() store the round count in the chacha_ctx (previously
chacha20_ctx). The generic code then passes the round count through to
chacha_block(). There will be a ->setkey() function for each explicitly
allowed round count; the encrypt/decrypt functions will be the same. I
decided not to do it the opposite way (same ->setkey() function for all
round counts, with different encrypt/decrypt functions) because that
would have required more boilerplate code in architecture-specific
implementations of ChaCha and XChaCha.
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: NMartin Willi <martin@strongswan.org>
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

1ca1b917

16 11月, 2018 1 次提交

crypto: streebog - add Streebog hash function · fe18957e

由 Vitaly Chikunov 提交于 11月 07, 2018

Add GOST/IETF Streebog hash function (GOST R 34.11-2012, RFC 6986)
generic hash transformation.

Cc: linux-integrity@vger.kernel.org
Signed-off-by: NVitaly Chikunov <vt@altlinux.org>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

fe18957e

28 9月, 2018 2 次提交

crypto: ofb - add output feedback mode · e497c518

由 Gilad Ben-Yossef 提交于 9月 20, 2018

Add a generic version of output feedback mode. We already have support of
several hardware based transformations of this mode and the needed test
vectors but we somehow missed adding a generic software one. Fix this now.
Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

e497c518

crypto: user - Implement a generic crypto statistics · cac5818c

由 Corentin Labbe 提交于 9月 19, 2018

This patch implement a generic way to get statistics about all crypto
usages.
Signed-off-by: NCorentin Labbe <clabbe@baylibre.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

cac5818c

04 9月, 2018 2 次提交

crypto: x86 - remove SHA multibuffer routines and mcryptd · ab8085c1

由 Ard Biesheuvel 提交于 8月 22, 2018

As it turns out, the AVX2 multibuffer SHA routines are currently
broken [0], in a way that would have likely been noticed if this
code were in wide use. Since the code is too complicated to be
maintained by anyone except the original authors, and since the
performance benefits for real-world use cases are debatable to
begin with, it is better to drop it entirely for the moment.

[0] https://marc.info/?l=linux-crypto-vger&m=153476243825350&w=2Suggested-by: NEric Biggers <ebiggers@google.com>
Cc: Megha Dey <megha.dey@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

ab8085c1

crypto: speck - remove Speck · 578bdaab

由 Jason A. Donenfeld 提交于 8月 07, 2018

These are unused, undesired, and have never actually been used by
anybody. The original authors of this code have changed their mind about
its inclusion. While originally proposed for disk encryption on low-end
devices, the idea was discarded [1] in favor of something else before
that could really get going. Therefore, this patch removes Speck.

[1] https://marc.info/?l=linux-crypto-vger&m=153359499015659Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Acked-by: NEric Biggers <ebiggers@google.com>
Cc: stable@vger.kernel.org
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

578bdaab

31 5月, 2018 1 次提交

crypto: morus - Mark MORUS SIMD glue as x86-specific · 2808f173

由 Ondrej Mosnacek 提交于 5月 21, 2018

Commit 56e8e57f ("crypto: morus - Add common SIMD glue code for
MORUS") accidetally consiedered the glue code to be usable by different
architectures, but it seems to be only usable on x86.

This patch moves it under arch/x86/crypto and adds 'depends on X86' to
the Kconfig options and also removes the prompt to hide these internal
options from the user.
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NOndrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

2808f173

19 5月, 2018 3 次提交

crypto: morus - Add common SIMD glue code for MORUS · 56e8e57f

由 Ondrej Mosnacek 提交于 5月 11, 2018

This patch adds a common glue code for optimized implementations of
MORUS AEAD algorithms.
Signed-off-by: NOndrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

56e8e57f

crypto: morus - Add generic MORUS AEAD implementations · 396be41f

由 Ondrej Mosnacek 提交于 5月 11, 2018

This patch adds the generic implementation of the MORUS family of AEAD
algorithms (MORUS-640 and MORUS-1280). The original authors of MORUS
are Hongjun Wu and Tao Huang.

At the time of writing, MORUS is one of the finalists in CAESAR, an
open competition intended to select a portfolio of alternatives to
the problematic AES-GCM:

https://competitions.cr.yp.to/caesar-submissions.html
https://competitions.cr.yp.to/round3/morusv2.pdfSigned-off-by: NOndrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

396be41f

crypto: aegis - Add generic AEGIS AEAD implementations · f606a88e

由 Ondrej Mosnacek 提交于 5月 11, 2018

This patch adds the generic implementation of the AEGIS family of AEAD
algorithms (AEGIS-128, AEGIS-128L, and AEGIS-256). The original
authors of AEGIS are Hongjun Wu and Bart Preneel.

At the time of writing, AEGIS is one of the finalists in CAESAR, an
open competition intended to select a portfolio of alternatives to
the problematic AES-GCM:

https://competitions.cr.yp.to/caesar-submissions.html
https://competitions.cr.yp.to/round3/aegisv11.pdfSigned-off-by: NOndrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

f606a88e

21 4月, 2018 1 次提交

crypto: zstd - Add zstd support · d28fc3db

由 Nick Terrell 提交于 3月 30, 2018

Adds zstd support to crypto and scompress. Only supports the default
level.

Previously we held off on this patch, since there weren't any users.
Now zram is ready for zstd support, but depends on CONFIG_CRYPTO_ZSTD,
which isn't defined until this patch is in. I also see a patch adding
zstd to pstore [0], which depends on crypto zstd.

[0] lkml.kernel.org/r/9c9416b2dff19f05fb4c35879aaa83d11ff72c92.1521626182.git.geliangtang@gmail.com
Signed-off-by: NNick Terrell <terrelln@fb.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

d28fc3db

07 4月, 2018 2 次提交

kbuild: rename *-asn1.[ch] to *.asn1.[ch] · 4fa8bc94

由 Masahiro Yamada 提交于 3月 23, 2018

Our convention is to distinguish file types by suffixes with a period
as a separator.

*-asn1.[ch] is a different pattern from other generated sources such
as *.lex.c, *.tab.[ch], *.dtb.S, etc.  More confusing, files with
'-asn1.[ch]' are generated files, but '_asn1.[ch]' are checked-in
files:
  net/netfilter/nf_conntrack_h323_asn1.c
  include/linux/netfilter/nf_conntrack_h323_asn1.h
  include/linux/sunrpc/gss_asn1.h

Rename generated files to *.asn1.[ch] for consistency.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>

4fa8bc94

kbuild: clean up *-asn1.[ch] patterns from top-level Makefile · 3ca3273e

由 Masahiro Yamada 提交于 3月 23, 2018

Clean up these patterns from the top Makefile to omit 'clean-files'
in each Makefile.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>

3ca3273e

16 3月, 2018 1 次提交

crypto: sm4 - introduce SM4 symmetric cipher algorithm · 747c8ce4

由 Gilad Ben-Yossef 提交于 3月 06, 2018

Introduce the SM4 cipher algorithms (OSCCA GB/T 32907-2016).

SM4 (GBT.32907-2016) is a cryptographic standard issued by the
Organization of State Commercial Administration of China (OSCCA)
as an authorized cryptographic algorithms for the use within China.

SMS4 was originally created for use in protecting wireless
networks, and is mandated in the Chinese National Standard for
Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure)
(GB.15629.11-2003).
Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

747c8ce4

09 3月, 2018 1 次提交

crypto: cfb - add support for Cipher FeedBack mode · a7d85e06

由 James Bottomley 提交于 3月 01, 2018

TPM security routines require encryption and decryption with AES in
CFB mode, so add it to the Linux Crypto schemes. CFB is basically a
one time pad where the pad is generated initially from the encrypted
IV and then subsequently from the encrypted previous block of
ciphertext. The pad is XOR'd into the plain text to get the final
ciphertext.

https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#CFBSigned-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

a7d85e06

03 3月, 2018 1 次提交

crypto: ablk_helper - remove ablk_helper · 0e145b47

由 Eric Biggers 提交于 2月 19, 2018

All users of ablk_helper have been converted over to crypto_simd, so
remove ablk_helper.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

0e145b47

22 2月, 2018 1 次提交

crypto: speck - add support for the Speck block cipher · da7a0ab5

由 Eric Biggers 提交于 2月 14, 2018

Add a generic implementation of Speck, including the Speck128 and
Speck64 variants. Speck is a lightweight block cipher that can be much
faster than AES on processors that don't have AES instructions.

We are planning to offer Speck-XTS (probably Speck128/256-XTS) as an
option for dm-crypt and fscrypt on Android, for low-end mobile devices
with older CPUs such as ARMv7 which don't have the Cryptography
Extensions. Currently, such devices are unencrypted because AES is not
fast enough, even when the NEON bit-sliced implementation of AES is
used. Other AES alternatives such as Twofish, Threefish, Camellia,
CAST6, and Serpent aren't fast enough either; it seems that only a
modern ARX cipher can provide sufficient performance on these devices.

This is a replacement for our original proposal
(https://patchwork.kernel.org/patch/10101451/) which was to offer
ChaCha20 for these devices. However, the use of a stream cipher for
disk/file encryption with no space to store nonces would have been much
more insecure than we thought initially, given that it would be used on
top of flash storage as well as potentially on top of F2FS, neither of
which is guaranteed to overwrite data in-place.

Speck has been somewhat controversial due to its origin. Nevertheless,
it has a straightforward design (it's an ARX cipher), and it appears to
be the leading software-optimized lightweight block cipher currently,
with the most cryptanalysis. It's also easy to implement without side
channels, unlike AES. Moreover, we only intend Speck to be used when
the status quo is no encryption, due to AES not being fast enough.

We've also considered a novel length-preserving encryption mode based on
ChaCha20 and Poly1305. While theoretically attractive, such a mode
would be a brand new crypto construction and would be more complicated
and difficult to implement efficiently in comparison to Speck-XTS.

There is confusion about the byte and word orders of Speck, since the
original paper doesn't specify them. But we have implemented it using
the orders the authors recommended in a correspondence with them. The
test vectors are taken from the original paper but were mapped to byte
arrays using the recommended byte and word orders.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

da7a0ab5

20 1月, 2018 1 次提交

crypto: aes-generic - fix aes-generic regression on powerpc · 6e36719f

由 Arnd Bergmann 提交于 1月 15, 2018

My last bugfix added -Os on the command line, which unfortunately caused
a build regression on powerpc in some configurations.

I've done some more analysis of the original problem and found slightly
different workaround that avoids this regression and also results in
better performance on gcc-7.0: -fcode-hoisting is an optimization step
that got added in gcc-7 and that for all gcc-7 versions causes worse
performance.

This disables -fcode-hoisting on all compilers that understand the option.
For gcc-7.1 and 7.2 I found the same performance as my previous patch
(using -Os), in gcc-7.0 it was even better. On gcc-8 I could see no
change in performance from this patch. In theory, code hoisting should
not be able make things better for the AES cipher, so leaving it
disabled for gcc-8 only serves to simplify the Makefile change.
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Link: https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg30418.html
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
Fixes: 148b974d ("crypto: aes-generic - build with -Os on gcc-7+")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

6e36719f

12 1月, 2018 1 次提交

crypto: aes-generic - build with -Os on gcc-7+ · 148b974d

由 Arnd Bergmann 提交于 1月 03, 2018

While testing other changes, I discovered that gcc-7.2.1 produces badly
optimized code for aes_encrypt/aes_decrypt. This is especially true when
CONFIG_UBSAN_SANITIZE_ALL is enabled, where it leads to extremely
large stack usage that in turn might cause kernel stack overflows:

crypto/aes_generic.c: In function 'aes_encrypt':
crypto/aes_generic.c:1371:1: warning: the frame size of 4880 bytes is larger than 2048 bytes [-Wframe-larger-than=]
crypto/aes_generic.c: In function 'aes_decrypt':
crypto/aes_generic.c:1441:1: warning: the frame size of 4864 bytes is larger than 2048 bytes [-Wframe-larger-than=]

I verified that this problem exists on all architectures that are
supported by gcc-7.2, though arm64 in particular is less affected than
the others. I also found that gcc-7.1 and gcc-8 do not show the extreme
stack usage but still produce worse code than earlier versions for this
file, apparently because of optimization passes that generally provide
a substantial improvement in object code quality but understandably fail
to find any shortcuts in the AES algorithm.

Possible workarounds include

a) disabling -ftree-pre and -ftree-sra optimizations, this was an earlier
   patch I tried, which reliably fixed the stack usage, but caused a
   serious performance regression in some versions, as later testing
   found.

b) disabling UBSAN on this file or all ciphers, as suggested by Ard
   Biesheuvel. This would lead to massively better crypto performance in
   UBSAN-enabled kernels and avoid the stack usage, but there is a concern
   over whether we should exclude arbitrary files from UBSAN at all.

c) Forcing the optimization level in a different way. Similar to a),
   but rather than deselecting specific optimization stages,
   this now uses "gcc -Os" for this file, regardless of the
   CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE/SIZE option. This is a reliable
   workaround for the stack consumption on all architecture, and I've
   retested the performance results now on x86, cycles/byte (lower is
   better) for cbc(aes-generic) with 256 bit keys:

			-O2     -Os
	gcc-6.3.1	14.9	15.1
	gcc-7.0.1	14.7	15.3
	gcc-7.1.1	15.3	14.7
	gcc-7.2.1	16.8	15.9
	gcc-8.0.0	15.5	15.6

This implements the option c) by enabling forcing -Os on all compiler
versions starting with gcc-7.1. As a workaround for PR83356, it would
only be needed for gcc-7.2+ with UBSAN enabled, but since it also shows
better performance on gcc-7.1 without UBSAN, it seems appropriate to
use the faster version here as well.

Side note: during testing, I also played with the AES code in libressl,
which had a similar performance regression from gcc-6 to gcc-7.2,
but was three times slower overall. It might be interesting to
investigate that further and possibly port the Linux implementation
into that.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
Cc: Richard Biener <rguenther@suse.de>
Cc: Jakub Jelinek <jakub@gcc.gnu.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

148b974d

02 11月, 2017 1 次提交

License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318

由 Greg Kroah-Hartman 提交于 11月 01, 2017

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b2441318

22 9月, 2017 1 次提交

crypto: sm3 - add OSCCA SM3 secure hash · 4f0fc160

由 Gilad Ben-Yossef 提交于 8月 21, 2017

Add OSCCA SM3 secure hash (OSCCA GM/T 0004-2012 SM3)
generic hash transformation.
Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

4f0fc160

10 6月, 2017 1 次提交

crypto: ecdh - add privkey generation support · 6755fd26

由 Tudor-Dan Ambarus 提交于 5月 30, 2017

Add support for generating ecc private keys.

Generation of ecc private keys is helpful in a user-space to kernel
ecdh offload because the keys are not revealed to user-space. Private
key generation is also helpful to implement forward secrecy.

If the user provides a NULL ecc private key, the kernel will generate it
and further use it for ecdh.

Move ecdh's object files below drbg's. drbg must be present in the kernel
at the time of calling.
Signed-off-by: NTudor Ambarus <tudor.ambarus@microchip.com>
Reviewed-by: NStephan Müller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

6755fd26

11 2月, 2017 2 次提交

crypto: improve gcc optimization flags for serpent and wp512 · 7d6e9105

由 Arnd Bergmann 提交于 2月 03, 2017

An ancient gcc bug (first reported in 2003) has apparently resurfaced
on MIPS, where kernelci.org reports an overly large stack frame in the
whirlpool hash algorithm:

crypto/wp512.c:987:1: warning: the frame size of 1112 bytes is larger than 1024 bytes [-Wframe-larger-than=]

With some testing in different configurations, I'm seeing large
variations in stack frames size up to 1500 bytes for what should have
around 300 bytes at most. I also checked the reference implementation,
which is essentially the same code but also comes with some test and
benchmarking infrastructure.

It seems that recent compiler versions on at least arm, arm64 and powerpc
have a partial fix for this problem, but enabling "-fsched-pressure", but
even with that fix they suffer from the issue to a certain degree. Some
testing on arm64 shows that the time needed to hash a given amount of
data is roughly proportional to the stack frame size here, which makes
sense given that the wp512 implementation is doing lots of loads for
table lookups, and the problem with the overly large stack is a result
of doing a lot more loads and stores for spilled registers (as seen from
inspecting the object code).

Disabling -fschedule-insns consistently fixes the problem for wp512,
in my collection of cross-compilers, the results are consistently better
or identical when comparing the stack sizes in this function, though
some architectures (notable x86) have schedule-insns disabled by
default.

The four columns are:
default: -O2
press:	 -O2 -fsched-pressure
nopress: -O2 -fschedule-insns -fno-sched-pressure
nosched: -O2 -no-schedule-insns (disables sched-pressure)

				default	press	nopress	nosched
alpha-linux-gcc-4.9.3		1136	848	1136	176
am33_2.0-linux-gcc-4.9.3	2100	2076	2100	2104
arm-linux-gnueabi-gcc-4.9.3	848	848	1048	352
cris-linux-gcc-4.9.3		272	272	272	272
frv-linux-gcc-4.9.3		1128	1000	1128	280
hppa64-linux-gcc-4.9.3		1128	336	1128	184
hppa-linux-gcc-4.9.3		644	308	644	276
i386-linux-gcc-4.9.3		352	352	352	352
m32r-linux-gcc-4.9.3		720	656	720	268
microblaze-linux-gcc-4.9.3	1108	604	1108	256
mips64-linux-gcc-4.9.3		1328	592	1328	208
mips-linux-gcc-4.9.3		1096	624	1096	240
powerpc64-linux-gcc-4.9.3	1088	432	1088	160
powerpc-linux-gcc-4.9.3		1080	584	1080	224
s390-linux-gcc-4.9.3		456	456	624	360
sh3-linux-gcc-4.9.3		292	292	292	292
sparc64-linux-gcc-4.9.3		992	240	992	208
sparc-linux-gcc-4.9.3		680	592	680	312
x86_64-linux-gcc-4.9.3		224	240	272	224
xtensa-linux-gcc-4.9.3		1152	704	1152	304

aarch64-linux-gcc-7.0.0		224	224	1104	208
arm-linux-gnueabi-gcc-7.0.1	824	824	1048	352
mips-linux-gcc-7.0.0		1120	648	1120	272
x86_64-linux-gcc-7.0.1		240	240	304	240

arm-linux-gnueabi-gcc-4.4.7	840			392
arm-linux-gnueabi-gcc-4.5.4	784	728	784	320
arm-linux-gnueabi-gcc-4.6.4	736	728	736	304
arm-linux-gnueabi-gcc-4.7.4	944	784	944	352
arm-linux-gnueabi-gcc-4.8.5	464	464	760	352
arm-linux-gnueabi-gcc-4.9.3	848	848	1048	352
arm-linux-gnueabi-gcc-5.3.1	824	824	1064	336
arm-linux-gnueabi-gcc-6.1.1	808	808	1056	344
arm-linux-gnueabi-gcc-7.0.1	824	824	1048	352

Trying the same test for serpent-generic, the picture is a bit different,
and while -fno-schedule-insns is generally better here than the default,
-fsched-pressure wins overall, so I picked that instead.

				default	press	nopress	nosched
alpha-linux-gcc-4.9.3		1392	864	1392	960
am33_2.0-linux-gcc-4.9.3	536	524	536	528
arm-linux-gnueabi-gcc-4.9.3	552	552	776	536
cris-linux-gcc-4.9.3		528	528	528	528
frv-linux-gcc-4.9.3		536	400	536	504
hppa64-linux-gcc-4.9.3		524	208	524	480
hppa-linux-gcc-4.9.3		768	472	768	508
i386-linux-gcc-4.9.3		564	564	564	564
m32r-linux-gcc-4.9.3		712	576	712	532
microblaze-linux-gcc-4.9.3	724	392	724	512
mips64-linux-gcc-4.9.3		720	384	720	496
mips-linux-gcc-4.9.3		728	384	728	496
powerpc64-linux-gcc-4.9.3	704	304	704	480
powerpc-linux-gcc-4.9.3		704	296	704	480
s390-linux-gcc-4.9.3		560	560	592	536
sh3-linux-gcc-4.9.3		540	540	540	540
sparc64-linux-gcc-4.9.3		544	352	544	496
sparc-linux-gcc-4.9.3		544	344	544	496
x86_64-linux-gcc-4.9.3		528	536	576	528
xtensa-linux-gcc-4.9.3		752	544	752	544

aarch64-linux-gcc-7.0.0		432	432	656	480
arm-linux-gnueabi-gcc-7.0.1	616	616	808	536
mips-linux-gcc-7.0.0		720	464	720	488
x86_64-linux-gcc-7.0.1		536	528	600	536

arm-linux-gnueabi-gcc-4.4.7	592			440
arm-linux-gnueabi-gcc-4.5.4	776	448	776	544
arm-linux-gnueabi-gcc-4.6.4	776	448	776	544
arm-linux-gnueabi-gcc-4.7.4	768	448	768	544
arm-linux-gnueabi-gcc-4.8.5	488	488	776	544
arm-linux-gnueabi-gcc-4.9.3	552	552	776	536
arm-linux-gnueabi-gcc-5.3.1	552	552	776	536
arm-linux-gnueabi-gcc-6.1.1	560	560	776	536
arm-linux-gnueabi-gcc-7.0.1	616	616	808	536

I did not do any runtime tests with serpent, so it is possible that stack
frame size does not directly correlate with runtime performance here and
it actually makes things worse, but it's more likely to help here, and
the reduced stack frame size is probably enough reason to apply the patch,
especially given that the crypto code is often used in deep call chains.

Link: https://kernelci.org/build/id/58797d7559b5149efdf6c3a9/logs/
Link: http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11488
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

7d6e9105

crypto: aes - add generic time invariant AES cipher · b5e0b032

由 Ard Biesheuvel 提交于 2月 02, 2017

Lookup table based AES is sensitive to timing attacks, which is due to
the fact that such table lookups are data dependent, and the fact that
8 KB worth of tables covers a significant number of cachelines on any
architecture, resulting in an exploitable correlation between the key
and the processing time for known plaintexts.

For network facing algorithms such as CTR, CCM or GCM, this presents a
security risk, which is why arch specific AES ports are typically time
invariant, either through the use of special instructions, or by using
SIMD algorithms that don't rely on table lookups.

For generic code, this is difficult to achieve without losing too much
performance, but we can improve the situation significantly by switching
to an implementation that only needs 256 bytes of table data (the actual
S-box itself), which can be prefetched at the start of each block to
eliminate data dependent latencies.

This code encrypts at ~25 cycles per byte on ARM Cortex-A57 (while the
ordinary generic AES driver manages 18 cycles per byte on this
hardware). Decryption is substantially slower.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

b5e0b032

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功