提交 · 435d3e51af3de0c1fe9f6ca1a18df3cd4d6b8c17 · openeuler / raspberrypi-kernel

25 2月, 2012 4 次提交

crypto: serpent-sse2 - combine ablk_*_init functions · 435d3e51

由 Jussi Kivilinna 提交于 2月 17, 2012

Driver name in ablk_*_init functions can be constructed runtime. Therefore
use single function ablk_init to reduce object size.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

435d3e51

crypto: blowfish-x86_64 - use crypto_[un]register_algs · d433208c

由 Jussi Kivilinna 提交于 2月 17, 2012

Combine all crypto_alg to be registered and use new crypto_[un]register_algs
functions. Simplifies init/exit code and reduce object size.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

d433208c

crypto: twofish-x86_64-3way - use crypto_[un]register_algs · 53709dde

由 Jussi Kivilinna 提交于 2月 17, 2012

53709dde

crypto: serpent-sse2 - use crypto_[un]register_algs · 35474c3b

由 Jussi Kivilinna 提交于 2月 17, 2012

35474c3b

14 2月, 2012 2 次提交

crypto: serpent-sse2 - remove dead code from serpent_sse2_glue.c::serpent_sse2_init() · 6e77fe8c

由 Jesper Juhl 提交于 2月 09, 2012

We cannot reach the line after 'return err'. Remove it.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

6e77fe8c

crypto: twofish-x86 - Remove dead code from twofish_glue_3way.c::init() · 8d21190e

由 Jesper Juhl 提交于 2月 09, 2012

We can never reach the line just after the 'return 0'
statement. Remove it.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

8d21190e

13 1月, 2012 3 次提交

crypto: serpent-sse2 - change transpose_4x4 to only use integer instructions · 847cb7ef

由 Jussi Kivilinna 提交于 12月 20, 2011

Matrix transpose macro in serpent-sse2 uses mix of SSE2 integer and SSE floating
point instructions, which might cause performance penality on some CPUs.

This patch replaces transpose_4x4 macro with version that uses only SSE2
integer instructions.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

847cb7ef

crypto: blowfish-x86_64 - blacklist Pentium 4 · 4c58464b

由 Jussi Kivilinna 提交于 12月 20, 2011

Implementation in blowfish-x86_64 uses 64bit rotations which are slow on P4,
making blowfish-x86_64 slower than generic C implementation. Therefore
blacklist P4.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

4c58464b

crypto: twofish-x86_64-3way - blacklist pentium4 and atom · a522ee85

由 Jussi Kivilinna 提交于 12月 20, 2011

Performance of twofish-x86_64-3way on Intel Pentium 4 and Atom is lower than
of twofish-x86_64 module. So blacklist these CPUs.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

a522ee85

20 12月, 2011 2 次提交

crypto: serpent-sse2 - remove unneeded LRW/XTS #ifdefs · 7ba8babf

由 Jussi Kivilinna 提交于 12月 13, 2011

Since LRW & XTS are selected by serpent-sse2, we don't need these #ifdefs
anymore.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

7ba8babf

crypto: twofish-x86_64-3way - remove unneeded LRW/XTS #ifdefs · 88715b9a

由 Jussi Kivilinna 提交于 12月 13, 2011

Since LRW & XTS are selected by twofish-x86_64-3way, we don't need these
#ifdefs anymore.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

88715b9a

21 11月, 2011 5 次提交

crypto: serpent-sse2 - clear CRYPTO_TFM_REQ_MAY_SLEEP in lrw and xts modes · d3564338

由 Jussi Kivilinna 提交于 11月 09, 2011

LRW/XTS patches for serpent-sse2 forgot to add this. CRYPTO_TFM_REQ_MAY_SLEEP
should be cleared as sleeping between kernel_fpu_begin()/kernel_fpu_end() is
not allowed.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

d3564338

crypto: serpent-sse2 - add xts support · 5962f8b6

由 Jussi Kivilinna 提交于 11月 09, 2011

Patch adds XTS support for serpent-sse2 by using xts_crypt(). Patch has been
tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (serpent-sse2/serpent_generic speed ratios):

Intel Celeron T1600 (x86_64) (fam:6, model:15, step:13):
size    xts-enc xts-dec
16B     0.98x   1.00x
64B     1.00x   1.01x
256B    2.78x   2.75x
1024B   3.30x   3.26x
8192B   3.39x   3.30x

AMD Phenom II 1055T (x86_64) (fam:16, model:10):
size    xts-enc xts-dec
16B     1.05x   1.02x
64B     1.04x   1.03x
256B    2.10x   2.05x
1024B   2.34x   2.35x
8192B   2.34x   2.40x

Intel Atom N270 (i586):
size    xts-enc xts-dec
16B     0.95x   0.96x
64B     1.53x   1.50x
256B    1.72x   1.75x
1024B   1.88x   1.87x
8192B   1.86x   1.83x
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

5962f8b6

crypto: serpent-sse2 - add lrw support · 18482053

由 Jussi Kivilinna 提交于 11月 09, 2011

Patch adds LRW support for serpent-sse2 by using lrw_crypt(). Patch has been
tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (serpent-sse2/serpent_generic speed ratios):

Benchmark results with tcrypt:

Intel Celeron T1600 (x86_64) (fam:6, model:15, step:13):
size    lrw-enc lrw-dec
16B     1.00x   0.96x
64B     1.01x   1.01x
256B    3.01x   2.97x
1024B   3.39x   3.33x
8192B   3.35x   3.33x

AMD Phenom II 1055T (x86_64) (fam:16, model:10):
size    lrw-enc lrw-dec
16B     0.98x   1.03x
64B     1.01x   1.04x
256B    2.10x   2.14x
1024B   2.28x   2.33x
8192B   2.30x   2.33x

Intel Atom N270 (i586):
size    lrw-enc lrw-dec
16B     0.97x   0.97x
64B     1.47x   1.50x
256B    1.72x   1.69x
1024B   1.88x   1.81x
8192B   1.84x   1.79x
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

18482053

crypto: serpent - add 4-way parallel i586/SSE2 assembler implementation · 251496db

由 Jussi Kivilinna 提交于 11月 09, 2011

Patch adds i586/SSE2 assembler implementation of serpent cipher. Assembler
functions crypt data in four block chunks.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (serpent-sse2/serpent_generic speed ratios):

Intel Atom N270:

size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16      0.95x   1.12x   1.02x   1.07x   0.97x   0.98x
64      1.73x   1.82x   1.08x   1.82x   1.72x   1.73x
256     2.08x   2.00x   1.04x   2.07x   1.99x   2.01x
1024    2.28x   2.18x   1.05x   2.23x   2.17x   2.20x
8192    2.28x   2.13x   1.05x   2.23x   2.18x   2.20x

Full output:
 http://koti.mbnet.fi/axh/kernel/crypto/atom-n270/serpent-generic.txt
 http://koti.mbnet.fi/axh/kernel/crypto/atom-n270/serpent-sse2.txt

Userspace test results:

Encryption/decryption of sse2-i586 vs generic on Intel Atom N270:
 encrypt: 2.35x
 decrypt: 2.54x

Encryption/decryption of sse2-i586 vs generic on AMD Phenom II:
 encrypt: 1.82x
 decrypt: 2.51x

Encryption/decryption of sse2-i586 vs generic on Intel Xeon E7330:
 encrypt: 2.99x
 decrypt: 3.48x
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

251496db

crypto: serpent - add 8-way parallel x86_64/SSE2 assembler implementation · 937c30d7

由 Jussi Kivilinna 提交于 11月 09, 2011

Patch adds x86_64/SSE2 assembler implementation of serpent cipher. Assembler
functions crypt data in eigth block chunks (two 4 block chunk SSE2 operations
in parallel to improve performance on out-of-order CPUs). Glue code is based
on one from AES-NI implementation, so requests from irq context are redirected
to cryptd.

v2:
 - add missing include of linux/module.h
   (appearently crypto.h used to include module.h, which changed for 3.2 by
    commit 7c926402)

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (serpent-sse2/serpent_generic speed ratios):

AMD Phenom II 1055T (fam:16, model:10):

size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B     1.03x   1.01x   1.03x   1.05x   1.00x   0.99x
64B     1.00x   1.01x   1.02x   1.04x   1.02x   1.01x
256B    2.34x   2.41x   0.99x   2.43x   2.39x   2.40x
1024B   2.51x   2.57x   1.00x   2.59x   2.56x   2.56x
8192B   2.50x   2.54x   1.00x   2.55x   2.57x   2.57x

Intel Celeron T1600 (fam:6, model:15, step:13):

size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B     0.97x   0.97x   1.01x   1.01x   1.01x   1.02x
64B     1.00x   1.00x   1.00x   1.02x   1.01x   1.01x
256B    3.41x   3.35x   1.00x   3.39x   3.42x   3.44x
1024B   3.75x   3.72x   0.99x   3.74x   3.75x   3.75x
8192B   3.70x   3.68x   0.99x   3.68x   3.69x   3.69x

Full output:
 http://koti.mbnet.fi/axh/kernel/crypto/phenom-ii-1055t/serpent-generic.txt
 http://koti.mbnet.fi/axh/kernel/crypto/phenom-ii-1055t/serpent-sse2.txt
 http://koti.mbnet.fi/axh/kernel/crypto/celeron-t1600/serpent-generic.txt
 http://koti.mbnet.fi/axh/kernel/crypto/celeron-t1600/serpent-sse2.txtSigned-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

937c30d7

09 11月, 2011 2 次提交

crypto: twofish-x86_64-3way - add xts support · bae6d303

由 Jussi Kivilinna 提交于 10月 18, 2011

Patch adds XTS support for twofish-x86_64-3way by using xts_crypt(). Patch has
been tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (twofish-3way/twofish-asm speed ratios):

Intel Celeron T1600 (fam:6, model:15, step:13):

size    xts-enc xts-dec
16B     0.98x   1.00x
64B     1.14x   1.15x
256B    1.23x   1.25x
1024B   1.26x   1.29x
8192B   1.28x   1.30x

AMD Phenom II 1055T (fam:16, model:10):

size    xts-enc xts-dec
16B     1.03x   1.03x
64B     1.13x   1.16x
256B    1.20x   1.20x
1024B   1.22x   1.22x
8192B   1.22x   1.21x
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

bae6d303

crypto: twofish-x86_64-3way - add lrw support · 81559f9a

由 Jussi Kivilinna 提交于 10月 18, 2011

Patch adds LRW support for twofish-x86_64-3way by using lrw_crypt(). Patch has
been tested with tcrypt and automated filesystem tests.

Tcrypt benchmarks results (twofish-3way/twofish-asm speed ratios):

Intel Celeron T1600 (fam:6, model:15, step:13):

size	lrw-enc	lrw-dec
16B	0.99x	1.00x
64B	1.17x	1.17x
256B	1.26x	1.27x
1024B	1.30x	1.31x
8192B	1.31x	1.32x

AMD Phenom II 1055T (fam:16, model:10):

size	lrw-enc	lrw-dec
16B	1.06x	1.01x
64B	1.08x	1.14x
256B	1.19x	1.20x
1024B	1.21x	1.22x
8192B	1.23x	1.24x
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

81559f9a

01 11月, 2011 1 次提交

x86: fix up files really needing to include module.h · 7c52d551

由 Paul Gortmaker 提交于 5月 27, 2011

These files aren't just exporting symbols -- they are also defining
a MODULE_LICENSE etc. so give them the full module.h file.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

7c52d551

21 10月, 2011 6 次提交

crypto: twofish-x86_64-3way - fix ctr blocksize to 1 · 906b2c9f

由 Jussi Kivilinna 提交于 10月 10, 2011

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

906b2c9f

crypto: blowfish-x86_64 - fix ctr blocksize to 1 · a516ebaf

由 Jussi Kivilinna 提交于 10月 10, 2011

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

a516ebaf

crypto: twofish - add 3-way parallel x86_64 assembler implemention · 8280daad

由 Jussi Kivilinna 提交于 9月 26, 2011

Patch adds 3-way parallel x86_64 assembly implementation of twofish as new
module. New assembler functions crypt data in three blocks chunks, improving
cipher performance on out-of-order CPUs.

Patch has been tested with tcrypt and automated filesystem tests.

Summary of the tcrypt benchmarks:

Twofish 3-way-asm vs twofish asm (128bit 8kb block ECB)
 encrypt: 1.3x speed
 decrypt: 1.3x speed

Twofish 3-way-asm vs twofish asm (128bit 8kb block CBC)
 encrypt: 1.07x speed
 decrypt: 1.4x speed

Twofish 3-way-asm vs twofish asm (128bit 8kb block CTR)
 encrypt: 1.4x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block ECB)
 encrypt: 1.0x speed
 decrypt: 1.0x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block CBC)
 encrypt: 0.84x speed
 decrypt: 1.09x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block CTR)
 encrypt: 1.15x speed

Full output:
 http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-3way-asm-x86_64.txt
 http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-asm-x86_64.txt
 http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-aes-asm-x86_64.txt

Tests were run on:
 vendor_id  : AuthenticAMD
 cpu family : 16
 model      : 10
 model name : AMD Phenom(tm) II X6 1055T Processor

Also userspace test were run on:
 vendor_id  : GenuineIntel
 cpu family : 6
 model      : 15
 model name : Intel(R) Xeon(R) CPU           E7330  @ 2.40GHz
 stepping   : 11

Userspace test results:

Encryption/decryption of twofish 3-way vs x86_64-asm on AMD Phenom II:
 encrypt: 1.27x
 decrypt: 1.25x

Encryption/decryption of twofish 3-way vs x86_64-asm on Intel Xeon E7330:
 encrypt: 1.36x
 decrypt: 1.36x
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

8280daad

crypto: twofish-x86-asm - make assembler functions use twofish_ctx instead of crypto_tfm · 91d41f15

由 Jussi Kivilinna 提交于 9月 26, 2011

This needed by 3-way twofish patch to be able to easily use one block
assembler functions. As glue code is shared between i586/x86_64 apply
change to i586 assembler too. Also export assembler functions for
3-way parallel twofish module.

CC: Joachim Fritschi <jfritschi@freenet.de>
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

91d41f15

crypto: blowfish-x86_64 - add credits · a071d06e

由 Jussi Kivilinna 提交于 9月 23, 2011

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

a071d06e

crypto: blowfish-x86_64 - improve x86_64 blowfish 4-way performance · e827bb09

由 Jussi Kivilinna 提交于 9月 23, 2011

This patch adds improved F-macro for 4-way parallel functions. With new
F-macro for 4-way parallel functions, blowfish sees ~15% improvement in
speed tests on AMD Phenom II (~5% on Intel Xeon E7330).

However when used in 1-way blowfish function new macro would be ~10%
slower than original, so old F-macro is kept for 1-way functions.
Patch cleans up old F-macro as it is no longer needed in 4-way part.

Patch also does register macro renaming to reduce stack usage.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

e827bb09

22 9月, 2011 2 次提交

crypto: aes-x86 - quiet sparse noise about symbol not declared · 4a4cc2b6

由 H Hartley Sweeten 提交于 9月 22, 2011

Include <asm/aes.h> to pick up the declarations for crypto_aes_encrypt_x86
and crypto_aes_decrypt_x86 to quiet the sparse noise:

warning: symbol 'crypto_aes_encrypt_x86' was not declared. Should it be static?
warning: symbol 'crypto_aes_decrypt_x86' was not declared. Should it be static?
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: NMandeep Singh Baines <msb@chromium.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

4a4cc2b6

crypto: blowfish - add x86_64 assembly implementation · 64b94cea

由 Jussi Kivilinna 提交于 9月 02, 2011

Patch adds x86_64 assembly implementation of blowfish. Two set of assembler
functions are provided. First set is regular 'one-block at time'
encrypt/decrypt functions. Second is 'four-block at time' functions that
gain performance increase on out-of-order CPUs. Performance of 4-way
functions should be equal to 1-way functions with in-order CPUs.

Summary of the tcrypt benchmarks:

Blowfish assembler vs blowfish C (256bit 8kb block ECB)
encrypt: 2.2x speed
decrypt: 2.3x speed

Blowfish assembler vs blowfish C (256bit 8kb block CBC)
encrypt: 1.12x speed
decrypt: 2.5x speed

Blowfish assembler vs blowfish C (256bit 8kb block CTR)
encrypt: 2.5x speed

Full output:
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-c-x86_64.txt

Tests were run on:
 vendor_id	: AuthenticAMD
 cpu family	: 16
 model		: 10
 model name	: AMD Phenom(tm) II X6 1055T Processor
 stepping	: 0
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

64b94cea

10 8月, 2011 1 次提交

crypto: sha1 - SSSE3 based SHA1 implementation for x86-64 · 66be8951

由 Mathias Krause 提交于 8月 04, 2011

This is an assembler implementation of the SHA1 algorithm using the
Supplemental SSE3 (SSSE3) instructions or, when available, the
Advanced Vector Extensions (AVX).

Testing with the tcrypt module shows the raw hash performance is up to
2.3 times faster than the C implementation, using 8k data blocks on a
Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25%
faster.

Since this implementation uses SSE/YMM registers it cannot safely be
used in every situation, e.g. while an IRQ interrupts a kernel thread.
The implementation falls back to the generic SHA1 variant, if using
the SSE/YMM registers is not possible.

With this algorithm I was able to increase the throughput of a single
IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using
the SSSE3 variant -- a speedup of +34.8%.

Saving and restoring SSE/YMM state might make the actual throughput
fluctuate when there are FPU intensive userland applications running.
For example, meassuring the performance using iperf2 directly on the
machine under test gives wobbling numbers because iperf2 uses the FPU
for each packet to check if the reporting interval has expired (in the
above test I got min/max/avg: 402/484/464 MBit/s).

Using this algorithm on a IPsec gateway gives much more reasonable and
stable numbers, albeit not as high as in the directly connected case.
Here is the result from an RFC 2544 test run with a EXFO Packet Blazer
FTB-8510:

 frame size    sha1-generic     sha1-ssse3    delta
    64 byte     37.5 MBit/s    37.5 MBit/s     0.0%
   128 byte     56.3 MBit/s    62.5 MBit/s   +11.0%
   256 byte     87.5 MBit/s   100.0 MBit/s   +14.3%
   512 byte    131.3 MBit/s   150.0 MBit/s   +14.2%
  1024 byte    162.5 MBit/s   193.8 MBit/s   +19.3%
  1280 byte    175.0 MBit/s   212.5 MBit/s   +21.4%
  1420 byte    175.0 MBit/s   218.7 MBit/s   +25.0%
  1518 byte    150.0 MBit/s   181.2 MBit/s   +20.8%

The throughput for the largest frame size is lower than for the
previous size because the IP packets need to be fragmented in this
case to make there way through the IPsec tunnel.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Cc: Maxim Locktyukhin <maxim.locktyukhin@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

66be8951

30 6月, 2011 1 次提交
- G
  crypto: ghash-intel - Fix set but not used in ghash_async_setkey() · c3e73e76
  由 Gustavo F. Padovan 提交于 5月 26, 2011
```
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
```
  c3e73e76
18 5月, 2011 1 次提交

crypto: aesni-intel - fix aesni build on i386 · 9bed4aca

由 Randy Dunlap 提交于 5月 18, 2011

Fix build error on i386 by moving function prototypes:

arch/x86/crypto/aesni-intel_glue.c: In function 'aesni_init':
arch/x86/crypto/aesni-intel_glue.c:1263: error: implicit declaration of function 'crypto_fpu_init'
arch/x86/crypto/aesni-intel_glue.c: In function 'aesni_exit':
arch/x86/crypto/aesni-intel_glue.c:1373: error: implicit declaration of function 'crypto_fpu_exit'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

9bed4aca

16 5月, 2011 1 次提交

crypto: aesni-intel - Merge with fpu.ko · b23b6451

由 Andy Lutomirski 提交于 5月 16, 2011

Loading fpu without aesni-intel does nothing.  Loading aesni-intel
without fpu causes modes like xts to fail.  (Unloading
aesni-intel will restore those modes.)

One solution would be to make aesni-intel depend on fpu, but it
seems cleaner to just combine the modules.

This is probably responsible for bugs like:
https://bugzilla.redhat.com/show_bug.cgi?id=589390Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

b23b6451

27 3月, 2011 1 次提交

crypto: aesni-intel - fixed problem with packets that are not multiple of 64bytes · 60af520c

由 Tadeusz Struk 提交于 3月 13, 2011

This patch fixes problem with packets that are not multiple of 64bytes.
Signed-off-by: NAdrian Hoban <adrian.hoban@intel.com>
Signed-off-by: NAidan O'Mahony <aidan.o.mahony@intel.com>
Signed-off-by: NGabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

60af520c

18 3月, 2011 1 次提交

x86: Fix common misspellings · 0d2eb44f

由 Lucas De Marchi 提交于 3月 17, 2011

They were generated by 'codespell' and then manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: trivial@kernel.org
LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0d2eb44f

16 2月, 2011 1 次提交

crypto: aesni-intel - Fix remaining leak in rfc4106_set_hash_key · fc9044e2

由 Jesper Juhl 提交于 2月 16, 2011

Fix up previous patch that failed to properly fix mem leak in 
rfc4106_set_hash_subkey(). This add-on patch; fixes the leak. moves 
kfree() out of the error path, returns -ENOMEM rather than -EINVAL when 
ablkcipher_request_alloc() fails.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

fc9044e2

23 1月, 2011 1 次提交

crypto: aesni-intel - Don't leak memory in rfc4106_set_hash_subkey · 7efd95f6

由 Jesper Juhl 提交于 1月 23, 2011

There's a small memory leak in
arch/x86/crypto/aesni-intel_glue.c::rfc4106_set_hash_subkey(). If the call
to kmalloc() fails and returns NULL then the memory allocated previously
by ablkcipher_request_alloc() is not freed when we leave the function.

I could have just added a call to ablkcipher_request_free() before we
return -ENOMEM, but that started to look too much like the code we
already had at the end of the function, so I chose instead to rework the
code a bit so that there are now a few labels at the end that we goto when
various allocations fail, so we don't have to repeat the same blocks of
code (this also reduces the object code size slightly).
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

7efd95f6

15 12月, 2010 1 次提交

crypto: ghash-intel - ghash-clmulni-intel_glue needs err.h · 52f6c5ad

由 Randy Dunlap 提交于 12月 15, 2010

Add missing header file:

arch/x86/crypto/ghash-clmulni-intel_glue.c:256: error: implicit declaration of function 'IS_ERR'
arch/x86/crypto/ghash-clmulni-intel_glue.c:257: error: implicit declaration of function 'PTR_ERR'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

52f6c5ad

13 12月, 2010 1 次提交

crypto: aesni-intel - Fixed build with binutils 2.16 · 3c097b80

由 Tadeusz Struk 提交于 12月 13, 2010

This patch fixes the problem with 2.16 binutils.
Signed-off-by: NAidan O'Mahony <aidan.o.mahony@intel.com>
Signed-off-by: NAdrian Hoban <adrian.hoban@intel.com>
Signed-off-by: NGabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

3c097b80

29 11月, 2010 1 次提交

crypto: aesni-intel - Fixed build error on x86-32 · 559ad0ff

由 Mathias Krause 提交于 11月 29, 2010

Exclude AES-GCM code for x86-32 due to heavy usage of 64-bit registers
not available on x86-32.

While at it, fixed unregister order in aesni_exit().
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

559ad0ff

27 11月, 2010 1 次提交

crypto: aesni-intel - Ported implementation to x86-32 · 0d258efb

由 Mathias Krause 提交于 11月 27, 2010

The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.

To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:

x86:                   i568       aes-ni    delta
ECB, 256 bit:     93.8 MB/s   123.3 MB/s   +31.4%
CBC, 256 bit:     84.8 MB/s   262.3 MB/s  +209.3%
LRW, 256 bit:    108.6 MB/s   222.1 MB/s  +104.5%
XTS, 256 bit:    105.0 MB/s   205.5 MB/s   +95.7%

Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:

x86-64:           old impl.    new impl.    delta
ECB, 256 bit:    121.1 MB/s   123.0 MB/s    +1.5%
CBC, 256 bit:    285.3 MB/s   290.8 MB/s    +1.9%
LRW, 256 bit:    263.7 MB/s   265.3 MB/s    +0.6%
XTS, 256 bit:    251.1 MB/s   255.3 MB/s    +1.7%
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Reviewed-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

0d258efb

13 11月, 2010 1 次提交

crypto: aesni-intel - RFC4106 AES-GCM Driver Using Intel New Instructions · 0bd82f5f

由 Tadeusz Struk 提交于 11月 04, 2010

This patch adds an optimized RFC4106 AES-GCM implementation for 64-bit
kernels. It supports 128-bit AES key size. This leverages the crypto
AEAD interface type to facilitate a combined AES & GCM operation to
be implemented in assembly code. The assembly code leverages Intel(R)
AES New Instructions and the PCLMULQDQ instruction.
Signed-off-by: NAdrian Hoban <adrian.hoban@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NGabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: NAidan O'Mahony <aidan.o.mahony@intel.com>
Signed-off-by: NErdinc Ozturk <erdinc.ozturk@intel.com>
Signed-off-by: NJames Guilford <james.guilford@intel.com>
Signed-off-by: NWajdi Feghali <wajdi.k.feghali@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

0bd82f5f