提交 · 32dc43e40a2707d0cb1ab8768d080c3e9bcfed52 · openanolis / cloud-kernel

20 1月, 2013 14 次提交

crypto: crc32-pclmul - Kill warning on x86-32 · 79836276

由 Herbert Xu 提交于 1月 20, 2013

This patch removes a gratuitous warning on x86-32:

arch/x86/crypto/crc32-pclmul_asm.S:87:2: warning: #warning Using 32bit code support [-Wcpp]
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

79836276

crypto: x86/twofish - assembler clean-ups: use ENTRY/ENDPROC, localize jump labels · d3f5188d

由 Jussi Kivilinna 提交于 1月 19, 2013

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

d3f5188d

crypto: x86/sha1 - assembler clean-ups: use ENTRY/ENDPROC · ac9d55dd

由 Jussi Kivilinna 提交于 1月 19, 2013

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

ac9d55dd

crypto: x86/serpent - use ENTRY/ENDPROC for assember functions and localize jump targets · 2dcfd44d

由 Jussi Kivilinna 提交于 1月 19, 2013

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

2dcfd44d

crypto: x86/salsa20 - assembler cleanup, use ENTRY/ENDPROC for assember... · 04443808

由 Jussi Kivilinna 提交于 1月 19, 2013

crypto: x86/salsa20 - assembler cleanup, use ENTRY/ENDPROC for assember functions and rename ECRYPT_* to salsa20_*
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

04443808

crypto: x86/ghash - assembler clean-up: use ENDPROC at end of assember functions · b05d3f37

由 Jussi Kivilinna 提交于 1月 19, 2013

Signed-off-by: NJussi Kivilinna <jussi.kivilinn@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

b05d3f37

crypto: x86/crc32c - assembler clean-up: use ENTRY/ENDPROC · 698a5abb

由 Jussi Kivilinna 提交于 1月 19, 2013

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

698a5abb

crypto: cast6-avx: use ENTRY()/ENDPROC() for assembler functions · 1985fecf

由 Jussi Kivilinna 提交于 1月 19, 2013

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

1985fecf

crypto: cast5-avx: use ENTRY()/ENDPROC() for assembler functions and localize jump targets · e17e209e

由 Jussi Kivilinna 提交于 1月 19, 2013

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

e17e209e

crypto: camellia-x86_64/aes-ni: use ENTRY()/ENDPROC() for assembler functions... · 59990684

由 Jussi Kivilinna 提交于 1月 19, 2013

crypto: camellia-x86_64/aes-ni: use ENTRY()/ENDPROC() for assembler functions and localize jump targets
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

59990684

crypto: blowfish-x86_64: use ENTRY()/ENDPROC() for assembler functions and localize jump targets · 5186e395

由 Jussi Kivilinna 提交于 1月 19, 2013

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

5186e395

crypto: aesni-intel - add ENDPROC statements for assembler functions · 8309b745

由 Jussi Kivilinna 提交于 1月 19, 2013

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

8309b745

crypto: x86/aes - assembler clean-ups: use ENTRY/ENDPROC, localize jump targets · 3f299743

由 Jussi Kivilinna 提交于 1月 19, 2013

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

3f299743

crypto: crc32 - add crc32 pclmulqdq implementation and wrappers for table implementation · 78c37d19

由 Alexander Boyko 提交于 1月 10, 2013

This patch adds crc32 algorithms to shash crypto api. One is wrapper to
gerneric crc32_le function. Second is crc32 pclmulqdq implementation. It
use hardware provided PCLMULQDQ instruction to accelerate the CRC32 disposal.
This instruction present from Intel Westmere and AMD Bulldozer CPUs.

For intel core i5 I got 450MB/s for table implementation and 2100MB/s
for pclmulqdq implementation.
Signed-off-by: NAlexander Boyko <alexander_boyko@xyratex.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

78c37d19

08 1月, 2013 1 次提交

crypto: aesni-intel - remove rfc3686(ctr(aes)), utilize rfc3686 from ctr-module instead · 0024dc53

由 Jussi Kivilinna 提交于 12月 28, 2012

rfc3686 in CTR module is now able of using asynchronous ctr(aes) from
aesni-intel, so rfc3686(ctr(aes)) in aesni-intel is no longer needed.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

0024dc53

06 12月, 2012 1 次提交

crypto: cast5/cast6 - move lookup tables to shared module · 044ab525

由 Jussi Kivilinna 提交于 11月 13, 2012

CAST5 and CAST6 both use same lookup tables, which can be moved shared module
'cast_common'.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

044ab525

09 11月, 2012 2 次提交

crypto: camellia - add AES-NI/AVX/x86_64 assembler implementation of camellia cipher · d9b1d2e7

由 Jussi Kivilinna 提交于 10月 26, 2012

This patch adds AES-NI/AVX/x86_64 assembler implementation of Camellia block
cipher. Implementation process data in sixteen block chunks, which are
byte-sliced and AES SubBytes is reused for Camellia s-box with help of pre-
and post-filtering.

Patch has been tested with tcrypt and automated filesystem tests.

tcrypt test results:

Intel Core i5-2450M:

camellia-aesni-avx vs camellia-asm-x86_64-2way:
128bit key: (lrw:256bit) (xts:256bit)
size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B 0.98x 0.96x 0.99x 0.96x 0.96x 0.95x 0.95x 0.94x 0.97x 0.98x
64B 0.99x 0.98x 1.00x 0.98x 0.98x 0.99x 0.98x 0.93x 0.99x 0.98x
256B 2.28x 2.28x 1.01x 2.29x 2.25x 2.24x 1.96x 1.97x 1.91x 1.90x
1024B 2.57x 2.56x 1.00x 2.57x 2.51x 2.53x 2.19x 2.17x 2.19x 2.22x
8192B 2.49x 2.49x 1.00x 2.53x 2.48x 2.49x 2.17x 2.17x 2.22x 2.22x

256bit key: (lrw:384bit) (xts:512bit)
size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B 0.97x 0.98x 0.99x 0.97x 0.97x 0.96x 0.97x 0.98x 0.98x 0.99x
64B 1.00x 1.00x 1.01x 0.99x 0.98x 0.99x 0.99x 0.99x 0.99x 0.99x
256B 2.37x 2.37x 1.01x 2.39x 2.35x 2.33x 2.10x 2.11x 1.99x 2.02x
1024B 2.58x 2.60x 1.00x 2.58x 2.56x 2.56x 2.28x 2.29x 2.28x 2.29x
8192B 2.50x 2.52x 1.00x 2.56x 2.51x 2.51x 2.24x 2.25x 2.26x 2.29x
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

d9b1d2e7

crypto: camellia-x86_64 - share common functions and move structures and... · cf582cce

由 Jussi Kivilinna 提交于 10月 26, 2012

crypto: camellia-x86_64 - share common functions and move structures and function definitions to header file

Prepare camellia-x86_64 functions to be reused from AVX/AESNI implementation
module.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

cf582cce

24 10月, 2012 5 次提交

crypto: cast5/avx - avoid using temporary stack buffers · c12ab20b

由 Jussi Kivilinna 提交于 10月 20, 2012

Introduce new assembler functions to avoid use temporary stack buffers in glue
code. This also allows use of vector instructions for xoring output in CTR and
CBC modes and construction of IVs for CTR mode.

ECB mode sees ~0.5% decrease in speed because added one extra function
call. CBC mode decryption and CTR mode benefit from vector operations
and gain ~5%.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

c12ab20b

crypto: serpent/avx - avoid using temporary stack buffers · facd416f

由 Jussi Kivilinna 提交于 10月 20, 2012

Introduce new assembler functions to avoid use temporary stack buffers in glue
code. This also allows use of vector instructions for xoring output in CTR and
CBC modes and construction of IVs for CTR mode.

ECB mode sees ~0.5% decrease in speed because added one extra function
call. CBC mode decryption and CTR mode benefit from vector operations
and gain ~3%.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

facd416f

crypto: twofish/avx - avoid using temporary stack buffers · 8435a3c3

由 Jussi Kivilinna 提交于 10月 20, 2012

Introduce new assembler functions to avoid use temporary stack buffers in glue
code. This also allows use of vector instructions for xoring output in CTR and
CBC modes and construction of IVs for CTR mode.

ECB mode sees ~0.2% decrease in speed because added one extra function
call. CBC mode decryption and CTR mode benefit from vector operations
and gain ~3%.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

8435a3c3

crypto: cast6/avx - avoid using temporary stack buffers · cba1cce0

由 Jussi Kivilinna 提交于 10月 20, 2012

Introduce new assembler functions to avoid use temporary stack buffers in
glue code. This also allows use of vector instructions for xoring output
in CTR and CBC modes and construction of IVs for CTR mode.

ECB mode sees ~0.5% decrease in speed because added one extra function
call. CBC mode decryption and CTR mode benefit from vector operations
and gain ~2%.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

cba1cce0

crypto: x86/glue_helper - use le128 instead of u128 for CTR mode · 58990986

由 Jussi Kivilinna 提交于 10月 20, 2012

'u128' currently used for CTR mode is on little-endian 'long long' swapped
and would require extra swap operations by SSE/AVX code. Use of le128
instead of u128 allows IV calculations to be done with vector registers
easier.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

58990986

19 10月, 2012 1 次提交

crypto: aesni - fix XTS mode on x86-32, add wrapper function for asmlinkage aesni_enc() · 32bec973

由 Jussi Kivilinna 提交于 10月 18, 2012

Calling convention for internal functions and 'asmlinkage' functions is
different on x86-32. Therefore do not directly cast aesni_enc as XTS tweak
function, but use wrapper function in between. Fixes crash with "XTS +
aesni_intel + x86-32" combination.

Cc: stable@vger.kernel.org
Reported-by: NKrzysztof Kolasa <kkolasa@winsoft.pl>
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

32bec973

15 10月, 2012 2 次提交

crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction · 6a8ce1ef

由 Tim Chen 提交于 9月 27, 2012

This patch adds the crc_pcl function that calculates CRC32C checksum using the
PCLMULQDQ instruction on processors that support this feature. This will
provide speedup over using CRC32 instruction only.
The usage of PCLMULQDQ necessitate the invocation of kernel_fpu_begin and
kernel_fpu_end and incur some overhead. So the new crc_pcl function is only
invoked for buffer size of 512 bytes or more. Larger sized
buffers will expect to see greater speedup. This feature is best used coupled
with eager_fpu which reduces the kernel_fpu_begin/end overhead. For
buffer size of 1K the speedup is around 1.6x and for buffer size greater than
4K, the speedup is around 3x compared to original implementation in crc32c-intel
module. Test was performed on Sandy Bridge based platform with constant frequency
set for cpu.

A white paper detailing the algorithm can be found here:
http://download.intel.com/design/intarch/papers/323405.pdfSigned-off-by: NTim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

6a8ce1ef

crypto: crc32c - Rename crc32c-intel.c to crc32c-intel_glue.c · 35b80920

由 Tim Chen 提交于 9月 27, 2012

This patch renames the crc32c-intel.c file to crc32c-intel_glue.c file
in preparation for linking with the new crc32c-pcl-intel-asm.S file,
which contains optimized crc32c calculation based on PCLMULQDQ
instruction.
Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

35b80920

04 10月, 2012 1 次提交

crypto: x86/glue_helper - fix storing of new IV in CBC encryption · c9f97a27

由 Jussi Kivilinna 提交于 9月 19, 2012

Glue_helper incorrectly XORs new IV over old IV at end of CBC encryption
function when it should store. This causes CBC encryption to give
incorrect output on multi-page encryption requests.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

c9f97a27

27 9月, 2012 1 次提交

crypto: cast5/avx - fix storing of new IV in CBC encryption · 200429cc

由 Jussi Kivilinna 提交于 9月 19, 2012

cast5/avx incorrectly XORs new IV over old IV at end of CBC encryption
function when it should store. This causes CBC encryption to give
incorrect output on multi-page encryption requests.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

200429cc

07 9月, 2012 4 次提交

crypto: camellia-x86_64 - fix sparse warnings (constant is so big) · 1ffb72a3

由 Jussi Kivilinna 提交于 8月 28, 2012

Fix "constant 0xXXXXXXXXXXXXXXXX is so big it's unsigned long" sparse warnings.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

1ffb72a3

crypto: cast6-avx - tune assembler code for more performance · c09220e1

由 Jussi Kivilinna 提交于 8月 28, 2012

Patch replaces 'movb' instructions with 'movzbl' to break false register
dependencies, interleaves instructions better for out-of-order scheduling
and merges constant 16-bit rotation with round-key variable rotation.

tcrypt ECB results:

Intel Core i5-2450M:

size    old-vs-new      new-vs-generic  old-vs-generic
        enc     dec     enc     dec     enc     dec
256     1.13x   1.19x   2.05x   2.17x   1.82x   1.82x
1k      1.18x   1.21x   2.26x   2.33x   1.93x   1.93x
8k      1.19x   1.19x   2.32x   2.33x   1.95x   1.95x

[v2]
 - Do instruction interleaving another way to avoid adding new FPU<=>CPU
   register moves as these cause performance drop on Bulldozer.
 - Improvements to round-key variable rotation handling.
 - Further interleaving improvements for better out-of-order scheduling.

Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

c09220e1

crypto: cast5-avx - tune assembler code for more performance · ddaea786

由 Jussi Kivilinna 提交于 8月 28, 2012

Patch replaces 'movb' instructions with 'movzbl' to break false register
dependencies, interleaves instructions better for out-of-order scheduling
and merges constant 16-bit rotation with round-key variable rotation.

tcrypt ECB results (128bit key):

Intel Core i5-2450M:

size    old-vs-new      new-vs-generic  old-vs-generic
        enc     dec     enc     dec     enc     dec
256     1.18x   1.18x   2.45x   2.47x   2.08x   2.10x
1k      1.20x   1.20x   2.73x   2.73x   2.28x   2.28x
8k      1.20x   1.19x   2.73x   2.73x   2.28x   2.29x

[v2]
 - Do instruction interleaving another way to avoid adding new FPU<=>CPU
   register moves as these cause performance drop on Bulldozer.
 - Improvements to round-key variable rotation handling.
 - Further interleaving improvements for better out-of-order scheduling.

Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

ddaea786

crypto: twofish-avx - tune assembler code for more performance · f94a73f8

由 Jussi Kivilinna 提交于 8月 28, 2012

Patch replaces 'movb' instructions with 'movzbl' to break false register
dependencies and interleaves instructions better for out-of-order scheduling.

Tested on Intel Core i5-2450M and AMD FX-8100.

tcrypt ECB results:

Intel Core i5-2450M:

size    old-vs-new      new-vs-3way     old-vs-3way
        enc     dec     enc     dec     enc     dec
256     1.12x   1.13x   1.36x   1.37x   1.21x   1.22x
1k      1.14x   1.14x   1.48x   1.49x   1.29x   1.31x
8k      1.14x   1.14x   1.50x   1.52x   1.32x   1.33x

AMD FX-8100:

size    old-vs-new      new-vs-3way     old-vs-3way
        enc     dec     enc     dec     enc     dec
256     1.10x   1.11x   1.01x   1.01x   0.92x   0.91x
1k      1.11x   1.12x   1.08x   1.07x   0.97x   0.96x
8k      1.11x   1.13x   1.10x   1.08x   0.99x   0.97x

[v2]
 - Do instruction interleaving another way to avoid adding new FPU<=>CPU
   register moves as these cause performance drop on Bulldozer.
 - Further interleaving improvements for better out-of-order scheduling.
Tested-by: NBorislav Petkov <bp@alien8.de>
Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

f94a73f8

20 8月, 2012 1 次提交

crypto: aesni_intel - improve lrw and xts performance by utilizing parallel... · 023af608

由 Jussi Kivilinna 提交于 7月 22, 2012

crypto: aesni_intel - improve lrw and xts performance by utilizing parallel AES-NI hardware pipelines

Use parallel LRW and XTS encryption facilities to better utilize AES-NI
hardware pipelines and gain extra performance.

Tcrypt benchmark results (async), old vs new ratios:

Intel Core i5-2450M CPU (fam: 6, model: 42, step: 7)

aes:128bit
        lrw:256bit      xts:256bit
size    lrw-enc lrw-dec xts-dec xts-dec
16B     0.99x   1.00x   1.22x   1.19x
64B     1.38x   1.50x   1.58x   1.61x
256B    2.04x   2.02x   2.27x   2.29x
1024B   2.56x   2.54x   2.89x   2.92x
8192B   2.85x   2.99x   3.40x   3.23x

aes:192bit
        lrw:320bit      xts:384bit
size    lrw-enc lrw-dec xts-dec xts-dec
16B     1.08x   1.08x   1.16x   1.17x
64B     1.48x   1.54x   1.59x   1.65x
256B    2.18x   2.17x   2.29x   2.28x
1024B   2.67x   2.67x   2.87x   3.05x
8192B   2.93x   2.84x   3.28x   3.33x

aes:256bit
        lrw:348bit      xts:512bit
size    lrw-enc lrw-dec xts-dec xts-dec
16B     1.07x   1.07x   1.18x   1.19x
64B     1.56x   1.56x   1.70x   1.71x
256B    2.22x   2.24x   2.46x   2.46x
1024B   2.76x   2.77x   3.13x   3.05x
8192B   2.99x   3.05x   3.40x   3.30x

Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Reviewed-by: NKim Phillips <kim.phillips@freescale.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

023af608

01 8月, 2012 3 次提交

crypto: cast6 - add x86_64/avx assembler implementation · 4ea1277d

由 Johannes Goetzfried 提交于 7月 11, 2012

This patch adds a x86_64/avx assembler implementation of the Cast6 block
cipher. The implementation processes eight blocks in parallel (two 4 block
chunk AVX operations). The table-lookups are done in general-purpose registers.
For small blocksizes the functions from the generic module are called. A good
performance increase is provided for blocksizes greater or equal to 128B.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmark results:

Intel Core i5-2500 CPU (fam:6, model:42, step:7)

cast6-avx-x86_64 vs. cast6-generic
128bit key: (lrw:256bit) (xts:256bit)
size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B 0.97x 1.00x 1.01x 1.01x 0.99x 0.97x 0.98x 1.01x 0.96x 0.98x
64B 0.98x 0.99x 1.02x 1.01x 0.99x 1.00x 1.01x 0.99x 1.00x 0.99x
256B 1.77x 1.84x 0.99x 1.85x 1.77x 1.77x 1.70x 1.74x 1.69x 1.72x
1024B 1.93x 1.95x 0.99x 1.96x 1.93x 1.93x 1.84x 1.85x 1.89x 1.87x
8192B 1.91x 1.95x 0.99x 1.97x 1.95x 1.91x 1.86x 1.87x 1.93x 1.90x

256bit key: (lrw:384bit) (xts:512bit)
size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B 0.97x 0.99x 1.02x 1.01x 0.98x 0.99x 1.00x 1.00x 0.98x 0.98x
64B 0.98x 0.99x 1.01x 1.00x 1.00x 1.00x 1.01x 1.01x 0.97x 1.00x
256B 1.77x 1.83x 1.00x 1.86x 1.79x 1.78x 1.70x 1.76x 1.71x 1.69x
1024B 1.92x 1.95x 0.99x 1.96x 1.93x 1.93x 1.83x 1.86x 1.89x 1.87x
8192B 1.94x 1.95x 0.99x 1.97x 1.95x 1.95x 1.87x 1.87x 1.93x 1.91x
Signed-off-by: NJohannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

4ea1277d

crypto: cast5 - add x86_64/avx assembler implementation · 4d6d6a2c

由 Johannes Goetzfried 提交于 7月 11, 2012

This patch adds a x86_64/avx assembler implementation of the Cast5 block
cipher. The implementation processes sixteen blocks in parallel (four 4 block
chunk AVX operations). The table-lookups are done in general-purpose registers.
For small blocksizes the functions from the generic module are called. A good
performance increase is provided for blocksizes greater or equal to 128B.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmark results:

Intel Core i5-2500 CPU (fam:6, model:42, step:7)

cast5-avx-x86_64 vs. cast5-generic
64bit key:
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B     0.99x   0.99x   1.00x   1.00x   1.02x   1.01x
64B     1.00x   1.00x   0.98x   1.00x   1.01x   1.02x
256B    2.03x   2.01x   0.95x   2.11x   2.12x   2.13x
1024B   2.30x   2.24x   0.95x   2.29x   2.35x   2.35x
8192B   2.31x   2.27x   0.95x   2.31x   2.39x   2.39x

128bit key:
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B     0.99x   0.99x   1.00x   1.00x   1.01x   1.01x
64B     1.00x   1.00x   0.98x   1.01x   1.02x   1.01x
256B    2.17x   2.13x   0.96x   2.19x   2.19x   2.19x
1024B   2.29x   2.32x   0.95x   2.34x   2.37x   2.38x
8192B   2.35x   2.32x   0.95x   2.35x   2.39x   2.39x
Signed-off-by: NJohannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

4d6d6a2c

crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations · 7af6c245

由 Jussi Kivilinna 提交于 7月 11, 2012

Initialization of cra_list is currently mixed, most ciphers initialize this
field and most shashes do not. Initialization however is not needed at all
since cra_list is initialized/overwritten in __crypto_register_alg() with
list_add(). Therefore perform cleanup to remove all unneeded initializations
of this field in 'arch/x86/crypto/'.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

7af6c245

11 7月, 2012 2 次提交

crypto: twofish-avx - remove useless instruction · a4347886

由 Johannes Goetzfried 提交于 7月 05, 2012

The register %rdx is written, but never read till the end of the encryption
routine. Therefore let's delete the useless instruction.
Signed-off-by: NJohannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

a4347886

crypto: aesni-intel - fix wrong kfree pointer · bf084d8f

由 Milan Broz 提交于 6月 28, 2012

kfree(new_key_mem) in rfc4106_set_key() should be called on malloced pointer,
not on aligned one, otherwise it can cause invalid pointer on free.

(Seen at least once when running tcrypt tests with debug kernel.)
Signed-off-by: NMilan Broz <mbroz@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

bf084d8f

27 6月, 2012 2 次提交

crypto: move arch/x86/include/asm/aes.h to arch/x86/include/asm/crypto/ · 70ef2601

由 Jussi Kivilinna 提交于 6月 18, 2012

Move AES header to the new asm/crypto directory.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

70ef2601

crypto: move arch/x86/include/asm/serpent-{sse2|avx}.h to arch/x86/include/asm/crypto/ · d4af0e9d

由 Jussi Kivilinna 提交于 6月 18, 2012

Move serpent crypto headers to the new asm/crypto/ directory.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

d4af0e9d

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功