提交 · 8280daad436edb7dd9e7e06fc13bcecb6b2a885c · OpenHarmony / kernel_linux

21 10月, 2011 4 次提交

crypto: twofish - add 3-way parallel x86_64 assembler implemention · 8280daad

由 Jussi Kivilinna 提交于 9月 26, 2011

Patch adds 3-way parallel x86_64 assembly implementation of twofish as new
module. New assembler functions crypt data in three blocks chunks, improving
cipher performance on out-of-order CPUs.

Patch has been tested with tcrypt and automated filesystem tests.

Summary of the tcrypt benchmarks:

Twofish 3-way-asm vs twofish asm (128bit 8kb block ECB)
 encrypt: 1.3x speed
 decrypt: 1.3x speed

Twofish 3-way-asm vs twofish asm (128bit 8kb block CBC)
 encrypt: 1.07x speed
 decrypt: 1.4x speed

Twofish 3-way-asm vs twofish asm (128bit 8kb block CTR)
 encrypt: 1.4x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block ECB)
 encrypt: 1.0x speed
 decrypt: 1.0x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block CBC)
 encrypt: 0.84x speed
 decrypt: 1.09x speed

Twofish 3-way-asm vs AES asm (128bit 8kb block CTR)
 encrypt: 1.15x speed

Full output:
 http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-3way-asm-x86_64.txt
 http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-asm-x86_64.txt
 http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-aes-asm-x86_64.txt

Tests were run on:
 vendor_id  : AuthenticAMD
 cpu family : 16
 model      : 10
 model name : AMD Phenom(tm) II X6 1055T Processor

Also userspace test were run on:
 vendor_id  : GenuineIntel
 cpu family : 6
 model      : 15
 model name : Intel(R) Xeon(R) CPU           E7330  @ 2.40GHz
 stepping   : 11

Userspace test results:

Encryption/decryption of twofish 3-way vs x86_64-asm on AMD Phenom II:
 encrypt: 1.27x
 decrypt: 1.25x

Encryption/decryption of twofish 3-way vs x86_64-asm on Intel Xeon E7330:
 encrypt: 1.36x
 decrypt: 1.36x
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

8280daad

crypto: twofish-x86-asm - make assembler functions use twofish_ctx instead of crypto_tfm · 91d41f15

由 Jussi Kivilinna 提交于 9月 26, 2011

This needed by 3-way twofish patch to be able to easily use one block
assembler functions. As glue code is shared between i586/x86_64 apply
change to i586 assembler too. Also export assembler functions for
3-way parallel twofish module.

CC: Joachim Fritschi <jfritschi@freenet.de>
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

91d41f15

crypto: blowfish-x86_64 - add credits · a071d06e

由 Jussi Kivilinna 提交于 9月 23, 2011

Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

a071d06e

crypto: blowfish-x86_64 - improve x86_64 blowfish 4-way performance · e827bb09

由 Jussi Kivilinna 提交于 9月 23, 2011

This patch adds improved F-macro for 4-way parallel functions. With new
F-macro for 4-way parallel functions, blowfish sees ~15% improvement in
speed tests on AMD Phenom II (~5% on Intel Xeon E7330).

However when used in 1-way blowfish function new macro would be ~10%
slower than original, so old F-macro is kept for 1-way functions.
Patch cleans up old F-macro as it is no longer needed in 4-way part.

Patch also does register macro renaming to reduce stack usage.
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

e827bb09

22 9月, 2011 2 次提交

crypto: aes-x86 - quiet sparse noise about symbol not declared · 4a4cc2b6

由 H Hartley Sweeten 提交于 9月 22, 2011

Include <asm/aes.h> to pick up the declarations for crypto_aes_encrypt_x86
and crypto_aes_decrypt_x86 to quiet the sparse noise:

warning: symbol 'crypto_aes_encrypt_x86' was not declared. Should it be static?
warning: symbol 'crypto_aes_decrypt_x86' was not declared. Should it be static?
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: NMandeep Singh Baines <msb@chromium.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

4a4cc2b6

crypto: blowfish - add x86_64 assembly implementation · 64b94cea

由 Jussi Kivilinna 提交于 9月 02, 2011

Patch adds x86_64 assembly implementation of blowfish. Two set of assembler
functions are provided. First set is regular 'one-block at time'
encrypt/decrypt functions. Second is 'four-block at time' functions that
gain performance increase on out-of-order CPUs. Performance of 4-way
functions should be equal to 1-way functions with in-order CPUs.

Summary of the tcrypt benchmarks:

Blowfish assembler vs blowfish C (256bit 8kb block ECB)
encrypt: 2.2x speed
decrypt: 2.3x speed

Blowfish assembler vs blowfish C (256bit 8kb block CBC)
encrypt: 1.12x speed
decrypt: 2.5x speed

Blowfish assembler vs blowfish C (256bit 8kb block CTR)
encrypt: 2.5x speed

Full output:
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-c-x86_64.txt

Tests were run on:
 vendor_id	: AuthenticAMD
 cpu family	: 16
 model		: 10
 model name	: AMD Phenom(tm) II X6 1055T Processor
 stepping	: 0
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

64b94cea

10 8月, 2011 1 次提交

crypto: sha1 - SSSE3 based SHA1 implementation for x86-64 · 66be8951

由 Mathias Krause 提交于 8月 04, 2011

This is an assembler implementation of the SHA1 algorithm using the
Supplemental SSE3 (SSSE3) instructions or, when available, the
Advanced Vector Extensions (AVX).

Testing with the tcrypt module shows the raw hash performance is up to
2.3 times faster than the C implementation, using 8k data blocks on a
Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25%
faster.

Since this implementation uses SSE/YMM registers it cannot safely be
used in every situation, e.g. while an IRQ interrupts a kernel thread.
The implementation falls back to the generic SHA1 variant, if using
the SSE/YMM registers is not possible.

With this algorithm I was able to increase the throughput of a single
IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using
the SSSE3 variant -- a speedup of +34.8%.

Saving and restoring SSE/YMM state might make the actual throughput
fluctuate when there are FPU intensive userland applications running.
For example, meassuring the performance using iperf2 directly on the
machine under test gives wobbling numbers because iperf2 uses the FPU
for each packet to check if the reporting interval has expired (in the
above test I got min/max/avg: 402/484/464 MBit/s).

Using this algorithm on a IPsec gateway gives much more reasonable and
stable numbers, albeit not as high as in the directly connected case.
Here is the result from an RFC 2544 test run with a EXFO Packet Blazer
FTB-8510:

 frame size    sha1-generic     sha1-ssse3    delta
    64 byte     37.5 MBit/s    37.5 MBit/s     0.0%
   128 byte     56.3 MBit/s    62.5 MBit/s   +11.0%
   256 byte     87.5 MBit/s   100.0 MBit/s   +14.3%
   512 byte    131.3 MBit/s   150.0 MBit/s   +14.2%
  1024 byte    162.5 MBit/s   193.8 MBit/s   +19.3%
  1280 byte    175.0 MBit/s   212.5 MBit/s   +21.4%
  1420 byte    175.0 MBit/s   218.7 MBit/s   +25.0%
  1518 byte    150.0 MBit/s   181.2 MBit/s   +20.8%

The throughput for the largest frame size is lower than for the
previous size because the IP packets need to be fragmented in this
case to make there way through the IPsec tunnel.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Cc: Maxim Locktyukhin <maxim.locktyukhin@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

66be8951

30 6月, 2011 1 次提交
- G
  crypto: ghash-intel - Fix set but not used in ghash_async_setkey() · c3e73e76
  由 Gustavo F. Padovan 提交于 5月 26, 2011
```
Signed-off-by: NGustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
```
  c3e73e76
18 5月, 2011 1 次提交

crypto: aesni-intel - fix aesni build on i386 · 9bed4aca

由 Randy Dunlap 提交于 5月 18, 2011

Fix build error on i386 by moving function prototypes:

arch/x86/crypto/aesni-intel_glue.c: In function 'aesni_init':
arch/x86/crypto/aesni-intel_glue.c:1263: error: implicit declaration of function 'crypto_fpu_init'
arch/x86/crypto/aesni-intel_glue.c: In function 'aesni_exit':
arch/x86/crypto/aesni-intel_glue.c:1373: error: implicit declaration of function 'crypto_fpu_exit'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

9bed4aca

16 5月, 2011 1 次提交

crypto: aesni-intel - Merge with fpu.ko · b23b6451

由 Andy Lutomirski 提交于 5月 16, 2011

Loading fpu without aesni-intel does nothing.  Loading aesni-intel
without fpu causes modes like xts to fail.  (Unloading
aesni-intel will restore those modes.)

One solution would be to make aesni-intel depend on fpu, but it
seems cleaner to just combine the modules.

This is probably responsible for bugs like:
https://bugzilla.redhat.com/show_bug.cgi?id=589390Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

b23b6451

27 3月, 2011 1 次提交

crypto: aesni-intel - fixed problem with packets that are not multiple of 64bytes · 60af520c

由 Tadeusz Struk 提交于 3月 13, 2011

This patch fixes problem with packets that are not multiple of 64bytes.
Signed-off-by: NAdrian Hoban <adrian.hoban@intel.com>
Signed-off-by: NAidan O'Mahony <aidan.o.mahony@intel.com>
Signed-off-by: NGabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

60af520c

18 3月, 2011 1 次提交

x86: Fix common misspellings · 0d2eb44f

由 Lucas De Marchi 提交于 3月 17, 2011

They were generated by 'codespell' and then manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: trivial@kernel.org
LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0d2eb44f

16 2月, 2011 1 次提交

crypto: aesni-intel - Fix remaining leak in rfc4106_set_hash_key · fc9044e2

由 Jesper Juhl 提交于 2月 16, 2011

Fix up previous patch that failed to properly fix mem leak in 
rfc4106_set_hash_subkey(). This add-on patch; fixes the leak. moves 
kfree() out of the error path, returns -ENOMEM rather than -EINVAL when 
ablkcipher_request_alloc() fails.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

fc9044e2

23 1月, 2011 1 次提交

crypto: aesni-intel - Don't leak memory in rfc4106_set_hash_subkey · 7efd95f6

由 Jesper Juhl 提交于 1月 23, 2011

There's a small memory leak in
arch/x86/crypto/aesni-intel_glue.c::rfc4106_set_hash_subkey(). If the call
to kmalloc() fails and returns NULL then the memory allocated previously
by ablkcipher_request_alloc() is not freed when we leave the function.

I could have just added a call to ablkcipher_request_free() before we
return -ENOMEM, but that started to look too much like the code we
already had at the end of the function, so I chose instead to rework the
code a bit so that there are now a few labels at the end that we goto when
various allocations fail, so we don't have to repeat the same blocks of
code (this also reduces the object code size slightly).
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

7efd95f6

15 12月, 2010 1 次提交

crypto: ghash-intel - ghash-clmulni-intel_glue needs err.h · 52f6c5ad

由 Randy Dunlap 提交于 12月 15, 2010

Add missing header file:

arch/x86/crypto/ghash-clmulni-intel_glue.c:256: error: implicit declaration of function 'IS_ERR'
arch/x86/crypto/ghash-clmulni-intel_glue.c:257: error: implicit declaration of function 'PTR_ERR'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

52f6c5ad

13 12月, 2010 1 次提交

crypto: aesni-intel - Fixed build with binutils 2.16 · 3c097b80

由 Tadeusz Struk 提交于 12月 13, 2010

This patch fixes the problem with 2.16 binutils.
Signed-off-by: NAidan O'Mahony <aidan.o.mahony@intel.com>
Signed-off-by: NAdrian Hoban <adrian.hoban@intel.com>
Signed-off-by: NGabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

3c097b80

29 11月, 2010 1 次提交

crypto: aesni-intel - Fixed build error on x86-32 · 559ad0ff

由 Mathias Krause 提交于 11月 29, 2010

Exclude AES-GCM code for x86-32 due to heavy usage of 64-bit registers
not available on x86-32.

While at it, fixed unregister order in aesni_exit().
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

559ad0ff

27 11月, 2010 1 次提交

crypto: aesni-intel - Ported implementation to x86-32 · 0d258efb

由 Mathias Krause 提交于 11月 27, 2010

The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.

To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:

x86:                   i568       aes-ni    delta
ECB, 256 bit:     93.8 MB/s   123.3 MB/s   +31.4%
CBC, 256 bit:     84.8 MB/s   262.3 MB/s  +209.3%
LRW, 256 bit:    108.6 MB/s   222.1 MB/s  +104.5%
XTS, 256 bit:    105.0 MB/s   205.5 MB/s   +95.7%

Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:

x86-64:           old impl.    new impl.    delta
ECB, 256 bit:    121.1 MB/s   123.0 MB/s    +1.5%
CBC, 256 bit:    285.3 MB/s   290.8 MB/s    +1.9%
LRW, 256 bit:    263.7 MB/s   265.3 MB/s    +0.6%
XTS, 256 bit:    251.1 MB/s   255.3 MB/s    +1.7%
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Reviewed-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

0d258efb

13 11月, 2010 1 次提交

crypto: aesni-intel - RFC4106 AES-GCM Driver Using Intel New Instructions · 0bd82f5f

由 Tadeusz Struk 提交于 11月 04, 2010

This patch adds an optimized RFC4106 AES-GCM implementation for 64-bit
kernels. It supports 128-bit AES key size. This leverages the crypto
AEAD interface type to facilitate a combined AES & GCM operation to
be implemented in assembly code. The assembly code leverages Intel(R)
AES New Instructions and the PCLMULQDQ instruction.
Signed-off-by: NAdrian Hoban <adrian.hoban@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NGabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: NAidan O'Mahony <aidan.o.mahony@intel.com>
Signed-off-by: NErdinc Ozturk <erdinc.ozturk@intel.com>
Signed-off-by: NJames Guilford <james.guilford@intel.com>
Signed-off-by: NWajdi Feghali <wajdi.k.feghali@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

0bd82f5f

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

13 3月, 2010 1 次提交

crypto: aesni-intel - Fix CTR optimization build failure with gas 2.16.1 · 32cbd7df

由 Huang Ying 提交于 3月 13, 2010

Andrew Morton reported that AES-NI CTR optimization failed to compile
with gas 2.16.1, the error message is as follow:

arch/x86/crypto/aesni-intel_asm.S: Assembler messages:
arch/x86/crypto/aesni-intel_asm.S:752: Error: suffix or operands invalid for `movq'
arch/x86/crypto/aesni-intel_asm.S:753: Error: suffix or operands invalid for `movq'

To fix this, a gas macro is defined to assemble movq with 64bit
general purpose registers and XMM registers. The macro will generate
the raw .byte sequence for needed instructions.
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

32cbd7df

10 3月, 2010 1 次提交

crypto: aesni-intel - Add AES-NI accelerated CTR mode · 12387a46

由 Huang Ying 提交于 3月 10, 2010

To take advantage of the hardware pipeline implementation of AES-NI
instructions. CTR mode cryption is implemented in ASM to schedule
multiple AES-NI instructions one after another. This way, some latency
of AES-NI instruction can be eliminated.

Performance testing based on dm-crypt should 50% reduction of
ecryption/decryption time.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

12387a46

09 2月, 2010 1 次提交

tree-wide: Assorted spelling fixes · 3ad2f3fb

由 Daniel Mack 提交于 2月 03, 2010

In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.
Signed-off-by: NDaniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

3ad2f3fb

23 11月, 2009 3 次提交

crypto: ghash-clmulni-intel - Put proper .data section in place · 68ee8716

由 Jiri Kosina 提交于 11月 23, 2009

Lbswap_mask, Lpoly and Ltwo_one should clearly belong to
.data section, not .text.
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

68ee8716

crypto: ghash-clmulni-intel - Use gas macro for PCLMULQDQ-NI and PSHUFB · 564ec0ec

由 Huang Ying 提交于 11月 23, 2009

Old binutils do not support PCLMULQDQ-NI and PSHUFB, to make kernel
can be compiled by them, .byte code is used instead of assembly
instructions. But the readability and flexibility of raw .byte code is
not good.

So corresponding assembly instruction like gas macro is used instead.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

564ec0ec

crypto: aesni-intel - Use gas macro for AES-NI instructions · b369e521

由 Huang Ying 提交于 11月 23, 2009

Old binutils do not support AES-NI instructions, to make kernel can be
compiled by them, .byte code is used instead of AES-NI assembly
instructions. But the readability and flexibility of raw .byte code is
not good.

So corresponding assembly instruction like gas macro is used instead.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

b369e521

03 11月, 2009 2 次提交

crypto: ghash-intel - Fix irq_fpu_usable usage · 01dd9582

由 Huang Ying 提交于 11月 03, 2009

When renaming kernel_fpu_using to irq_fpu_usable, the semantics of the
function is changed too, from mesuring whether kernel is using FPU,
that is, the FPU is NOT available, to measuring whether FPU is usable,
that is, the FPU is available.

But the usage of irq_fpu_usable in ghash-clmulni-intel_glue.c is not
changed accordingly. This patch fixes this.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

01dd9582

crypto: ghash-intel - Add PSHUFB macros · 3b0d6596

由 Herbert Xu 提交于 11月 03, 2009

Add PSHUFB macros instead of repeating byte sequences, suggested
by Ingo.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NIngo Molnar <mingo@elte.hu>

3b0d6596

02 11月, 2009 1 次提交

crypto: ghash-intel - Hard-code pshufb · 2d06ef7f

由 Herbert Xu 提交于 11月 01, 2009

Old gases don't have a clue what pshufb stands for so we have
to hard-code it for now.
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

2d06ef7f

20 10月, 2009 1 次提交

crypto: aesni-intel - Fix irq_fpu_usable usage · 13b79b97

由 Huang Ying 提交于 10月 20, 2009

When renaming kernel_fpu_using to irq_fpu_usable, the semantics of the
function is changed too, from mesuring whether kernel is using FPU,
that is, the FPU is NOT available, to measuring whether FPU is usable,
that is, the FPU is available.

But the usage of irq_fpu_usable in aesni-intel_glue.c is not changed
accordingly. This patch fixes this.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

13b79b97

19 10月, 2009 1 次提交

crypto: ghash - Add PCLMULQDQ accelerated implementation · 0e1227d3

由 Huang Ying 提交于 10月 19, 2009

PCLMULQDQ is used to accelerate the most time-consuming part of GHASH,
carry-less multiplication. More information about PCLMULQDQ can be
found at:

http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/

Because PCLMULQDQ changes XMM state, its usage must be enclosed with
kernel_fpu_begin/end, which can be used only in process context, the
acceleration is implemented as crypto_ahash. That is, request in soft
IRQ context will be defered to the cryptd kernel thread.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

0e1227d3

02 9月, 2009 1 次提交

x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h · ae4b688d

由 Huang Ying 提交于 8月 31, 2009

This function measures whether the FPU/SSE state can be touched in
interrupt context. If the interrupted code is in user space or has no
valid FPU/SSE context (CR0.TS == 1), FPU/SSE state can be used in IRQ
or soft_irq context too.

This is used by AES-NI accelerated AES implementation and PCLMULQDQ
accelerated GHASH implementation.

v3:
 - Renamed to irq_fpu_usable to reflect the purpose of the function.

v2:
 - Renamed to irq_is_fpu_using to reflect the real situation.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
CC: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

ae4b688d

24 6月, 2009 1 次提交

crypto: aes-ni - Don't print message with KERN_ERR on old system · c9944881

由 Roland Dreier 提交于 6月 24, 2009

When the aes-intel module is loaded on a system that does not have the
AES instructions, it prints

    Intel AES-NI instructions are not detected.

at level KERN_ERR.  Since aes-intel is aliased to "aes" it will be tried
whenever anything uses AES and spam the console.  This doesn't match
existing practice for how to handle "no hardware" when initializing a
module, so downgrade the message to KERN_INFO.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

c9944881

18 6月, 2009 3 次提交

crypto: aes-ni - Remove CRYPTO_TFM_REQ_MAY_SLEEP from fpu template · b6f34d44

由 Huang Ying 提交于 6月 18, 2009

kernel_fpu_begin/end used preempt_disable/enable, so sleep should be
prevented between kernel_fpu_begin/end.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

b6f34d44

crypto: aes-ni - Do not sleep when using the FPU · 9251b64f

由 Huang Ying 提交于 6月 18, 2009

Because AES-NI instructions will touch XMM state, corresponding code
must be enclosed within kernel_fpu_begin/end, which used
preempt_disable/enable. So sleep should be prevented between
kernel_fpu_begin/end.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

9251b64f

crypto: aes-ni - Fix cbc mode IV saving · e6efaa02

由 Huang Ying 提交于 6月 18, 2009

Original implementation of aesni_cbc_dec do not save IV if input
length % 4 == 0. This will make decryption of next block failed.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

e6efaa02

02 6月, 2009 2 次提交

crypto: aes-ni - Add support for more modes · 2cf4ac8b

由 Huang Ying 提交于 3月 29, 2009

Because kernel_fpu_begin() and kernel_fpu_end() operations are too
slow, the performance gain of general mode implementation + aes-aesni
is almost all compensated.

The AES-NI support for more modes are implemented as follow:

- Add a new AES algorithm implementation named __aes-aesni without
  kernel_fpu_begin/end()

- Use fpu(<mode>(AES)) to provide kenrel_fpu_begin/end() invoking

- Add <mode>(AES) ablkcipher, which uses cryptd(fpu(<mode>(AES))) to
  defer cryption to cryptd context in soft_irq context.

Now the ctr, lrw, pcbc and xts support are added.

Performance testing based on dm-crypt shows that cryption time can be
reduced to 50% of general mode implementation + aes-aesni implementation.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

2cf4ac8b

crypto: fpu - Add template for blkcipher touching FPU · 150c7e85

由 Huang Ying 提交于 3月 29, 2009

Blkcipher touching FPU need to be enclosed by kernel_fpu_begin() and
kernel_fpu_end(). If they are invoked in cipher algorithm
implementation, they will be invoked for each block, so that
performance will be hurt, because they are "slow" operations. This
patch implements "fpu" template, which makes these operations to be
invoked for each request.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

150c7e85

18 2月, 2009 2 次提交

crypto: aes-ni - Add support to Intel AES-NI instructions for x86_64 platform · 54b6a1bd

由 Huang Ying 提交于 1月 18, 2009

Intel AES-NI is a new set of Single Instruction Multiple Data (SIMD)
instructions that are going to be introduced in the next generation of
Intel processor, as of 2009. These instructions enable fast and secure
data encryption and decryption, using the Advanced Encryption Standard
(AES), defined by FIPS Publication number 197.  The architecture
introduces six instructions that offer full hardware support for
AES. Four of them support high performance data encryption and
decryption, and the other two instructions support the AES key
expansion procedure.

The white paper can be downloaded from:

http://softwarecommunity.intel.com/isn/downloads/intelavx/AES-Instructions-Set_WP.pdf

AES may be used in soft_irq context, but MMX/SSE context can not be
touched safely in soft_irq context. So in_interrupt() is checked, if
in IRQ or soft_irq context, the general x86_64 implementation are used
instead.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

54b6a1bd

crypto: aes - Export x86 AES encrypt/decrypt functions · 07bf44f8

由 Huang Ying 提交于 1月 09, 2009

Intel AES-NI AES acceleration instructions touch XMM state, to use
that in soft_irq context, general x86 AES implementation is used as
fallback. The first parameter is changed from struct crypto_tfm * to
struct crypto_aes_ctx * to make it easier to deal with 16 bytes
alignment requirement of AES-NI implementation.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

07bf44f8

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年