- 25 8月, 2018 1 次提交
-
-
由 Dave Watson 提交于
A regression was reported bisecting to 1476db2d "Move HashKey computation from stack to gcm_context". That diff moved HashKey computation from the stack, which was explicitly aligned in the asm, to a struct provided from the C code, depending on AESNI_ALIGN_ATTR for alignment. It appears some compilers may not align this struct correctly, resulting in a crash on the movdqa instruction when attempting to encrypt or decrypt data. Fix by using unaligned loads for the HashKeys. On modern hardware there is no perf difference between the unaligned and aligned loads. All other accesses to gcm_context_data already use unaligned loads. Reported-by: NMauro Rossi <issor.oruam@gmail.com> Fixes: 1476db2d ("Move HashKey computation from stack to gcm_context") Cc: <stable@vger.kernel.org> Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 03 7月, 2018 1 次提交
-
-
由 Jan Beulich 提交于
Some Intel CPUs don't recognize 64-bit XORs as zeroing idioms. Zeroing idioms don't require execution bandwidth, as they're being taken care of in the frontend (through register renaming). Use 32-bit XORs instead. Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: herbert@gondor.apana.org.au Cc: pavel@ucw.cz Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/5B39FF1A02000078001CFB54@prv1-mh.provo.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 22 2月, 2018 13 次提交
-
-
由 Dave Watson 提交于
The asm macros are all set up now, introduce entry points. GCM_INIT and GCM_COMPLETE have arguments supplied, so that the new scatter/gather entry points don't have to take all the arguments, and only the ones they need. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
We can fast-path any < 16 byte read if the full message is > 16 bytes, and shift over by the appropriate amount. Usually we are reading > 16 bytes, so this should be faster than the READ_PARTIAL macro introduced in b20209c9 for the average case. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
Before this diff, multiple calls to GCM_ENC_DEC will succeed, but only if all calls are a multiple of 16 bytes. Handle partial blocks at the start of GCM_ENC_DEC, and update aadhash as appropriate. The data offset %r11 is also updated after the partial block. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
HashKey computation only needs to happen once per scatter/gather operation, save it between calls in gcm_context struct instead of on the stack. Since the asm no longer stores anything on the stack, we can use %rsp directly, and clean up the frame save/restore macros a bit. Hashkeys actually only need to be calculated once per key and could be moved to when set_key is called, however, the current glue code falls back to generic aes code if fpu is disabled. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
Prepare to handle partial blocks between scatter/gather calls. For the last partial block, we only want to calculate the aadhash in GCM_COMPLETE, and a new partial block macro will handle both aadhash update and encrypting partial blocks between calls. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
Fill in aadhash, aadlen, pblocklen, curcount with appropriate values. pblocklen, aadhash, and pblockenckey are also updated at the end of each scatter/gather operation, to be carried over to the next operation. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
AAD hash only needs to be calculated once for each scatter/gather operation. Move it to its own macro, and call it from GCM_INIT instead of INITIAL_BLOCKS. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
Introduce a gcm_context_data struct that will be used to pass context data between scatter/gather update calls. It is passed as the second argument (after crypto keys), other args are renumbered. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
Make a macro for the main encode/decode routine. Only a small handful of lines differ for enc and dec. This will also become the main scatter/gather update routine. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
Merge encode and decode tag calculations in GCM_COMPLETE macro. Scatter/gather routines will call this once at the end of encryption or decryption. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
Reduce code duplication by introducting GCM_INIT macro. This macro will also be exposed as a function for implementing scatter/gather support, since INIT only needs to be called once for the full operation. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
Macro-ify function save and restore. These will be used in new functions added for scatter/gather update operations. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Dave Watson 提交于
Use macro operations to merge implemetations of INITIAL_BLOCKS, since they differ by only a small handful of lines. Use macro counter \@ to simplify implementation. Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 12 1月, 2018 1 次提交
-
-
由 David Woodhouse 提交于
Convert all indirect jumps in crypto assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NArjan van de Ven <arjan@linux.intel.com> Acked-by: NIngo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-6-git-send-email-dwmw@amazon.co.uk
-
- 28 12月, 2017 2 次提交
-
-
由 Junaid Shahid 提交于
The aesni_gcm_enc/dec functions can access memory after the end of the AAD buffer if the AAD length is not a multiple of 4 bytes. It didn't matter with rfc4106-gcm-aesni as in that case the AAD was always followed by the 8 byte IV, but that is no longer the case with generic-gcm-aesni. This can potentially result in accessing a page that is not mapped and thus causing the machine to crash. This patch fixes that by reading the last <16 byte block of the AAD byte-by-byte and optionally via an 8-byte load if the block was at least 8 bytes. Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen") Cc: <stable@vger.kernel.org> Signed-off-by: NJunaid Shahid <junaids@google.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Junaid Shahid 提交于
The aesni_gcm_enc/dec functions can access memory before the start of the data buffer if the length of the data buffer is less than 16 bytes. This is because they perform the read via a single 16-byte load. This can potentially result in accessing a page that is not mapped and thus causing the machine to crash. This patch fixes that by reading the partial block byte-by-byte and optionally an via 8-byte load if the block was at least 8 bytes. Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen") Cc: <stable@vger.kernel.org> Signed-off-by: NJunaid Shahid <junaids@google.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 18 5月, 2017 2 次提交
-
-
由 Sabrina Dubroca 提交于
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Sabrina Dubroca 提交于
This is the first step to make the aesni AES-GCM implementation generic. The current code was written for rfc4106, so it handles only some specific sizes of associated data. Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 23 1月, 2017 1 次提交
-
-
由 Denys Vlasenko 提交于
A lot of asm-optimized routines in arch/x86/crypto/ keep its constants in .data. This is wrong, they should be on .rodata. Mnay of these constants are the same in different modules. For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F exists in at least half a dozen places. There is a way to let linker merge them and use just one copy. The rules are as follows: mergeable objects of different sizes should not share sections. You can't put them all in one .rodata section, they will lose "mergeability". GCC puts its mergeable constants in ".rodata.cstSIZE" sections, or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used. This patch does the same: .section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16 It is important that all data in such section consists of 16-byte elements, not larger ones, and there are no implicit use of one element from another. When this is not the case, use non-mergeable section: .section .rodata[.VAR_NAME], "a", @progbits This reduces .data by ~15 kbytes: text data bss dec hex filename 11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o 11112095 2690672 2630712 16433479 fac147 vmlinux.o Merged objects are visible in System.map: ffffffff81a28810 r POLY ffffffff81a28810 r POLY ffffffff81a28820 r TWOONE ffffffff81a28820 r TWOONE ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of ffffffff81a28830 r SHUF_MASK <------------- the name difference ffffffff81a28830 r SHUF_MASK ffffffff81a28830 r SHUF_MASK .. ffffffff81a28d00 r K512 <- merged three identical 640-byte tables ffffffff81a28d00 r K512 ffffffff81a28d00 r K512 Use of object names in section name suffixes is not strictly necessary, but might help if someday link stage will use garbage collection to eliminate unused sections (ld --gc-sections). Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: Josh Poimboeuf <jpoimboe@redhat.com> CC: Xiaodong Liu <xiaodong.liu@intel.com> CC: Megha Dey <megha.dey@intel.com> CC: linux-crypto@vger.kernel.org CC: x86@kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 24 2月, 2016 2 次提交
-
-
由 Josh Poimboeuf 提交于
The crypto code has several callable non-leaf functions which don't honor CONFIG_FRAME_POINTER, which can result in bad stack traces. Create stack frames for them when CONFIG_FRAME_POINTER is enabled. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: David S. Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/6c20192bcf1102ae18ae5a242cabf30ce9b29895.1453405861.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Josh Poimboeuf 提交于
stacktool reports the following warning: stacktool: arch/x86/crypto/aesni-intel_asm.o: _aesni_inc_init(): can't find starting instruction stacktool gets confused when it tries to disassemble the following data in the .text section: .Lbswap_mask: .byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 Move it to .rodata which is a more appropriate section for read-only data. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: David S. Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/b6a2f3f8bda705143e127c025edb2b53c86e6eb4.1453405861.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 14 1月, 2015 1 次提交
-
-
由 Timothy McCaffrey 提交于
These patches fix the RFC4106 implementation in the aesni-intel module so it supports 192 & 256 bit keys. Since the AVX support that was added to this module also only supports 128 bit keys, and this patch only affects the SSE implementation, changes were also made to use the SSE version if key sizes other than 128 are specified. RFC4106 specifies that 192 & 256 bit keys must be supported (section 8.4). Also, this should fix Strongswan issue 341 where the aesni module needs to be unloaded if 256 bit keys are used: http://wiki.strongswan.org/issues/341 This patch has been tested with Sandy Bridge and Haswell processors. With 128 bit keys and input buffers > 512 bytes a slight performance degradation was noticed (~1%). For input buffers of less than 512 bytes there was no performance impact. Compared to 128 bit keys, 256 bit key size performance is approx. .5 cycles per byte slower on Sandy Bridge, and .37 cycles per byte slower on Haswell (vs. SSE code). This patch has also been tested with StrongSwan IPSec connections where it worked correctly. I created this diff from a git clone of crypto-2.6.git. Any questions, please feel free to contact me. Signed-off-by: NTimothy McCaffrey <timothy.mccaffrey@unisys.com> Signed-off-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 13 6月, 2013 1 次提交
-
-
由 Jussi Kivilinna 提交于
The new XTS code for aesni_intel uses input buffers directly as memory operands for pxor instructions, which causes crash if those buffers are not aligned to 16 bytes. Patch changes XTS code to handle unaligned memory correctly, by loading memory with movdqu instead. Reported-by: NDave Jones <davej@redhat.com> Tested-by: NDave Jones <davej@redhat.com> Signed-off-by: NJussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 25 4月, 2013 1 次提交
-
-
由 Jussi Kivilinna 提交于
Add more optimized XTS code for aesni_intel in 64-bit mode, for smaller stack usage and boost for speed. tcrypt results, with Intel i5-2450M: 256-bit key enc dec 16B 0.98x 0.99x 64B 0.64x 0.63x 256B 1.29x 1.32x 1024B 1.54x 1.58x 8192B 1.57x 1.60x 512-bit key enc dec 16B 0.98x 0.99x 64B 0.60x 0.59x 256B 1.24x 1.25x 1024B 1.39x 1.42x 8192B 1.38x 1.42x I chose not to optimize smaller than block size of 256 bytes, since XTS is practically always used with data blocks of size 512 bytes. This is why performance is reduced in tcrypt for 64 byte long blocks. Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: NJussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 20 1月, 2013 1 次提交
-
-
由 Jussi Kivilinna 提交于
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 31 5月, 2012 1 次提交
-
-
由 Mathias Krause 提交于
The 32 bit variant of cbc(aes) decrypt is using instructions requiring 128 bit aligned memory locations but fails to ensure this constraint in the code. Fix this by loading the data into intermediate registers with load unaligned instructions. This fixes reported general protection faults related to aesni. References: https://bugzilla.kernel.org/show_bug.cgi?id=43223Reported-by: NDaniel <garkein@mailueberfall.de> Cc: stable@kernel.org [v2.6.39+] Signed-off-by: NMathias Krause <minipli@googlemail.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 27 3月, 2011 1 次提交
-
-
由 Tadeusz Struk 提交于
This patch fixes problem with packets that are not multiple of 64bytes. Signed-off-by: NAdrian Hoban <adrian.hoban@intel.com> Signed-off-by: NAidan O'Mahony <aidan.o.mahony@intel.com> Signed-off-by: NGabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 18 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
They were generated by 'codespell' and then manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 12月, 2010 1 次提交
-
-
由 Tadeusz Struk 提交于
This patch fixes the problem with 2.16 binutils. Signed-off-by: NAidan O'Mahony <aidan.o.mahony@intel.com> Signed-off-by: NAdrian Hoban <adrian.hoban@intel.com> Signed-off-by: NGabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 29 11月, 2010 1 次提交
-
-
由 Mathias Krause 提交于
Exclude AES-GCM code for x86-32 due to heavy usage of 64-bit registers not available on x86-32. While at it, fixed unregister order in aesni_exit(). Signed-off-by: NMathias Krause <minipli@googlemail.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 27 11月, 2010 1 次提交
-
-
由 Mathias Krause 提交于
The AES-NI instructions are also available in legacy mode so the 32-bit architecture may profit from those, too. To illustrate the performance gain here's a short summary of a dm-crypt speed test on a Core i7 M620 running at 2.67GHz comparing both assembler implementations: x86: i568 aes-ni delta ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4% CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3% LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5% XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7% Additionally, due to some minor optimizations, the 64-bit version also got a minor performance gain as seen below: x86-64: old impl. new impl. delta ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5% CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9% LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6% XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7% Signed-off-by: NMathias Krause <minipli@googlemail.com> Reviewed-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 13 11月, 2010 1 次提交
-
-
由 Tadeusz Struk 提交于
This patch adds an optimized RFC4106 AES-GCM implementation for 64-bit kernels. It supports 128-bit AES key size. This leverages the crypto AEAD interface type to facilitate a combined AES & GCM operation to be implemented in assembly code. The assembly code leverages Intel(R) AES New Instructions and the PCLMULQDQ instruction. Signed-off-by: NAdrian Hoban <adrian.hoban@intel.com> Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: NGabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: NAidan O'Mahony <aidan.o.mahony@intel.com> Signed-off-by: NErdinc Ozturk <erdinc.ozturk@intel.com> Signed-off-by: NJames Guilford <james.guilford@intel.com> Signed-off-by: NWajdi Feghali <wajdi.k.feghali@intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 13 3月, 2010 1 次提交
-
-
由 Huang Ying 提交于
Andrew Morton reported that AES-NI CTR optimization failed to compile with gas 2.16.1, the error message is as follow: arch/x86/crypto/aesni-intel_asm.S: Assembler messages: arch/x86/crypto/aesni-intel_asm.S:752: Error: suffix or operands invalid for `movq' arch/x86/crypto/aesni-intel_asm.S:753: Error: suffix or operands invalid for `movq' To fix this, a gas macro is defined to assemble movq with 64bit general purpose registers and XMM registers. The macro will generate the raw .byte sequence for needed instructions. Reported-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 10 3月, 2010 1 次提交
-
-
由 Huang Ying 提交于
To take advantage of the hardware pipeline implementation of AES-NI instructions. CTR mode cryption is implemented in ASM to schedule multiple AES-NI instructions one after another. This way, some latency of AES-NI instruction can be eliminated. Performance testing based on dm-crypt should 50% reduction of ecryption/decryption time. Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 23 11月, 2009 1 次提交
-
-
由 Huang Ying 提交于
Old binutils do not support AES-NI instructions, to make kernel can be compiled by them, .byte code is used instead of AES-NI assembly instructions. But the readability and flexibility of raw .byte code is not good. So corresponding assembly instruction like gas macro is used instead. Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 18 6月, 2009 1 次提交
-
-
由 Huang Ying 提交于
Original implementation of aesni_cbc_dec do not save IV if input length % 4 == 0. This will make decryption of next block failed. Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 18 2月, 2009 1 次提交
-
-
由 Huang Ying 提交于
Intel AES-NI is a new set of Single Instruction Multiple Data (SIMD) instructions that are going to be introduced in the next generation of Intel processor, as of 2009. These instructions enable fast and secure data encryption and decryption, using the Advanced Encryption Standard (AES), defined by FIPS Publication number 197. The architecture introduces six instructions that offer full hardware support for AES. Four of them support high performance data encryption and decryption, and the other two instructions support the AES key expansion procedure. The white paper can be downloaded from: http://softwarecommunity.intel.com/isn/downloads/intelavx/AES-Instructions-Set_WP.pdf AES may be used in soft_irq context, but MMX/SSE context can not be touched safely in soft_irq context. So in_interrupt() is checked, if in IRQ or soft_irq context, the general x86_64 implementation are used instead. Signed-off-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-