1. 17 7月, 2015 4 次提交
    • M
      crypto: poly1305 - Add a SSE2 SIMD variant for x86_64 · c70f4abe
      Martin Willi 提交于
      Implements an x86_64 assembler driver for the Poly1305 authenticator. This
      single block variant holds the 130-bit integer in 5 32-bit words, but uses
      SSE to do two multiplications/additions in parallel.
      
      When calling updates with small blocks, the overhead for kernel_fpu_begin/
      kernel_fpu_end() negates the perfmance gain. We therefore use the
      poly1305-generic fallback for small updates.
      
      For large messages, throughput increases by ~5-10% compared to
      poly1305-generic:
      
      testing speed of poly1305 (poly1305-generic)
      test  0 (   96 byte blocks,   16 bytes per update,   6 updates): 4080026 opers/sec,  391682496 bytes/sec
      test  1 (   96 byte blocks,   32 bytes per update,   3 updates): 6221094 opers/sec,  597225024 bytes/sec
      test  2 (   96 byte blocks,   96 bytes per update,   1 updates): 9609750 opers/sec,  922536057 bytes/sec
      test  3 (  288 byte blocks,   16 bytes per update,  18 updates): 1459379 opers/sec,  420301267 bytes/sec
      test  4 (  288 byte blocks,   32 bytes per update,   9 updates): 2115179 opers/sec,  609171609 bytes/sec
      test  5 (  288 byte blocks,  288 bytes per update,   1 updates): 3729874 opers/sec, 1074203856 bytes/sec
      test  6 ( 1056 byte blocks,   32 bytes per update,  33 updates):  593000 opers/sec,  626208000 bytes/sec
      test  7 ( 1056 byte blocks, 1056 bytes per update,   1 updates): 1081536 opers/sec, 1142102332 bytes/sec
      test  8 ( 2080 byte blocks,   32 bytes per update,  65 updates):  302077 opers/sec,  628320576 bytes/sec
      test  9 ( 2080 byte blocks, 2080 bytes per update,   1 updates):  554384 opers/sec, 1153120176 bytes/sec
      test 10 ( 4128 byte blocks, 4128 bytes per update,   1 updates):  278715 opers/sec, 1150536345 bytes/sec
      test 11 ( 8224 byte blocks, 8224 bytes per update,   1 updates):  140202 opers/sec, 1153022070 bytes/sec
      
      testing speed of poly1305 (poly1305-simd)
      test  0 (   96 byte blocks,   16 bytes per update,   6 updates): 3790063 opers/sec,  363846076 bytes/sec
      test  1 (   96 byte blocks,   32 bytes per update,   3 updates): 5913378 opers/sec,  567684355 bytes/sec
      test  2 (   96 byte blocks,   96 bytes per update,   1 updates): 9352574 opers/sec,  897847104 bytes/sec
      test  3 (  288 byte blocks,   16 bytes per update,  18 updates): 1362145 opers/sec,  392297990 bytes/sec
      test  4 (  288 byte blocks,   32 bytes per update,   9 updates): 2007075 opers/sec,  578037628 bytes/sec
      test  5 (  288 byte blocks,  288 bytes per update,   1 updates): 3709811 opers/sec, 1068425798 bytes/sec
      test  6 ( 1056 byte blocks,   32 bytes per update,  33 updates):  566272 opers/sec,  597984182 bytes/sec
      test  7 ( 1056 byte blocks, 1056 bytes per update,   1 updates): 1111657 opers/sec, 1173910108 bytes/sec
      test  8 ( 2080 byte blocks,   32 bytes per update,  65 updates):  288857 opers/sec,  600823808 bytes/sec
      test  9 ( 2080 byte blocks, 2080 bytes per update,   1 updates):  590746 opers/sec, 1228751888 bytes/sec
      test 10 ( 4128 byte blocks, 4128 bytes per update,   1 updates):  301825 opers/sec, 1245936902 bytes/sec
      test 11 ( 8224 byte blocks, 8224 bytes per update,   1 updates):  153075 opers/sec, 1258896201 bytes/sec
      
      Benchmark results from a Core i5-4670T.
      Signed-off-by: NMartin Willi <martin@strongswan.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      c70f4abe
    • M
      crypto: chacha20 - Add an eight block AVX2 variant for x86_64 · 3d1e93cd
      Martin Willi 提交于
      Extends the x86_64 ChaCha20 implementation by a function processing eight
      ChaCha20 blocks in parallel using AVX2.
      
      For large messages, throughput increases by ~55-70% compared to four block
      SSSE3:
      
      testing speed of chacha20 (chacha20-simd) encryption
      test 0 (256 bit key, 16 byte blocks): 42249230 operations in 10 seconds (675987680 bytes)
      test 1 (256 bit key, 64 byte blocks): 46441641 operations in 10 seconds (2972265024 bytes)
      test 2 (256 bit key, 256 byte blocks): 33028112 operations in 10 seconds (8455196672 bytes)
      test 3 (256 bit key, 1024 byte blocks): 11568759 operations in 10 seconds (11846409216 bytes)
      test 4 (256 bit key, 8192 byte blocks): 1448761 operations in 10 seconds (11868250112 bytes)
      
      testing speed of chacha20 (chacha20-simd) encryption
      test 0 (256 bit key, 16 byte blocks): 41999675 operations in 10 seconds (671994800 bytes)
      test 1 (256 bit key, 64 byte blocks): 45805908 operations in 10 seconds (2931578112 bytes)
      test 2 (256 bit key, 256 byte blocks): 32814947 operations in 10 seconds (8400626432 bytes)
      test 3 (256 bit key, 1024 byte blocks): 19777167 operations in 10 seconds (20251819008 bytes)
      test 4 (256 bit key, 8192 byte blocks): 2279321 operations in 10 seconds (18672197632 bytes)
      
      Benchmark results from a Core i5-4670T.
      Signed-off-by: NMartin Willi <martin@strongswan.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      3d1e93cd
    • M
      crypto: chacha20 - Add a four block SSSE3 variant for x86_64 · 274f938e
      Martin Willi 提交于
      Extends the x86_64 SSSE3 ChaCha20 implementation by a function processing
      four ChaCha20 blocks in parallel. This avoids the word shuffling needed
      in the single block variant, further increasing throughput.
      
      For large messages, throughput increases by ~110% compared to single block
      SSSE3:
      
      testing speed of chacha20 (chacha20-simd) encryption
      test 0 (256 bit key, 16 byte blocks): 43141886 operations in 10 seconds (690270176 bytes)
      test 1 (256 bit key, 64 byte blocks): 46845874 operations in 10 seconds (2998135936 bytes)
      test 2 (256 bit key, 256 byte blocks): 18458512 operations in 10 seconds (4725379072 bytes)
      test 3 (256 bit key, 1024 byte blocks): 5360533 operations in 10 seconds (5489185792 bytes)
      test 4 (256 bit key, 8192 byte blocks): 692846 operations in 10 seconds (5675794432 bytes)
      
      testing speed of chacha20 (chacha20-simd) encryption
      test 0 (256 bit key, 16 byte blocks): 42249230 operations in 10 seconds (675987680 bytes)
      test 1 (256 bit key, 64 byte blocks): 46441641 operations in 10 seconds (2972265024 bytes)
      test 2 (256 bit key, 256 byte blocks): 33028112 operations in 10 seconds (8455196672 bytes)
      test 3 (256 bit key, 1024 byte blocks): 11568759 operations in 10 seconds (11846409216 bytes)
      test 4 (256 bit key, 8192 byte blocks): 1448761 operations in 10 seconds (11868250112 bytes)
      
      Benchmark results from a Core i5-4670T.
      Signed-off-by: NMartin Willi <martin@strongswan.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      274f938e
    • M
      crypto: chacha20 - Add a SSSE3 SIMD variant for x86_64 · c9320b6d
      Martin Willi 提交于
      Implements an x86_64 assembler driver for the ChaCha20 stream cipher. This
      single block variant works on a single state matrix using SSE instructions.
      It requires SSSE3 due the use of pshufb for efficient 8/16-bit rotate
      operations.
      
      For large messages, throughput increases by ~65% compared to
      chacha20-generic:
      
      testing speed of chacha20 (chacha20-generic) encryption
      test 0 (256 bit key, 16 byte blocks): 45089207 operations in 10 seconds (721427312 bytes)
      test 1 (256 bit key, 64 byte blocks): 43839521 operations in 10 seconds (2805729344 bytes)
      test 2 (256 bit key, 256 byte blocks): 12702056 operations in 10 seconds (3251726336 bytes)
      test 3 (256 bit key, 1024 byte blocks): 3371173 operations in 10 seconds (3452081152 bytes)
      test 4 (256 bit key, 8192 byte blocks): 422468 operations in 10 seconds (3460857856 bytes)
      
      testing speed of chacha20 (chacha20-simd) encryption
      test 0 (256 bit key, 16 byte blocks): 43141886 operations in 10 seconds (690270176 bytes)
      test 1 (256 bit key, 64 byte blocks): 46845874 operations in 10 seconds (2998135936 bytes)
      test 2 (256 bit key, 256 byte blocks): 18458512 operations in 10 seconds (4725379072 bytes)
      test 3 (256 bit key, 1024 byte blocks): 5360533 operations in 10 seconds (5489185792 bytes)
      test 4 (256 bit key, 8192 byte blocks): 692846 operations in 10 seconds (5675794432 bytes)
      
      Benchmark results from a Core i5-4670T.
      Signed-off-by: NMartin Willi <martin@strongswan.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      c9320b6d
  2. 14 7月, 2015 1 次提交
  3. 29 6月, 2015 1 次提交
  4. 15 6月, 2015 1 次提交
    • J
      crypto: aesni - fix crypto_fpu_exit() section mismatch · de1e0087
      Jeremiah Mahler 提交于
      The '__init aesni_init()' function calls the '__exit crypto_fpu_exit()'
      function directly.  Since they are in different sections, this generates
      a warning.
      
        make CONFIG_DEBUG_SECTION_MISMATCH=y
        ...
        WARNING: arch/x86/crypto/aesni-intel.o(.init.text+0x12b): Section
        mismatch in reference from the function init_module() to the function
        .exit.text:crypto_fpu_exit()
        The function __init init_module() references
        a function __exit crypto_fpu_exit().
        This is often seen when error handling in the init function
        uses functionality in the exit path.
        The fix is often to remove the __exit annotation of
        crypto_fpu_exit() so it may be used outside an exit section.
      
      Fix the warning by removing the __exit annotation.
      Signed-off-by: NJeremiah Mahler <jmmahler@gmail.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      de1e0087
  5. 03 6月, 2015 2 次提交
  6. 22 5月, 2015 1 次提交
    • I
      x86/fpu, crypto: Fix AVX2 feature tests · b54b4bbb
      Ingo Molnar 提交于
      For some CPU models I broke the AVX2 feature detection in:
      
        7bc371fa ("x86/fpu, crypto x86/camellia_aesni_avx2: Simplify the camellia_aesni_init() xfeature checks")
        534ff06e ("x86/fpu, crypto x86/serpent_avx2: Simplify the init() xfeature checks")
      
      ... because I did not realize that it's possible for a CPU to support
      the xstate necessary for AVX2 execution (XSTATE_YMM), but not have
      the AVX2 instructions themselves.
      
      Restore the necessary CPUID checks as well.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b54b4bbb
  7. 19 5月, 2015 16 次提交
    • I
      x86/fpu, crypto x86/sha1_mb: Remove FPU internal headers from sha1_mb.c · 57dd083e
      Ingo Molnar 提交于
      This file only uses the public FPU APIs, so remove the xcr.h, fpu/xstate.h
      and fpu/internal.h headers and add the fpu/api.h include.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      57dd083e
    • I
      x86/fpu, crypto x86/serpent_avx2: Simplify the init() xfeature checks · 534ff06e
      Ingo Molnar 提交于
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      534ff06e
    • I
      x86/fpu, crypto x86/sha1_ssse3: Simplify the sha1_ssse3_mod_init() xfeature checks · d1e50966
      Ingo Molnar 提交于
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d1e50966
    • I
      x86/fpu, crypto x86/cast6_avx: Simplify the cast6_init() xfeature checks · 1debf7db
      Ingo Molnar 提交于
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1debf7db
    • I
      x86/fpu, crypto x86/sha512_ssse3: Simplify the sha512_ssse3_mod_init() xfeature checks · c93b8a39
      Ingo Molnar 提交于
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c93b8a39
    • I
      x86/fpu, crypto x86/cast5_avx: Simplify the cast5_init() xfeature checks · d5d34d98
      Ingo Molnar 提交于
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d5d34d98
    • I
      x86/fpu, crypto x86/serpent_avx: Simplify the serpent_init() xfeature checks · c1c23f7e
      Ingo Molnar 提交于
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c1c23f7e
    • I
      x86/fpu, crypto x86/twofish_avx: Simplify the twofish_init() xfeature checks · 4eecd261
      Ingo Molnar 提交于
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4eecd261
    • I
      x86/fpu, crypto x86/camellia_aesni_avx2: Simplify the camellia_aesni_init() xfeature checks · 7bc371fa
      Ingo Molnar 提交于
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      7bc371fa
    • I
      x86/fpu, crypto x86/sha256_ssse3: Simplify the sha256_ssse3_mod_init() xfeature checks · 70d51eb6
      Ingo Molnar 提交于
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      70d51eb6
    • I
      x86/fpu, crypto x86/camellia_aesni_avx: Simplify the camellia_aesni_init() xfeature checks · ce4f5f9b
      Ingo Molnar 提交于
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit:
      
           text    data     bss     dec     hex filename
           2128    2896       0    5024    13a0 camellia_aesni_avx_glue.o.before
           2067    2896       0    4963    1363 camellia_aesni_avx_glue.o.after
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ce4f5f9b
    • I
      x86/fpu: Rename fpu/xsave.h to fpu/xstate.h · 669ebabb
      Ingo Molnar 提交于
      'xsave' is an x86 instruction name to most people - but xsave.h is
      about a lot more than just the XSAVE instruction: it includes
      definitions and support, both internal and external, related to
      xstate and xfeatures support.
      
      As a first step in cleaning up the various xstate uses rename this
      header to 'fpu/xstate.h' to better reflect what this header file
      is about.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      669ebabb
    • I
      x86/fpu: Rename fpu-internal.h to fpu/internal.h · 78f7f1e5
      Ingo Molnar 提交于
      This unifies all the FPU related header files under a unified, hiearchical
      naming scheme:
      
       - asm/fpu/types.h:      FPU related data types, needed for 'struct task_struct',
                               widely included in almost all kernel code, and hence kept
                               as small as possible.
      
       - asm/fpu/api.h:        FPU related 'public' methods exported to other subsystems.
      
       - asm/fpu/internal.h:   FPU subsystem internal methods
      
       - asm/fpu/xsave.h:      XSAVE support internal methods
      
      (Also standardize the header guard in asm/fpu/internal.h.)
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      78f7f1e5
    • I
      x86/fpu: Move xsave.h to fpu/xsave.h · a137fb6b
      Ingo Molnar 提交于
      Move the xsave.h header file to the FPU directory as well.
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a137fb6b
    • I
      x86/fpu: Rename i387.h to fpu/api.h · df6b35f4
      Ingo Molnar 提交于
      We already have fpu/types.h, move i387.h to fpu/api.h.
      
      The file name has become a misnomer anyway: it offers generic FPU APIs,
      but is not limited to i387 functionality.
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      df6b35f4
    • I
      x86/fpu: Fix header file dependencies of fpu-internal.h · f89e32e0
      Ingo Molnar 提交于
      Fix a minor header file dependency bug in asm/fpu-internal.h: it
      relies on i387.h but does not include it. All users of fpu-internal.h
      included it explicitly.
      
      Also remove unnecessary includes, to reduce compilation time.
      
      This also makes it easier to use it as a standalone header file
      for FPU internals, such as an upcoming C module in arch/x86/kernel/fpu/.
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f89e32e0
  8. 13 5月, 2015 1 次提交
  9. 26 4月, 2015 1 次提交
  10. 24 4月, 2015 1 次提交
  11. 10 4月, 2015 3 次提交
  12. 01 4月, 2015 1 次提交
  13. 31 3月, 2015 7 次提交