1. 23 1月, 2017 1 次提交
    • D
      crypto: x86 - make constants readonly, allow linker to merge them · e183914a
      Denys Vlasenko 提交于
      A lot of asm-optimized routines in arch/x86/crypto/ keep its
      constants in .data. This is wrong, they should be on .rodata.
      
      Mnay of these constants are the same in different modules.
      For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
      exists in at least half a dozen places.
      
      There is a way to let linker merge them and use just one copy.
      The rules are as follows: mergeable objects of different sizes
      should not share sections. You can't put them all in one .rodata
      section, they will lose "mergeability".
      
      GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
      or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
      This patch does the same:
      
      	.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
      
      It is important that all data in such section consists of
      16-byte elements, not larger ones, and there are no implicit
      use of one element from another.
      
      When this is not the case, use non-mergeable section:
      
      	.section .rodata[.VAR_NAME], "a", @progbits
      
      This reduces .data by ~15 kbytes:
      
          text    data     bss     dec      hex filename
      11097415 2705840 2630712 16433967  fac32f vmlinux-prev.o
      11112095 2690672 2630712 16433479  fac147 vmlinux.o
      
      Merged objects are visible in System.map:
      
      ffffffff81a28810 r POLY
      ffffffff81a28810 r POLY
      ffffffff81a28820 r TWOONE
      ffffffff81a28820 r TWOONE
      ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
      ffffffff81a28830 r SHUF_MASK   <------------- the name difference
      ffffffff81a28830 r SHUF_MASK
      ffffffff81a28830 r SHUF_MASK
      ..
      ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
      ffffffff81a28d00 r K512
      ffffffff81a28d00 r K512
      
      Use of object names in section name suffixes is not strictly necessary,
      but might help if someday link stage will use garbage collection
      to eliminate unused sections (ld --gc-sections).
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      CC: Herbert Xu <herbert@gondor.apana.org.au>
      CC: Josh Poimboeuf <jpoimboe@redhat.com>
      CC: Xiaodong Liu <xiaodong.liu@intel.com>
      CC: Megha Dey <megha.dey@intel.com>
      CC: linux-crypto@vger.kernel.org
      CC: x86@kernel.org
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      e183914a
  2. 24 2月, 2016 1 次提交
    • J
      x86/asm/crypto: Create stack frames in crypto functions · 8691ccd7
      Josh Poimboeuf 提交于
      The crypto code has several callable non-leaf functions which don't
      honor CONFIG_FRAME_POINTER, which can result in bad stack traces.
      
      Create stack frames for them when CONFIG_FRAME_POINTER is enabled.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chris J Arges <chris.j.arges@canonical.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Pedro Alves <palves@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/6c20192bcf1102ae18ae5a242cabf30ce9b29895.1453405861.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8691ccd7
  3. 25 4月, 2013 1 次提交
  4. 20 1月, 2013 1 次提交
  5. 24 10月, 2012 1 次提交
  6. 14 6月, 2012 1 次提交
  7. 12 6月, 2012 1 次提交
    • J
      crypto: serpent - add x86_64/avx assembler implementation · 7efe4076
      Johannes Goetzfried 提交于
      This patch adds a x86_64/avx assembler implementation of the Serpent block
      cipher. The implementation is very similar to the sse2 implementation and
      processes eight blocks in parallel. Because of the new non-destructive three
      operand syntax all move-instructions can be removed and therefore a little
      performance increase is provided.
      
      Patch has been tested with tcrypt and automated filesystem tests.
      
      Tcrypt benchmark results:
      
      Intel Core i5-2500 CPU (fam:6, model:42, step:7)
      
      serpent-avx-x86_64 vs. serpent-sse2-x86_64
      128bit key:                                             (lrw:256bit)    (xts:256bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.03x   1.01x   1.01x   1.01x   1.00x   1.00x   1.00x   1.00x   1.00x   1.01x
      64B     1.00x   1.00x   1.00x   1.00x   1.00x   0.99x   1.00x   1.01x   1.00x   1.00x
      256B    1.05x   1.03x   1.00x   1.02x   1.05x   1.06x   1.05x   1.02x   1.05x   1.02x
      1024B   1.05x   1.02x   1.00x   1.02x   1.05x   1.06x   1.05x   1.03x   1.05x   1.02x
      8192B   1.05x   1.02x   1.00x   1.02x   1.06x   1.06x   1.04x   1.03x   1.04x   1.02x
      
      256bit key:                                             (lrw:384bit)    (xts:512bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.01x   1.00x   1.01x   1.01x   1.00x   1.00x   0.99x   1.03x   1.01x   1.01x
      64B     1.00x   1.00x   1.00x   1.00x   1.00x   1.00x   1.00x   1.01x   1.00x   1.02x
      256B    1.05x   1.02x   1.00x   1.02x   1.05x   1.02x   1.04x   1.05x   1.05x   1.02x
      1024B   1.06x   1.02x   1.00x   1.02x   1.07x   1.06x   1.05x   1.04x   1.05x   1.02x
      8192B   1.05x   1.02x   1.00x   1.02x   1.06x   1.06x   1.04x   1.05x   1.05x   1.02x
      
      serpent-avx-x86_64 vs aes-asm (8kB block):
               128bit  256bit
      ecb-enc  1.26x   1.73x
      ecb-dec  1.20x   1.64x
      cbc-enc  0.33x   0.45x
      cbc-dec  1.24x   1.67x
      ctr-enc  1.32x   1.76x
      ctr-dec  1.32x   1.76x
      lrw-enc  1.20x   1.60x
      lrw-dec  1.15x   1.54x
      xts-enc  1.22x   1.64x
      xts-dec  1.17x   1.57x
      Signed-off-by: NJohannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      7efe4076