1. 12 6月, 2012 1 次提交
  2. 31 5月, 2012 1 次提交
  3. 15 5月, 2012 2 次提交
  4. 22 3月, 2012 2 次提交
  5. 20 3月, 2012 1 次提交
  6. 14 3月, 2012 1 次提交
    • J
      crypto: camellia - add assembler implementation for x86_64 · 0b95ec56
      Jussi Kivilinna 提交于
      Patch adds x86_64 assembler implementation of Camellia block cipher. Two set of
      functions are provided. First set is regular 'one-block at time' encrypt/decrypt
      functions. Second is 'two-block at time' functions that gain performance increase
      on out-of-order CPUs. Performance of 2-way functions should be equal to 1-way
      functions with in-order CPUs.
      
      Patch has been tested with tcrypt and automated filesystem tests.
      
      Tcrypt benchmark results:
      
      AMD Phenom II 1055T (fam:16, model:10):
      
      camellia-asm vs camellia_generic:
      128bit key:                                             (lrw:256bit)    (xts:256bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.27x   1.22x   1.30x   1.42x   1.30x   1.34x   1.19x   1.05x   1.23x   1.24x
      64B     1.74x   1.79x   1.43x   1.87x   1.81x   1.87x   1.48x   1.38x   1.55x   1.62x
      256B    1.90x   1.87x   1.43x   1.94x   1.94x   1.95x   1.63x   1.62x   1.67x   1.70x
      1024B   1.96x   1.93x   1.43x   1.95x   1.98x   2.01x   1.67x   1.69x   1.74x   1.80x
      8192B   1.96x   1.96x   1.39x   1.93x   2.01x   2.03x   1.72x   1.64x   1.71x   1.76x
      
      256bit key:                                             (lrw:384bit)    (xts:512bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.23x   1.23x   1.33x   1.39x   1.34x   1.38x   1.04x   1.18x   1.21x   1.29x
      64B     1.72x   1.69x   1.42x   1.78x   1.81x   1.89x   1.57x   1.52x   1.56x   1.65x
      256B    1.85x   1.88x   1.42x   1.86x   1.93x   1.96x   1.69x   1.65x   1.70x   1.75x
      1024B   1.88x   1.86x   1.45x   1.95x   1.96x   1.95x   1.77x   1.71x   1.77x   1.78x
      8192B   1.91x   1.86x   1.42x   1.91x   2.03x   1.98x   1.73x   1.71x   1.78x   1.76x
      
      camellia-asm vs aes-asm (8kB block):
               128bit  256bit
      ecb-enc  1.15x   1.22x
      ecb-dec  1.16x   1.16x
      cbc-enc  0.85x   0.90x
      cbc-dec  1.20x   1.23x
      ctr-enc  1.28x   1.30x
      ctr-dec  1.27x   1.28x
      lrw-enc  1.12x   1.16x
      lrw-dec  1.08x   1.10x
      xts-enc  1.11x   1.15x
      xts-dec  1.14x   1.15x
      
      Intel Core2 T8100 (fam:6, model:23, step:6):
      
      camellia-asm vs camellia_generic:
      128bit key:                                             (lrw:256bit)    (xts:256bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.10x   1.12x   1.14x   1.16x   1.16x   1.15x   1.02x   1.02x   1.08x   1.08x
      64B     1.61x   1.60x   1.17x   1.68x   1.67x   1.66x   1.43x   1.42x   1.44x   1.42x
      256B    1.65x   1.73x   1.17x   1.77x   1.81x   1.80x   1.54x   1.53x   1.58x   1.54x
      1024B   1.76x   1.74x   1.18x   1.80x   1.85x   1.85x   1.60x   1.59x   1.65x   1.60x
      8192B   1.77x   1.75x   1.19x   1.81x   1.85x   1.86x   1.63x   1.61x   1.66x   1.62x
      
      256bit key:                                             (lrw:384bit)    (xts:512bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.10x   1.07x   1.13x   1.16x   1.11x   1.16x   1.03x   1.02x   1.08x   1.07x
      64B     1.61x   1.62x   1.15x   1.66x   1.63x   1.68x   1.47x   1.46x   1.47x   1.44x
      256B    1.71x   1.70x   1.16x   1.75x   1.69x   1.79x   1.58x   1.57x   1.59x   1.55x
      1024B   1.78x   1.72x   1.17x   1.75x   1.80x   1.80x   1.63x   1.62x   1.65x   1.62x
      8192B   1.76x   1.73x   1.17x   1.78x   1.80x   1.81x   1.64x   1.62x   1.68x   1.64x
      
      camellia-asm vs aes-asm (8kB block):
               128bit  256bit
      ecb-enc  1.17x   1.21x
      ecb-dec  1.17x   1.20x
      cbc-enc  0.80x   0.82x
      cbc-dec  1.22x   1.24x
      ctr-enc  1.25x   1.26x
      ctr-dec  1.25x   1.26x
      lrw-enc  1.14x   1.18x
      lrw-dec  1.13x   1.17x
      xts-enc  1.14x   1.18x
      xts-dec  1.14x   1.17x
      Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      0b95ec56
  7. 25 2月, 2012 6 次提交
  8. 14 2月, 2012 2 次提交
  9. 27 1月, 2012 1 次提交
    • A
      crypto: Add support for x86 cpuid auto loading for x86 crypto drivers · 3bd391f0
      Andi Kleen 提交于
      Add support for auto-loading of crypto drivers based on cpuid features.
      This enables auto-loading of the VIA and Intel specific drivers
      for AES, hashing and CRCs.
      
      Requires the earlier infrastructure patch to add x86 modinfo.
      I kept it all in a single patch for now.
      
      I dropped the printks when the driver cpuid doesn't match (imho
      drivers never should print anything in such a case)
      
      One drawback is that udev doesn't know if the drivers are used or not,
      so they will be unconditionally loaded at boot up. That's better
      than not loading them at all, like it often happens.
      
      Cc: Dave Jones <davej@redhat.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Jen Axboe <axboe@kernel.dk>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      3bd391f0
  10. 13 1月, 2012 3 次提交
  11. 20 12月, 2011 2 次提交
  12. 21 11月, 2011 5 次提交
  13. 09 11月, 2011 2 次提交
    • J
      crypto: twofish-x86_64-3way - add xts support · bae6d303
      Jussi Kivilinna 提交于
      Patch adds XTS support for twofish-x86_64-3way by using xts_crypt(). Patch has
      been tested with tcrypt and automated filesystem tests.
      
      Tcrypt benchmarks results (twofish-3way/twofish-asm speed ratios):
      
      Intel Celeron T1600 (fam:6, model:15, step:13):
      
      size    xts-enc xts-dec
      16B     0.98x   1.00x
      64B     1.14x   1.15x
      256B    1.23x   1.25x
      1024B   1.26x   1.29x
      8192B   1.28x   1.30x
      
      AMD Phenom II 1055T (fam:16, model:10):
      
      size    xts-enc xts-dec
      16B     1.03x   1.03x
      64B     1.13x   1.16x
      256B    1.20x   1.20x
      1024B   1.22x   1.22x
      8192B   1.22x   1.21x
      Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      bae6d303
    • J
      crypto: twofish-x86_64-3way - add lrw support · 81559f9a
      Jussi Kivilinna 提交于
      Patch adds LRW support for twofish-x86_64-3way by using lrw_crypt(). Patch has
      been tested with tcrypt and automated filesystem tests.
      
      Tcrypt benchmarks results (twofish-3way/twofish-asm speed ratios):
      
      Intel Celeron T1600 (fam:6, model:15, step:13):
      
      size	lrw-enc	lrw-dec
      16B	0.99x	1.00x
      64B	1.17x	1.17x
      256B	1.26x	1.27x
      1024B	1.30x	1.31x
      8192B	1.31x	1.32x
      
      AMD Phenom II 1055T (fam:16, model:10):
      
      size	lrw-enc	lrw-dec
      16B	1.06x	1.01x
      64B	1.08x	1.14x
      256B	1.19x	1.20x
      1024B	1.21x	1.22x
      8192B	1.23x	1.24x
      Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      81559f9a
  14. 01 11月, 2011 1 次提交
  15. 21 10月, 2011 6 次提交
  16. 22 9月, 2011 2 次提交
  17. 10 8月, 2011 1 次提交
    • M
      crypto: sha1 - SSSE3 based SHA1 implementation for x86-64 · 66be8951
      Mathias Krause 提交于
      This is an assembler implementation of the SHA1 algorithm using the
      Supplemental SSE3 (SSSE3) instructions or, when available, the
      Advanced Vector Extensions (AVX).
      
      Testing with the tcrypt module shows the raw hash performance is up to
      2.3 times faster than the C implementation, using 8k data blocks on a
      Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25%
      faster.
      
      Since this implementation uses SSE/YMM registers it cannot safely be
      used in every situation, e.g. while an IRQ interrupts a kernel thread.
      The implementation falls back to the generic SHA1 variant, if using
      the SSE/YMM registers is not possible.
      
      With this algorithm I was able to increase the throughput of a single
      IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using
      the SSSE3 variant -- a speedup of +34.8%.
      
      Saving and restoring SSE/YMM state might make the actual throughput
      fluctuate when there are FPU intensive userland applications running.
      For example, meassuring the performance using iperf2 directly on the
      machine under test gives wobbling numbers because iperf2 uses the FPU
      for each packet to check if the reporting interval has expired (in the
      above test I got min/max/avg: 402/484/464 MBit/s).
      
      Using this algorithm on a IPsec gateway gives much more reasonable and
      stable numbers, albeit not as high as in the directly connected case.
      Here is the result from an RFC 2544 test run with a EXFO Packet Blazer
      FTB-8510:
      
       frame size    sha1-generic     sha1-ssse3    delta
          64 byte     37.5 MBit/s    37.5 MBit/s     0.0%
         128 byte     56.3 MBit/s    62.5 MBit/s   +11.0%
         256 byte     87.5 MBit/s   100.0 MBit/s   +14.3%
         512 byte    131.3 MBit/s   150.0 MBit/s   +14.2%
        1024 byte    162.5 MBit/s   193.8 MBit/s   +19.3%
        1280 byte    175.0 MBit/s   212.5 MBit/s   +21.4%
        1420 byte    175.0 MBit/s   218.7 MBit/s   +25.0%
        1518 byte    150.0 MBit/s   181.2 MBit/s   +20.8%
      
      The throughput for the largest frame size is lower than for the
      previous size because the IP packets need to be fragmented in this
      case to make there way through the IPsec tunnel.
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Cc: Maxim Locktyukhin <maxim.locktyukhin@intel.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      66be8951
  18. 30 6月, 2011 1 次提交