• J
    crypto: twofish - add x86_64/avx assembler implementation · 107778b5
    Johannes Goetzfried 提交于
    This patch adds a x86_64/avx assembler implementation of the Twofish block
    cipher. The implementation processes eight blocks in parallel (two 4 block
    chunk AVX operations). The table-lookups are done in general-purpose registers.
    For small blocksizes the 3way-parallel functions from the twofish-x86_64-3way
    module are called. A good performance increase is provided for blocksizes
    greater or equal to 128B.
    
    Patch has been tested with tcrypt and automated filesystem tests.
    
    Tcrypt benchmark results:
    
    Intel Core i5-2500 CPU (fam:6, model:42, step:7)
    
    twofish-avx-x86_64 vs. twofish-x86_64-3way
    128bit key:                                             (lrw:256bit)    (xts:256bit)
    size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
    16B     0.96x   0.97x   1.00x   0.95x   0.97x   0.97x   0.96x   0.95x   0.95x   0.98x
    64B     0.99x   0.99x   1.00x   0.99x   0.98x   0.98x   0.99x   0.98x   0.99x   0.98x
    256B    1.20x   1.21x   1.00x   1.19x   1.15x   1.14x   1.19x   1.20x   1.18x   1.19x
    1024B   1.29x   1.30x   1.00x   1.28x   1.23x   1.24x   1.26x   1.28x   1.26x   1.27x
    8192B   1.31x   1.32x   1.00x   1.31x   1.25x   1.25x   1.28x   1.29x   1.28x   1.30x
    
    256bit key:                                             (lrw:384bit)    (xts:512bit)
    size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
    16B     0.96x   0.96x   1.00x   0.96x   0.97x   0.98x   0.95x   0.95x   0.95x   0.96x
    64B     1.00x   0.99x   1.00x   0.98x   0.98x   1.01x   0.98x   0.98x   0.98x   0.98x
    256B    1.20x   1.21x   1.00x   1.21x   1.15x   1.15x   1.19x   1.20x   1.18x   1.19x
    1024B   1.29x   1.30x   1.00x   1.28x   1.23x   1.23x   1.26x   1.27x   1.26x   1.27x
    8192B   1.31x   1.33x   1.00x   1.31x   1.26x   1.26x   1.29x   1.29x   1.28x   1.30x
    
    twofish-avx-x86_64 vs aes-asm (8kB block):
             128bit  256bit
    ecb-enc  1.19x   1.63x
    ecb-dec  1.18x   1.62x
    cbc-enc  0.75x   1.03x
    cbc-dec  1.23x   1.67x
    ctr-enc  1.24x   1.65x
    ctr-dec  1.24x   1.65x
    lrw-enc  1.15x   1.53x
    lrw-dec  1.14x   1.52x
    xts-enc  1.16x   1.56x
    xts-dec  1.16x   1.56x
    Signed-off-by: NJohannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
    Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
    107778b5
Makefile 1.8 KB