- 22 3月, 2012 2 次提交
-
-
由 Jussi Kivilinna 提交于
This caused conflict with camellia-x86_64 when compiled into kernel, same function names and not static. Reported-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
This caused conflict with twofish-x86_64-3way when compiled into kernel, same function names and not static. Reported-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 14 3月, 2012 9 次提交
-
-
由 Steffen Klassert 提交于
When padata_do_parallel() is called from multiple cpus for the same padata instance, we can get object reordering on sequence number wrap because testing for sequence number wrap and reseting the sequence number must happen atomically but is implemented with two atomic operations. This patch fixes this by converting the sequence number from atomic_t to an unsigned int and protect the access with a spin_lock. As a side effect, we get rid of the sequence number wrap handling because the seqence number wraps back to null now without the need to do anything. Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Steffen Klassert 提交于
When a padata object is queued to the serialization queue, another cpu might process and free the padata object. So don't dereference it after queueing to the serialization queue. Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Patch adds x86_64 assembler implementation of Camellia block cipher. Two set of functions are provided. First set is regular 'one-block at time' encrypt/decrypt functions. Second is 'two-block at time' functions that gain performance increase on out-of-order CPUs. Performance of 2-way functions should be equal to 1-way functions with in-order CPUs. Patch has been tested with tcrypt and automated filesystem tests. Tcrypt benchmark results: AMD Phenom II 1055T (fam:16, model:10): camellia-asm vs camellia_generic: 128bit key: (lrw:256bit) (xts:256bit) size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec 16B 1.27x 1.22x 1.30x 1.42x 1.30x 1.34x 1.19x 1.05x 1.23x 1.24x 64B 1.74x 1.79x 1.43x 1.87x 1.81x 1.87x 1.48x 1.38x 1.55x 1.62x 256B 1.90x 1.87x 1.43x 1.94x 1.94x 1.95x 1.63x 1.62x 1.67x 1.70x 1024B 1.96x 1.93x 1.43x 1.95x 1.98x 2.01x 1.67x 1.69x 1.74x 1.80x 8192B 1.96x 1.96x 1.39x 1.93x 2.01x 2.03x 1.72x 1.64x 1.71x 1.76x 256bit key: (lrw:384bit) (xts:512bit) size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec 16B 1.23x 1.23x 1.33x 1.39x 1.34x 1.38x 1.04x 1.18x 1.21x 1.29x 64B 1.72x 1.69x 1.42x 1.78x 1.81x 1.89x 1.57x 1.52x 1.56x 1.65x 256B 1.85x 1.88x 1.42x 1.86x 1.93x 1.96x 1.69x 1.65x 1.70x 1.75x 1024B 1.88x 1.86x 1.45x 1.95x 1.96x 1.95x 1.77x 1.71x 1.77x 1.78x 8192B 1.91x 1.86x 1.42x 1.91x 2.03x 1.98x 1.73x 1.71x 1.78x 1.76x camellia-asm vs aes-asm (8kB block): 128bit 256bit ecb-enc 1.15x 1.22x ecb-dec 1.16x 1.16x cbc-enc 0.85x 0.90x cbc-dec 1.20x 1.23x ctr-enc 1.28x 1.30x ctr-dec 1.27x 1.28x lrw-enc 1.12x 1.16x lrw-dec 1.08x 1.10x xts-enc 1.11x 1.15x xts-dec 1.14x 1.15x Intel Core2 T8100 (fam:6, model:23, step:6): camellia-asm vs camellia_generic: 128bit key: (lrw:256bit) (xts:256bit) size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec 16B 1.10x 1.12x 1.14x 1.16x 1.16x 1.15x 1.02x 1.02x 1.08x 1.08x 64B 1.61x 1.60x 1.17x 1.68x 1.67x 1.66x 1.43x 1.42x 1.44x 1.42x 256B 1.65x 1.73x 1.17x 1.77x 1.81x 1.80x 1.54x 1.53x 1.58x 1.54x 1024B 1.76x 1.74x 1.18x 1.80x 1.85x 1.85x 1.60x 1.59x 1.65x 1.60x 8192B 1.77x 1.75x 1.19x 1.81x 1.85x 1.86x 1.63x 1.61x 1.66x 1.62x 256bit key: (lrw:384bit) (xts:512bit) size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec 16B 1.10x 1.07x 1.13x 1.16x 1.11x 1.16x 1.03x 1.02x 1.08x 1.07x 64B 1.61x 1.62x 1.15x 1.66x 1.63x 1.68x 1.47x 1.46x 1.47x 1.44x 256B 1.71x 1.70x 1.16x 1.75x 1.69x 1.79x 1.58x 1.57x 1.59x 1.55x 1024B 1.78x 1.72x 1.17x 1.75x 1.80x 1.80x 1.63x 1.62x 1.65x 1.62x 8192B 1.76x 1.73x 1.17x 1.78x 1.80x 1.81x 1.64x 1.62x 1.68x 1.64x camellia-asm vs aes-asm (8kB block): 128bit 256bit ecb-enc 1.17x 1.21x ecb-dec 1.17x 1.20x cbc-enc 0.80x 0.82x cbc-dec 1.22x 1.24x ctr-enc 1.25x 1.26x ctr-dec 1.25x 1.26x lrw-enc 1.14x 1.18x lrw-dec 1.13x 1.17x xts-enc 1.14x 1.18x xts-dec 1.14x 1.17x Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Fix checkpatch warnings before renaming file. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Rename camellia module to camellia_generic to allow optimized assembler implementations to autoload with module-alias. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Add tests for CTR, LRW and XTS modes. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
New ECB, CBC, CTR, LRW and XTS test vectors for camellia. Larger ECB/CBC test vectors needed for parallel 2-way camellia implementation. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
camellia_setup_tail() applies 'inverse of the last half of P-function' to subkeys, which is unneeded if keys are applied directly to yl/yr in CAMELLIA_ROUNDSM. Patch speeds up key setup and should speed up CAMELLIA_ROUNDSM as applying key to yl/yr early has less register dependencies. Quick tcrypt camellia results: x86_64, AMD Phenom II, ~5% faster x86_64, Intel Core 2, ~0.5% faster i386, Intel Atom N270, ~1% faster Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 25 2月, 2012 6 次提交
-
-
由 Jussi Kivilinna 提交于
x86 has fast unaligned accesses, so twofish-x86_64/i586 does not need to enforce alignment. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
x86 has fast unaligned accesses, so blowfish-x86_64 does not need to enforce alignment. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Driver name in ablk_*_init functions can be constructed runtime. Therefore use single function ablk_init to reduce object size. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Combine all crypto_alg to be registered and use new crypto_[un]register_algs functions. Simplifies init/exit code and reduce object size. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Combine all crypto_alg to be registered and use new crypto_[un]register_algs functions. Simplifies init/exit code and reduce object size. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Combine all crypto_alg to be registered and use new crypto_[un]register_algs functions. Simplifies init/exit code and reduce object size. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 16 2月, 2012 2 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6由 Herbert Xu 提交于
Merge crypto tree as it has cherry-picked the ror64 patch from cryptodev.
-
由 Alexey Dobriyan 提交于
Use standard ror64() instead of hand-written. There is no standard ror64, so create it. The difference is shift value being "unsigned int" instead of uint64_t (for which there is no reason). gcc starts to emit native ROR instructions which it doesn't do for some reason currently. This should make the code faster. Patch survives in-tree crypto test and ping flood with hmac(sha512) on. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 14 2月, 2012 2 次提交
-
-
由 Jesper Juhl 提交于
We cannot reach the line after 'return err'. Remove it. Signed-off-by: NJesper Juhl <jj@chaosbits.net> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jesper Juhl 提交于
We can never reach the line just after the 'return 0' statement. Remove it. Signed-off-by: NJesper Juhl <jj@chaosbits.net> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 05 2月, 2012 2 次提交
-
-
由 Jesper Juhl 提交于
We declare 'exact' without initializing it and then do: [...] if (strlen(p->cru_driver_name)) exact = 1; if (priority && !exact) return -EINVAL; [...] If the first 'if' is not true, then the second will test an uninitialized 'exact'. As far as I can tell, what we want is for 'exact' to be initialized to 0 (zero/false). Signed-off-by: NJesper Juhl <jj@chaosbits.net> Acked-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Herbert Xu 提交于
Unfortunately in reducing W from 80 to 16 we ended up unrolling the loop twice. As gcc has issues dealing with 64-bit ops on i386 this means that we end up using even more stack space (>1K). This patch solves the W reduction by moving LOAD_OP/BLEND_OP into the loop itself, thus avoiding the need to duplicate it. While the stack space still isn't great (>0.5K) it is at least in the same ball park as the amount of stack used for our C sha1 implementation. Note that this patch basically reverts to the original code so the diff looks bigger than it really is. Cc: stable@vger.kernel.org Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 26 1月, 2012 3 次提交
-
-
由 Herbert Xu 提交于
The previous patch used the modulus operator over a power of 2 unnecessarily which may produce suboptimal binary code. This patch changes changes them to binary ands instead. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Kim Phillips 提交于
drivers/crypto/caam/ctrl.c: In function 'caam_probe': drivers/crypto/caam/ctrl.c:49:6: warning: unused variable 'd' [-Wunused-variable] Signed-off-by: NKim Phillips <kim.phillips@freescale.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Mark Brown 提交于
Hardware crypto engines frequently need to register a selection of different algorithms with the core. Simplify their code slightly, especially the error handling, by providing functions to register a number of algorithms in a single call. Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 15 1月, 2012 3 次提交
-
-
由 Alexey Dobriyan 提交于
Use standard ror64() instead of hand-written. There is no standard ror64, so create it. The difference is shift value being "unsigned int" instead of uint64_t (for which there is no reason). gcc starts to emit native ROR instructions which it doesn't do for some reason currently. This should make the code faster. Patch survives in-tree crypto test and ping flood with hmac(sha512) on. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Alexey Dobriyan 提交于
For rounds 16--79, W[i] only depends on W[i - 2], W[i - 7], W[i - 15] and W[i - 16]. Consequently, keeping all W[80] array on stack is unnecessary, only 16 values are really needed. Using W[16] instead of W[80] greatly reduces stack usage (~750 bytes to ~340 bytes on x86_64). Line by line explanation: * BLEND_OP array is "circular" now, all indexes have to be modulo 16. Round number is positive, so remainder operation should be without surprises. * initial full message scheduling is trimmed to first 16 values which come from data block, the rest is calculated before it's needed. * original loop body is unrolled version of new SHA512_0_15 and SHA512_16_79 macros, unrolling was done to not do explicit variable renaming. Otherwise it's the very same code after preprocessing. See sha1_transform() code which does the same trick. Patch survives in-tree crypto test and original bugreport test (ping flood with hmac(sha512). See FIPS 180-2 for SHA-512 definition http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdfSigned-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Alexey Dobriyan 提交于
commit f9e2bca6 aka "crypto: sha512 - Move message schedule W[80] to static percpu area" created global message schedule area. If sha512_update will ever be entered twice, hash will be silently calculated incorrectly. Probably the easiest way to notice incorrect hashes being calculated is to run 2 ping floods over AH with hmac(sha512): #!/usr/sbin/setkey -f flush; spdflush; add IP1 IP2 ah 25 -A hmac-sha512 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000025; add IP2 IP1 ah 52 -A hmac-sha512 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000052; spdadd IP1 IP2 any -P out ipsec ah/transport//require; spdadd IP2 IP1 any -P in ipsec ah/transport//require; XfrmInStateProtoError will start ticking with -EBADMSG being returned from ah_input(). This never happens with, say, hmac(sha1). With patch applied (on BOTH sides), XfrmInStateProtoError does not tick with multiple bidirectional ping flood streams like it doesn't tick with SHA-1. After this patch sha512_transform() will start using ~750 bytes of stack on x86_64. This is OK for simple loads, for something more heavy, stack reduction will be done separatedly. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 13 1月, 2012 10 次提交
-
-
由 Kim Phillips 提交于
sha224 and 384 support extends caam noise to 21 lines. Do the same as commit 5b859b6e "crypto: talitos - be less noisy on startup", but for caam, and display: caam ffe300000.crypto: fsl,sec-v4.0 algorithms registered in /proc/crypto Signed-off-by: NKim Phillips <kim.phillips@freescale.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Hemant Agrawal 提交于
Signed-off-by: NHemant Agrawal <hemant@freescale.com> Signed-off-by: NKim Phillips <kim.phillips@freescale.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Julia Lawall 提交于
The function is called with locks held and thus should not use GFP_KERNEL. The semantic patch that makes this report is available in scripts/coccinelle/locks/call_kern.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/Signed-off-by: NJulia Lawall <julia.lawall@lip6.fr> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Nikos Mavrogiannopoulos 提交于
The added CRYPTO_ALG_KERN_DRIVER_ONLY indicates whether a cipher is only available via a kernel driver. If the cipher implementation might be available by using an instruction set or by porting the kernel code, then it must not be set. Signed-off-by: NNikos Mavrogiannopoulos <nmav@gnutls.org> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Julia Lawall 提交于
Reimplement a call to devm_request_mem_region followed by a call to ioremap or ioremap_nocache by a call to devm_request_and_ioremap. The semantic patch that makes this transformation is as follows: (http://coccinelle.lip6.fr/) // <smpl> @nm@ expression myname; identifier i; @@ struct platform_driver i = { .driver = { .name = myname } }; @@ expression dev,res,size; expression nm.myname; @@ -if (!devm_request_mem_region(dev, res->start, size, - \(res->name\|dev_name(dev)\|myname\))) { - ... - return ...; -} ... when != res->start ( -devm_ioremap(dev,res->start,size) +devm_request_and_ioremap(dev,res) | -devm_ioremap_nocache(dev,res->start,size) +devm_request_and_ioremap(dev,res) ) ... when any when != res->start // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Matrix transpose macro in serpent-sse2 uses mix of SSE2 integer and SSE floating point instructions, which might cause performance penality on some CPUs. This patch replaces transpose_4x4 macro with version that uses only SSE2 integer instructions. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Implementation in blowfish-x86_64 uses 64bit rotations which are slow on P4, making blowfish-x86_64 slower than generic C implementation. Therefore blacklist P4. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Performance of twofish-x86_64-3way on Intel Pentium 4 and Atom is lower than of twofish-x86_64 module. So blacklist these CPUs. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Varun Wadekar 提交于
driver supports ecb/cbc/ofb/ansi_x9.31rng modes, 128, 192 and 256-bit key sizes Signed-off-by: NVarun Wadekar <vwadekar@nvidia.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Henning Heinold 提交于
The crypto driver will need this api to use it in the RNG calculations. In order to build the crypto driver as a module, tegra_chip_uid has to be exported. Acked-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NHenning Heinold <heinold@inf.fu-berlin.de> Signed-off-by: NVarun Wadekar <vwadekar@nvidia.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 11 1月, 2012 1 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (54 commits) crypto: gf128mul - remove leftover "(EXPERIMENTAL)" in Kconfig crypto: serpent-sse2 - remove unneeded LRW/XTS #ifdefs crypto: serpent-sse2 - select LRW and XTS crypto: twofish-x86_64-3way - remove unneeded LRW/XTS #ifdefs crypto: twofish-x86_64-3way - select LRW and XTS crypto: xts - remove dependency on EXPERIMENTAL crypto: lrw - remove dependency on EXPERIMENTAL crypto: picoxcell - fix boolean and / or confusion crypto: caam - remove DECO access initialization code crypto: caam - fix polarity of "propagate error" logic crypto: caam - more desc.h cleanups crypto: caam - desc.h - convert spaces to tabs crypto: talitos - convert talitos_error to struct device crypto: talitos - remove NO_IRQ references crypto: talitos - fix bad kfree crypto: convert drivers/crypto/* to use module_platform_driver() char: hw_random: convert drivers/char/hw_random/* to use module_platform_driver() crypto: serpent-sse2 - should select CRYPTO_CRYPTD crypto: serpent - rename serpent.c to serpent_generic.c crypto: serpent - cleanup checkpatch errors and warnings ...
-