- 21 10月, 2011 4 次提交
-
-
由 Jussi Kivilinna 提交于
Patch adds 3-way parallel x86_64 assembly implementation of twofish as new module. New assembler functions crypt data in three blocks chunks, improving cipher performance on out-of-order CPUs. Patch has been tested with tcrypt and automated filesystem tests. Summary of the tcrypt benchmarks: Twofish 3-way-asm vs twofish asm (128bit 8kb block ECB) encrypt: 1.3x speed decrypt: 1.3x speed Twofish 3-way-asm vs twofish asm (128bit 8kb block CBC) encrypt: 1.07x speed decrypt: 1.4x speed Twofish 3-way-asm vs twofish asm (128bit 8kb block CTR) encrypt: 1.4x speed Twofish 3-way-asm vs AES asm (128bit 8kb block ECB) encrypt: 1.0x speed decrypt: 1.0x speed Twofish 3-way-asm vs AES asm (128bit 8kb block CBC) encrypt: 0.84x speed decrypt: 1.09x speed Twofish 3-way-asm vs AES asm (128bit 8kb block CTR) encrypt: 1.15x speed Full output: http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-3way-asm-x86_64.txt http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-asm-x86_64.txt http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-aes-asm-x86_64.txt Tests were run on: vendor_id : AuthenticAMD cpu family : 16 model : 10 model name : AMD Phenom(tm) II X6 1055T Processor Also userspace test were run on: vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU E7330 @ 2.40GHz stepping : 11 Userspace test results: Encryption/decryption of twofish 3-way vs x86_64-asm on AMD Phenom II: encrypt: 1.27x decrypt: 1.25x Encryption/decryption of twofish 3-way vs x86_64-asm on Intel Xeon E7330: encrypt: 1.36x decrypt: 1.36x Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
This needed by 3-way twofish patch to be able to easily use one block assembler functions. As glue code is shared between i586/x86_64 apply change to i586 assembler too. Also export assembler functions for 3-way parallel twofish module. CC: Joachim Fritschi <jfritschi@freenet.de> Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
This patch adds improved F-macro for 4-way parallel functions. With new F-macro for 4-way parallel functions, blowfish sees ~15% improvement in speed tests on AMD Phenom II (~5% on Intel Xeon E7330). However when used in 1-way blowfish function new macro would be ~10% slower than original, so old F-macro is kept for 1-way functions. Patch cleans up old F-macro as it is no longer needed in 4-way part. Patch also does register macro renaming to reduce stack usage. Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 22 9月, 2011 2 次提交
-
-
由 H Hartley Sweeten 提交于
Include <asm/aes.h> to pick up the declarations for crypto_aes_encrypt_x86 and crypto_aes_decrypt_x86 to quiet the sparse noise: warning: symbol 'crypto_aes_encrypt_x86' was not declared. Should it be static? warning: symbol 'crypto_aes_decrypt_x86' was not declared. Should it be static? Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com> Acked-by: NMandeep Singh Baines <msb@chromium.org> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Jussi Kivilinna 提交于
Patch adds x86_64 assembly implementation of blowfish. Two set of assembler functions are provided. First set is regular 'one-block at time' encrypt/decrypt functions. Second is 'four-block at time' functions that gain performance increase on out-of-order CPUs. Performance of 4-way functions should be equal to 1-way functions with in-order CPUs. Summary of the tcrypt benchmarks: Blowfish assembler vs blowfish C (256bit 8kb block ECB) encrypt: 2.2x speed decrypt: 2.3x speed Blowfish assembler vs blowfish C (256bit 8kb block CBC) encrypt: 1.12x speed decrypt: 2.5x speed Blowfish assembler vs blowfish C (256bit 8kb block CTR) encrypt: 2.5x speed Full output: http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-asm-x86_64.txt http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-c-x86_64.txt Tests were run on: vendor_id : AuthenticAMD cpu family : 16 model : 10 model name : AMD Phenom(tm) II X6 1055T Processor stepping : 0 Signed-off-by: NJussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 10 8月, 2011 1 次提交
-
-
由 Mathias Krause 提交于
This is an assembler implementation of the SHA1 algorithm using the Supplemental SSE3 (SSSE3) instructions or, when available, the Advanced Vector Extensions (AVX). Testing with the tcrypt module shows the raw hash performance is up to 2.3 times faster than the C implementation, using 8k data blocks on a Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25% faster. Since this implementation uses SSE/YMM registers it cannot safely be used in every situation, e.g. while an IRQ interrupts a kernel thread. The implementation falls back to the generic SHA1 variant, if using the SSE/YMM registers is not possible. With this algorithm I was able to increase the throughput of a single IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using the SSSE3 variant -- a speedup of +34.8%. Saving and restoring SSE/YMM state might make the actual throughput fluctuate when there are FPU intensive userland applications running. For example, meassuring the performance using iperf2 directly on the machine under test gives wobbling numbers because iperf2 uses the FPU for each packet to check if the reporting interval has expired (in the above test I got min/max/avg: 402/484/464 MBit/s). Using this algorithm on a IPsec gateway gives much more reasonable and stable numbers, albeit not as high as in the directly connected case. Here is the result from an RFC 2544 test run with a EXFO Packet Blazer FTB-8510: frame size sha1-generic sha1-ssse3 delta 64 byte 37.5 MBit/s 37.5 MBit/s 0.0% 128 byte 56.3 MBit/s 62.5 MBit/s +11.0% 256 byte 87.5 MBit/s 100.0 MBit/s +14.3% 512 byte 131.3 MBit/s 150.0 MBit/s +14.2% 1024 byte 162.5 MBit/s 193.8 MBit/s +19.3% 1280 byte 175.0 MBit/s 212.5 MBit/s +21.4% 1420 byte 175.0 MBit/s 218.7 MBit/s +25.0% 1518 byte 150.0 MBit/s 181.2 MBit/s +20.8% The throughput for the largest frame size is lower than for the previous size because the IP packets need to be fragmented in this case to make there way through the IPsec tunnel. Signed-off-by: NMathias Krause <minipli@googlemail.com> Cc: Maxim Locktyukhin <maxim.locktyukhin@intel.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
- 01 8月, 2011 8 次提交
-
-
由 David Brown 提交于
Migrate the driver for the v7-based MSM chips into drivers/gpio. The driver is unchanged, only moved. Change-Id: I810db5b50b71cdca4e869aa0d0310f7f48781a55 Signed-off-by: NDavid Brown <davidb@codeaurora.org> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
-
由 David Brown 提交于
Migrate the driver for the v6-based MSM chips into drivers/gpio. The driver is unchanged, only moved. Change-Id: I03ba597b95b4d62b42da112a8efac88d67aa40f9 Signed-off-by: NDavid Brown <davidb@codeaurora.org> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
-
由 David Brown 提交于
No need to have a separate header file containing only register definitions that are used by a single driver. Fold these into the gpio driver. Signed-off-by: NDavid Brown <davidb@codeaurora.org>
-
由 David Brown 提交于
The gpiomux.h header contains some SOC ifdefs. However, the API that is actually used by the GPIO driver only uses two functions that are general. Move these general definitions into a public header file. Change-Id: Ia5df8af87dba268225598d56908e523bcfc24ef6 Signed-off-by: NDavid Brown <davidb@codeaurora.org> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
-
由 David Brown 提交于
Select the GPIO register configuration at runtime rather than through idefs. Change-Id: I02ea0a3d61bc81669f32097c32420f0688552231 Signed-off-by: NDavid Brown <davidb@codeaurora.org> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
-
由 David Brown 提交于
Put an SOC prefix on each GPIO register definition, eliminating the need to have SOC ifdefs around the definitions. Change-Id: I5a01fd328a89ce1be610847934d6e118f5465e42 Signed-off-by: NDavid Brown <davidb@codeaurora.org> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
-
由 David Brown 提交于
The two GPIO controllers are always mapped to the same virtual address across all MSM devices. Instead of selecting this at compile time, determine the physical address at runtime, eliminating yet something else preventing multiple MSM targets from being compiled into the same kernel. Change-Id: I1672219d978ab6243526adeda6badf49472baa27 Signed-off-by: NDavid Brown <davidb@codeaurora.org> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
-
由 David Brown 提交于
The MSM7x25 and MSM7x27 devices are not yet supported in the kernel. Remove #ifdef-based tables supporting these chips for now. Change-Id: I4d9f5abc4cc0942ce75a067097b072489493c1b8 Signed-off-by: NDavid Brown <davidb@codeaurora.org> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
-
- 31 7月, 2011 14 次提交
-
-
由 Greg Dietsche 提交于
Remove unnecessary code that matches this coccinelle pattern if (...) return ret; return ret; Signed-off-by: NGreg Dietsche <Gregory.Dietsche@cuw.edu> Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
It's been unused for ages, and contains bugs (e.g. incorrect shifts in lsl64()). Reported-by: NJonathan Elchison <jelchison@gmail.com> Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
arch/m68k/kernel/setup_mm.c: In function ‘setup_arch’: arch/m68k/kernel/setup_mm.c:219: warning: unused variable ‘i’ Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
These defines are way to generic, and cause conflicts: drivers/net/wireless/rtlwifi/rtl8192c/../rtl8192ce/reg.h:369:1: warning: "GPIO_IN" redefined drivers/net/wireless/rtlwifi/rtl8192c/../rtl8192ce/reg.h:370:1: warning: "GPIO_OUT" redefined drivers/net/wireless/rtlwifi/rtl8192se/reg.h:252:1: warning: "GPIO_IN" redefined drivers/net/wireless/rtlwifi/rtl8192se/reg.h:253:1: warning: "GPIO_OUT" redefined Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
Replace a custom implementation (which doesn't lock the resource tree) by a call to lookup_resource() Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Acked-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michael Schmitz 提交于
Based on an original patch from Michael Schmitz: Because mem_init() is now called before device init, devices that rely on ST-RAM may find all ST-RAM already allocated to other users by the time device init happens. In particular, a large initrd RAM disk may use up enough of ST-RAM to cause atari_stram_alloc() to resort to __get_dma_pages() allocation. In the current state of Atari memory management, all of RAM is marked DMA capable, so __get_dma_pages() may well return RAM that is not in actual fact DMA capable. Using this for frame buffer or SCSI DMA buffer causes subtle failure. The ST-RAM allocator has been changed to allocate memory from a pool of reserved ST-RAM of configurable size, set aside on ST-RAM init (i.e. before mem_init()). As long as this pool is not exhausted, allocation of real ST-RAM can be guaranteed. Other changes: - Replace the custom allocator in the ST-RAM pool by the existing allocator in the resource subsystem, - Remove mem_init_done and its hook, as memory init is now done before device init, - Remove /proc/stram, as ST-RAM usage now shows up under /proc/iomem, e.g. 005f2000-006f1fff : ST-RAM Pool 005f2000-0063dfff : atafb 0063e000-00641fff : ataflop 00642000-00642fff : SCSI Signed-off-by: NMichael Schmitz <schmitz@debian.org> [Andreas Schwab <schwab@linux-m68k.org>: Use memparse()] [Geert: Use the resource subsystem instead of a custom allocator] Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
Replace a custom implementation (which doesn't lock the resource tree) by a call to lookup_resource() Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
The address that's passed to _sparc_find_resource() should always be the start address of a resource: - iounmap() passes a page-aligned virtual address, while the original address was created by adding the in-page offset to the resource's start address, - sbus_free_coherent() and pci32_free_coherent() should be passed an address obtained from sbus_alloc_coherent() resp. pci32_alloc_coherent(), which is always a resource's start address. Hence replace the range check by a check for an exact match. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Acked-by: NDavid S. Miller <davem@davemloft.net>
-
由 Geert Uytterhoeven 提交于
Technically, the end of Chip RAM should be offset by CHIP_PHYSADDR (which is zero). Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
While the core resource handling code is safe, our global counter must still be protected against concurrent modifications. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
As of commit 5df1abdb ('m68k/amiga: Fix "debug=mem"'), "debug=mem" no longer uses amiga_chip_alloc_res(), so we can remove the hack to prefer memory at the safe end. This allows to simplify the code and make amiga_chip_alloc() just call amiga_chip_alloc_res() internally. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
and fix a few formattings: - resource sizes are now resource_size_t, use %pR to make it future proof, - use %lu for unsigned long. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
由 Geert Uytterhoeven 提交于
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
-
- 30 7月, 2011 1 次提交
-
-
由 Greg Dietsche 提交于
remove unnecessary code that matches this coccinelle pattern if (...) return ret; return ret; Signed-off-by: NGreg Dietsche <Gregory.Dietsche@cuw.edu> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 7月, 2011 1 次提交
-
-
由 Arnd Bergmann 提交于
My previous commit left the file empty and present in the Makefile, which is a bit dirty and caused problems with 'make distclean', as pointed out by David Howells. This hopefully cleans it up the right way. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NDavid Howells <dhowells@redhat.com> Acked-by: NJohn Linn <john.linn@xilinx.com>
-
- 28 7月, 2011 9 次提交
-
-
Signed-off-by: NJean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Patrice Vilchez <patrice.vilchez@atmel.com>
-
Signed-off-by: NJean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Patrice Vilchez <patrice.vilchez@atmel.com>
-
Signed-off-by: NJean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Patrice Vilchez <patrice.vilchez@atmel.com>
-
Signed-off-by: NJean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Patrice Vilchez <patrice.vilchez@atmel.com>
-
instead of reading the registers everytime the current implementation respect the following constrain: - allow 1 to n soc to be enabled - allow to have a virtual cpu type and subtype - always detect the cpu type and subtype and report it - detect if the soc support is enabled - prepare for sysfs export support - drop soc specific code via compiler when the soc not enabled (via cpu_is_xxx) Today if we read the exid we will have the same value for 9g35 and 9m11 and we will need to check the cidr too with the new implementation we just need to check the soc subtype this will also allow to have specific virtual subtype for rm9200 which the board will have to specify via at91rm9200_set_type(int) as we have no way to detect it. this implementation is inspired by the SH cpu detection support Signed-off-by: NJean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Patrice Vilchez <patrice.vilchez@atmel.com>
-
to make the soc base specified at runtime instead of compiled time Signed-off-by: NJean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
-
they are the same except the default priority Signed-off-by: NJean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Patrice Vilchez <patrice.vilchez@atmel.com>
-
On all at91 except rm9200 and x40 have the System Controller starts at address 0xffffc000 and has a size of 16KiB. On rm9200 it's start at 0xfffe4000 of 111KiB with non reserved data starting at 0xfffff000 This patch removes the individual definitions of AT91_BASE_SYS and replaces them with a common version at base 0xfffffc000 and size 16KiB and map the same memory space Signed-off-by: NJean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Patrice Vilchez <patrice.vilchez@atmel.com>
-
由 Grant Likely 提交于
Everything required to populate NVIDIA Tegra devices from the device tree. This patch adds a new DT_MACHINE_DESC() which matches against a tegra20 device tree. So far it only registers the on-chip devices, but it will be refined in follow on patches to configure clocks and pin IO from the device tree also. Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
-