- 24 3月, 2012 16 次提交
-
-
由 Darrick J. Wong 提交于
Reuse the existing crc32 code to stamp out a crc32c implementation. Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Bob Pearson <rpearson@systemfabricworks.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bob Pearson 提交于
Add a comment at the top of crc32.c [djwong@us.ibm.com: Minor changelog tweaks] Signed-off-by: NBob Pearson <rpearson@systemfabricworks.com> Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bob Pearson 提交于
Add two changes that improve the performance of x86 systems 1. replace main loop with incrementing counter this change improves the performance of the selftest by about 5-6% on Nehalem CPUs. The apparent reason is that the compiler can use the loop index to perform an indexed memory access. This is reported to make the performance of PowerPC CPUs to get worse. 2. replace the rem_len loop with incrementing counter this change improves the performance of the selftest, which has more than the usual number of occurances, by about 1-2% on x86 CPUs. In actual work loads the length is most often a multiple of 4 bytes and this code does not get executed as often if at all. Again this change is reported to make the performance of PowerPC get worse. [djwong@us.ibm.com: Minor changelog tweaks] Signed-off-by: NBob Pearson <rpearson@systemfabricworks.com> Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bob Pearson 提交于
Add slicing-by-8 algorithm to the existing slicing-by-4 algorithm. This consists of: - extend largest BITS size from 32 to 64 - extend tables from tab[4][256] to up to tab[8][256] - Add code for inner loop. [djwong@us.ibm.com: Minor changelog tweaks] Signed-off-by: NBob Pearson <rpearson@systemfabricworks.com> Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bob Pearson 提交于
crc32.c provides a choice of one of several algorithms for computing the LSB and LSB versions of the CRC32 checksum based on the parameters CRC_LE_BITS and CRC_BE_BITS. In the original version the values 1, 2, 4 and 8 respectively selected versions of the alrogithm that computed the crc 1, 2, 4 and 32 bits as a time. This patch series adds a new version that computes the CRC 64 bits at a time. To make things easier to understand the parameter has been reinterpreted to actually stand for the number of bits processed in each step of the algorithm so that the old value 8 has been replaced with the value 32. This also allows us to add in a widely used crc algorithm that computes the crc 8 bits at a time called the Sarwate algorithm. [djwong@us.ibm.com: Minor changelog tweaks] Signed-off-by: NBob Pearson <rpearson@systemfabricworks.com> Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bob Pearson 提交于
crc32.c in its original version freely mixed u32, __le32 and __be32 types which caused warnings from sparse with __CHECK_ENDIAN__. This patch fixes these by forcing the types to u32. [djwong@us.ibm.com: Minor changelog tweaks] Signed-off-by: NBob Pearson <rpearson@systemfabricworks.com> Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bob Pearson 提交于
Misc cleanup of lib/crc32.c and related files. - remove unnecessary header files. - straighten out some convoluted ifdef's - rewrite some references to 2 dimensional arrays as 1 dimensional arrays to make them correct. I.e. replace tab[i] with tab[0][i]. - a few trivial whitespace changes - fix a warning in gen_crc32tables.c caused by a mismatch in the type of the pointer passed to output table. Since the table is only used at kernel compile time, it is simpler to make the table big enough to hold the largest column size used. One cannot make the column size smaller in output_table because it has to be used by both the le and be tables and they can have different column sizes. [djwong@us.ibm.com: Minor changelog tweaks] Signed-off-by: NBob Pearson <rpearson@systemfabricworks.com> Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bob Pearson 提交于
Replace the unit test provided in crc32.c, which doesn't have a makefile and doesn't compile with current headers, with a simpler self test routine that also gives a measure of performance and runs at module init time. The self test option can be enabled through a configuration option CONFIG_CRC32_SELFTEST. The test stresses the pre and post loops and is thus not very realistic since actual uses will likely have addresses and lengths that are at least 4 byte aligned. However, the main loop is long enough so that the performance is dominated by that loop. The expected values for crc32_le and crc32_be were generated with the original version of crc32.c using CRC_BITS_LE = 8 and CRC_BITS_BE = 8. These values were then used to check all the values of the BITS parameters in both the original and new versions. The performance results show some variability from run to run in spite of attempts to both warm the cache and reduce the amount of OS noise by limiting interrutps during the test. To get comparable results and to analyse options wrt performance the best time reported over a small sample of runs has been taken. [djwong@us.ibm.com: Minor changelog tweaks] Signed-off-by: NBob Pearson <rpearson@systemfabricworks.com> Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bob Pearson 提交于
Move a long comment from lib/crc32.c to Documentation/crc32.txt where it will more likely get read. Edited the resulting document to add an explanation of the slicing-by-n algorithm. [djwong@us.ibm.com: minor changelog tweaks] [akpm@linux-foundation.org: fix typo, per George] Signed-off-by: NGeorge Spelvin <linux@horizon.com> Signed-off-by: NBob Pearson <rpearson@systemfabricworks.com> Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bob Pearson 提交于
This patchset (re)uses Bob Pearson's crc32 slice-by-8 code to stamp out a software crc32c implementation. It removes the crc32c implementation in crypto/ in favor of using the stamped-out one in lib/. There is also a change to Kconfig so that the kernel builder can pick an implementation best suited for the hardware. The motivation for this patchset is that I am working on adding full metadata checksumming to ext4. As far as performance impact of adding checksumming goes, I see nearly no change with a standard mail server ffsb simulation. On a test that involves only file creation and deletion and extent tree writes, I see a drop of about 50 pcercent with the current kernel crc32c implementation; this improves to a drop of about 20 percent with the enclosed crc32c code. When metadata is usually a small fraction of total IO, this new implementation doesn't help much because metadata is usually a small fraction of total IO. However, when we are doing IO that is almost all metadata (such as rm -rf'ing a tree), then this patch speeds up the operation substantially. Incidentally, given that iscsi, sctp, and btrfs also use crc32c, this patchset should improve their speed as well. I have not yet quantified that, however. This latest submission combines Bob's patches from late August 2011 with mine so that they can be one coherent patch set. Please excuse my inability to combine some of the patches; I've been advised to leave Bob's patches alone and build atop them instead. :/ Since the last posting, I've also collected some crc32c test results on a bunch of different x86/powerpc/sparc platforms. The results can be viewed here: http://goo.gl/sgt3i ; the "crc32-kern-le" and "crc32c" columns describe the performance of the kernel's current crc32 and crc32c software implementations. The "crc32c-by8-le" column shows crc32c performance with this patchset applied. I expect crc32 performance to be roughly the same. The two _boost columns at the right side of the spreadsheet shows how much faster the new implementation is over the old one. As you can see, crc32 rises substantially, and crc32c experiences a huge increase. This patch: - remove trailing whitespace from lib/crc32.c - remove trailing whitespace from lib/crc32defs.h [djwong@us.ibm.com: changelog tweaks] Signed-off-by: NBob Pearson <rpearson@systemfabricworks.com> Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xiao Guangrong 提交于
Introduce prio_set_parent() to abstract the operation which is used to attach the node to its parent. Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xiao Guangrong 提交于
In current code, the deleted-node is recorded from first to last, actually, we can directly attach these node on 'node' we will insert as the left child, it can let the code more readable. Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xiao Guangrong 提交于
Introduce iter_walk_down()/iter_walk_up() to remove the common code between prio_tree_left() and prio_tree_right(). Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xiao Guangrong 提交于
Remove the code since 'node' has already been initialized in the begin of the function Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Akinobu Mita 提交于
- Generate a 64-bit pattern more efficiently memchr_inv needs to generate a 64-bit pattern filled with a target character. The operation can be done by more efficient way. - Don't call the slow check_bytes() if the memory area is 64-bit aligned memchr_inv compares contiguous 64-bit words with the 64-bit pattern as much as possible. The outside of the region is checked by check_bytes() that scans for each byte. Unfortunately, the first 64-bit word is unexpectedly scanned by check_bytes() even if the memory area is aligned to a 64-bit boundary. Both changes were originally suggested by Eric Dumazet. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Brian Norris <computersforpeace@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Cong Wang 提交于
ARCH_HAS_NMI_WATCHDOG is a macro defined by arch, but config HARDLOCKUP_DETECTOR depends on it. This is wrong, ARCH_HAS_NMI_WATCHDOG has to be a Kconfig config, and arch's need it should select it explicitly. Signed-off-by: NWANG Cong <xiyou.wangcong@gmail.com> Acked-by: NDon Zickus <dzickus@redhat.com> Acked-by: NMike Frysinger <vapier@gentoo.org> Cc: David Howells <dhowells@redhat.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 3月, 2012 1 次提交
-
-
由 Hugh Dickins 提交于
Make one small adjustment to idr_get_next(): take the height from the top layer (stable under RCU) instead of from the root (unprotected by RCU), as idr_find() does: so that it can be used with RCU locking. Copied comment on RCU locking from idr_find(). Signed-off-by: NHugh Dickins <hughd@google.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 3月, 2012 1 次提交
-
-
由 Cong Wang 提交于
Signed-off-by: NCong Wang <amwang@redhat.com>
-
- 12 3月, 2012 1 次提交
-
-
由 Tom Herbert 提交于
In some configurations, jiffies may be undefined in lib/dynamic_queue_limits.c. Adding include of jiffies.h to avoid this. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 3月, 2012 1 次提交
-
-
由 Andrew Vagin 提交于
The queue handling in the udev daemon assumes that the events are ordered. Before this patch uevent_seqnum is incremented under sequence_lock, than an event is send uner uevent_sock_mutex. I want to say that code contained a window between incrementing seqnum and sending an event. This patch locks uevent_sock_mutex before incrementing uevent_seqnum. v2: delete sequence_lock, uevent_seqnum is protected by uevent_sock_mutex v3: unlock the mutex before the goto exit Thanks for Kay for the comments. Signed-off-by: NAndrew Vagin <avagin@openvz.org> Tested-By: NKay Sievers <kay.sievers@vrfy.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 07 3月, 2012 1 次提交
-
-
由 Jan Beulich 提交于
kasprintf() (and potentially other functions that I didn't run across so far) want to evaluate argument lists twice. Caring to do so for the primary list is obviously their job, but they can't reasonably be expected to check the format string for instances of %pV, which however need special handling too: On architectures like x86-64 (as opposed to e.g. ix86), using the same argument list twice doesn't produce the expected results, as an internally managed cursor gets updated during the first run. Fix the problem by always acting on a copy of the original list when handling %pV. Signed-off-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 3月, 2012 1 次提交
-
-
由 Stephen Boyd 提交于
debugobjects is now printing a warning when a fixup for a NOTAVAILABLE object is run. This causes the selftest to fail like: ODEBUG: selftest warnings failed 4 != 5 We could just increase the number of warnings that the selftest is expecting to see because that is actually what has changed. But, it turns out that fixup_activate() was written with inverted logic and thus a fixup for a static object returned 1 indicating the object had been fixed, and 0 otherwise. Fix the logic to be correct and update the counts to reflect that nothing needed fixing for a static object. Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Reported-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 2月, 2012 2 次提交
-
-
由 Paul E. McKenney 提交于
There have been situations where RCU CPU stall warnings were caused by issues in scheduling-clock timer initialization. To make it easier to track these down, this commit causes the RCU CPU stall-warning messages to print out the number of scheduling-clock interrupts taken in the current grace period for each stalled CPU. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
The RCU_TRACE kernel parameter has always been intended for debugging, not for production use. Formalize this by moving RCU_TRACE from init/Kconfig to lib/Kconfig.debug. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 11 2月, 2012 1 次提交
-
-
The soft and hard lockup thresholds have changed so the corresponding Kconfig entries need to be updated accordingly. Add a reference to watchdog_thresh while at it. Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: NDon Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1328827342-6253-2-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 2月, 2012 1 次提交
-
-
由 David Howells 提交于
_parse_integer() does one or two division instructions (which are slow) per digit parsed to perform the overflow check. Furthermore, these are particularly expensive examples of division instruction as the number of clock cycles required to complete them may go up with the position of the most significant set bit in the dividend: if (*res > div_u64(ULLONG_MAX - val, base)) which is as maximal as possible. Worse, on 32-bit arches, more than one of these division instructions may be required per digit. So, assuming we don't support a base of more than 16, skip the check if the top nibble of the result is not set at this point. Signed-off-by: NDavid Howells <dhowells@redhat.com> [ Changed it to not dereference the pointer all the time - even if the compiler can and does optimize it away, the code just looks cleaner. And edited the top nybble test slightly to make the code generated on x86-64 better in the loop - test against a hoisted constant instead of shifting and testing the result ] Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 2月, 2012 2 次提交
-
-
由 David Miller 提交于
This copy of longlong.h is extremely dated and results in compile errors on sparc32 when MPILIB is enabled, copy over the more uptodate implementation from arch/sparc/math/sfp-util_32.h Reported-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 David Miller 提交于
Both sparc 32-bit's software divide assembler and MPILIB provide clz_tab[] with identical contents. Break it out into a seperate object file and select it when SPARC32 or MPILIB is set. Reported-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJames Morris <jmorris@namei.org>
-
- 01 2月, 2012 12 次提交
-
-
由 Dmitry Kasatkin 提交于
mpi_read_from_buffer() return value must not be NULL. Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Dmitry Kasatkin 提交于
Added missing NULL check after mpi_alloc_limb_space(). Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Dmitry Kasatkin 提交于
Comment explains that existing clients do not call this function with dsize == 0, which means that 1/0 should not happen. Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Dmitry Kasatkin 提交于
Buggy client might pass zero nlimbs which is meaningless. Added check for zero length. Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Dmitry Kasatkin 提交于
Removed useless 'is_valid' variable in pkcs_1_v1_5_decode_emsa(), which was inhereted from original code. Client now uses return value to check for an error. Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Dmitry Kasatkin 提交于
Added sanity checks for possible wrongly formatted key payload data: - minimum key payload size - zero modulus length - corrected upper key payload boundary. Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Dmitry Kasatkin 提交于
do_encode_md() and mpi_get_keyid() are not parts of mpi library. They were used early versions of gnupg and in digsig project, but they are not used neither here nor there anymore. Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Dmitry Kasatkin 提交于
Divisor length should not be 0. Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Dmitry Kasatkin 提交于
Definitely better to return error code than to divide by zero. Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Dmitry Kasatkin 提交于
MPI_NULL is replaced with normal NULL. Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Dmitry Kasatkin 提交于
Added missing NULL check after mpi_alloc(). Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Michael S. Tsirkin 提交于
Some architectures need to override the way IO port mapping is done on PCI devices. Supply a generic macro that calls ioport_map, and make it possible for architectures to override. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NArnd Bergmann <arnd@arndb.de>
-