- 18 10月, 2007 1 次提交
-
-
由 David S. Miller 提交于
When the cpu count is high and contention hits an atomic object, the processors can synchronize such that some cpus continually get knocked out and cannot complete the atomic update. So implement an exponential backoff when SMP. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 10月, 2007 1 次提交
-
-
由 David S. Miller 提交于
Some typos led to using %i6/%i7 instead of %l6/%l7 in loads which is really really bad because those are the frame pointer and return PC. Based upon a raid5 crash report by Bertrand Joel. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 10月, 2007 1 次提交
-
-
由 David S. Miller 提交于
It doesn't matter for use in 64-bit objects, but when used in 32-bit environments the top 32-bits of the local and in registers will get chopped off on the next register window spill/restore which leads to difficult to track down and subtle bugs. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 10月, 2007 1 次提交
-
-
由 David S. Miller 提交于
For the case where the source is not aligned modulo 8 we don't use load-twins to suck the data in and this kills performance since normal loads allocate in the L1 cache (unlike load-twin) and thus big memcpys swipe the entire L1 D-cache. We need to allocate a register window to implement this properly, but that actually simplifies a lot of things as a nice side-effect. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 8月, 2007 1 次提交
-
-
由 David S. Miller 提交于
The bzero/memset implementation stays the same as Niagara-1. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 8月, 2007 1 次提交
-
-
由 David S. Miller 提交于
Check the cpu type in the OBP device tree before committing to using the optimized Niagara memcpy and memset implementation. If we don't recognize the cpu type, use a completely generic version. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 7月, 2007 1 次提交
-
-
由 David S. Miller 提交于
Take a page from the powerpc folks and just calculate the delay factor directly. Since frequency scaling chips use a system-tick register, the value is going to be the same system-wide. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 3月, 2007 1 次提交
-
-
由 David S. Miller 提交于
The manual says that it is required and we actually have crash reports where loads see stale data due to not having membars here. In one case the networking does: memset(skb, 0, offsetof(struct sk_buff, truesize)); and then some code later checks skb->nohdr for zero, but it's still the value that was there before the memset(). Note that arch/sparc64/lib/xor.S already got this right. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 7月, 2006 1 次提交
-
-
由 Jörn Engel 提交于
Signed-off-by: NJörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-
- 05 6月, 2006 1 次提交
-
-
由 David S. Miller 提交于
Both csum_partial() and the csum_partial_copy*() family of routines forget to do a final fold on the computed checksum value on sparc64. So do the standard Sparc "add + set condition codes, add carry" sequence, then make sure the high 32-bits of the return value are clear. Based upon some excellent detective work and debugging done by Richard Braun and Samuel Thibault. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 3月, 2006 1 次提交
-
-
由 Akinobu Mita 提交于
- remove __{,test_and_}{set,clear,change}_bit() and test_bit() - remove ffz() - remove __ffs() - remove generic_fls() - remove generic_fls64() - remove sched_find_first_bit() - remove ffs() - unless defined(ULTRA_HAS_POPULATION_COUNT) - remove generic_hweight{64,32,16,8}() - remove find_{next,first}{,_zero}_bit() - remove ext2_{set,clear,test,find_first_zero,find_next_zero}_bit() - remove minix_{test,set,test_and_clear,test,find_first_zero}_bit() Signed-off-by: NAkinobu Mita <mita@miraclelinux.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 20 3月, 2006 11 次提交
-
-
由 David S. Miller 提交于
We only need to write an invalid tag every 16 bytes, so taking advantage of this can save many instructions compared to the simple memset() call we make now. A prefetching implementation is implemented for sun4u and a block-init store version if implemented for Niagara. The next trick is to be able to perform an init and a copy_tsb() in parallel when growing a TSB table. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This gives more consistent bogomips and delay() semantics, especially on sun4v. It gives weird looking values though... Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
The bug that hit SUN4V TLB patching exists elsewhere. Make sure we cure all such cases. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Yes, you heard it right, they changed the PTE layout for SUN4V. Ho hum... This is the simple and inefficient way to support this. It'll get optimized, don't worry. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
We need to restore the %asi register properly. For the kernel this means get_fs(), for user this means ASI_PNF. Also, NGcopy_to_user.S was including U3memcpy.S instead of NGmemcpy.S, oops :-) Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Happily we have no D-cache aliasing issues on these chips, so the implementation is very straightforward. Add a stub in bootup which will be where the patching calls will be made for niagara/sun4v/hypervisor. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Some of the trap code was still assuming that alternate global %g6 was hard coded with current_thread_info(). Let's just consistently flush at KERNBASE when we need a pipeline synchronization. That's locked into the TLB and will always work. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 3月, 2006 1 次提交
-
-
由 David S. Miller 提交于
We must use the "a" (allocate) attribute every time we emit an entry into the __ex_table section. For consistency, use "a" instead of #alloc which is some Solaris compat cruft GNU as provides on Sparc. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 10月, 2005 1 次提交
-
-
由 David S. Miller 提交于
We need to use stricter memory barriers around the block load and store instructions we use to save and restore the FPU register file. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 9月, 2005 2 次提交
-
-
由 David S. Miller 提交于
Instead of doing byte-at-a-time user accesses to figure out where the fault occurred, read the saved fault_address from the current thread structure. For the sake of defensive programming, if the fault_address does not fall into the user buffer range, simply assume the whole area faulted. This will cause the fixup for copy_from_user() to clear the entire kernel side buffer. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
We were not calling kernel_mna_trap_fault() correctly. Instead of being fancy, just return 0 vs. -EFAULT from the assembler stubs, and handle that return value as appropriate. Create an "__retl_efault" stub for assembler exception table entries and use it where possible. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 9月, 2005 1 次提交
-
-
由 David S. Miller 提交于
Several implementations were essentialy a common piece of C code using the cmpxchg() macro. Put the implementation in one spot that everyone can share, and convert sparc64 over to using this. Alpha is the lone arch-specific implementation, which codes up a special fast path for the common case in order to avoid GP reloading which a pure C version would require. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 9月, 2005 1 次提交
-
-
由 Ingo Molnar 提交于
This patch (written by me and also containing many suggestions of Arjan van de Ven) does a major cleanup of the spinlock code. It does the following things: - consolidates and enhances the spinlock/rwlock debugging code - simplifies the asm/spinlock.h files - encapsulates the raw spinlock type and moves generic spinlock features (such as ->break_lock) into the generic code. - cleans up the spinlock code hierarchy to get rid of the spaghetti. Most notably there's now only a single variant of the debugging code, located in lib/spinlock_debug.c. (previously we had one SMP debugging variant per architecture, plus a separate generic one for UP builds) Also, i've enhanced the rwlock debugging facility, it will now track write-owners. There is new spinlock-owner/CPU-tracking on SMP builds too. All locks have lockup detection now, which will work for both soft and hard spin/rwlock lockups. The arch-level include files now only contain the minimally necessary subset of the spinlock code - all the rest that can be generalized now lives in the generic headers: include/asm-i386/spinlock_types.h | 16 include/asm-x86_64/spinlock_types.h | 16 I have also split up the various spinlock variants into separate files, making it easier to see which does what. The new layout is: SMP | UP ----------------------------|----------------------------------- asm/spinlock_types_smp.h | linux/spinlock_types_up.h linux/spinlock_types.h | linux/spinlock_types.h asm/spinlock_smp.h | linux/spinlock_up.h linux/spinlock_api_smp.h | linux/spinlock_api_up.h linux/spinlock.h | linux/spinlock.h /* * here's the role of the various spinlock/rwlock related include files: * * on SMP builds: * * asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the * initializers * * linux/spinlock_types.h: * defines the generic type and initializers * * asm/spinlock.h: contains the __raw_spin_*()/etc. lowlevel * implementations, mostly inline assembly code * * (also included on UP-debug builds:) * * linux/spinlock_api_smp.h: * contains the prototypes for the _spin_*() APIs. * * linux/spinlock.h: builds the final spin_*() APIs. * * on UP builds: * * linux/spinlock_type_up.h: * contains the generic, simplified UP spinlock type. * (which is an empty structure on non-debug builds) * * linux/spinlock_types.h: * defines the generic type and initializers * * linux/spinlock_up.h: * contains the __raw_spin_*()/etc. version of UP * builds. (which are NOPs on non-debug, non-preempt * builds) * * (included on UP-non-debug builds:) * * linux/spinlock_api_up.h: * builds the _spin_*() APIs. * * linux/spinlock.h: builds the final spin_*() APIs. */ All SMP and UP architectures are converted by this patch. arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via crosscompilers. m32r, mips, sh, sparc, have not been tested yet, but should be mostly fine. From: Grant Grundler <grundler@parisc-linux.org> Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU). Builds 32-bit SMP kernel (not booted or tested). I did not try to build non-SMP kernels. That should be trivial to fix up later if necessary. I converted bit ops atomic_hash lock to raw_spinlock_t. Doing so avoids some ugly nesting of linux/*.h and asm/*.h files. Those particular locks are well tested and contained entirely inside arch specific code. I do NOT expect any new issues to arise with them. If someone does ever need to use debug/metrics with them, then they will need to unravel this hairball between spinlocks, atomic ops, and bit ops that exist only because parisc has exactly one atomic instruction: LDCW (load and clear word). From: "Luck, Tony" <tony.luck@intel.com> ia64 fix Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NArjan van de Ven <arjanv@infradead.org> Signed-off-by: NGrant Grundler <grundler@parisc-linux.org> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: NHirokazu Takata <takata@linux-m32r.org> Signed-off-by: NMikael Pettersson <mikpe@csd.uu.se> Signed-off-by: NBenoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 09 9月, 2005 1 次提交
-
-
由 David S. Miller 提交于
Since GCC has to emit a call and a delay slot to the out-of-line "membar" routines in arch/sparc64/lib/mb.S it is much better to just do the necessary predicted branch inline instead as: ba,pt %xcc, 1f membar #whatever 1: instead of the current: call membar_foo dslot because this way GCC is not required to allocate a stack frame if the function can be a leaf function. This also makes this bug fix easier to backport to 2.4.x Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 9月, 2005 1 次提交
-
-
由 David S. Miller 提交于
This kills warnings when building drivers/ide/ide-iops.c and puts us in-line with what other platforms do here. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 8月, 2005 1 次提交
-
-
由 David S. Miller 提交于
Just patch the branch at boot time instead. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 8月, 2005 2 次提交
-
-
由 David S. Miller 提交于
It appears that a memory barrier soon after a mispredicted branch, not just in the delay slot, can cause the hang condition of this cpu errata. So move them out-of-line, and explicitly put them into a "branch always, predict taken" delay slot which should fully kill this problem. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
When the spinlock routines were moved out of line into kernel/spinlock.c this made it so that the debugging spinlocks record lock acquisition program counts in the kernel/spinlock.c functions not in their callers. This makes the debugging info kind of useless. So record the correct caller's program counter and now this feature is useful once more. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 6月, 2005 1 次提交
-
-
由 David S. Miller 提交于
In particular, avoid membar instructions in the delay slot of a jmpl instruction. UltraSPARC-I, II, IIi, and IIe have a bug, documented in the UltraSPARC-IIi User's Manual, Appendix K, Erratum 51 The long and short of it is that if the IMU unit misses on a branch or jmpl, and there is a store buffer synchronizing membar in the delay slot, the chip can stop fetching instructions. If interrupts are enabled or some other trap is enabled, the chip will unwedge itself, but performance will suffer. We already had a workaround for this bug in a few spots, but it's better to have the entire tree sanitized for this rule. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 6月, 2005 1 次提交
-
-
由 Ingo Molnar 提交于
This patch implements a number of smp_processor_id() cleanup ideas that Arjan van de Ven and I came up with. The previous __smp_processor_id/_smp_processor_id/smp_processor_id API spaghetti was hard to follow both on the implementational and on the usage side. Some of the complexity arose from picking wrong names, some of the complexity comes from the fact that not all architectures defined __smp_processor_id. In the new code, there are two externally visible symbols: - smp_processor_id(): debug variant. - raw_smp_processor_id(): nondebug variant. Replaces all existing uses of _smp_processor_id() and __smp_processor_id(). Defined by every SMP architecture in include/asm-*/smp.h. There is one new internal symbol, dependent on DEBUG_PREEMPT: - debug_smp_processor_id(): internal debug variant, mapped to smp_processor_id(). Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new lib/smp_processor_id.c file. All related comments got updated and/or clarified. I have build/boot tested the following 8 .config combinations on x86: {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT} I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT. (Other architectures are untested, but should work just fine.) Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NArjan van de Ven <arjan@infradead.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 17 4月, 2005 1 次提交
-
-
由 Linus Torvalds 提交于
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
-