- 21 2月, 2013 1 次提交
-
-
由 David S. Miller 提交于
If our first THP installation for an MM is via the set_pmd_at() done during khugepaged's collapsing we'll end up in tsb_grow() trying to do a GFP_KERNEL allocation with several locks held. Simply using GFP_ATOMIC in this situation is not the best option because we really can't have this fail, so we'd really like to keep this an order 0 GFP_KERNEL allocation if possible. Also, doing the TSB allocation from khugepaged is a really bad idea because we'll allocate it potentially from the wrong NUMA node in that context. So what we do is defer the hugepage TSB allocation until the first TLB miss we take on a hugepage. This is slightly tricky because we have to handle two unusual cases: 1) Taking the first hugepage TLB miss in the window trap handler. We'll call the winfix_trampoline when that is detected. 2) An initial TSB allocation via TLB miss races with a hugetlb fault on another cpu running the same MM. We handle this by unconditionally loading the TSB we see into the current cpu even if it's non-NULL at hugetlb_setup time. Reported-by: NMeelis Roos <mroos@ut.ee> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 10月, 2012 1 次提交
-
-
由 David S. Miller 提交于
Missing error types, attributes, and report fields. Pad out to 64-bytes. Make string reporting cleaner and easier to extend in the future using "const char *" arrays that index by either bit position, or absolute field value. Report the raw 64-byte error report as a sequence of u64s before the annotated version. Only report fields which are valid, given the context and the attribute bits which are set. For shutdown requests, use the local copy of the error report not the one we just freed up back to the queue. Also, use orderly_poweroff() just like the Domain Services shutdown request code does. If the real-address reported is "-1" (unknown) try to disassemble the instruction to report the effective address of the access. Only do this in privileged mode. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 10月, 2012 2 次提交
-
-
由 David Miller 提交于
This is relatively easy since PMD's now cover exactly 4MB of memory. Our PMD entries are 32-bits each, so we use a special encoding. The lowest bit, PMD_ISHUGE, determines the interpretation. This is possible because sparc64's page tables are purely software entities so we can use whatever encoding scheme we want. We just have to make the TLB miss assembler page table walkers aware of the layout. set_pmd_at() works much like set_pte_at() but it has to operate in two page from a table of non-huge PTEs, so we have to queue up TLB flushes based upon what mappings are valid in the PTE table. In the second regime we are going from huge-page to non-huge-page, and in that case we need only queue up a single TLB flush to push out the huge page mapping. We still have 5 bits remaining in the huge PMD encoding so we can very likely support any new pieces of THP state tracking that might get added in the future. With lots of help from Johannes Weiner. Signed-off-by: NDavid S. Miller <davem@davemloft.net> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shaohua Li 提交于
.fault now can retry. The retry can break state machine of .fault. In filemap_fault, if page is miss, ra->mmap_miss is increased. In the second try, since the page is in page cache now, ra->mmap_miss is decreased. And these are done in one fault, so we can't detect random mmap file access. Add a new flag to indicate .fault is tried once. In the second try, skip ra->mmap_miss decreasing. The filemap_fault state machine is ok with it. I only tested x86, didn't test other archs, but looks the change for other archs is obvious, but who knows :) Signed-off-by: NShaohua Li <shaohua.li@fusionio.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 4月, 2012 1 次提交
-
-
由 Kautuk Consul 提交于
Commit d065bd81 (mm: retry page fault when blocking on disk transfer) and commit 37b23e05 (x86,mm: make pagefault killable) The above commits introduced changes into the x86 pagefault handler for making the page fault handler retryable as well as killable. These changes reduce the mmap_sem hold time, which is crucial during OOM killer invocation. Port these changes to 64-bit sparc. Signed-off-by: NKautuk Consul <consul.kautuk@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 7月, 2011 1 次提交
-
-
由 Peter Zijlstra 提交于
The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 01 3月, 2010 1 次提交
-
-
由 David S. Miller 提交于
Just faults right now, will add other traps later. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 1月, 2010 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 12月, 2009 2 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 8月, 2009 1 次提交
-
-
由 David S. Miller 提交于
As noted by Nick Piggin. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 6月, 2009 1 次提交
-
-
由 Linus Torvalds 提交于
This allows the callers to now pass down the full set of FAULT_FLAG_xyz flags to handle_mm_fault(). All callers have been (mechanically) converted to the new calling convention, there's almost certainly room for architectures to clean up their code and then add FAULT_FLAG_RETRY when that support is added. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 2月, 2009 1 次提交
-
-
由 David S. Miller 提交于
This builds upon eeabac73 ("sparc64: Validate kernel generated fault addresses on sparc64.") Upon further consideration, we actually should never see any fault addresses for 32-bit tasks with the upper 32-bits set. If it does every happen, by definition it's a bug. Whatever context created that fault would only have that fault satisfied if we used the full 64-bit address. If we truncate it, we'll always fault the wrong address and we'll always loop faulting forever. So catch such conditions and mark them as errors always. Log the error and fail the fault. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 2月, 2009 1 次提交
-
-
由 David S. Miller 提交于
In order to handle all of the cases of address calculation overflow properly, we run sparc 32-bit processes in "address masking" mode when running on a 64-bit kernel. Address masking mode zeros out the top 32-bits of the address calculated for every load and store instruction. However, when we're in privileged mode we have to run with that address masking mode disabled even when accessing userspace from the kernel. To "simulate" the address masking mode we clear the top-bits by hand for 32-bit processes in the fault handler. It is the responsibility of code in the compat layer to properly zero extend addresses used to access userspace. If this isn't followed properly we can get into a fault loop. Say that the user address is 0xf0000000 but for whatever reason the kernel code sign extends this to 64-bit, and then the kernel tries to access the result. In such a case we'll fault on address 0xfffffffff0000000 but the fault handler will process that fault as if it were to address 0xf0000000. We'll loop faulting forever because the fault never gets satisfied. So add a check specifically for this case, when the kernel is faulting on a user address access and the addresses don't match up. This code path is sufficiently slow path, and this bug is sufficiently painful to diagnose, that this kind of bug check is warranted. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 12月, 2008 1 次提交
-
-
由 Sam Ravnborg 提交于
- move all sparc64/mm/ files to arch/sparc/mm/ - commonly named files are named _64.c - add files to sparc/mm/Makefile preserving link order - delete now unused sparc64/mm/Makefile - sparc64 now finds mm/ in sparc Signed-off-by: NSam Ravnborg <sam@ravnborg.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 9月, 2008 1 次提交
-
-
由 David S. Miller 提交于
1) set_brkpt() is referenced by nothing and hasn't been used by anyone to my knowledge for many many years. So just delete it. 2) add extern decl for do_sparc64_fault() in asm/pgtable_64.h Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 7月, 2008 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 5月, 2008 1 次提交
-
-
由 Adrian Bunk 提交于
This patch removes the CVS keywords that weren't updated for a long time from comments. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 2月, 2008 1 次提交
-
-
由 David S. Miller 提交于
Because of the new futex validation init handler, we have to accept faults in init section text as well as the normal kernel text. Thanks to Tom Callaway for the bug report. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 2月, 2008 1 次提交
-
-
由 David S. Miller 提交于
Some parts of the kernel now do things like do *_user() accesses while set_fs(KERNEL_DS) that fault on purpose. See, for example, the code added by changeset a0c1e907 ("futex: runtime enable pi and robust functionality"). That trips up the ASI sanity checking we make in do_kernel_fault(). Just remove it for now. Maybe we can add it back later with an added conditional which looks at the current get_fs() value. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 10月, 2007 1 次提交
-
-
由 Will Schmidt 提交于
We have had complaints where a threaded application is left in a bad state after one of it's threads is killed when we hit a VM: out_of_memory condition. Killing just one of the process threads can leave the application in a bad state, whereas killing the entire process group would allow for the application to restart, or be otherwise handled, and makes it very obvious that something has gone wrong. This change allows the entire process group to be taken down, rather than just the one thread. Signed-off-by: NWill Schmidt <will_schmidt@vnet.ibm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 7月, 2007 1 次提交
-
-
由 David S. Miller 提交于
It didn't handle that case at all, and now dump_stack() can be implemented directly as show_stack(current, NULL) Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 7月, 2007 1 次提交
-
-
由 Nick Piggin 提交于
This patch completes Linus's wish that the fault return codes be made into bit flags, which I agree makes everything nicer. This requires requires all handle_mm_fault callers to be modified (possibly the modifications should go further and do things like fault accounting in handle_mm_fault -- however that would be for another patch). [akpm@linux-foundation.org: fix alpha build] [akpm@linux-foundation.org: fix s390 build] [akpm@linux-foundation.org: fix sparc build] [akpm@linux-foundation.org: fix sparc64 build] [akpm@linux-foundation.org: fix ia64 build] Signed-off-by: NNick Piggin <npiggin@suse.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Acked-by: NKyle McMartin <kyle@mcmartin.ca> Acked-by: NHaavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: NRalf Baechle <ralf@linux-mips.org> Acked-by: NAndi Kleen <ak@muc.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> [ Still apparently needs some ARM and PPC loving - Linus ] Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 5月, 2007 3 次提交
-
-
由 David S. Miller 提交于
And eliminate DIE_GPF while we're at it. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Randy Dunlap 提交于
Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
This patch moves the die notifier handling to common code. Previous various architectures had exactly the same code for it. Note that the new code is compiled unconditionally, this should be understood as an appel to the other architecture maintainer to implement support for it aswell (aka sprinkling a notify_die or two in the proper place) arm had a notifiy_die that did something totally different, I renamed it to arm_notify_die as part of the patch and made it static to the file it's declared and used at. avr32 used to pass slightly less information through this interface and I brought it into line with the other architectures. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix vmalloc_sync_all bustage] [bryan.wu@analog.com: fix vmalloc_sync_all in nommu] Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: <linux-arch@vger.kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: NBryan Wu <bryan.wu@analog.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 7月, 2006 1 次提交
-
-
由 David S. Miller 提交于
That way we'll have at least some debugging info even if the stack dump explodes. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 6月, 2006 1 次提交
-
-
由 Anil S Keshavamurthy 提交于
Overloading of page fault notification with the notify_die() has performance issues(since the only interested components for page fault is kprobes and/or kdb) and hence this patch introduces the new notifier call chain exclusively for page fault notifications their by avoiding notifying unnecessary components in the do_page_fault() code path. Signed-off-by: NAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 01 4月, 2006 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 3月, 2006 1 次提交
-
-
由 David S. Miller 提交于
The worst part about this bug is what it would cause a hugepage TSB to be allocated for every address space since "0 >= 0". Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 3月, 2006 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 3月, 2006 3 次提交
-
-
由 David S. Miller 提交于
This is good for up to %50 performance improvement of some test cases. The problem has been the race conditions, and hopefully I've plugged them all up here. 1) There was a serious race in switch_mm() wrt. lazy TLB switching to and from kernel threads. We could erroneously skip a tsb_context_switch() and thus use a stale TSB across a TSB grow event. There is a big comment now in that function describing exactly how it can happen. 2) All code paths that do something with the TSB need to be guarded with the mm->context.lock spinlock. This makes page table flushing paths properly synchronize with both TSB growing and TLB context changes. 3) TSB growing events are moved to the end of successful fault processing. Previously it was in update_mmu_cache() but that is deadlock prone. At the end of do_sparc64_fault() we hold no spinlocks that could deadlock the TSB grow sequence. We also have dropped the address space semaphore. While we're here, add prefetching to the copy_tsb() routine and put it in assembler into the tsb.S file. This piece of code is quite time critical. There are some small negative side effects to this code which can be improved upon. In particular we grab the mm->context.lock even for the tsb insert done by update_mmu_cache() now and that's a bit excessive. We can get rid of that locking, and the same lock taking in flush_tsb_user(), by disabling PSTATE_IE around the whole operation including the capturing of the tsb pointer and tsb_nentries value. That would work because anyone growing the TSB won't free up the old TSB until all cpus respond to the TSB change cross call. I'm not quite so confident in that optimization to put it in right now, but eventually we might be able to and the description is here for reference. This code seems very solid now. It passes several parallel GCC bootstrap builds, and our favorite "nut cruncher" stress test which is a full "make -j8192" build of a "make allmodconfig" kernel. That puts about 256 processes on each cpu's run queue, makes lots of process cpu migrations occur, causes lots of page table and TLB flushing activity, incurs many context version number changes, and it swaps the machine real far out to disk even though there is 16GB of ram on this test system. :-) Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Yes, you heard it right, they changed the PTE layout for SUN4V. Ho hum... This is the simple and inefficient way to support this. It'll get optimized, don't worry. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 11月, 2005 1 次提交
-
-
由 Tobias Klauser 提交于
Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a duplicate of ARRAY_SIZE which is never used anyways. Signed-off-by: NTobias Klauser <tklauser@nuerscht.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 11月, 2005 1 次提交
-
-
由 Hugh Dickins 提交于
Update comment on get_user_insn to the more general "pte lock", which may or may not be the page_table_lock. Note vmtruncate handled like kswapd. Signed-off-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 9月, 2005 4 次提交
-
-
由 David S. Miller 提交于
Thus, we can mark sp_banks[] static in arch/sparc64/mm/init.c Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Also, move prom_probe_memory() into arch/sparc64/mm/init.c Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Instead of doing byte-at-a-time user accesses to figure out where the fault occurred, read the saved fault_address from the current thread structure. For the sake of defensive programming, if the fault_address does not fall into the user buffer range, simply assume the whole area faulted. This will cause the fixup for copy_from_user() to clear the entire kernel side buffer. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
The funny "range" exception table entries we had were only used by the compat layer socketcall assembly, and it wasn't even needed there. For free we now get proper exception table sorting and fast binary searching. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-