1. 30 1月, 2015 1 次提交
    • L
      vm: add VM_FAULT_SIGSEGV handling support · 33692f27
      Linus Torvalds 提交于
      The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
      "you should SIGSEGV" error, because the SIGSEGV case was generally
      handled by the caller - usually the architecture fault handler.
      
      That results in lots of duplication - all the architecture fault
      handlers end up doing very similar "look up vma, check permissions, do
      retries etc" - but it generally works.  However, there are cases where
      the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.
      
      In particular, when accessing the stack guard page, libsigsegv expects a
      SIGSEGV.  And it usually got one, because the stack growth is handled by
      that duplicated architecture fault handler.
      
      However, when the generic VM layer started propagating the error return
      from the stack expansion in commit fee7e49d ("mm: propagate error
      from stack expansion even for guard page"), that now exposed the
      existing VM_FAULT_SIGBUS result to user space.  And user space really
      expected SIGSEGV, not SIGBUS.
      
      To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
      duplicate architecture fault handlers about it.  They all already have
      the code to handle SIGSEGV, so it's about just tying that new return
      value to the existing code, but it's all a bit annoying.
      
      This is the mindless minimal patch to do this.  A more extensive patch
      would be to try to gather up the mostly shared fault handling logic into
      one generic helper routine, and long-term we really should do that
      cleanup.
      
      Just from this patch, you can generally see that most architectures just
      copied (directly or indirectly) the old x86 way of doing things, but in
      the meantime that original x86 model has been improved to hold the VM
      semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
      "newer" things, so it would be a good idea to bring all those
      improvements to the generic case and teach other architectures about
      them too.
      Reported-and-tested-by: NTakashi Iwai <tiwai@suse.de>
      Tested-by: NJan Engelhardt <jengelh@inai.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
      Cc: linux-arch@vger.kernel.org
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33692f27
  2. 13 9月, 2013 2 次提交
    • J
      arch: mm: pass userspace fault flag to generic fault handler · 759496ba
      Johannes Weiner 提交于
      Unlike global OOM handling, memory cgroup code will invoke the OOM killer
      in any OOM situation because it has no way of telling faults occuring in
      kernel context - which could be handled more gracefully - from
      user-triggered faults.
      
      Pass a flag that identifies faults originating in user space from the
      architecture-specific fault handlers to generic code so that memcg OOM
      handling can be improved.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: azurIt <azurit@pobox.sk>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      759496ba
    • J
      arch: mm: remove obsolete init OOM protection · 94bce453
      Johannes Weiner 提交于
      The memcg code can trap tasks in the context of the failing allocation
      until an OOM situation is resolved.  They can hold all kinds of locks
      (fs, mm) at this point, which makes it prone to deadlocking.
      
      This series converts memcg OOM handling into a two step process that is
      started in the charge context, but any waiting is done after the fault
      stack is fully unwound.
      
      Patches 1-4 prepare architecture handlers to support the new memcg
      requirements, but in doing so they also remove old cruft and unify
      out-of-memory behavior across architectures.
      
      Patch 5 disables the memcg OOM handling for syscalls, readahead, kernel
      faults, because they can gracefully unwind the stack with -ENOMEM.  OOM
      handling is restricted to user triggered faults that have no other
      option.
      
      Patch 6 reworks memcg's hierarchical OOM locking to make it a little
      more obvious wth is going on in there: reduce locked regions, rename
      locking functions, reorder and document.
      
      Patch 7 implements the two-part OOM handling such that tasks are never
      trapped with the full charge stack in an OOM situation.
      
      This patch:
      
      Back before smart OOM killing, when faulting tasks were killed directly on
      allocation failures, the arch-specific fault handlers needed special
      protection for the init process.
      
      Now that all fault handlers call into the generic OOM killer (see commit
      609838cf: "mm: invoke oom-killer from remaining unconverted page
      fault handlers"), which already provides init protection, the
      arch-specific leftovers can be removed.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: azurIt <azurit@pobox.sk>
      Acked-by: Vineet Gupta <vgupta@synopsys.com>	[arch/arc bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      94bce453
  3. 10 7月, 2013 1 次提交
    • J
      mm: invoke oom-killer from remaining unconverted page fault handlers · 609838cf
      Johannes Weiner 提交于
      A few remaining architectures directly kill the page faulting task in an
      out of memory situation.  This is usually not a good idea since that
      task might not even use a significant amount of memory and so may not be
      the optimal victim to resolve the situation.
      
      Since 2.6.29's 1c0fe6e3 ("mm: invoke oom-killer from page fault") there
      is a hook that architecture page fault handlers are supposed to call to
      invoke the OOM killer and let it pick the right task to kill.  Convert
      the remaining architectures over to this hook.
      
      To have the previous behavior of simply taking out the faulting task the
      vm.oom_kill_allocating_task sysctl can be set to 1.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: Vineet Gupta <vgupta@synopsys.com>   [arch/arc bits]
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      609838cf
  4. 19 6月, 2009 1 次提交
  5. 31 1月, 2009 1 次提交
  6. 20 10月, 2007 1 次提交
    • S
      pid namespaces: define is_global_init() and is_container_init() · b460cbc5
      Serge E. Hallyn 提交于
      is_init() is an ambiguous name for the pid==1 check.  Split it into
      is_global_init() and is_container_init().
      
      A cgroup init has it's tsk->pid == 1.
      
      A global init also has it's tsk->pid == 1 and it's active pid namespace
      is the init_pid_ns.  But rather than check the active pid namespace,
      compare the task structure with 'init_pid_ns.child_reaper', which is
      initialized during boot to the /sbin/init process and never changes.
      
      Changelog:
      
      	2.6.22-rc4-mm2-pidns1:
      	- Use 'init_pid_ns.child_reaper' to determine if a given task is the
      	  global init (/sbin/init) process. This would improve performance
      	  and remove dependence on the task_pid().
      
      	2.6.21-mm2-pidns2:
      
      	- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
      	  ppc,avr32}/traps.c for the _exception() call to is_global_init().
      	  This way, we kill only the cgroup if the cgroup's init has a
      	  bug rather than force a kernel panic.
      
      [akpm@linux-foundation.org: fix comment]
      [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
      [bunk@stusta.de: kernel/pid.c: remove unused exports]
      [sukadev@us.ibm.com: Fix capability.c to work with threaded init]
      Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com>
      Signed-off-by: NSukadev Bhattiprolu <sukadev@us.ibm.com>
      Acked-by: NPavel Emelianov <xemul@openvz.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Herbert Poetzel <herbert@13thfloor.at>
      Cc: Kirill Korotaev <dev@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b460cbc5
  7. 17 10月, 2007 1 次提交
    • W
      During VM oom condition, kill all threads in process group · dcca2bde
      Will Schmidt 提交于
      We have had complaints where a threaded application is left in a bad state
      after one of it's threads is killed when we hit a VM: out_of_memory
      condition.
      
      Killing just one of the process threads can leave the application in a bad
      state, whereas killing the entire process group would allow for the
      application to restart, or be otherwise handled, and makes it very obvious
      that something has gone wrong.
      
      This change allows the entire process group to be taken down, rather
      than just the one thread.
      Signed-off-by: NWill Schmidt <will_schmidt@vnet.ibm.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Matthew Wilcox <willy@debian.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Zankel <chris@zankel.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dcca2bde
  8. 20 7月, 2007 1 次提交
    • N
      mm: fault feedback #2 · 83c54070
      Nick Piggin 提交于
      This patch completes Linus's wish that the fault return codes be made into
      bit flags, which I agree makes everything nicer.  This requires requires
      all handle_mm_fault callers to be modified (possibly the modifications
      should go further and do things like fault accounting in handle_mm_fault --
      however that would be for another patch).
      
      [akpm@linux-foundation.org: fix alpha build]
      [akpm@linux-foundation.org: fix s390 build]
      [akpm@linux-foundation.org: fix sparc build]
      [akpm@linux-foundation.org: fix sparc64 build]
      [akpm@linux-foundation.org: fix ia64 build]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Bryan Wu <bryan.wu@analog.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Greg Ungerer <gerg@uclinux.org>
      Cc: Matthew Wilcox <willy@debian.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Acked-by: NHaavard Skinnemoen <hskinnemoen@atmel.com>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NAndi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      [ Still apparently needs some ARM and PPC loving - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83c54070
  9. 09 5月, 2007 1 次提交
  10. 30 3月, 2007 1 次提交
  11. 30 11月, 2006 1 次提交
    • A
      [MIPS] Load modules to CKSEG0 if CONFIG_BUILD_ELF64=n · 656be92f
      Atsushi Nemoto 提交于
      This is a patch to load 64-bit modules to CKSEG0 so that can be
      compiled with -msym32 option.  This makes each module ~10% smaller.
      
      * introduce MODULE_START and MODULE_END
      * custom module_alloc()
      * PGD for modules
      * change XTLB refill handler synthesizer
      * enable -msym32 for modules again
        (revert ca78b1a5c6a6e70e052d3ea253828e49b5d07c8a)
      
      New XTLB refill handler looks like this:
      
      80000080 dmfc0   k0,C0_BADVADDR
      80000084 bltz    k0,800000e4			# goto l_module_alloc
      80000088 lui     k1,0x8046			# %high(pgd_current)
      8000008c ld      k1,24600(k1)			# %low(pgd_current)
      80000090 dsrl    k0,k0,0x1b			# l_vmalloc_done:
      80000094 andi    k0,k0,0x1ff8
      80000098 daddu   k1,k1,k0
      8000009c dmfc0   k0,C0_BADVADDR
      800000a0 ld      k1,0(k1)
      800000a4 dsrl    k0,k0,0x12
      800000a8 andi    k0,k0,0xff8
      800000ac daddu   k1,k1,k0
      800000b0 dmfc0   k0,C0_XCONTEXT
      800000b4 ld      k1,0(k1)
      800000b8 andi    k0,k0,0xff0
      800000bc daddu   k1,k1,k0
      800000c0 ld      k0,0(k1)
      800000c4 ld      k1,8(k1)
      800000c8 dsrl    k0,k0,0x6
      800000cc mtc0    k0,C0_ENTRYLO0
      800000d0 dsrl    k1,k1,0x6
      800000d4 mtc0    k1,C0_ENTRYL01
      800000d8 nop
      800000dc tlbwr
      800000e0 eret
      800000e4 dsll    k1,k0,0x2			# l_module_alloc:
      800000e8 bgez    k1,80000008			# goto l_vmalloc
      800000ec lui     k1,0xc000
      800000f0 dsubu   k0,k0,k1
      800000f4 lui     k1,0x8046			# %high(module_pg_dir)
      800000f8 beq     zero,zero,80000000
      800000fc nop
      80000000 beq     zero,zero,80000090		# goto l_vmalloc_done
      80000004 daddiu  k1,k1,0x4000
      80000008 dsll32  k1,k1,0x0			# l_vmalloc:
      8000000c dsubu   k0,k0,k1
      80000010 beq     zero,zero,80000090		# goto l_vmalloc_done
      80000014 lui     k1,0x8046			# %high(swapper_pg_dir)
      Signed-off-by: NAtsushi Nemoto <anemo@mba.ocn.ne.jp>
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      656be92f
  12. 30 9月, 2006 1 次提交
  13. 27 9月, 2006 1 次提交
  14. 19 4月, 2006 1 次提交
  15. 30 10月, 2005 3 次提交
  16. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4