1. 09 10月, 2013 22 次提交
  2. 06 10月, 2013 1 次提交
  3. 01 10月, 2013 4 次提交
    • F
      irq: Force hardirq exit's softirq processing on its own stack · ded79754
      Frederic Weisbecker 提交于
      The commit facd8b80
      ("irq: Sanitize invoke_softirq") converted irq exit
      calls of do_softirq() to __do_softirq() on all architectures,
      assuming it was only used there for its irq disablement
      properties.
      
      But as a side effect, the softirqs processed in the end
      of the hardirq are always called on the inline current
      stack that is used by irq_exit() instead of the softirq
      stack provided by the archs that override do_softirq().
      
      The result is mostly safe if the architecture runs irq_exit()
      on a separate irq stack because then softirqs are processed
      on that same stack that is near empty at this stage (assuming
      hardirq aren't nesting).
      
      Otherwise irq_exit() runs in the task stack and so does the softirq
      too. The interrupted call stack can be randomly deep already and
      the softirq can dig through it even further. To add insult to the
      injury, this softirq can be interrupted by a new hardirq, maximizing
      the chances for a stack overrun as reported in powerpc for example:
      
      	do_IRQ: stack overflow: 1920
      	CPU: 0 PID: 1602 Comm: qemu-system-ppc Not tainted 3.10.4-300.1.fc19.ppc64p7 #1
      	Call Trace:
      	[c0000000050a8740] .show_stack+0x130/0x200 (unreliable)
      	[c0000000050a8810] .dump_stack+0x28/0x3c
      	[c0000000050a8880] .do_IRQ+0x2b8/0x2c0
      	[c0000000050a8930] hardware_interrupt_common+0x154/0x180
      	--- Exception: 501 at .cp_start_xmit+0x3a4/0x820 [8139cp]
      		LR = .cp_start_xmit+0x390/0x820 [8139cp]
      	[c0000000050a8d40] .dev_hard_start_xmit+0x394/0x640
      	[c0000000050a8e00] .sch_direct_xmit+0x110/0x260
      	[c0000000050a8ea0] .dev_queue_xmit+0x260/0x630
      	[c0000000050a8f40] .br_dev_queue_push_xmit+0xc4/0x130 [bridge]
      	[c0000000050a8fc0] .br_dev_xmit+0x198/0x270 [bridge]
      	[c0000000050a9070] .dev_hard_start_xmit+0x394/0x640
      	[c0000000050a9130] .dev_queue_xmit+0x428/0x630
      	[c0000000050a91d0] .ip_finish_output+0x2a4/0x550
      	[c0000000050a9290] .ip_local_out+0x50/0x70
      	[c0000000050a9310] .ip_queue_xmit+0x148/0x420
      	[c0000000050a93b0] .tcp_transmit_skb+0x4e4/0xaf0
      	[c0000000050a94a0] .__tcp_ack_snd_check+0x7c/0xf0
      	[c0000000050a9520] .tcp_rcv_established+0x1e8/0x930
      	[c0000000050a95f0] .tcp_v4_do_rcv+0x21c/0x570
      	[c0000000050a96c0] .tcp_v4_rcv+0x734/0x930
      	[c0000000050a97a0] .ip_local_deliver_finish+0x184/0x360
      	[c0000000050a9840] .ip_rcv_finish+0x148/0x400
      	[c0000000050a98d0] .__netif_receive_skb_core+0x4f8/0xb00
      	[c0000000050a99d0] .netif_receive_skb+0x44/0x110
      	[c0000000050a9a70] .br_handle_frame_finish+0x2bc/0x3f0 [bridge]
      	[c0000000050a9b20] .br_nf_pre_routing_finish+0x2ac/0x420 [bridge]
      	[c0000000050a9bd0] .br_nf_pre_routing+0x4dc/0x7d0 [bridge]
      	[c0000000050a9c70] .nf_iterate+0x114/0x130
      	[c0000000050a9d30] .nf_hook_slow+0xb4/0x1e0
      	[c0000000050a9e00] .br_handle_frame+0x290/0x330 [bridge]
      	[c0000000050a9ea0] .__netif_receive_skb_core+0x34c/0xb00
      	[c0000000050a9fa0] .netif_receive_skb+0x44/0x110
      	[c0000000050aa040] .napi_gro_receive+0xe8/0x120
      	[c0000000050aa0c0] .cp_rx_poll+0x31c/0x590 [8139cp]
      	[c0000000050aa1d0] .net_rx_action+0x1dc/0x310
      	[c0000000050aa2b0] .__do_softirq+0x158/0x330
      	[c0000000050aa3b0] .irq_exit+0xc8/0x110
      	[c0000000050aa430] .do_IRQ+0xdc/0x2c0
      	[c0000000050aa4e0] hardware_interrupt_common+0x154/0x180
      	 --- Exception: 501 at .bad_range+0x1c/0x110
      		 LR = .get_page_from_freelist+0x908/0xbb0
      	[c0000000050aa7d0] .list_del+0x18/0x50 (unreliable)
      	[c0000000050aa850] .get_page_from_freelist+0x908/0xbb0
      	[c0000000050aa9e0] .__alloc_pages_nodemask+0x21c/0xae0
      	[c0000000050aaba0] .alloc_pages_vma+0xd0/0x210
      	[c0000000050aac60] .handle_pte_fault+0x814/0xb70
      	[c0000000050aad50] .__get_user_pages+0x1a4/0x640
      	[c0000000050aae60] .get_user_pages_fast+0xec/0x160
      	[c0000000050aaf10] .__gfn_to_pfn_memslot+0x3b0/0x430 [kvm]
      	[c0000000050aafd0] .kvmppc_gfn_to_pfn+0x64/0x130 [kvm]
      	[c0000000050ab070] .kvmppc_mmu_map_page+0x94/0x530 [kvm]
      	[c0000000050ab190] .kvmppc_handle_pagefault+0x174/0x610 [kvm]
      	[c0000000050ab270] .kvmppc_handle_exit_pr+0x464/0x9b0 [kvm]
      	[c0000000050ab320]  kvm_start_lightweight+0x1ec/0x1fc [kvm]
      	[c0000000050ab4f0] .kvmppc_vcpu_run_pr+0x168/0x3b0 [kvm]
      	[c0000000050ab9c0] .kvmppc_vcpu_run+0xc8/0xf0 [kvm]
      	[c0000000050aba50] .kvm_arch_vcpu_ioctl_run+0x5c/0x1a0 [kvm]
      	[c0000000050abae0] .kvm_vcpu_ioctl+0x478/0x730 [kvm]
      	[c0000000050abc90] .do_vfs_ioctl+0x4ec/0x7c0
      	[c0000000050abd80] .SyS_ioctl+0xd4/0xf0
      	[c0000000050abe30] syscall_exit+0x0/0x98
      
      Since this is a regression, this patch proposes a minimalistic
      and low-risk solution by blindly forcing the hardirq exit processing of
      softirqs on the softirq stack. This way we should reduce significantly
      the opportunities for task stack overflow dug by softirqs.
      
      Longer term solutions may involve extending the hardirq stack coverage to
      irq_exit(), etc...
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: #3.9.. <stable@vger.kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@au1.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Mackerras <paulus@au1.ibm.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      ded79754
    • O
      pidns: fix free_pid() to handle the first fork failure · 314a8ad0
      Oleg Nesterov 提交于
      "case 0" in free_pid() assumes that disable_pid_allocation() should
      clear PIDNS_HASH_ADDING before the last pid goes away.
      
      However this doesn't happen if the first fork() fails to create the
      child reaper which should call disable_pid_allocation().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      314a8ad0
    • T
      kernel/kmod.c: check for NULL in call_usermodehelper_exec() · 4c1c7be9
      Tetsuo Handa 提交于
      If /proc/sys/kernel/core_pattern contains only "|", a NULL pointer
      dereference happens upon core dump because argv_split("") returns
      argv[0] == NULL.
      
      This bug was once fixed by commit 264b83c0 ("usermodehelper: check
      subprocess_info->path != NULL") but was by error reintroduced by commit
      7f57cfa4 ("usermodehelper: kill the sub_info->path[0] check").
      
      This bug seems to exist since 2.6.19 (the version which core dump to
      pipe was added).  Depending on kernel version and config, some side
      effect might happen immediately after this oops (e.g.  kernel panic with
      2.6.32-358.18.1.el6).
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4c1c7be9
    • R
      PM / hibernate: Fix user space driven resume regression · aab17289
      Rafael J. Wysocki 提交于
      Recent commit 8fd37a4c (PM / hibernate: Create memory bitmaps after
      freezing user space) broke the resume part of the user space driven
      hibernation (s2disk), because I forgot that the resume utility
      loaded the image into memory without freezing user space (it still
      freezes tasks after loading the image).  This means that during user
      space driven resume we need to create the memory bitmaps at the
      "device open" time rather than at the "freeze tasks" time, so make
      that happen (that's a special case anyway, so it needs to be treated
      in a special way).
      Reported-and-tested-by: NRonald <ronald645@gmail.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      aab17289
  4. 29 9月, 2013 1 次提交
  5. 27 9月, 2013 1 次提交
  6. 25 9月, 2013 11 次提交