1. 23 8月, 2013 3 次提交
  2. 22 8月, 2013 5 次提交
  3. 21 8月, 2013 1 次提交
  4. 20 8月, 2013 13 次提交
    • C
      xen/smp: initialize IPI vectors before marking CPU online · fc78d343
      Chuck Anderson 提交于
      An older PVHVM guest (v3.0 based) crashed during vCPU hot-plug with:
      
      	kernel BUG at drivers/xen/events.c:1328!
      
      RCU has detected that a CPU has not entered a quiescent state within the
      grace period.  It needs to send the CPU a reschedule IPI if it is not
      offline.  rcu_implicit_offline_qs() does this check:
      
      	/*
      	 * If the CPU is offline, it is in a quiescent state.  We can
      	 * trust its state not to change because interrupts are disabled.
      	 */
      	if (cpu_is_offline(rdp->cpu)) {
      		rdp->offline_fqs++;
      		return 1;
      	}
      
      	Else the CPU is online.  Send it a reschedule IPI.
      
      The CPU is in the middle of being hot-plugged and has been marked online
      (!cpu_is_offline()).  See start_secondary():
      
      	set_cpu_online(smp_processor_id(), true);
      	...
      	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
      
      start_secondary() then waits for the CPU bringing up the hot-plugged CPU to
      mark it as active:
      
      	/*
      	 * Wait until the cpu which brought this one up marked it
      	 * online before enabling interrupts. If we don't do that then
      	 * we can end up waking up the softirq thread before this cpu
      	 * reached the active state, which makes the scheduler unhappy
      	 * and schedule the softirq thread on the wrong cpu. This is
      	 * only observable with forced threaded interrupts, but in
      	 * theory it could also happen w/o them. It's just way harder
      	 * to achieve.
      	 */
      	while (!cpumask_test_cpu(smp_processor_id(), cpu_active_mask))
      		cpu_relax();
      
      	/* enable local interrupts */
      	local_irq_enable();
      
      The CPU being hot-plugged will be marked active after it has been fully
      initialized by the CPU managing the hot-plug.  In the Xen PVHVM case
      xen_smp_intr_init() is called to set up the hot-plugged vCPU's
      XEN_RESCHEDULE_VECTOR.
      
      The hot-plugging CPU is marked online, not marked active and does not have
      its IPI vectors set up.  rcu_implicit_offline_qs() sees the hot-plugging
      cpu is !cpu_is_offline() and tries to send it a reschedule IPI:
      This will lead to:
      
      	kernel BUG at drivers/xen/events.c:1328!
      
      	xen_send_IPI_one()
      	xen_smp_send_reschedule()
      	rcu_implicit_offline_qs()
      	rcu_implicit_dynticks_qs()
      	force_qs_rnp()
      	force_quiescent_state()
      	__rcu_process_callbacks()
      	rcu_process_callbacks()
      	__do_softirq()
      	call_softirq()
      	do_softirq()
      	irq_exit()
      	xen_evtchn_do_upcall()
      
      because xen_send_IPI_one() will attempt to use an uninitialized IRQ for
      the XEN_RESCHEDULE_VECTOR.
      
      There is at least one other place that has caused the same crash:
      
      	xen_smp_send_reschedule()
      	wake_up_idle_cpu()
      	add_timer_on()
      	clocksource_watchdog()
      	call_timer_fn()
      	run_timer_softirq()
      	__do_softirq()
      	call_softirq()
      	do_softirq()
      	irq_exit()
      	xen_evtchn_do_upcall()
      	xen_hvm_callback_vector()
      
      clocksource_watchdog() uses cpu_online_mask to pick the next CPU to handle
      a watchdog timer:
      
      	/*
      	 * Cycle through CPUs to check if the CPUs stay synchronized
      	 * to each other.
      	 */
      	next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
      	if (next_cpu >= nr_cpu_ids)
      		next_cpu = cpumask_first(cpu_online_mask);
      	watchdog_timer.expires += WATCHDOG_INTERVAL;
      	add_timer_on(&watchdog_timer, next_cpu);
      
      This resulted in an attempt to send an IPI to a hot-plugging CPU that
      had not initialized its reschedule vector. One option would be to make
      the RCU code check to not check for CPU offline but for CPU active.
      As becoming active is done after a CPU is online (in older kernels).
      
      But Srivatsa pointed out that "the cpu_active vs cpu_online ordering has been
      completely reworked - in the online path, cpu_active is set *before* cpu_online,
      and also, in the cpu offline path, the cpu_active bit is reset in the CPU_DYING
      notification instead of CPU_DOWN_PREPARE." Drilling in this the bring-up
      path: "[brought up CPU].. send out a CPU_STARTING notification, and in response
      to that, the scheduler sets the CPU in the cpu_active_mask. Again, this mask
      is better left to the scheduler alone, since it has the intelligence to use it
      judiciously."
      
      The conclusion was that:
      "
      1. At the IPI sender side:
      
         It is incorrect to send an IPI to an offline CPU (cpu not present in
         the cpu_online_mask). There are numerous places where we check this
         and warn/complain.
      
      2. At the IPI receiver side:
      
         It is incorrect to let the world know of our presence (by setting
         ourselves in global bitmasks) until our initialization steps are complete
         to such an extent that we can handle the consequences (such as
         receiving interrupts without crashing the sender etc.)
      " (from Srivatsa)
      
      As the native code enables the interrupts at some point we need to be
      able to service them. In other words a CPU must have valid IPI vectors
      if it has been marked online.
      
      It doesn't need to handle the IPI (interrupts may be disabled) but needs
      to have valid IPI vectors because another CPU may find it in cpu_online_mask
      and attempt to send it an IPI.
      
      This patch will change the order of the Xen vCPU bring-up functions so that
      Xen vectors have been set up before start_secondary() is called.
      It also will not continue to bring up a Xen vCPU if xen_smp_intr_init() fails
      to initialize it.
      
      Orabug 13823853
      Signed-off-by Chuck Anderson <chuck.anderson@oracle.com>
      Acked-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      fc78d343
    • D
      xen/events: mask events when changing their VCPU binding · 4704fe4f
      David Vrabel 提交于
      When a event is being bound to a VCPU there is a window between the
      EVTCHNOP_bind_vpcu call and the adjustment of the local per-cpu masks
      where an event may be lost.  The hypervisor upcalls the new VCPU but
      the kernel thinks that event is still bound to the old VCPU and
      ignores it.
      
      There is even a problem when the event is being bound to the same VCPU
      as there is a small window beween the clear_bit() and set_bit() calls
      in bind_evtchn_to_cpu().  When scanning for pending events, the kernel
      may read the bit when it is momentarily clear and ignore the event.
      
      Avoid this by masking the event during the whole bind operation.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      CC: stable@vger.kernel.org
      4704fe4f
    • D
      xen/events: initialize local per-cpu mask for all possible events · 84ca7a8e
      David Vrabel 提交于
      The sizeof() argument in init_evtchn_cpu_bindings() is incorrect
      resulting in only the first 64 (or 32 in 32-bit guests) ports having
      their bindings being initialized to VCPU 0.
      
      In most cases this does not cause a problem as request_irq() will set
      the irq affinity which will set the correct local per-cpu mask.
      However, if the request_irq() is called on a VCPU other than 0, there
      is a window between the unmasking of the event and the affinity being
      set were an event may be lost because it is not locally unmasked on
      any VCPU. If request_irq() is called on VCPU 0 then local irqs are
      disabled during the window and the race does not occur.
      
      Fix this by initializing all NR_EVENT_CHANNEL bits in the local
      per-cpu masks.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      CC: stable@vger.kernel.org
      84ca7a8e
    • D
      x86/xen: do not identity map UNUSABLE regions in the machine E820 · 3bc38cbc
      David Vrabel 提交于
      If there are UNUSABLE regions in the machine memory map, dom0 will
      attempt to map them 1:1 which is not permitted by Xen and the kernel
      will crash.
      
      There isn't anything interesting in the UNUSABLE region that the dom0
      kernel needs access to so we can avoid making the 1:1 mapping and
      treat it as RAM.
      
      We only do this for dom0, as that is where tboot case shows up.
      A PV domU could have an UNUSABLE region in its pseudo-physical map
      and would need to be handled in another patch.
      
      This fixes a boot failure on hosts with tboot.
      
      tboot marks a region in the e820 map as unusable and the dom0 kernel
      would attempt to map this region and Xen does not permit unusable
      regions to be mapped by guests.
      
        (XEN)  0000000000000000 - 0000000000060000 (usable)
        (XEN)  0000000000060000 - 0000000000068000 (reserved)
        (XEN)  0000000000068000 - 000000000009e000 (usable)
        (XEN)  0000000000100000 - 0000000000800000 (usable)
        (XEN)  0000000000800000 - 0000000000972000 (unusable)
      
      tboot marked this region as unusable.
      
        (XEN)  0000000000972000 - 00000000cf200000 (usable)
        (XEN)  00000000cf200000 - 00000000cf38f000 (reserved)
        (XEN)  00000000cf38f000 - 00000000cf3ce000 (ACPI data)
        (XEN)  00000000cf3ce000 - 00000000d0000000 (reserved)
        (XEN)  00000000e0000000 - 00000000f0000000 (reserved)
        (XEN)  00000000fe000000 - 0000000100000000 (reserved)
        (XEN)  0000000100000000 - 0000000630000000 (usable)
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      [v1: Altered the patch and description with domU's with UNUSABLE regions]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      3bc38cbc
    • W
      arm64: perf: fix event validation for software group leaders · ee7538a0
      Will Deacon 提交于
      This is a port of c95eb318 ("ARM: 7809/1: perf: fix event validation
      for software group leaders") to arm64, which fixes a panic in the arm64
      perf backend found as a result of Vince's fuzzing tool.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      ee7538a0
    • W
      arm64: perf: fix array out of bounds access in armpmu_map_hw_event() · 868f6fea
      Will Deacon 提交于
      This is a port of d9f96635 ("ARM: 7810/1: perf: Fix array out of
      bounds access in armpmu_map_hw_event()") to arm64, which fixes an oops
      in the arm64 perf backend found as a result of Vince's fuzzing tool.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      868f6fea
    • L
      proc: more readdir conversion bug-fixes · fd3930f7
      Linus Torvalds 提交于
      In the previous commit, Richard Genoud fixed proc_root_readdir(), which
      had lost the check for whether all of the non-process /proc entries had
      been returned or not.
      
      But that in turn exposed _another_ bug, namely that the original readdir
      conversion patch had yet another problem: it had lost the return value
      of proc_readdir_de(), so now checking whether it had completed
      successfully or not didn't actually work right anyway.
      
      This reinstates the non-zero return for the "end of base entries" that
      had also gotten lost in commit f0c3b509 ("[readdir] convert
      procfs").  So now you get all the base entries *and* you get all the
      process entries, regardless of getdents buffer size.
      
      (Side note: the Linux "getdents" manual page actually has a nice example
      application for testing getdents, which can be easily modified to use
      different buffers.  Who knew? Man-pages can be useful)
      Reported-by: NEmmanuel Benisty <benisty.e@gmail.com>
      Reported-by: NMarc Dionne <marc.c.dionne@gmail.com>
      Cc: Richard Genoud <richard.genoud@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fd3930f7
    • R
      proc: return on proc_readdir error · 94fc5d9d
      Richard Genoud 提交于
      Commit f0c3b509 ("[readdir] convert procfs") introduced a bug on the
      listing of the proc file-system.  The return value of proc_readdir()
      isn't tested anymore in the proc_root_readdir function.
      
      This lead to an "interesting" behaviour when we are using the getdents()
      system call with a buffer too small: instead of failing, it returns the
      first entries of /proc (enough to fill the given buffer), plus the PID
      directories.
      
      This is not triggered on glibc (as getdents is called with a 32KB
      buffer), but on uclibc, the buffer size is only 1KB, thus some proc
      entries are missing.
      
      See https://lkml.org/lkml/2013/8/12/288 for more background.
      Signed-off-by: NRichard Genoud <richard.genoud@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      94fc5d9d
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes · d6a5e06c
      Linus Torvalds 提交于
      Pull gfs2 fixes from Steven Whitehouse:
       "Out of these five patches, the one for ensuring that the number of
        revokes is not exceeded, and the one for checking the glock is not
        already held in gfs2_getxattr are the two most important.  The latter
        can be triggered by selinux.
      
        The other three patches are very small and fix mostly fairly trivial
        issues"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
        GFS2: Check for glock already held in gfs2_getxattr
        GFS2: alloc_workqueue() doesn't return an ERR_PTR
        GFS2: don't overrun reserved revokes
        GFS2: WQ_NON_REENTRANT is meaningless and going away
        GFS2: Fix typo in gfs2_create_inode()
      d6a5e06c
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7067552d
      Linus Torvalds 提交于
      Pull x86 fixes from Ingo Molnar:
       "Two AMD microcode loader fixes and an OLPC firmware support fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, microcode, AMD: Fix early microcode loading
        x86, microcode, AMD: Make cpu_has_amd_erratum() use the correct struct cpuinfo_x86
        x86: Don't clear olpc_ofw_header when sentinel is detected
      7067552d
    • L
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e91dade5
      Linus Torvalds 提交于
      Pull timer fixes from Ingo Molnar:
       "Three small fixlets"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        nohz: fix compile warning in tick_nohz_init()
        nohz: Do not warn about unstable tsc unless user uses nohz_full
        sched_clock: Fix integer overflow
      e91dade5
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · fbf21849
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Bit late with these, was under the weather for a a few days, nothing
        too crazy:
      
        Some radeon regression fixes, one intel regression fix, and one fix to
        avoid a warn with i915 when used with dma-buf"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: unpin backing storage in dmabuf_unmap
        drm/radeon: fix WREG32_OR macro setting bits in a register
        drm/radeon/r7xx: fix copy paste typo in golden register setup
        drm/i915: Don't deref pipe->cpu_transcoder in the hangcheck code
        drm/radeon: fix UVD message buffer validation
      fbf21849
    • R
      kernel: fix new kernel-doc warning in wait.c · 2203547f
      Randy Dunlap 提交于
      Fix new kernel-doc warnings in kernel/wait.c:
      
        Warning(kernel/wait.c:374): No description found for parameter 'p'
        Warning(kernel/wait.c:374): Excess function parameter 'word' description in 'wake_up_atomic_t'
        Warning(kernel/wait.c:374): Excess function parameter 'bit' description in 'wake_up_atomic_t'
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2203547f
  5. 19 8月, 2013 9 次提交
  6. 18 8月, 2013 2 次提交
  7. 17 8月, 2013 7 次提交
新手
引导
客服 返回
顶部