1. 23 3月, 2015 5 次提交
  2. 20 3月, 2015 1 次提交
  3. 18 3月, 2015 4 次提交
  4. 17 3月, 2015 2 次提交
  5. 16 3月, 2015 18 次提交
  6. 04 3月, 2015 2 次提交
    • N
      powerpc/iommu: Remove IOMMU device references via bus notifier · 4ad04e59
      Nishanth Aravamudan 提交于
      After d905c5df ("PPC: POWERNV: move iommu_add_device earlier"), the
      refcnt on the kobject backing the IOMMU group for a PCI device is
      elevated by each call to pci_dma_dev_setup_pSeriesLP() (via
      set_iommu_table_base_and_group). When we go to dlpar a multi-function
      PCI device out:
      
              iommu_reconfig_notifier ->
                      iommu_free_table ->
                              iommu_group_put
                              BUG_ON(tbl->it_group)
      
      We trip this BUG_ON, because there are still references on the table, so
      it is not freed. Fix this by moving the powernv bus notifier to common
      code and calling it for both powernv and pseries.
      
      Fixes: d905c5df ("PPC: POWERNV: move iommu_add_device earlier")
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Tested-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4ad04e59
    • M
      powerpc/smp: Wait until secondaries are active & online · 875ebe94
      Michael Ellerman 提交于
      Anton has a busy ppc64le KVM box where guests sometimes hit the infamous
      "kernel BUG at kernel/smpboot.c:134!" issue during boot:
      
        BUG_ON(td->cpu != smp_processor_id());
      
      Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops
      output confirms it:
      
        CPU: 0
        Comm: watchdog/130
      
      The problem is that we aren't ensuring the CPU active bit is set for the
      secondary before allowing the master to continue on. The master unparks
      the secondary CPU's kthreads and the scheduler looks for a CPU to run
      on. It calls select_task_rq() and realises the suggested CPU is not in
      the cpus_allowed mask. It then ends up in select_fallback_rq(), and
      since the active bit isnt't set we choose some other CPU to run on.
      
      This seems to have been introduced by 6acbfb96 "sched: Fix hotplug
      vs. set_cpus_allowed_ptr()", which changed from setting active before
      online to setting active after online. However that was in turn fixing a
      bug where other code assumed an active CPU was also online, so we can't
      just revert that fix.
      
      The simplest fix is just to spin waiting for both active & online to be
      set. We already have a barrier prior to set_cpu_online() (which also
      sets active), to ensure all other setup is completed before online &
      active are set.
      
      Fixes: 6acbfb96 ("sched: Fix hotplug vs. set_cpus_allowed_ptr()")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      875ebe94
  7. 23 2月, 2015 1 次提交
    • P
      powerpc: Re-enable dynticks · fea559f3
      Paul Clarke 提交于
      Implement arch_irq_work_has_interrupt() for powerpc
      
      Commit 9b01f5bf introduced a dependency on "IRQ work self-IPIs" for
      full dynamic ticks to be enabled, by expecting architectures to
      implement a suitable arch_irq_work_has_interrupt() routine.
      
      Several arches have implemented this routine, including x86 (3010279f)
      and arm (09f6edd4), but powerpc was omitted.
      
      This patch implements this routine for powerpc.
      
      The symptom, at boot (on powerpc systems) with "nohz_full=<CPU list>"
      is displayed:
      
           NO_HZ: Can't run full dynticks because arch doesn't support irq work self-IPIs
      
      after this patch:
      
           NO_HZ: Full dynticks CPUs: <CPU list>.
      
      Tested against 3.19.
      
      powerpc implements "IRQ work self-IPIs" by setting the decrementer to 1 in
      arch_irq_work_raise(), which causes a decrementer exception on the next
      timebase tick. We then handle the work in __timer_interrupt().
      
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPaul A. Clarke <pc@us.ibm.com>
      Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [mpe: Flesh out change log, fix ws & include guards, remove include of processor.h]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fea559f3
  8. 19 2月, 2015 1 次提交
  9. 18 2月, 2015 1 次提交
  10. 17 2月, 2015 1 次提交
  11. 14 2月, 2015 1 次提交
  12. 13 2月, 2015 3 次提交
    • C
      powerpc: add running_clock for powerpc to prevent spurious softlockup warnings · 4be1b297
      Cyril Bur 提交于
      On POWER8 virtualised kernels the VTB register can be read to have a view
      of time that only increases while the guest is running.  This will prevent
      guests from seeing time jump if a guest is paused for significant amounts
      of time.
      
      On POWER7 and below virtualised kernels stolen time is subtracted from
      local_clock as a best effort approximation.  This will not eliminate
      spurious warnings in the case of a suspended guest but may reduce the
      occurance in the case of softlockups due to host over commit.
      
      Bare metal kernels should avoid reading the VTB as KVM does not restore
      sane values when not executing, the approxmation is fine as host kernels
      won't observe any stolen time.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Andrew Jones <drjones@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: chai wen <chaiw.fnst@cn.fujitsu.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Ben Zhang <benzh@chromium.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4be1b297
    • A
      all arches, signal: move restart_block to struct task_struct · f56141e3
      Andy Lutomirski 提交于
      If an attacker can cause a controlled kernel stack overflow, overwriting
      the restart block is a very juicy exploit target.  This is because the
      restart_block is held in the same memory allocation as the kernel stack.
      
      Moving the restart block to struct task_struct prevents this exploit by
      making the restart_block harder to locate.
      
      Note that there are other fields in thread_info that are also easy
      targets, at least on some architectures.
      
      It's also a decent simplification, since the restart code is more or less
      identical on all architectures.
      
      [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f56141e3
    • M
      mm: remove remaining references to NUMA hinting bits and helpers · 21d9ee3e
      Mel Gorman 提交于
      This patch removes the NUMA PTE bits and associated helpers.  As a
      side-effect it increases the maximum possible swap space on x86-64.
      
      One potential source of problems is races between the marking of PTEs
      PROT_NONE, NUMA hinting faults and migration.  It must be guaranteed that
      a PTE being protected is not faulted in parallel, seen as a pte_none and
      corrupting memory.  The base case is safe but transhuge has problems in
      the past due to an different migration mechanism and a dependance on page
      lock to serialise migrations and warrants a closer look.
      
      task_work hinting update			parallel fault
      ------------------------			--------------
      change_pmd_range
        change_huge_pmd
          __pmd_trans_huge_lock
            pmdp_get_and_clear
      						__handle_mm_fault
      						pmd_none
      						  do_huge_pmd_anonymous_page
      						  read? pmd_lock blocks until hinting complete, fail !pmd_none test
      						  write? __do_huge_pmd_anonymous_page acquires pmd_lock, checks pmd_none
            pmd_modify
            set_pmd_at
      
      task_work hinting update			parallel migration
      ------------------------			------------------
      change_pmd_range
        change_huge_pmd
          __pmd_trans_huge_lock
            pmdp_get_and_clear
      						__handle_mm_fault
      						  do_huge_pmd_numa_page
      						    migrate_misplaced_transhuge_page
      						    pmd_lock waits for updates to complete, recheck pmd_same
            pmd_modify
            set_pmd_at
      
      Both of those are safe and the case where a transhuge page is inserted
      during a protection update is unchanged.  The case where two processes try
      migrating at the same time is unchanged by this series so should still be
      ok.  I could not find a case where we are accidentally depending on the
      PTE not being cleared and flushed.  If one is missed, it'll manifest as
      corruption problems that start triggering shortly after this series is
      merged and only happen when NUMA balancing is enabled.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Tested-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      21d9ee3e