1. 11 10月, 2013 2 次提交
    • A
      powerpc: add real mode support for dma operations on powernv · 8e0a1611
      Alexey Kardashevskiy 提交于
      The existing TCE machine calls (tce_build and tce_free) only support
      virtual mode as they call __raw_writeq for TCE invalidation what
      fails in real mode.
      
      This introduces tce_build_rm and tce_free_rm real mode versions
      which do mostly the same but use "Store Doubleword Caching Inhibited
      Indexed" instruction for TCE invalidation.
      
      This new feature is going to be utilized by real mode support of VFIO.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8e0a1611
    • A
      powerpc: Prepare to support kernel handling of IOMMU map/unmap · 8e0861fa
      Alexey Kardashevskiy 提交于
      The current VFIO-on-POWER implementation supports only user mode
      driven mapping, i.e. QEMU is sending requests to map/unmap pages.
      However this approach is really slow, so we want to move that to KVM.
      Since H_PUT_TCE can be extremely performance sensitive (especially with
      network adapters where each packet needs to be mapped/unmapped) we chose
      to implement that as a "fast" hypercall directly in "real
      mode" (processor still in the guest context but MMU off).
      
      To be able to do that, we need to provide some facilities to
      access the struct page count within that real mode environment as things
      like the sparsemem vmemmap mappings aren't accessible.
      
      This adds an API function realmode_pfn_to_page() to get page struct when
      MMU is off.
      
      This adds to MM a new function put_page_unless_one() which drops a page
      if counter is bigger than 1. It is going to be used when MMU is off
      (for example, real mode on PPC64) and we want to make sure that page
      release will not happen in real mode as it may crash the kernel in
      a horrible way.
      
      CONFIG_SPARSEMEM_VMEMMAP and CONFIG_FLATMEM are supported.
      
      Cc: linux-mm@kvack.org
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8e0861fa
  2. 08 10月, 2013 1 次提交
  3. 03 10月, 2013 9 次提交
  4. 25 9月, 2013 4 次提交
    • B
      powerpc/pseries: Do not start secondaries in Open Firmware · dbe78b40
      Benjamin Herrenschmidt 提交于
      Starting secondary CPUs early on from Open Firmware and placing them
      in a holding spin loop slows down the boot process significantly under
      some hypervisors such as KVM.
      
      This is also unnecessary when RTAS supports querying the CPU state
      
      So let's not do it.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      dbe78b40
    • B
      powerpc/zImage: make the "OF" wrapper support ePAPR boot · 0c9fa291
      Benjamin Herrenschmidt 提交于
      This makes the "OF" zImage wrapper (zImage.pseries, zImage.pmac,
      zImage.maple) work if booted via a flat device-tree (ePAPR boot
      mode), and thus potentially usable with kexec.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0c9fa291
    • B
      powerpc: Remove ksp_limit on ppc64 · cbc9565e
      Benjamin Herrenschmidt 提交于
      We've been keeping that field in thread_struct for a while, it contains
      the "limit" of the current stack pointer and is meant to be used for
      detecting stack overflows.
      
      It has a few problems however:
      
       - First, it was never actually *used* on 64-bit. Set and updated but
      not actually exploited
      
       - When switching stack to/from irq and softirq stacks, it's update
      is racy unless we hard disable interrupts, which is costly. This
      is fine on 32-bit as we don't soft-disable there but not on 64-bit.
      
      Thus rather than fixing 2 in order to implement 1 in some hypothetical
      future, let's remove the code completely from 64-bit. In order to avoid
      a clutter of ifdef's, we remove the updates from C code completely
      during interrupt stack switching, and instead maintain it from the
      asm helper that is used to do the stack switching in the first place.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cbc9565e
    • B
      powerpc/irq: Run softirqs off the top of the irq stack · 0366a1c7
      Benjamin Herrenschmidt 提交于
      Nowadays, irq_exit() calls __do_softirq() pretty much directly
      instead of calling do_softirq() which switches to the decicated
      softirq stack.
      
      This has lead to observed stack overflows on powerpc since we call
      irq_enter() and irq_exit() outside of the scope that switches to
      the irq stack.
      
      This fixes it by moving the stack switching up a level, making
      irq_enter() and irq_exit() run off the irq stack.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0366a1c7
  5. 13 9月, 2013 2 次提交
  6. 12 9月, 2013 1 次提交
    • N
      mm: migrate: check movability of hugepage in unmap_and_move_huge_page() · 83467efb
      Naoya Horiguchi 提交于
      Currently hugepage migration works well only for pmd-based hugepages
      (mainly due to lack of testing,) so we had better not enable migration of
      other levels of hugepages until we are ready for it.
      
      Some users of hugepage migration (mbind, move_pages, and migrate_pages) do
      page table walk and check pud/pmd_huge() there, so they are safe.  But the
      other users (softoffline and memory hotremove) don't do this, so without
      this patch they can try to migrate unexpected types of hugepages.
      
      To prevent this, we introduce hugepage_migration_support() as an
      architecture dependent check of whether hugepage are implemented on a pmd
      basis or not.  And on some architecture multiple sizes of hugepages are
      available, so hugepage_migration_support() also checks hugepage size.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83467efb
  7. 11 9月, 2013 4 次提交
    • V
      powerpc: Default arch idle could cede processor on pseries · 363edbe2
      Vaidyanathan Srinivasan 提交于
      When adding cpuidle support to pSeries, we introduced two
      regressions:
      
        - The new cpuidle backend driver only works under hypervisors
          supporting the "SLPLAR" option, which isn't the case of the
          old POWER4 hypervisor and the HV "light" used on js2x blades
      
        - The cpuidle driver registers fairly late, meaning that for
          a significant portion of the boot process, we end up having
          all threads spinning. This slows down the boot process and
          increases the overall resource usage if the hypervisor has
          shared processors.
      
      This fixes both by implementing a "default" idle that will cede
      to the hypervisor when possible, in a very simple way without
      all the bells and whisles of cpuidle.
      Reported-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Acked-by: NDeepthi Dharwar <deepthi@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org>
      363edbe2
    • V
      powerpc: Fix section mismatch warning for prom_rtas_call · 620e5050
      Vladimir Murzin 提交于
      While cross-building for PPC64 I've got
      
      WARNING: vmlinux.o(.text.unlikely+0x1ba): Section mismatch in
      reference from the function .prom_rtas_call() to the variable
      .init.data:dt_string_start The function .prom_rtas_call() references
      the variable __initdata dt_string_start.  This is often because
      .prom_rtas_call lacks a __initdata annotation or the annotation of
      dt_string_start is wrong.
      
      WARNING: vmlinux.o(.meminit.text+0xeb0): Section mismatch in reference
      from the function .free_area_init_core.isra.47() to the function
      .init.text:.set_pageblock_order() The function __meminit
      .free_area_init_core.isra.47() references a function __init
      .set_pageblock_order().  If .set_pageblock_order is only used by
      .free_area_init_core.isra.47 then annotate .set_pageblock_order with a
      matching annotation.
      
      Fix it by proper annotation of prom_rtas_call.
      Signed-off-by: NVladimir Murzin <murzin.v@gmail.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      620e5050
    • A
      powerpc: Fix possible deadlock on page fault · 69e044dd
      Aneesh Kumar K.V 提交于
       stack_grow_into/14082 is trying to acquire lock:
        (&mm->mmap_sem){++++++}, at: [<c000000000206d28>] .might_fault+0x78/0xe0
      
       but task is already holding lock:
        (&mm->mmap_sem){++++++}, at: [<c0000000007ffd8c>] .do_page_fault+0x24c/0x910
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&mm->mmap_sem);
         lock(&mm->mmap_sem);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by stack_grow_into/14082:
        #0:  (&mm->mmap_sem){++++++}, at: [<c0000000007ffd8c>] .do_page_fault+0x24c/0x910
      
       stack backtrace:
       CPU: 21 PID: 14082 Comm: stack_grow_into Not tainted 3.10.0-10.el7.ppc64.debug #1
       Call Trace:
       [c0000003d396b850] [c000000000016e7c] .show_stack+0x7c/0x1f0 (unreliable)
       [c0000003d396b920] [c000000000813fc8] .dump_stack+0x28/0x3c
       [c0000003d396b990] [c000000000124b90] .__lock_acquire+0x1640/0x1800
       [c0000003d396bab0] [c00000000012570c] .lock_acquire+0xac/0x250
       [c0000003d396bb80] [c000000000206d54] .might_fault+0xa4/0xe0
       [c0000003d396bbf0] [c0000000007ffe2c] .do_page_fault+0x2ec/0x910
       [c0000003d396be30] [c0000000000092e8] handle_page_fault+0x10/0x30
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      69e044dd
    • G
      powerpc: Export cpu_to_chip_id() to fix build error · 256588fd
      Guenter Roeck 提交于
      powerpc allmodconfig build fails with:
      
      ERROR: ".cpu_to_chip_id" [drivers/block/mtip32xx/mtip32xx.ko] undefined!
      
      The problem was introduced with commit 15863ff3 (powerpc: Make chip-id
      information available to userspace).
      
      Export the missing symbol.
      
      Cc: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Cc: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      256588fd
  8. 05 9月, 2013 2 次提交
  9. 04 9月, 2013 4 次提交
  10. 29 8月, 2013 2 次提交
  11. 28 8月, 2013 6 次提交
  12. 27 8月, 2013 3 次提交
    • B
      powerpc/powernv: Return secondary CPUs to firmware on kexec · 13906db6
      Benjamin Herrenschmidt 提交于
      With OPAL v3 we can return secondary CPUs to firmware on kexec. This
      allows firmware to do various cleanups making things generally more
      reliable, and will enable the "new" kernel to call OPAL to perform
      some reconfiguration tasks early on that can only be done while
      all the CPUs are in firmware.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      13906db6
    • P
      powerpc: Work around gcc miscompilation of __pa() on 64-bit · bdbc29c1
      Paul Mackerras 提交于
      On 64-bit, __pa(&static_var) gets miscompiled by recent versions of
      gcc as something like:
      
              addis 3,2,.LANCHOR1+4611686018427387904@toc@ha
              addi 3,3,.LANCHOR1+4611686018427387904@toc@l
      
      This ends up effectively ignoring the offset, since its bottom 32 bits
      are zero, and means that the result of __pa() still has 0xC in the top
      nibble.  This happens with gcc 4.8.1, at least.
      
      To work around this, for 64-bit we make __pa() use an AND operator,
      and for symmetry, we make __va() use an OR operator.  Using an AND
      operator rather than a subtraction ends up with slightly shorter code
      since it can be done with a single clrldi instruction, whereas it
      takes three instructions to form the constant (-PAGE_OFFSET) and add
      it on.  (Note that MEMORY_START is always 0 on 64-bit.)
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      bdbc29c1
    • B
      powerpc: Don't Oops when accessing /proc/powerpc/lparcfg without hypervisor · f5f6cbb6
      Benjamin Herrenschmidt 提交于
      /proc/powerpc/lparcfg is an ancient facility (though still actively used)
      which allows access to some informations relative to the partition when
      running underneath a PAPR compliant hypervisor.
      
      It makes no sense on non-pseries machines. However, currently, not only
      can it be created on these if the kernel has pseries support, but accessing
      it on such a machine will crash due to trying to do hypervisor calls.
      
      In fact, it should also not do HV calls on older pseries that didn't have
      an hypervisor either.
      
      Finally, it has the plumbing to be a module but is a "bool" Kconfig option.
      
      This fixes the whole lot by turning it into a machine_device_initcall
      that is only created on pseries, and adding the necessary hypervisor
      check before calling the H_GET_EM_PARMS hypercall
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f5f6cbb6