1. 27 2月, 2016 2 次提交
  2. 24 2月, 2016 1 次提交
  3. 22 2月, 2016 2 次提交
  4. 17 2月, 2016 3 次提交
    • B
      powerpc: atomic: Implement acquire/release/relaxed variants for cmpxchg · 56c08e6d
      Boqun Feng 提交于
      Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on
      which _release variants can be built.
      
      To avoid superfluous barriers in _acquire variants, we implement these
      operations with assembly code rather use __atomic_op_acquire() to build
      them automatically.
      
      For the same reason, we keep the assembly implementation of fully
      ordered cmpxchg operations.
      
      However, we don't do the similar for _release, because that will require
      putting barriers in the middle of ll/sc loops, which is probably a bad
      idea.
      
      Note cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed are not
      compiler barriers.
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      56c08e6d
    • B
      powerpc: atomic: Implement acquire/release/relaxed variants for xchg · 26760fc1
      Boqun Feng 提交于
      Implement xchg{,64}_relaxed and atomic{,64}_xchg_relaxed, based on these
      _relaxed variants, release/acquire variants and fully ordered versions
      can be built.
      
      Note that xchg{,64}_relaxed and atomic_{,64}_xchg_relaxed are not
      compiler barriers.
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      26760fc1
    • B
      powerpc: atomic: Implement atomic{, 64}_*_return_* variants · dc53617c
      Boqun Feng 提交于
      On powerpc, acquire and release semantics can be achieved with
      lightweight barriers("lwsync" and "ctrl+isync"), which can be used to
      implement __atomic_op_{acquire,release}.
      
      For release semantics, since we only need to ensure all memory accesses
      that issue before must take effects before the -store- part of the
      atomics, "lwsync" is what we only need. On the platform without
      "lwsync", "sync" should be used. Therefore in __atomic_op_release() we
      use PPC_RELEASE_BARRIER.
      
      For acquire semantics, "lwsync" is what we only need for the similar
      reason.  However on the platform without "lwsync", we can use "isync"
      rather than "sync" as an acquire barrier. Therefore in
      __atomic_op_acquire() we use PPC_ACQUIRE_BARRIER, which is barrier() on
      UP, "lwsync" if available and "isync" otherwise.
      
      Implement atomic{,64}_{add,sub,inc,dec}_return_relaxed, and build other
      variants with these helpers.
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dc53617c
  5. 15 2月, 2016 3 次提交
    • A
      powerpc/mm: Fix Multi hit ERAT cause by recent THP update · c777e2a8
      Aneesh Kumar K.V 提交于
      With ppc64 we use the deposited pgtable_t to store the hash pte slot
      information. We should not withdraw the deposited pgtable_t without
      marking the pmd none. This ensure that low level hash fault handling
      will skip this huge pte and we will handle them at upper levels.
      
      Recent change to pmd splitting changed the above in order to handle the
      race between pmd split and exit_mmap. The race is explained below.
      
      Consider following race:
      
      		CPU0				CPU1
      shrink_page_list()
        add_to_swap()
          split_huge_page_to_list()
            __split_huge_pmd_locked()
              pmdp_huge_clear_flush_notify()
      	// pmd_none() == true
      					exit_mmap()
      					  unmap_vmas()
      					    zap_pmd_range()
      					      // no action on pmd since pmd_none() == true
      	pmd_populate()
      
      As result the THP will not be freed. The leak is detected by check_mm():
      
      	BUG: Bad rss-counter state mm:ffff880058d2e580 idx:1 val:512
      
      The above required us to not mark pmd none during a pmd split.
      
      The fix for ppc is to clear the huge pte of _PAGE_USER, so that low
      level fault handling code skip this pte. At higher level we do take ptl
      lock. That should serialze us against the pmd split. Once the lock is
      acquired we do check the pmd again using pmd_same. That should always
      return false for us and hence we should retry the access. We do the
      pmd_same check in all case after taking plt with
      THP (do_huge_pmd_wp_page, do_huge_pmd_numa_page and
      huge_pmd_set_accessed)
      
      Also make sure we wait for irq disable section in other cpus to finish
      before flipping a huge pte entry with a regular pmd entry. Code paths
      like find_linux_pte_or_hugepte depend on irq disable to get
      a stable pte_t pointer. A parallel thp split need to make sure we
      don't convert a pmd pte to a regular pmd entry without waiting for the
      irq disable section to finish.
      
      Fixes: eef1b3ba ("thp: implement split_huge_pmd()")
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c777e2a8
    • G
      powerpc/eeh: Fix stale cached primary bus · 05ba75f8
      Gavin Shan 提交于
      When PE is created, its primary bus is cached to pe->bus. At later
      point, the cached primary bus is returned from eeh_pe_bus_get().
      However, we could get stale cached primary bus and run into kernel
      crash in one case: full hotplug as part of fenced PHB error recovery
      releases all PCI busses under the PHB at unplugging time and recreate
      them at plugging time. pe->bus is still dereferencing the PCI bus
      that was released.
      
      This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity
      of pe->bus. pe->bus is updated when its first child EEH device is
      online and the flag is set. Before unplugging in full hotplug for
      error recovery, the flag is cleared.
      
      Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE")
      Cc: stable@vger.kernel.org #v3.11+
      Reported-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reported-by: NPradipta Ghosh <pradghos@in.ibm.com>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Tested-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      05ba75f8
    • D
      powerpc/pseries: Don't trace hcalls on offline CPUs · 126df08c
      Denis Kirjanov 提交于
      If a cpu is hotplugged while the hcall trace points are active, it's
      possible to hit a warning from RCU due to the trace points calling into
      RCU from an offline cpu, eg:
      
        RCU used illegally from offline CPU!
        rcu_scheduler_active = 1, debug_locks = 1
      
      Make the hypervisor tracepoints conditional by using
      TRACE_EVENT_FN_COND.
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      126df08c
  6. 10 2月, 2016 2 次提交
  7. 09 2月, 2016 1 次提交
  8. 28 1月, 2016 1 次提交
  9. 21 1月, 2016 4 次提交
    • S
      powerpc: Remove newly added extra definition of pmd_dirty · 0e2bce74
      Stephen Rothwell 提交于
      Commit d5d6a443 ("arch/powerpc/include/asm/pgtable-ppc64.h:
      add pmd_[dirty|mkclean] for THP") added a new identical definition
      of pmd_dirty(). Remove it again.
      
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      0e2bce74
    • C
      powerpc: Wire up copy_file_range() syscall · d7f9ee60
      Chandan Rajendra 提交于
      Test runs on a ppc64 BE guest succeeded using modified fstests.
      
      Also tested on ppc64 LE using a home made test - mpe.
      Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d7f9ee60
    • C
      dma-mapping: always provide the dma_map_ops based implementation · e1c7e324
      Christoph Hellwig 提交于
      Move the generic implementation to <linux/dma-mapping.h> now that all
      architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
      that everyone supports them.
      
      [valentinrothberg@gmail.com: remove leftovers in Kconfig]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NValentin Rothberg <valentinrothberg@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e1c7e324
    • R
      powerpc/fadump: rename cpu_online_mask member of struct fadump_crash_info_header · a0512164
      Rasmus Villemoes 提交于
      The four cpumasks cpu_{possible,online,present,active}_bits are exposed
      readonly via the corresponding const variables cpu_xyz_mask.  But they are
      also accessible for arbitrary writing via the exposed functions
      set_cpu_xyz.  There's quite a bit of code throughout the kernel which
      iterates over or otherwise accesses these bitmaps, and having the access
      go via the cpu_xyz_mask variables is nowadays [1] simply a useless
      indirection.
      
      It may be that any problem in CS can be solved by an extra level of
      indirection, but that doesn't mean every extra indirection solves a
      problem.  In this case, it even necessitates some minor ugliness (see
      4/6).
      
      Patch 1/6 is new in v2, and fixes a build failure on ppc by renaming a
      struct member, to avoid problems when the identifier cpu_online_mask
      becomes a macro later in the series.  The next four patches eliminate the
      cpu_xyz_mask variables by simply exposing the actual bitmaps, after
      renaming them to discourage direct access - that still happens through
      cpu_xyz_mask, which are now simply macros with the same type and value as
      they used to have.
      
      After that, there's no longer any reason to have the setter functions be
      out-of-line: The boolean parameter is almost always a literal true or
      false, so by making them static inlines they will usually compile to one
      or two instructions.
      
      For a defconfig build on x86_64, bloat-o-meter says we save ~3000 bytes.
      We also save a little stack (stackdelta says 127 functions have a 16 byte
      smaller stack frame, while two grow by that amount).  Mostly because, when
      iterating over the mask, gcc typically loads the value of cpu_xyz_mask
      into a callee-saved register and from there into %rdi before each
      find_next_bit call - now it can just load the appropriate immediate
      address into %rdi before each call.
      
      [1] See Rusty's kind explanation
      http://thread.gmane.org/gmane.linux.kernel/2047078/focus=2047722 for
      some historic context.
      
      This patch (of 6):
      
      As preparation for eliminating the indirect access to the various global
      cpu_*_bits bitmaps via the pointer variables cpu_*_mask, rename the
      cpu_online_mask member of struct fadump_crash_info_header to simply
      online_mask, thus allowing cpu_online_mask to become a macro.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a0512164
  10. 16 1月, 2016 3 次提交
    • D
      kvm: rename pfn_t to kvm_pfn_t · ba049e93
      Dan Williams 提交于
      To date, we have implemented two I/O usage models for persistent memory,
      PMEM (a persistent "ram disk") and DAX (mmap persistent memory into
      userspace).  This series adds a third, DAX-GUP, that allows DAX mappings
      to be the target of direct-i/o.  It allows userspace to coordinate
      DMA/RDMA from/to persistent memory.
      
      The implementation leverages the ZONE_DEVICE mm-zone that went into
      4.3-rc1 (also discussed at kernel summit) to flag pages that are owned
      and dynamically mapped by a device driver.  The pmem driver, after
      mapping a persistent memory range into the system memmap via
      devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus
      page-backed pmem-pfns via flags in the new pfn_t type.
      
      The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the
      resulting pte(s) inserted into the process page tables with a new
      _PAGE_DEVMAP flag.  Later, when get_user_pages() is walking ptes it keys
      off _PAGE_DEVMAP to pin the device hosting the page range active.
      Finally, get_page() and put_page() are modified to take references
      against the device driver established page mapping.
      
      Finally, this need for "struct page" for persistent memory requires
      memory capacity to store the memmap array.  Given the memmap array for a
      large pool of persistent may exhaust available DRAM introduce a
      mechanism to allocate the memmap from persistent memory.  The new
      "struct vmem_altmap *" parameter to devm_memremap_pages() enables
      arch_add_memory() to use reserved pmem capacity rather than the page
      allocator.
      
      This patch (of 18):
      
      The core has developed a need for a "pfn_t" type [1].  Move the existing
      pfn_t in KVM to kvm_pfn_t [2].
      
      [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html
      [2]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002218.htmlSigned-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba049e93
    • M
      arch/powerpc/include/asm/pgtable-ppc64.h: add pmd_[dirty|mkclean] for THP · d5d6a443
      Minchan Kim 提交于
      MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite
      of the contents since MADV_FREE syscall is called for THP page.
      
      This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE
      support.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: <yalin.wang2010@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Gang <gang.chen.5i5j@gmail.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Jason Evans <je@fb.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mika Penttil <mika.penttila@nextfour.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d5d6a443
    • K
      powerpc, thp: remove infrastructure for handling splitting PMDs · 7aa9a23c
      Kirill A. Shutemov 提交于
      With new refcounting we don't need to mark PMDs splitting.  Let's drop
      code to handle this.
      
      pmdp_splitting_flush() is not needed too: on splitting PMD we will do
      pmdp_clear_flush() + set_pte_at().  pmdp_clear_flush() will do IPI as
      needed for fast_gup.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7aa9a23c
  11. 13 1月, 2016 5 次提交
  12. 12 1月, 2016 2 次提交
    • H
      powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff · 2f10f1a7
      Hugh Dickins 提交于
      Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y
      but CONFIG_MEM_SOFT_DIRTY is not set.  That's because the non-zero
      _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not
      discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be
      recognized.
      
      (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on
      CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete
      attempt to solve this problem.)
      
      It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and
      and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff
      should be made more robust; but nevertheless, fix up the powerpc ifdefs
      as x86_64 and s390 (which met the same problem) have them, defining the
      bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set.
      
      Fixes: 7207f436 ("powerpc/mm: Add page soft dirty tracking")
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Reviewed-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2f10f1a7
    • A
      powerpc/mm: Fix _PAGE_PTE breaking swapoff · 44734f23
      Aneesh Kumar K.V 提交于
      Core kernel expects swp_entry_t to consist of only swap type and swap
      offset. We should not leak pte bits into swp_entry_t. This breaks
      swapoff which use the swap type and offset to build a swp_entry_t and
      later compare that to the swp_entry_t obtained from linux page table
      pte. Leaking pte bits into swp_entry_t breaks that comparison and
      results in us looping in try_to_unuse.
      
      The stack trace can be anywhere below try_to_unuse() in mm/swapfile.c,
      since swapoff is circling around and around that function, reading from
      each used swap block into a page, then trying to find where that page
      belongs, looking at every non-file pte of every mm that ever swapped.
      
      Fixes: 6a119eae ("powerpc/mm: Add a _PAGE_PTE bit")
      Reported-by: NHugh Dickins <hughd@google.com>
      Suggested-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      44734f23
  13. 11 1月, 2016 1 次提交
  14. 09 1月, 2016 2 次提交
  15. 05 1月, 2016 1 次提交
  16. 04 1月, 2016 1 次提交
  17. 28 12月, 2015 1 次提交
  18. 27 12月, 2015 2 次提交
    • R
      powerpc/powernv: Add a kmsg_dumper that flushes console output on panic · affddff6
      Russell Currey 提交于
      On BMC machines, console output is controlled by the OPAL firmware and is
      only flushed when its pollers are called.  When the kernel is in a panic
      state, it no longer calls these pollers and thus console output does not
      completely flush, causing some output from the panic to be lost.
      
      Output is only actually lost when the kernel is configured to not power off
      or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
      flushes the console buffer as part of its power down routines.  Before this
      patch, however, only partial output would be printed during the timeout wait.
      
      This patch adds a new kmsg_dumper which gets called at panic time to ensure
      panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
      in the OPAL API, and if that is not available, the pollers are called enough
      times to (hopefully) completely flush the buffer.
      
      The flushing mechanism will only affect output printed at and before the
      kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
      message may still be truncated as follows:
      
      >Call Trace:
      >[c000000f1f603b00] [c0000000008e9458] dump_stack+0x90/0xbc (unreliable)
      >[c000000f1f603b30] [c0000000008e7e78] panic+0xf8/0x2c4
      >[c000000f1f603bc0] [c000000000be4860] mount_block_root+0x288/0x33c
      >[c000000f1f603c80] [c000000000be4d14] prepare_namespace+0x1f4/0x254
      >[c000000f1f603d00] [c000000000be43e8] kernel_init_freeable+0x318/0x350
      >[c000000f1f603dc0] [c00000000000bd74] kernel_init+0x24/0x130
      >[c000000f1f603e30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac
      >---[ end Kernel panic - not
      
      This functionality is implemented as a kmsg_dumper as it seems to be the
      most sensible way to introduce platform-specific functionality to the
      panic function.
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      affddff6
    • M
      powerpc: Copy only required pieces of the mm_context_t to the paca · 2fc251a8
      Michael Neuling 提交于
      Currently we copy the whole mm_context_t to the paca but only access a
      few bits of it.  This is wasteful of space paca and also takes quite
      some time in the hot path of context switching.
      
      This patch pulls in only the required bits from the mm_context_t to
      the paca and on context switch, copies only those.
      
      Benchmarking this (On top of Anton's recent MSR context switching
      changes [1]) using processes and yield shows an improvement of almost
      3% on POWER8:
      
        http://ozlabs.org/~anton/junkcode/context_switch2.c
        ./context_switch2 --test=yield --process 0 0
      
      1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.htmlSigned-off-by: NMichael Neuling <mikey@neuling.org>
      [mpe: Rename paca fields to be mm_ctx_foo rather than context_foo]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2fc251a8
  19. 23 12月, 2015 3 次提交