1. 30 10月, 2015 1 次提交
    • Y
      sparc/PCI: Add mem64 resource parsing for root bus · af86fa40
      Yinghai Lu 提交于
      David reported that a T5-8 sparc system failed to boot with:
      
        pci_sun4v f02dbcfc: PCI host bridge to bus 0000:00
        pci_bus 0000:00: root bus resource [io  0x804000000000-0x80400fffffff] (bus address [0x0000-0xfffffff])
        pci_bus 0000:00: root bus resource [mem 0x800000000000-0x80007effffff] (bus address [0x00000000-0x7effffff])
        pci 0000:00:01.0: can't claim BAR 15 [mem 0x100000000-0x4afffffff pref]: no compatible bridge window
      
      Note that we don't know about a host bridge aperture that contains
      BAR 15.  OF does report a MEM64 aperture, but before this patch,
      pci_determine_mem_io_space() ignored it.
      
      Add support for host bridge apertures with 64-bit PCI addresses.  Also
      set IORESOURCE_MEM_64 for PCI device and bridge resources in PCI 64-bit
      memory space.
      
      Sparc doesn't actually print the device and bridge resources, but after
      this patch, we should have the equivalent of this:
      
        pci_sun4v f02dbcfc: PCI host bridge to bus 0000:00
        pci_bus 0000:00: root bus resource [io  0x804000000000-0x80400fffffff] (bus address [0x0000-0xfffffff])
        pci_bus 0000:00: root bus resource [mem 0x800000000000-0x80007effffff] (bus address [0x00000000-0x7effffff])
        pci_bus 0000:00: root bus resource [mem 0x800100000000-0x8007ffffffff] (bus address [0x100000000-0x7ffffffff])
        pci 0000:00:01.0:   bridge window [mem 0x800100000000-0x8004afffffff 64bit pref]
      
      [bhelgaas: changelog, URL to David's report]
      Fixes: d63e2e1f ("sparc/PCI: Clip bridge windows to fit in upstream windows")
      Link: http://lkml.kernel.org/r/5514391F.2030300@oracle.comReported-by: NDavid Ahern <david.ahern@oracle.com>
      Tested-by: NDavid Ahern <david.ahern@oracle.com>
      Tested-by: NKhalid Aziz <khalid.aziz@oracle.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      af86fa40
  2. 11 9月, 2015 5 次提交
    • C
      dma-mapping: consolidate dma_set_mask · 452e06af
      Christoph Hellwig 提交于
      Almost everyone implements dma_set_mask the same way, although some time
      that's hidden in ->set_dma_mask methods.
      
      This patch consolidates those into a common implementation that either
      calls ->set_dma_mask if present or otherwise uses the default
      implementation.  Some architectures used to only call ->set_dma_mask
      after the initial checks, and those instance have been fixed to do the
      full work.  h8300 implemented dma_set_mask bogusly as a no-ops and has
      been fixed.
      
      Unfortunately some architectures overload unrelated semantics like changing
      the dma_ops into it so we still need to allow for an architecture override
      for now.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      452e06af
    • C
      dma-mapping: consolidate dma_supported · ee196371
      Christoph Hellwig 提交于
      Most architectures just call into ->dma_supported, but some also return 1
      if the method is not present, or 0 if no dma ops are present (although
      that should never happeb). Consolidate this more broad version into
      common code.
      
      Also fix h8300 which inorrectly always returned 0, which would have been
      a problem if it's dma_set_mask implementation wasn't a similarly buggy
      noop.
      
      As a few architectures have much more elaborate implementations, we
      still allow for arch overrides.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee196371
    • C
      dma-mapping: cosolidate dma_mapping_error · efa21e43
      Christoph Hellwig 提交于
      Currently there are three valid implementations of dma_mapping_error:
      
       (1) call ->mapping_error
       (2) check for a hardcoded error code
       (3) always return 0
      
      This patch provides a common implementation that calls ->mapping_error
      if present, then checks for DMA_ERROR_CODE if defined or otherwise
      returns 0.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      efa21e43
    • C
      dma-mapping: consolidate dma_{alloc,free}_noncoherent · 1e893752
      Christoph Hellwig 提交于
      Most architectures do not support non-coherent allocations and either
      define dma_{alloc,free}_noncoherent to their coherent versions or stub
      them out.
      
      Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips
      implements them directly.
      
      This patch moves the Openrisc version to common code, and handles the
      DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance.
      
      Note that actual non-coherent allocations require a dma_cache_sync
      implementation, so if non-coherent allocations didn't work on
      an architecture before this patch they still won't work after it.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1e893752
    • C
      dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} · 6894258e
      Christoph Hellwig 提交于
      Since 2009 we have a nice asm-generic header implementing lots of DMA API
      functions for architectures using struct dma_map_ops, but unfortunately
      it's still missing a lot of APIs that all architectures still have to
      duplicate.
      
      This series consolidates the remaining functions, although we still need
      arch opt outs for two of them as a few architectures have very
      non-standard implementations.
      
      This patch (of 5):
      
      The coherent DMA allocator works the same over all architectures supporting
      dma_map operations.
      
      This patch consolidates them and converges the minor differences:
      
       - the debug_dma helpers are now called from all architectures, including
         those that were previously missing them
       - dma_alloc_from_coherent and dma_release_from_coherent are now always
         called from the generic alloc/free routines instead of the ops
         dma-mapping-common.h always includes dma-coherent.h to get the defintions
         for them, or the stubs if the architecture doesn't support this feature
       - checks for ->alloc / ->free presence are removed.  There is only one
         magic instead of dma_map_ops without them (mic_dma_ops) and that one
         is x86 only anyway.
      
      Besides that only x86 needs special treatment to replace a default devices
      if none is passed and tweak the gfp_flags.  An optional arch hook is provided
      for that.
      
      [linux@roeck-us.net: fix build]
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6894258e
  3. 09 9月, 2015 1 次提交
    • M
      sparc32: do not include swap.h from pgtable_32.h · b3d9ed3f
      Michal Hocko 提交于
      "memcg: export struct mem_cgroup" will add includes into
      linux/memcontrol.h which lead to further header dependency issues as
      reported by Guenter Roeck:
      
        In file included from include/linux/highmem.h:7:0,
                         from include/linux/bio.h:23,
                         from include/linux/writeback.h:192,
                         from include/linux/memcontrol.h:30,
                         from include/linux/swap.h:8,
                         from ./arch/sparc/include/asm/pgtable_32.h:17,
                         from ./arch/sparc/include/asm/pgtable.h:6,
                         from arch/sparc/kernel/traps_32.c:23:
        include/linux/mm.h: In function 'is_vmalloc_addr':
        include/linux/mm.h:371:17: error: 'VMALLOC_START' undeclared (first use in this function)
        include/linux/mm.h:371:17: note: each undeclared identifier is reported only once for each function it appears in
        include/linux/mm.h:371:41: error: 'VMALLOC_END' undeclared (first use in this function)
        include/linux/mm.h: In function 'maybe_mkwrite':
        include/linux/mm.h:556:3: error: implicit declaration of function 'pte_mkwrite'
      
      The issue is that pgtable_32.h depends on swap.h to get swap_entry_t but
      that goes all the way down to linux/mm.h which wants to have VMALLOC_*
      which is defined later in pgtable_32.h, though.
      
      swap_entry_t is defined in include/mm_types.h so it should be sufficient
      to include this header without more dependencies.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3d9ed3f
  4. 11 8月, 2015 1 次提交
    • D
      cleanup IORESOURCE_CACHEABLE vs ioremap() · 92b19ff5
      Dan Williams 提交于
      Quoting Arnd:
          I was thinking the opposite approach and basically removing all uses
          of IORESOURCE_CACHEABLE from the kernel. There are only a handful of
          them.and we can probably replace them all with hardcoded
          ioremap_cached() calls in the cases they are actually useful.
      
      All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of
      ioremap_nocache() if the resource is cacheable, however ioremap() is
      uncached by default. Clearly none of the existing usages care about the
      cacheability. Particularly devm_ioremap_resource() never worked as
      advertised since it always fell back to plain ioremap().
      
      Clean this up as the new direction we want is to convert
      ioremap_<type>() usages to memremap(..., flags).
      Suggested-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      92b19ff5
  5. 10 8月, 2015 1 次提交
  6. 07 8月, 2015 2 次提交
    • L
      treewide: Fix typo compatability -> compatibility · 60acc4eb
      Laurent Pinchart 提交于
      Even though 'compatability' has a dedicated entry in the Wiktionary,
      it's listed as 'Mispelling of compatibility'. Fix it.
      Signed-off-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> for the atomic_helper.c
      Signed-off-by: NJiri Kosina <jkosina@suse.com>
      60acc4eb
    • D
      sparc64: Fix userspace FPU register corruptions. · 44922150
      David S. Miller 提交于
      If we have a series of events from userpsace, with %fprs=FPRS_FEF,
      like follows:
      
      ETRAP
      	ETRAP
      		VIS_ENTRY(fprs=0x4)
      		VIS_EXIT
      		RTRAP (kernel FPU restore with fpu_saved=0x4)
      	RTRAP
      
      We will not restore the user registers that were clobbered by the FPU
      using kernel code in the inner-most trap.
      
      Traps allocate FPU save slots in the thread struct, and FPU using
      sequences save the "dirty" FPU registers only.
      
      This works at the initial trap level because all of the registers
      get recorded into the top-level FPU save area, and we'll return
      to userspace with the FPU disabled so that any FPU use by the user
      will take an FPU disabled trap wherein we'll load the registers
      back up properly.
      
      But this is not how trap returns from kernel to kernel operate.
      
      The simplest fix for this bug is to always save all FPU register state
      for anything other than the top-most FPU save area.
      
      Getting rid of the optimized inner-slot FPU saving code ends up
      making VISEntryHalf degenerate into plain VISEntry.
      
      Longer term we need to do something smarter to reinstate the partial
      save optimizations.  Perhaps the fundament error is having trap entry
      and exit allocate FPU save slots and restore register state.  Instead,
      the VISEntry et al. calls should be doing that work.
      
      This bug is about two decades old.
      Reported-by: NJames Y Knight <jyknight@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44922150
  7. 04 8月, 2015 1 次提交
    • I
      sched, sparc32: Update scheduler comments in copy_thread() · bef033a3
      Ingo Molnar 提交于
      There's no finish_arch_switch() anymore in the latest scheduler tree.
      Also update some other details.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      bef033a3
  8. 03 8月, 2015 3 次提交
    • P
      locking/static_keys: Add a new static_key interface · 11276d53
      Peter Zijlstra 提交于
      There are various problems and short-comings with the current
      static_key interface:
      
       - static_key_{true,false}() read like a branch depending on the key
         value, instead of the actual likely/unlikely branch depending on
         init value.
      
       - static_key_{true,false}() are, as stated above, tied to the
         static_key init values STATIC_KEY_INIT_{TRUE,FALSE}.
      
       - we're limited to the 2 (out of 4) possible options that compile to
         a default NOP because that's what our arch_static_branch() assembly
         emits.
      
      So provide a new static_key interface:
      
        DEFINE_STATIC_KEY_TRUE(name);
        DEFINE_STATIC_KEY_FALSE(name);
      
      Which define a key of different types with an initial true/false
      value.
      
      Then allow:
      
         static_branch_likely()
         static_branch_unlikely()
      
      to take a key of either type and emit the right instruction for the
      case.
      
      This means adding a second arch_static_branch_jump() assembly helper
      which emits a JMP per default.
      
      In order to determine the right instruction for the right state,
      encode the branch type in the LSB of jump_entry::key.
      
      This is the final step in removing the naming confusion that has led to
      a stream of avoidable bugs such as:
      
        a833581e ("x86, perf: Fix static_key bug in load_mm_cr4()")
      
      ... but it also allows new static key combinations that will give us
      performance enhancements in the subsequent patches.
      
      Tested-by: Rabin Vincent <rabin@rab.in> # arm
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> # ppc
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      11276d53
    • P
      jump_label: Rename JUMP_LABEL_{EN,DIS}ABLE to JUMP_LABEL_{JMP,NOP} · 76b235c6
      Peter Zijlstra 提交于
      Since we've already stepped away from ENABLE is a JMP and DISABLE is a
      NOP with the branch_default bits, and are going to make it even worse,
      rename it to make it all clearer.
      
      This way we don't mix multiple levels of logic attributes, but have a
      plain 'physical' name for what the current instruction patching status
      of a jump label is.
      
      This is a first step in removing the naming confusion that has led to
      a stream of avoidable bugs such as:
      
        a833581e ("x86, perf: Fix static_key bug in load_mm_cr4()")
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      [ Beefed up the changelog. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      76b235c6
    • A
      locking, arch: use WRITE_ONCE()/READ_ONCE() in smp_store_release()/smp_load_acquire() · 76695af2
      Andrey Konovalov 提交于
      Replace ACCESS_ONCE() macro in smp_store_release() and smp_load_acquire()
      with WRITE_ONCE() and READ_ONCE() on x86, arm, arm64, ia64, metag, mips,
      powerpc, s390, sparc and asm-generic since ACCESS_ONCE() does not work
      reliably on non-scalar types.
      
      WRITE_ONCE() and READ_ONCE() were introduced in the following commits:
      
        230fa253 ("kernel: Provide READ_ONCE and ASSIGN_ONCE")
        43239cbe ("kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)")
      Signed-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NDavidlohr Bueso <dbueso@suse.de>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Cc: Alexander Duyck <alexander.h.duyck@redhat.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arch@vger.kernel.org
      Link: http://lkml.kernel.org/r/1438528264-714-1-git-send-email-andreyknvl@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      76695af2
  9. 01 8月, 2015 2 次提交
  10. 31 7月, 2015 2 次提交
  11. 27 7月, 2015 2 次提交
  12. 23 7月, 2015 1 次提交
  13. 21 7月, 2015 1 次提交
  14. 18 7月, 2015 1 次提交
  15. 26 6月, 2015 1 次提交
  16. 25 6月, 2015 11 次提交
    • D
      sparc64: perf: Use UREG_FP rather than UREG_I6 · 2d89cd86
      David Ahern 提交于
      perf walks userspace callchains by following frame pointers. Use the
      UREG_FP macro to make it clearer that the %fp is being used.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d89cd86
    • D
      sparc64: perf: Add sanity checking on addresses in user stack · b69fb769
      David Ahern 提交于
      Processes are getting killed (sigbus or segv) while walking userspace
      callchains when using perf. In some instances I have seen ufp = 0x7ff
      which does not seem like a proper stack address.
      
      This patch adds a function to run validity checks against the address
      before attempting the copy_from_user. The checks are copied from the
      x86 version as a start point with the addition of a 4-byte alignment
      check.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b69fb769
    • D
      sparc64: Convert BUG_ON to warning · 2bf7c3ef
      David Ahern 提交于
      Pagefault handling has a BUG_ON path that panics the system. Convert it to
      a warning instead. There is no need to bring down the system for this kind
      of failure.
      
      The following was hit while running:
          perf sched record -g -- make -j 16
      
      [3609412.782801] kernel BUG at /opt/dahern/linux.git/arch/sparc/mm/fault_64.c:416!
      [3609412.782833]               \|/ ____ \|/
      [3609412.782833]               "@'/ .. \`@"
      [3609412.782833]               /_| \__/ |_\
      [3609412.782833]                  \__U_/
      [3609412.782870] cat(4516): Kernel bad sw trap 5 [#1]
      [3609412.782889] CPU: 0 PID: 4516 Comm: cat Tainted: G            E   4.1.0-rc8+ #6
      [3609412.782909] task: fff8000126e31f80 ti: fff8000110d90000 task.ti: fff8000110d90000
      [3609412.782931] TSTATE: 0000004411001603 TPC: 000000000096b164 TNPC: 000000000096b168 Y: 0000004e    Tainted: G            E
      [3609412.782964] TPC: <do_sparc64_fault+0x5e4/0x6a0>
      [3609412.782979] g0: 000000000096abe0 g1: 0000000000d314c4 g2: 0000000000000000 g3: 0000000000000001
      [3609412.783009] g4: fff8000126e31f80 g5: fff80001302d2000 g6: fff8000110d90000 g7: 00000000000000ff
      [3609412.783045] o0: 0000000000aff6a8 o1: 00000000000001a0 o2: 0000000000000001 o3: 0000000000000054
      [3609412.783080] o4: fff8000100026820 o5: 0000000000000001 sp: fff8000110d935f1 ret_pc: 000000000096b15c
      [3609412.783117] RPC: <do_sparc64_fault+0x5dc/0x6a0>
      [3609412.783137] l0: 000007feff996000 l1: 0000000000030001 l2: 0000000000000004 l3: fff8000127bd0120
      [3609412.783174] l4: 0000000000000054 l5: fff8000127bd0188 l6: 0000000000000000 l7: fff8000110d9dba8
      [3609412.783210] i0: fff8000110d93f60 i1: fff8000110ca5530 i2: 000000000000003f i3: 0000000000000054
      [3609412.783244] i4: fff800010000081a i5: fff8000100000398 i6: fff8000110d936a1 i7: 0000000000407c6c
      [3609412.783286] I7: <sparc64_realfault_common+0x10/0x20>
      [3609412.783308] Call Trace:
      [3609412.783329]  [0000000000407c6c] sparc64_realfault_common+0x10/0x20
      [3609412.783353] Disabling lock debugging due to kernel taint
      [3609412.783379] Caller[0000000000407c6c]: sparc64_realfault_common+0x10/0x20
      [3609412.783449] Caller[fff80001002283e4]: 0xfff80001002283e4
      [3609412.783471] Instruction DUMP: 921021a0  7feaff91  901222a8 <91d02005> 82086100  02f87f7b  808a2873  81cfe008  01000000
      [3609412.783542] Kernel panic - not syncing: Fatal exception
      [3609412.784605] Press Stop-A (L1-A) to return to the boot prom
      [3609412.784615] ---[ end Kernel panic - not syncing: Fatal exception
      
      With this patch rather than a panic I occasionally get something like this:
          perf sched record -g -m 1024  -- make -j N
      
      where N is based on number of cpus (128 to 1024 for a T7-4 and 8 for an 8 cpu
      VM on a T5-2).
      
      WARNING: CPU: 211 PID: 52565 at /opt/dahern/linux.git/arch/sparc/mm/fault_64.c:417 do_sparc64_fault+0x340/0x70c()
      address (7feffcd6000) != regs->tpc (fff80001004873c0)
      Modules linked in: ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 cdc_ether usbnet mii ixgbe mdio igb i2c_algo_bit i2c_core ptp crc32c_sparc64 camellia_sparc64 des_sparc64 des_generic md5_sparc64 sha512_sparc64 sha1_sparc64 uio_pdrv_genirq uio usb_storage mpt3sas scsi_transport_sas raid_class aes_sparc64 sunvnet sunvdc sha256_sparc64(E) sha256_generic(E)
      CPU: 211 PID: 52565 Comm: ld Tainted: G        W   E   4.1.0-rc8+ #19
      Call Trace:
       [000000000045ce30] warn_slowpath_common+0x7c/0xa0
       [000000000045ceec] warn_slowpath_fmt+0x30/0x40
       [000000000098ad64] do_sparc64_fault+0x340/0x70c
       [0000000000407c2c] sparc64_realfault_common+0x10/0x20
      ---[ end trace 62ee02065a01a049 ]---
      ld[52565]: segfault at fff80001004873c0 ip fff80001004873c0 (rpc fff8000100158868) sp 000007feffcd70e1 error 30002 in libc-2.12.so[fff8000100410000+184000]
      
      The segfault is horrible, but better than a system panic.
      
      An 8-cpu VM on a T5-2 also showed the above traces from time to time,
      so it is a general problem and not specific to the T7 or baremetal.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2bf7c3ef
    • D
      sparc: perf: Disable pagefaults while walking userspace stacks · c17af4dd
      David Ahern 提交于
      Page faults generated walking userspace stacks can call schedule to switch
      out the task. When collecting callchains for scheduler tracepoints this
      causes a deadlock as the tracepoints can be hit with the runqueue lock held:
      
      [ 8138.159054] WARNING: CPU: 758 PID: 12488 at /opt/dahern/linux.git/arch/sparc/kernel/nmi.c:80 perfctr_irq+0x1f8/0x2b4()
      
      [ 8138.203152] Watchdog detected hard LOCKUP on cpu 758
      
      [ 8138.410969] CPU: 758 PID: 12488 Comm: perf Not tainted 4.0.0-rc6+ #6
      [ 8138.437146] Call Trace:
      [ 8138.447193]  [000000000045cdd4] warn_slowpath_common+0x7c/0xa0
      [ 8138.471238]  [000000000045ce90] warn_slowpath_fmt+0x30/0x40
      [ 8138.494189]  [0000000000983e38] perfctr_irq+0x1f8/0x2b4
      [ 8138.515716]  [00000000004209f4] tl0_irq15+0x14/0x20
      [ 8138.535791]  [00000000009839ec] _raw_spin_trylock_bh+0x68/0x108
      [ 8138.560180]  [0000000000980018] __schedule+0xcc/0x710
      [ 8138.580981]  [00000000009806dc] preempt_schedule_common+0x10/0x3c
      [ 8138.606082]  [000000000098077c] _cond_resched+0x34/0x44
      [ 8138.627603]  [0000000000565990] kmem_cache_alloc_node+0x24/0x1a0
      [ 8138.652345]  [0000000000450b60] tsb_grow+0xac/0x488
      [ 8138.672429]  [0000000000985040] do_sparc64_fault+0x4dc/0x6e4
      [ 8138.695736]  [0000000000407c2c] sparc64_realfault_common+0x10/0x20
      [ 8138.721202]  [00000000006f2e24] NG4copy_from_user+0xa4/0x3c0
      [ 8138.744510]  [000000000044f900] perf_callchain_user+0x5c/0x6c
      [ 8138.768182]  [0000000000517b5c] perf_callchain+0x16c/0x19c
      [ 8138.790774]  [0000000000515f84] perf_prepare_sample+0x68/0x218
      [ 8138.814801] ---[ end trace 42ca6294b1ff7573 ]---
      
      As with PowerPC (b59a1bfc, "powerpc/perf: Disable pagefaults during
      callchain stack read") disable pagefaults while walking userspace stacks.
      Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c17af4dd
    • T
      mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute · fc6daaf9
      Tony Luck 提交于
      Some high end Intel Xeon systems report uncorrectable memory errors as a
      recoverable machine check.  Linux has included code for some time to
      process these and just signal the affected processes (or even recover
      completely if the error was in a read only page that can be replaced by
      reading from disk).
      
      But we have no recovery path for errors encountered during kernel code
      execution.  Except for some very specific cases were are unlikely to ever
      be able to recover.
      
      Enter memory mirroring. Actually 3rd generation of memory mirroing.
      
      Gen1: All memory is mirrored
      	Pro: No s/w enabling - h/w just gets good data from other side of the
      	     mirror
      	Con: Halves effective memory capacity available to OS/applications
      
      Gen2: Partial memory mirror - just mirror memory begind some memory controllers
      	Pro: Keep more of the capacity
      	Con: Nightmare to enable. Have to choose between allocating from
      	     mirrored memory for safety vs. NUMA local memory for performance
      
      Gen3: Address range partial memory mirror - some mirror on each memory
            controller
      	Pro: Can tune the amount of mirror and keep NUMA performance
      	Con: I have to write memory management code to implement
      
      The current plan is just to use mirrored memory for kernel allocations.
      This has been broken into two phases:
      
      1) This patch series - find the mirrored memory, use it for boot time
         allocations
      
      2) Wade into mm/page_alloc.c and define a ZONE_MIRROR to pick up the
         unused mirrored memory from mm/memblock.c and only give it out to
         select kernel allocations (this is still being scoped because
         page_alloc.c is scary).
      
      This patch (of 3):
      
      Add extra "flags" to memblock to allow selection of memory based on
      attribute.  No functional changes
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Xiexiuqi <xiexiuqi@huawei.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc6daaf9
    • A
      mm: clarify that the function operates on hugepage pte · 8809aa2d
      Aneesh Kumar K.V 提交于
      We have confusing functions to clear pmd, pmd_clear_* and pmd_clear.  Add
      _huge_ to pmdp_clear functions so that we are clear that they operate on
      hugepage pte.
      
      We don't bother about other functions like pmdp_set_wrprotect,
      pmdp_clear_flush_young, because they operate on PTE bits and hence
      indicate they are operating on hugepage ptes
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8809aa2d
    • Z
      mm/hugetlb: reduce arch dependent code about hugetlb_prefault_arch_hook · a67a31fa
      Zhang Zhen 提交于
      Currently we have many duplicates in definitions of
      hugetlb_prefault_arch_hook.  In all architectures this function is empty.
      Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a67a31fa
    • L
      mm: new mm hook framework · 2ae416b1
      Laurent Dufour 提交于
      CRIU is recreating the process memory layout by remapping the checkpointee
      memory area on top of the current process (criu).  This includes remapping
      the vDSO to the place it has at checkpoint time.
      
      However some architectures like powerpc are keeping a reference to the
      vDSO base address to build the signal return stack frame by calling the
      vDSO sigreturn service.  So once the vDSO has been moved, this reference
      is no more valid and the signal frame built later are not usable.
      
      This patch serie is introducing a new mm hook framework, and a new
      arch_remap hook which is called when mremap is done and the mm lock still
      hold.  The next patch is adding the vDSO remap and unmap tracking to the
      powerpc architecture.
      
      This patch (of 3):
      
      This patch introduces a new set of header file to manage mm hooks:
      - per architecture empty header file (arch/x/include/asm/mm-arch-hooks.h)
      - a generic header (include/linux/mm-arch-hooks.h)
      
      The architecture which need to overwrite a hook as to redefine it in its
      header file, while architecture which doesn't need have nothing to do.
      
      The default hooks are defined in the generic header and are used in the
      case the architecture is not defining it.
      
      In a next step, mm hooks defined in include/asm-generic/mm_hooks.h should
      be moved here.
      Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2ae416b1
    • Z
      mm/hugetlb: reduce arch dependent code about huge_pmd_unshare · e81f2d22
      Zhang Zhen 提交于
      Currently we have many duplicates in definitions of huge_pmd_unshare.  In
      all architectures this function just returns 0 when
      CONFIG_ARCH_WANT_HUGE_PMD_SHARE is N.
      
      This patch puts the default implementation in mm/hugetlb.c and lets these
      architectures use the common code.
      Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: James Yang <James.Yang@freescale.com>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e81f2d22
    • A
      sparc: use for_each_sg() · 8c07a308
      Akinobu Mita 提交于
      This replaces the plain loop over the sglist array with for_each_sg()
      macro which consists of sg_next() function calls.  Since sparc does select
      ARCH_HAS_SG_CHAIN, it is necessary to use for_each_sg() in order to loop
      over each sg element.  This also help find problems with drivers that do
      not properly initialize their sg tables when CONFIG_DEBUG_SG is enabled.
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8c07a308
    • X
      sparc: time: Replace update_persistent_clock() with CONFIG_RTC_SYSTOHC · 460ea8d7
      Xunlei Pang 提交于
      On Sparc systems, update_persistent_clock() uses RTC drivers to do
      the job, it makes more sense to hand it over to CONFIG_RTC_SYSTOHC.
      
      In the long run, all the update_persistent_clock() should migrate to
      proper class RTC drivers if any and use CONFIG_RTC_SYSTOHC instead.
      Signed-off-by: NXunlei Pang <pang.xunlei@linaro.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com>
      460ea8d7
  17. 08 6月, 2015 1 次提交
  18. 07 6月, 2015 1 次提交
    • T
      arch/*/io.h: Add ioremap_wt() to all architectures · 556269c1
      Toshi Kani 提交于
      Add ioremap_wt() to all arch-specific asm/io.h headers which
      define ioremap_wc() locally. These headers do not include
      <asm-generic/iomap.h>. Some of them include <asm-generic/io.h>,
      but ioremap_wt() is defined for consistency since they define
      all ioremap_xxx locally.
      
      In all architectures without Write-Through support, ioremap_wt()
      is defined indentical to ioremap_nocache().
      
      frv and m68k already have ioremap_writethrough(). On those we
      add ioremap_wt() indetical to ioremap_writethrough() and defines
      ARCH_HAS_IOREMAP_WT in both architectures.
      
      The ioremap_wt() interface is exported to drivers.
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Elliott@hp.com
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: hch@lst.de
      Cc: hmh@hmh.eng.br
      Cc: jgross@suse.com
      Cc: konrad.wilk@oracle.com
      Cc: linux-mm <linux-mm@kvack.org>
      Cc: linux-nvdimm@lists.01.org
      Cc: stefan.bader@canonical.com
      Cc: yigal@plexistor.com
      Link: http://lkml.kernel.org/r/1433436928-31903-9-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      556269c1
  19. 01 6月, 2015 2 次提交