1. 10 11月, 2015 3 次提交
  2. 04 11月, 2015 1 次提交
  3. 23 10月, 2015 1 次提交
  4. 16 10月, 2015 1 次提交
  5. 14 10月, 2015 1 次提交
  6. 01 10月, 2015 15 次提交
  7. 28 9月, 2015 1 次提交
  8. 25 9月, 2015 1 次提交
  9. 23 9月, 2015 1 次提交
    • A
      x86, efi, kasan: #undef memset/memcpy/memmove per arch · 769a8089
      Andrey Ryabinin 提交于
      In not-instrumented code KASAN replaces instrumented memset/memcpy/memmove
      with not-instrumented analogues __memset/__memcpy/__memove.
      
      However, on x86 the EFI stub is not linked with the kernel.  It uses
      not-instrumented mem*() functions from arch/x86/boot/compressed/string.c
      
      So we don't replace them with __mem*() variants in EFI stub.
      
      On ARM64 the EFI stub is linked with the kernel, so we should replace
      mem*() functions with __mem*(), because the EFI stub runs before KASAN
      sets up early shadow.
      
      So let's move these #undef mem* into arch's asm/efi.h which is also
      included by the EFI stub.
      
      Also, this will fix the warning in 32-bit build reported by kbuild test
      robot:
      
      	efi-stub-helper.c:599:2: warning: implicit declaration of function 'memcpy'
      
      [akpm@linux-foundation.org: use 80 cols in comment]
      Signed-off-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
      Reported-by: NFengguang Wu <fengguang.wu@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      769a8089
  10. 21 9月, 2015 1 次提交
  11. 16 9月, 2015 1 次提交
    • P
      KVM: add halt_attempted_poll to VCPU stats · 62bea5bf
      Paolo Bonzini 提交于
      This new statistic can help diagnosing VCPUs that, for any reason,
      trigger bad behavior of halt_poll_ns autotuning.
      
      For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
      like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
      10+20+40+80+160+320+480 = 1110 microseconds out of every
      479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
      is consuming about 30% more CPU than it would use without
      polling.  This would show as an abnormally high number of
      attempted polling compared to the successful polls.
      
      Acked-by: Christian Borntraeger <borntraeger@de.ibm.com<
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      62bea5bf
  12. 14 9月, 2015 1 次提交
  13. 11 9月, 2015 8 次提交
    • P
      locking/qspinlock/x86: Only emit the test-and-set fallback when building guest support · a6b27785
      Peter Zijlstra 提交于
      Only emit the test-and-set fallback for Hypervisors lacking
      PARAVIRT_SPINLOCKS support when building for guests.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org # 4.2
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a6b27785
    • P
      locking/qspinlock/x86: Fix performance regression under unaccelerated VMs · 43b3f028
      Peter Zijlstra 提交于
      Dave ran into horrible performance on a VM without PARAVIRT_SPINLOCKS
      set and Linus noted that the test-and-set implementation was retarded.
      
      One should spin on the variable with a load, not a RMW.
      
      While there, remove 'queued' from the name, as the lock isn't queued
      at all, but a simple test-and-set.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: NDave Chinner <david@fromorbit.com>
      Tested-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Waiman Long <Waiman.Long@hp.com>
      Cc: stable@vger.kernel.org # v4.2+
      Link: http://lkml.kernel.org/r/20150904152523.GR18673@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      43b3f028
    • C
      dma-mapping: consolidate dma_set_mask · 452e06af
      Christoph Hellwig 提交于
      Almost everyone implements dma_set_mask the same way, although some time
      that's hidden in ->set_dma_mask methods.
      
      This patch consolidates those into a common implementation that either
      calls ->set_dma_mask if present or otherwise uses the default
      implementation.  Some architectures used to only call ->set_dma_mask
      after the initial checks, and those instance have been fixed to do the
      full work.  h8300 implemented dma_set_mask bogusly as a no-ops and has
      been fixed.
      
      Unfortunately some architectures overload unrelated semantics like changing
      the dma_ops into it so we still need to allow for an architecture override
      for now.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      452e06af
    • C
      dma-mapping: consolidate dma_supported · ee196371
      Christoph Hellwig 提交于
      Most architectures just call into ->dma_supported, but some also return 1
      if the method is not present, or 0 if no dma ops are present (although
      that should never happeb). Consolidate this more broad version into
      common code.
      
      Also fix h8300 which inorrectly always returned 0, which would have been
      a problem if it's dma_set_mask implementation wasn't a similarly buggy
      noop.
      
      As a few architectures have much more elaborate implementations, we
      still allow for arch overrides.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee196371
    • C
      dma-mapping: cosolidate dma_mapping_error · efa21e43
      Christoph Hellwig 提交于
      Currently there are three valid implementations of dma_mapping_error:
      
       (1) call ->mapping_error
       (2) check for a hardcoded error code
       (3) always return 0
      
      This patch provides a common implementation that calls ->mapping_error
      if present, then checks for DMA_ERROR_CODE if defined or otherwise
      returns 0.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      efa21e43
    • C
      dma-mapping: consolidate dma_{alloc,free}_noncoherent · 1e893752
      Christoph Hellwig 提交于
      Most architectures do not support non-coherent allocations and either
      define dma_{alloc,free}_noncoherent to their coherent versions or stub
      them out.
      
      Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips
      implements them directly.
      
      This patch moves the Openrisc version to common code, and handles the
      DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance.
      
      Note that actual non-coherent allocations require a dma_cache_sync
      implementation, so if non-coherent allocations didn't work on
      an architecture before this patch they still won't work after it.
      
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1e893752
    • C
      dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} · 6894258e
      Christoph Hellwig 提交于
      Since 2009 we have a nice asm-generic header implementing lots of DMA API
      functions for architectures using struct dma_map_ops, but unfortunately
      it's still missing a lot of APIs that all architectures still have to
      duplicate.
      
      This series consolidates the remaining functions, although we still need
      arch opt outs for two of them as a few architectures have very
      non-standard implementations.
      
      This patch (of 5):
      
      The coherent DMA allocator works the same over all architectures supporting
      dma_map operations.
      
      This patch consolidates them and converges the minor differences:
      
       - the debug_dma helpers are now called from all architectures, including
         those that were previously missing them
       - dma_alloc_from_coherent and dma_release_from_coherent are now always
         called from the generic alloc/free routines instead of the ops
         dma-mapping-common.h always includes dma-coherent.h to get the defintions
         for them, or the stubs if the architecture doesn't support this feature
       - checks for ->alloc / ->free presence are removed.  There is only one
         magic instead of dma_map_ops without them (mic_dma_ops) and that one
         is x86 only anyway.
      
      Besides that only x86 needs special treatment to replace a default devices
      if none is passed and tweak the gfp_flags.  An optional arch hook is provided
      for that.
      
      [linux@roeck-us.net: fix build]
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6894258e
    • D
      kexec: split kexec_load syscall from kexec core code · 2965faa5
      Dave Young 提交于
      There are two kexec load syscalls, kexec_load another and kexec_file_load.
       kexec_file_load has been splited as kernel/kexec_file.c.  In this patch I
      split kexec_load syscall code to kernel/kexec.c.
      
      And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
      use kexec_file_load only, or vice verse.
      
      The original requirement is from Ted Ts'o, he want kexec kernel signature
      being checked with CONFIG_KEXEC_VERIFY_SIG enabled.  But kexec-tools use
      kexec_load syscall can bypass the checking.
      
      Vivek Goyal proposed to create a common kconfig option so user can compile
      in only one syscall for loading kexec kernel.  KEXEC/KEXEC_FILE selects
      KEXEC_CORE so that old config files still work.
      
      Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
      architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
      KEXEC_CORE in arch Kconfig.  Also updated general kernel code with to
      kexec_load syscall.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2965faa5
  14. 09 9月, 2015 2 次提交
    • J
      xen: Use correctly the Xen memory terminologies · 0df4f266
      Julien Grall 提交于
      Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
      is meant, I suspect this is because the first support for Xen was for
      PV. This resulted in some misimplementation of helpers on ARM and
      confused developers about the expected behavior.
      
      For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
      Although, if we look at the implementation on x86, it's returning a GFN.
      
      For clarity and avoid new confusion, replace any reference to mfn with
      gfn in any helpers used by PV drivers. The x86 code will still keep some
      reference of pfn_to_mfn which may be used by all kind of guests
      No changes as been made in the hypercall field, even
      though they may be invalid, in order to keep the same as the defintion
      in xen repo.
      
      Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a
      name to close to the KVM function gfn_to_page.
      
      Take also the opportunity to simplify simple construction such
      as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up
      will come in follow-up patches.
      
      [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cbSigned-off-by: NJulien Grall <julien.grall@citrix.com>
      Reviewed-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Acked-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Acked-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      0df4f266
    • J
      xen: Make clear that swiotlb and biomerge are dealing with DMA address · 32e09870
      Julien Grall 提交于
      The swiotlb is required when programming a DMA address on ARM when a
      device is not protected by an IOMMU.
      
      In this case, the DMA address should always be equal to the machine address.
      For DOM0 memory, Xen ensure it by have an identity mapping between the
      guest address and host address. However, when mapping a foreign grant
      reference, the 1:1 model doesn't work.
      
      For ARM guest, most of the callers of pfn_to_mfn expects to get a GFN
      (Guest Frame Number), i.e a PFN (Page Frame Number) from the Linux point
      of view given that all ARM guest are auto-translated.
      
      Even though the name pfn_to_mfn is misleading, we need to ensure that
      those caller get a GFN and not by mistake a MFN. In pratical, I haven't
      seen error related to this but we should fix it for the sake of
      correctness.
      
      In order to fix the implementation of pfn_to_mfn on ARM in a follow-up
      patch, we have to introduce new helpers to return the DMA from a PFN and
      the invert.
      
      On x86, the new helpers will be an alias of pfn_to_mfn and mfn_to_pfn.
      
      The helpers will be used in swiotlb and xen_biovec_phys_mergeable.
      
      This is necessary in the latter because we have to ensure that the
      biovec code will not try to merge a biovec using foreign page and
      another using Linux memory.
      
      Lastly, the helper mfn_to_local_pfn has been renamed to bfn_to_local_pfn
      given that the only usage was in swiotlb.
      Signed-off-by: NJulien Grall <julien.grall@citrix.com>
      Reviewed-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      32e09870
  15. 05 9月, 2015 1 次提交
    • M
      mm: send one IPI per CPU to TLB flush all entries after unmapping pages · 72b252ae
      Mel Gorman 提交于
      An IPI is sent to flush remote TLBs when a page is unmapped that was
      potentially accesssed by other CPUs.  There are many circumstances where
      this happens but the obvious one is kswapd reclaiming pages belonging to a
      running process as kswapd and the task are likely running on separate
      CPUs.
      
      On small machines, this is not a significant problem but as machine gets
      larger with more cores and more memory, the cost of these IPIs can be
      high.  This patch uses a simple structure that tracks CPUs that
      potentially have TLB entries for pages being unmapped.  When the unmapping
      is complete, the full TLB is flushed on the assumption that a refill cost
      is lower than flushing individual entries.
      
      Architectures wishing to do this must give the following guarantee.
      
              If a clean page is unmapped and not immediately flushed, the
              architecture must guarantee that a write to that linear address
              from a CPU with a cached TLB entry will trap a page fault.
      
      This is essentially what the kernel already depends on but the window is
      much larger with this patch applied and is worth highlighting.  The
      architecture should consider whether the cost of the full TLB flush is
      higher than sending an IPI to flush each individual entry.  An additional
      architecture helper called flush_tlb_local is required.  It's a trivial
      wrapper with some accounting in the x86 case.
      
      The impact of this patch depends on the workload as measuring any benefit
      requires both mapped pages co-located on the LRU and memory pressure.  The
      case with the biggest impact is multiple processes reading mapped pages
      taken from the vm-scalability test suite.  The test case uses NR_CPU
      readers of mapped files that consume 10*RAM.
      
      Linear mapped reader on a 4-node machine with 64G RAM and 48 CPUs
      
                                                 4.2.0-rc1          4.2.0-rc1
                                                   vanilla       flushfull-v7
      Ops lru-file-mmap-read-elapsed      159.62 (  0.00%)   120.68 ( 24.40%)
      Ops lru-file-mmap-read-time_range    30.59 (  0.00%)     2.80 ( 90.85%)
      Ops lru-file-mmap-read-time_stddv     6.70 (  0.00%)     0.64 ( 90.38%)
      
                 4.2.0-rc1    4.2.0-rc1
                   vanilla flushfull-v7
      User          581.00       611.43
      System       5804.93      4111.76
      Elapsed       161.03       122.12
      
      This is showing that the readers completed 24.40% faster with 29% less
      system CPU time.  From vmstats, it is known that the vanilla kernel was
      interrupted roughly 900K times per second during the steady phase of the
      test and the patched kernel was interrupts 180K times per second.
      
      The impact is lower on a single socket machine.
      
                                                 4.2.0-rc1          4.2.0-rc1
                                                   vanilla       flushfull-v7
      Ops lru-file-mmap-read-elapsed       25.33 (  0.00%)    20.38 ( 19.54%)
      Ops lru-file-mmap-read-time_range     0.91 (  0.00%)     1.44 (-58.24%)
      Ops lru-file-mmap-read-time_stddv     0.28 (  0.00%)     0.47 (-65.34%)
      
                 4.2.0-rc1    4.2.0-rc1
                   vanilla flushfull-v7
      User           58.09        57.64
      System        111.82        76.56
      Elapsed        27.29        22.55
      
      It's still a noticeable improvement with vmstat showing interrupts went
      from roughly 500K per second to 45K per second.
      
      The patch will have no impact on workloads with no memory pressure or have
      relatively few mapped pages.  It will have an unpredictable impact on the
      workload running on the CPU being flushed as it'll depend on how many TLB
      entries need to be refilled and how long that takes.  Worst case, the TLB
      will be completely cleared of active entries when the target PFNs were not
      resident at all.
      
      [sasha.levin@oracle.com: trace tlb flush after disabling preemption in try_to_unmap_flush]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72b252ae
  16. 28 8月, 2015 1 次提交
    • D
      x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB · 96601adb
      Dan Williams 提交于
      Given that a write-back (WB) mapping plus non-temporal stores is
      expected to be the most efficient way to access PMEM, update the
      definition of ARCH_HAS_PMEM_API to imply arch support for
      WB-mapped-PMEM.  This is needed as a pre-requisite for adding PMEM to
      the direct map and mapping it with struct page.
      
      The above clarification for X86_64 means that memcpy_to_pmem() is
      permitted to use the non-temporal arch_memcpy_to_pmem() rather than
      needlessly fall back to default_memcpy_to_pmem() when the pcommit
      instruction is not available.  When arch_memcpy_to_pmem() is not
      guaranteed to flush writes out of cache, i.e. on older X86_32
      implementations where non-temporal stores may just dirty cache,
      ARCH_HAS_PMEM_API is simply disabled.
      
      The default fall back for persistent memory handling remains.  Namely,
      map it with the WT (write-through) cache-type and hope for the best.
      
      arch_has_pmem_api() is updated to only indicate whether the arch
      provides the proper helpers to meet the minimum "writes are visible
      outside the cache hierarchy after memcpy_to_pmem() + wmb_pmem()".  Code
      that cares whether wmb_pmem() actually flushes writes to pmem must now
      call arch_has_wmb_pmem() directly.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      [hch: set ARCH_HAS_PMEM_API=n on x86_32]
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      [toshi: x86_32 compile fixes]
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      96601adb