1. 23 3月, 2016 2 次提交
  2. 16 3月, 2016 1 次提交
  3. 10 3月, 2016 3 次提交
    • A
      memremap: check pfn validity before passing to pfn_to_page() · ac343e88
      Ard Biesheuvel 提交于
      In memremap's helper function try_ram_remap(), we dereference a struct
      page pointer that was derived from a PFN that is known to be covered by
      a 'System RAM' iomem region, and is thus assumed to be a 'valid' PFN,
      i.e., a PFN that has a struct page associated with it and is covered by
      the kernel direct mapping.
      
      However, the assumption that there is a 1:1 relation between the System
      RAM iomem region and the kernel direct mapping is not universally valid
      on all architectures, and on ARM and arm64, 'System RAM' may include
      regions for which pfn_valid() returns false.
      
      Generally speaking, both __va() and pfn_to_page() should only ever be
      called on PFNs/physical addresses for which pfn_valid() returns true, so
      add that check to try_ram_remap().
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac343e88
    • D
      mm: fix mixed zone detection in devm_memremap_pages · 5f29a77c
      Dan Williams 提交于
      The check for whether we overlap "System RAM" needs to be done at
      section granularity.  For example a system with the following mapping:
      
          100000000-37bffffff : System RAM
          37c000000-837ffffff : Persistent Memory
      
      ...is unable to use devm_memremap_pages() as it would result in two
      zones colliding within a given section.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5f29a77c
    • D
      list: kill list_force_poison() · d77a117e
      Dan Williams 提交于
      Given we have uninitialized list_heads being passed to list_add() it
      will always be the case that those uninitialized values randomly trigger
      the poison value.  Especially since a list_add() operation will seed the
      stack with the poison value for later stack allocations to trip over.
      
      For example, see these two false positive reports:
      
        list_add attempted on force-poisoned entry
        WARNING: at lib/list_debug.c:34
        [..]
        NIP [c00000000043c390] __list_add+0xb0/0x150
        LR [c00000000043c38c] __list_add+0xac/0x150
        Call Trace:
          __list_add+0xac/0x150 (unreliable)
          __down+0x4c/0xf8
          down+0x68/0x70
          xfs_buf_lock+0x4c/0x150 [xfs]
      
        list_add attempted on force-poisoned entry(0000000000000500),
         new->next == d0000000059ecdb0, new->prev == 0000000000000500
        WARNING: at lib/list_debug.c:33
        [..]
        NIP [c00000000042db78] __list_add+0xa8/0x140
        LR [c00000000042db74] __list_add+0xa4/0x140
        Call Trace:
          __list_add+0xa4/0x140 (unreliable)
          rwsem_down_read_failed+0x6c/0x1a0
          down_read+0x58/0x60
          xfs_log_commit_cil+0x7c/0x600 [xfs]
      
      Fixes: commit 5c2c2587 ("mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gup")
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Reported-by: NEryu Guan <eguan@redhat.com>
      Tested-by: NEryu Guan <eguan@redhat.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d77a117e
  4. 24 2月, 2016 1 次提交
  5. 19 2月, 2016 1 次提交
  6. 12 2月, 2016 1 次提交
  7. 01 2月, 2016 1 次提交
  8. 30 1月, 2016 2 次提交
    • T
      memremap: Change region_intersects() to take @flags and @desc · 1c29f25b
      Toshi Kani 提交于
      Change region_intersects() to identify a target with @flags and
      @desc, instead of @name with strcmp().
      
      Change the callers of region_intersects(), memremap() and
      devm_memremap(), to set IORESOURCE_SYSTEM_RAM in @flags and
      IORES_DESC_NONE in @desc when searching System RAM.
      
      Also, export region_intersects() so that the ACPI EINJ error
      injection driver can call this function in a later patch.
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jakub Sitnicki <jsitnicki@gmail.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mm <linux-mm@kvack.org>
      Link: http://lkml.kernel.org/r/1453841853-11383-13-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1c29f25b
    • D
      devm_memremap_pages: fix vmem_altmap lifetime + alignment handling · eb7d78c9
      Dan Williams 提交于
      to_vmem_altmap() needs to return valid results until
      arch_remove_memory() completes.  It also needs to be valid for any pfn
      in a section regardless of whether that pfn maps to data.  This escape
      was a result of a bug in the unit test.
      
      The signature of this bug is that free_pagetable() fails to retrieve a
      vmem_altmap and goes off into the weeds:
      
       BUG: unable to handle kernel NULL pointer dereference at           (null)
       IP: [<ffffffff811d2629>] get_pfnblock_flags_mask+0x49/0x60
       [..]
       Call Trace:
        [<ffffffff811d3477>] free_hot_cold_page+0x97/0x1d0
        [<ffffffff811d367a>] __free_pages+0x2a/0x40
        [<ffffffff8191e669>] free_pagetable+0x8c/0xd4
        [<ffffffff8191ef4e>] remove_pagetable+0x37a/0x808
        [<ffffffff8191b210>] vmemmap_free+0x10/0x20
      
      Fixes: 4b94ffdc ("x86, mm: introduce vmem_altmap to augment vmemmap_populate()")
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Reported-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      eb7d78c9
  9. 16 1月, 2016 5 次提交
    • D
      mm, x86: get_user_pages() for dax mappings · 3565fce3
      Dan Williams 提交于
      A dax mapping establishes a pte with _PAGE_DEVMAP set when the driver
      has established a devm_memremap_pages() mapping, i.e.  when the pfn_t
      return from ->direct_access() has PFN_DEV and PFN_MAP set.  Later, when
      encountering _PAGE_DEVMAP during a page table walk we lookup and pin a
      struct dev_pagemap instance to keep the result of pfn_to_page() valid
      until put_page().
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Tested-by: NLogan Gunthorpe <logang@deltatee.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3565fce3
    • D
      mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gup · 5c2c2587
      Dan Williams 提交于
      get_dev_page() enables paths like get_user_pages() to pin a dynamically
      mapped pfn-range (devm_memremap_pages()) while the resulting struct page
      objects are in use.  Unlike get_page() it may fail if the device is, or
      is in the process of being, disabled.  While the initial lookup of the
      range may be an expensive list walk, the result is cached to speed up
      subsequent lookups which are likely to be in the same mapped range.
      
      devm_memremap_pages() now requires a reference counter to be specified
      at init time.  For pmem this means moving request_queue allocation into
      pmem_alloc() so the existing queue usage counter can track "device
      pages".
      
      ZONE_DEVICE pages always have an elevated count and will never be on an
      lru reclaim list.  That space in 'struct page' can be redirected for
      other uses, but for safety introduce a poison value that will always
      trip __list_add() to assert.  This allows half of the struct list_head
      storage to be reclaimed with some assurance to back up the assumption
      that the page count never goes to zero and a list_add() is never
      attempted.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Tested-by: NLogan Gunthorpe <logang@deltatee.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c2c2587
    • D
      x86, mm: introduce vmem_altmap to augment vmemmap_populate() · 4b94ffdc
      Dan Williams 提交于
      In support of providing struct page for large persistent memory
      capacities, use struct vmem_altmap to change the default policy for
      allocating memory for the memmap array.  The default vmemmap_populate()
      allocates page table storage area from the page allocator.  Given
      persistent memory capacities relative to DRAM it may not be feasible to
      store the memmap in 'System Memory'.  Instead vmem_altmap represents
      pre-allocated "device pages" to satisfy vmemmap_alloc_block_buf()
      requests.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b94ffdc
    • D
      mm: introduce find_dev_pagemap() · 9476df7d
      Dan Williams 提交于
      There are several scenarios where we need to retrieve and update
      metadata associated with a given devm_memremap_pages() mapping, and the
      only lookup key available is a pfn in the range:
      
      1/ We want to augment vmemmap_populate() (called via arch_add_memory())
         to allocate memmap storage from pre-allocated pages reserved by the
         device driver.  At vmemmap_alloc_block_buf() time it grabs device pages
         rather than page allocator pages.  This is in support of
         devm_memremap_pages() mappings where the memmap is too large to fit in
         main memory (i.e. large persistent memory devices).
      
      2/ Taking a reference against the mapping when inserting device pages
         into the address_space radix of a given inode.  This facilitates
         unmap_mapping_range() and truncate_inode_pages() operations when the
         driver is tearing down the mapping.
      
      3/ get_user_pages() operations on ZONE_DEVICE memory require taking a
         reference against the mapping so that the driver teardown path can
         revoke and drain usage of device pages.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Tested-by: NLogan Gunthorpe <logang@deltatee.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9476df7d
    • D
      mm, dax, pmem: introduce pfn_t · 34c0fd54
      Dan Williams 提交于
      For the purpose of communicating the optional presence of a 'struct
      page' for the pfn returned from ->direct_access(), introduce a type that
      encapsulates a page-frame-number plus flags.  These flags contain the
      historical "page_link" encoding for a scatterlist entry, but can also
      denote "device memory".  Where "device memory" is a set of pfns that are
      not part of the kernel's linear mapping by default, but are accessed via
      the same memory controller as ram.
      
      The motivation for this new type is large capacity persistent memory
      that needs struct page entries in the 'memmap' to support 3rd party DMA
      (i.e.  O_DIRECT I/O with a persistent memory source/target).  However,
      we also need it in support of maintaining a list of mapped inodes which
      need to be unmapped at driver teardown or freeze_bdev() time.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34c0fd54
  10. 27 10月, 2015 1 次提交
    • D
      memremap: fix highmem support · 182475b7
      Dan Williams 提交于
      Currently memremap checks if the range is "System RAM" and returns the
      kernel linear address.  This is broken for highmem platforms where a
      range may be "System RAM", but is not part of the kernel linear mapping.
      Fallback to ioremap_cache() in these cases, to let the arch code attempt
      to handle it.
      
      Note that ARM ioremap will WARN when attempting to remap ram, and in
      that case the caller needs to be fixed.  For this reason, existing
      ioremap_cache() usages for ARM are already trained to avoid attempts to
      remap ram.
      
      The impact of this bug is low for now since the pmem driver is the only
      user of memremap(), but this is important to fix before more conversions
      to memremap arrive in 4.4.
      
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      182475b7
  11. 10 10月, 2015 4 次提交
  12. 28 8月, 2015 1 次提交
  13. 15 8月, 2015 2 次提交
    • C
      devres: add devm_memremap · 7d3dcf26
      Christoph Hellwig 提交于
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      7d3dcf26
    • D
      arch: introduce memremap() · 92281dee
      Dan Williams 提交于
      Existing users of ioremap_cache() are mapping memory that is known in
      advance to not have i/o side effects.  These users are forced to cast
      away the __iomem annotation, or otherwise neglect to fix the sparse
      errors thrown when dereferencing pointers to this memory.  Provide
      memremap() as a non __iomem annotated ioremap_*() in the case when
      ioremap is otherwise a pointer to cacheable memory. Empirically,
      ioremap_<cacheable-type>() call sites are seeking memory-like semantics
      (e.g.  speculative reads, and prefetching permitted).
      
      memremap() is a break from the ioremap implementation pattern of adding
      a new memremap_<type>() for each mapping type and having silent
      compatibility fall backs.  Instead, the implementation defines flags
      that are passed to the central memremap() and if a mapping type is not
      supported by an arch memremap returns NULL.
      
      We introduce a memremap prototype as a trivial wrapper of
      ioremap_cache() and ioremap_wt().  Later, once all ioremap_cache() and
      ioremap_wt() usage has been removed from drivers we teach archs to
      implement arch_memremap() with the ability to strictly enforce the
      mapping type.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      92281dee