1. 20 4月, 2017 1 次提交
  2. 13 1月, 2017 1 次提交
  3. 17 12月, 2016 1 次提交
    • D
      libnvdimm: fix mishandled nvdimm_clear_poison() return value · 868f036f
      Dan Williams 提交于
      Colin, via static analysis, reports that the length could be negative
      from nvdimm_clear_poison() in the error case. There was a similar
      problem with commit 0a3f27b9 "libnvdimm, namespace: avoid multiple
      sector calculations" that I noticed when merging the for-4.10/libnvdimm
      topic branch into libnvdimm-for-next, but I missed this one. Fix both of
      them to the following procedure:
      
      * if we clear a block's worth of media, clear that many blocks in
        badblocks
      
      * if we clear less than the requested size of the transfer return an
        error
      
      * always invalidate cache after any non-error / non-zero
        nvdimm_clear_poison result
      
      Fixes: 82bf1037 ("libnvdimm: check and clear poison before writing to pmem")
      Fixes: 0a3f27b9 ("libnvdimm, namespace: avoid multiple sector calculations")
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Reported-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      868f036f
  4. 05 12月, 2016 1 次提交
  5. 29 11月, 2016 1 次提交
    • D
      libnvdimm: use consistent naming for request_mem_region() · 450c6633
      Dan Williams 提交于
      Here is an example /proc/iomem listing for a system with 2 namespaces,
      one in "sector" mode and one in "memory" mode:
      
        1fc000000-2fbffffff : Persistent Memory (legacy)
          1fc000000-2fbffffff : namespace1.0
        340000000-34fffffff : Persistent Memory
          340000000-34fffffff : btt0.1
      
      Here is the corresponding ndctl listing:
      
        # ndctl list
        [
          {
            "dev":"namespace1.0",
            "mode":"memory",
            "size":4294967296,
            "blockdev":"pmem1"
          },
          {
            "dev":"namespace0.0",
            "mode":"sector",
            "size":267091968,
            "uuid":"f7594f86-badb-4592-875f-ded577da2eaf",
            "sector_size":4096,
            "blockdev":"pmem0s"
          }
        ]
      
      Notice that the ndctl listing is purely in terms of namespace devices,
      while the iomem listing leaks the internal "btt0.1" implementation
      detail. Given that ndctl requires the namespace device name to change
      the mode, for example:
      
        # ndctl create-namespace --reconfig=namespace0.0 --mode=raw --force
      
      ...use the namespace name in the iomem listing to keep the claiming
      device name consistent across different mode settings.
      
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      450c6633
  6. 20 10月, 2016 1 次提交
    • T
      pmem: report error on clear poison failure · 3115bb02
      Toshi Kani 提交于
      ACPI Clear Uncorrectable Error DSM function may fail or may be
      unsupported on a platform.  pmem_clear_poison() returns without clearing
      badblocks in such cases.  This failure is detected at the next read
      (-EIO).
      
      This behavior can lead to an issue when user keeps writing but does not
      read immediately.  For instance, flight recorder file may be only read
      when it is necessary for troubleshooting.
      
      Change pmem_do_bvec() and pmem_clear_poison() to return -EIO so that
      filesystem can log an error message on a write error.
      
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      3115bb02
  7. 01 10月, 2016 1 次提交
  8. 08 8月, 2016 2 次提交
    • J
      block: rename bio bi_rw to bi_opf · 1eff9d32
      Jens Axboe 提交于
      Since commit 63a4cc24, bio->bi_rw contains flags in the lower
      portion and the op code in the higher portions. This means that
      old code that relies on manually setting bi_rw is most likely
      going to be broken. Instead of letting that brokeness linger,
      rename the member, to force old and out-of-tree code to break
      at compile time instead of at runtime.
      
      No intended functional changes in this commit.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      1eff9d32
    • J
      block/mm: make bdev_ops->rw_page() take a bool for read/write · c11f0c0b
      Jens Axboe 提交于
      Commit abf54548 changed it from an 'rw' flags type to the
      newer ops based interface, but now we're effectively leaking
      some bdev internals to the rest of the kernel. Since we only
      care about whether it's a read or a write at that level, just
      pass in a bool 'is_write' parameter instead.
      
      Then we can also move op_is_write() and friends back under
      CONFIG_BLOCK protection.
      Reviewed-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      c11f0c0b
  9. 05 8月, 2016 1 次提交
  10. 24 7月, 2016 1 次提交
  11. 21 7月, 2016 1 次提交
  12. 13 7月, 2016 2 次提交
  13. 12 7月, 2016 2 次提交
    • D
      libnvdimm, pmem: use REQ_FUA, REQ_FLUSH for nvdimm_flush() · 7e267a8c
      Dan Williams 提交于
      Given that nvdimm_flush() has higher overhead than wmb_pmem() (pointer
      chasing through nd_region), and that we otherwise assume a platform has
      ADR capability when flush hints are not present, move nvdimm_flush() to
      REQ_FLUSH context.
      
      Note that we still arrange for nvdimm_flush() to be called even in the
      ADR case. We need at least once wmb() fence to push buffered writes in
      the cpu out to the ADR protected domain.
      
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      7e267a8c
    • D
      libnvdimm: introduce nvdimm_flush() and nvdimm_has_flush() · f284a4f2
      Dan Williams 提交于
      nvdimm_flush() is a replacement for the x86 'pcommit' instruction.  It is
      an optional write flushing mechanism that an nvdimm bus can provide for
      the pmem driver to consume.  In the case of the NFIT nvdimm-bus-provider
      nvdimm_flush() is implemented as a series of flush-hint-address [1]
      writes to each dimm in the interleave set (region) that backs the
      namespace.
      
      The nvdimm_has_flush() routine relies on platform firmware to describe
      the flushing capabilities of a platform.  It uses the heuristic of
      whether an nvdimm bus provider provides flush address data to return a
      ternary result:
      
            1: flush addresses defined
            0: dimm topology described without flush addresses (assume ADR)
       -errno: no topology information, unable to determine flush mechanism
      
      The pmem driver is expected to take the following actions on this ternary
      result:
      
            1: nvdimm_flush() in response to REQ_FUA / REQ_FLUSH and shutdown
            0: do not set, WC or FUA on the queue, take no further action
       -errno: warn and then operate as if nvdimm_has_flush() returned '0'
      
      The caveat of this heuristic is that it can not distinguish the "dimm
      does not have flush address" case from the "platform firmware is broken
      and failed to describe a flush address".  Given we are already
      explicitly trusting the NFIT there's not much more we can do beyond
      blacklisting broken firmwares if they are ever encountered.
      
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      f284a4f2
  14. 28 6月, 2016 1 次提交
    • D
      block: convert to device_add_disk() · 0d52c756
      Dan Williams 提交于
      For block drivers that specify a parent device, convert them to use
      device_add_disk().
      
      This conversion was done with the following semantic patch:
      
          @@
          struct gendisk *disk;
          expression E;
          @@
      
          - disk->driverfs_dev = E;
          ...
          - add_disk(disk);
          + device_add_disk(E, disk);
      
          @@
          struct gendisk *disk;
          expression E1, E2;
          @@
      
          - disk->driverfs_dev = E1;
          ...
          E2 = disk;
          ...
          - add_disk(E2);
          + device_add_disk(E1, E2);
      
      ...plus some manual fixups for a few missed conversions.
      
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0d52c756
  15. 25 6月, 2016 1 次提交
    • D
      libnvdimm, pmem: allow nfit_test to override pmem_direct_access() · f295e53b
      Dan Williams 提交于
      Currently phys_to_pfn_t() is an exported symbol to allow nfit_test to
      override it and indicate that nfit_test-pmem is not device-mapped.  Now,
      we want to enable nfit_test to operate without DMA_CMA and the pmem it
      provides will no longer be physically contiguous, i.e. won't be capable
      of supporting direct_access requests larger than a page.  Make
      pmem_direct_access() a weak symbol so that it can be replaced by the
      tools/testing/nvdimm/ version, and move phys_to_pfn_t() to a static
      inline now that it no longer needs to be overridden.
      Acked-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      f295e53b
  16. 16 6月, 2016 1 次提交
  17. 21 5月, 2016 1 次提交
  18. 19 5月, 2016 1 次提交
  19. 07 5月, 2016 1 次提交
  20. 01 5月, 2016 1 次提交
    • D
      libnvdimm, pfn: fix memmap reservation sizing · 658922e5
      Dan Williams 提交于
      When configuring a pfn-device instance to allocate the memmap array it
      needs to account for the fact that vmemmap_populate_hugepages()
      allocates struct page blocks in HPAGE_SIZE chunks.  We need to align the
      reserved area size to 2MB otherwise arch_add_memory() runs out of memory
      while establishing the memmap:
      
       WARNING: CPU: 0 PID: 496 at arch/x86/mm/init_64.c:704 arch_add_memory+0xe7/0xf0
       [..]
       Call Trace:
        [<ffffffff8148bdb3>] dump_stack+0x85/0xc2
        [<ffffffff810a749b>] __warn+0xcb/0xf0
        [<ffffffff810a75cd>] warn_slowpath_null+0x1d/0x20
        [<ffffffff8106a497>] arch_add_memory+0xe7/0xf0
        [<ffffffff811d2097>] devm_memremap_pages+0x287/0x450
        [<ffffffff811d1ffa>] ? devm_memremap_pages+0x1ea/0x450
        [<ffffffffa0000298>] __wrap_devm_memremap_pages+0x58/0x70 [nfit_test_iomap]
        [<ffffffffa0047a58>] pmem_attach_disk+0x318/0x420 [nd_pmem]
        [<ffffffffa0047bcf>] nd_pmem_probe+0x6f/0x90 [nd_pmem]
        [<ffffffffa0009469>] nvdimm_bus_probe+0x69/0x110 [libnvdimm]
       [..]
        ndbus0: nd_pmem.probe(pfn3.0) = -12
       nd_pmem: probe of pfn3.0 failed with error -12
      libndctl: ndctl_pfn_enable: pfn3.0: failed to enable
      Reported-by: NNamratha Kothapalli <namratha.n.kothapalli@intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      658922e5
  21. 23 4月, 2016 9 次提交
  22. 16 4月, 2016 1 次提交
  23. 08 4月, 2016 1 次提交
    • D
      libnvdimm, pfn: fix nvdimm_namespace_add_poison() vs section alignment · a3901802
      Dan Williams 提交于
      When section alignment padding is in effect we need to shift / truncate
      the range that is queried for poison by the 'start_pad' or 'end_trunc'
      reservations.
      
      It's easiest if we just pass in an adjusted resource range rather than
      deriving it from the passed in namespace.  With the resource range
      resolution pushed out to the caller we can also push the
      namespace-to-region lookup to the caller and drop the implicit pmem-type
      assumption about the passed in namespace object.
      
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      a3901802
  24. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  25. 29 3月, 2016 1 次提交
  26. 10 3月, 2016 3 次提交
  27. 07 3月, 2016 1 次提交