1. 12 10月, 2017 2 次提交
  2. 10 7月, 2017 1 次提交
  3. 30 6月, 2017 1 次提交
  4. 10 6月, 2017 1 次提交
    • D
      x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations · 0aed55af
      Dan Williams 提交于
      The pmem driver has a need to transfer data with a persistent memory
      destination and be able to rely on the fact that the destination writes are not
      cached. It is sufficient for the writes to be flushed to a cpu-store-buffer
      (non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync()
      to ensure data-writes have reached a power-fail-safe zone in the platform. The
      fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn
      around and fence previous writes with an "sfence".
      
      Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and
      memcpy_flushcache, that guarantee that the destination buffer is not dirty in
      the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines
      will be used to replace the "pmem api" (include/linux/pmem.h +
      arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache()
      and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
      config symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
      otherwise.
      
      This is meant to satisfy the concern from Linus that if a driver wants to do
      something beyond the normal nocache semantics it should be something private to
      that driver [1], and Al's concern that anything uaccess related belongs with
      the rest of the uaccess code [2].
      
      The first consumer of this interface is a new 'copy_from_iter' dax operation so
      that pmem can inject cache maintenance operations without imposing this
      overhead on other dax-capable drivers.
      
      [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
      [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
      
      Cc: <x86@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0aed55af
  5. 03 4月, 2017 1 次提交
  6. 06 12月, 2016 1 次提交
    • A
      [iov_iter] new primitives - copy_from_iter_full() and friends · cbbd26b8
      Al Viro 提交于
      copy_from_iter_full(), copy_from_iter_full_nocache() and
      csum_and_copy_from_iter_full() - counterparts of copy_from_iter()
      et.al., advancing iterator only in case of successful full copy
      and returning whether it had been successful or not.
      
      Convert some obvious users.  *NOTE* - do not blindly assume that
      something is a good candidate for those unless you are sure that
      not advancing iov_iter in failure case is the right thing in
      this case.  Anything that does short read/short write kind of
      stuff (or is in a loop, etc.) is unlikely to be a good one.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      cbbd26b8
  7. 01 11月, 2016 1 次提交
  8. 11 10月, 2016 1 次提交
  9. 06 10月, 2016 1 次提交
    • A
      new iov_iter flavour: pipe-backed · 241699cd
      Al Viro 提交于
      iov_iter variant for passing data into pipe.  copy_to_iter()
      copies data into page(s) it has allocated and stuffs them into
      the pipe; copy_page_to_iter() stuffs there a reference to the
      page given to it.  Both will try to coalesce if possible.
      iov_iter_zero() is similar to copy_to_iter(); iov_iter_get_pages()
      and friends will do as copy_to_iter() would have and return the
      pages where the data would've been copied.  iov_iter_advance()
      will truncate everything past the spot it has advanced to.
      
      New primitive: iov_iter_pipe(), used for initializing those.
      pipe should be locked all along.
      
      Running out of space acts as fault would for iovec-backed ones;
      in other words, giving it to ->read_iter() may result in short
      read if the pipe overflows, or -EFAULT if it happens with nothing
      copied there.
      
      In other words, ->read_iter() on those acts pretty much like
      ->splice_read().  Moreover, all generic_file_splice_read() users,
      as well as many other ->splice_read() instances can be switched
      to that scheme - that'll happen in the next commit.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      241699cd
  10. 28 9月, 2016 1 次提交
    • A
      get rid of separate multipage fault-in primitives · 4bce9f6e
      Al Viro 提交于
      * the only remaining callers of "short" fault-ins are just as happy with generic
      variants (both in lib/iov_iter.c); switch them to multipage variants, kill the
      "short" ones
      * rename the multipage variants to now available plain ones.
      * get rid of compat macro defining iov_iter_fault_in_multipage_readable by
      expanding it in its only user.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4bce9f6e
  11. 18 9月, 2016 1 次提交
  12. 09 4月, 2016 1 次提交
  13. 07 12月, 2015 1 次提交
  14. 12 4月, 2015 2 次提交
  15. 30 3月, 2015 1 次提交
    • A
      saner iov_iter initialization primitives · bc917be8
      Al Viro 提交于
      iovec-backed iov_iter instances are assumed to satisfy several properties:
      	* no more than UIO_MAXIOV elements in iovec array
      	* total size of all ranges is no more than MAX_RW_COUNT
      	* all ranges pass access_ok().
      
      The problem is, invariants of data structures should be established in the
      primitives creating those data structures, not in the code using those
      primitives.  And iov_iter_init() violates that principle.  For a while we
      managed to get away with that, but once the use of iov_iter started to
      spread, it didn't take long for shit to hit the fan - missed check in
      sys_sendto() had introduced a roothole.
      
      We _do_ have primitives for importing and validating iovecs (both native and
      compat ones) and those primitives are almost always followed by shoving the
      resulting iovec into iov_iter.  Life would be considerably simpler (and safer)
      if we combined those primitives with initializing iov_iter.
      
      That gives us two new primitives - import_iovec() and compat_import_iovec().
      Calling conventions:
      	iovec = iov_array;
      	err = import_iovec(direction, uvec, nr_segs,
      			   ARRAY_SIZE(iov_array), &iovec,
      			   &iter);
      imports user vector into kernel space (into iov_array if it fits, allocated
      if it doesn't fit or if iovec was NULL), validates it and sets iter up to
      refer to it.  On success 0 is returned and allocated kernel copy (or NULL
      if the array had fit into caller-supplied one) is returned via iovec.
      On failure all allocations are undone and -E... is returned.  If the total
      size of ranges exceeds MAX_RW_COUNT, the excess is silently truncated.
      
      compat_import_iovec() expects uvec to be a pointer to user array of compat_iovec;
      otherwise it's identical to import_iovec().
      
      Finally, import_single_range() sets iov_iter backed by single-element iovec
      covering a user-supplied range -
      
      	err = import_single_range(direction, address, size, iovec, &iter);
      
      does validation and sets iter up.  Again, size in excess of MAX_RW_COUNT gets
      silently truncated.
      
      Next commits will be switching the things up to use of those and reducing
      the amount of iov_iter_init() instances.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bc917be8
  16. 18 2月, 2015 1 次提交
  17. 04 2月, 2015 3 次提交
  18. 29 1月, 2015 1 次提交
  19. 17 12月, 2014 1 次提交
  20. 10 12月, 2014 1 次提交
  21. 09 12月, 2014 4 次提交
  22. 09 10月, 2014 1 次提交
    • M
      Add copy_to_iter(), copy_from_iter() and iov_iter_zero() · c35e0248
      Matthew Wilcox 提交于
      For DAX, we want to be able to copy between iovecs and kernel addresses
      that don't necessarily have a struct page.  This is a fairly simple
      rearrangement for bvec iters to kmap the pages outside and pass them in,
      but for user iovecs it gets more complicated because we might try various
      different ways to kmap the memory.  Duplicating the existing logic works
      out best in this case.
      
      We need to be able to write zeroes to an iovec for reads from unwritten
      ranges in a file.  This is performed by the new iov_iter_zero() function,
      again patterned after the existing code that handles iovec iterators.
      
      [AV: and export the buggers...]
      Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c35e0248
  23. 27 9月, 2014 1 次提交
  24. 08 8月, 2014 1 次提交
  25. 28 6月, 2014 1 次提交
  26. 27 6月, 2014 1 次提交
    • A
      Fix 32-bit regression in block device read(2) · 0b86dbf6
      Al Viro 提交于
      blkdev_read_iter() wants to cap the iov_iter by the amount of data
      remaining to the end of device.  That's what iov_iter_truncate() is for
      (trim iter->count if it's above the given limit).  So far, so good, but
      the argument of iov_iter_truncate() is size_t, so on 32bit boxen (in
      case of a large device) we end up with that upper limit truncated down
      to 32 bits *before* comparing it with iter->count.
      
      Easily fixed by making iov_iter_truncate() take 64bit argument - it does
      the right thing after such change (we only reach the assignment in there
      when the current value of iter->count is greater than the limit, i.e.
      for anything that would get truncated we don't reach the assignment at
      all) and that argument is not the new value of iter->count - it's an
      upper limit for such.
      
      The overhead of passing u64 is not an issue - the thing is inlined, so
      callers passing size_t won't pay any penalty.
      Reported-and-tested-by: NTheodore Tso <tytso@mit.edu>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: NAlan Cox <gnomes@lxorguk.ukuu.org.uk>
      Tested-by: NBruno Wolff III <bruno@wolff.to>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b86dbf6
  27. 07 5月, 2014 7 次提交
    • A
      bio_vec-backed iov_iter · 62a8067a
      Al Viro 提交于
      New variant of iov_iter - ITER_BVEC in iter->type, backed with
      bio_vec array instead of iovec one.  Primitives taught to deal
      with such beasts, __swap_write() switched to using that kind
      of iov_iter.
      
      Note that bio_vec is just a <page, offset, length> triple - there's
      nothing block-specific about it.  I've left the definition where it
      was, but took it from under ifdef CONFIG_BLOCK.
      
      Next target: ->splice_write()...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      62a8067a
    • A
      lustre: get rid of messing with iovecs · b42b15fd
      Al Viro 提交于
      * switch to ->read_iter/->write_iter
      * keep a pointer to iov_iter instead of iov/nr_segs
      * do not modify iovecs; use iov_iter_truncate()/iov_iter_advance() and
      a new primitive - iov_iter_reexpand() (expand previously truncated
      iterator) istead.
      * (racy) check for lustre VMAs intersecting with iovecs kept for now as
      for_each_iov() loop.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b42b15fd
    • A
      new helper: copy_page_from_iter() · f0d1bec9
      Al Viro 提交于
      parallel to copy_page_to_iter().  pipe_write() switched to it (and became
      ->write_iter()).
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      f0d1bec9
    • A
      iov_iter_truncate() · 0c949334
      Al Viro 提交于
      Now It Can Be Done(tm) - we don't need to do iov_shorten() in
      generic_file_direct_write() anymore, now that all ->direct_IO()
      instances are converted to proper iov_iter methods and honour
      iter->count and iter->iov_offset properly.
      
      Get rid of count/ocount arguments of generic_file_direct_write(),
      while we are at it.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0c949334
    • A
      new helper: iov_iter_get_pages_alloc() · 91f79c43
      Al Viro 提交于
      same as iov_iter_get_pages(), except that pages array is allocated
      (kmalloc if possible, vmalloc if that fails) and left for caller to
      free.  Lustre and NFS ->direct_IO() switched to it.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      91f79c43
    • A
      new helper: iov_iter_npages() · f67da30c
      Al Viro 提交于
      counts the pages covered by iov_iter, up to given limit.
      do_block_direct_io() and fuse_iter_npages() switched to
      it.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      f67da30c
    • A
      new helper: iov_iter_get_pages() · 7b2c99d1
      Al Viro 提交于
      iov_iter_get_pages(iter, pages, maxsize, &start) grabs references pinning
      the pages of up to maxsize of (contiguous) data from iter.  Returns the
      amount of memory grabbed or -error.  In case of success, the requested
      area begins at offset start in pages[0] and runs through pages[1], etc.
      Less than requested amount might be returned - either because the contiguous
      area in the beginning of iterator is smaller than requested, or because
      the kernel failed to pin that many pages.
      
      direct-io.c switched to using iov_iter_get_pages()
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7b2c99d1