1. 07 9月, 2017 2 次提交
  2. 11 7月, 2017 1 次提交
  3. 06 7月, 2017 2 次提交
  4. 05 7月, 2017 1 次提交
    • A
      fs: generic_block_bmap(): initialize all of the fields in the temp bh · 2a527d68
      Alexander Potapenko 提交于
      KMSAN (KernelMemorySanitizer, a new error detection tool) reports the
      use of uninitialized memory in ext4_update_bh_state():
      
      ==================================================================
      BUG: KMSAN: use of unitialized memory
      CPU: 3 PID: 1 Comm: swapper/0 Tainted: G    B           4.8.0-rc6+ #597
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
       0000000000000282 ffff88003cc96f68 ffffffff81f30856 0000003000000008
       ffff88003cc96f78 0000000000000096 ffffffff8169742a ffff88003cc96ff8
       ffffffff812fc1fc 0000000000000008 ffff88003a1980e8 0000000100000000
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff81f30856>] dump_stack+0xa6/0xc0 lib/dump_stack.c:51
       [<ffffffff812fc1fc>] kmsan_report+0x1ec/0x300 mm/kmsan/kmsan.c:?
       [<ffffffff812fc33b>] __msan_warning+0x2b/0x40 ??:?
       [<     inline     >] ext4_update_bh_state fs/ext4/inode.c:727
       [<ffffffff8169742a>] _ext4_get_block+0x6ca/0x8a0 fs/ext4/inode.c:759
       [<ffffffff81696d4c>] ext4_get_block+0x8c/0xa0 fs/ext4/inode.c:769
       [<ffffffff814a2d36>] generic_block_bmap+0x246/0x2b0 fs/buffer.c:2991
       [<ffffffff816ca30e>] ext4_bmap+0x5ee/0x660 fs/ext4/inode.c:3177
      ...
      origin description: ----tmp@generic_block_bmap
      ==================================================================
      
      (the line numbers are relative to 4.8-rc6, but the bug persists
      upstream)
      
      The local |tmp| is created in generic_block_bmap() and then passed into
      ext4_bmap() => ext4_get_block() => _ext4_get_block() =>
      ext4_update_bh_state(). Along the way tmp.b_page is never initialized
      before ext4_update_bh_state() checks its value.
      
      [ Use the approach suggested by Kees Cook of initializing the whole bh
        structure.]
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      2a527d68
  5. 03 7月, 2017 1 次提交
    • A
      vfs: Add page_cache_seek_hole_data helper · 334fd34d
      Andreas Gruenbacher 提交于
      Both ext4 and xfs implement seeking for the next hole or piece of data
      in unwritten extents by scanning the page cache, and both versions share
      the same bug when iterating the buffers of a page: the start offset into
      the page isn't taken into account, so when a page fits more than two
      filesystem blocks, things will go wrong.  For example, on a filesystem
      with a block size of 1k, the following command will fail:
      
        xfs_io -f -c "falloc 0 4k" \
                  -c "pwrite 1k 1k" \
                  -c "pwrite 3k 1k" \
                  -c "seek -a -r 0" foo
      
      In this example, neither lseek(fd, 1024, SEEK_HOLE) nor lseek(fd, 2048,
      SEEK_DATA) will return the correct result.
      
      Introduce a generic vfs helper for seeking in the page cache that gets
      this right.  The next commits will replace the filesystem specific
      implementations.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      [hch: dropped the export]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      334fd34d
  6. 28 6月, 2017 1 次提交
  7. 09 6月, 2017 1 次提交
  8. 09 5月, 2017 1 次提交
  9. 27 4月, 2017 1 次提交
    • E
      fs: remove _submit_bh() · 020c2833
      Eric Biggers 提交于
      _submit_bh() allowed submitting a buffer_head for I/O using custom
      bio_flags.  It used to be used by jbd to set BIO_SNAP_STABLE, introduced
      by commit 71368511 ("mm: make snapshotting pages for stable writes a
      per-bio operation").  However, the code and flag has since been removed
      and no _submit_bh() users remain.
      
      These days, bio_flags are mostly used internally by the block layer to
      track the state of bio's.  As such, it doesn't really make sense for
      filesystems to use them instead of op_flags when wanting special
      behavior for block requests.
      
      Therefore, remove _submit_bh() and trim the bio_flags argument from
      submit_bh_wbc().
      
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      020c2833
  10. 02 3月, 2017 1 次提交
  11. 28 2月, 2017 1 次提交
  12. 03 1月, 2017 1 次提交
  13. 10 11月, 2016 1 次提交
  14. 05 11月, 2016 3 次提交
  15. 03 11月, 2016 1 次提交
  16. 01 11月, 2016 1 次提交
  17. 28 10月, 2016 1 次提交
    • C
      block: better op and flags encoding · ef295ecf
      Christoph Hellwig 提交于
      Now that we don't need the common flags to overflow outside the range
      of a 32-bit type we can encode them the same way for both the bio and
      request fields.  This in addition allows us to place the operation
      first (and make some room for more ops while we're at it) and to
      stop having to shift around the operation values.
      
      In addition this allows passing around only one value in the block layer
      instead of two (and eventuall also in the file systems, but we can do
      that later) and thus clean up a lot of code.
      
      Last but not least this allows decreasing the size of the cmd_flags
      field in struct request to 32-bits.  Various functions passing this
      value could also be updated, but I'd like to avoid the churn for now.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      ef295ecf
  18. 12 10月, 2016 1 次提交
  19. 28 9月, 2016 1 次提交
  20. 21 7月, 2016 1 次提交
  21. 27 6月, 2016 1 次提交
    • B
      fs: export __block_write_full_page · b4bba389
      Benjamin Marzinski 提交于
      gfs2 needs to be able to skip the check to see if a page is outside of
      the file size when writing it out. gfs2 can get into a situation where
      it needs to flush its in-memory log to disk while a truncate is in
      progress. If the file being trucated has data journaling enabled, it is
      possible that there are data blocks in the log that are past the end of
      the file. gfs can't finish the log flush without either writing these
      blocks out or revoking them. Otherwise, if the node crashed, it could
      overwrite subsequent changes made by other nodes in the cluster when
      it's journal was replayed.
      
      Unfortunately, there is no way to add log entries to the log during a
      flush. So gfs2 simply writes out the page instead. This situation can
      only occur when the truncate code still has the file locked exclusively,
      and hasn't marked this block as free in the metadata (which happens
      later in truc_dealloc).  After gfs2 writes this page out, the truncation
      code will shortly invalidate it and write out any revokes if necessary.
      
      In order to make this work, gfs2 needs to be able to skip the check for
      writes outside the file size. Since the check exists in
      block_write_full_page, this patch exports __block_write_full_page, which
      doesn't have the check.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      b4bba389
  22. 21 6月, 2016 1 次提交
    • C
      fs: introduce iomap infrastructure · ae259a9c
      Christoph Hellwig 提交于
      Add infrastructure for multipage buffered writes.  This is implemented
      using an main iterator that applies an actor function to a range that
      can be written.
      
      This infrastucture is used to implement a buffered write helper, one
      to zero file ranges and one to implement the ->page_mkwrite VM
      operations.  All of them borrow a fair amount of code from fs/buffers.
      for now by using an internal version of __block_write_begin that
      gets passed an iomap and builds the corresponding buffer head.
      
      The file system is gets a set of paired ->iomap_begin and ->iomap_end
      calls which allow it to map/reserve a range and get a notification
      once the write code is finished with it.
      
      Based on earlier code from Dave Chinner.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      ae259a9c
  23. 08 6月, 2016 3 次提交
  24. 20 5月, 2016 1 次提交
    • M
      mm, page_alloc: avoid looking up the first zone in a zonelist twice · c33d6c06
      Mel Gorman 提交于
      The allocator fast path looks up the first usable zone in a zonelist and
      then get_page_from_freelist does the same job in the zonelist iterator.
      This patch preserves the necessary information.
      
                                                   4.6.0-rc2                  4.6.0-rc2
                                              fastmark-v1r20             initonce-v1r20
        Min      alloc-odr0-1               364.00 (  0.00%)           359.00 (  1.37%)
        Min      alloc-odr0-2               262.00 (  0.00%)           260.00 (  0.76%)
        Min      alloc-odr0-4               214.00 (  0.00%)           214.00 (  0.00%)
        Min      alloc-odr0-8               186.00 (  0.00%)           186.00 (  0.00%)
        Min      alloc-odr0-16              173.00 (  0.00%)           173.00 (  0.00%)
        Min      alloc-odr0-32              165.00 (  0.00%)           165.00 (  0.00%)
        Min      alloc-odr0-64              161.00 (  0.00%)           162.00 ( -0.62%)
        Min      alloc-odr0-128             159.00 (  0.00%)           161.00 ( -1.26%)
        Min      alloc-odr0-256             168.00 (  0.00%)           170.00 ( -1.19%)
        Min      alloc-odr0-512             180.00 (  0.00%)           181.00 ( -0.56%)
        Min      alloc-odr0-1024            190.00 (  0.00%)           190.00 (  0.00%)
        Min      alloc-odr0-2048            196.00 (  0.00%)           196.00 (  0.00%)
        Min      alloc-odr0-4096            202.00 (  0.00%)           202.00 (  0.00%)
        Min      alloc-odr0-8192            206.00 (  0.00%)           205.00 (  0.49%)
        Min      alloc-odr0-16384           206.00 (  0.00%)           205.00 (  0.49%)
      
      The benefit is negligible and the results are within the noise but each
      cycle counts.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c33d6c06
  25. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  26. 16 3月, 2016 2 次提交
  27. 07 1月, 2016 1 次提交
  28. 11 11月, 2015 1 次提交
    • R
      vfs: remove unused wrapper block_page_mkwrite() · 5c500029
      Ross Zwisler 提交于
      The function currently called "__block_page_mkwrite()" used to be called
      "block_page_mkwrite()" until a wrapper for this function was added by:
      
      commit 24da4fab ("vfs: Create __block_page_mkwrite() helper passing
      	error values back")
      
      This wrapper, the current "block_page_mkwrite()", is currently unused.
      __block_page_mkwrite() is used directly by ext4, nilfs2 and xfs.
      
      Remove the unused wrapper, rename __block_page_mkwrite() back to
      block_page_mkwrite() and update the comment above block_page_mkwrite().
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NJan Kara <jack@suse.com>
      Cc: Jan Kara <jack@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5c500029
  29. 07 11月, 2015 1 次提交
  30. 14 8月, 2015 1 次提交
  31. 29 7月, 2015 2 次提交
    • J
      block: manipulate bio->bi_flags through helpers · b7c44ed9
      Jens Axboe 提交于
      Some places use helpers now, others don't. We only have the 'is set'
      helper, add helpers for setting and clearing flags too.
      
      It was a bit of a mess of atomic vs non-atomic access. With
      BIO_UPTODATE gone, we don't have any risk of concurrent access to the
      flags. So relax the restriction and don't make any of them atomic. The
      flags that do have serialization issues (reffed and chained), we
      already handle those separately.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b7c44ed9
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  32. 02 6月, 2015 1 次提交