1. 05 2月, 2017 1 次提交
  2. 14 11月, 2016 2 次提交
  3. 05 11月, 2016 1 次提交
  4. 01 11月, 2016 1 次提交
  5. 12 10月, 2016 1 次提交
  6. 30 9月, 2016 1 次提交
  7. 11 7月, 2016 1 次提交
  8. 08 6月, 2016 2 次提交
  9. 06 5月, 2016 1 次提交
    • J
      ext4: remove unnecessary bio get/put · 32157de2
      Jens Axboe 提交于
      ext4_io_submit() used to check for EOPNOTSUPP after bio submission,
      which is why it had to get an extra reference to the bio before
      submitting it. But since we no longer touch the bio after submission,
      get rid of the redundant get/put of the bio. If we do get the extra
      reference, we enter the slower path of having to flag this bio as now
      having external references.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      32157de2
  10. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  11. 03 4月, 2016 1 次提交
  12. 27 3月, 2016 1 次提交
    • T
      ext4 crypto: don't let data integrity writebacks fail with ENOMEM · c9af28fd
      Theodore Ts'o 提交于
      We don't want the writeback triggered from the journal commit (in
      data=writeback mode) to cause the journal to abort due to
      generic_writepages() returning an ENOMEM error.  In addition, if
      fsync() fails with ENOMEM, most applications will probably not do the
      right thing.
      
      So if we are doing a data integrity sync, and ext4_encrypt() returns
      ENOMEM, we will submit any queued I/O to date, and then retry the
      allocation using GFP_NOFAIL.
      
      Google-Bug-Id: 27641567
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c9af28fd
  13. 09 3月, 2016 1 次提交
  14. 29 2月, 2016 1 次提交
  15. 07 1月, 2016 1 次提交
  16. 03 10月, 2015 1 次提交
    • T
      ext4 crypto: fix memory leak in ext4_bio_write_page() · 937d7b84
      Theodore Ts'o 提交于
      There are times when ext4_bio_write_page() is called even though we
      don't actually need to do any I/O.  This happens when ext4_writepage()
      gets called by the jbd2 commit path when an inode needs to force its
      pages written out in order to provide data=ordered guarantees --- and
      a page is backed by an unwritten (e.g., uninitialized) block on disk,
      or if delayed allocation means the page's backing store hasn't been
      allocated yet.  In that case, we need to skip the call to
      ext4_encrypt_page(), since in addition to wasting CPU, it leads to a
      bounce page and an ext4 crypto context getting leaked.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      937d7b84
  17. 14 8月, 2015 1 次提交
  18. 29 7月, 2015 1 次提交
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  19. 22 7月, 2015 2 次提交
    • T
      ext4: implement cgroup writeback support · 001e4a87
      Tejun Heo 提交于
      For ordered and writeback data modes, all data IOs go through
      ext4_io_submit.  This patch adds cgroup writeback support by invoking
      wbc_init_bio() from io_submit_init_bio() and wbc_account_io() in
      io_submit_add_bh().  Journal data which is written by jbd2 worker is
      left alone by this patch and will always be written out from the root
      cgroup.
      
      ext4_fill_super() is updated to set MS_CGROUPWB when data mode is
      either ordered or writeback.  In journaled data mode, most IOs become
      synchronous through the journal and enabling cgroup writeback support
      doesn't make much sense or difference.  Journaled data mode is left
      alone.
      
      Lightly tested with sequential data write workload.  Behaves as
      expected.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      001e4a87
    • T
      ext4: replace ext4_io_submit->io_op with ->io_wbc · 5a33911f
      Tejun Heo 提交于
      ext4_io_submit_init() takes the pointer to writeback_control to test
      its sync_mode and determine between WRITE and WRITE_SYNC and records
      the result in ->io_op.  This patch makes it record the pointer
      directly and moves the test to ext4_io_submit().
      
      This doesn't cause any noticeable differences now but having
      writeback_control available throughout IO submission path will be
      depended upon by the planned cgroup writeback support.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      5a33911f
  20. 01 6月, 2015 1 次提交
  21. 19 5月, 2015 1 次提交
  22. 12 4月, 2015 1 次提交
  23. 03 4月, 2015 1 次提交
  24. 26 3月, 2015 1 次提交
  25. 05 6月, 2014 1 次提交
  26. 28 5月, 2014 1 次提交
  27. 12 5月, 2014 1 次提交
    • N
      ext4: fix data integrity sync in ordered mode · 1c8349a1
      Namjae Jeon 提交于
      When we perform a data integrity sync we tag all the dirty pages with
      PAGECACHE_TAG_TOWRITE at start of ext4_da_writepages.  Later we check
      for this tag in write_cache_pages_da and creates a struct
      mpage_da_data containing contiguously indexed pages tagged with this
      tag and sync these pages with a call to mpage_da_map_and_submit.  This
      process is done in while loop until all the PAGECACHE_TAG_TOWRITE
      pages are synced. We also do journal start and stop in each iteration.
      journal_stop could initiate journal commit which would call
      ext4_writepage which in turn will call ext4_bio_write_page even for
      delayed OR unwritten buffers. When ext4_bio_write_page is called for
      such buffers, even though it does not sync them but it clears the
      PAGECACHE_TAG_TOWRITE of the corresponding page and hence these pages
      are also not synced by the currently running data integrity sync. We
      will end up with dirty pages although sync is completed.
      
      This could cause a potential data loss when the sync call is followed
      by a truncate_pagecache call, which is exactly the case in
      collapse_range.  (It will cause generic/127 failure in xfstests)
      
      To avoid this issue, we can use set_page_writeback_keepwrite instead of
      set_page_writeback, which doesn't clear TOWRITE tag.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      1c8349a1
  28. 07 4月, 2014 1 次提交
  29. 24 11月, 2013 2 次提交
    • K
      block: Abstract out bvec iterator · 4f024f37
      Kent Overstreet 提交于
      Immutable biovecs are going to require an explicit iterator. To
      implement immutable bvecs, a later patch is going to add a bi_bvec_done
      member to this struct; for now, this patch effectively just renames
      things.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Ed L. Cashin" <ecashin@coraid.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Yehuda Sadeh <yehuda@inktank.com>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Alex Elder <elder@inktank.com>
      Cc: ceph-devel@vger.kernel.org
      Cc: Joshua Morris <josh.h.morris@us.ibm.com>
      Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux390@de.ibm.com
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Cc: Benny Halevy <bhalevy@tonian.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Chris Mason <chris.mason@fusionio.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: Joern Engel <joern@logfs.org>
      Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Ben Myers <bpm@sgi.com>
      Cc: xfs@oss.sgi.com
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Guo Chao <yan@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: "Roger Pau Monné" <roger.pau@citrix.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
      Cc: Ian Campbell <Ian.Campbell@citrix.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchand@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Peng Tao <tao.peng@emc.com>
      Cc: Andy Adamson <andros@netapp.com>
      Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
      Cc: Jie Liu <jeff.liu@oracle.com>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Pankaj Kumar <pankaj.km@samsung.com>
      Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
      Cc: Mel Gorman <mgorman@suse.de>6
      4f024f37
    • K
      block: Convert various code to bio_for_each_segment() · 2c30c71b
      Kent Overstreet 提交于
      With immutable biovecs we don't want code accessing bi_io_vec directly -
      the uses this patch changes weren't incorrect since they all own the
      bio, but it makes the code harder to audit for no good reason - also,
      this will help with multipage bvecs later.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Chris Mason <chris.mason@fusionio.com>
      Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
      Cc: Joern Engel <joern@logfs.org>
      Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      2c30c71b
  30. 16 10月, 2013 1 次提交
  31. 04 9月, 2013 1 次提交
    • C
      direct-io: Implement generic deferred AIO completions · 7b7a8665
      Christoph Hellwig 提交于
      Add support to the core direct-io code to defer AIO completions to user
      context using a workqueue.  This replaces opencoded and less efficient
      code in XFS and ext4 (we save a memory allocation for each direct IO)
      and will be needed to properly support O_(D)SYNC for AIO.
      
      The communication between the filesystem and the direct I/O code requires
      a new buffer head flag, which is a bit ugly but not avoidable until the
      direct I/O code stops abusing the buffer_head structure for communicating
      with the filesystems.
      
      Currently this creates a per-superblock unbound workqueue for these
      completions, which is taken from an earlier patch by Jan Kara.  I'm
      not really convinced about this use and would prefer a "normal" global
      workqueue with a high concurrency limit, but this needs further discussion.
      
      JK: Fixed ext4 part, dynamic allocation of the workqueue.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7b7a8665
  32. 12 7月, 2013 1 次提交
    • A
      ext4: rate limit printk in buffer_io_error() · e8974c39
      Anatol Pomozov 提交于
      If there are a lot of outstanding buffered IOs when a device is
      taken offline (due to hardware errors etc), ext4_end_bio prints
      out a message for each failed logical block. While this is desirable,
      we see thousands of such lines being printed out before the
      serial console gets overwhelmed, causing ext4_end_bio() wait for
      the printk to complete.
      
      This in itself isn't a disaster, except for the detail that this
      function is being called with the queue lock held.
      This causes any other function in the block layer
      to spin on its spin_lock_irqsave while the serial console is
      draining. If NMI watchdog is enabled on this machine then it
      eventually comes along and shoots the machine in the head.
      
      The end result is that losing any one disk causes the machine to
      go down. This patch rate limits the printk to bandaid around the
      problem.
      
      Tested: xfstests
      Change-Id: I8ab5690dcf4f3a67e78be147d45e489fdf4a88d8
      Signed-off-by: NAnatol Pomozov <anatol.pomozov@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      e8974c39
  33. 11 7月, 2013 1 次提交
    • J
      ext4: fix warning in ext4_evict_inode() · 822dbba3
      Jan Kara 提交于
      The following race can lead to ext4_evict_inode() seeing i_ioend_count
      > 0 and thus triggering a sanity check warning:
      
              CPU1                                    CPU2
      ext4_end_bio()                          ext4_evict_inode()
        ext4_finish_bio()
          end_page_writeback();
                                                truncate_inode_pages()
                                                  evict page
                                              WARN_ON(i_ioend_count > 0);
        ext4_put_io_end_defer()
          ext4_release_io_end()
            dec i_ioend_count
      
      This is possible use-after-free bug since we decrement i_ioend_count in
      possibly released inode.
      
      Since i_ioend_count is used only for sanity checks one possible solution
      would be to just remove it but for now I'd like to keep those sanity
      checks to help debugging the new ext4 writeback code.
      
      This patch changes ext4_end_bio() to call ext4_put_io_end_defer() before
      ext4_finish_bio() in the shortcut case when unwritten extent conversion
      isn't needed.  In that case we don't need the io_end so we are safe to
      drop it early.
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      822dbba3
  34. 06 6月, 2013 1 次提交
  35. 05 6月, 2013 2 次提交