1. 15 9月, 2009 4 次提交
    • J
      udf: Fix possible corruption when close races with write · cbc8cc33
      Jan Kara 提交于
      When we close a file, we remove preallocated blocks from it. But this
      truncation was not protected by i_mutex and thus it could have raced with a
      write through a different fd and cause crashes or even filesystem corruption.
      Signed-off-by: NJan Kara <jack@suse.cz>
      cbc8cc33
    • J
      udf: Perform preallocation only for regular files · 81056dd0
      Jan Kara 提交于
      So far we preallocated blocks also for directories but that brings a
      problem, when to get rid of preallocated blocks we don't need. So far
      we removed them in udf_clear_inode() which has a disadvantage that
      1) blocks are unavailable long after writing to a directory finished
         and thus one can get out of space unnecessarily early
      2) releasing blocks from udf_clear_inode is problematic because VFS
         does not expect us to redirty inode there and it also slows down
         memory reclaim.
      
      So preallocate blocks only for regular files where we can drop preallocation
      in udf_release_file.
      Signed-off-by: NJan Kara <jack@suse.cz>
      81056dd0
    • J
      udf: Remove wrong assignment in udf_symlink · 7c6e3d1a
      Jan Kara 提交于
      Recomputation of the pointer was wrong (it should have been just increment).
      Luckily, we never use the computed value. Remove it.
      Signed-off-by: NJan Kara <jack@suse.cz>
      7c6e3d1a
    • J
      udf: Remove dead code · 5891d9dd
      Jan Kara 提交于
      Remove code that gets never used.
      Signed-off-by: NJan Kara <jack@suse.cz>
      5891d9dd
  2. 14 9月, 2009 36 次提交
    • C
      fsync: wait for data writeout completion before calling ->fsync · 2daea67e
      Christoph Hellwig 提交于
      Currenly vfs_fsync(_range) first calls filemap_fdatawrite to write out
      the data, the calls into ->fsync to write out the metadata and then finally
      calls filemap_fdatawait to wait for the data I/O to complete.  What sounds
      like a clever micro-optimization actually is nast trap for many filesystems.
      
      For many modern filesystems i_size or other inode information is only
      updated on I/O completion and we need to wait for I/O to finish before
      we can write out the metadata.  For old fashionen filesystems that
      instanciate blocks during the actual write and also update the metadata
      at that point it opens up a large window were we could expose uninitialized
      blocks after a crash.  While a few filesystems that need it already wait
      for the I/O to finish inside their ->fsync methods it is rather suboptimal
      as it is done under the i_mutex and also always for the whole file instead
      of just a part as we could do for O_SYNC handling.
      
      Here is a small audit of all fsync instances in the tree:
      
       - spufs_mfc_fsync:
       - ps3flash_fsync:
       - vol_cdev_fsync:
       - printer_fsync:
       - fb_deferred_io_fsync:
       - bad_file_fsync:
       - simple_sync_file:
      
      	don't care - filesystems/drivers do't use the page cache or are
      	purely in-memory.
      
       - simple_fsync:
       - file_fsync:
       - affs_file_fsync:
       - fat_file_fsync:
       - jfs_fsync:
       - ubifs_fsync:
       - reiserfs_dir_fsync:
       - reiserfs_sync_file:
      
      	never touch pagecache themselves.  We need to wait before if we do
      	not want to expose stale data after an allocation.
      
       - afs_fsync:
       - fuse_fsync_common:
      
      	do the waiting writeback itself in awkward ways, would benefit from
      	proper semantics
      
       - block_fsync:
      
      	Does a filemap_write_and_wait on the block device inode.  Because we
      	now have f_mapping that is the same inode we call it on in vfs_fsync.
      	So just removing it and letting the VFS do the work in one go would
      	be an improvement.
      
       - btrfs_sync_file:
       - cifs_fsync:
       - xfs_file_fsync:
      
      	need the wait first and currently do it themselves. would benefit from
      	doing it outside i_mutex.
      
       - coda_fsync:
       - ecryptfs_fsync:
       - exofs_file_fsync:
       - shm_fsync:
      
      	only passes the fsync through to the lower layer
      
       - ext3_sync_file:
      
      	doesn't seem to care, comments are confusing.
      
       - ext4_sync_file:
      
      	would need the wait to work correctly for delalloc mode with late
      	i_size updates.  Otherwise the ext3 comment applies.
      
      	currently implemens it's own writeback and wait in an odd way,
      	could benefit from doing it properly.
      
       - gfs2_fsync:
      
      	not needed for journaled data mode, but probably harmless there.
      	Currently writes back data asynchronously itself.  Needs some
      	major audit.
      
       - hostfs_fsync:
      
      	just calls fsync/datasync on the host FD.  Without the wait before
      	data might not even be inflight yet if we're unlucky.
      
       - hpfs_file_fsync:
       - ncp_fsync:
      
      	no-ops.  Dangerous before and after.
      
       - jffs2_fsync:
      
      	just calls jffs2_flush_wbuf_gc, not sure how this relates to data.
      
       - nfs_fsync_dir:
      
      	just increments stats, claims all directory operations are synchronous
      
       - nfs_file_fsync:
      
      	only writes out data???  Looks very odd.
      
       - nilfs_sync_file:
      
      	looks like it expects all data done, but not sure from the code
      
       - ntfs_dir_fsync:
       - ntfs_file_fsync:
      
      	appear to do their own data writeback.  Very convoluted code.
      
       - ocfs2_sync_file:
      
      	does it's own data writeback, but no wait.  probably needs the wait.
      
       - smb_fsync:
      
      	according to a comment expects all pages written already, probably needs
      	the wait before.
      
      This patch only changes vfs_fsync_range, removal of the wait in the methods
      that have it is left to the filesystem maintainers.  Note that most
      filesystems really do need an audit for their fsync methods given the
      gems found in this very brief audit.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      2daea67e
    • J
      vfs: Remove generic_osync_inode() and sync_page_range{_nolock}() · 18f2ee70
      Jan Kara 提交于
      Remove these three functions since nobody uses them anymore.
      Signed-off-by: NJan Kara <jack@suse.cz>
      18f2ee70
    • J
      fat: Opencode sync_page_range_nolock() · 2f3d675b
      Jan Kara 提交于
      fat_cont_expand() is the only user of sync_page_range_nolock(). It's also the
      only user of generic_osync_inode() which does not have a file open.  So
      opencode needed actions for FAT so that we can convert generic_osync_inode() to
      a standard syncing path.
      
      Update a comment about generic_osync_inode().
      
      CC: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: NJan Kara <jack@suse.cz>
      2f3d675b
    • J
      xfs: Convert sync_page_range() to simple filemap_write_and_wait_range() · af0f4414
      Jan Kara 提交于
      Christoph Hellwig says that it is enough for XFS to call
      filemap_write_and_wait_range() instead of sync_page_range() because we do
      all the metadata syncing when forcing the log.
      
      CC: Felix Blyakher <felixb@sgi.com>
      CC: xfs@oss.sgi.com
      CC: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      af0f4414
    • J
      ocfs2: Update syncing after splicing to match generic version · d23c937b
      Jan Kara 提交于
      Update ocfs2 specific splicing code to use generic syncing helper. The sync now
      does not happen under rw_lock because generic_write_sync() acquires i_mutex
      which ranks above rw_lock. That should not matter because standard fsync path
      does not hold it either.
      Acked-by: NJoel Becker <Joel.Becker@oracle.com>
      Acked-by: NMark Fasheh <mfasheh@suse.com>
      CC: ocfs2-devel@oss.oracle.com
      Signed-off-by: NJan Kara <jack@suse.cz>
      d23c937b
    • J
      ntfs: Use new syncing helpers and update comments · ebbbf757
      Jan Kara 提交于
      Use new syncing helpers in .write and .aio_write functions. Also
      remove superfluous syncing in ntfs_file_buffered_write() and update
      comments about generic_osync_inode().
      
      CC: Anton Altaparmakov <aia21@cantab.net>
      CC: linux-ntfs-dev@lists.sourceforge.net
      Signed-off-by: NJan Kara <jack@suse.cz>
      ebbbf757
    • J
      ext4: Remove syncing logic from ext4_file_write · 0d34ec62
      Jan Kara 提交于
      The syncing is now properly handled by generic_file_aio_write() so
      no special ext4 code is needed.
      
      CC: linux-ext4@vger.kernel.org
      CC: tytso@mit.edu
      Signed-off-by: NJan Kara <jack@suse.cz>
      0d34ec62
    • J
      ext3: Remove syncing logic from ext3_file_write · e367626b
      Jan Kara 提交于
      Syncing is now properly done by generic_file_aio_write() so no special logic is
      needed in ext3.
      
      CC: linux-ext4@vger.kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      e367626b
    • J
      ext2: Update comment about generic_osync_inode · a2a735ad
      Jan Kara 提交于
      We rely on generic_write_sync() now.
      
      CC: linux-ext4@vger.kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      a2a735ad
    • J
      vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode · 148f948b
      Jan Kara 提交于
      Introduce new function for generic inode syncing (vfs_fsync_range) and use
      it from fsync() path. Introduce also new helper for syncing after a sync
      write (generic_write_sync) using the generic function.
      
      Use these new helpers for syncing from generic VFS functions. This makes
      O_SYNC writes to block devices acquire i_mutex for syncing. If we really
      care about this, we can make block_fsync() drop the i_mutex and reacquire
      it before it returns.
      
      CC: Evgeniy Polyakov <zbr@ioremap.net>
      CC: ocfs2-devel@oss.oracle.com
      CC: Joel Becker <joel.becker@oracle.com>
      CC: Felix Blyakher <felixb@sgi.com>
      CC: xfs@oss.sgi.com
      CC: Anton Altaparmakov <aia21@cantab.net>
      CC: linux-ntfs-dev@lists.sourceforge.net
      CC: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      CC: linux-ext4@vger.kernel.org
      CC: tytso@mit.edu
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      148f948b
    • C
      vfs: Rename generic_file_aio_write_nolock · eef99380
      Christoph Hellwig 提交于
      generic_file_aio_write_nolock() is now used only by block devices and raw
      character device. Filesystems should use __generic_file_aio_write() in case
      generic_file_aio_write() doesn't suit them. So rename the function to
      blkdev_aio_write() and move it to fs/blockdev.c.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      eef99380
    • J
      ocfs2: Use __generic_file_aio_write instead of generic_file_aio_write_nolock · 918941a3
      Jan Kara 提交于
      Use the new helper. We have to submit data pages ourselves in case of O_SYNC
      write because __generic_file_aio_write does not do it for us. OCFS2 developpers
      might think about moving the sync out of i_mutex which seems to be easily
      possible but that's out of scope of this patch.
      
      CC: ocfs2-devel@oss.oracle.com
      Acked-by: NJoel Becker <joel.becker@oracle.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      918941a3
    • R
      fs/Kconfig: move nilfs2 outside misc filesystems · 41f4db0f
      Ryusuke Konishi 提交于
      Some people asked me questions like the following:
      
      On Wed, 15 Jul 2009 13:11:21 +0200, Leon Woestenberg wrote:
      > just wondering, any reasons why NILFS2 is one of the miscellaneous
      > filesystems and, for example, btrfs, is not in Kconfig?
      
      Actually, nilfs is NOT a filesystem came from other operating systems,
      but a filesystem created purely for Linux.  Nor is it a flash
      filesystem but that for generic block devices.
      
      So, this moves nilfs outside the misc category as I responded in LKML
      "Re: Why does NILFS2 hide under Miscellaneous filesystems?"
      (Message-Id: <20090716.002526.93465395.ryusuke@osrg.net>).
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      41f4db0f
    • R
      nilfs2: convert nilfs_bmap_lookup to an inline function · 0f3fe33b
      Ryusuke Konishi 提交于
      The nilfs_bmap_lookup() is now a wrapper function of
      nilfs_bmap_lookup_at_level().
      
      This moves the nilfs_bmap_lookup() to a header file converting it to
      an inline function and gives an opportunity for optimization.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      0f3fe33b
    • R
      nilfs2: allow btree code to directly call dat operations · 2e0c2c73
      Ryusuke Konishi 提交于
      The current btree code is written so that btree functions call dat
      operations via wrapper functions in bmap.c when they allocate, free,
      or modify virtual block addresses.
      
      This abstraction requires additional function calls and causes
      frequent call of nilfs_bmap_get_dat() function since it is used in the
      every wrapper function.
      
      This removes the wrapper functions and makes them available from
      btree.c and direct.c, which will increase the opportunity of
      compiler optimization.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      2e0c2c73
    • R
      nilfs2: add update functions of virtual block address to dat · bd8169ef
      Ryusuke Konishi 提交于
      This is a preparation for the successive cleanup ("nilfs2: allow btree
      to directly call dat operations").
      
      This adds functions bundling a few operations to change an entry of
      virtual block address on the dat file.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      bd8169ef
    • R
      nilfs2: remove individual gfp constants for each metadata file · 7a102b09
      Ryusuke Konishi 提交于
      This gets rid of NILFS_CPFILE_GFP, NILFS_SUFILE_GFP, NILFS_DAT_GFP,
      and NILFS_IFILE_GFP.  All of these constants refer to NILFS_MDT_GFP,
      and can be removed.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      7a102b09
    • R
      nilfs2: stop zero-fill of btree path just before free it · 3218929d
      Ryusuke Konishi 提交于
      The btree path object is cleared just before it is freed.
      
      This will remove the code doing the unnecessary clear operation.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      3218929d
    • R
      nilfs2: remove unused btree argument from btree functions · 6d28f7ea
      Ryusuke Konishi 提交于
      Even though many btree functions take a btree object as their first
      argument, most of them are not used in their functions.
      
      This sticky use of the btree argument is hurting code readability and
      giving the possibility of inefficient code generation.
      
      So, this removes the unnecessary btree arguments.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      6d28f7ea
    • R
      nilfs2: remove nilfs_dat_abort_start and nilfs_dat_abort_free · 9ead9863
      Ryusuke Konishi 提交于
      These functions are not called from any functions.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      9ead9863
    • J
      nilfs2: shorten freeze period due to GC in write operation v3 · 1cf58fa8
      Jiro SEKIBA 提交于
      This is a re-revised patch to shorten freeze period.
      This version include a fix of the bug Konishi-san mentioned last time.
      
      When GC is runnning, GC moves live block to difference segments.
      Copying live blocks into memory is done in a transaction,
      however it is not necessarily to be in the transaction.
      This patch will get the nilfs_ioctl_move_blocks() out from
      transaction lock and put it before the transaction.
      
      I ran sysbench fileio test against nilfs partition.
      I copied some DVD/CD images and created snapshot to create live blocks
      before starting the benchmark.
      
      Followings are summary of rc8 and rc8 w/ the patch of per-request
      statistics, which is min/max and avg.  I ran each test three times and
      bellow is average of those numers.
      
      According to this benchmark result, average time is slightly degrated.
      However, worstcase (max) result is significantly improved.
      This can address a few seconds write freeze.
      
      - random write per-request performance of rc8
       min   0.843ms
       max 680.406ms
       avg   3.050ms
      - random write per-request performance of rc8 w/ this patch
       min   0.843ms -> 100.00%
       max 380.490ms ->  55.90%
       avg   3.233ms -> 106.00%
      
      - sequential write per-request performance of rc8
       min   0.736ms
       max 774.343ms
       avg   2.883ms
      - sequential write per-request performance of rc8 w/ this patch
       min   0.720ms ->  97.80%
       max  644.280ms->  83.20%
       avg   3.130ms -> 108.50%
      
      -----8<-----8<-----nilfs_cleanerd.conf-----8<-----8<-----
      protection_period       150
      selection_policy        timestamp       # timestamp in ascend order
      nsegments_per_clean     2
      cleaning_interval       2
      retry_interval          60
      use_mmap
      log_priority            info
      -----8<-----8<-----nilfs_cleanerd.conf-----8<-----8<-----
      Signed-off-by: NJiro SEKIBA <jir@unicus.jp>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      1cf58fa8
    • Z
      nilfs2: add more check routines in mount process · 43be0ec0
      Zhu Yanhai 提交于
      nilfs2: Add more safeguard routines and protections in mount process,
      which also makes nilfs2 report consistency error messages when
      checkpoint number is invalid.
      Signed-off-by: NZhu Yanhai <zhu.yanhai@gmail.com>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      43be0ec0
    • Z
      nilfs2: An unassigned variable is assigned to a never used structure member · a4f0b9c5
      Zhang Qiang 提交于
      nilfs2: In procedure 'nilfs_get_sb()', when a nilfs filesysttem is
      mounted for the first time, local variable 'nilfs->ns_last_cno' is
      used before loading the latest checkpoint number from disk (in
      'nilfs_fill_super'). 'nilfs->ns_last_cno' is assigned to 'sd.cno', but
      'sd.cno' has never been used in the procedure.
      Signed-off-by: NZhang Qiang <zhangqiang.buaa@gmail.com>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      a4f0b9c5
    • R
      nilfs2: use GFP_NOIO for bio_alloc instead of GFP_NOWAIT · c1b353f0
      Ryusuke Konishi 提交于
      Alberto Bertogli advised me about bio_alloc() use in nilfs:
      On Sat, 13 Jun 2009 22:52:40 -0300, Alberto Bertogli wrote:
      > By the way, those bio_alloc()s are using GFP_NOWAIT but it looks
      > like they could use at least GFP_NOIO or GFP_NOFS, since the caller
      > can (and sometimes do) sleep. The only caller is nilfs_submit_bh(),
      > which calls nilfs_submit_seg_bio() which can sleep calling
      > wait_for_completion().
      
      This takes in the comment and replaces the use of GFP_NOWAIT flag with
      GFP_NOIO.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      c1b353f0
    • J
      nilfs2: stop using periodic write_super callback · 1dfa2710
      Jiro SEKIBA 提交于
      This removes nilfs_write_super and commit super block in nilfs
      internal thread, instead of periodic write_super callback.
      
      VFS layer calls ->write_super callback periodically.  However,
      it looks like that calling back is ommited when disk I/O is busy.
      And when cleanerd (nilfs GC) is runnig, disk I/O tend to be busy thus
      nilfs superblock is not synchronized as nilfs designed.
      
      To avoid it, syncing superblock by nilfs thread instead of pdflush.
      Signed-off-by: NJiro SEKIBA <jir@unicus.jp>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      1dfa2710
    • J
      nilfs2: clean up nilfs_write_super · 79efdd94
      Jiro SEKIBA 提交于
      Separate conditions that check if syncing super block and alternative
      super block are required as inline functions to reuse the conditions.
      Signed-off-by: NJiro SEKIBA <jir@unicus.jp>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      79efdd94
    • J
      nilfs2: fix disorder of nilfs_write_super in nilfs_sync_fs · 6233caa9
      Jiro SEKIBA 提交于
      This fixes disorder of nilfs_write_super in nilfs_sync_fs.  Commiting
      super block must be the end of the function so that every changes are
      reflected.
      
      ->sync_fs() is not called frequently so this makes nilfs_sync_fs call
      nilfs_commit_super instead of nilfs_write_super.
      Signed-off-by: NJiro SEKIBA <jir@unicus.jp>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      6233caa9
    • J
      nilfs2: remove redundant super block commit · ec5d66ab
      Jiro SEKIBA 提交于
      This removes redundant super block commit.
      
      nilfs_write_super will call nilfs_commit_super to store super block
      into block device.  However, nilfs_put_super will call
      nilfs_commit_super right after calling nilfs_write_super.  So calling
      nilfs_write_super in nilfs_put_super would be redundant.
      Signed-off-by: NJiro SEKIBA <jir@unicus.jp>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      ec5d66ab
    • J
      nilfs2: implement nilfs_show_options to display mount options in /proc/mounts · b58a285b
      Jiro SEKIBA 提交于
      This is a patch to display mount options in procfs.
      Mount options will show up in the /proc/mounts as other fs does.
      
      ...
      /dev/sda6 /mnt nilfs2 ro,relatime,barrier=off,cp=3,order=strict 0 0
      ...
      Signed-off-by: NJiro SEKIBA <jir@unicus.jp>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      b58a285b
    • R
      nilfs2: always lookup disk block address before reading metadata block · 14351104
      Ryusuke Konishi 提交于
      The current metadata file code skips disk address lookup for its data
      block if the buffer has a mapped flag.
      
      This has a potential risk to cause read request to be performed
      against the stale block address that GC moved, and it may lead to meta
      data corruption.  The mapped flag is safe if the buffer has an
      uptodate flag, otherwise it may prevent necessary update of disk
      address in the next read.
      
      This will avoid the potential problem by ensuring disk address lookup
      before reading metadata block even for buffers with the mapped flag.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      14351104
    • R
      nilfs2: use semaphore to protect pointer to a writable FS-instance · 027d6404
      Ryusuke Konishi 提交于
      will get rid of nilfs_get_writer() and nilfs_put_writer() pair used to
      retain a writable FS-instance for a period.
      
      The pair functions were making up some kind of recursive lock with a
      mutex, but they became overkill since the commit
      201913ed.  Furthermore, they caused
      the following lockdep warning because the mutex can be released by a
      task which didn't lock it:
      
       =====================================
       [ BUG: bad unlock balance detected! ]
       -------------------------------------
       kswapd0/422 is trying to release lock (&nilfs->ns_writer_mutex) at:
       [<c1359ff5>] mutex_unlock+0x8/0xa
       but there are no more locks to release!
      
       other info that might help us debug this:
       no locks held by kswapd0/422.
      
       stack backtrace:
       Pid: 422, comm: kswapd0 Not tainted 2.6.31-rc4-nilfs #51
       Call Trace:
        [<c1358f97>] ? printk+0xf/0x18
        [<c104fea7>] print_unlock_inbalance_bug+0xcc/0xd7
        [<c11578de>] ? prop_put_global+0x3/0x35
        [<c1050195>] lock_release+0xed/0x1dc
        [<c1359ff5>] ? mutex_unlock+0x8/0xa
        [<c1359f83>] __mutex_unlock_slowpath+0xaf/0x119
        [<c1359ff5>] mutex_unlock+0x8/0xa
        [<d1284add>] nilfs_mdt_write_page+0xd8/0xe1 [nilfs2]
        [<c1092653>] shrink_page_list+0x379/0x68d
        [<c109171b>] ? isolate_pages_global+0xb4/0x18c
        [<c1092bd2>] shrink_list+0x26b/0x54b
        [<c10930be>] shrink_zone+0x20c/0x2a2
        [<c10936b7>] kswapd+0x407/0x591
        [<c1091667>] ? isolate_pages_global+0x0/0x18c
        [<c1040603>] ? autoremove_wake_function+0x0/0x33
        [<c10932b0>] ? kswapd+0x0/0x591
        [<c104033b>] kthread+0x69/0x6e
        [<c10402d2>] ? kthread+0x0/0x6e
        [<c1003e33>] kernel_thread_helper+0x7/0x1a
      
      This patch uses a reader/writer semaphore instead of the own lock and
      kills this warning.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      027d6404
    • H
      nilfs2: fix format string compile warning (ino_t) · b5696e5e
      Heiko Carstens 提交于
      Unlike on most other architectures ino_t is an unsigned int on s390.
      So add an explicit cast to avoid this compile warning:
      
      fs/nilfs2/recovery.c: In function 'recover_dsync_blocks':
      fs/nilfs2/recovery.c:555: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      b5696e5e
    • R
      nilfs2: fix ignored error code in __nilfs_read_inode() · 1b2f5a64
      Ryusuke Konishi 提交于
      The __nilfs_read_inode function is ignoring the error code returned
      from nilfs_read_inode_common(), and wrongly delivers a success code
      (zero) when it escapes from the function in erroneous cases.
      
      This adds the missing error handling.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      1b2f5a64
    • S
      GFS2: Whitespace fixes · 86d00636
      Steven Whitehouse 提交于
      Reported-by: NDaniel Walker <dwalker@fifo99.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      86d00636
    • C
      block: use blkdev_issue_discard in blk_ioctl_discard · 746cd1e7
      Christoph Hellwig 提交于
      blk_ioctl_discard duplicates large amounts of code from blkdev_issue_discard,
      the only difference between the two is that blkdev_issue_discard needs to
      send a barrier discard request and blk_ioctl_discard a non-barrier one,
      and blk_ioctl_discard needs to wait on the request.  To facilitates this
      add a flags argument to blkdev_issue_discard to control both aspects of the
      behaviour.  This will be very useful later on for using the waiting
      funcitonality for other callers.
      
      Based on an earlier patch from Matthew Wilcox <matthew@wil.cx>.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      746cd1e7
    • N
      Seperate read and write statistics of in_flight requests · a9327cac
      Nikanth Karthikesan 提交于
      Currently, there is a single in_flight counter measuring the number of
      requests in the request_queue. But some monitoring tools would like to
      know how many read requests and write requests are in progress. Split the
      current in_flight counter into two seperate counters for read and write.
      
      This information is exported as a sysfs attribute, as changing the
      currently available stat files would break the existing tools.
      Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      a9327cac