1. 04 7月, 2013 28 次提交
  2. 03 7月, 2013 3 次提交
    • A
      ext4: ->tmpfile() support · af51a2ac
      Al Viro 提交于
      very similar to ext3 counterpart...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      af51a2ac
    • J
      vfs: export lseek_execute() to modules · 46a1c2c7
      Jie Liu 提交于
      For those file systems(btrfs/ext4/ocfs2/tmpfs) that support
      SEEK_DATA/SEEK_HOLE functions, we end up handling the similar
      matter in lseek_execute() to update the current file offset
      to the desired offset if it is valid, ceph also does the
      simliar things at ceph_llseek().
      
      To reduce the duplications, this patch make lseek_execute()
      public accessible so that we can call it directly from the
      underlying file systems.
      
      Thanks Dave Chinner for this suggestion.
      
      [AV: call it vfs_setpos(), don't bring the removed 'inode' argument back]
      
      v2->v1:
      - Add kernel-doc comments for lseek_execute()
      - Call lseek_execute() in ceph->llseek()
      Signed-off-by: NJie Liu <jeff.liu@oracle.com>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chris Mason <chris.mason@fusionio.com>
      Cc: Josef Bacik <jbacik@fusionio.com>
      Cc: Ben Myers <bpm@sgi.com>
      Cc: Ted Tso <tytso@mit.edu>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Sage Weil <sage@inktank.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      46a1c2c7
    • D
      sync: don't block the flusher thread waiting on IO · 7747bd4b
      Dave Chinner 提交于
      When sync does it's WB_SYNC_ALL writeback, it issues data Io and
      then immediately waits for IO completion. This is done in the
      context of the flusher thread, and hence completely ties up the
      flusher thread for the backing device until all the dirty inodes
      have been synced. On filesystems that are dirtying inodes constantly
      and quickly, this means the flusher thread can be tied up for
      minutes per sync call and hence badly affect system level write IO
      performance as the page cache cannot be cleaned quickly.
      
      We already have a wait loop for IO completion for sync(2), so cut
      this out of the flusher thread and delegate it to wait_sb_inodes().
      Hence we can do rapid IO submission, and then wait for it all to
      complete.
      
      Effect of sync on fsmark before the patch:
      
      FSUse%        Count         Size    Files/sec     App Overhead
      .....
           0       640000         4096      35154.6          1026984
           0       720000         4096      36740.3          1023844
           0       800000         4096      36184.6           916599
           0       880000         4096       1282.7          1054367
           0       960000         4096       3951.3           918773
           0      1040000         4096      40646.2           996448
           0      1120000         4096      43610.1           895647
           0      1200000         4096      40333.1           921048
      
      And a single sync pass took:
      
        real    0m52.407s
        user    0m0.000s
        sys     0m0.090s
      
      After the patch, there is no impact on fsmark results, and each
      individual sync(2) operation run concurrently with the same fsmark
      workload takes roughly 7s:
      
        real    0m6.930s
        user    0m0.000s
        sys     0m0.039s
      
      IOWs, sync is 7-8x faster on a busy filesystem and does not have an
      adverse impact on ongoing async data write operations.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7747bd4b
  3. 02 7月, 2013 7 次提交
    • J
      f2fs: fix to recover i_size from roll-forward · a1dd3c13
      Jaegeuk Kim 提交于
      If user requests many data writes and fsync together, the last updated i_size
      should be stored to the inode block consistently.
      
      But, previous write_end just marks the inode as dirty and doesn't update its
      metadata into its inode block.
      After that, fsync just writes the inode block with newly updated data index
      excluding inode metadata updates.
      
      So, this patch introduces write_end in which updates inode block too when the
      i_size is changed.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      a1dd3c13
    • G
      f2fs: remove the unused argument "sbi" of func destroy_fsync_dnodes() · 5ebefc5b
      Gu Zheng 提交于
      As destroy_fsync_dnodes() is a simple list-cleanup func, so delete the unused
      and unrelated f2fs_sb_info argument of it.
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      5ebefc5b
    • J
      f2fs: remove reusing any prefree segments · 763bfe1b
      Jaegeuk Kim 提交于
      This patch removes check_prefree_segments initially designed to enhance the
      performance by narrowing the range of LBA usage across the whole block device.
      
      When allocating a new segment, previous f2fs tries to find proper prefree
      segments, and then, if finds a segment, it reuses the segment for further
      data or node block allocation.
      
      However, I found that this was totally wrong approach since the prefree segments
      have several data or node blocks that will be used by the roll-forward mechanism
      operated after sudden-power-off.
      
      Let's assume the following scenario.
      
      /* write 8MB with fsync */
      for (i = 0; i < 2048; i++) {
      	offset = i * 4096;
      	write(fd, offset, 4KB);
      	fsync(fd);
      }
      
      In this case, naive segment allocation sequence will be like:
       data segment: x, x+1, x+2, x+3
       node segment: y, y+1, y+2, y+3.
      
      But, if we can reuse prefree segments, the sequence can be like:
       data segment: x, x+1, y, y+1
       node segment: y, y+1, y+2, y+3.
      Because, y, y+1, and y+2 became prefree segments one by one, and those are
      reused by data allocation.
      
      After conducting this workload, we should consider how to recover the latest
      inode with its data.
      If we reuse the prefree segments such as y or y+1, we lost the old node blocks
      so that f2fs even cannot start roll-forward recovery.
      
      Therefore, I suggest that we should remove reusing prefree segments.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      763bfe1b
    • G
      f2fs: code cleanup and simplify in func {find/add}_gc_inode · 6cc4af56
      Gu Zheng 提交于
      This patch simplifies list operations in find_gc_inode and add_gc_inode.
      Just simple code cleanup.
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      [Jaegeuk Kim: add description]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      6cc4af56
    • N
      f2fs: optimize the init_dirty_segmap function · 8736fbf0
      Namjae Jeon 提交于
      Optimize the while loop condition
      
      Since this condition will always be true and while loop will
      be terminated by the following condition in code:
      
      if (segno >= TOTAL_SEGS(sbi))
          break;
      Hence we can replace the while loop condition with while(1)
      instead of always checking for segno to be less than Total segs.
      
      Also we do not need to use TOTAL_SEGS() everytime. We can store
      this value in a local variable since this value is constant.
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      8736fbf0
    • J
      f2fs: fix an endian conversion bug detected by sparse · 060dd67b
      Jaegeuk Kim 提交于
      This patch should fix the following bug reported by kbuild test robot.
      
      fs/f2fs/recovery.c:233:33: sparse: incorrect type in assignment
      (different base types)
      
      parse warnings: (new ones prefixed by >>)
      
      >> recovery.c:233: sparse: incorrect type in assignment (different base types)
         recovery.c:233:    expected unsigned int [unsigned] [assigned] ofs_in_node
         recovery.c:233:    got restricted __le16 [assigned] [usertype] ofs_in_node
      >> recovery.c:238: sparse: incorrect type in assignment (different base types)
         recovery.c:238:    expected unsigned int [unsigned] ofs_in_node
         recovery.c:238:    got restricted __le16 [assigned] [usertype] ofs_in_node
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      060dd67b
    • J
      f2fs: fix crc endian conversion · 7e586fa0
      Jaegeuk Kim 提交于
      While calculating CRC for the checkpoint block, we use __u32, but when storing
      the crc value to the disk, we use __le32.
      
      Let's fix the inconsistency.
      Reported-and-Tested-by: NOded Gabbay <ogabbay@advaoptical.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      7e586fa0
  4. 01 7月, 2013 2 次提交
    • A
      ext4: optimize starting extent in ext4_ext_rm_leaf() · 6ae06ff5
      Ashish Sangwan 提交于
      Both hole punch and truncate use ext4_ext_rm_leaf() for removing
      blocks.  Currently we choose the last extent as the starting
      point for removing blocks:
      
      	ex = EXT_LAST_EXTENT(eh);
      
      This is OK for truncate but for hole punch we can optimize the extent
      selection as the path is already initialized.  We could use this
      information to select proper starting extent.  The code change in this
      patch will not affect truncate as for truncate path[depth].p_ext will
      always be NULL.
      Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      6ae06ff5
    • T
      jbd2: invalidate handle if jbd2_journal_restart() fails · 41a5b913
      Theodore Ts'o 提交于
      If jbd2_journal_restart() fails the handle will have been disconnected
      from the current transaction.  In this situation, the handle must not
      be used for for any jbd2 function other than jbd2_journal_stop().
      Enforce this with by treating a handle which has a NULL transaction
      pointer as an aborted handle, and issue a kernel warning if
      jbd2_journal_extent(), jbd2_journal_get_write_access(),
      jbd2_journal_dirty_metadata(), etc. is called with an invalid handle.
      
      This commit also fixes a bug where jbd2_journal_stop() would trip over
      a kernel jbd2 assertion check when trying to free an invalid handle.
      
      Also move the responsibility of setting current->journal_info to
      start_this_handle(), simplifying the three users of this function.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Reported-by: NYounger Liu <younger.liu@huawei.com>
      Cc: Jan Kara <jack@suse.cz>
      41a5b913