1. 28 10月, 2010 1 次提交
    • T
      ext4: improve llseek error handling for overly large seek offsets · e0d10bfa
      Toshiyuki Okajima 提交于
      The llseek system call should return EINVAL if passed a seek offset
      which results in a write error.  What this maximum offset should be
      depends on whether or not the huge_file file system feature is set,
      and whether or not the file is extent based or not.
      
      
      If the file has no "EXT4_EXTENTS_FL" flag, the maximum size which can be 
      written (write systemcall) is different from the maximum size which can be 
      sought (lseek systemcall).
      
      For example, the following 2 cases demonstrates the differences
      between the maximum size which can be written, versus the seek offset
      allowed by the llseek system call:
      
      #1: mkfs.ext3 <dev>; mount -t ext4 <dev>
      #2: mkfs.ext3 <dev>; tune2fs -Oextent,huge_file <dev>; mount -t ext4 <dev>
      
      Table. the max file size which we can write or seek
             at each filesystem feature tuning and file flag setting
      +============+===============================+===============================+
      | \ File flag|                               |                               |
      |      \     |     !EXT4_EXTENTS_FL          |        EXT4_EXTETNS_FL        |
      |case       \|                               |                               |
      +------------+-------------------------------+-------------------------------+
      | #1         |   write:      2194719883264   | write:       --------------   |
      |            |   seek:       2199023251456   | seek:        --------------   |
      +------------+-------------------------------+-------------------------------+
      | #2         |   write:      4402345721856   | write:       17592186044415   |
      |            |   seek:      17592186044415   | seek:        17592186044415   |
      +------------+-------------------------------+-------------------------------+
      
      The differences exist because ext4 has 2 maxbytes which are sb->s_maxbytes
      (= extent-mapped maxbytes) and EXT4_SB(sb)->s_bitmap_maxbytes (= block-mapped 
      maxbytes).  Although generic_file_llseek uses only extent-mapped maxbytes.
      (llseek of ext4_file_operations is generic_file_llseek which uses
      sb->s_maxbytes.)
      
      Therefore we create ext4 llseek function which uses 2 maxbytes.
      
      The new own function originates from generic_file_llseek().
      If the file flag, "EXT4_EXTENTS_FL" is not set, the function alters 
      inode->i_sb->s_maxbytes into EXT4_SB(inode->i_sb)->s_bitmap_maxbytes.
      Signed-off-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      e0d10bfa
  2. 27 7月, 2010 1 次提交
  3. 12 6月, 2010 1 次提交
    • T
      ext4: Clean up s_dirt handling · a0375156
      Theodore Ts'o 提交于
      We don't need to set s_dirt in most of the ext4 code when journaling
      is enabled.  In ext3/4 some of the summary statistics for # of free
      inodes, blocks, and directories are calculated from the per-block
      group statistics when the file system is mounted or unmounted.  As a
      result the superblock doesn't have to be updated, either via the
      journal or by setting s_dirt.  There are a few exceptions, most
      notably when resizing the file system, where the superblock needs to
      be modified --- and in that case it should be done as a journalled
      operation if possible, and s_dirt set only in no-journal mode.
      
      This patch will optimize out some unneeded disk writes when using ext4
      with a journal.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a0375156
  4. 17 5月, 2010 1 次提交
  5. 05 3月, 2010 2 次提交
    • C
      dquot: cleanup dquot initialize routine · 871a2931
      Christoph Hellwig 提交于
      Get rid of the initialize dquot operation - it is now always called from
      the filesystem and if a filesystem really needs it's own (which none
      currently does) it can just call into it's own routine directly.
      
      Rename the now static low-level dquot_initialize helper to __dquot_initialize
      and vfs_dq_init to dquot_initialize to have a consistent namespace.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      871a2931
    • C
      dquot: move dquot initialization responsibility into the filesystem · 907f4554
      Christoph Hellwig 提交于
      Currently various places in the VFS call vfs_dq_init directly.  This means
      we tie the quota code into the VFS.  Get rid of that and make the
      filesystem responsible for the initialization.   For most metadata operations
      this is a straight forward move into the methods, but for truncate and
      open it's a bit more complicated.
      
      For truncate we currently only call vfs_dq_init for the sys_truncate case
      because open already takes care of it for ftruncate and open(O_TRUNC) - the
      new code causes an additional vfs_dq_init for those which is harmless.
      
      For open the initialization is moved from do_filp_open into the open method,
      which means it happens slightly earlier now, and only for regular files.
      The latter is fine because we don't need to initialize it for operations
      on special files, and we already do it as part of the namespace operations
      for directories.
      
      Add a dquot_file_open helper that filesystems that support generic quotas
      can use to fill in ->open.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      907f4554
  6. 04 3月, 2010 1 次提交
  7. 25 1月, 2010 1 次提交
    • T
      ext4: Use bitops to read/modify EXT4_I(inode)->i_state · 19f5fb7a
      Theodore Ts'o 提交于
      At several places we modify EXT4_I(inode)->i_state without holding
      i_mutex (ext4_release_file, ext4_bmap, ext4_journalled_writepage,
      ext4_do_update_inode, ...). These modifications are racy and we can
      lose updates to i_state. So convert handling of i_state to use bitops
      which are atomic.
      
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      19f5fb7a
  8. 28 9月, 2009 1 次提交
  9. 14 9月, 2009 1 次提交
  10. 09 9月, 2009 1 次提交
  11. 13 6月, 2009 1 次提交
    • T
      ext4: update the s_last_mounted field in the superblock · bc0b0d6d
      Theodore Ts'o 提交于
      This field can be very helpful when a system administrator is trying
      to sort through large numbers of block devices or filesystem images.
      What is stored in this field can be ambiguous if multiple filesystem
      namespaces are in play; what we store in practice is the mountpoint
      interpreted by the process's namespace which first opens a file in the
      filesystem.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bc0b0d6d
  12. 28 3月, 2009 1 次提交
  13. 24 2月, 2009 1 次提交
    • T
      ext4: Automatically allocate delay allocated blocks on close · 7d8f9f7d
      Theodore Ts'o 提交于
      When closing a file that had been previously truncated, force any
      delay allocated blocks that to be allocated so that if the filesystem
      is mounted with data=ordered, the data blocks will be pushed out to
      disk along with the journal commit.  Many application programs expect
      this, so we do this to avoid zero length files if the system crashes
      unexpectedly.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      7d8f9f7d
  14. 23 11月, 2008 1 次提交
  15. 11 10月, 2008 1 次提交
  16. 07 10月, 2008 1 次提交
  17. 10 10月, 2008 1 次提交
  18. 09 9月, 2008 1 次提交
  19. 12 7月, 2008 2 次提交
    • M
      ext4: delayed allocation i_blocks fix for stat · 3e3398a0
      Mingming Cao 提交于
      Right now i_blocks is not getting updated until the blocks are actually
      allocaed on disk.  This means with delayed allocation, right after files
      are copied, "ls -sF" shoes the file as taking 0 blocks on disk.  "du"
      also shows the files taking zero space, which is highly confusing to the
      user.
      
      Since delayed allocation already keeps track of per-inode total
      number of blocks that are subject to delayed allocation, this patch fix
      this by using that to adjust the value returned by stat(2). When real
      block allocation is done, the i_blocks will get updated. Since the
      reserved blocks for delayed allocation will be decreased, this will be
      keep value returned by stat(2) consistent.
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      3e3398a0
    • A
      ext4: Use page_mkwrite vma_operations to get mmap write notification. · 2e9ee850
      Aneesh Kumar K.V 提交于
      We would like to get notified when we are doing a write on mmap section.
      This is needed with respect to preallocated area. We split the preallocated
      area into initialzed extent and uninitialzed extent in the call back. This
      let us handle ENOSPC better. Otherwise we get ENOSPC in the writepage and
      that would result in data loss. The changes are also needed to handle ENOSPC
      when writing to an mmap section of files with holes.
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      2e9ee850
  20. 30 4月, 2008 2 次提交
  21. 29 1月, 2008 2 次提交
  22. 18 7月, 2007 1 次提交
    • A
      fallocate support in ext4 · a2df2a63
      Amit Arora 提交于
      This patch implements ->fallocate() inode operation in ext4. With this
      patch users of ext4 file systems will be able to use fallocate() system
      call for persistent preallocation. Current implementation only supports
      preallocation for regular files (directories not supported as of date)
      with extent maps. This patch does not support block-mapped files currently.
      Only FALLOC_ALLOCATE and FALLOC_RESV_SPACE modes are being supported as of
      now.
      Signed-off-by: NAmit Arora <aarora@in.ibm.com>
      a2df2a63
  23. 10 7月, 2007 1 次提交
  24. 13 2月, 2007 1 次提交
  25. 09 12月, 2006 1 次提交
  26. 12 10月, 2006 3 次提交
  27. 01 10月, 2006 3 次提交
  28. 27 9月, 2006 1 次提交
  29. 31 3月, 2006 1 次提交
    • J
      [PATCH] Introduce sys_splice() system call · 5274f052
      Jens Axboe 提交于
      This adds support for the sys_splice system call. Using a pipe as a
      transport, it can connect to files or sockets (latter as output only).
      
      From the splice.c comments:
      
         "splice": joining two ropes together by interweaving their strands.
      
         This is the "extended pipe" functionality, where a pipe is used as
         an arbitrary in-memory buffer. Think of a pipe as a small kernel
         buffer that you can use to transfer data from one end to the other.
      
         The traditional unix read/write is extended with a "splice()" operation
         that transfers data buffers to or from a pipe buffer.
      
         Named by Larry McVoy, original implementation from Linus, extended by
         Jens to support splicing to files and fixing the initial implementation
         bugs.
      Signed-off-by: NJens Axboe <axboe@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5274f052
  30. 29 3月, 2006 1 次提交
  31. 23 3月, 2006 1 次提交
  32. 29 6月, 2005 1 次提交
    • M
      [PATCH] ext3: reduce allocate-with-reservation lock latencies · 21fe3471
      Mingming Cao 提交于
      Currently in ext3 block reservation code, the global filesystem reservation
      tree lock (rsv_block) is hold during the process of searching for a space
      to make a new reservation window, including while scaning the block bitmap
      to verify if the avalible window has a free block.  Holding the lock during
      bitmap scan is unnecessary and could possibly cause scalability issue and
      latency issues.
      
      This patch tries to address this by dropping the lock before scan the
      bitmap.  Before that we need to reserve the open window in case someone
      else is targetting at the same window.  Question was should we reserve the
      whole free reservable space or just the window size we need.  Reserve the
      whole free reservable space will possibly force other threads which
      intended to do block allocation nearby move to another block group(cause
      bad layout).  In this patch, we just reserve the desired size before drop
      the lock and scan the block bitmap.  This patch fixed a ext3 reservation
      latency issue seen on a cvs check out test.  Patch is tested with many fsx,
      tiobench, dbench and untar a kernel test.
      Signed-Off-By: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      21fe3471