1. 30 9月, 2009 4 次提交
    • T
      ext4: Fix time encoding with extra epoch bits · c1fccc06
      Theodore Ts'o 提交于
      "Looking at ext4.h, I think the setting of extra time fields forgets to
      mask the epoch bits so the epoch part overwrites nsec part. The second
      change is only for coherency (2 -> EXT4_EPOCH_BITS)."
      
      Thanks to Damien Guibouret for pointing out this problem.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c1fccc06
    • T
      jbd2: Use tracepoints for history file · bf699327
      Theodore Ts'o 提交于
      The /proc/fs/jbd2/<dev>/history was maintained manually; by using
      tracepoints, we can get all of the existing functionality of the /proc
      file plus extra capabilities thanks to the ftrace infrastructure.  We
      save memory as a bonus.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bf699327
    • T
      ext4: Use tracepoints for mb_history trace file · 296c355c
      Theodore Ts'o 提交于
      The /proc/fs/ext4/<dev>/mb_history was maintained manually, and had a
      number of problems: it required a largish amount of memory to be
      allocated for each ext4 filesystem, and the s_mb_history_lock
      introduced a CPU contention problem.  
      
      By ripping out the mb_history code and replacing it with ftrace
      tracepoints, and we get more functionality: timestamps, event
      filtering, the ability to correlate mballoc history with other ext4
      tracepoints, etc.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      296c355c
    • T
      ext4, jbd2: Drop unneeded printks at mount and unmount time · 90576c0b
      Theodore Ts'o 提交于
      There are a number of kernel printk's which are printed when an ext4
      filesystem is mounted and unmounted.  Disable them to economize space
      in the system logs.  In addition, disabling the mballoc stats by
      default saves a number of unneeded atomic operations for every block
      allocation or deallocation.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      90576c0b
  2. 29 9月, 2009 1 次提交
  3. 30 9月, 2009 1 次提交
  4. 29 9月, 2009 8 次提交
    • F
      ext4: Avoid updating the inode table bh twice in no journal mode · 830156c7
      Frank Mayhar 提交于
      This is a cleanup of commit 91ac6f43.  Since ext4_mark_inode_dirty()
      has already called ext4_mark_iloc_dirty(), which in turn calls
      ext4_do_update_inode(), it's not necessary to have ext4_write_inode()
      call ext4_do_update_inode() in no journal mode.  Indeed, it would be
      duplicated work.
      Reviewed-by: N"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NFrank Mayhar <fmayhar@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      830156c7
    • R
      nilfs2: fix missing initialization of i_dir_start_lookup member · 3cc811bf
      Ryusuke Konishi 提交于
      The i_dir_start_lookup field in nilfs_inode_info objects should be
      cleared when the objects are allocated, but the the initialization was
      missing in case of reading from disk.  This adds the initialization.
      
      Since the variable just gives a start page on directory lookups, the
      bug was nonfatal until now.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      3cc811bf
    • R
      nilfs2: fix missing zero-fill initialization of btree node cache · 1f28fcd9
      Ryusuke Konishi 提交于
      This will fix file system corruption which infrequently happens after
      mount.  The problem was reported from users with the title "[NILFS
      users] Fail to mount NILFS." (Message-ID:
      <200908211918.34720.yuri@itinteg.net>), and so forth.  I've also
      experienced the corruption multiple times on kernel 2.6.30 and 2.6.31.
      
      The problem turned out to be caused due to discordance between
      mapping->nrpages of a btree node cache and the actual number of pages
      hung on the cache; if the mapping->nrpages becomes zero even as it has
      pages, truncate_inode_pages() returns without doing anything.  Usually
      this is harmless except it may cause page leak, but garbage collection
      fairly infrequently sees a stale page remained in the btree node cache
      of DAT (i.e. disk address translation file of nilfs), and induces the
      corruption.
      
      I identified a missing initialization in btree node caches was the
      root cause.  This corrects the bug.
      
      I've tested this for kernel 2.6.30 and 2.6.31.
      Reported-by: NYuri Chislov <yuri@itinteg.net>
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Cc: stable <stable@kernel.org>
      1f28fcd9
    • T
      ext4: EXT4_IOC_MOVE_EXT: Check for different original and donor inodes first · f3ce8064
      Theodore Ts'o 提交于
      Move the check to make sure the original and donor inodes are
      different earlier, to avoid a potential deadlock by trying to lock the
      same inode twice.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f3ce8064
    • M
      ext4: async direct IO for holes and fallocate support · 8d5d02e6
      Mingming Cao 提交于
      For async direct IO that covers holes or fallocate, the end_io
      callback function now queued the convertion work on workqueue but
      don't flush the work rightaway as it might take too long to afford.
      
      But when fsync is called after all the data is completed, user expects
      the metadata also being updated before fsync returns.
      
      Thus we need to flush the conversion work when fsync() is called.
      This patch keep track of a listed of completed async direct io that
      has a work queued on workqueue.  When fsync() is called, it will go
      through the list and do the conversion.
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      8d5d02e6
    • M
      ext4: Use end_io callback to avoid direct I/O fallback to buffered I/O · 4c0425ff
      Mingming Cao 提交于
      Currently the DIO VFS code passes create = 0 when writing to the
      middle of file.  It does this to avoid block allocation for holes, so
      as not to expose stale data out when there is a parallel buffered read
      (which does not hold the i_mutex lock).  Direct I/O writes into holes
      falls back to buffered IO for this reason.
      
      Since preallocated extents are treated as holes when doing a
      get_block() look up (buffer is not mapped), direct IO over fallocate
      also falls back to buffered IO.  Thus ext4 actually silently falls
      back to buffered IO in above two cases, which is undesirable.
      
      To fix this, this patch creates unitialized extents when a direct I/O
      write into holes in sparse files, and registering an end_io callback which
      converts the uninitialized extent to an initialized extent after the
      I/O is completed.
      Singed-Off-By: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      4c0425ff
    • M
      ext4: Split uninitialized extents for direct I/O · 0031462b
      Mingming Cao 提交于
      When writing into an unitialized extent via direct I/O, and the direct
      I/O doesn't exactly cover the unitialized extent, split the extent
      into uninitialized and initialized extents before submitting the I/O.
      This avoids needing to deal with an ENOSPC error in the end_io
      callback that gets used for direct I/O.
      
      When the IO is complete, the written extent will be marked as initialized.
      
      Singed-Off-By: Mingming Cao <cmm@us.ibm.com> 
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      0031462b
    • M
      ext4: release reserved quota when block reservation for delalloc retry · 9f0ccfd8
      Mingming Cao 提交于
      ext4_da_reserve_space() can reserve quota blocks multiple times if
      ext4_claim_free_blocks() fail and we retry the allocation. We should
      release the quota reservation before restarting.
      
      Bug found by Jan Kara.
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      9f0ccfd8
  5. 30 9月, 2009 1 次提交
    • T
      ext4: Adjust ext4_da_writepages() to write out larger contiguous chunks · 55138e0b
      Theodore Ts'o 提交于
      Work around problems in the writeback code to force out writebacks in
      larger chunks than just 4mb, which is just too small.  This also works
      around limitations in the ext4 block allocator, which can't allocate
      more than 2048 blocks at a time.  So we need to defeat the round-robin
      characteristics of the writeback code and try to write out as many
      blocks in one inode before allowing the writeback code to move on to
      another inode.  We add a a new per-filesystem tunable,
      max_writeback_mb_bump, which caps this to a default of 128mb per
      inode.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      55138e0b
  6. 28 9月, 2009 2 次提交
  7. 27 9月, 2009 1 次提交
  8. 26 9月, 2009 13 次提交
  9. 25 9月, 2009 7 次提交
  10. 24 9月, 2009 2 次提交