1. 09 1月, 2012 1 次提交
    • J
      jbd: Remove j_barrier mutex · 00482785
      Jan Kara 提交于
      j_barrier mutex is used for serializing different journal lock operations.  The
      problem with it is that e.g. FIFREEZE ioctl results in process leaving kernel
      with j_barrier mutex held which makes lockdep freak out. Also hibernation code
      wants to freeze filesystem but it cannot do so because it then cannot hibernate
      the system because of mutex being locked.
      
      So we remove j_barrier mutex and use direct wait on j_barrier_count instead.
      Since locking journal is a rare operation we don't have to care about fairness
      or such things.
      
      CC: Andrew Morton <akpm@linux-foundation.org>
      Acked-by: NJoel Becker <jlbec@evilplan.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      00482785
  2. 22 11月, 2011 1 次提交
    • T
      freezer: unexport refrigerator() and update try_to_freeze() slightly · a0acae0e
      Tejun Heo 提交于
      There is no reason to export two functions for entering the
      refrigerator.  Calling refrigerator() instead of try_to_freeze()
      doesn't save anything noticeable or removes any race condition.
      
      * Rename refrigerator() to __refrigerator() and make it return bool
        indicating whether it scheduled out for freezing.
      
      * Update try_to_freeze() to return bool and relay the return value of
        __refrigerator() if freezing().
      
      * Convert all refrigerator() users to try_to_freeze().
      
      * Update documentation accordingly.
      
      * While at it, add might_sleep() to try_to_freeze().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Samuel Ortiz <samuel@sortiz.org>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Christoph Hellwig <hch@infradead.org>
      a0acae0e
  3. 02 11月, 2011 1 次提交
    • E
      jbd/jbd2: validate sb->s_first in journal_get_superblock() · 8762202d
      Eryu Guan 提交于
      I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when
      mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3
      image has s_first = 0 in journal superblock, and the 0 is passed to
      journal->j_head in journal_reset(), then to blocknr in
      cleanup_journal_tail(), in the end the J_ASSERT failed.
      
      So validate s_first after reading journal superblock from disk in
      journal_get_superblock() to ensure s_first is valid.
      
      The following script could reproduce it:
      
      fstype=ext3
      blocksize=1024
      img=$fstype.img
      offset=0
      found=0
      magic="c0 3b 39 98"
      
      dd if=/dev/zero of=$img bs=1M count=8
      mkfs -t $fstype -b $blocksize -F $img
      filesize=`stat -c %s $img`
      while [ $offset -lt $filesize ]
      do
              if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then
                      echo "Found journal: $offset"
                      found=1
                      break
              fi
              offset=`echo "$offset+$blocksize" | bc`
      done
      
      if [ $found -ne 1 ];then
              echo "Magic \"$magic\" not found"
              exit 1
      fi
      
      dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1
      
      mkdir -p ./mnt
      mount -o loop $img ./mnt
      
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NEryu Guan <guaneryu@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      8762202d
  4. 27 6月, 2011 1 次提交
    • J
      jbd: Fix oops in journal_remove_journal_head() · bb189247
      Jan Kara 提交于
      journal_remove_journal_head() can oops when trying to access journal_head
      returned by bh2jh(). This is caused for example by the following race:
      
      	TASK1					TASK2
        journal_commit_transaction()
          ...
          processing t_forget list
            __journal_refile_buffer(jh);
            if (!jh->b_transaction) {
              jbd_unlock_bh_state(bh);
      					journal_try_to_free_buffers()
      					  journal_grab_journal_head(bh)
      					  jbd_lock_bh_state(bh)
      					  __journal_try_to_free_buffer()
      					  journal_put_journal_head(jh)
              journal_remove_journal_head(bh);
      
      journal_put_journal_head() in TASK2 sees that b_jcount == 0 and buffer is not
      part of any transaction and thus frees journal_head before TASK1 gets to doing
      so. Note that even buffer_head can be released by try_to_free_buffers() after
      journal_put_journal_head() which adds even larger opportunity for oops (but I
      didn't see this happen in reality).
      
      Fix the problem by making transactions hold their own journal_head reference
      (in b_jcount). That way we don't have to remove journal_head explicitely via
      journal_remove_journal_head() and instead just remove journal_head when
      b_jcount drops to zero. The result of this is that [__]journal_refile_buffer(),
      [__]journal_unfile_buffer(), and __journal_remove_checkpoint() can free
      journal_head which needs modification of a few callers. Also we have to be
      careful because once journal_head is removed, buffer_head might be freed as
      well. So we have to get our own buffer_head reference where it matters.
      Signed-off-by: NJan Kara <jack@suse.cz>
      bb189247
  5. 25 6月, 2011 1 次提交
    • L
      jbd: Add fixed tracepoints · 99cb1a31
      Lukas Czerner 提交于
      This commit adds fixed tracepoint for jbd. It has been based on fixed
      tracepoints for jbd2, however there are missing those for collecting
      statistics, since I think that it will require more intrusive patch so I
      should have its own commit, if someone decide that it is needed. Also
      there are new tracepoints in __journal_drop_transaction() and
      journal_update_superblock().
      
      The list of jbd tracepoints:
      
      jbd_checkpoint
      jbd_start_commit
      jbd_commit_locking
      jbd_commit_flushing
      jbd_commit_logging
      jbd_drop_transaction
      jbd_end_commit
      jbd_do_submit_data
      jbd_cleanup_journal_tail
      jbd_update_superblock_end
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NJan Kara <jack@suse.cz>
      99cb1a31
  6. 17 5月, 2011 1 次提交
    • T
      jbd: fix fsync() tid wraparound bug · d9b01934
      Ted Ts'o 提交于
      If an application program does not make any changes to the indirect
      blocks or extent tree, i_datasync_tid will not get updated.  If there
      are enough commits (i.e., 2**31) such that tid_geq()'s calculations
      wrap, and there isn't a currently active transaction at the time of
      the fdatasync() call, this can end up triggering a BUG_ON in
      fs/jbd/commit.c:
      
      	J_ASSERT(journal->j_running_transaction != NULL);
      
      It's pretty rare that this can happen, since it requires the use of
      fdatasync() plus *very* frequent and excessive use of fsync().  But
      with the right workload, it can.
      
      We fix this by replacing the use of tid_geq() with an equality test,
      since there's only one valid transaction id that is valid for us to
      start: namely, the currently running transaction (if it exists).
      
      CC: stable@kernel.org
      Reported-by: Martin_Zielinski@McAfee.com
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NJan Kara <jack@suse.cz>
      d9b01934
  7. 31 3月, 2011 1 次提交
  8. 01 3月, 2011 1 次提交
  9. 28 10月, 2010 4 次提交
  10. 18 8月, 2010 1 次提交
    • C
      remove SWRITE* I/O types · 9cb569d6
      Christoph Hellwig 提交于
      These flags aren't real I/O types, but tell ll_rw_block to always
      lock the buffer instead of giving up on a failed trylock.
      
      Instead add a new write_dirty_buffer helper that implements this semantic
      and use it from the existing SWRITE* callers.  Note that the ll_rw_block
      code had a bug where it didn't promote WRITE_SYNC_PLUG properly, which
      this patch fixes.
      
      In the ufs code clean up the helper that used to call ll_rw_block
      to mirror sync_dirty_buffer, which is the function it implements for
      compound buffers.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9cb569d6
  11. 21 7月, 2010 1 次提交
  12. 22 5月, 2010 2 次提交
  13. 23 12月, 2009 1 次提交
  14. 12 11月, 2009 1 次提交
  15. 11 11月, 2009 1 次提交
  16. 16 9月, 2009 1 次提交
  17. 21 7月, 2009 1 次提交
  18. 16 7月, 2009 1 次提交
  19. 03 4月, 2009 1 次提交
  20. 12 2月, 2009 1 次提交
    • J
      jbd: fix return value of journal_start_commit() · 8fe4cd0d
      Jan Kara 提交于
      journal_start_commit() returns 1 if either a transaction is committing or
      the function has queued a transaction commit.  But it returns 0 if we
      raced with somebody queueing the transaction commit as well.  This
      resulted in ext3_sync_fs() not functioning correctly (description from
      Arthur Jones): In the case of a data=ordered umount with pending long
      symlinks which are delayed due to a long list of other I/O on the backing
      block device, this causes the buffer associated with the long symlinks to
      not be moved to the inode dirty list in the second phase of fsync_super.
      Then, before they can be dirtied again, kjournald exits, seeing the UMOUNT
      flag and the dirty pages are never written to the backing block device,
      causing long symlink corruption and exposing new or previously freed block
      data to userspace.
      
      This can be reproduced with a script created by Eric Sandeen
      <sandeen@redhat.com>:
      
              #!/bin/bash
      
              umount /mnt/test2
              mount /dev/sdb4 /mnt/test2
              rm -f /mnt/test2/*
              dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
              touch /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
              ln -s /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
              /mnt/test2/link
              umount /mnt/test2
              mount /dev/sdb4 /mnt/test2
              ls /mnt/test2/
      
      This patch fixes journal_start_commit() to always return 1 when there's
      a transaction committing or queued for commit.
      
      Cc: Eric Sandeen <sandeen@redhat.com>
      Cc: Mike Snitzer <snitzer@gmail.com>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8fe4cd0d
  21. 23 10月, 2008 1 次提交
    • H
      jbd: fix error handling for checkpoint io · 4afe9785
      Hidehiro Kawai 提交于
      When a checkpointing IO fails, current JBD code doesn't check the error
      and continue journaling.  This means latest metadata can be lost from both
      the journal and filesystem.
      
      This patch leaves the failed metadata blocks in the journal space and
      aborts journaling in the case of log_do_checkpoint().  To achieve this, we
      need to do:
      
      1. don't remove the failed buffer from the checkpoint list where in
         the case of __try_to_free_cp_buf() because it may be released or
         overwritten by a later transaction
      2. log_do_checkpoint() is the last chance, remove the failed buffer
         from the checkpoint list and abort the journal
      3. when checkpointing fails, don't update the journal super block to
         prevent the journaled contents from being cleaned.  For safety,
         don't update j_tail and j_tail_sequence either
      4. when checkpointing fails, notify this error to the ext3 layer so
         that ext3 don't clear the needs_recovery flag, otherwise the
         journaled contents are ignored and cleaned in the recovery phase
      5. if the recovery fails, keep the needs_recovery flag
      6. prevent cleanup_journal_tail() from being called between
         __journal_drop_transaction() and journal_abort() (a race issue
         between journal_flush() and __log_wait_for_space()
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NJan Kara <jack@suse.cz>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4afe9785
  22. 26 7月, 2008 2 次提交
  23. 28 4月, 2008 1 次提交
  24. 31 3月, 2008 1 次提交
  25. 20 3月, 2008 1 次提交
  26. 07 2月, 2008 1 次提交
  27. 20 10月, 2007 2 次提交
  28. 19 10月, 2007 1 次提交
  29. 18 10月, 2007 2 次提交
  30. 17 10月, 2007 1 次提交
    • M
      Group short-lived and reclaimable kernel allocations · e12ba74d
      Mel Gorman 提交于
      This patch marks a number of allocations that are either short-lived such as
      network buffers or are reclaimable such as inode allocations.  When something
      like updatedb is called, long-lived and unmovable kernel allocations tend to
      be spread throughout the address space which increases fragmentation.
      
      This patch groups these allocations together as much as possible by adding a
      new MIGRATE_TYPE.  The MIGRATE_RECLAIMABLE type is for allocations that can be
      reclaimed on demand, but not moved.  i.e.  they can be migrated by deleting
      them and re-reading the information from elsewhere.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Christoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e12ba74d
  31. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  32. 09 5月, 2007 2 次提交