1. 15 7月, 2010 3 次提交
    • B
      GFS2: Fix kernel NULL pointer dereference by dlm_astd · b1becbde
      Bob Peterson 提交于
      This patch fixes a problem in an error path when looking
      up dinodes.  There are two sister-functions, gfs2_inode_lookup
      and gfs2_process_unlinked_inode.  Both functions acquire and
      hold the i_iopen glock for the dinode being looked up. The last
      thing they try to do is hold the i_gl glock for the dinode.
      If that glock fails for some reason, the error path was
      incorrectly calling gfs2_glock_put for the i_iopen glock twice.
      This resulted in the glock being prematurely freed.  The
      "minimum hold time" usually kept the glock in memory, but the
      lock interface to dlm (aka lock_dlm) freed its memory for the
      glock.  In some circumstances, it would cause dlm's dlm_astd daemon
      to try to call the bast function for the freed lock_dlm memory,
      which resulted in a NULL pointer dereference.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b1becbde
    • B
      GFS2: recovery stuck on transaction lock · b7dc2df5
      Bob Peterson 提交于
      This patch fixes bugzilla bug #590878: GFS2: recovery stuck on
      transaction lock.  We set the frozen flag on the glock when we receive
      a completion that cannot be delivered due to blocked locks. At that
      point we check to see whether the first waiting holder has the noexp
      flag set. If the noexp lock is queued later, then we need to unfreeze
      the glock at that point in time, namely, in the glock work function.
      
      This patch was originally written by Steve Whitehouse, but since
      he's on holiday, I'm submitting it.  It's been well tested with a
      complex recovery test called revolver.
      Signed-off-by: NSteve Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      b7dc2df5
    • B
      GFS2: O_TRUNC not working on stuffed files across cluster · a8bf2bc2
      Bob Peterson 提交于
      This patch replaces a statement that got dropped out by accident.
      Without the patch, truncates on stuffed (very small) files cause
      those files to have an unpredictable size.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      a8bf2bc2
  2. 06 7月, 2010 4 次提交
    • C
      writeback: simplify the write back thread queue · 83ba7b07
      Christoph Hellwig 提交于
      First remove items from work_list as soon as we start working on them.  This
      means we don't have to track any pending or visited state and can get
      rid of all the RCU magic freeing the work items - we can simply free
      them once the operation has finished.  Second use a real completion for
      tracking synchronous requests - if the caller sets the completion pointer
      we complete it, otherwise use it as a boolean indicator that we can free
      the work item directly.  Third unify struct wb_writeback_args and struct
      bdi_work into a single data structure, wb_writeback_work.  Previous we
      set all parameters into a struct wb_writeback_args, copied it into
      struct bdi_work, copied it again on the stack to use it there.  Instead
      of just allocate one structure dynamically or on the stack and use it
      all the way through the stack.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      83ba7b07
    • C
      writeback: split writeback_inodes_wb · edadfb10
      Christoph Hellwig 提交于
      The case where we have a superblock doesn't require a loop here as we scan
      over all inodes in writeback_sb_inodes. Split it out into a separate helper
      to make the code simpler.  This also allows to get rid of the sb member in
      struct writeback_control, which was rather out of place there.
      
      Also update the comments in writeback_sb_inodes that explain the handling
      of inodes from wrong superblocks.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      edadfb10
    • C
      writeback: remove writeback_inodes_wbc · 9c3a8ee8
      Christoph Hellwig 提交于
      This was just an odd wrapper around writeback_inodes_wb.  Removing this
      also allows to get rid of the bdi member of struct writeback_control
      which was rather out of place there.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      9c3a8ee8
    • S
      ceph: fix crush device 'out' threshold to 1.0, not 0.1 · 153a1093
      Sage Weil 提交于
      Fix a typo that made any OSD weighted between 0.1 and 1.0 effectively
      weighted as 1.0 (fully in).
      Signed-off-by: NSage Weil <sage@newdream.net>
      153a1093
  3. 01 7月, 2010 1 次提交
  4. 30 6月, 2010 9 次提交
  5. 28 6月, 2010 1 次提交
  6. 25 6月, 2010 5 次提交
  7. 24 6月, 2010 3 次提交
    • D
      xfs: remove block number from inode lookup code · 7b6259e7
      Dave Chinner 提交于
      The block number comes from bulkstat based inode lookups to shortcut
      the mapping calculations. We ar enot able to trust anything from
      bulkstat, so drop the block number as well so that the correct
      lookups and mappings are always done.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      7b6259e7
    • D
      xfs: rename XFS_IGET_BULKSTAT to XFS_IGET_UNTRUSTED · 1920779e
      Dave Chinner 提交于
      Inode numbers may come from somewhere external to the filesystem
      (e.g. file handles, bulkstat information) and so are inherently
      untrusted. Rename the flag we use for these lookups to make it
      obvious we are doing a lookup of an untrusted inode number and need
      to verify it completely before trying to read it from disk.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      1920779e
    • D
      xfs: validate untrusted inode numbers during lookup · 7124fe0a
      Dave Chinner 提交于
      When we decode a handle or do a bulkstat lookup, we are using an
      inode number we cannot trust to be valid. If we are deleting inode
      chunks from disk (default noikeep mode), then we cannot trust the on
      disk inode buffer for any given inode number to correctly reflect
      whether the inode has been unlinked as the di_mode nor the
      generation number may have been updated on disk.
      
      This is due to the fact that when we delete an inode chunk, we do
      not write the clusters back to disk when they are removed - instead
      we mark them stale to avoid them being written back potentially over
      the top of something that has been subsequently allocated at that
      location. The result is that we can have locations of disk that look
      like they contain valid inodes but in reality do not. Hence we
      cannot simply convert the inode number to a block number and read
      the location from disk to determine if the inode is valid or not.
      
      As a result, and XFS_IGET_BULKSTAT lookup needs to actually look the
      inode up in the inode allocation btree to determine if the inode
      number is valid or not.
      
      It should be noted even on ikeep filesystems, there is the
      possibility that blocks on disk may look like valid inode clusters.
      e.g. if there are filesystem images hosted on the filesystem. Hence
      even for ikeep filesystems we really need to validate that the inode
      number is valid before issuing the inode buffer read.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      7124fe0a
  8. 23 6月, 2010 1 次提交
  9. 24 6月, 2010 1 次提交
  10. 23 6月, 2010 4 次提交
  11. 22 6月, 2010 2 次提交
    • S
      ceph: delay umount until all mds requests drop inode+dentry refs · 17c688c3
      Sage Weil 提交于
      This fixes a race between handle_reply finishing an mds request, signalling
      completion, and then dropping the request structing and its dentry+inode
      refs, and pre_umount function waiting for requests to finish before
      letting the vfs tear down the dcache.  If umount was delayed waiting for
      mds requests, we could race and BUG in shrink_dcache_for_umount_subtree
      because of a slow dput.
      
      This delays umount until the msgr queue flushes, which means handle_reply
      will exit and will have dropped the ceph_mds_request struct.  I'm assuming
      the VFS has already ensured that its calls have all completed and those
      request refs have thus been dropped as well (I haven't seen that race, at
      least).
      Signed-off-by: NSage Weil <sage@newdream.net>
      17c688c3
    • S
      ceph: handle splice_dentry/d_materialize_unique error in readdir_prepopulate · d69ed05a
      Sage Weil 提交于
      Handle a splice_dentry failure (due to a d_materialize_unique error)
      without crashing.  (Also, report the error code.)
      Signed-off-by: NSage Weil <sage@newdream.net>
      d69ed05a
  12. 18 6月, 2010 1 次提交
  13. 17 6月, 2010 5 次提交