1. 22 5月, 2009 3 次提交
  2. 15 4月, 2009 1 次提交
  3. 24 3月, 2009 1 次提交
    • S
      GFS2: Merge lock_dlm module into GFS2 · f057f6cd
      Steven Whitehouse 提交于
      This is the big patch that I've been working on for some time
      now. There are many reasons for wanting to make this change
      such as:
       o Reducing overhead by eliminating duplicated fields between structures
       o Simplifcation of the code (reduces the code size by a fair bit)
       o The locking interface is now the DLM interface itself as proposed
         some time ago.
       o Fewer lookups of glocks when processing replies from the DLM
       o Fewer memory allocations/deallocations for each glock
       o Scope to do further optimisations in the future (but this patch is
         more than big enough for now!)
      
      Please note that (a) this patch relates to the lock_dlm module and
      not the DLM itself, that is still a separate module; and (b) that
      we retain the ability to build GFS2 as a standalone single node
      filesystem with out requiring the DLM.
      
      This patch needs a lot of testing, hence my keeping it I restarted
      my -git tree after the last merge window. That way, this has the maximum
      exposure before its merged. This is (modulo a few minor bug fixes) the
      same patch that I've been posting on and off the the last three months
      and its passed a number of different tests so far.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f057f6cd
  4. 05 1月, 2009 7 次提交
  5. 14 11月, 2008 1 次提交
  6. 18 9月, 2008 1 次提交
    • S
      GFS2: high time to take some time over atime · 719ee344
      Steven Whitehouse 提交于
      Until now, we've used the same scheme as GFS1 for atime. This has failed
      since atime is a per vfsmnt flag, not a per fs flag and as such the
      "noatime" flag was not getting passed down to the filesystems. This
      patch removes all the "special casing" around atime updates and we
      simply use the VFS's atime code.
      
      The net result is that GFS2 will now support all the same atime related
      mount options of any other filesystem on a per-vfsmnt basis. We do lose
      the "lazy atime" updates, but we gain "relatime". We could add lazy
      atime to the VFS at a later date, if there is a requirement for that
      variant still - I suspect relatime will be enough.
      
      Also we lose about 100 lines of code after this patch has been applied,
      and I have a suspicion that it will speed things up a bit, even when
      atime is "on". So it seems like a nice clean up as well.
      
      From a user perspective, everything stays the same except the loss of
      the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very
      least, and to be honest I don't think anybody ever used it) and that a
      number of options which were ignored before now work correctly.
      
      Please let me know if you've got any comments. I'm pushing this out
      early so that you can all see what my plans are.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      719ee344
  7. 05 9月, 2008 1 次提交
  8. 27 8月, 2008 1 次提交
    • S
      GFS2: Fix & clean up GFS2 rename · 0188d6c5
      Steven Whitehouse 提交于
      This patch fixes a locking issue in the rename code by ensuring that we hold
      the per sb rename lock over both directory and "other" renames which involve
      different parent directories.
      
      At the same time, this moved the (only called from one place) function
      gfs2_ok_to_move into the file that its called from, so we can mark it
      static. This should make a code a bit easier to follow.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Peter Staubach <staubach@redhat.com>
      0188d6c5
  9. 27 7月, 2008 1 次提交
  10. 10 7月, 2008 1 次提交
  11. 03 7月, 2008 1 次提交
    • M
      [GFS2] don't call permission() · f58ba889
      Miklos Szeredi 提交于
      GFS2 calls permission() to verify permissions after locks on the files
      have been taken.
      
      For this it's sufficient to call gfs2_permission() instead.  This
      results in the following changes:
      
        - IS_RDONLY() check is not performed
        - IS_IMMUTABLE() check is not performed
        - devcgroup_inode_permission() is not called
        - security_inode_permission() is not called
      
      IS_RDONLY() should be unnecessary anyway, as the per-mount read-only
      flag should provide protection against read-only remounts during
      operations.  do_gfs2_set_flags() has been fixed to perform
      mnt_want_write()/mnt_drop_write() to protect against remounting
      read-only.
      
      IS_IMMUTABLE has been added to gfs2_permission()
      
      Repeating the security checks seems to be pointless, as they don't
      normally change, and if they do, it's independent of the filesystem
      state.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f58ba889
  12. 12 5月, 2008 1 次提交
  13. 10 4月, 2008 1 次提交
  14. 31 3月, 2008 9 次提交
    • C
      [GFS2] possible null pointer dereference fixup · 182fe5ab
      Cyrill Gorcunov 提交于
      gfs2_alloc_get may fail so we have to check it to prevent
      NULL pointer dereference.
      Signed-off-by: NCyrill Gorcunov <gorcunov@gamil.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      182fe5ab
    • D
      [GFS2] re-support special inode · 43a33c53
      Denis Cheng 提交于
      a previous commit removed call to
      init_special_inode from inode lookuping, this cause problems as:
      
       # mknod /mnt/gfs2/dev/null c 1 3
       # cat /mnt/gfs2/dev/null
       cat: /mnt/gfs2/dev/null: Invalid argument
      
      without special inode, GFS2 cannot support char device file,
      block device file, fifo pipe, and socket file, lose many important
      features as a common file system.
      
      this one line patch re add special inode support.
      Signed-off-by: NDenis Cheng <crquan@gmail.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      43a33c53
    • D
      [GFS2] remove gfs2_dev_iops · d83225d4
      Denis Cheng 提交于
      struct inode_operations gfs2_dev_iops is always the same as gfs2_file_iops,
      since Jan 2006, when GFS2 merged into mainstream kernel.
      
      So one of them could be removed.
      Signed-off-by: NDenis Cheng <crquan@gmail.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d83225d4
    • S
      [GFS2] Fix a page lock / glock deadlock · 7afd88d9
      Steven Whitehouse 提交于
      We've previously been using a "try lock" in readpage on the basis that
      it would prevent deadlocks due to the inverted lock ordering (our normal
      lock ordering is glock first and then page lock). Unfortunately tests
      have shown that this isn't enough. If the glock has a demote request
      queued such that run_queue() in the glock code tries to do a demote when
      its called under readpage then it will try and write out all the dirty
      pages which requires locking them. This then deadlocks with the page
      locked by readpage.
      
      The solution is to always require two calls into readpage. The first
      unlocks the page, gets the glock and returns AOP_TRUNCATED_PAGE, the
      second does the actual readpage and unlocks the glock & page as
      required.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7afd88d9
    • S
      [GFS2] Eliminate (almost) duplicate field from gfs2_inode · 77658aad
      Steven Whitehouse 提交于
      The blocks counter is almost a duplicate of the i_blocks
      field in the VFS inode. The only difference is that i_blocks
      can be only 32bits long for 32bit arch without large single file
      support. Since GFS2 doesn't handle the non-large single file
      case (for 32 bit anyway) this adds a new config dependency on
      64BIT || LSF. This has always been the case, however we've never
      explicitly said so before.
      
      Even if we do add support for the non-LSF case, we will still
      not require this field to be duplicated since we will not be
      able to access oversized files anyway.
      
      So the net result of all this is that we shave 8 bytes from a gfs2_inode
      and get our config deps correct.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      77658aad
    • S
      [GFS2] Reduce inode size by merging fields · ce276b06
      Steven Whitehouse 提交于
      There were three fields being used to keep track of the location
      of the most recently allocated block for each inode. These have
      been merged into a single field in order to better keep the
      data and metadata for an inode close on disk, and also to reduce
      the space required for storage.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ce276b06
    • S
      [GFS2] Shrink & rename di_depth · 9a004508
      Steven Whitehouse 提交于
      This patch forms a pair with the previous patch which shrunk
      di_height. Like that patch di_depth is renamed i_depth and moved
      into struct gfs2_inode directly. Also the field goes from 16 bits
      to 8 bits since it is also limited to a max value which is rather
      small (17 in this case). In addition we also now validate the field
      against this maximum value when its read in.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9a004508
    • B
      [GFS2] Fix debug inode printing · ca390601
      Bob Peterson 提交于
      I noticed that the latest change to i_height got rid of the
      value from the inode dump.  This patch adds it back.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ca390601
    • S
      [GFS2] Streamline indirect pointer tree height calculation · ecc30c79
      Steven Whitehouse 提交于
      This patch improves the calculation of the tree height in order to reduce
      the number of operations which are carried out on each call to gfs2_block_map.
      In the common case, we now make a single comparison, rather than calculating
      the required tree height from scratch each time. Also in the case that the
      tree does need some extra height, we start from the current height rather from
      zero when we work out what the new height ought to be.
      
      In addition the di_height field is moved into the inode proper and reduced
      in size to a u8 since the value must be between 0 and GFS2_MAX_META_HEIGHT (10).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ecc30c79
  15. 08 2月, 2008 1 次提交
  16. 25 1月, 2008 7 次提交
    • B
      [GFS2] Lockup on error · 1b8177ec
      Bob Peterson 提交于
      I spotted this bug while I was digging around.  Looks like it could cause
      a lockup in some rare error condition.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1b8177ec
    • S
      [GFS2] Reduce inode size by moving i_alloc out of line · 6dbd8224
      Steven Whitehouse 提交于
      It is possible to reduce the size of GFS2 inodes by taking the i_alloc
      structure out of the gfs2_inode. This patch allocates the i_alloc
      structure whenever its needed, and frees it afterward. This decreases
      the amount of low memory we use at the expense of requiring a memory
      allocation for each page or partial page that we write. A quick test
      with postmark shows that the overhead is not measurable and I also note
      that OCFS2 use the same approach.
      
      In the future I'd like to solve the problem by shrinking down the size
      of the members of the i_alloc structure, but for now, this reduces the
      immediate problem of using too much low-memory on x86 and doesn't add
      too much overhead.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6dbd8224
    • W
      [GFS2] Remove lock methods for lock_nolock protocol · c97bfe43
      Wendy Cheng 提交于
      GFS2 supports two modes of locking - lock_nolock for single node filesystem
      and lock_dlm for cluster mode locking. The gfs2 lock methods are removed from
      file operation table for lock_nolock protocol. This would allow VFS to handle
      posix lock and flock logics just like other in-tree filesystems without
      duplication.
      Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      c97bfe43
    • S
      [GFS2] Don't add glocks to the journal · 2bcd610d
      Steven Whitehouse 提交于
      The only reason for adding glocks to the journal was to keep track
      of which locks required a log flush prior to release. We add a
      flag to the glock to allow this check to be made in a simpler way.
      
      This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64)
      and means that we can avoid extra work during the journal flush.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2bcd610d
    • S
      [GFS2] Introduce gfs2_set_aops() · 5561093e
      Steven Whitehouse 提交于
      Just like ext3 we now have three sets of address space operations
      to cover the cases of writeback, ordered and journalled data
      writes. This means that the individual operations can now become
      less complicated as we are able to remove some of the tests for
      file data mode from the code.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5561093e
    • S
      [GFS2] Remove useless i_cache from inodes · f91a0d3e
      Steven Whitehouse 提交于
      The i_cache was designed to keep references to the indirect blocks
      used during block mapping so that they didn't have to be looked
      up continually. The idea failed because there are too many places
      where the i_cache needs to be freed, and this has in the past been
      the cause of many bugs.
      
      In addition there was no performance benefit being gained since the
      disk blocks in question were cached anyway. So this patch removes
      it in order to simplify the code to prepare for other changes which
      would otherwise have had to add further support for this feature.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f91a0d3e
    • S
      [GFS2] Clean up internal read function · 51ff87bd
      Steven Whitehouse 提交于
      As requested by Christoph, this patch cleans up GFS2's internal
      read function so that it no longer uses the do_generic_mapping_read
      function. This function is obsolete and GFS2 is the last user of it.
      
      As a side effect the internal read code gets smaller and easier
      to read and gfs2_readpage is split into two. One function has the locking
      and the other function has the rest of the logic.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      51ff87bd
  17. 10 10月, 2007 2 次提交
    • B
      [GFS2] Alternate gfs2_iget to avoid looking up inodes being freed · 7a9f53b3
      Benjamin Marzinski 提交于
      There is a possible deadlock between two processes on the same node, where one
      process is deleting an inode, and another process is looking for allocated but
      unused inodes to delete in order to create more space.
      
      process A does an iput() on inode X, and it's i_count drops to 0. This causes
      iput_final() to be called, which puts an inode into state I_FREEING at
      generic_delete_inode(). There no point between when iput_final() is called, and
      when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is
      set, no other process on that node can successfully look up that inode until
      the delete finishes.
      
      process B locks the the resource group for the same inode in get_local_rgrp(),
      which is called by gfs2_inplace_reserve_i()
      
      process A tries to lock the resource group for the inode in
      gfs2_dinode_dealloc(), but it's already locked by process B
      
      process B waits in find_inode for the inode to have the I_FREEING state cleared.
      
      Deadlock.
      
      This patch solves the problem by adding an alternative to gfs2_iget(),
      gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING
      state.o The alternate test function is just like the original one, except that
      it fails if the inode is being freed, and sets a skipped flag. The alternate
      set function is just like the original, except that it fails if the skipped
      flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of
      gfs2_iget().
      Signed-off-by: NBenjamin E. Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7a9f53b3
    • W
      [GFS2] fix inode meta data corruption · e9bd2b3b
      Wendy Cheng 提交于
      Fix a nasty inode meta data corruption issue by keeping the buffer head in
      icache array. This buffer needs to stay in memory until journal flush occurs
      Otherwise, gfs2_meta_inode_buffer could do a disk read before the inode hits
      disk. It ends up with meta data corruptions. The buffer will be released as
      part of the existing journal flush logic.
      Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e9bd2b3b