1. 16 1月, 2014 1 次提交
    • S
      GFS2: Don't use ENOBUFS when ENOMEM is the correct error code · ac3beb6a
      Steven Whitehouse 提交于
      Al Viro has tactfully pointed out that we are using the incorrect
      error code in some cases. This patch fixes that, and also removes
      the (unused) return value for glock dumping.
      
      >        * gfs2_iget() - ENOBUFS instead of ENOMEM.  ENOBUFS is
      > "No buffer space available (POSIX.1 (XSI STREAMS option))" and since
      > we don't support STREAMS it's probably fair game, but... what the hell?
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      ac3beb6a
  2. 15 10月, 2013 1 次提交
    • S
      GFS2: Use lockref for glocks · e66cf161
      Steven Whitehouse 提交于
      Currently glocks have an atomic reference count and also a spinlock
      which covers various internal fields, such as the state. This intent of
      this patch is to replace the spinlock and the atomic reference count
      with a lockref structure. This contains a spinlock which we can continue
      to use as before, and a reference counter which is used in conjuction
      with the spinlock to replace the previous atomic counter.
      
      As a result of this there are some new rules for reference counting on
      glocks. We need to distinguish between reference count changes under
      gl_spin (which are now just increment or decrement of the new counter,
      provided the count cannot hit zero) and those which are outside of
      gl_spin, but which now take gl_spin internally.
      
      The conversion is relatively straight forward. There is probably some
      further clean up which can be done, but the priority at this stage is to
      make the change in as simple a manner as possible.
      
      A consequence of this change is that the reference count is being
      decoupled from the lru list processing. This should allow future
      adoption of the lru_list code with glocks in due course.
      
      The reason for using the "dead" state and not just relying on 0 being
      the "invalid state" is so that in due course 0 ref counts can be
      allowable. The intent is to eventually be able to remove the ref count
      changes which are currently hidden away in state_change().
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e66cf161
  3. 08 4月, 2013 1 次提交
    • S
      GFS2: Remove gfs2_refresh_inode from inode creation path · 28fb3027
      Steven Whitehouse 提交于
      The original method for creating inodes used in GFS2 was to fill
      out a buffer, with all the information, and then to read that
      buffer into the in-core inode, using gfs2_refresh_inode()
      
      The problem with this approach is that all the inode's fields
      need to be calculated ahead of time, and were stored in various
      variables making the code rather complicated.
      
      The new approach is simply to allocate the in-core inode earlier
      and fill in as many fields as possible ahead of time. These can
      then be used to initilise the on disk representation. The
      code has been working towards the point where it is possible
      to remove gfs2_refresh_inode() because all the fields are
      correctly initialised ahead of time. We've now reached that
      milestone, and have reversed the order of setting up the in
      core and on disk inodes.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      28fb3027
  4. 07 11月, 2012 1 次提交
  5. 11 1月, 2012 1 次提交
  6. 01 11月, 2011 1 次提交
  7. 15 7月, 2011 1 次提交
  8. 20 4月, 2011 1 次提交
    • S
      GFS2: Alter point of entry to glock lru list for glocks with an address_space · 29687a2a
      Steven Whitehouse 提交于
      Rather than allowing the glocks to be scheduled for possible
      reclaim as soon as they have exited the journal, this patch
      delays their entry to the list until the glocks in question
      are no longer in use.
      
      This means that we will rely on the vm for writeback of all
      dirty data and metadata from now on. When glocks are added
      to the lru list they should be freeable much faster since all
      the I/O required to free them should have already been completed.
      
      This should lead to much better I/O patterns under low memory
      conditions.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      29687a2a
  9. 09 3月, 2011 1 次提交
    • S
      GFS2: Fix glock deallocation race · fc0e38da
      Steven Whitehouse 提交于
      This patch fixes a race in deallocating glocks which was introduced
      in the RCU glock patch. We need to ensure that the glock count is
      kept correct even in the case that there is a race to add a new
      glock into the hash table. Also, to avoid having to wait for an
      RCU grace period, the glock counter can be decremented before
      call_rcu() is called.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      fc0e38da
  10. 21 1月, 2011 1 次提交
    • S
      GFS2: Use RCU for glock hash table · bc015cb8
      Steven Whitehouse 提交于
      This has a number of advantages:
      
       - Reduces contention on the hash table lock
       - Makes the code smaller and simpler
       - Should speed up glock dumps when under load
       - Removes ref count changing in examine_bucket
       - No longer need hash chain lock in glock_put() in common case
      
      There are some further changes which this enables and which
      we may do in the future. One is to look at using SLAB_RCU,
      and another is to look at using a per-cpu counter for the
      per-sb glock counter, since that is touched twice in the
      lifetime of each glock (but only used at umount time).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      bc015cb8
  11. 30 11月, 2010 3 次提交
  12. 18 10月, 2010 1 次提交
  13. 01 3月, 2010 1 次提交
    • S
      GFS2: Metadata address space clean up · 009d8518
      Steven Whitehouse 提交于
      Since the start of GFS2, an "extra" inode has been used to store
      the metadata belonging to each inode. The only reason for using
      this inode was to have an extra address space, the other fields
      were unused. This means that the memory usage was rather inefficient.
      
      The reason for keeping each inode's metadata in a separate address
      space is that when glocks are requested on remote nodes, we need to
      be able to efficiently locate the data and metadata which relating
      to that glock (inode) in order to sync or sync and invalidate it
      (depending on the remotely requested lock mode).
      
      This patch adds a new type of glock, which has in addition to
      its normal fields, has an address space. This applies to all
      inode and rgrp glocks (but to no other glock types which remain
      as before). As a result, we no longer need to have the second
      inode.
      
      This results in three major improvements:
       1. A saving of approx 25% of memory used in caching inodes
       2. A removal of the circular dependency between inodes and glocks
       3. No confusion between "normal" and "metadata" inodes in super.c
      
      Although the first of these is the more immediately apparent, the
      second is just as important as it now enables a number of clean
      ups at umount time. Those will be the subject of future patches.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      009d8518
  14. 03 2月, 2010 1 次提交
    • S
      GFS2: Extend umount wait coverage to full glock lifetime · 8f05228e
      Steven Whitehouse 提交于
      Although all glocks are, by the time of the umount glock wait,
      scheduled for demotion, some of them haven't made it far
      enough through the process for the original set of waiting
      code to wait for them.
      
      This extends the ref count to the whole glock lifetime in order
      to ensure that the waiting does catch all glocks. It does make
      it a bit more invasive, but it seems the only sensible solution
      at the moment.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8f05228e
  15. 03 12月, 2009 1 次提交
  16. 30 7月, 2009 1 次提交
  17. 24 3月, 2009 1 次提交
    • S
      GFS2: Merge lock_dlm module into GFS2 · f057f6cd
      Steven Whitehouse 提交于
      This is the big patch that I've been working on for some time
      now. There are many reasons for wanting to make this change
      such as:
       o Reducing overhead by eliminating duplicated fields between structures
       o Simplifcation of the code (reduces the code size by a fair bit)
       o The locking interface is now the DLM interface itself as proposed
         some time ago.
       o Fewer lookups of glocks when processing replies from the DLM
       o Fewer memory allocations/deallocations for each glock
       o Scope to do further optimisations in the future (but this patch is
         more than big enough for now!)
      
      Please note that (a) this patch relates to the lock_dlm module and
      not the DLM itself, that is still a separate module; and (b) that
      we retain the ability to build GFS2 as a standalone single node
      filesystem with out requiring the DLM.
      
      This patch needs a lot of testing, hence my keeping it I restarted
      my -git tree after the last merge window. That way, this has the maximum
      exposure before its merged. This is (modulo a few minor bug fixes) the
      same patch that I've been posting on and off the the last three months
      and its passed a number of different tests so far.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f057f6cd
  18. 05 1月, 2009 4 次提交
    • S
      Revert "GFS2: Fix use-after-free bug on umount" · fefc03bf
      Steven Whitehouse 提交于
      This reverts commit 78802499912f1ba31ce83a94c55b5a980f250a43.
      
      The original patch is causing problems in relation to order of
      operations at umount in relation to jdata files. I need to fix
      this a different way.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      fefc03bf
    • S
      GFS2: Fix use-after-free bug on umount · 3af165ac
      Steven Whitehouse 提交于
      There was a use-after-free with the GFS2 super block during
      umount. This patch moves almost all of the umount code from
      ->put_super into ->kill_sb, the only bit that cannot be moved
      being the glock hash clearing which has to remain as ->put_super
      due to umount ordering requirements. As a result its now obvious
      that the kfree is the final operation, whereas before it was
      hidden in ->put_super.
      
      Also gfs2_jindex_free is then only referenced from a single file
      so thats moved and marked static too.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3af165ac
    • S
      GFS2: Kill two daemons with one patch · 97cc1025
      Steven Whitehouse 提交于
      This patch removes the two daemons, gfs2_scand and gfs2_glockd
      and replaces them with a shrinker which is called from the VM.
      
      The net result is that GFS2 responds better when there is memory
      pressure, since it shrinks the glock cache at the same rate
      as the VFS shrinks the dcache and icache. There are no longer
      any time based criteria for shrinking glocks, they are kept
      until such time as the VM asks for more memory and then we
      demote just as many glocks as required.
      
      There are potential future changes to this code, including the
      possibility of sorting the glocks which are to be written back
      into inode number order, to get a better I/O ordering. It would
      be very useful to have an elevator based workqueue implementation
      for this, as that would automatically deal with the read I/O cases
      at the same time.
      
      This patch is my answer to Andrew Morton's remark, made during
      the initial review of GFS2, asking why GFS2 needs so many kernel
      threads, the answer being that it doesn't :-) This patch is a
      net loss of about 200 lines of code.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      97cc1025
    • S
      GFS2: Fix "truncate in progress" hang · 813e0c46
      Steven Whitehouse 提交于
      Following on from the recent clean up of gfs2_quotad, this patch moves
      the processing of "truncate in progress" inodes from the glock workqueue
      into gfs2_quotad. This fixes a hang due to the "truncate in progress"
      processing requiring glocks in order to complete.
      
      It might seem odd to use gfs2_quotad for this particular item, but
      we have to use a pre-existing thread since creating a thread implies
      a GFP_KERNEL memory allocation which is not allowed from the glock
      workqueue context. Of the existing threads, gfs2_logd and gfs2_recoverd
      may deadlock if used for this operation. gfs2_scand and gfs2_glockd are
      both scheduled for removal at some (hopefully not too distant) future
      point. That leaves only gfs2_quotad whose workload is generally fairly
      light and is easily adapted for this extra task.
      
      Also, as a result of this change, it opens the way for a future patch to
      make the reading of the inode's information asynchronous with respect to
      the glock workqueue, which is another improvement that has been on the list
      for some time now.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      813e0c46
  19. 18 9月, 2008 1 次提交
    • S
      GFS2: high time to take some time over atime · 719ee344
      Steven Whitehouse 提交于
      Until now, we've used the same scheme as GFS1 for atime. This has failed
      since atime is a per vfsmnt flag, not a per fs flag and as such the
      "noatime" flag was not getting passed down to the filesystems. This
      patch removes all the "special casing" around atime updates and we
      simply use the VFS's atime code.
      
      The net result is that GFS2 will now support all the same atime related
      mount options of any other filesystem on a per-vfsmnt basis. We do lose
      the "lazy atime" updates, but we gain "relatime". We could add lazy
      atime to the VFS at a later date, if there is a requirement for that
      variant still - I suspect relatime will be enough.
      
      Also we lose about 100 lines of code after this patch has been applied,
      and I have a suspicion that it will speed things up a bit, even when
      atime is "on". So it seems like a nice clean up as well.
      
      From a user perspective, everything stays the same except the loss of
      the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very
      least, and to be honest I don't think anybody ever used it) and that a
      number of options which were ignored before now work correctly.
      
      Please let me know if you've got any comments. I'm pushing this out
      early so that you can all see what my plans are.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      719ee344
  20. 27 6月, 2008 2 次提交
    • S
      [GFS2] Remove remote lock dropping code · 1bdad606
      Steven Whitehouse 提交于
      There are several reasons why this is undesirable:
      
       1. It never happens during normal operation anyway
       2. If it does happen it causes performance to be very, very poor
       3. It isn't likely to solve the original problem (memory shortage
          on remote DLM node) it was supposed to solve
       4. It uses a bunch of arbitrary constants which are unlikely to be
          correct for any particular situation and for which the tuning seems
          to be a black art.
       5. In an N node cluster, only 1/N of the dropped locked will actually
          contribute to solving the problem on average.
      
      So all in all we are better off without it. This also makes merging
      the lock_dlm module into GFS2 a bit easier.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1bdad606
    • S
      [GFS2] Clean up the glock core · 6802e340
      Steven Whitehouse 提交于
      This patch implements a number of cleanups to the core of the
      GFS2 glock code. As a result a lot of code is removed. It looks
      like a really big change, but actually a large part of this patch
      is either removing or moving existing code.
      
      There are some new bits too though, such as the new run_queue()
      function which is considerably streamlined. Highlights of this
      patch include:
      
       o Fixes a cluster coherency bug during SH -> EX lock conversions
       o Removes the "glmutex" code in favour of a single bit lock
       o Removes the ->go_xmote_bh() for inodes since it was duplicating
         ->go_lock()
       o We now only use the ->lm_lock() function for both locks and
         unlocks (i.e. unlock is a lock with target mode LM_ST_UNLOCKED)
       o The fast path is considerably shortly, giving performance gains
         especially with lock_nolock
       o The glock_workqueue is now used for all the callbacks from the DLM
         which allows us to simplify the lock_dlm module (see following patch)
       o The way is now open to make further changes such as eliminating the two
         threads (gfs2_glockd and gfs2_scand) in favour of a more efficient
         scheme.
      
      This patch has undergone extensive testing with various test suites
      so it should be pretty stable by now.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Bob Peterson <rpeterso@redhat.com>
      6802e340
  21. 31 3月, 2008 2 次提交
    • S
      [GFS2] Fix a page lock / glock deadlock · 7afd88d9
      Steven Whitehouse 提交于
      We've previously been using a "try lock" in readpage on the basis that
      it would prevent deadlocks due to the inverted lock ordering (our normal
      lock ordering is glock first and then page lock). Unfortunately tests
      have shown that this isn't enough. If the glock has a demote request
      queued such that run_queue() in the glock code tries to do a demote when
      its called under readpage then it will try and write out all the dirty
      pages which requires locking them. This then deadlocks with the page
      locked by readpage.
      
      The solution is to always require two calls into readpage. The first
      unlocks the page, gets the glock and returns AOP_TRUNCATED_PAGE, the
      second does the actual readpage and unlocks the glock & page as
      required.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7afd88d9
    • A
      [GFS2] make gfs2_glock_hold() static · 048786f1
      Adrian Bunk 提交于
      gfs2_glock_hold() can now become static.
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      048786f1
  22. 08 2月, 2008 1 次提交
  23. 10 10月, 2007 2 次提交
    • A
      [GFS2] flocks from same process trip kernel BUG at fs/gfs2/glock.c:1118! · b4c20166
      Abhijith Das 提交于
      This patch adds a new flag to the gfs2_holder structure GL_FLOCK.
      It is set on holders of glocks representing flocks. This flag is
      checked in add_to_queue() and a process is permitted to queue more
      than one holder onto a glock if it is set. This solves the issue
      of a process not being able to do multiple flocks on the same file.
      Through a single descriptor, a process can now promote and demote
      flocks. Through multiple descriptors a process can now queue
      multiple flocks on the same file. There's still the problem of
      a process deadlocking itself (because gfs2 blocking locks are not
      interruptible) by queueing incompatible deadlock.
      Signed-off-by: NAbhijith Das <adas@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b4c20166
    • S
      [GFS2] Reduce number of gfs2_scand processes to one · 8fbbfd21
      Steven Whitehouse 提交于
      We only need a single gfs2_scand process rather than the one
      per filesystem which we had previously. As a result the parameter
      determining the frequency of gfs2_scand runs becomes a module
      parameter rather than a mount parameter as it was before.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      8fbbfd21
  24. 09 7月, 2007 1 次提交
    • A
      [GFS2] Fix deallocation issues · d93cfa98
      Abhijith Das 提交于
      There were two issues during deallocation of unlinked inodes. The
      first was relating to the use of a "try" lock which in the case of
      the inode lock wasn't trying hard enough to deallocate in all
      circumstances (now changed to a normal glock) and in the case of
      the iopen lock didn't wait for the demotion of the shared lock before
      attempting to get the exclusive lock, and thereby sometimes (timing dependent)
      not completing the deallocation when it should have done.
      
      The second issue related to the lack of a way to invalidate dcache entries
      on remote nodes (now fixed by this patch) which meant that unlinks were
      taking a long time to return disk space to the fs. By adding some code to
      invalidate the dcache entries across the cluster for unlinked inodes, that
      is now fixed.
      
      This patch was written jointly by Abhijith Das and Steven Whitehouse.
      Signed-off-by: NAbhijith Das <adas@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      d93cfa98
  25. 22 5月, 2007 1 次提交
    • A
      Detach sched.h from mm.h · e8edc6e0
      Alexey Dobriyan 提交于
      First thing mm.h does is including sched.h solely for can_do_mlock() inline
      function which has "current" dereference inside. By dealing with can_do_mlock()
      mm.h can be detached from sched.h which is good. See below, why.
      
      This patch
      a) removes unconditional inclusion of sched.h from mm.h
      b) makes can_do_mlock() normal function in mm/mlock.c
      c) exports can_do_mlock() to not break compilation
      d) adds sched.h inclusions back to files that were getting it indirectly.
      e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
         getting them indirectly
      
      Net result is:
      a) mm.h users would get less code to open, read, preprocess, parse, ... if
         they don't need sched.h
      b) sched.h stops being dependency for significant number of files:
         on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
         after patch it's only 3744 (-8.3%).
      
      Cross-compile tested on
      
      	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
      	alpha alpha-up
      	arm
      	i386 i386-up i386-defconfig i386-allnoconfig
      	ia64 ia64-up
      	m68k
      	mips
      	parisc parisc-up
      	powerpc powerpc-up
      	s390 s390-up
      	sparc sparc-up
      	sparc64 sparc64-up
      	um-x86_64
      	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
      
      as well as my two usual configs.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8edc6e0
  26. 01 5月, 2007 3 次提交
    • R
      [GFS2] Red Hat bz 228540: owner references · 04b933f2
      Robert Peterson 提交于
      In Testing the previously posted and accepted patch for
      https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228540
      I uncovered some gfs2 badness.  It turns out that the current
      gfs2 code saves off a process pointer when glocks is taken
      in both the glock and glock holder structures.  Those
      structures will persist in memory long after the process has
      ended; pointers to poisoned memory.
      
      This problem isn't caused by the 228540 fix; the new capability
      introduced by the fix just uncovered the problem.
      
      I wrote this patch that avoids saving process pointers
      and instead saves off the process pid.  Rather than
      referencing the bad pointers, it now does process lookups.
      There is special code that makes the output nicer for
      printing holder information for processes that have ended.
      
      This patch also adds a stub for the new "sprint_symbol"
      function that exists in Andrew Morton's -mm patch set, but
      won't go into the base kernel until 2.6.22, since it adds
      functionality but doesn't fix a bug.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      04b933f2
    • S
      [GFS2] Fix bz 224480 and cleanup glock demotion code · 3b8249f6
      Steven Whitehouse 提交于
      This patch prevents the printing of a warning message in cases where
      the fs is functioning normally by handing off responsibility for
      unlinked, but still open inodes, to another node for eventual deallocation.
      Also, there is now an improved system for ensuring that such requests
      to other nodes do not get lost. The callback on the iopen lock is
      only ever called when i_nlink == 0 and when a node is unable to deallocate
      it due to it still being in use on another node. When a node receives
      the callback therefore, it knows that i_nlink must be zero, so we mark
      it as such (in gfs2_drop_inode) in order that it will then attempt
      deallocation of the inode itself.
      
      As an additional benefit, queuing a demote request no longer requires
      a memory allocation. This simplifies the code for dealing with gfs2_holders
      as it removes one special case.
      
      There are two new fields in struct gfs2_glock. gl_demote_state is the
      state which the remote node has requested and gl_demote_time is the
      time when the request came in. Both fields are only valid when the
      GLF_DEMOTE flag is set in gl_flags.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3b8249f6
    • R
      [GFS2] Add gfs2_tool lockdump support to gfs2 (bz 228540) · 7c52b166
      Robert Peterson 提交于
      The attached patch resolves bz 228540.  This adds the capability
      for gfs2 to dump gfs2 locks through the debugfs file system.
      This used to exist in gfs1 as "gfs_tool lockdump" but it's missing from
      gfs2 because all the ioctls were stripped out.  Please see the bugzilla
      for more history about the fix.  This patch is also attached to the bugzilla
      record.
      
      The patch is against Steve Whitehouse's latest nmw git tree kernel
      (2.6.21-rc1) and has been tested on system trin-10.
      Signed-off-by: NRobert Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7c52b166
  27. 06 2月, 2007 4 次提交
    • S
      [GFS2] Tidy up glops calls · b5d32bea
      Steven Whitehouse 提交于
      This patch doesn't make any changes to the ordering of the various
      operations related to glocking, but it does tidy up the calls to the
      glops.c functions to make the structure more obvious.
      
      The two functions: gfs2_glock_xmote_th() and gfs2_glock_drop_th() can be
      made static within glock.c since they are called by every set of glock
      operations. The xmote_th and drop_th glock operations are then made
      conditional upon those two routines existing and called from the
      previously mentioned functions in glock.c respectively.
      
      Also it can be seen that the go_sync operation isn't needed since it can
      easily be replaced by calls to xmote_bh and drop_bh respectively. This
      results in no longer (confusingly) calling back into routines in glock.c
      from glops.c and also reducing the glock operations by one member.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b5d32bea
    • S
      [GFS2] Remove local exclusive glock mode · 1c0f4872
      Steven Whitehouse 提交于
      Here is a patch for GFS2 to remove the local exclusive flag. In
      the places it was used, mutex's are always held earlier in the
      call path, so it appears redundant in the LM_ST_SHARED case.
      
      Also, the GFS2 holders were setting local exclusive in any case where
      the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock
      code where the flag was tested have been replaced with tests for the
      lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the
      same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well
      as globally exclusive).
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      1c0f4872
    • S
      [GFS2] Remove the "greedy" function from glock.[ch] · e5dab552
      Steven Whitehouse 提交于
      The "greedy" code was an attempt to retain glocks for a minimum length
      of time when they relate to mmap()ed files. The current implementation
      of this feature is not, however, ideal in that it required allocating
      memory in order to do this and its overly complicated.
      
      It also misses the mark by ignoring the other I/O operations which are
      just as likely to suffer from the same problem. So the plan is to remove
      this now and then add the functionality back as part of the glock state
      machine at a later date (and thus take into account all the possible
      users of this feature)
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e5dab552
    • S
      [GFS2] Clean up/speed up readdir · 3699e3a4
      Steven Whitehouse 提交于
      This removes the extra filldir callback which gfs2 was using to
      enclose an attempt at readahead for inodes during readdir. The
      code was too complicated and also hurts performance badly in the
      case that the getdents64/readdir call isn't being followed by
      stat() and it wasn't even getting it right all the time when it
      was.
      
      As a result, on my test box an "ls" of a directory containing 250000
      files fell from about 7mins (freshly mounted, so nothing cached) to
      between about 15 to 25 seconds. When the directory content was cached,
      the time taken fell from about 3mins to about 4 or 5 seconds.
      
      Interestingly in the cached case, running "ls -l" once reduced the time
      taken for subsequent runs of "ls" to about 6 secs even without this
      patch. Now it turns out that there was a special case of glocks being
      used for prefetching the metadata, but because of the timeouts for these
      locks (set to 10 secs) the metadata was being timed out before it was
      being used and this the prefetch code was constantly trying to prefetch
      the same data over and over.
      
      Calling "ls -l" meant that the inodes were brought into memory and once
      the inodes are cached, the glocks are not disposed of until the inodes
      are pushed out of the cache, thus extending the lifetime of the glocks,
      and thus bringing down the time for subsequent runs of "ls"
      considerably.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      3699e3a4