1. 12 10月, 2011 1 次提交
  2. 13 8月, 2011 1 次提交
    • C
      xfs: remove subdirectories · c59d87c4
      Christoph Hellwig 提交于
      Use the move from Linux 2.6 to Linux 3.x as an excuse to kill the
      annoying subdirectories in the XFS source code.  Besides the large
      amount of file rename the only changes are to the Makefile, a few
      files including headers with the subdirectory prefix, and the binary
      sysctl compat code that includes a header under fs/xfs/ from
      kernel/.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      c59d87c4
  3. 26 7月, 2011 1 次提交
  4. 21 7月, 2011 2 次提交
  5. 08 7月, 2011 1 次提交
    • C
      xfs: improve sync behaviour in the face of aggressive dirtying · 33b8f7c2
      Christoph Hellwig 提交于
      The following script from Wu Fengguang shows very bad behaviour in XFS
      when aggressively dirtying data during a sync on XFS, with sync times
      up to almost 10 times as long as ext4.
      
      A large part of the issue is that XFS writes data out itself two times
      in the ->sync_fs method, overriding the livelock protection in the core
      writeback code, and another issue is the lock-less xfs_ioend_wait call,
      which doesn't prevent new ioend from being queue up while waiting for
      the count to reach zero.
      
      This patch removes the XFS-internal sync calls and relies on the VFS
      to do it's work just like all other filesystems do.  Note that the
      i_iocount wait which is rather suboptimal is simply removed here.
      We already do it in ->write_inode, which keeps the current supoptimal
      behaviour.  We'll eventually need to remove that as well, but that's
      material for a separate commit.
      
      ------------------------------ snip ------------------------------
      #!/bin/sh
      
      umount /dev/sda7
      mkfs.xfs -f /dev/sda7
      # mkfs.ext4 /dev/sda7
      # mkfs.btrfs /dev/sda7
      mount /dev/sda7 /fs
      
      echo $((50<<20)) > /proc/sys/vm/dirty_bytes
      
      pid=
      for i in `seq 10`
      do
      	dd if=/dev/zero of=/fs/zero-$i bs=1M count=1000 &
      	pid="$pid $!"
      done
      
      sleep 1
      
      tic=$(date +'%s')
      sync
      tac=$(date +'%s')
      
      echo
      echo sync time: $((tac-tic))
      egrep '(Dirty|Writeback|NFS_Unstable)' /proc/meminfo
      
      pidof dd > /dev/null && { kill -9 $pid; echo sync NOT livelocked; }
      ------------------------------ snip ------------------------------
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reported-by: NWu Fengguang <fengguang.wu@intel.com>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      33b8f7c2
  6. 25 5月, 2011 1 次提交
  7. 20 5月, 2011 1 次提交
    • D
      xfs: avoid getting stuck during async inode flushes · ee58abdf
      Dave Chinner 提交于
      When the underlying inode buffer is locked and xfs_sync_inode_attr()
      is doing a non-blocking flush, xfs_iflush() can return EAGAIN.  When
      this happens, clear the error rather than returning it to
      xfs_inode_ag_walk(), as returning EAGAIN will result in the AG walk
      delaying for a short while and trying again. This can result in
      background walks getting stuck on the one AG until inode buffer is
      unlocked by some other means.
      
      This behaviour was noticed when analysing event traces followed by
      code inspection and verification of the fix via further traces.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      ee58abdf
  8. 10 5月, 2011 2 次提交
    • D
      xfs: ensure reclaim cursor is reset correctly at end of AG · 228d62dd
      Dave Chinner 提交于
      On a 32 bit highmem PowerPC machine, the XFS inode cache was growing
      without bound and exhausting low memory causing the OOM killer to be
      triggered. After some effort, the problem was reproduced on a 32 bit
      x86 highmem machine.
      
      The problem is that the per-ag inode reclaim index cursor was not
      getting reset to the start of the AG if the radix tree tag lookup
      found no more reclaimable inodes. Hence every further reclaim
      attempt started at the same index beyond where any reclaimable
      inodes lay, and no further background reclaim ever occurred from the
      AG.
      
      Without background inode reclaim the VM driven cache shrinker
      simply cannot keep up with cache growth, and OOM is the result.
      
      While the change that exposed the problem was the conversion of the
      inode reclaim to use work queues for background reclaim, it was not
      the cause of the bug. The bug was introduced when the cursor code
      was added, just waiting for some weird configuration to strike....
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Tested-By: NChristian Kujau <lists@nerdbynature.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      
      (cherry picked from commit b2232219)
      228d62dd
    • D
      xfs: ensure reclaim cursor is reset correctly at end of AG · b2232219
      Dave Chinner 提交于
      On a 32 bit highmem PowerPC machine, the XFS inode cache was growing
      without bound and exhausting low memory causing the OOM killer to be
      triggered. After some effort, the problem was reproduced on a 32 bit
      x86 highmem machine.
      
      The problem is that the per-ag inode reclaim index cursor was not
      getting reset to the start of the AG if the radix tree tag lookup
      found no more reclaimable inodes. Hence every further reclaim
      attempt started at the same index beyond where any reclaimable
      inodes lay, and no further background reclaim ever occurred from the
      AG.
      
      Without background inode reclaim the VM driven cache shrinker
      simply cannot keep up with cache growth, and OOM is the result.
      
      While the change that exposed the problem was the conversion of the
      inode reclaim to use work queues for background reclaim, it was not
      the cause of the bug. The bug was introduced when the cursor code
      was added, just waiting for some weird configuration to strike....
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Tested-By: NChristian Kujau <lists@nerdbynature.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      b2232219
  9. 08 4月, 2011 4 次提交
    • D
      xfs: push the AIL from memory reclaim and periodic sync · fd074841
      Dave Chinner 提交于
      When we are short on memory, we want to expedite the cleaning of
      dirty objects.  Hence when we run short on memory, we need to kick
      the AIL flushing into action to clean as many dirty objects as
      quickly as possible.  To implement this, sample the lsn of the log
      item at the head of the AIL and use that as the push target for the
      AIL flush.
      
      Further, we keep items in the AIL that are dirty that are not
      tracked any other way, so we can get objects sitting in the AIL that
      don't get written back until the AIL is pushed. Hence to get the
      filesystem to the idle state, we might need to push the AIL to flush
      out any remaining dirty objects sitting in the AIL. This requires
      the same push mechanism as the reclaim push.
      
      This patch also renames xfs_trans_ail_tail() to xfs_ail_min_lsn() to
      match the new xfs_ail_max_lsn() function introduced in this patch.
      Similarly for xfs_trans_ail_push -> xfs_ail_push.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      fd074841
    • D
      xfs: introduce background inode reclaim work · a7b339f1
      Dave Chinner 提交于
      Background inode reclaim needs to run more frequently that the XFS
      syncd work is run as 30s is too long between optimal reclaim runs.
      Add a new periodic work item to the xfs syncd workqueue to run a
      fast, non-blocking inode reclaim scan.
      
      Background inode reclaim is kicked by the act of marking inodes for
      reclaim.  When an AG is first marked as having reclaimable inodes,
      the background reclaim work is kicked. It will continue to run
      periodically untill it detects that there are no more reclaimable
      inodes. It will be kicked again when the first inode is queued for
      reclaim.
      
      To ensure shrinker based inode reclaim throttles to the inode
      cleaning and reclaim rate but still reclaim inodes efficiently, make it kick the
      background inode reclaim so that when we are low on memory we are
      trying to reclaim inodes as efficiently as possible. This kick shoul
      d not be necessary, but it will protect against failures to kick the
      background reclaim when inodes are first dirtied.
      
      To provide the rate throttling, make the shrinker pass do
      synchronous inode reclaim so that it blocks on inodes under IO. This
      means that the shrinker will reclaim inodes rather than just
      skipping over them, but it does not adversely affect the rate of
      reclaim because most dirty inodes are already under IO due to the
      background reclaim work the shrinker kicked.
      
      These two modifications solve one of the two OOM killer invocations
      Chris Mason reported recently when running a stress testing script.
      The particular workload trigger for the OOM killer invocation is
      where there are more threads than CPUs all unlinking files in an
      extremely memory constrained environment. Unlike other solutions,
      this one does not have a performance impact on performance when
      memory is not constrained or the number of concurrent threads
      operating is <= to the number of CPUs.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      a7b339f1
    • D
      xfs: convert ENOSPC inode flushing to use new syncd workqueue · 89e4cb55
      Dave Chinner 提交于
      On of the problems with the current inode flush at ENOSPC is that we
      queue a flush per ENOSPC event, regardless of how many are already
      queued. Thi can result in    hundreds of queued flushes, most of
      which simply burn CPU scanned and do no real work. This simply slows
      down allocation at ENOSPC.
      
      We really only need one active flush at a time, and we can easily
      implement that via the new xfs_syncd_wq. All we need to do is queue
      a flush if one is not already active, then block waiting for the
      currently active flush to complete. The result is that we only ever
      have a single ENOSPC inode flush active at a time and this greatly
      reduces the overhead of ENOSPC processing.
      
      On my 2p test machine, this results in tests exercising ENOSPC
      conditions running significantly faster - 042 halves execution time,
      083 drops from 60s to 5s, etc - while not introducing test
      regressions.
      
      This allows us to remove the old xfssyncd threads and infrastructure
      as they are no longer used.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      89e4cb55
    • D
      xfs: introduce a xfssyncd workqueue · c6d09b66
      Dave Chinner 提交于
      All of the work xfssyncd does is background functionality. There is
      no need for a thread per filesystem to do this work - it can al be
      managed by a global workqueue now they manage concurrency
      effectively.
      
      Introduce a new gglobal xfssyncd workqueue, and convert the periodic
      work to use this new functionality. To do this, use a delayed work
      construct to schedule the next running of the periodic sync work
      for the filesystem. When the sync work is complete, queue a new
      delayed work for the next running of the sync work.
      
      For laptop mode, we wait on completion for the sync works, so ensure
      that the sync work queuing interface can flush and wait for work to
      complete to enable the work queue infrastructure to replace the
      current sequence number and wakeup that is used.
      
      Because the sync work does non-trivial amounts of work, mark the
      new work queue as CPU intensive.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      c6d09b66
  10. 31 3月, 2011 1 次提交
  11. 26 3月, 2011 1 次提交
    • D
      xfs: introduce inode cluster buffer trylocks for xfs_iflush · 1bfd8d04
      Dave Chinner 提交于
      There is an ABBA deadlock between synchronous inode flushing in
      xfs_reclaim_inode and xfs_icluster_free. xfs_icluster_free locks the
      buffer, then takes inode ilocks, whilst synchronous reclaim takes
      the ilock followed by the buffer lock in xfs_iflush().
      
      To avoid this deadlock, separate the inode cluster buffer locking
      semantics from the synchronous inode flush semantics, allowing
      callers to attempt to lock the buffer but still issue synchronous IO
      if it can get the buffer. This requires xfs_iflush() calls that
      currently use non-blocking semantics to pass SYNC_TRYLOCK rather
      than 0 as the flags parameter.
      
      This allows xfs_reclaim_inode to avoid the deadlock on the buffer
      lock and detect the failure so that it can drop the inode ilock and
      restart the reclaim attempt on the inode. This allows
      xfs_ifree_cluster to obtain the inode lock, mark the inode stale and
      release it and hence defuse the deadlock situation. It also has the
      pleasant side effect of avoiding IO in xfs_reclaim_inode when it
      tries to next reclaim the inode as it is now marked stale.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      1bfd8d04
  12. 07 3月, 2011 1 次提交
  13. 12 1月, 2011 1 次提交
    • D
      xfs: ensure log covering transactions are synchronous · c58efdb4
      Dave Chinner 提交于
      To ensure the log is covered and the filesystem idles correctly, we
      need to ensure that dummy transactions hit the disk and do not stay
      pinned in memory.  If the superblock is pinned in memory, it can't
      be flushed so the log covering cannot make progress. The result is
      dependent on timing - more oftent han not we continue to issues a
      log covering transaction every 36s rather than idling after ~90s.
      
      Fix this by making the log covering transaction synchronous. To
      avoid additional log force from xfssyncd, make the log covering
      transaction take the place of the existing log force in the xfssyncd
      background sync process.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      c58efdb4
  14. 16 12月, 2010 1 次提交
  15. 17 12月, 2010 1 次提交
    • D
      xfs: convert inode cache lookups to use RCU locking · 1a3e8f3d
      Dave Chinner 提交于
      With delayed logging greatly increasing the sustained parallelism of inode
      operations, the inode cache locking is showing significant read vs write
      contention when inode reclaim runs at the same time as lookups. There is
      also a lot more write lock acquistions than there are read locks (4:1 ratio)
      so the read locking is not really buying us much in the way of parallelism.
      
      To avoid the read vs write contention, change the cache to use RCU locking on
      the read side. To avoid needing to RCU free every single inode, use the built
      in slab RCU freeing mechanism. This requires us to be able to detect lookups of
      freed inodes, so enѕure that ever freed inode has an inode number of zero and
      the XFS_IRECLAIM flag set. We already check the XFS_IRECLAIM flag in cache hit
      lookup path, but also add a check for a zero inode number as well.
      
      We canthen convert all the read locking lockups to use RCU read side locking
      and hence remove all read side locking.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      1a3e8f3d
  16. 11 11月, 2010 1 次提交
  17. 19 10月, 2010 6 次提交
    • D
      xfs: serialise inode reclaim within an AG · 69b491c2
      Dave Chinner 提交于
      Memory reclaim via shrinkers has a terrible habit of having N+M
      concurrent shrinker executions (N = num CPUs, M = num kswapds) all
      trying to shrink the same cache. When the cache they are all working
      on is protected by a single spinlock, massive contention an
      slowdowns occur.
      
      Wrap the per-ag inode caches with a reclaim mutex to serialise
      reclaim access to the AG. This will block concurrent reclaim in each
      AG but still allow reclaim to scan multiple AGs concurrently. Allow
      shrinkers to move on to the next AG if it can't get the lock, and if
      we can't get any AG, then start blocking on locks.
      
      To prevent reclaimers from continually scanning the same inodes in
      each AG, add a cursor that tracks where the last reclaim got up to
      and start from that point on the next reclaim. This should avoid
      only ever scanning a small number of inodes at the satart of each AG
      and not making progress. If we have a non-shrinker based reclaim
      pass, ignore the cursor and reset it to zero once we are done.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      69b491c2
    • D
      xfs: batch inode reclaim lookup · e3a20c0b
      Dave Chinner 提交于
      Batch and optimise the per-ag inode lookup for reclaim to minimise
      scanning overhead. This involves gang lookups on the radix trees to
      get multiple inodes during each tree walk, and tighter validation of
      what inodes can be reclaimed without blocking befor we take any
      locks.
      
      This is based on ideas suggested in a proof-of-concept patch
      posted by Nick Piggin.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      e3a20c0b
    • D
      xfs: implement batched inode lookups for AG walking · 78ae5256
      Dave Chinner 提交于
      With the reclaim code separated from the generic walking code, it is
      simple to implement batched lookups for the generic walk code.
      Separate out the inode validation from the execute operations and
      modify the tree lookups to get a batch of inodes at a time.
      
      Reclaim operations will be optimised separately.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      78ae5256
    • D
      xfs: split out inode walk inode grabbing · e13de955
      Dave Chinner 提交于
      When doing read side inode cache walks, the code to validate and
      grab an inode is common to all callers. Split it out of the execute
      callbacks in preparation for batching lookups. Similarly, split out
      the inode reference dropping from the execute callbacks into the
      main lookup look to be symmetric with the grab.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      e13de955
    • D
      xfs: split inode AG walking into separate code for reclaim · 65d0f205
      Dave Chinner 提交于
      The reclaim walk requires different locking and has a slightly
      different walk algorithm, so separate it out so that it can be
      optimised separately.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      65d0f205
    • D
      xfs: lockless per-ag lookups · e176579e
      Dave Chinner 提交于
      When we start taking a reference to the per-ag for every cached
      buffer in the system, kernel lockstat profiling on an 8-way create
      workload shows the mp->m_perag_lock has higher acquisition rates
      than the inode lock and has significantly more contention. That is,
      it becomes the highest contended lock in the system.
      
      The perag lookup is trivial to convert to lock-less RCU lookups
      because perag structures never go away. Hence the only thing we need
      to protect against is tree structure changes during a grow. This can
      be done simply by replacing the locking in xfs_perag_get() with RCU
      read locking. This removes the mp->m_perag_lock completely from this
      path.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      e176579e
  18. 07 10月, 2010 1 次提交
    • J
      xfs: properly account for reclaimed inodes · 081003ff
      Johannes Weiner 提交于
      When marking an inode reclaimable, a per-AG counter is increased, the
      inode is tagged reclaimable in its per-AG tree, and, when this is the
      first reclaimable inode in the AG, the AG entry in the per-mount tree
      is also tagged.
      
      When an inode is finally reclaimed, however, it is only deleted from
      the per-AG tree.  Neither the counter is decreased, nor is the parent
      tree's AG entry untagged properly.
      
      Since the tags in the per-mount tree are not cleared, the inode
      shrinker iterates over all AGs that have had reclaimable inodes at one
      point in time.
      
      The counters on the other hand signal an increasing amount of slab
      objects to reclaim.  Since "70e60ce7 xfs: convert inode shrinker to
      per-filesystem context" this is not a real issue anymore because the
      shrinker bails out after one iteration.
      
      But the problem was observable on a machine running v2.6.34, where the
      reclaimable work increased and each process going into direct reclaim
      eventually got stuck on the xfs inode shrinking path, trying to scan
      several million objects.
      
      Fix this by properly unwinding the reclaimable-state tracking of an
      inode when it is reclaimed.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: stable@kernel.org
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      081003ff
  19. 24 8月, 2010 1 次提交
    • D
      xfs: dummy transactions should not dirty VFS state · 1a387d3b
      Dave Chinner 提交于
      When we  need to cover the log, we issue dummy transactions to ensure
      the current log tail is on disk. Unfortunately we currently use the
      root inode in the dummy transaction, and the act of committing the
      transaction dirties the inode at the VFS level.
      
      As a result, the VFS writeback of the dirty inode will prevent the
      filesystem from idling long enough for the log covering state
      machine to complete. The state machine gets stuck in a loop issuing
      new dummy transactions to cover the log and never makes progress.
      
      To avoid this problem, the dummy transactions should not cause
      externally visible state changes. To ensure this occurs, make sure
      that dummy transactions log an unchanging field in the superblock as
      it's state is never propagated outside the filesystem. This allows
      the log covering state machine to complete successfully and the
      filesystem now correctly enters a fully idle state about 90s after
      the last modification was made.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      1a387d3b
  20. 27 7月, 2010 5 次提交
  21. 20 7月, 2010 2 次提交
    • D
      xfs: track AGs with reclaimable inodes in per-ag radix tree · 16fd5367
      Dave Chinner 提交于
      https://bugzilla.kernel.org/show_bug.cgi?id=16348
      
      When the filesystem grows to a large number of allocation groups,
      the summing of recalimable inodes gets expensive. In many cases,
      most AGs won't have any reclaimable inodes and so we are wasting CPU
      time aggregating over these AGs. This is particularly important for
      the inode shrinker that gets called frequently under memory
      pressure.
      
      To avoid the overhead, track AGs with reclaimable inodes in the
      per-ag radix tree so that we can find all the AGs with reclaimable
      inodes via a simple gang tag lookup. This involves setting the tag
      when the first reclaimable inode is tracked in the AG, and removing
      the tag when the last reclaimable inode is removed from the tree.
      Then the summation process becomes a loop walking the radix tree
      summing AGs with the reclaim tag set.
      
      This significantly reduces the overhead of scanning - a 6400 AG
      filesystea now only uses about 25% of a cpu in kswapd while slab
      reclaim progresses instead of being permanently stuck at 100% CPU
      and making little progress. Clean filesystems filesystems will see
      no overhead and the overhead only increases linearly with the number
      of dirty AGs.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      16fd5367
    • D
      xfs: convert inode shrinker to per-filesystem contexts · 70e60ce7
      Dave Chinner 提交于
      Now the shrinker passes us a context, wire up a shrinker context per
      filesystem. This allows us to remove the global mount list and the
      locking problems that introduced. It also means that a shrinker call
      does not need to traverse clean filesystems before finding a
      filesystem with reclaimable inodes.  This significantly reduces
      scanning overhead when lots of filesystems are present.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      70e60ce7
  22. 19 7月, 2010 1 次提交
    • D
      mm: add context argument to shrinker callback · 7f8275d0
      Dave Chinner 提交于
      The current shrinker implementation requires the registered callback
      to have global state to work from. This makes it difficult to shrink
      caches that are not global (e.g. per-filesystem caches). Pass the shrinker
      structure to the callback so that users can embed the shrinker structure
      in the context the shrinker needs to operate on and get back to it in the
      callback via container_of().
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      7f8275d0
  23. 29 5月, 2010 1 次提交
    • C
      xfs: fix access to upper inodes without inode64 · fb3b504a
      Christoph Hellwig 提交于
      If a filesystem is mounted without the inode64 mount option we
      should still be able to access inodes not fitting into 32 bits, just
      not created new ones.  For this to work we need to make sure the
      inode cache radix tree is initialized for all allocation groups, not
      just those we plan to allocate inodes from.  This patch makes sure
      we initialize the inode cache radix tree for all allocation groups,
      and also cleans xfs_initialize_perag up a bit to separate the
      inode32 logical from the general perag structure setup.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      fb3b504a
  24. 19 5月, 2010 2 次提交
    • C
      xfs: enforce synchronous writes in xfs_bwrite · 8c38366f
      Christoph Hellwig 提交于
      xfs_bwrite is used with the intention of synchronously writing out
      buffers, but currently it does not actually clear the async flag if
      that's left from previous writes but instead implements async
      behaviour if it finds it.  Remove the code handling asynchronous
      writes as we've got rid of those entirely outside of the log and
      delwri buffers, and make sure that we clear the async and read flags
      before writing the buffer.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      8c38366f
    • C
      xfs: remove periodic superblock writeback · df308bcf
      Christoph Hellwig 提交于
      All modifications to the superblock are done transactional through
      xfs_trans_log_buf, so there is no reason to initiate periodic
      asynchronous writeback.  This only removes the superblock from the
      delwri list and will lead to sub-optimal I/O scheduling.
      
      Cut down xfs_sync_fsdata now that it's only used for synchronous
      superblock writes and move the log coverage checks into the two
      callers.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      df308bcf