1. 25 5月, 2011 7 次提交
  2. 20 5月, 2011 6 次提交
    • D
      xfs: obey minleft values during extent allocation correctly · bf59170a
      Dave Chinner 提交于
      When allocating an extent that is long enough to consume the
      remaining free space in an AG, we need to ensure that the allocation
      leaves enough space in the AG for any subsequent bmap btree blocks
      that are needed to track the new extent. These have to be allocated
      in the same AG as we only reserve enough blocks in an allocation
      transaction for modification of the freespace trees in a single AG.
      
      xfs_alloc_fix_minleft() has been considering blocks on the AGFL as
      free blocks available for extent and bmbt block allocation, which is
      not correct - blocks on the AGFL are there exclusively for the use
      of the free space btrees. As a result, when minleft is less than the
      number of blocks on the AGFL, xfs_alloc_fix_minleft() does not trim
      the given extent to leave minleft blocks available for bmbt
      allocation, and hence we can fail allocation during bmbt record
      insertion.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      bf59170a
    • D
      xfs: reset buffer pointers before freeing them · 44396476
      Dave Chinner 提交于
      When we free a vmapped buffer, we need to ensure the vmap address
      and length we free is the same as when it was allocated. In various
      places in the log code we change the memory the buffer is pointing
      to before issuing IO, but we never reset the buffer to point back to
      it's original memory (or no memory, if that is the case for the
      buffer).
      
      As a result, when we free the buffer it points to memory that is
      owned by something else and attempts to unmap and free it. Because
      the range does not match any known mapped range, it can trigger
      BUG_ON() traps in the vmap code, and potentially corrupt the vmap
      area tracking.
      
      Fix this by always resetting these buffers to their original state
      before freeing them.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      44396476
    • D
      xfs: avoid getting stuck during async inode flushes · ee58abdf
      Dave Chinner 提交于
      When the underlying inode buffer is locked and xfs_sync_inode_attr()
      is doing a non-blocking flush, xfs_iflush() can return EAGAIN.  When
      this happens, clear the error rather than returning it to
      xfs_inode_ag_walk(), as returning EAGAIN will result in the AG walk
      delaying for a short while and trying again. This can result in
      background walks getting stuck on the one AG until inode buffer is
      unlocked by some other means.
      
      This behaviour was noticed when analysing event traces followed by
      code inspection and verification of the fix via further traces.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      ee58abdf
    • D
      xfs: fix xfs_itruncate_start tracing · e5737515
      Dave Chinner 提交于
      Variables are ordered incorrectly in trace call.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      e5737515
    • D
      xfs: fix duplicate workqueue initialisation · 1beb65ad
      Dave Chinner 提交于
      The workqueue initialisation function is called twice when
      initialising the XFS subsystem. Remove the second initialisation
      call.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      1beb65ad
    • J
      xfs: kill off xfs_printk() · e69522a8
      Joe Perches 提交于
      xfs_alert_tag() can be defined using xfs_alert(), and thereby avoid
      using xfs_printk() altogether.  This is the only remaining use of
      xfs_printk(), so changing it this way means xfs_printk() can simply
      be eliminated.can simply be eliminated.can simply be eliminated.can
      simply be eliminated.can simply be eliminated.can simply be
      eliminated.can simply be eliminated.can simply be eliminated.can
      simply be eliminated.
      
      Also add format checking to the non-debug inline function xfs_debug.
      Miscellaneous function prototype argument alignment.
      
      (Updated to delete the definition of xfs_printk(), which is
      no longer used or needed.)
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      e69522a8
  3. 10 5月, 2011 5 次提交
    • D
      xfs: fix race condition in AIL push trigger · e4d3c4a4
      Dave Chinner 提交于
      The recent conversion of the xfsaild functionality to a work queue
      introduced a hard-to-hit log space grant hang. One is caused by a
      race condition in determining whether there is a psh in progress or
      not.
      
      The XFS_AIL_PUSHING_BIT is used to determine whether a push is
      currently in progress.  When the AIL push work completes, it checked
      whether the target changed and cleared the PUSHING bit to allow a
      new push to be requeued. The race condition is as follows:
      
      	Thread 1		push work
      
      	smp_wmb()
      				smp_rmb()
      				check ailp->xa_target unchanged
      	update ailp->xa_target
      	test/set PUSHING bit
      	does not queue
      				clear PUSHING bit
      				does not requeue
      
      Now that the push target is updated, new attempts to push the AIL
      will not trigger as the push target will be the same, and hence
      despite trying to push the AIL we won't ever wake it again.
      
      The fix is to ensure that the AIL push work clears the PUSHING bit
      before it checks if the target is unchanged.
      
      As a result, both push triggers operate on the same test/set bit
      criteria, so even if we race in the push work and miss the target
      update, the thread requesting the push will still set the PUSHING
      bit and queue the push work to occur. For safety sake, the same
      queue check is done if the push work detects the target change,
      though only one of the two will will queue new work due to the use
      of test_and_set_bit() checks.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      e4d3c4a4
    • D
      xfs: make AIL target updates and compares 32bit safe. · fd5670f2
      Dave Chinner 提交于
      The recent conversion of the xfsaild functionality to a work queue
      introduced a hard-to-hit log space grant hang. One of the problems
      noticed was that updates of the push target are not 32 bit safe as
      the target is a 64 bit value.
      
      We cannot copy a 64 bit LSN without the possibility of corrupting
      the result when racing with another updating thread. We have
      function to do this update safely without needing to care about
      32/64 bit issues - xfs_trans_ail_copy_lsn() - so use that when
      updating the AIL push target.
      
      Also move the reading of the target in the push work inside the AIL
      lock, and use XFS_LSN_CMP() for the unlocked comparison during work
      termination to close read holes as well.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      fd5670f2
    • D
      xfs: always push the AIL to the target · cb64026b
      Dave Chinner 提交于
      The recent conversion of the xfsaild functionality to a work queue
      introduced a hard-to-hit log space grant hang. One of the problems
      discovered is a target mismatch between the item pushing loop and
      the target itself.
      
      The push trigger checks for the target increasing (i.e. new target >
      current) while the push loop only pushes items that have a LSN <
      current. As a result, we can get the situation where the push target
      is X, the items at the tail of the AIL have LSN X and they don't get
      pushed. The push work then completes thinking it is done, and cannot
      be restarted until the push target increases to >= X + 1. If the
      push target then never increases (because the tail is not moving),
      then we never run the push work again and we stall.
      
      Fix it by making sure log items with a LSN that matches the target
      exactly are pushed during the loop.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      cb64026b
    • D
      xfs: exit AIL push work correctly when AIL is empty · ea35a200
      Dave Chinner 提交于
      The recent conversion of the xfsaild functionality to a work queue
      introduced a hard-to-hit log space grant hang. The main cause is a
      regression where a work exit path fails to clear the PUSHING state
      and recheck the target correctly.
      
      Make both exit paths do the same PUSHING bit clearing and target
      checking when the "no more work to be done" condition is hit.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      ea35a200
    • D
      xfs: ensure reclaim cursor is reset correctly at end of AG · b2232219
      Dave Chinner 提交于
      On a 32 bit highmem PowerPC machine, the XFS inode cache was growing
      without bound and exhausting low memory causing the OOM killer to be
      triggered. After some effort, the problem was reproduced on a 32 bit
      x86 highmem machine.
      
      The problem is that the per-ag inode reclaim index cursor was not
      getting reset to the start of the AG if the radix tree tag lookup
      found no more reclaimable inodes. Hence every further reclaim
      attempt started at the same index beyond where any reclaimable
      inodes lay, and no further background reclaim ever occurred from the
      AG.
      
      Without background inode reclaim the VM driven cache shrinker
      simply cannot keep up with cache growth, and OOM is the result.
      
      While the change that exposed the problem was the conversion of the
      inode reclaim to use work queues for background reclaim, it was not
      the cause of the bug. The bug was introduced when the cursor code
      was added, just waiting for some weird configuration to strike....
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Tested-By: NChristian Kujau <lists@nerdbynature.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      b2232219
  4. 29 4月, 2011 7 次提交
  5. 21 4月, 2011 1 次提交
  6. 19 4月, 2011 1 次提交
  7. 18 4月, 2011 2 次提交
  8. 16 4月, 2011 8 次提交
  9. 15 4月, 2011 3 次提交
    • L
      vfs: fix incorrect dentry_update_name_case() BUG_ON() test · 7ebfa57f
      Linus Torvalds 提交于
      The case we should be verifying when updating the dentry name is that
      the _parent_ inode (the directory) semaphore is held, not the semaphore
      for the dentry itself.  It's the directory locking that rename and
      readdir() etc all care about.
      
      The comment just above even says so - but then the BUG_ON() still
      checked the dentry inode itself.
      
      Very few people noticed, because this helper function really isn't used
      for very much, so you had to be using ncpfs to ever hit it.
      
      I think I should just remove the BUG_ON (the function really has just
      one user), but let's run with it fixed for a while before getting rid of
      it entirely.
      Reported-and-tested-by: NBongani Hlope <bonganih@bankservafrica.com>
      Reported-and-tested-by: NBernd Feige <bernd.feige@uniklinik-freiburg.de>
      Cc: Petr Vandrovec <petr@vandrovec.name>,
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ebfa57f
    • B
      ramfs: fix memleak on no-mmu arch · b836aec5
      Bob Liu 提交于
      On no-mmu arch, there is a memleak during shmem test.  The cause of this
      memleak is ramfs_nommu_expand_for_mapping() added page refcount to 2
      which makes iput() can't free that pages.
      
      The simple test file is like this:
      
        int main(void)
        {
      	int i;
      	key_t k = ftok("/etc", 42);
      
      	for ( i=0; i<100; ++i) {
      		int id = shmget(k, 10000, 0644|IPC_CREAT);
      		if (id == -1) {
      			printf("shmget error\n");
      		}
      		if(shmctl(id, IPC_RMID, NULL ) == -1) {
      			printf("shm  rm error\n");
      			return -1;
      		}
      	}
      	printf("run ok...\n");
      	return 0;
        }
      
      And the result:
      
        root:/> free
                     total         used         free       shared      buffers
        Mem:         60320        17912        42408            0            0
        -/+ buffers:              17912        42408
        root:/> shmem
        run ok...
        root:/> free
                     total         used         free       shared      buffers
        Mem:         60320        19096        41224            0            0
        -/+ buffers:              19096        41224
        root:/> shmem
        run ok...
        root:/> free
                     total         used         free       shared      buffers
        Mem:         60320        20296        40024            0            0
        -/+ buffers:              20296        40024
        ...
      
      After this patch the test result is:(no memleak anymore)
      
        root:/> free
                     total         used         free       shared      buffers
        Mem:         60320        16668        43652            0            0
        -/+ buffers:              16668        43652
        root:/> shmem
        run ok...
        root:/> free
                     total         used         free       shared      buffers
        Mem:         60320        16668        43652            0            0
        -/+ buffers:              16668        43652
      Signed-off-by: NBob Liu <lliubbo@gmail.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b836aec5
    • J
      fs/fhandle.c: add <linux/personality.h> for ia64 · ed5afeaf
      Jeff Mahoney 提交于
      force_o_largefile() on ia64 is defined in <asm/fcntl.h> and requires
      <linux/personality.h>.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed5afeaf