提交 · c11554104f4dcb509fd43973389b097a04b9d51d · openeuler / raspberrypi-kernel

24 5月, 2010 1 次提交

xfs: Clean up XFS_BLI_* flag namespace · c1155410

由 Dave Chinner 提交于 5月 07, 2010

Clean up the buffer log format (XFS_BLI_*) flags because they have a
polluted namespace. They XFS_BLI_ prefix is used for both in-memory
and on-disk flag feilds, but have overlapping values for different
flags. Rename the buffer log format flags to use the XFS_BLF_*
prefix to avoid confusing them with the in-memory XFS_BLI_* prefixed
flags.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c1155410

19 5月, 2010 20 次提交

C
xfs: clean up end index calculation in xfs_page_state_convert · bd1556a1
由 Christoph Hellwig 提交于 4月 28, 2010
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>
```
bd1556a1
C
xfs: clean up mapping size calculation in __xfs_get_blocks · 2b8f12b7
由 Christoph Hellwig 提交于 4月 28, 2010
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>
```
2b8f12b7

xfs: clean up xfs_iomap_valid · 558e6891

由 Christoph Hellwig 提交于 4月 28, 2010

Rename all iomap_valid identifiers to imap_valid to fit the new
world order, and clean up xfs_iomap_valid to convert the passed in
offset to blocks instead of the imap values to bytes.  Use the
simpler inode->i_blkbits instead of the XFS macros for this.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

558e6891

xfs: move I/O type flags into xfs_aops.c · 34a52c6c

由 Christoph Hellwig 提交于 4月 28, 2010

The IOMAP_ flags are now only used inside xfs_aops.c for extent
probing and I/O completion tracking, so more them here, and rename
them to IO_* as there's no mapping involved at all.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

34a52c6c

xfs: kill struct xfs_iomap · 207d0416

由 Christoph Hellwig 提交于 4月 28, 2010

Now that struct xfs_iomap contains exactly the same units as struct
xfs_bmbt_irec we can just use the latter directly in the aops code.
Replace the missing IOMAP_NEW flag with a new boolean output
parameter to xfs_iomap.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

207d0416

xfs: report iomap_bn in block base · e513182d

由 Christoph Hellwig 提交于 4月 28, 2010

Report the iomap_bn field of struct xfs_iomap in terms of filesystem
blocks instead of in terms of bytes.  Shift the byte conversions
into the caller, and replace the IOMAP_DELAY and IOMAP_HOLE flag
checks with checks for HOLESTARTBLOCK and DELAYSTARTBLOCK.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

e513182d

xfs: report iomap_offset and iomap_bsize in block base · 8699bb0a

由 Christoph Hellwig 提交于 4月 28, 2010

Report the iomap_offset and iomap_bsize fields of struct xfs_iomap
in terms of fsblocks instead of in terms of disk blocks.  Shift the
byte conversions into the callers temporarily, but they will
disappear or get cleaned up later.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

8699bb0a

xfs: remove iomap_delta · 9563b3d8

由 Christoph Hellwig 提交于 4月 28, 2010

The iomap_delta field in struct xfs_iomap just contains the
difference between the offset passed to xfs_iomap and the
iomap_offset.  Just calculate it in the only caller that cares.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

9563b3d8

xfs: remove iomap_target · 046f1685

由 Christoph Hellwig 提交于 4月 28, 2010

Instead of using the iomap_target field in struct xfs_iomap
and the IOMAP_REALTIME flag just use the already existing
xfs_find_bdev_for_inode helper.  There's some fallout as we
need to pass the inode in a few more places, which we also
use to sanitize some calling conventions.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

046f1685

xfs: Make fiemap work in query mode. · 2d1ff3c7

由 Tao Ma 提交于 4月 29, 2010

According to Documentation/filesystems/fiemap.txt, If fm_extent_count
is zero, then the fm_extents[] array is ignored (no extents will be
returned), and the fm_mapped_extents count will hold the number of
extents needed.

But as the commit 97db39a1 has changed
bmv_count to the caller's input buffer, this number query function can't
work any more. As this commit is written to change bmv_count from
MAXEXTNUM because of ENOMEM.

This patch just try to  set bm.bmv_count to something sane.
Thanks to Dave Chinner <david@fromorbit.com> for the suggestion.

Cc: Eric Sandeen <sandeen@redhat.com>
Cc: Alex Elder <aelder@sgi.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

2d1ff3c7

xfs: wait for direct I/O to complete in fsync and write_inode · 37bc5743

由 Christoph Hellwig 提交于 4月 20, 2010

We need to wait for all pending direct I/O requests before taking care of
metadata in fsync and write_inode.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>

37bc5743

xfs: xfs_trace.c: duplicated include · fce1cad6

由 Andrea Gelmini 提交于 3月 25, 2010

fs/xfs/linux-2.6/xfs_trace.c: xfs_attr_sf.h is included more than once.
Signed-off-by: NAndrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: NAlex Elder <aelder@sgi.com>

fce1cad6

xfs: enforce synchronous writes in xfs_bwrite · 8c38366f

由 Christoph Hellwig 提交于 3月 12, 2010

xfs_bwrite is used with the intention of synchronously writing out
buffers, but currently it does not actually clear the async flag if
that's left from previous writes but instead implements async
behaviour if it finds it.  Remove the code handling asynchronous
writes as we've got rid of those entirely outside of the log and
delwri buffers, and make sure that we clear the async and read flags
before writing the buffer.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

8c38366f

xfs: remove periodic superblock writeback · df308bcf

由 Christoph Hellwig 提交于 3月 12, 2010

All modifications to the superblock are done transactional through
xfs_trans_log_buf, so there is no reason to initiate periodic
asynchronous writeback.  This only removes the superblock from the
delwri list and will lead to sub-optimal I/O scheduling.

Cut down xfs_sync_fsdata now that it's only used for synchronous
superblock writes and move the log coverage checks into the two
callers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

df308bcf

xfs: convert the dquot hash list to use list heads · e6a81f13

由 Dave Chinner 提交于 4月 13, 2010

Convert the dquot hash list on the filesystem to use listhead
infrastructure rather than the roll-your-own in the quota code.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

e6a81f13

xfs: remove duplicate code from dquot reclaim · 368e1361

由 Dave Chinner 提交于 4月 13, 2010

The dquot shaker and the free-list reclaim code use exactly the same
algorithm but the code is duplicated and slightly different in each
case. Make the shaker code use the single dquot reclaim code to
remove the code duplication.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

368e1361

xfs: add log item recovery tracing · 9abbc539

由 Dave Chinner 提交于 4月 13, 2010

Currently there is no tracing in log recovery, so it is difficult to
determine what is going on when something goes wrong.

Add tracing for log item recovery to provide visibility into the log
recovery process. The tracing added shows regions being extracted
from the log transactions and added to the transaction hash forming
recovery items, followed by the reordering, cancelling and finally
recovery of the items.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

9abbc539

xfs: Add inode pin counts to traces · 4aaf15d1

由 Dave Chinner 提交于 3月 08, 2010

We don't record pin counts in inode events right now, and this makes
it difficult to track down problems related to pinning inodes. Add
the pin count to the inode trace class and add trace events for
pinning and unpinning inodes.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

4aaf15d1

xfs: add blockdev name to kthreads · e2a07812

由 Jan Engelhardt 提交于 3月 23, 2010

This allows to see in `ps` and similar tools which kthreads are
allotted to which block device/filesystem, similar to what jbd2
does. As the process name is a fixed 16-char array, no extra
space is needed in tasks.

  PID TTY      STAT   TIME COMMAND
    2 ?        S      0:00 [kthreadd]
  197 ?        S      0:00  \_ [jbd2/sda2-8]
  198 ?        S      0:00  \_ [ext4-dio-unwrit]
  204 ?        S      0:00  \_ [flush-8:0]
 2647 ?        S      0:00  \_ [xfs_mru_cache]
 2648 ?        S      0:00  \_ [xfslogd/0]
 2649 ?        S      0:00  \_ [xfsdatad/0]
 2650 ?        S      0:00  \_ [xfsconvertd/0]
 2651 ?        S      0:00  \_ [xfsbufd/ram0]
 2652 ?        S      0:00  \_ [xfsaild/ram0]
 2653 ?        S      0:00  \_ [xfssyncd/ram0]
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>

e2a07812

xfs: Fix integer overflow in fs/xfs/linux-2.6/xfs_ioctl*.c · fda168c2

由 Zhitong Wang 提交于 3月 23, 2010

The am_hreq.opcount field in the xfs_attrmulti_by_handle() interface
is not bounded correctly. The opcount is used to determine the size
of the buffer required. The size is bounded, but can overflow and so
the size checks may not be sufficient to catch invalid opcounts.
Fix it by catching opcount values that would cause overflows before
calculating the size.
Signed-off-by: NZhitong Wang <zhitong.wangzt@alibaba-inc.com>
Reviewed-by: NDave Chinner <david@fromorbit.com>

fda168c2

30 4月, 2010 1 次提交

xfs: add a shrinker to background inode reclaim · 9bf729c0

由 Dave Chinner 提交于 4月 29, 2010

On low memory boxes or those with highmem, kernel can OOM before the
background reclaims inodes via xfssyncd. Add a shrinker to run inode
reclaim so that it inode reclaim is expedited when memory is low.

This is more complex than it needs to be because the VM folk don't
want a context added to the shrinker infrastructure. Hence we need
to add a global list of XFS mount structures so the shrinker can
traverse them.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

9bf729c0

17 4月, 2010 1 次提交

xfs: don't warn on EAGAIN in inode reclaim · f1d486a3

由 Dave Chinner 提交于 4月 13, 2010

Any inode reclaim flush that returns EAGAIN will result in the inode
reclaim being attempted again later. There is no need to issue a
warning into the logs about this situation.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

f1d486a3

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

17 3月, 2010 3 次提交

xfs: don't warn about page discards on shutdown · e8c3753c

由 Dave Chinner 提交于 3月 15, 2010

If we are doing a forced shutdown, we can get lots of noise about
delalloc pages being discarded. This is happens by design during a
forced shutdown, so don't spam the logs with these messages.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

e8c3753c

xfs: use scalable vmap API · 8a262e57

由 Alex Elder 提交于 3月 16, 2010

Re-apply a commit that had been reverted due to regressions
that have since been fixed.

    From 95f8e302 Mon Sep 17 00:00:00 2001
    From: Nick Piggin <npiggin@suse.de>
    Date: Tue, 6 Jan 2009 14:43:09 +1100

    Implement XFS's large buffer support with the new vmap APIs. See the vmap
    rewrite (db64fe02) for some numbers. The biggest improvement that comes from
    using the new APIs is avoiding the global KVA allocation lock on every call.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Reviewed-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

Only modifications here were a minor reformat, plus making the patch
apply given the new use of xfs_buf_is_vmapped().
Modified-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

8a262e57

xfs: remove old vmap cache · cd9640a7

由 Alex Elder 提交于 3月 16, 2010

Re-apply a commit that had been reverted due to regressions
that have since been fixed.

    Original commit: d2859751
    Author: Nick Piggin <npiggin@suse.de>
    Date: Tue, 6 Jan 2009 14:40:44 +1100

    XFS's vmap batching simply defers a number (up to 64) of vunmaps,
    and keeps track of them in a list. To purge the batch, it just goes
    through the list and calls vunamp on each one. This is pretty poor:
    a global TLB flush is generally still performed on each vunmap, with
    the most expensive parts of the operation being the broadcast IPIs
    and locking involved in the SMP callouts, and the locking involved
    in the vmap management -- none of these are avoided by just batching
    up the calls. I'm actually surprised it ever made much difference.
    (Now that the lazy vmap allocator is upstream, this description is
    not quite right, but the vunmap batching still doesn't seem to do
    much).

    Rip all this logic out of XFS completely. I will improve vmap
    performance and scalability directly in subsequent patch.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Reviewed-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

The only change I made was to use the "new" xfs_buf_is_vmapped()
function in a place it had been open-coded in the original.
Modified-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

cd9640a7

06 3月, 2010 5 次提交

pass writeback_control to ->write_inode · a9185b41

由 Christoph Hellwig 提交于 3月 05, 2010

This gives the filesystem more information about the writeback that
is happening.  Trond requested this for the NFS unstable write handling,
and other filesystems might benefit from this too by beeing able to
distinguish between the different callers in more detail.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a9185b41

make sure data is on disk before calling ->write_inode · 26821ed4

由 Christoph Hellwig 提交于 3月 05, 2010

Similar to the fsync issue fixed a while ago in commit
2daea67e we need to write for data to
actually hit the disk before writing out the metadata to guarantee
data integrity for filesystems that modify the inode in the data I/O
completion path.  Currently XFS and NFS handle this manually, and AFS
has a write_inode method that does nothing but waiting for data, while
others are possibly missing out on this.

Fortunately this change has a lot less impact than the fsync change
as none of the write_inode methods starts data writeout of any form
by itself.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

26821ed4

xfs: truncate delalloc extents when IO fails in writeback · 3ed3a434

由 Dave Chinner 提交于 3月 05, 2010

We currently use block_invalidatepage() to clean up pages where I/O
fails in ->writepage(). Unfortunately, if the page has delalloc
regions on it, we fail to remove the delalloc regions when we
invalidate the page.  This can result in tripping a BUG() in
xfs_get_blocks() later on if a direct IO read is done on that same
region - the delalloc extent is returned when none is supposed to be
there.

Fix this by truncating away the delalloc regions on the page before
invalidating it. Because they are delalloc, we can do this without
needing a transaction. Indeed - if we get ENOSPC errors, we have to
be able to do this truncation without a transaction as there is
no space left for block reservation (typically why we see a ENOSPC
in writeback).
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

3ed3a434

xfs: check for more work before sleeping in xfssyncd · 20f6b2c7

由 Dave Chinner 提交于 3月 04, 2010

xfssyncd processes a queue of work by detaching the queue and
then iterating over all the work items. It then sleeps for a
time period or until new work comes in. If new work is queued
while xfssyncd is actively processing the detached work queue,
it will not process that new work until after a sleep timeout
or the next work event queued wakes it.

Fix this by checking the work queue again before going to sleep.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

20f6b2c7

xfs: Fix a build warning in xfs_aops.c · 69418932

由 Dave Chinner 提交于 3月 04, 2010

Fix a build warning that slipped through.  Dave Chinner had posted
an updated version of his patch but the previous version--without
this fix--was what got committed.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

69418932

05 3月, 2010 2 次提交

quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota · ac0e7737

由 Christoph Hellwig 提交于 2月 16, 2010

We already do these checks in the generic code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

ac0e7737

quota: clean up Q_XQUOTASYNC · 8c4e4acd

由 Christoph Hellwig 提交于 2月 16, 2010

Currently Q_XQUOTASYNC calls into the quota_sync method, but XFS does something
entirely different in it than the rest of the filesystems.  xfs_quota which
calls Q_XQUOTASYNC expects an asynchronous data writeout to flush delayed
allocations, while the "VFS" quota support wants to flush changes to the quota
file.

So make Q_XQUOTASYNC call into the writeback code directly and make the
quota_sync method optional as XFS doesn't need in the sense expected by the
rest of the quota code.

GFS2 was using limited XFS-style quota and has a quota_sync method fitting
neither the style used by vfs_quota_sync nor xfs_fs_quota_sync.  I left it
in for now as per discussion with Steve it expects to be called from the
sync path this way.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

8c4e4acd

02 3月, 2010 6 次提交

xfs: fix locking for inode cache radix tree tag updates · f1f724e4

由 Christoph Hellwig 提交于 3月 01, 2010

The radix-tree code requires it's users to serialize tag updates
against other updates to the tree.  While XFS protects tag updates
against each other it does not serialize them against updates of the
tree contents, which can lead to tag corruption.  Fix the inode
cache to always take pag_ici_lock in exclusive mode when updating
radix tree tags.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NPatrick Schreurs <patrick@news-service.com>
Tested-by: NPatrick Schreurs <patrick@news-service.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

f1f724e4

xfs: kill xfs_lrw.h · d7658d48

由 Christoph Hellwig 提交于 2月 17, 2010

Move the two declarations to better fitting headers now that
xfs_lrw.c is gone.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

d7658d48

xfs: fix xfs_fsblock_t tracing · f7008d0a

由 Christoph Hellwig 提交于 2月 15, 2010

Using a static buffer in xfs_fmtfsblock means we can corrupt traces if
multiple CPUs hit this code path at the same.  Just remove xfs_fmtfsblock
for now and print the block number purely numerical.  If we want the
NULLFSBLOCK and NULLSTARTBLOCK formatting back the best way would be
a decoding plugin in the trace-cmd userspace command.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

f7008d0a

xfs: fix inode pincount check in fsync · 024910cb

由 Christoph Hellwig 提交于 2月 17, 2010

We need to hold the ilock to check the inode pincount safely.  While
we're at it also remove the check for ip->i_itemp->ili_last_lsn, a
pinned inode always has it set.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

024910cb

xfs: Non-blocking inode locking in IO completion · 77d7a0c2

由 Dave Chinner 提交于 2月 17, 2010

The introduction of barriers to loop devices has created a new IO
order completion dependency that XFS does not handle. The loop
device implements barriers using fsync and so turns a log IO in the
XFS filesystem on the loop device into a data IO in the backing
filesystem. That is, the completion of log IOs in the loop
filesystem are now dependent on completion of data IO in the backing
filesystem.

This can cause deadlocks when a flush daemon issues a log force with
an inode locked because the IO completion of IO on the inode is
blocked by the inode lock. This in turn prevents further data IO
completion from occuring on all XFS filesystems on that CPU (due to
the shared nature of the completion queues). This then prevents the
log IO from completing because the log is waiting for data IO
completion as well.

The fix for this new completion order dependency issue is to make
the IO completion inode locking non-blocking. If the inode lock
can't be grabbed, simply requeue the IO completion back to the work
queue so that it can be processed later. This prevents the
completion queue from being blocked and allows data IO completion on
other inodes to proceed, hence avoiding completion order dependent
deadlocks.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

77d7a0c2

xfs: implement optimized fdatasync · 66d834ea

由 Christoph Hellwig 提交于 2月 15, 2010

Allow us to track the difference between timestamp and size updates
by using mark_inode_dirty from the I/O completion code, and checking
the VFS inode flags in xfs_file_fsync.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

66d834ea