提交 · 7271d243f9d1b4106289e4cf876c8b1203de59ab · openanolis / cloud-kernel

12 10月, 2011 2 次提交

xfs: don't serialise adjacent concurrent direct IO appending writes · 7271d243

由 Dave Chinner 提交于 8月 25, 2011

For append write workloads, extending the file requires a certain
amount of exclusive locking to be done up front to ensure sanity in
things like ensuring that we've zeroed any allocated regions
between the old EOF and the start of the new IO.

For single threads, this typically isn't a problem, and for large
IOs we don't serialise enough for it to be a problem for two
threads on really fast block devices. However for smaller IO and
larger thread counts we have a problem.

Take 4 concurrent sequential, single block sized and aligned IOs.
After the first IO is submitted but before it completes, we end up
with this state:

        IO 1    IO 2    IO 3    IO 4
      +-------+-------+-------+-------+
      ^       ^
      |       |
      |       |
      |       |
      |       \- ip->i_new_size
      \- ip->i_size

And the IO is done without exclusive locking because offset <=
ip->i_size. When we submit IO 2, we see offset > ip->i_size, and
grab the IO lock exclusive, because there is a chance we need to do
EOF zeroing. However, there is already an IO in progress that avoids
the need for IO zeroing because offset <= ip->i_new_size. hence we
could avoid holding the IO lock exlcusive for this. Hence after
submission of the second IO, we'd end up this state:

        IO 1    IO 2    IO 3    IO 4
      +-------+-------+-------+-------+
      ^               ^
      |               |
      |               |
      |               |
      |               \- ip->i_new_size
      \- ip->i_size

There is no need to grab the i_mutex of the IO lock in exclusive
mode if we don't need to invalidate the page cache. Taking these
locks on every direct IO effective serialises them as taking the IO
lock in exclusive mode has to wait for all shared holders to drop
the lock. That only happens when IO is complete, so effective it
prevents dispatch of concurrent direct IO writes to the same inode.

And so you can see that for the third concurrent IO, we'd avoid
exclusive locking for the same reason we avoided the exclusive lock
for the second IO.

Fixing this is a bit more complex than that, because we need to hold
a write-submission local value of ip->i_new_size to that clearing
the value is only done if no other thread has updated it before our
IO completes.....
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

7271d243

xfs: don't serialise direct IO reads on page cache checks · 0c38a251

由 Dave Chinner 提交于 8月 25, 2011

There is no need to grab the i_mutex of the IO lock in exclusive
mode if we don't need to invalidate the page cache. Taking these
locks on every direct IO effective serialises them as taking the IO
lock in exclusive mode has to wait for all shared holders to drop
the lock. That only happens when IO is complete, so effective it
prevents dispatch of concurrent direct IO reads to the same inode.

Fix this by taking the IO lock shared to check the page cache state,
and only then drop it and take the IO lock exclusively if there is
work to be done. Hence for the normal direct IO case, no exclusive
locking will occur.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Tested-by: NJoern Engel <joern@logfs.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

0c38a251

14 9月, 2011 1 次提交

xfs: fix a use after free in xfs_end_io_direct_write · 2d2422ae

由 Christoph Hellwig 提交于 9月 13, 2011

There is a window in which the ioend that we call inode_dio_wake on
in xfs_end_io_direct_write is already free.  Fix this by storing
the inode pointer in a local variable.

This is a fix for the regression introduced in 3.1-rc by
"fs: move inode_dio_done to the end_io handler".
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

2d2422ae

01 9月, 2011 2 次提交

xfs: fix ->write_inode return values · 58d84c4e

由 Christoph Hellwig 提交于 8月 27, 2011

Currently we always redirty an inode that was attempted to be written out
synchronously but has been cleaned by an AIL pushed internall, which is
rather bogus.  Fix that by doing the i_update_core check early on and
return 0 for it.  Also include async calls for it, as doing any work for
those is just as pointless.  While we're at it also fix the sign for the
EIO return in case of a filesystem shutdown, and fix the completely
non-sensical locking around xfs_log_inode.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>
(cherry picked from commit 297db93bb74cf687510313eb235a7aec14d67e97)
Signed-off-by: NAlex Elder <aelder@sgi.com>

58d84c4e

xfs: fix xfs_mark_inode_dirty during umount · 866e4ed7

由 Christoph Hellwig 提交于 8月 27, 2011

During umount we do not add a dirty inode to the lru and wait for it to
become clean first, but force writeback of data and metadata with
I_WILL_FREE set.  Currently there is no way for XFS to detect that the
inode has been redirtied for metadata operations, as we skip the
mark_inode_dirty call during teardown.  Fix this by setting i_update_core
nanually in that case, so that the inode gets flushed during inode reclaim.

Alternatively we could enable calling mark_inode_dirty for inodes in
I_WILL_FREE state, and let the VFS dirty tracking handle this.  I decided
against this as we will get better I/O patterns from reclaim compared to
the synchronous writeout in write_inode_now, and always marking the inode
dirty in some way from xfs_mark_inode_dirty is a better safetly net in
either case.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>
(cherry picked from commit da6742a5a4cc844a9982fdd936ddb537c0747856)
Signed-off-by: NAlex Elder <aelder@sgi.com>

866e4ed7

25 8月, 2011 1 次提交

xfs: deprecate the nodelaylog mount option · 242d6219

由 Christoph Hellwig 提交于 8月 24, 2011

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

242d6219

23 8月, 2011 1 次提交

xfs: fix tracing builds inside the source tree · b6bede3b

由 Christoph Hellwig 提交于 8月 14, 2011

The code really requires the current source directory to be in the
header search path.  We already do this if building with an object
tree separate from the source, but it needs to be added manually
if building inside the source.  The cflags addition for it accidentally
got removed when collapsing the xfs directory structure.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

b6bede3b

13 8月, 2011 4 次提交

xfs: remove subdirectories · c59d87c4

由 Christoph Hellwig 提交于 8月 12, 2011

Use the move from Linux 2.6 to Linux 3.x as an excuse to kill the
annoying subdirectories in the XFS source code.  Besides the large
amount of file rename the only changes are to the Makefile, a few
files including headers with the subdirectory prefix, and the binary
sysctl compat code that includes a header under fs/xfs/ from
kernel/.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c59d87c4

xfs: don't expect xfs headers to be in subdirectories · 06f8e2d6

由 Alex Elder 提交于 8月 12, 2011

Fix up some #include directives in preparation for moving a few
header files out of xfs source subdirectories.

Note that "xfs_linux.h" also got its quoting convention for included
files switched.
Signed-off-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

06f8e2d6

xfs: replace xfs_buf_geterror() with bp->b_error · e5702805

由 Chandra Seetharaman 提交于 8月 03, 2011

Since we just checked bp for NULL, it is ok to replace
xfs_buf_geterror() with bp->b_error in these places.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

e5702805

xfs: Check the return value of xfs_buf_read() for NULL · ac4d6888

由 Chandra Seetharaman 提交于 8月 03, 2011

Check the return value of xfs_buf_read() for NULL and return ENOMEM
if it is NULL.  This is necessary in a few spots to avoid subsequent
code blindly dereferencing the null buffer pointer.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

ac4d6888

11 8月, 2011 1 次提交

"xfs: fix error handling for synchronous writes" revisited · 9e978d8f

由 Ajeet Yadav 提交于 7月 29, 2011

xfs: fix for hang during synchronous buffer write error

If removed storage while synchronous buffer write underway,
"xfslogd" hangs.

Detailed log http://oss.sgi.com/archives/xfs/2011-07/msg00740.html

Related work bfc60177
"xfs: fix error handling for synchronous writes"

Given that xfs_bwrite actually does the shutdown already after
waiting for the b_iodone completion and given that we actually
found that calling xfs_force_shutdown from inside
xfs_buf_iodone_callbacks was a major contributor the problem
it better to drop this call.
Signed-off-by: NAjeet Yadav <ajeet.yadav.77@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

9e978d8f

10 8月, 2011 1 次提交

xfs: set cursor in xfs_ail_splice() even when AIL was empty · e44f4112

由 Alex Elder 提交于 7月 22, 2011

In xfs_ail_splice(), if a cursor is provided it is updated to
point to the last item on the list being spliced into the AIL.
But if the AIL was found to be empty, the cursor (if provided)
is just initialized instead.

There is no reason the empty AIL case needs to be treated any
differently.  And treating it the same way allows this code
to be rearranged a bit, with a somewhat tidier result.
Signed-off-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

e44f4112

01 8月, 2011 3 次提交

xfs: Fix build breakage in xfs_iops.c when CONFIG_FS_POSIX_ACL is not set · 206d440f

由 Markus Trippelsdorf 提交于 7月 26, 2011

commit 4e34e719, that takes the ACL checks to common code,
accidentely broke the build when CONFIG_FS_POSIX_ACL is not set:

  CC      fs/xfs/linux-2.6/xfs_iops.o
fs/xfs/linux-2.6/xfs_iops.c:1025:14: error: ‘xfs_get_acl’ undeclared here (not in a function)

Fix this by declaring xfs_get_acl a static inline function.
Signed-off-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

206d440f

switch posix_acl_equiv_mode() to umode_t * · d6952123

由 Al Viro 提交于 7月 23, 2011

... so that &inode->i_mode could be passed to it
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d6952123

A
switch posix_acl_create() to umode_t * · d3fb6120
由 Al Viro 提交于 7月 23, 2011
```
so we can pass &inode->i_mode to it
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
d3fb6120

30 7月, 2011 1 次提交

xfs: Fix build breakage in xfs_iops.c when CONFIG_FS_POSIX_ACL is not set · a5a7bbcc

由 Markus Trippelsdorf 提交于 7月 26, 2011

commit 4e34e719, that takes the ACL checks to common code,
accidentely broke the build when CONFIG_FS_POSIX_ACL is not set:

  CC      fs/xfs/linux-2.6/xfs_iops.o
fs/xfs/linux-2.6/xfs_iops.c:1025:14: error: ‘xfs_get_acl’ undeclared here (not in a function)

Fix this by declaring xfs_get_acl a static inline function.
Signed-off-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

a5a7bbcc

27 7月, 2011 6 次提交

xfs: optimize the negative xattr caching · 510792ee

由 Christoph Hellwig 提交于 7月 26, 2011

Since the addition of file capabilities every write needs to read xattrs to
check if we have any capabilities to clear.  In Linux 3.0 Andi Kleen added
a flag to cache the fact that we do not have any attributes on an inode.
Make sure to already mark a file as not having any attributes when reading
it from disk in case it doesn't even have an attribute fork.  Based on an
earlier patch from Andi Kleen.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

510792ee

xfs: prevent against ioend livelocks in xfs_file_fsync · d1166ec7

由 Christoph Hellwig 提交于 7月 26, 2011

We need to take some locks to prevent new ioends from coming in when we wait
for all existing ones to go away.  Up to Linux 3.0 that was done using the
i_mutex held by the VFS fsync code, but now that we are called without
it we need to take care of it ourselves.  Use the I/O lock instead of
i_mutex just like we do in other places.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

d1166ec7

xfs: flag all buffers as metadata · 34951f5c

由 Christoph Hellwig 提交于 7月 26, 2011

Now that REQ_META bios aren't treated specially in the CFQ I/O schedule
anymore, we can tag all buffers as metadata to make blktrace traces more
meaningful. Note that we use buffers also to zero out partial blocks
in the preallocation / hole punching code, and while they operate on
data blocks the zeros written certainly aren't data. I think this case
is borderline metadata enough to not bother special casing it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

34951f5c

xfs: encapsulate a block of debug code · 1c4f3329

由 Alex Elder 提交于 7月 18, 2011

Pull into a helper function some debug-only code that validates a
xfs_da_blkinfo structure that's been read from disk.
Signed-off-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NChristoph Hellwig <hch@infradead.org>

1c4f3329

A
xfs: fix misspelled S_IS...() · 03209378
由 Al Viro 提交于 7月 25, 2011
```
mode_t is not a bitmap...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
03209378
A
xfs: get rid of open-coded S_ISREG(), etc. · abbede1b
由 Al Viro 提交于 7月 26, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
abbede1b

26 7月, 2011 17 次提交

xfs: Remove the macro XFS_BUFTARG_NAME · c35a549c

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usages of the macro XFS_BUFTARG_NAME.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c35a549c

xfs: Remove the macro XFS_BUF_TARGET · 49074c06

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usages of the macro XFS_BUF_TARGET
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

49074c06

xfs: Remove the macro XFS_BUF_SET_TARGET · e38c9b87

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the macro XFS_BUF_SET_TARGET.

hch: As all the buffer allocator already set ->b_target it should be safe
to simply remove these calls.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

e38c9b87

Replace the macro XFS_BUF_ISPINNED with helper xfs_buf_ispinned · 811e64c7

由 Chandra Seetharaman 提交于 7月 22, 2011

Replace the macro XFS_BUF_ISPINNED with an inline helper function
xfs_buf_ispinned, and change all its usages.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

811e64c7

xfs: Remove the macro XFS_BUF_SET_PTR · 02fe03d9

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usages of the macro XFS_BUF_SET_PTR.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

02fe03d9

xfs: Remove the macro XFS_BUF_PTR · 62926044

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usages of the macro XFS_BUF_PTR.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

62926044

xfs: Remove macro XFS_BUF_SET_START · 0095a21e

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usage of the macro XFS_BUF_SET_START.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

0095a21e

xfs: Remove macro XFS_BUF_HOLD · 72790aa1

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usage of the macro XFS_BUF_HOLD
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

72790aa1

xfs: Remove macro XFS_BUF_BUSY and family · b75e40a4

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definitions and uses of the macros XFS_BUF_BUSY,
XFS_BUF_UNBUSY, and XFS_BUF_ISBUSY.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

b75e40a4

xfs: Remove the macro XFS_BUF_ERROR and family · 5a52c2a5

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definitions and usage of the macros XFS_BUF_ERROR,
XFS_BUF_GETERROR and XFS_BUF_ISERROR.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

5a52c2a5

xfs: Remove the macro XFS_BUF_BFLAGS · ed43233b

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition of the macro XFS_BUF_BFLAGS and its usage.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

ed43233b

fs: take the ACL checks to common code · 4e34e719

由 Christoph Hellwig 提交于 7月 23, 2011

Replace the ->check_acl method with a ->get_acl method that simply reads an
ACL from disk after having a cache miss. This means we can replace the ACL
checking boilerplate code with a single implementation in namei.c.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4e34e719

kill boilerplates around posix_acl_create_masq() · 826cae2f

由 Al Viro 提交于 7月 23, 2011

new helper: posix_acl_create(&acl, gfp, mode_p).  Replaces acl with
modified clone, on failure releases acl and replaces with NULL.
Returns 0 or -ve on error.  All callers of posix_acl_create_masq()
switched.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

826cae2f

kill boilerplate around posix_acl_chmod_masq() · bc26ab5f

由 Al Viro 提交于 7月 23, 2011

new helper: posix_acl_chmod(&acl, gfp, mode).  Replaces acl with modified
clone or with NULL if that has failed; returns 0 or -ve on error.  All
callers of posix_acl_chmod_masq() switched to that - they'd been doing
exactly the same thing.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bc26ab5f

xfs: cache negative ACLs if there is no attribute fork · 6311b108

由 Christoph Hellwig 提交于 7月 23, 2011

Always set up a negative ACL cache entry if the inode doesn't have an
attribute fork.  That behaves much better than doing this check inside
->check_acl.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6311b108

vfs: move ACL cache lookup into generic code · e77819e5

由 Linus Torvalds 提交于 7月 22, 2011

This moves logic for checking the cached ACL values from low-level
filesystems into generic code.  The end result is a streamlined ACL
check that doesn't need to load the inode->i_op->check_acl pointer at
all for the common cached case.

The filesystems also don't need to check for a non-blocking RCU walk
case in their acl_check() functions, because that is all handled at a
VFS layer.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e77819e5

xfs: Fix wrong return value of xfs_file_aio_write · 340a0a01

由 Markus Trippelsdorf 提交于 7月 24, 2011

The fsync prototype change commit 02c24a82 accidentally overwrote
the ssize_t return value of xfs_file_aio_write with 0 for SYNC type
writes. Fix this by checking if an error occured when calling
xfs_file_fsync and only change the return value in this case.
In addition xfs_file_fsync actually returns a normal negative error, so
fix this, too.
Signed-off-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

340a0a01

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功