提交 · 9467c4fdd66f6810cecef0f1173330f3c6e67d45 · openeuler / raspberrypi-kernel

06 3月, 2010 2 次提交

pass writeback_control to ->write_inode · a9185b41

由 Christoph Hellwig 提交于 3月 05, 2010

This gives the filesystem more information about the writeback that
is happening.  Trond requested this for the NFS unstable write handling,
and other filesystems might benefit from this too by beeing able to
distinguish between the different callers in more detail.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a9185b41

make sure data is on disk before calling ->write_inode · 26821ed4

由 Christoph Hellwig 提交于 3月 05, 2010

Similar to the fsync issue fixed a while ago in commit
2daea67e we need to write for data to
actually hit the disk before writing out the metadata to guarantee
data integrity for filesystems that modify the inode in the data I/O
completion path.  Currently XFS and NFS handle this manually, and AFS
has a write_inode method that does nothing but waiting for data, while
others are possibly missing out on this.

Fortunately this change has a lot less impact than the fsync change
as none of the write_inode methods starts data writeout of any form
by itself.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

26821ed4

13 2月, 2010 1 次提交

xfs: only clear the suid bit once in xfs_write · 87185517

由 Christoph Hellwig 提交于 2月 03, 2010

file_remove_suid already calls into ->setattr to clear the suid and
sgid bits if needed, no need to start a second transaction to do it
ourselves.

Note that xfs_write_clear_setuid issues a sync transaction while the
path through ->setattr doesn't, but that is consistant with the
other filesystems.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

87185517

06 2月, 2010 1 次提交

xfs: fix xfs to work with Virtually Indexed architectures · 73c77e2c

由 James Bottomley 提交于 1月 25, 2010

xfs_buf.c includes what is essentially a hand rolled version of
blk_rq_map_kern().  In order to work properly with the vmalloc buffers
that xfs uses, this hand rolled routine must also implement the flushing
API for vmap/vmalloc areas.

[style updates from hch@lst.de]
Acked-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>

73c77e2c

04 2月, 2010 1 次提交

xfs: kill xfs_bawrite · 5322892d

由 Dave Chinner 提交于 2月 04, 2010

There are no more users of this function left in the XFS code
now that we've switched everything to delayed write flushing.
Remove it.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

5322892d

09 2月, 2010 1 次提交

xfs: log changed inodes instead of writing them synchronously · 07fec736

由 Christoph Hellwig 提交于 2月 09, 2010

When an inode has already be flushed delayed write,
xfs_inode_clean() returns true and hence xfs_fs_write_inode() can
return on a synchronous inode write without having written the
inode. Currently these sycnhronous writes only come sync(1),
unmount, a sycnhronous NFS export and cachefiles so should be
relatively rare and out of common performance paths.

Realistically, a synchronous inode write is not necessary here; we
can avoid writing the inode by logging any non-transactional changes
that are pending.  This needs to be done with synchronous
transactions, but it avoids seeking between the log and inode
clusters as we do now. We don't force the log if the inode is
pinned, though, so this differs from the fsync case.  For normal
sys_sync and unmount behaviour this is fine because we do a
synchronous log force in xfs_sync_data which is called from the
->sync_fs code.

It does however break the NFS synchronous export guarantees for now,
but work is under way to fix this at a higher level or for the
higher level to provide an additional flag in the writeback control
to tell us that a log force is needed.

Portions of this patch are based on work from Dave Chinner.
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Reviewed-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NAlex Elder <aelder@sgi.com>

07fec736

26 1月, 2010 1 次提交

xfs: Sort delayed write buffers before dispatch · 089716aa

由 Dave Chinner 提交于 1月 26, 2010

Currently when the xfsbufd writes delayed write buffers, it pushes
them to disk in the order they come off the delayed write list. If
there are lots of buffers ѕpread widely over the disk, this results
in overwhelming the elevator sort queues in the block layer and we
end up losing the posibility of merging adjacent buffers to minimise
the number of IOs.

Use the new generic list_sort function to sort the delwri dispatch
queue before issue to ensure that the buffers are pushed in the most
friendly order possible to the lower layers.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

089716aa

02 2月, 2010 1 次提交

xfs: Don't issue buffer IO direct from AIL push V2 · d808f617

由 Dave Chinner 提交于 2月 02, 2010

All buffers logged into the AIL are marked as delayed write.
When the AIL needs to push the buffer out, it issues an async write of the
buffer. This means that IO patterns are dependent on the order of
buffers in the AIL.

Instead of flushing the buffer, promote the buffer in the delayed
write list so that the next time the xfsbufd is run the buffer will
be flushed by the xfsbufd. Return the state to the xfsaild that the
buffer was promoted so that the xfsaild knows that it needs to cause
the xfsbufd to run to flush the buffers that were promoted.

Using the xfsbufd for issuing the IO allows us to dispatch all
buffer IO from the one queue. This means that we can make much more
enlightened decisions on what order to flush buffers to disk as
we don't have multiple places issuing IO. Optimisations to xfsbufd
will be in a future patch.

Version 2
- kill XFS_ITEM_FLUSHING as it is now unused.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

d808f617

06 2月, 2010 2 次提交

xfs: Use delayed write for inodes rather than async V2 · c854363e

由 Dave Chinner 提交于 2月 06, 2010

We currently do background inode flush asynchronously, resulting in
inodes being written in whatever order the background writeback
issues them. Not only that, there are also blocking and non-blocking
asynchronous inode flushes, depending on where the flush comes from.

This patch completely removes asynchronous inode writeback. It
removes all the strange writeback modes and replaces them with
either a synchronous flush or a non-blocking delayed write flush.
That is, inode flushes will only issue IO directly if they are
synchronous, and background flushing may do nothing if the operation
would block (e.g. on a pinned inode or buffer lock).

Delayed write flushes will now result in the inode buffer sitting in
the delwri queue of the buffer cache to be flushed by either an AIL
push or by the xfsbufd timing out the buffer. This will allow
accumulation of dirty inode buffers in memory and allow optimisation
of inode cluster writeback at the xfsbufd level where we have much
greater queue depths than the block layer elevators. We will also
get adjacent inode cluster buffer IO merging for free when a later
patch in the series allows sorting of the delayed write buffers
before dispatch.

This effectively means that any inode that is written back by
background writeback will be seen as flush locked during AIL
pushing, and will result in the buffers being pushed from there.
This writeback path is currently non-optimal, but the next patch
in the series will fix that problem.

A side effect of this delayed write mechanism is that background
inode reclaim will no longer directly flush inodes, nor can it wait
on the flush lock. The result is that inode reclaim must leave the
inode in the reclaimable state until it is clean. Hence attempts to
reclaim a dirty inode in the background will simply skip the inode
until it is clean and this allows other mechanisms (i.e. xfsbufd) to
do more optimal writeback of the dirty buffers. As a result, the
inode reclaim code has been rewritten so that it no longer relies on
the ambiguous return values of xfs_iflush() to determine whether it
is safe to reclaim an inode.

Portions of this patch are derived from patches by Christoph
Hellwig.

Version 2:
- cleanup reclaim code as suggested by Christoph
- log background reclaim inode flush errors
- just pass sync flags to xfs_iflush
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

c854363e

xfs: Make inode reclaim states explicit · 777df5af

由 Dave Chinner 提交于 2月 06, 2010

A.K.A.: don't rely on xfs_iflush() return value in reclaim

We have gradually been moving checks out of the reclaim code because
they are duplicated in xfs_iflush(). We've had a history of problems
in this area, and many of them stem from the overloading of the
return values from xfs_iflush() and interaction with inode flush
locking to determine if the inode is safe to reclaim.

With the desire to move to delayed write flushing of inodes and
non-blocking inode tree reclaim walks, the overloading of the
return value of xfs_iflush makes it very difficult to determine
the correct thing to do next.

This patch explicitly re-adds the checks to the inode reclaim code,
removing the reliance on the return value of xfs_iflush() to
determine what to do next. It also means that we can clearly
document all the inode states that reclaim must handle and hence
we can easily see that we handled all the necessary cases.

This also removes the need for the xfs_inode_clean() check in
xfs_iflush() as all callers now check this first (safely).
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

777df5af

09 2月, 2010 1 次提交

xfs: more reserved blocks fixups · d5db0f97

由 Eric Sandeen 提交于 2月 05, 2010

This mangles the reserved blocks counts a little more.

1) add a helper function for the default reserved count
2) add helper functions to save/restore counts on ro/rw
3) save/restore reserved blocks on freeze/thaw
4) disallow changing reserved count while readonly

V2: changed field name to match Dave's changes
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Signed-off-by: NAlex Elder <aelder@sgi.com>

d5db0f97

26 1月, 2010 1 次提交

xfs: don't hold onto reserved blocks on remount,ro · cbe132a8

由 Dave Chinner 提交于 1月 26, 2010

If we hold onto reserved blocks when doing a remount,ro we end
up writing the blocks used count to disk that includes the reserved
blocks. Reserved blocks are not actually used, so this results in
the values in the superblock being incorrect.

Hence if we run xfs_check or xfs_repair -n while the filesystem is
mounted remount,ro we end up with an inconsistent filesystem being
reported. Also, running xfs_copy on the remount,ro filesystem will
result in an inconsistent image being generated.

To fix this, unreserve the blocks when doing the remount,ro, and
reserved them again on remount,rw. This way a remount,ro filesystem
will appear consistent on disk to all utilities.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

cbe132a8

22 1月, 2010 3 次提交

xfs: replace KM_LARGE with explicit vmalloc use · bdfb0430

由 Christoph Hellwig 提交于 1月 20, 2010

We use the KM_LARGE flag to make kmem_alloc and friends use vmalloc
if necessary.  As we only need this for a few boot/mount time
allocations just switch to explicit vmalloc calls there.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

bdfb0430

xfs: cleanup up xfs_log_force calling conventions · a14a348b

由 Christoph Hellwig 提交于 1月 19, 2010

Remove the XFS_LOG_FORCE argument which was always set, and the
XFS_LOG_URGE define, which was never used.

Split xfs_log_force into a two helpers - xfs_log_force which forces
the whole log, and xfs_log_force_lsn which forces up to the
specified LSN.  The underlying implementations already were entirely
separate, as were the users.

Also re-indent the new _xfs_log_force/_xfs_log_force which
previously had a weird coding style.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

a14a348b

xfs: remove duplicate buffer flags · 0cadda1c

由 Christoph Hellwig 提交于 1月 19, 2010

Currently we define aliases for the buffer flags in various
namespaces, which only adds confusion.  Remove all but the XBF_
flags to clean this up a bit.

Note that we still abuse XFS_B_ASYNC/XBF_ASYNC for some non-buffer
uses, but I'll clean that up later.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

0cadda1c

20 1月, 2010 2 次提交

xfs: convert attr to use unsigned names · a9273ca5

由 Dave Chinner 提交于 1月 20, 2010

To be consistent with the directory code, the attr code should use
unsigned names. Convert the names from the vfs at the highest level
to unsigned, and ænsure they are consistenly used as unsigned down
to disk.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

a9273ca5

xfs: xfs_buf_iomove() doesn't care about signedness · b9c48649

由 Dave Chinner 提交于 1月 20, 2010

xfs_buf_iomove() uses xfs_caddr_t as it's parameter types, but it doesn't
care about the signedness of the variables as it is just copying the
data. Change the prototype to use void * so that we don't get sign
warnings at call sites.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

b9c48649

16 1月, 2010 13 次提交

xfs: move more buffer helpers into xfs_buf.c · 4e23471a

由 Christoph Hellwig 提交于 1月 13, 2010

Move xfsbdstrat and xfs_bdstrat_cb from xfs_lrw.c and xfs_bioerror
and xfs_bioerror_relse from xfs_rw.c into xfs_buf.c.  This also
means xfs_bioerror and xfs_bioerror_relse can be marked static now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

4e23471a

xfs: clean up xfs_bwrite · 64e0bc7d

由 Christoph Hellwig 提交于 1月 13, 2010

Fold XFS_bwrite into it's only caller, xfs_bwrite and move it into
xfs_buf.c instead of leaving it as a fairly large inline function.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

64e0bc7d

xfs: clean up log buffer writes · 873ff550

由 Christoph Hellwig 提交于 1月 13, 2010

Don't bother using XFS_bwrite as it doesn't provide much code for
our use case.  Instead opencode it and fold xlog_bdstrat_cb into the
new xlog_bdstrat helper.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

873ff550

xfs: Kill filestreams cache flush · b657fc82