提交 · 4e94b71b7068b4bd9c615301197e09dbf0c3b770 · openeuler / raspberrypi-kernel

15 5月, 2012 5 次提交

xfs: use blocks for counting length of buffers · 4e94b71b

由 Dave Chinner 提交于 4月 23, 2012

Now that we pass block counts everywhere, and index buffers by block
number, track the length of the buffer in units of blocks rather
than bytes. Convert the code to use block counts, and those that
need byte counts get converted at the time of use.

Also, remove the XFS_BUF_{SET_}SIZE() macros that are just wrappers
around the buffer length. They only serve to make the code shouty
loud and don't actually add any real value.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMark Tinguely <tinguely@sgi.com>
Signed-off-by: NBen Myers <bpm@sgi.com>

4e94b71b

xfs: kill b_file_offset · de1cbee4

由 Dave Chinner 提交于 4月 23, 2012

Seeing as we pass block numbers around everywhere in the buffer
cache now, it makes no sense to index everything by byte offset.
Replace all the byte offset indexing with block number based
indexing, and replace all uses of the byte offset with direct
conversion from the block index.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMark Tinguely <tinguely@sgi.com>
Signed-off-by: NBen Myers <bpm@sgi.com>

de1cbee4

xfs: clean up buffer get/read call API · e70b73f8

由 Dave Chinner 提交于 4月 23, 2012

The xfs_buf_get/read API is not consistent in the units it uses, and
does not use appropriate or consistent units/types for the
variables.

Convert the API to use disk addresses and block counts for all
buffer get and read calls. Use consistent naming for all the
functions and their declarations, and convert the internal functions
to use disk addresses and block counts to avoid need to convert them
from one type to another and back again.

Fix all the callers to use disk addresses and block counts. In many
cases, this removes an additional conversion from the function call
as the callers already have a block count.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMark Tinguely <tinguely@sgi.com>
Signed-off-by: NBen Myers <bpm@sgi.com>

e70b73f8

xfs: check for buffer errors before waiting · 0e95f19a

由 Dave Chinner 提交于 4月 23, 2012

If we call xfs_buf_iowait() on a buffer that failed dispatch due to
an IO error, it will wait forever for an Io that does not exist.
This is hndled in xfs_buf_read, but there is other code that calls
xfs_buf_iowait directly that doesn't.

Rather than make the call sites have to handle checking for dispatch
errors and then checking for completion errors, make
xfs_buf_iowait() check for dispatch errors on the buffer before
waiting. This means we handle both dispatch and completion errors
with one set of error handling at the caller sites.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMark Tinguely <tinguely@sgi.com>
Signed-off-by: NBen Myers <bpm@sgi.com>

0e95f19a

xfs: on-stack delayed write buffer lists · 43ff2122

由 Christoph Hellwig 提交于 4月 23, 2012

Queue delwri buffers on a local on-stack list instead of a per-buftarg one,
and write back the buffers per-process instead of by waking up xfsbufd.

This is now easily doable given that we have very few places left that write
delwri buffers:

 - log recovery:
	Only done at mount time, and already forcing out the buffers
	synchronously using xfs_flush_buftarg

 - quotacheck:
	Same story.

 - dquot reclaim:
	Writes out dirty dquots on the LRU under memory pressure.  We might
	want to look into doing more of this via xfsaild, but it's already
	more optimal than the synchronous inode reclaim that writes each
	buffer synchronously.

 - xfsaild:
	This is the main beneficiary of the change.  By keeping a local list
	of buffers to write we reduce latency of writing out buffers, and
	more importably we can remove all the delwri list promotions which
	were hitting the buffer cache hard under sustained metadata loads.

The implementation is very straight forward - xfs_buf_delwri_queue now gets
a new list_head pointer that it adds the delwri buffers to, and all callers
need to eventually submit the list using xfs_buf_delwi_submit or
xfs_buf_delwi_submit_nowait.  Buffers that already are on a delwri list are
skipped in xfs_buf_delwri_queue, assuming they already are on another delwri
list.  The biggest change to pass down the buffer list was done to the AIL
pushing. Now that we operate on buffers the trylock, push and pushbuf log
item methods are merged into a single push routine, which tries to lock the
item, and if possible add the buffer that needs writeback to the buffer list.
This leads to much simpler code than the previous split but requires the
individual IOP_PUSH instances to unlock and reacquire the AIL around calls
to blocking routines.

Given that xfsailds now also handle writing out buffers, the conditions for
log forcing and the sleep times needed some small changes.  The most
important one is that we consider an AIL busy as long we still have buffers
to push, and the other one is that we do increment the pushed LSN for
buffers that are under flushing at this moment, but still count them towards
the stuck items for restart purposes.  Without this we could hammer on stuck
items without ever forcing the log and not make progress under heavy random
delete workloads on fast flash storage devices.

[ Dave Chinner:
	- rebase on previous patches.
	- improved comments for XBF_DELWRI_Q handling
	- fix XBF_ASYNC handling in queue submission (test 106 failure)
	- rename delwri submit function buffer list parameters for clarity
	- xfs_efd_item_push() should return XFS_ITEM_PINNED ]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NMark Tinguely <tinguely@sgi.com>
Signed-off-by: NBen Myers <bpm@sgi.com>

43ff2122

29 3月, 2012 1 次提交

Remove all #inclusions of asm/system.h · 9ffc93f2

由 David Howells 提交于 3月 28, 2012

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`
Signed-off-by: NDavid Howells <dhowells@redhat.com>

9ffc93f2

17 12月, 2011 1 次提交

xfs: remove unused XBT_FORCE_SLEEP bit · 687d1c5e

由 Eric Sandeen 提交于 12月 13, 2011

XBT_FORCE_SLEEP is no longer ever tested; it is only set
and cleared.  Remove it.
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NBen Myers <bpm@sgi.com>

687d1c5e

12 10月, 2011 12 次提交

xfs: remove XFS_bflush · a9add83e

由 Christoph Hellwig 提交于 10月 10, 2011

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

a9add83e

xfs: remove xfs_buf_target_name · 02b102df

由 Christoph Hellwig 提交于 10月 10, 2011

The calling convention that returns a pointer to a static buffer is
fairly nasty, so just opencode it in the only caller that is left.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

02b102df

xfs: clean up xfs_ioerror_alert · 901796af

由 Christoph Hellwig 提交于 10月 10, 2011

Instead of passing the block number and mount structure explicitly
get them off the bp and fix make the argument order more natural.

Also move it to xfs_buf.c and stop printing the device name given
that we already get the fs name as part of xfs_alert, and we know
what device is operates on because of the caller that gets printed,
finally rename it to xfs_buf_ioerror_alert and pass __func__ as
argument where it makes sense.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

901796af

xfs: clean up buffer allocation · 4347b9d7

由 Christoph Hellwig 提交于 10月 10, 2011

Change _xfs_buf_initialize to allocate the buffer directly and rename it to
xfs_buf_alloc now that is the only buffer allocation routine.  Also remove
the xfs_buf_deallocate wrapper around the kmem_zone_free calls for buffers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

4347b9d7

xfs: remove XFS_BUF_STALE and XFS_BUF_SUPER_STALE · c867cb61

由 Christoph Hellwig 提交于 10月 10, 2011

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c867cb61

xfs: remove XFS_BUF_SET_VTYPE and XFS_BUF_SET_VTYPE_REF · 38f23232

由 Christoph Hellwig 提交于 10月 10, 2011

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

38f23232

xfs: remove XFS_BUF_FINISH_IOWAIT · 5fde0326

由 Christoph Hellwig 提交于 10月 10, 2011

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

5fde0326

xfs: remove xfs_get_buftarg_list · b17b8334

由 Christoph Hellwig 提交于 10月 10, 2011

The code is unused and under a config option that doesn't exist, remove it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

b17b8334

xfs: fix buffer flushing during unmount · 87c7bec7

由 Christoph Hellwig 提交于 9月 14, 2011

The code to flush buffers in the umount code is a bit iffy: we first
flush all delwri buffers out, but then might be able to queue up a
new one when logging the sb counts.  On a normal shutdown that one
would get flushed out when doing the synchronous superblock write in
xfs_unmountfs_writesb, but we skip that one if the filesystem has
been shut down.

Fix this by moving the delwri list flushing until just before unmounting
the log, and while we're at it also remove the superflous delwri list
and buffer lru flusing for the rt and log device that can never have
cached or delwri buffers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NAmit Sahrawat <amit.sahrawat83@gmail.com>
Tested-by: NAmit Sahrawat <amit.sahrawat83@gmail.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

87c7bec7

xfs: use the "delwri" terminology consistently · c4e1c098

由 Christoph Hellwig 提交于 8月 23, 2011

And also remove the strange local lock and delwri list pointers in a few
functions.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c4e1c098

xfs: let xfs_bwrite callers handle the xfs_buf_relse · c2b006c1

由 Christoph Hellwig 提交于 8月 23, 2011

Remove the xfs_buf_relse from xfs_bwrite and let the caller handle it to
mirror the delwri and read paths.

Also remove the mount pointer passed to xfs_bwrite, which is superflous now
that we have a mount pointer in the buftarg.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c2b006c1

xfs: call xfs_buf_delwri_queue directly · 61551f1e

由 Christoph Hellwig 提交于 8月 23, 2011

Unify the ways we add buffers to the delwri queue by always calling
xfs_buf_delwri_queue directly.  The xfs_bdwrite functions is removed and
opencoded in its callers, and the two places setting XBF_DELWRI while a
buffer is locked and expecting xfs_buf_unlock to pick it up are converted
to call xfs_buf_delwri_queue directly, too.  Also replace the
XFS_BUF_UNDELAYWRITE macro with direct calls to xfs_buf_delwri_dequeue
to make the explicit queuing/dequeuing more obvious.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

61551f1e

13 8月, 2011 1 次提交

xfs: remove subdirectories · c59d87c4

由 Christoph Hellwig 提交于 8月 12, 2011

Use the move from Linux 2.6 to Linux 3.x as an excuse to kill the
annoying subdirectories in the XFS source code.  Besides the large
amount of file rename the only changes are to the Makefile, a few
files including headers with the subdirectory prefix, and the binary
sysctl compat code that includes a header under fs/xfs/ from
kernel/.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c59d87c4

26 7月, 2011 11 次提交

xfs: Remove the macro XFS_BUFTARG_NAME · c35a549c

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usages of the macro XFS_BUFTARG_NAME.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c35a549c

xfs: Remove the macro XFS_BUF_TARGET · 49074c06

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usages of the macro XFS_BUF_TARGET
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

49074c06

xfs: Remove the macro XFS_BUF_SET_TARGET · e38c9b87

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the macro XFS_BUF_SET_TARGET.

hch: As all the buffer allocator already set ->b_target it should be safe
to simply remove these calls.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

e38c9b87

Replace the macro XFS_BUF_ISPINNED with helper xfs_buf_ispinned · 811e64c7

由 Chandra Seetharaman 提交于 7月 22, 2011

Replace the macro XFS_BUF_ISPINNED with an inline helper function
xfs_buf_ispinned, and change all its usages.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

811e64c7

xfs: Remove the macro XFS_BUF_SET_PTR · 02fe03d9

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usages of the macro XFS_BUF_SET_PTR.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

02fe03d9

xfs: Remove the macro XFS_BUF_PTR · 62926044

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usages of the macro XFS_BUF_PTR.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

62926044

xfs: Remove macro XFS_BUF_SET_START · 0095a21e

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usage of the macro XFS_BUF_SET_START.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

0095a21e

xfs: Remove macro XFS_BUF_HOLD · 72790aa1

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition and usage of the macro XFS_BUF_HOLD
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

72790aa1

xfs: Remove macro XFS_BUF_BUSY and family · b75e40a4

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definitions and uses of the macros XFS_BUF_BUSY,
XFS_BUF_UNBUSY, and XFS_BUF_ISBUSY.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

b75e40a4

xfs: Remove the macro XFS_BUF_ERROR and family · 5a52c2a5

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definitions and usage of the macros XFS_BUF_ERROR,
XFS_BUF_GETERROR and XFS_BUF_ISERROR.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

5a52c2a5

xfs: Remove the macro XFS_BUF_BFLAGS · ed43233b

由 Chandra Seetharaman 提交于 7月 22, 2011

Remove the definition of the macro XFS_BUF_BFLAGS and its usage.
Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

ed43233b

13 7月, 2011 3 次提交

xfs: remove wrappers around b_iodone · cb669ca5

由 Christoph Hellwig 提交于 7月 13, 2011

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

cb669ca5

xfs: remove wrappers around b_fspriv · adadbeef

由 Christoph Hellwig 提交于 7月 13, 2011

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

adadbeef

xfs: add a proper transaction pointer to struct xfs_buf · bf9d9013

由 Christoph Hellwig 提交于 7月 13, 2011

Replace the typeless b_fspriv2 and the ugly macros around it with a properly
typed transaction pointer.  As a fallout the log buffer state debug checks
are also removed.  We could have kept them using casts, but as they do
not have a real purpose we can as well just remove them.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

bf9d9013

08 7月, 2011 3 次提交

xfs: cleanup I/O-related buffer flags · 1d5ae5df

由 Christoph Hellwig 提交于 7月 08, 2011

Remove the unused and misnamed _XBF_RUN_QUEUES flag, rename XBF_LOG_BUFFER
to the more fitting XBF_SYNCIO, and split XBF_ORDERED into XBF_FUA and
XBF_FLUSH to allow more fine grained control over the bio flags. Also
cleanup processing of the flags in _xfs_buf_ioapply to make more sense,
and renumber the sparse flag number space to group flags by purpose.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

1d5ae5df

xfs: clean up buffer locking helpers · 0c842ad4

由 Christoph Hellwig 提交于 7月 08, 2011

Rename xfs_buf_cond_lock and reverse it's return value to fit most other
trylock operations in the Kernel and XFS (with the exception of down_trylock,
after which xfs_buf_cond_lock was modelled), and replace xfs_buf_lock_val
with an xfs_buf_islocked for use in asserts, or and opencoded variant in
tracing. remove the XFS_BUF_* wrappers for all the locking helpers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

0c842ad4

xfs: remove the unused xfs_bufhash structure · bbb4197c

由 Christoph Hellwig 提交于 7月 08, 2011

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

bbb4197c

20 5月, 2011 1 次提交

xfs: reset buffer pointers before freeing them · 44396476

由 Dave Chinner 提交于 4月 21, 2011

When we free a vmapped buffer, we need to ensure the vmap address
and length we free is the same as when it was allocated. In various
places in the log code we change the memory the buffer is pointing
to before issuing IO, but we never reset the buffer to point back to
it's original memory (or no memory, if that is the case for the
buffer).

As a result, when we free the buffer it points to memory that is
owned by something else and attempts to unmap and free it. Because
the range does not match any known mapped range, it can trigger
BUG_ON() traps in the vmap code, and potentially corrupt the vmap
area tracking.

Fix this by always resetting these buffers to their original state
before freeing them.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

44396476

26 3月, 2011 1 次提交

xfs: stop using the page cache to back the buffer cache · 0e6e847f

由 Dave Chinner 提交于 3月 26, 2011

Now that the buffer cache has it's own LRU, we do not need to use
the page cache to provide persistent caching and reclaim
infrastructure. Convert the buffer cache to use alloc_pages()
instead of the page cache. This will remove all the overhead of page
cache management from setup and teardown of the buffers, as well as
needing to mark pages accessed as we find buffers in the buffer
cache.

By avoiding the page cache, we also remove the need to keep state in
the page_private(page) field for persistant storage across buffer
free/buffer rebuild and so all that code can be removed. This also
fixes the long-standing problem of not having enough bits in the
page_private field to track all the state needed for a 512
sector/64k page setup.

It also removes the need for page locking during reads as the pages
are unique to the buffer and nobody else will be attempting to
access them.

Finally, it removes the buftarg address space lock as a point of
global contention on workloads that allocate and free buffers
quickly such as when creating or removing large numbers of inodes in
parallel. This remove the 16TB limit on filesystem size on 32 bit
machines as the page index (32 bit) is no longer used for lookups
of metadata buffers - the buffer cache is now solely indexed by disk
address which is stored in a 64 bit field in the buffer.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NAlex Elder <aelder@sgi.com>

0e6e847f

12 1月, 2011 1 次提交

xfs: fix error handling for synchronous writes · bfc60177

由 Christoph Hellwig 提交于 1月 07, 2011

If we get an IO error on a synchronous superblock write, we attach an
error release function to it so that when the last reference goes away
the release function is called and the buffer is invalidated and
unlocked. The buffer is left locked until the release function is
called so that other concurrent users of the buffer will be locked out
until the buffer error is fully processed.

Unfortunately, for the superblock buffer the filesyetm itself holds a
reference to the buffer which prevents the reference count from
dropping to zero and the release function being called. As a result,
once an IO error occurs on a sync write, the buffer will never be
unlocked and all future attempts to lock the buffer will hang.

To make matters worse, this problems is not unique to such buffers;
if there is a concurrent _xfs_buf_find() running, the lookup will grab
a reference to the buffer and then wait on the buffer lock, preventing
the reference count from ever falling to zero and hence unlocking the
buffer.

As such, the whole b_relse function implementation is broken because it
cannot rely on the buffer reference count falling to zero to unlock the
errored buffer. The synchronous write error path is the only path that
uses this callback - it is used to ensure that the synchronous waiter
gets the buffer error before the error state is cleared from the buffer
by the release function.

Given that the only sychronous buffer writes now go through xfs_bwrite
and the error path in question can only occur for a write of a dirty,
logged buffer, we can move most of the b_relse processing to happen
inline in xfs_buf_iodone_callbacks, just like a normal I/O completion.
In addition to that we make sure the error is not cleared in
xfs_buf_iodone_callbacks, so that xfs_bwrite can reliably check it.
Given that xfs_bwrite keeps the buffer locked until it has waited for
it and checked the error this allows to reliably propagate the error
to the caller, and make sure that the buffer is reliably unlocked.

Given that xfs_buf_iodone_callbacks was the only instance of the
b_relse callback we can remove it entirely.

Based on earlier patches by Dave Chinner and Ajeet Yadav.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NAjeet Yadav <ajeet.yadav.77@gmail.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

bfc60177