提交 · 552ef8024f909d9b3a7442d0ab0d48a22de24e9e · openeuler / raspberrypi-kernel

27 7月, 2010 1 次提交

direct-io: move aio_complete into ->end_io · 552ef802

由 Christoph Hellwig 提交于 7月 27, 2010

Filesystems with unwritten extent support must not complete an AIO request
until the transaction to convert the extent has been commited.  That means
the aio_complete calls needs to be moved into the ->end_io callback so
that the filesystem can control when to call it exactly.

This makes a bit of a mess out of dio_complete and the ->end_io callback
prototype even more complicated. 
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz> 
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

552ef802

09 6月, 2010 1 次提交

xfs: remove nr_to_write writeback windup. · 254c8c2d

由 Dave Chinner 提交于 6月 09, 2010

Now that the background flush code has been fixed, we shouldn't need to
silently multiply the wbc->nr_to_write to get good writeback. Remove
that code.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

254c8c2d

03 6月, 2010 1 次提交

xfs: skip writeback from reclaim context · 070ecdca

由 Christoph Hellwig 提交于 6月 03, 2010

Allowing writeback from reclaim context causes massive problems with stack
overflows as we can call into the writeback code which tends to be a heavy
stack user both in the generic code and XFS from random contexts that
perform memory allocations.

Follow the example of btrfs (and in slightly different form ext4) and refuse
to write out data from reclaim context.  This issue should really be handled
by the VM so that we can tune better for this case, but until we get it
sorted out there we have to hack around this in each filesystem with a
complex writeback path.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

070ecdca

29 5月, 2010 5 次提交

xfs: fix access to upper inodes without inode64 · fb3b504a

由 Christoph Hellwig 提交于 5月 28, 2010

If a filesystem is mounted without the inode64 mount option we
should still be able to access inodes not fitting into 32 bits, just
not created new ones.  For this to work we need to make sure the
inode cache radix tree is initialized for all allocation groups, not
just those we plan to allocate inodes from.  This patch makes sure
we initialize the inode cache radix tree for all allocation groups,
and also cleans xfs_initialize_perag up a bit to separate the
inode32 logical from the general perag structure setup.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

fb3b504a

xfs: remove duplicated #include · 3bd0946e

由 Huang Weiyi 提交于 5月 25, 2010

Remove duplicated #include('s) in
  fs/xfs/linux-2.6/xfs_quotaops.c
Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com>
Reviewed-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

3bd0946e

xfs: convert more trace events to DEFINE_EVENT · f8adb4d5

由 Li Zefan 提交于 5月 24, 2010

Use DECLARE_EVENT_CLASS, and save ~15K:

   text    data     bss     dec     hex filename
 171949   43028      48  215025   347f1 fs/xfs/linux-2.6/xfs_trace.o.orig
 156521   43028      36  199585   30ba1 fs/xfs/linux-2.6/xfs_trace.o

No change in functionality.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAlex Elder <aelder@sgi.com>

f8adb4d5

xfs: xfs_trace.c: remove duplicated #include · 292ec4cf

由 Huang Weiyi 提交于 5月 22, 2010

Remove duplicated #include('s) in
  fs/xfs/linux-2.6/xfs_trace.c
Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com>
Reviewed-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

292ec4cf

xfs: Check new inode size is OK before preallocating · 07f1a4f5

由 Dave Chinner 提交于 5月 21, 2010

The new xfsqa test 228 tries to preallocate more space than the
filesystem contains. it should fail, but instead triggers an assert
about lock flags.  The failure is due to the size extension failing
in vmtruncate() due to rlimit being set. Check this before we start
the preallocation to avoid allocating space that will never be used.

Also the path through xfs_vn_allocate already holds the IO lock, so
it should not be present in the lock flags when the setattr fails.
Hence the assert needs to take this into account. This will prevent
other such callers from hitting this incorrect ASSERT.

(Fixed a reference to "newsize" to read "new_size". -Alex)
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

07f1a4f5

28 5月, 2010 1 次提交

drop unused dentry argument to ->fsync · 7ea80859

由 Christoph Hellwig 提交于 5月 26, 2010

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7ea80859

24 5月, 2010 3 次提交

xfs: Introduce delayed logging core code · 71e330b5

由 Dave Chinner 提交于 5月 21, 2010

The delayed logging code only changes in-memory structures and as
such can be enabled and disabled with a mount option. Add the mount
option and emit a warning that this is an experimental feature that
should not be used in production yet.

We also need infrastructure to track committed items that have not
yet been written to the log. This is what the Committed Item List
(CIL) is for.

The log item also needs to be extended to track the current log
vector, the associated memory buffer and it's location in the Commit
Item List. Extend the log item and log vector structures to enable
this tracking.

To maintain the current log format for transactions with delayed
logging, we need to introduce a checkpoint transaction and a context
for tracking each checkpoint from initiation to transaction
completion.  This includes adding a log ticket for tracking space
log required/used by the context checkpoint.

To track all the changes we need an io vector array per log item,
rather than a single array for the entire transaction. Using the new
log vector structure for this requires two passes - the first to
allocate the log vector structures and chain them together, and the
second to fill them out.  This log vector chain can then be passed
to the CIL for formatting, pinning and insertion into the CIL.

Formatting of the log vector chain is relatively simple - it's just
a loop over the iovecs on each log vector, but it is made slightly
more complex because we re-write the iovec after the copy to point
back at the memory buffer we just copied into.

This code also needs to pin log items. If the log item is not
already tracked in this checkpoint context, then it needs to be
pinned. Otherwise it is already pinned and we don't need to pin it
again.

The only other complexity is calculating the amount of new log space
the formatting has consumed. This needs to be accounted to the
transaction in progress, and the accounting is made more complex
becase we need also to steal space from it for log metadata in the
checkpoint transaction. Calculate all this at insert time and update
all the tickets, counters, etc correctly.

Once we've formatted all the log items in the transaction, attach
the busy extents to the checkpoint context so the busy extents live
until checkpoint completion and can be processed at that point in
time. Transactions can then be freed at this point in time.

Now we need to issue checkpoints - we are tracking the amount of log space
used by the items in the CIL, so we can trigger background checkpoints when the
space usage gets to a certain threshold. Otherwise, checkpoints need ot be
triggered when a log synchronisation point is reached - a log force event.

Because the log write code already handles chained log vectors, writing the
transaction is trivial, too. Construct a transaction header, add it
to the head of the chain and write it into the log, then issue a
commit record write. Then we can release the checkpoint log ticket
and attach the context to the log buffer so it can be called during
Io completion to complete the checkpoint.

We also need to allow for synchronising multiple in-flight
checkpoints. This is needed for two things - the first is to ensure
that checkpoint commit records appear in the log in the correct
sequence order (so they are replayed in the correct order). The
second is so that xfs_log_force_lsn() operates correctly and only
flushes and/or waits for the specific sequence it was provided with.

To do this we need a wait variable and a list tracking the
checkpoint commits in progress. We can walk this list and wait for
the checkpoints to change state or complete easily, an this provides
the necessary synchronisation for correct operation in both cases.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

71e330b5

xfs: Improve scalability of busy extent tracking · ed3b4d6c

由 Dave Chinner 提交于 5月 21, 2010

When we free a metadata extent, we record it in the per-AG busy
extent array so that it is not re-used before the freeing
transaction hits the disk. This array is fixed size, so when it
overflows we make further allocation transactions synchronous
because we cannot track more freed extents until those transactions
hit the disk and are completed. Under heavy mixed allocation and
freeing workloads with large log buffers, we can overflow this array
quite easily.

Further, the array is sparsely populated, which means that inserts
need to search for a free slot, and array searches often have to
search many more slots that are actually used to check all the
busy extents. Quite inefficient, really.

To enable this aspect of extent freeing to scale better, we need
a structure that can grow dynamically. While in other areas of
XFS we have used radix trees, the extents being freed are at random
locations on disk so are better suited to being indexed by an rbtree.

So, use a per-AG rbtree indexed by block number to track busy
extents.  This incures a memory allocation when marking an extent
busy, but should not occur too often in low memory situations. This
should scale to an arbitrary number of extents so should not be a
limitation for features such as in-memory aggregation of
transactions.

However, there are still situations where we can't avoid allocating
busy extents (such as allocation from the AGFL). To minimise the
overhead of such occurences, we need to avoid doing a synchronous
log force while holding the AGF locked to ensure that the previous
transactions are safely on disk before we use the extent. We can do
this by marking the transaction doing the allocation as synchronous
rather issuing a log force.

Because of the locking involved and the ordering of transactions,
the synchronous transaction provides the same guarantees as a
synchronous log force because it ensures that all the prior
transactions are already on disk when the synchronous transaction
hits the disk. i.e. it preserves the free->allocate order of the
extent correctly in recovery.

By doing this, we avoid holding the AGF locked while log writes are
in progress, hence reducing the length of time the lock is held and
therefore we increase the rate at which we can allocate and free
from the allocation group, thereby increasing overall throughput.

The only problem with this approach is that when a metadata buffer is
marked stale (e.g. a directory block is removed), then buffer remains
pinned and locked until the log goes to disk. The issue here is that
if that stale buffer is reallocated in a subsequent transaction, the
attempt to lock that buffer in the transaction will hang waiting
the log to go to disk to unlock and unpin the buffer. Hence if
someone tries to lock a pinned, stale, locked buffer we need to
push on the log to get it unlocked ASAP. Effectively we are trading
off a guaranteed log force for a much less common trigger for log
force to occur.

Ideally we should not reallocate busy extents. That is a much more
complex fix to the problem as it involves direct intervention in the
allocation btree searches in many places. This is left to a future
set of modifications.

Finally, now that we track busy extents in allocated memory, we
don't need the descriptors in the transaction structure to point to
them. We can replace the complex busy chunk infrastructure with a
simple linked list of busy extents. This allows us to remove a large
chunk of code, making the overall change a net reduction in code
size.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

ed3b4d6c

xfs: Clean up XFS_BLI_* flag namespace · c1155410

由 Dave Chinner 提交于 5月 07, 2010

Clean up the buffer log format (XFS_BLI_*) flags because they have a
polluted namespace. They XFS_BLI_ prefix is used for both in-memory
and on-disk flag feilds, but have overlapping values for different
flags. Rename the buffer log format flags to use the XFS_BLF_*
prefix to avoid confusing them with the in-memory XFS_BLI_* prefixed
flags.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c1155410

22 5月, 2010 3 次提交

xfs: constify xattr_handler · 46e58764

由 Stephen Hemminger 提交于 5月 13, 2010

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

46e58764

quota: unify ->set_dqblk · c472b432

由 Christoph Hellwig 提交于 5月 06, 2010

Pass the larger struct fs_disk_quota to the ->set_dqblk operation so
that the Q_SETQUOTA and Q_XSETQUOTA operations can be implemented
with a single filesystem operation and we can retire the ->set_xquota
operation.  The additional information (RT-subvolume accounting and
warn counts) are left zero for the VFS quota implementation.

Add new fieldmask values for setting the numer of blocks and inodes
values which is required for the VFS quota, but wasn't for XFS.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

c472b432

quota: unify ->get_dqblk · b9b2dd36

由 Christoph Hellwig 提交于 5月 06, 2010

Pass the larger struct fs_disk_quota to the ->get_dqblk operation so
that the Q_GETQUOTA and Q_XGETQUOTA operations can be implemented
with a single filesystem operation and we can retire the ->get_xquota
operation.  The additional information (RT-subvolume accounting and
warn counts) are left zero for the VFS quota implementation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>

b9b2dd36

19 5月, 2010 20 次提交

C
xfs: clean up end index calculation in xfs_page_state_convert · bd1556a1
由 Christoph Hellwig 提交于 4月 28, 2010
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>
```
bd1556a1
C
xfs: clean up mapping size calculation in __xfs_get_blocks · 2b8f12b7
由 Christoph Hellwig 提交于 4月 28, 2010
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>
```
2b8f12b7

xfs: clean up xfs_iomap_valid · 558e6891

由 Christoph Hellwig 提交于 4月 28, 2010

Rename all iomap_valid identifiers to imap_valid to fit the new
world order, and clean up xfs_iomap_valid to convert the passed in
offset to blocks instead of the imap values to bytes.  Use the
simpler inode->i_blkbits instead of the XFS macros for this.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

558e6891

xfs: move I/O type flags into xfs_aops.c · 34a52c6c

由 Christoph Hellwig 提交于 4月 28, 2010

The IOMAP_ flags are now only used inside xfs_aops.c for extent
probing and I/O completion tracking, so more them here, and rename
them to IO_* as there's no mapping involved at all.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

34a52c6c

xfs: kill struct xfs_iomap · 207d0416

由 Christoph Hellwig 提交于 4月 28, 2010

Now that struct xfs_iomap contains exactly the same units as struct
xfs_bmbt_irec we can just use the latter directly in the aops code.
Replace the missing IOMAP_NEW flag with a new boolean output
parameter to xfs_iomap.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

207d0416

xfs: report iomap_bn in block base · e513182d

由 Christoph Hellwig 提交于 4月 28, 2010

Report the iomap_bn field of struct xfs_iomap in terms of filesystem
blocks instead of in terms of bytes.  Shift the byte conversions
into the caller, and replace the IOMAP_DELAY and IOMAP_HOLE flag
checks with checks for HOLESTARTBLOCK and DELAYSTARTBLOCK.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

e513182d

xfs: report iomap_offset and iomap_bsize in block base · 8699bb0a

由 Christoph Hellwig 提交于 4月 28, 2010

Report the iomap_offset and iomap_bsize fields of struct xfs_iomap
in terms of fsblocks instead of in terms of disk blocks.  Shift the
byte conversions into the callers temporarily, but they will
disappear or get cleaned up later.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

8699bb0a

xfs: remove iomap_delta · 9563b3d8

由 Christoph Hellwig 提交于 4月 28, 2010

The iomap_delta field in struct xfs_iomap just contains the
difference between the offset passed to xfs_iomap and the
iomap_offset.  Just calculate it in the only caller that cares.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

9563b3d8

xfs: remove iomap_target · 046f1685

由 Christoph Hellwig 提交于 4月 28, 2010

Instead of using the iomap_target field in struct xfs_iomap
and the IOMAP_REALTIME flag just use the already existing
xfs_find_bdev_for_inode helper.  There's some fallout as we
need to pass the inode in a few more places, which we also
use to sanitize some calling conventions.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

046f1685

xfs: Make fiemap work in query mode. · 2d1ff3c7

由 Tao Ma 提交于 4月 29, 2010

According to Documentation/filesystems/fiemap.txt, If fm_extent_count
is zero, then the fm_extents[] array is ignored (no extents will be
returned), and the fm_mapped_extents count will hold the number of
extents needed.

But as the commit 97db39a1 has changed
bmv_count to the caller's input buffer, this number query function can't
work any more. As this commit is written to change bmv_count from
MAXEXTNUM because of ENOMEM.

This patch just try to  set bm.bmv_count to something sane.
Thanks to Dave Chinner <david@fromorbit.com> for the suggestion.

Cc: Eric Sandeen <sandeen@redhat.com>
Cc: Alex Elder <aelder@sgi.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

2d1ff3c7

xfs: wait for direct I/O to complete in fsync and write_inode · 37bc5743

由 Christoph Hellwig 提交于 4月 20, 2010

We need to wait for all pending direct I/O requests before taking care of
metadata in fsync and write_inode.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>

37bc5743

xfs: xfs_trace.c: duplicated include · fce1cad6

由 Andrea Gelmini 提交于 3月 25, 2010

fs/xfs/linux-2.6/xfs_trace.c: xfs_attr_sf.h is included more than once.
Signed-off-by: NAndrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: NAlex Elder <aelder@sgi.com>

fce1cad6

xfs: enforce synchronous writes in xfs_bwrite · 8c38366f

由 Christoph Hellwig 提交于 3月 12, 2010

xfs_bwrite is used with the intention of synchronously writing out
buffers, but currently it does not actually clear the async flag if
that's left from previous writes but instead implements async
behaviour if it finds it.  Remove the code handling asynchronous
writes as we've got rid of those entirely outside of the log and
delwri buffers, and make sure that we clear the async and read flags
before writing the buffer.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

8c38366f

xfs: remove periodic superblock writeback · df308bcf

由 Christoph Hellwig 提交于 3月 12, 2010

All modifications to the superblock are done transactional through
xfs_trans_log_buf, so there is no reason to initiate periodic
asynchronous writeback.  This only removes the superblock from the
delwri list and will lead to sub-optimal I/O scheduling.

Cut down xfs_sync_fsdata now that it's only used for synchronous
superblock writes and move the log coverage checks into the two
callers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

df308bcf

xfs: convert the dquot hash list to use list heads · e6a81f13

由 Dave Chinner 提交于 4月 13, 2010

Convert the dquot hash list on the filesystem to use listhead
infrastructure rather than the roll-your-own in the quota code.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

e6a81f13

xfs: remove duplicate code from dquot reclaim · 368e1361

由 Dave Chinner 提交于 4月 13, 2010

The dquot shaker and the free-list reclaim code use exactly the same
algorithm but the code is duplicated and slightly different in each
case. Make the shaker code use the single dquot reclaim code to
remove the code duplication.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

368e1361

xfs: add log item recovery tracing · 9abbc539

由 Dave Chinner 提交于 4月 13, 2010

Currently there is no tracing in log recovery, so it is difficult to
determine what is going on when something goes wrong.

Add tracing for log item recovery to provide visibility into the log
recovery process. The tracing added shows regions being extracted
from the log transactions and added to the transaction hash forming
recovery items, followed by the reordering, cancelling and finally
recovery of the items.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

9abbc539

xfs: Add inode pin counts to traces · 4aaf15d1

由 Dave Chinner 提交于 3月 08, 2010

We don't record pin counts in inode events right now, and this makes
it difficult to track down problems related to pinning inodes. Add
the pin count to the inode trace class and add trace events for
pinning and unpinning inodes.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

4aaf15d1

xfs: add blockdev name to kthreads · e2a07812

由 Jan Engelhardt 提交于 3月 23, 2010

This allows to see in `ps` and similar tools which kthreads are
allotted to which block device/filesystem, similar to what jbd2
does. As the process name is a fixed 16-char array, no extra
space is needed in tasks.

  PID TTY      STAT   TIME COMMAND
    2 ?        S      0:00 [kthreadd]
  197 ?        S      0:00  \_ [jbd2/sda2-8]
  198 ?        S      0:00  \_ [ext4-dio-unwrit]
  204 ?        S      0:00  \_ [flush-8:0]
 2647 ?        S      0:00  \_ [xfs_mru_cache]
 2648 ?        S      0:00  \_ [xfslogd/0]
 2649 ?        S      0:00  \_ [xfsdatad/0]
 2650 ?        S      0:00  \_ [xfsconvertd/0]
 2651 ?        S      0:00  \_ [xfsbufd/ram0]
 2652 ?        S      0:00  \_ [xfsaild/ram0]
 2653 ?        S      0:00  \_ [xfssyncd/ram0]
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>

e2a07812

xfs: Fix integer overflow in fs/xfs/linux-2.6/xfs_ioctl*.c · fda168c2

由 Zhitong Wang 提交于 3月 23, 2010

The am_hreq.opcount field in the xfs_attrmulti_by_handle() interface
is not bounded correctly. The opcount is used to determine the size
of the buffer required. The size is bounded, but can overflow and so
the size checks may not be sufficient to catch invalid opcounts.
Fix it by catching opcount values that would cause overflows before
calculating the size.
Signed-off-by: NZhitong Wang <zhitong.wangzt@alibaba-inc.com>
Reviewed-by: NDave Chinner <david@fromorbit.com>

fda168c2

30 4月, 2010 1 次提交

xfs: add a shrinker to background inode reclaim · 9bf729c0

由 Dave Chinner 提交于 4月 29, 2010

On low memory boxes or those with highmem, kernel can OOM before the
background reclaims inodes via xfssyncd. Add a shrinker to run inode
reclaim so that it inode reclaim is expedited when memory is low.

This is more complex than it needs to be because the VM folk don't
want a context added to the shrinker infrastructure. Hence we need
to add a global list of XFS mount structures so the shrinker can
traverse them.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

9bf729c0

29 4月, 2010 1 次提交

blkdev: generalize flags for blkdev_issue_fn functions · fbd9b09a

由 Dmitry Monakhov 提交于 4月 28, 2010

The patch just convert all blkdev_issue_xxx function to common
set of flags. Wait/allocation semantics preserved.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fbd9b09a

17 4月, 2010 1 次提交

xfs: don't warn on EAGAIN in inode reclaim · f1d486a3

由 Dave Chinner 提交于 4月 13, 2010

Any inode reclaim flush that returns EAGAIN will result in the inode
reclaim being attempted again later. There is no need to issue a
warning into the logs about this situation.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

f1d486a3

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

17 3月, 2010 1 次提交

xfs: don't warn about page discards on shutdown · e8c3753c

由 Dave Chinner 提交于 3月 15, 2010

If we are doing a forced shutdown, we can get lots of noise about
delalloc pages being discarded. This is happens by design during a
forced shutdown, so don't spam the logs with these messages.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

e8c3753c