提交 · 4b8879df8c21bed3efd1eb2da5d72501199aba29 · openanolis / cloud-kernel

18 4月, 2008 40 次提交

[XFS] Propagate xfs_qm_dqflush_all() errors. · 4b8879df

由 David Chinner 提交于 4月 10, 2008

xfs_qm_dqflush_all() can return flush errors. Ensure they are propagated
into the quotacheck code to determine if the quotacheck succeeded or not.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30786a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NNiv Sardi <xaiki@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

4b8879df

[XFS] xfs_qm_reset_dqcounts() does not return errors. · 5b139738

由 David Chinner 提交于 4月 10, 2008

Declare it void.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30785a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NNiv Sardi <xaiki@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

5b139738

[XFS] Report errors from xfs_reserve_blocks(). · 714082bc

由 David Chinner 提交于 4月 10, 2008

xfs_reserve_blocks() can fail in interesting ways. In neither case is it a
fatal error, but the result can lead to sub-optimal behaviour. Warn to the
syslog if the call fails but otherwise continue.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30784a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NNiv Sardi <xaiki@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

714082bc

[XFS] xfs_icsb_counter_disabled() never returns an error. · 36fbe6e6

由 David Chinner 提交于 4月 10, 2008

Mark it void.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30782a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NNiv Sardi <xaiki@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

36fbe6e6

[XFS] Remove useless whitespace in function prototypes · a414047f

由 David Chinner 提交于 4月 10, 2008

Makes it simpler to annotate function prototypes with __must_check via sed
scripts.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30781a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NNiv Sardi <xaiki@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

a414047f

[XFS] xfs_quiesce_fs() never returns an error. Mark it void. · 3c85c36c

由 David Chinner 提交于 4月 10, 2008

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30780a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NNiv Sardi <xaiki@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

3c85c36c

[XFS] Don't validate symlink target component length · b6ddc4e6

由 Christoph Hellwig 提交于 4月 10, 2008

This target component validation is not POSIX conformant and it is not
done by any other Linux filesystem so remove it from XFS.

SGI-PV: 980080
SGI-Modid: xfs-linux-melb:xfs-kern:30776a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

b6ddc4e6

[XFS] replace remaining __FUNCTION__ occurrences · 34a622b2

由 Harvey Harrison 提交于 4月 10, 2008

__FUNCTION__ is gcc-specific, use __func__

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30775a
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

34a622b2

[XFS] Replace __inline with inline · 0225da1f

由 Harvey Harrison 提交于 4月 10, 2008

Remove the remaining uses of __inline in the XFS code base.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30774a
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

0225da1f

[XFS] Fix lock inversion in forced shutdown. · 6b1d1a73

由 David Chinner 提交于 4月 10, 2008

Recent changes to xlog_state_release_iclog() placed the grant_lock inside
the icloglock. forced unmount of the log does this the opposite way
around, but does not depend on the order for correct working. Fix the
inversion by changing the order locks are gained in
xfs_log_force_umount().

SGI-PV: 979661
SGI-Modid: xfs-linux-melb:xfs-kern:30773a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

6b1d1a73

[XFS] Reorganise xlog_t for better cacheline isolation of contention · 4679b2d3

由 David Chinner 提交于 4月 10, 2008

To reduce contention on the log in large CPU count, separate out different
parts of the xlog_t structure onto different cachelines. Move each lock
onto a different cacheline along with all the members that are
accessed/modified while that lock is held.

Also, move the debugging code into debug code.

SGI-PV: 978729
SGI-Modid: xfs-linux-melb:xfs-kern:30772a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

4679b2d3

[XFS] Remove the xlog_ticket allocator · eb01c9cd

由 David Chinner 提交于 4月 10, 2008

The ticket allocator is just a simple slab implementation internal to the
log. It requires the icloglock to be held when manipulating it and this
contributes to contention on that lock.

Just kill the entire allocator and use a memory zone instead. While there,
allow us to gracefully fail allocation with ENOMEM.

SGI-PV: 978729
SGI-Modid: xfs-linux-melb:xfs-kern:30771a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

eb01c9cd

[XFS] Per iclog callback chain lock · 114d23aa

由 David Chinner 提交于 4月 10, 2008

Rather than use the icloglock for protecting the iclog completion callback
chain, use a new per-iclog lock so that walking the callback chain doesn't
require holding a global lock.

This reduces contention on the icloglock during transaction commit and log
I/O completion by reducing the number of times we need to hold the global
icloglock during these operations.

SGI-PV: 978729
SGI-Modid: xfs-linux-melb:xfs-kern:30770a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

114d23aa

[XFS] Prevent xfs_bmap_check_leaf_extents() referencing unmapped memory. · 2abdb8c8

由 Lachlan McIlroy 提交于 3月 27, 2008

While investigating the extent corruption bug I ran into this bug in debug
only code. xfs_bmap_check_leaf_extents() loops through the leaf blocks of
the extent btree checking that every extent is entirely before the next
extent. It also compares the last extent in the previous block to the
first extent in the current block when the previous block has been
released and potentially unmapped. So take a copy of the last extent
instead of a pointer. Also move the last extent check out of the loop
because we only need to do it once.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30718a
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>

2abdb8c8

[XFS] remove most calls to VN_RELE · 43355099

由 Christoph Hellwig 提交于 3月 27, 2008

Most VN_RELE calls either directly contain a XFS_ITOV or have the
corresponding xfs_inode already in scope. Use the IRELE helper instead of
VN_RELE to clarify the code. With a little more work we can kill VN_RELE
altogether and define IRELE in terms of iput directly.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30710a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

43355099

[XFS] split xfs_ioc_xattr · df26cfe8

由 Lachlan McIlroy 提交于 4月 18, 2008

The three subcases of xfs_ioc_xattr don't share any semantics and almost
no code, so split it into three separate helpers.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30709a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

df26cfe8

[XFS] cleanup root inode handling in xfs_fs_fill_super · f3dcc13f

由 Christoph Hellwig 提交于 3月 27, 2008

- rename rootvp to root for clarify
- remove useless vn_to_inode call
- check is_bad_inode before calling d_alloc_root
- use iput instead of VN_RELE in the error case

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30708a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

f3dcc13f

[XFS] Ensure a btree insert returns a valid cursor. · 59a33f9f

由 David Chinner 提交于 3月 27, 2008

When writing into preallocated regions there is a case where XFS can oops
or hang doing the unwritten extent conversion on I/O completion. It turns
out that the problem is related to the btree cursor being invalid.

When we do an insert into the tree, we may need to split blocks in the
tree. When we only split at the leaf level (i.e. level 0), everything
works just fine. However, if we have a multi-level split in the btreee,
the cursor passed to the insert function is no longer valid once the
insert is complete.

The leaf level split is handled correctly because all the operations at
level 0 are done using the original cursor, hence it is updated correctly.
However, when we need to update the next level up the tree, we don't use
that cursor - we use a cloned cursor that points to the index in the next
level up where we need to do the insert.

Hence if we need to split a second level, the changes to the tree are
reflected in the cloned cursor and not the original cursor. This
clone-and-move-up-a-level-on-split behaviour recurses all the way to the
top of the tree.

The complexity here is that these cloned cursors do not point to the
original index that was inserted - they point to the newly allocated block
(the right block) and the original cursor pointer to that level may still
point to the left block. Hence, without deep examination of the cloned
cursor and buffers, we cannot update the original cursor with the new path
from the cloned cursor.

In these cases the original cursor could be pointing to the wrong block(s)
and hence a subsequent modification to the tree using that cursor will
lead to corruption of the tree.

The crash case occurs when the tree changes height - we insert a new level
in the tree, and the cursor does not have a buffer in it's path for that
level. Hence any attempt to walk back up the cursor to the root block will
result in a null pointer dereference.

To make matters even more complex, the BMAP BT is rooted in an inode, so
we can have a change of height in the btree *without a root split*. That
is, if the root block in the inode is full when we split a leaf node, we
cannot fit the pointer to the new block in the root, so we allocate a new
block, migrate all the ptrs out of the inode into the new block and point
the inode root block at the newly allocated block. This changes the height
of the tree without a root split having occurred and hence invalidates the
path in the original cursor.

The patch below prevents xfs_bmbt_insert() from returning with an invalid
cursor by detecting the cases that invalidate the original cursor and
refresh it by do a lookup into the btree for the original index we were
inserting at.

Note that the INOBT, AGFBNO and AGFCNT btree implementations also have
this bug, but the cursor is currently always destroyed or revalidated
after an insert for those trees. Hence this patch only address the problem
in the BMBT code.

SGI-PV: 979339
SGI-Modid: xfs-linux-melb:xfs-kern:30701a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

59a33f9f

[XFS] Account for inode cluster alignment in all allocations · 75de2a91

由 David Chinner 提交于 3月 27, 2008

At ENOSPC, we can get a filesystem shutdown due to a cancelling a dirty
transaction in xfs_mkdir or xfs_create. This is due to the initial
allocation attempt not taking into account inode alignment and hence we
can prepare the AGF freelist for allocation when it's not actually
possible to do an allocation. This results in inode allocation returning
ENOSPC with a dirty transaction, and hence we shut down the filesystem.

Because the first allocation is an exact allocation attempt, we must tell
the allocator that the alignment does not affect the allocation attempt.
i.e. we will accept any extent alignment as long as the extent starts at
the block we want. Unfortunately, this means that if the longest free
extent is less than the length + alignment necessary for fallback
allocation attempts but is long enough to attempt a non-aligned
allocation, we will modify the free list.

If we then have the exact allocation fail, all other allocation attempts
will also fail due to the alignment constraint being taken into account.
Hence the initial attempt needs to set the "alignment slop" field so that
alignment, while not required, must be taken into account when determining
if there is enough space left in the AG to do the allocation.

That means if the exact allocation fails, we will not dirty the freelist
if there is not enough space available fo a subsequent allocation to
succeed. Hence we get an ENOSPC error back to userspace without shutting
down the filesystem.

SGI-PV: 978886
SGI-Modid: xfs-linux-melb:xfs-kern:30699a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

75de2a91

[XFS] Replace custom AIL linked-list code with struct list_head · 535f6b37

由 Josef 'Jeff' Sipek 提交于 3月 27, 2008

Replace the xfs_ail_entry_t with a struct list_head and clean the
surrounding code up. Also fixes a livelock in xfs_trans_first_push_ail()
by terminating the loop at the head of the list correctly.

SGI-PV: 978682
SGI-Modid: xfs-linux-melb:xfs-kern:30636a
Signed-off-by: NJosef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

535f6b37

[XFS] Remove superflous xfs_readsb call in xfs_mountfs. · a45c7968

由 Christoph Hellwig 提交于 3月 06, 2008

When xfs_mountfs is called by xfs_mount xfs_readsb was called 35 lines
above unconditionally, so there is no need to try to read the superblock
if it's not present. If any other port doesn't have the superblock read at
this point it should just call it directly from it's xfs_mount equivalent.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30603a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NDonald Douwsma <donaldd@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

a45c7968

[XFS] kill t_sema member of struct xfs_trans · dfa18b11

由 Niv Sardi 提交于 3月 06, 2008

It's completely unused so we might aswell kill it. Note that there is
another t_sema in struct xlog_ticket, which is used and actually an sv_t
despite the name. That one is left untouched by this patch.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30591a
Signed-off-by: NNiv Sardi <xaiki@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

dfa18b11

[XFS] cleanup vnode use in xfs_bmap.c · 5f90150a

由 Christoph Hellwig 提交于 3月 06, 2008

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30553a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

5f90150a

[XFS] cleanup vnode use in xfs_iops.c · af048193

由 Christoph Hellwig 提交于 3月 06, 2008

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30552a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

af048193

[XFS] cleanup vnode use in xfs_lrw.c · dcf49cc5

由 Christoph Hellwig 提交于 3月 06, 2008

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30551a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

dcf49cc5

[XFS] cleanup vnode use in xfs_lookup · ef1f5e7a

由 Christoph Hellwig 提交于 3月 06, 2008

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30550a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

ef1f5e7a

[XFS] cleanup vnode use in xfs_symlink and xfs_rename · 3937be5b

由 Christoph Hellwig 提交于 3月 06, 2008

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30548a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

3937be5b

[XFS] cleanup vnode use in xfs_link · a3da7896

由 Christoph Hellwig 提交于 3月 06, 2008

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30547a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

a3da7896

[XFS] cleanup vnode use in xfs_create/mknod/mkdir · 979ebab1

由 Christoph Hellwig 提交于 3月 06, 2008

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30546a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

979ebab1

[XFS] cleanup vnode use in dmapi calls · bc4ac74a

由 Christoph Hellwig 提交于 3月 06, 2008

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30545a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

bc4ac74a

[XFS] Use power-of-2 sized buffers to reduce overhead · d2341541

由 David Chinner 提交于 3月 06, 2008

Now that the ktrace_enter() code is using atomics, the non-power-of-2
buffer sizes - which require modulus operations to get the index - are
showing up as using substantial CPU in the profiles.

Force the buffer sizes to be rounded up to the nearest power of two and
use masking rather than modulus operations to convert the index counter to
the buffer index. This reduces ktrace_enter overhead to 8% of a CPU time,
and again almost halves the trace intensive test runtime.

SGI-PV: 977546
SGI-Modid: xfs-linux-melb:xfs-kern:30538a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

d2341541

[XFS] Use atomic counters for ktrace buffer indexes · 6ee4752f

由 David Chinner 提交于 3月 06, 2008

ktrace_enter() is consuming vast amounts of CPU time due to the use of a
single global lock for protecting buffer index increments. Change it to
use per-buffer atomic counters - this reduces ktrace_enter() overhead
during a trace intensive test on a 4p machine from 58% of all CPU time to
12% and halves test runtime.

SGI-PV: 977546
SGI-Modid: xfs-linux-melb:xfs-kern:30537a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

6ee4752f

[XFS] Update c/mtime correctly on truncates · 44d814ce

由 David Chinner 提交于 3月 06, 2008

XFS changes the c/mtime of an inode when truncating it to the same size.
The c/mtime is only supposed to change if the size is changed. Not to be
confused with ftruncate, where the c/mtime is supposed to be changed even
if the size is not changed.

The Linux VFS encodes this semantic difference in the flags it sends down
to ->setattr, which XFS currently ignores. We need to make XFS pay
attention to the VFS flags and hence Do The Right Thing.

SGI-PV: 977547
SGI-Modid: xfs-linux-melb:xfs-kern:30536a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

44d814ce

[XFS] don't encode parent in nfs filehandles unless nessecary · 24bd861d

由 Christoph Hellwig 提交于 3月 06, 2008

As Dave pointed out after the export ops changes we now always encode the
parent into the filehandle for regular files, but it's not actually needed
when the filesystem is export with no_subtree_check. This one-liner fixes
xfs_fs_encode_fh to skip encoding the parent unless nessecary.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30535a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

24bd861d

[XFS] kill xfs_rwlock/xfs_rwunlock · 126468b1

由 Christoph Hellwig 提交于 3月 06, 2008

We can just use xfs_ilock/xfs_iunlock instead and get rid of the ugly
bhv_vrwlock_t.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30533a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

126468b1

[XFS] kill xfs_get_dir_entry · 43973964

由 Christoph Hellwig 提交于 3月 06, 2008

Instead of of xfs_get_dir_entry use a macro to get the xfs_inode from the
dentry in the callers and grab the reference manually.

Only grab the reference once as it's fine to keep it over the dmapi calls.
(And even that reference is actually superflous in Linux but I'll leave
that for another patch)

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30531a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

43973964

[XFS] vnode cleanup in xfs_fs_subr.c · a8b3acd5

由 Christoph Hellwig 提交于 3月 06, 2008

Cleanup the unneeded intermediate vnode step in the flushing helpers and
go directly from the xfs_inode to the struct address_space.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30530a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

a8b3acd5

[XFS] cleanup xfs_vn_mknod · db0bb7ba

由 Christoph Hellwig 提交于 3月 06, 2008

- use proper goto based unwinding instead of the current mess of
  multiple conditionals
- rename ip to inode because that's the normal convention for Linux
  inodes while ip is the convention for xfs_inodes
- remove unlikely checks for the default_acl - branches marked unlikely
  might lead to extreme branch bredictor slowdons if taken and for some
  workloads a default acl is quite common
- properly indent the switch statements
- remove xfs_has_fs_struct as nfsd has a fs_struct in any semi-recent
  kernel

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30529a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

db0bb7ba

[XFS] Use atomics for iclog reference counting · 155cc6b7

由 David Chinner 提交于 3月 06, 2008

Now that we update the log tail LSN less frequently on transaction
completion, we pass the contention straight to the global log state lock
(l_iclog_lock) during transaction completion.

We currently have to take this lock to decrement the iclog reference
count. there is a reference count on each iclog, so we need to take þhe
global lock for all refcount changes.

When large numbers of processes are all doing small trnasctions, the iclog
reference counts will be quite high, and the state change that absolutely
requires the l_iclog_lock is the except rather than the norm.

Change the reference counting on the iclogs to use atomic_inc/dec so that
we can use atomic_dec_and_lock during transaction completion and avoid the
need for grabbing the l_iclog_lock for every reference count decrement
except the one that matters - the last.

SGI-PV: 975671
SGI-Modid: xfs-linux-melb:xfs-kern:30505a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NTim Shimmin <tes@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

155cc6b7

[XFS] Prevent AIL lock contention during transaction completion · b589334c

由 David Chinner 提交于 3月 06, 2008

When hundreds of processors attempt to commit transactions at the same
time, they can contend on the AIL lock when updating the tail LSN held in
the in-core log structure.

At the moment, the tail LSN is only needed when actually writing out an
iclog, so it really does not need to be updated on every single
transaction completion - only those that result in switching iclogs and
flushing them to disk.

The result is that we reduce the number of times we need to grab the AIL
lock and the log grant lock by up to two orders of magnitude on large
processor count machines. The problem has previously been hidden by AIL
lock contention walking the AIL list which was recently solved and
uncovered this issue.

SGI-PV: 975671
SGI-Modid: xfs-linux-melb:xfs-kern:30504a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NTim Shimmin <tes@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

b589334c

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功