提交 · 1c1c6ebcf5284aee4910f3b906ac90c20e510c82 · openanolis / cloud-kernel

16 1月, 2010 2 次提交

xfs: Replace per-ag array with a radix tree · 1c1c6ebc

由 Dave Chinner 提交于 1月 11, 2010

The use of an array for the per-ag structures requires reallocation
of the array when growing the filesystem. This requires locking
access to the array to avoid use after free situations, and the
locking is difficult to get right. To avoid needing to reallocate an
array, change the per-ag structures to an allocated object per ag
and index them using a tree structure.

The AGs are always densely indexed (hence the use of an array), but
the number supported is 2^32 and lookups tend to be random and hence
indexing needs to scale. A simple choice is a radix tree - it works
well with this sort of index. This change also removes another
large contiguous allocation from the mount/growfs path in XFS.

The growing process now needs to change to only initialise the new
AGs required for the extra space, and as such only needs to
exclusively lock the tree for inserts. The rest of the code only
needs to lock the tree while doing lookups, and hence this will
remove all the deadlocks that currently occur on the m_perag_lock as
it is now an innermost lock. The lock is also changed to a spinlock
from a read/write lock as the hold time is now extremely short.

To complete the picture, the per-ag structures will need to be
reference counted to ensure that we don't free/modify them while
they are still in use. This will be done in subsequent patch.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

1c1c6ebc

xfs: Don't directly reference m_perag in allocation code · a862e0fd

由 Dave Chinner 提交于 1月 11, 2010

Start abstracting the perag references so that the indexing of the
structures is not directly coded into all the places that uses the
perag structures. This will allow us to separate the use of the
perag structure and the way it is indexed and hence avoid the known
deadlocks related to growing a busy filesystem.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

a862e0fd

11 1月, 2010 1 次提交

xfs: Ensure we force all busy extents in range to disk · fd45e478

由 Dave Chinner 提交于 1月 02, 2010

When we search for and find a busy extent during allocation we
force the log out to ensure the extent free transaction is on
disk before the allocation transaction. The current implementation
has a subtle bug in it--it does not handle multiple overlapping
ranges.

That is, if we free lots of little extents into a single
contiguous extent, then allocate the contiguous extent, the busy
search code stops searching at the first extent it finds that
overlaps the allocated range. It then uses the commit LSN of the
transaction to force the log out to.

Unfortunately, the other busy ranges might have more recent
commit LSNs than the first busy extent that is found, and this
results in xfs_alloc_search_busy() returning before all the
extent free transactions are on disk for the range being
allocated. This can lead to potential metadata corruption or
stale data exposure after a crash because log replay won't replay
all the extent free transactions that cover the allocation range.
Modified-by: NAlex Elder <aelder@sgi.com>

(Dropped the "found" argument from the xfs_alloc_busysearch trace
event.)
Signed-off-by: NDave Chinner <david@fromorbit.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

fd45e478

15 12月, 2009 1 次提交

xfs: event tracing support · 0b1b213f

由 Christoph Hellwig 提交于 12月 14, 2009

Convert the old xfs tracing support that could only be used with the
out of tree kdb and xfsidbg patches to use the generic event tracer.

To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
all xfs trace channels by:

   echo 1 > /sys/kernel/debug/tracing/events/xfs/enable

or alternatively enable single events by just doing the same in one
event subdirectory, e.g.

   echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable

or set more complex filters, etc. In Documentation/trace/events.txt
all this is desctribed in more detail.  To reads the events do a

   cat /sys/kernel/debug/tracing/trace

Compared to the last posting this patch converts the tracing mostly to
the one tracepoint per callsite model that other users of the new
tracing facility also employ.  This allows a very fine-grained control
of the tracing, a cleaner output of the traces and also enables the
perf tool to use each tracepoint as a virtual performance counter,
     allowing us to e.g. count how often certain workloads git various
     spots in XFS.  Take a look at

    http://lwn.net/Articles/346470/

for some examples.

Also the btree tracing isn't included at all yet, as it will require
additional core tracing features not in mainline yet, I plan to
deliver it later.

And the really nice thing about this patch is that it actually removes
many lines of code while adding this nice functionality:

 fs/xfs/Makefile                |    8
 fs/xfs/linux-2.6/xfs_acl.c     |    1
 fs/xfs/linux-2.6/xfs_aops.c    |   52 -
 fs/xfs/linux-2.6/xfs_aops.h    |    2
 fs/xfs/linux-2.6/xfs_buf.c     |  117 +--
 fs/xfs/linux-2.6/xfs_buf.h     |   33
 fs/xfs/linux-2.6/xfs_fs_subr.c |    3
 fs/xfs/linux-2.6/xfs_ioctl.c   |    1
 fs/xfs/linux-2.6/xfs_ioctl32.c |    1
 fs/xfs/linux-2.6/xfs_iops.c    |    1
 fs/xfs/linux-2.6/xfs_linux.h   |    1
 fs/xfs/linux-2.6/xfs_lrw.c     |   87 --
 fs/xfs/linux-2.6/xfs_lrw.h     |   45 -
 fs/xfs/linux-2.6/xfs_super.c   |  104 ---
 fs/xfs/linux-2.6/xfs_super.h   |    7
 fs/xfs/linux-2.6/xfs_sync.c    |    1
 fs/xfs/linux-2.6/xfs_trace.c   |   75 ++
 fs/xfs/linux-2.6/xfs_trace.h   | 1369 +++++++++++++++++++++++++++++++++++++++++
 fs/xfs/linux-2.6/xfs_vnode.h   |    4
 fs/xfs/quota/xfs_dquot.c       |  110 ---
 fs/xfs/quota/xfs_dquot.h       |   21
 fs/xfs/quota/xfs_qm.c          |   40 -
 fs/xfs/quota/xfs_qm_syscalls.c |    4
 fs/xfs/support/ktrace.c        |  323 ---------
 fs/xfs/support/ktrace.h        |   85 --
 fs/xfs/xfs.h                   |   16
 fs/xfs/xfs_ag.h                |   14
 fs/xfs/xfs_alloc.c             |  230 +-----
 fs/xfs/xfs_alloc.h             |   27
 fs/xfs/xfs_alloc_btree.c       |    1
 fs/xfs/xfs_attr.c              |  107 ---
 fs/xfs/xfs_attr.h              |   10
 fs/xfs/xfs_attr_leaf.c         |   14
 fs/xfs/xfs_attr_sf.h           |   40 -
 fs/xfs/xfs_bmap.c              |  507 +++------------
 fs/xfs/xfs_bmap.h              |   49 -
 fs/xfs/xfs_bmap_btree.c        |    6
 fs/xfs/xfs_btree.c             |    5
 fs/xfs/xfs_btree_trace.h       |   17
 fs/xfs/xfs_buf_item.c          |   87 --
 fs/xfs/xfs_buf_item.h          |   20
 fs/xfs/xfs_da_btree.c          |    3
 fs/xfs/xfs_da_btree.h          |    7
 fs/xfs/xfs_dfrag.c             |    2
 fs/xfs/xfs_dir2.c              |    8
 fs/xfs/xfs_dir2_block.c        |   20
 fs/xfs/xfs_dir2_leaf.c         |   21
 fs/xfs/xfs_dir2_node.c         |   27
 fs/xfs/xfs_dir2_sf.c           |   26
 fs/xfs/xfs_dir2_trace.c        |  216 ------
 fs/xfs/xfs_dir2_trace.h        |   72 --
 fs/xfs/xfs_filestream.c        |    8
 fs/xfs/xfs_fsops.c             |    2
 fs/xfs/xfs_iget.c              |  111 ---
 fs/xfs/xfs_inode.c             |   67 --
 fs/xfs/xfs_inode.h             |   76 --
 fs/xfs/xfs_inode_item.c        |    5
 fs/xfs/xfs_iomap.c             |   85 --
 fs/xfs/xfs_iomap.h             |    8
 fs/xfs/xfs_log.c               |  181 +----
 fs/xfs/xfs_log_priv.h          |   20
 fs/xfs/xfs_log_recover.c       |    1
 fs/xfs/xfs_mount.c             |    2
 fs/xfs/xfs_quota.h             |    8
 fs/xfs/xfs_rename.c            |    1
 fs/xfs/xfs_rtalloc.c           |    1
 fs/xfs/xfs_rw.c                |    3
 fs/xfs/xfs_trans.h             |   47 +
 fs/xfs/xfs_trans_buf.c         |   62 -
 fs/xfs/xfs_vnodeops.c          |    8
 70 files changed, 2151 insertions(+), 2592 deletions(-)
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

0b1b213f

01 9月, 2009 2 次提交

un-static xfs_read_agf · fef1111e

由 Eric Sandeen 提交于 7月 02, 2009

CONFIG_XFS_DEBUG builds still need xfs_read_agf to be
non-static, oops.
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Reviewed-by: NFelix Blyakher <felixb@sgi.com>
Signed-off-by: NFelix Blyakher <felixb@sgi.com>

fef1111e

xfs: add more statics & drop some unused functions · d96f8f89

由 Eric Sandeen 提交于 7月 02, 2009

A lot more functions could be made static, but they need
forward declarations; this does some easy ones, and also
found a few unused functions in the process.
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NFelix Blyakher <felixb@sgi.com>

d96f8f89

03 7月, 2009 1 次提交

un-static xfs_read_agf · ab8b9baa

由 Eric Sandeen 提交于 7月 02, 2009

CONFIG_XFS_DEBUG builds still need xfs_read_agf to be
non-static, oops.
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Reviewed-by: NFelix Blyakher <felixb@sgi.com>
Signed-off-by: NFelix Blyakher <felixb@sgi.com>

ab8b9baa

02 7月, 2009 1 次提交

xfs: add more statics & drop some unused functions · 370f0482

由 Eric Sandeen 提交于 7月 02, 2009

A lot more functions could be made static, but they need
forward declarations; this does some easy ones, and also
found a few unused functions in the process.
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NFelix Blyakher <felixb@sgi.com>

370f0482

16 3月, 2009 1 次提交
- D
  xfs: factor out code to find the longest free extent in the AG · 6cc87645
  由 Dave Chinner 提交于 3月 16, 2009
```
Signed-off-by: NDave Chinner <dgc@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
  6cc87645
01 12月, 2008 1 次提交

[XFS] factor out xfs_read_agf helper · 4805621a

由 From: Christoph Hellwig 提交于 11月 28, 2008

Add a helper to read the AGF header and perform basic verification.
Based on hunks from a larger patch from Dave Chinner.

(First sent on Juli 23rd)
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NNiv Sardi <xaiki@sgi.com>

4805621a

30 10月, 2008 10 次提交

[XFS] Always use struct xfs_btree_block instead of short / longform · 7cc95a82

由 Christoph Hellwig 提交于 10月 30, 2008

structures.

Always use the generic xfs_btree_block type instead of the short / long
structures. Add XFS_BTREE_SBLOCK_LEN / XFS_BTREE_LBLOCK_LEN defines for
the length of a short / long form block. The rationale for this is that we
will grow more btree block header variants to support CRCs and other RAS
information, and always accessing them through the same datatype with
unions for the short / long form pointers makes implementing this much
easier.

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32300a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NDonald Douwsma <donaldd@sgi.com>
Signed-off-by: NDavid Chinner <david@fromorbit.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

7cc95a82

[XFS] Check agf_btreeblks is valid when reading in the AGF · 89b28393

由 Barry Naujok 提交于 10月 30, 2008

SGI-PV: 987683

SGI-Modid: xfs-linux-melb:xfs-kern:32232a
Signed-off-by: NBarry Naujok <bnaujok@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

89b28393

[XFS] implement generic xfs_btree_get_rec · 8cc938fe

由 Christoph Hellwig 提交于 10月 30, 2008

Not really much reason to make it generic given that it's so small, but
this is the last non-method in xfs_alloc_btree.c and xfs_ialloc_btree.c,
so it makes the whole btree implementation more structured.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32206a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NBill O'Donnell <billodo@sgi.com>
Signed-off-by: NDavid Chinner <david@fromorbit.com>

8cc938fe

[XFS] implement generic xfs_btree_delete/delrec · 91cca5df

由 Christoph Hellwig 提交于 10月 30, 2008

Make the btree delete code generic. Based on a patch from David Chinner
with lots of changes to follow the original btree implementations more
closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32205a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NBill O'Donnell <billodo@sgi.com>
Signed-off-by: NDavid Chinner <david@fromorbit.com>

91cca5df

[XFS] implement generic xfs_btree_insert/insrec · 4b22a571

由 Christoph Hellwig 提交于 10月 30, 2008

Make the btree insert code generic. Based on a patch from David Chinner
with lots of changes to follow the original btree implementations more
closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32202a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NBill O'Donnell <billodo@sgi.com>
Signed-off-by: NDavid Chinner <david@fromorbit.com>

4b22a571

[XFS] implement generic xfs_btree_update · 278d0ca1

由 Christoph Hellwig 提交于 10月 30, 2008

From: Dave Chinner <dgc@sgi.com>

The most complicated part here is the lastrec tracking for the alloc
btree. Most logic is in the update_lastrec method which has to do some
hopefully good enough dirty magic to maintain it.

[hch: split out from bigger patch and a rework of the lastrec

logic]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32194a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NBill O'Donnell <billodo@sgi.com>
Signed-off-by: NDavid Chinner <david@fromorbit.com>

278d0ca1

[XFS] implement generic xfs_btree_lookup · fe033cc8

由 Christoph Hellwig 提交于 10月 30, 2008

From: Dave Chinner <dgc@sgi.com>

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32192a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NBill O'Donnell <billodo@sgi.com>
Signed-off-by: NDavid Chinner <david@fromorbit.com>

fe033cc8

[XFS] implement generic xfs_btree_decrement · 8df4da4a

由 Christoph Hellwig 提交于 10月 30, 2008

From: Dave Chinner <dgc@sgi.com>

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32191a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NBill O'Donnell <billodo@sgi.com>
Signed-off-by: NDavid Chinner <david@fromorbit.com>

8df4da4a

[XFS] implement generic xfs_btree_increment · 637aa50f

由 Christoph Hellwig 提交于 10月 30, 2008

From: Dave Chinner <dgc@sgi.com>

Because this is the first major generic btree routine this patch includes
some infrastrucure, first a few routines to deal with a btree block that
can be either in short or long form, second xfs_btree_read_buf_block,
which is the new central routine to read a btree block given a cursor, and
third the new xfs_btree_ptr_addr routine to calculate the address for a
given btree pointer record.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32190a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NBill O'Donnell <billodo@sgi.com>
Signed-off-by: NDavid Chinner <david@fromorbit.com>

637aa50f

[XFS] split up xfs_btree_init_cursor · 561f7d17

由 Christoph Hellwig 提交于 10月 30, 2008

xfs_btree_init_cursor contains close to little shared code for the
different btrees and will get even more non-common code in the future.
Split it up into one routine per btree type.

Because xfs_btree_dup_cursor needs to call the init routine for a generic
btree cursor add a new btree operation vector that contains a dup_cursor
method that initializes a new cursor based on an existing one.

The btree operations vector is based on an idea and code from Dave Chinner
and will grow more entries later during this series.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32176a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NBill O'Donnell <billodo@sgi.com>
Signed-off-by: NDavid Chinner <david@fromorbit.com>

561f7d17

18 4月, 2008 4 次提交

[XFS] fix logic error in xfs_alloc_ag_vextent_near() · e6430037

由 David Chinner 提交于 4月 17, 2008

Fix a logic error in xfs_alloc_ag_vextent_near(). This is a regression
introduced by the error handling changes.

SGI-PV: 890084
SGI-Modid: xfs-linux-melb:xfs-kern:30838a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NBarry Naujok <bnaujok@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

e6430037

[XFS] Make xfs_alloc_compute_aligned() void. · 12375c82

由 David Chinner 提交于 4月 10, 2008

xfs_alloc_compute_aligned() returns a value based on a comparison of the
computed extent length and the minimum length allowed. This is only used
by some callers - the other four return parameters are used more often.
Hence move the comparison to the code that actually needs to do it and
make xfs_alloc_compute_aligned() a void function.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30797a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NNiv Sardi <xaiki@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

12375c82

[XFS] Clean up xfs_alloc_search_busy() return values. · f4586e40

由 David Chinner 提交于 4月 10, 2008

xfs_alloc_search_busy() returns an index into the busy array if the extent
was found in the array. This is never checked, and the
xfs_alloc_search_busy() does a log force to prevent reuse of the extent
before the free transaction hits the disk. Hence the return value is
useless. Declare the function void and remove the slot number from the
tracing as well.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30796a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NNiv Sardi <xaiki@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

f4586e40

[XFS] replace remaining __FUNCTION__ occurrences · 34a622b2

由 Harvey Harrison 提交于 4月 10, 2008

__FUNCTION__ is gcc-specific, use __func__

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30775a
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

34a622b2

14 2月, 2008 1 次提交

xfs: convert beX_add to beX_add_cpu (new common API) · 413d57c9

由 Marcin Slusarz 提交于 2月 13, 2008

remove beX_add functions and replace all uses with beX_add_cpu
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: NDave Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

413d57c9

07 2月, 2008 2 次提交

[XFS] Remove spin.h · 007c61c6

由 Eric Sandeen 提交于 10月 11, 2007

remove spinlock init abstraction macro in spin.h, remove the callers, and
remove the file. Move no-op spinlock_destroy to xfs_linux.h Cleanup
spinlock locals in xfs_mount.c

SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29751a
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Signed-off-by: NDonald Douwsma <donaldd@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NTim Shimmin <tes@sgi.com>

007c61c6

[XFS] Unwrap pagb_lock. · 64137e56

由 Eric Sandeen 提交于 10月 11, 2007

Un-obfuscate pagb_lock, remove mutex_lock->spin_lock macros, call
spin_lock directly, remove extraneous cookie holdover from old xfs code,
and change lock type to spinlock_t.

SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29743a
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Signed-off-by: NDonald Douwsma <donaldd@sgi.com>
Signed-off-by: NTim Shimmin <tes@sgi.com>

64137e56

14 7月, 2007 2 次提交

[XFS] Clean up function name handling in tracing code · 3a59c94c

由 Eric Sandeen 提交于 7月 11, 2007

Remove the hardcoded "fnames" for tracing, and just embed them in tracing
macros via __FUNCTION__. Kills a lot of #ifdefs too.

SGI-PV: 967353
SGI-Modid: xfs-linux-melb:xfs-kern:29099a
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NTim Shimmin <tes@sgi.com>

3a59c94c

[XFS] Lazy Superblock Counters · 92821e2b

由 David Chinner 提交于 5月 24, 2007

When we have a couple of hundred transactions on the fly at once, they all
typically modify the on disk superblock in some way.
create/unclink/mkdir/rmdir modify inode counts, allocation/freeing modify
free block counts.

When these counts are modified in a transaction, they must eventually lock
the superblock buffer and apply the mods. The buffer then remains locked
until the transaction is committed into the incore log buffer. The result
of this is that with enough transactions on the fly the incore superblock
buffer becomes a bottleneck.

The result of contention on the incore superblock buffer is that
transaction rates fall - the more pressure that is put on the superblock
buffer, the slower things go.

The key to removing the contention is to not require the superblock fields
in question to be locked. We do that by not marking the superblock dirty
in the transaction. IOWs, we modify the incore superblock but do not
modify the cached superblock buffer. In short, we do not log superblock
modifications to critical fields in the superblock on every transaction.
In fact we only do it just before we write the superblock to disk every
sync period or just before unmount.

This creates an interesting problem - if we don't log or write out the
fields in every transaction, then how do the values get recovered after a
crash? the answer is simple - we keep enough duplicate, logged information
in other structures that we can reconstruct the correct count after log
recovery has been performed.

It is the AGF and AGI structures that contain the duplicate information;
after recovery, we walk every AGI and AGF and sum their individual
counters to get the correct value, and we do a transaction into the log to
correct them. An optimisation of this is that if we have a clean unmount
record, we know the value in the superblock is correct, so we can avoid
the summation walk under normal conditions and so mount/recovery times do
not change under normal operation.

One wrinkle that was discovered during development was that the blocks
used in the freespace btrees are never accounted for in the AGF counters.
This was once a valid optimisation to make; when the filesystem is full,
the free space btrees are empty and consume no space. Hence when it
matters, the "accounting" is correct. But that means the when we do the
AGF summations, we would not have a correct count and xfs_check would
complain. Hence a new counter was added to track the number of blocks used
by the free space btrees. This is an *on-disk format change*.

As a result of this, lazy superblock counters are a mkfs option and at the
moment on linux there is no way to convert an old filesystem. This is
possible - xfs_db can be used to twiddle the right bits and then
xfs_repair will do the format conversion for you. Similarly, you can
convert backwards as well. At some point we'll add functionality to
xfs_admin to do the bit twiddling easily....

SGI-PV: 964999
SGI-Modid: xfs-linux-melb:xfs-kern:28652a
Signed-off-by: NDavid Chinner <dgc@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NTim Shimmin <tes@sgi.com>

92821e2b

08 5月, 2007 1 次提交

[XFS] reducing the number of random number functions. · e7a23a9b

由 Joe Perches 提交于 5月 08, 2007

Patch provided by Joe Perches

SGI-PV: 961696
SGI-Modid: xfs-linux-melb:xfs-kern:28209a
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NTim Shimmin <tes@sgi.com>

e7a23a9b

28 9月, 2006 2 次提交

[XFS] Minor code rearranging and cleanup to prevent some coverity false · d432c80e

由 Nathan Scott 提交于 9月 28, 2006

positives.

SGI-PV: 955502
SGI-Modid: xfs-linux-melb:xfs-kern:26805a
Signed-off-by: NNathan Scott <nathans@sgi.com>
Signed-off-by: NTim Shimmin <tes@sgi.com>

d432c80e

[XFS] endianess annotation for xfs_agfl_t. Trivial, xfs_agfl_t is always · e2101005

由 Christoph Hellwig 提交于 9月 28, 2006

used for ondisk values.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26553a
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NNathan Scott <nathans@sgi.com>
Signed-off-by: NTim Shimmin <tes@sgi.com>

e2101005

10 8月, 2006 1 次提交

[XFS] Fix xfs_free_extent related NULL pointer dereference. · 0e1edbd9

由 Nathan Scott 提交于 8月 10, 2006

We recently fixed an out-of-space deadlock in XFS, and part of that fix
involved the addition of the XFS_ALLOC_FLAG_FREEING flag to some of the
space allocator calls to indicate they're freeing space, not allocating
it. There was a missed xfs_alloc_fix_freelist condition test that did not
correctly test "flags". The same test would also test an uninitialised
structure field (args->userdata) and depending on its value either would
or would not return early with a critical buffer pointer set to NULL.

This fixes that up, adds asserts to several places to catch future botches
of this nature, and skips sections of xfs_alloc_fix_freelist that are
irrelevent for the space-freeing case.

SGI-PV: 955303
SGI-Modid: xfs-linux-melb:xfs-kern:26743a
Signed-off-by: NNathan Scott <nathans@sgi.com>

0e1edbd9

20 6月, 2006 1 次提交
- N
  [XFS] Remove version 1 directory code. Never functioned on Linux, just · f6c2d1fa
  由 Nathan Scott 提交于 6月 20, 2006
```
pure bloat.

SGI-PV: 952969
SGI-Modid: xfs-linux-melb:xfs-kern:26251a
Signed-off-by: NNathan Scott <nathans@sgi.com>
```
  f6c2d1fa
09 6月, 2006 1 次提交

[XFS] In actual allocation of file system blocks and freeing extents, the · d210a28c

由 Yingping Lu 提交于 6月 09, 2006

transaction within each such operation may involve multiple locking of AGF
buffer. While the freeing extent function has sorted the extents based on
AGF number before entering into transaction, however, when the file system
space is very limited, the allocation of space would try every AGF to get
space allocated, this could potentially cause out-of-order locking, thus
deadlock could happen. This fix mitigates the scarce space for allocation
by setting aside a few blocks without reservation, and avoid deadlock by
maintaining ascending order of AGF locking.

SGI-PV: 947395
SGI-Modid: xfs-linux-melb:xfs-kern:210801a
Signed-off-by: NYingping Lu <yingping@sgi.com>
Signed-off-by: NNathan Scott <nathans@sgi.com>

d210a28c

08 5月, 2006 1 次提交
- N
  [XFS] Fix a possible metadata buffer (AGFL) refcount leak when fixing an · e63a3690
  由 Nathan Scott 提交于 5月 08, 2006
```
AG freelist.

SGI-PV: 952681
SGI-Modid: xfs-linux-melb:xfs-kern:25902a
Signed-off-by: NNathan Scott <nathans@sgi.com>
```
  e63a3690
29 3月, 2006 1 次提交
- N
  [XFS] We really suck at spulling. Thanks to Chris Pascoe for fixing all · c41564b5
  由 Nathan Scott 提交于 3月 29, 2006
```
these typos.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:25539a
Signed-off-by: NNathan Scott <nathans@sgi.com>
```
  c41564b5
02 11月, 2005 3 次提交
- C
  [XFS] Endianess annotations for various allocator data structures · 16259e7d
  由 Christoph Hellwig 提交于 11月 02, 2005
```
SGI-PV: 943272
SGI-Modid: xfs-linux:xfs-kern:201006a
Signed-off-by: NChristoph Hellwig <hch@sgi.com>
Signed-off-by: NNathan Scott <nathans@sgi.com>
```
  16259e7d
- N
  [XFS] Update license/copyright notices to match the prefered SGI · 7b718769
  由 Nathan Scott 提交于 11月 02, 2005
```
boilerplate.

SGI-PV: 913862
SGI-Modid: xfs-linux:xfs-kern:23903a
Signed-off-by: NNathan Scott <nathans@sgi.com>
```
  7b718769
- N
  [XFS] Remove xfs_macros.c, xfs_macros.h, rework headers a whole lot. · a844f451
  由 Nathan Scott 提交于 11月 02, 2005
```
SGI-PV: 943122
SGI-Modid: xfs-linux:xfs-kern:23901a
Signed-off-by: NNathan Scott <nathans@sgi.com>
```
  a844f451

openanolis / cloud-kernel 大约 2 年 前同步成功

openanolis / cloud-kernel
大约 2 年前同步成功