提交 0cbbc422 编写于 作者: L Linus Torvalds

Merge tag 'xfs-rmap-for-linus-4.8-rc1' of...

Merge tag 'xfs-rmap-for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs

Pull more xfs updates from Dave Chinner:
 "This is the second part of the XFS updates for this merge cycle, and
  contains the new reverse block mapping feature for XFS.

  Reverse mapping allows us to track the owner of a specific block on
  disk precisely.  It is implemented as a set of btrees (one per
  allocation group) that track the owners of allocated extents.
  Effectively it is a "used space tree" that is updated when we allocate
  or free extents.  i.e. it is coherent with the free space btrees we
  already maintain and never overlaps with them.

  This reverse mapping infrastructure is the building block of several
  upcoming features - reflink, copy-on-write data, dedupe, online
  metadata and data scrubbing, highly accurate bad sector/data loss
  reporting to users, and significantly improved reconstruction of
  damaged and corrupted filesystems.  There's a lot of new stuff coming
  along in the next couple of cycles,a nd it all builds in the rmap
  infrastructure.

  As such, it's a huge chunk of new code with new on-disk format
  features and internal infrastructure.  It warns at mount time as an
  experimental feature and that it may eat data (as we do with all new
  on-disk features until they stabilise).  We have not released
  userspace suport for it yet - userspace support currently requires
  download from Darrick's xfsprogs repo and build from source, so the
  access to this feature is really developer/tester only at this point.
  Initial userspace support will be released at the same time kernel
  with this code in it is released.

  The new rmap enabled code regresses 3 xfstests - all are ENOSPC
  related corner cases, one of which Darrick posted a fix for a few
  hours ago.  The other two are fixed by infrastructure that is part of
  the upcoming reflink patchset.  This new ENOSPC infrastructure
  requires a on-disk format tweak required to keep mount times in
  check - we need to keep an on-disk count of allocated rmapbt blocks so
  we don't have to scan the entire btrees at mount time to count them.

  This is currently being tested and will be part of the fixes sent in
  the next week or two so users will not be exposed to this change"

* tag 'xfs-rmap-for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (52 commits)
  xfs: move (and rename) the deferred bmap-free tracepoints
  xfs: collapse single use static functions
  xfs: remove unnecessary parentheses from log redo item recovery functions
  xfs: remove the extents array from the rmap update done log item
  xfs: in btree_lshift, only allocate temporary cursor when needed
  xfs: remove unnecesary lshift/rshift key initialization
  xfs: remove the get*keys and update_keys btree ops pointers
  xfs: enable the rmap btree functionality
  xfs: don't update rmapbt when fixing agfl
  xfs: disable XFS_IOC_SWAPEXT when rmap btree is enabled
  xfs: add rmap btree block detection to log recovery
  xfs: add rmap btree geometry feature flag
  xfs: propagate bmap updates to rmapbt
  xfs: enable the xfs_defer mechanism to process rmaps to update
  xfs: log rmap intent items
  xfs: create rmap update intent log items
  xfs: add rmap btree insert and delete helpers
  xfs: convert unwritten status of reverse mappings
  xfs: remove an extent from the rmap btree
  xfs: add an extent to the rmap btree
  ...
......@@ -39,6 +39,7 @@ xfs-y += $(addprefix libxfs/, \
xfs_btree.o \
xfs_da_btree.o \
xfs_da_format.o \
xfs_defer.o \
xfs_dir2.o \
xfs_dir2_block.o \
xfs_dir2_data.o \
......@@ -51,6 +52,8 @@ xfs-y += $(addprefix libxfs/, \
xfs_inode_fork.o \
xfs_inode_buf.o \
xfs_log_rlimit.o \
xfs_rmap.o \
xfs_rmap_btree.o \
xfs_sb.o \
xfs_symlink_remote.o \
xfs_trans_resv.o \
......@@ -100,11 +103,13 @@ xfs-y += xfs_log.o \
xfs_extfree_item.o \
xfs_icreate_item.o \
xfs_inode_item.o \
xfs_rmap_item.o \
xfs_log_recover.o \
xfs_trans_ail.o \
xfs_trans_buf.o \
xfs_trans_extfree.o \
xfs_trans_inode.o \
xfs_trans_rmap.o \
# optional features
xfs-$(CONFIG_XFS_QUOTA) += xfs_dquot.o \
......
......@@ -24,8 +24,10 @@
#include "xfs_bit.h"
#include "xfs_sb.h"
#include "xfs_mount.h"
#include "xfs_defer.h"
#include "xfs_inode.h"
#include "xfs_btree.h"
#include "xfs_rmap.h"
#include "xfs_alloc_btree.h"
#include "xfs_alloc.h"
#include "xfs_extent_busy.h"
......@@ -49,6 +51,81 @@ STATIC int xfs_alloc_ag_vextent_size(xfs_alloc_arg_t *);
STATIC int xfs_alloc_ag_vextent_small(xfs_alloc_arg_t *,
xfs_btree_cur_t *, xfs_agblock_t *, xfs_extlen_t *, int *);
xfs_extlen_t
xfs_prealloc_blocks(
struct xfs_mount *mp)
{
if (xfs_sb_version_hasrmapbt(&mp->m_sb))
return XFS_RMAP_BLOCK(mp) + 1;
if (xfs_sb_version_hasfinobt(&mp->m_sb))
return XFS_FIBT_BLOCK(mp) + 1;
return XFS_IBT_BLOCK(mp) + 1;
}
/*
* In order to avoid ENOSPC-related deadlock caused by out-of-order locking of
* AGF buffer (PV 947395), we place constraints on the relationship among
* actual allocations for data blocks, freelist blocks, and potential file data
* bmap btree blocks. However, these restrictions may result in no actual space
* allocated for a delayed extent, for example, a data block in a certain AG is
* allocated but there is no additional block for the additional bmap btree
* block due to a split of the bmap btree of the file. The result of this may
* lead to an infinite loop when the file gets flushed to disk and all delayed
* extents need to be actually allocated. To get around this, we explicitly set
* aside a few blocks which will not be reserved in delayed allocation.
*
* When rmap is disabled, we need to reserve 4 fsbs _per AG_ for the freelist
* and 4 more to handle a potential split of the file's bmap btree.
*
* When rmap is enabled, we must also be able to handle two rmap btree inserts
* to record both the file data extent and a new bmbt block. The bmbt block
* might not be in the same AG as the file data extent. In the worst case
* the bmap btree splits multiple levels and all the new blocks come from
* different AGs, so set aside enough to handle rmap btree splits in all AGs.
*/
unsigned int
xfs_alloc_set_aside(
struct xfs_mount *mp)
{
unsigned int blocks;
blocks = 4 + (mp->m_sb.sb_agcount * XFS_ALLOC_AGFL_RESERVE);
if (xfs_sb_version_hasrmapbt(&mp->m_sb))
blocks += mp->m_sb.sb_agcount * mp->m_rmap_maxlevels;
return blocks;
}
/*
* When deciding how much space to allocate out of an AG, we limit the
* allocation maximum size to the size the AG. However, we cannot use all the
* blocks in the AG - some are permanently used by metadata. These
* blocks are generally:
* - the AG superblock, AGF, AGI and AGFL
* - the AGF (bno and cnt) and AGI btree root blocks, and optionally
* the AGI free inode and rmap btree root blocks.
* - blocks on the AGFL according to xfs_alloc_set_aside() limits
* - the rmapbt root block
*
* The AG headers are sector sized, so the amount of space they take up is
* dependent on filesystem geometry. The others are all single blocks.
*/
unsigned int
xfs_alloc_ag_max_usable(
struct xfs_mount *mp)
{
unsigned int blocks;
blocks = XFS_BB_TO_FSB(mp, XFS_FSS_TO_BB(mp, 4)); /* ag headers */
blocks += XFS_ALLOC_AGFL_RESERVE;
blocks += 3; /* AGF, AGI btree root blocks */
if (xfs_sb_version_hasfinobt(&mp->m_sb))
blocks++; /* finobt root block */
if (xfs_sb_version_hasrmapbt(&mp->m_sb))
blocks++; /* rmap root block */
return mp->m_sb.sb_agblocks - blocks;
}
/*
* Lookup the record equal to [bno, len] in the btree given by cur.
*/
......@@ -636,6 +713,14 @@ xfs_alloc_ag_vextent(
ASSERT(!args->wasfromfl || !args->isfl);
ASSERT(args->agbno % args->alignment == 0);
/* if not file data, insert new block into the reverse map btree */
if (args->oinfo.oi_owner != XFS_RMAP_OWN_UNKNOWN) {
error = xfs_rmap_alloc(args->tp, args->agbp, args->agno,
args->agbno, args->len, &args->oinfo);
if (error)
return error;
}
if (!args->wasfromfl) {
error = xfs_alloc_update_counters(args->tp, args->pag,
args->agbp,
......@@ -1577,14 +1662,15 @@ xfs_alloc_ag_vextent_small(
/*
* Free the extent starting at agno/bno for length.
*/
STATIC int /* error */
STATIC int
xfs_free_ag_extent(
xfs_trans_t *tp, /* transaction pointer */
xfs_buf_t *agbp, /* buffer for a.g. freelist header */
xfs_agnumber_t agno, /* allocation group number */
xfs_agblock_t bno, /* starting block number */
xfs_extlen_t len, /* length of extent */
int isfl) /* set if is freelist blocks - no sb acctg */
xfs_trans_t *tp,
xfs_buf_t *agbp,
xfs_agnumber_t agno,
xfs_agblock_t bno,
xfs_extlen_t len,
struct xfs_owner_info *oinfo,
int isfl)
{
xfs_btree_cur_t *bno_cur; /* cursor for by-block btree */
xfs_btree_cur_t *cnt_cur; /* cursor for by-size btree */
......@@ -1601,12 +1687,19 @@ xfs_free_ag_extent(
xfs_extlen_t nlen; /* new length of freespace */
xfs_perag_t *pag; /* per allocation group data */
bno_cur = cnt_cur = NULL;
mp = tp->t_mountp;
if (oinfo->oi_owner != XFS_RMAP_OWN_UNKNOWN) {
error = xfs_rmap_free(tp, agbp, agno, bno, len, oinfo);
if (error)
goto error0;
}
/*
* Allocate and initialize a cursor for the by-block btree.
*/
bno_cur = xfs_allocbt_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_BNO);
cnt_cur = NULL;
/*
* Look for a neighboring block on the left (lower block numbers)
* that is contiguous with this space.
......@@ -1875,6 +1968,11 @@ xfs_alloc_min_freelist(
/* space needed by-size freespace btree */
min_free += min_t(unsigned int, pag->pagf_levels[XFS_BTNUM_CNTi] + 1,
mp->m_ag_maxlevels);
/* space needed reverse mapping used space btree */
if (xfs_sb_version_hasrmapbt(&mp->m_sb))
min_free += min_t(unsigned int,
pag->pagf_levels[XFS_BTNUM_RMAPi] + 1,
mp->m_rmap_maxlevels);
return min_free;
}
......@@ -1992,21 +2090,34 @@ xfs_alloc_fix_freelist(
* anything other than extra overhead when we need to put more blocks
* back on the free list? Maybe we should only do this when space is
* getting low or the AGFL is more than half full?
*
* The NOSHRINK flag prevents the AGFL from being shrunk if it's too
* big; the NORMAP flag prevents AGFL expand/shrink operations from
* updating the rmapbt. Both flags are used in xfs_repair while we're
* rebuilding the rmapbt, and neither are used by the kernel. They're
* both required to ensure that rmaps are correctly recorded for the
* regenerated AGFL, bnobt, and cntbt. See repair/phase5.c and
* repair/rmap.c in xfsprogs for details.
*/
while (pag->pagf_flcount > need) {
memset(&targs, 0, sizeof(targs));
if (flags & XFS_ALLOC_FLAG_NORMAP)
xfs_rmap_skip_owner_update(&targs.oinfo);
else
xfs_rmap_ag_owner(&targs.oinfo, XFS_RMAP_OWN_AG);
while (!(flags & XFS_ALLOC_FLAG_NOSHRINK) && pag->pagf_flcount > need) {
struct xfs_buf *bp;
error = xfs_alloc_get_freelist(tp, agbp, &bno, 0);
if (error)
goto out_agbp_relse;
error = xfs_free_ag_extent(tp, agbp, args->agno, bno, 1, 1);
error = xfs_free_ag_extent(tp, agbp, args->agno, bno, 1,
&targs.oinfo, 1);
if (error)
goto out_agbp_relse;
bp = xfs_btree_get_bufs(mp, tp, args->agno, bno, 0);
xfs_trans_binval(tp, bp);
}
memset(&targs, 0, sizeof(targs));
targs.tp = tp;
targs.mp = mp;
targs.agbp = agbp;
......@@ -2271,6 +2382,10 @@ xfs_agf_verify(
be32_to_cpu(agf->agf_levels[XFS_BTNUM_CNT]) > XFS_BTREE_MAXLEVELS)
return false;
if (xfs_sb_version_hasrmapbt(&mp->m_sb) &&
be32_to_cpu(agf->agf_levels[XFS_BTNUM_RMAP]) > XFS_BTREE_MAXLEVELS)
return false;
/*
* during growfs operations, the perag is not fully initialised,
* so we can't use it for any useful checking. growfs ensures we can't
......@@ -2402,6 +2517,8 @@ xfs_alloc_read_agf(
be32_to_cpu(agf->agf_levels[XFS_BTNUM_BNOi]);
pag->pagf_levels[XFS_BTNUM_CNTi] =
be32_to_cpu(agf->agf_levels[XFS_BTNUM_CNTi]);
pag->pagf_levels[XFS_BTNUM_RMAPi] =
be32_to_cpu(agf->agf_levels[XFS_BTNUM_RMAPi]);
spin_lock_init(&pag->pagb_lock);
pag->pagb_count = 0;
pag->pagb_tree = RB_ROOT;
......@@ -2691,7 +2808,8 @@ int /* error */
xfs_free_extent(
struct xfs_trans *tp, /* transaction pointer */
xfs_fsblock_t bno, /* starting block number of extent */
xfs_extlen_t len) /* length of extent */
xfs_extlen_t len, /* length of extent */
struct xfs_owner_info *oinfo) /* extent owner */
{
struct xfs_mount *mp = tp->t_mountp;
struct xfs_buf *agbp;
......@@ -2701,6 +2819,11 @@ xfs_free_extent(
ASSERT(len != 0);
if (XFS_TEST_ERROR(false, mp,
XFS_ERRTAG_FREE_EXTENT,
XFS_RANDOM_FREE_EXTENT))
return -EIO;
error = xfs_free_extent_fix_freelist(tp, agno, &agbp);
if (error)
return error;
......@@ -2712,7 +2835,7 @@ xfs_free_extent(
agbno + len <= be32_to_cpu(XFS_BUF_TO_AGF(agbp)->agf_length),
err);
error = xfs_free_ag_extent(tp, agbp, agno, agbno, len, 0);
error = xfs_free_ag_extent(tp, agbp, agno, agbno, len, oinfo, 0);
if (error)
goto err;
......
......@@ -54,41 +54,8 @@ typedef unsigned int xfs_alloctype_t;
*/
#define XFS_ALLOC_FLAG_TRYLOCK 0x00000001 /* use trylock for buffer locking */
#define XFS_ALLOC_FLAG_FREEING 0x00000002 /* indicate caller is freeing extents*/
/*
* In order to avoid ENOSPC-related deadlock caused by
* out-of-order locking of AGF buffer (PV 947395), we place
* constraints on the relationship among actual allocations for
* data blocks, freelist blocks, and potential file data bmap
* btree blocks. However, these restrictions may result in no
* actual space allocated for a delayed extent, for example, a data
* block in a certain AG is allocated but there is no additional
* block for the additional bmap btree block due to a split of the
* bmap btree of the file. The result of this may lead to an
* infinite loop in xfssyncd when the file gets flushed to disk and
* all delayed extents need to be actually allocated. To get around
* this, we explicitly set aside a few blocks which will not be
* reserved in delayed allocation. Considering the minimum number of
* needed freelist blocks is 4 fsbs _per AG_, a potential split of file's bmap
* btree requires 1 fsb, so we set the number of set-aside blocks
* to 4 + 4*agcount.
*/
#define XFS_ALLOC_SET_ASIDE(mp) (4 + ((mp)->m_sb.sb_agcount * 4))
/*
* When deciding how much space to allocate out of an AG, we limit the
* allocation maximum size to the size the AG. However, we cannot use all the
* blocks in the AG - some are permanently used by metadata. These
* blocks are generally:
* - the AG superblock, AGF, AGI and AGFL
* - the AGF (bno and cnt) and AGI btree root blocks
* - 4 blocks on the AGFL according to XFS_ALLOC_SET_ASIDE() limits
*
* The AG headers are sector sized, so the amount of space they take up is
* dependent on filesystem geometry. The others are all single blocks.
*/
#define XFS_ALLOC_AG_MAX_USABLE(mp) \
((mp)->m_sb.sb_agblocks - XFS_BB_TO_FSB(mp, XFS_FSS_TO_BB(mp, 4)) - 7)
#define XFS_ALLOC_FLAG_NORMAP 0x00000004 /* don't modify the rmapbt */
#define XFS_ALLOC_FLAG_NOSHRINK 0x00000008 /* don't shrink the freelist */
/*
......@@ -123,6 +90,7 @@ typedef struct xfs_alloc_arg {
char isfl; /* set if is freelist blocks - !acctg */
char userdata; /* mask defining userdata treatment */
xfs_fsblock_t firstblock; /* io first block allocated */
struct xfs_owner_info oinfo; /* owner of blocks being allocated */
} xfs_alloc_arg_t;
/*
......@@ -132,6 +100,11 @@ typedef struct xfs_alloc_arg {
#define XFS_ALLOC_INITIAL_USER_DATA (1 << 1)/* special case start of file */
#define XFS_ALLOC_USERDATA_ZERO (1 << 2)/* zero extent on allocation */
/* freespace limit calculations */
#define XFS_ALLOC_AGFL_RESERVE 4
unsigned int xfs_alloc_set_aside(struct xfs_mount *mp);
unsigned int xfs_alloc_ag_max_usable(struct xfs_mount *mp);
xfs_extlen_t xfs_alloc_longest_free_extent(struct xfs_mount *mp,
struct xfs_perag *pag, xfs_extlen_t need);
unsigned int xfs_alloc_min_freelist(struct xfs_mount *mp,
......@@ -210,7 +183,8 @@ int /* error */
xfs_free_extent(
struct xfs_trans *tp, /* transaction pointer */
xfs_fsblock_t bno, /* starting block number of extent */
xfs_extlen_t len); /* length of extent */
xfs_extlen_t len, /* length of extent */
struct xfs_owner_info *oinfo);/* extent owner */
int /* error */
xfs_alloc_lookup_ge(
......@@ -232,4 +206,6 @@ int xfs_alloc_fix_freelist(struct xfs_alloc_arg *args, int flags);
int xfs_free_extent_fix_freelist(struct xfs_trans *tp, xfs_agnumber_t agno,
struct xfs_buf **agbp);
xfs_extlen_t xfs_prealloc_blocks(struct xfs_mount *mp);
#endif /* __XFS_ALLOC_H__ */
......@@ -211,17 +211,6 @@ xfs_allocbt_init_key_from_rec(
key->alloc.ar_blockcount = rec->alloc.ar_blockcount;
}
STATIC void
xfs_allocbt_init_rec_from_key(
union xfs_btree_key *key,
union xfs_btree_rec *rec)
{
ASSERT(key->alloc.ar_startblock != 0);
rec->alloc.ar_startblock = key->alloc.ar_startblock;
rec->alloc.ar_blockcount = key->alloc.ar_blockcount;
}
STATIC void
xfs_allocbt_init_rec_from_cur(
struct xfs_btree_cur *cur,
......@@ -406,7 +395,6 @@ static const struct xfs_btree_ops xfs_allocbt_ops = {
.get_minrecs = xfs_allocbt_get_minrecs,
.get_maxrecs = xfs_allocbt_get_maxrecs,
.init_key_from_rec = xfs_allocbt_init_key_from_rec,
.init_rec_from_key = xfs_allocbt_init_rec_from_key,
.init_rec_from_cur = xfs_allocbt_init_rec_from_cur,
.init_ptr_from_cur = xfs_allocbt_init_ptr_from_cur,
.key_diff = xfs_allocbt_key_diff,
......
......@@ -23,6 +23,7 @@
#include "xfs_trans_resv.h"
#include "xfs_bit.h"
#include "xfs_mount.h"
#include "xfs_defer.h"
#include "xfs_da_format.h"
#include "xfs_da_btree.h"
#include "xfs_attr_sf.h"
......@@ -203,7 +204,7 @@ xfs_attr_set(
{
struct xfs_mount *mp = dp->i_mount;
struct xfs_da_args args;
struct xfs_bmap_free flist;
struct xfs_defer_ops dfops;
struct xfs_trans_res tres;
xfs_fsblock_t firstblock;
int rsvd = (flags & ATTR_ROOT) != 0;
......@@ -221,7 +222,7 @@ xfs_attr_set(
args.value = value;
args.valuelen = valuelen;
args.firstblock = &firstblock;
args.flist = &flist;
args.dfops = &dfops;
args.op_flags = XFS_DA_OP_ADDNAME | XFS_DA_OP_OKNOENT;
args.total = xfs_attr_calc_size(&args, &local);
......@@ -316,13 +317,13 @@ xfs_attr_set(
* It won't fit in the shortform, transform to a leaf block.
* GROT: another possible req'mt for a double-split btree op.
*/
xfs_bmap_init(args.flist, args.firstblock);
xfs_defer_init(args.dfops, args.firstblock);
error = xfs_attr_shortform_to_leaf(&args);
if (!error)
error = xfs_bmap_finish(&args.trans, args.flist, dp);
error = xfs_defer_finish(&args.trans, args.dfops, dp);
if (error) {
args.trans = NULL;
xfs_bmap_cancel(&flist);
xfs_defer_cancel(&dfops);
goto out;
}
......@@ -382,7 +383,7 @@ xfs_attr_remove(
{
struct xfs_mount *mp = dp->i_mount;
struct xfs_da_args args;
struct xfs_bmap_free flist;
struct xfs_defer_ops dfops;
xfs_fsblock_t firstblock;
int error;
......@@ -399,7 +400,7 @@ xfs_attr_remove(
return error;
args.firstblock = &firstblock;
args.flist = &flist;
args.dfops = &dfops;
/*
* we have no control over the attribute names that userspace passes us
......@@ -584,13 +585,13 @@ xfs_attr_leaf_addname(xfs_da_args_t *args)
* Commit that transaction so that the node_addname() call
* can manage its own transactions.
*/
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
error = xfs_attr3_leaf_to_node(args);
if (!error)
error = xfs_bmap_finish(&args->trans, args->flist, dp);
error = xfs_defer_finish(&args->trans, args->dfops, dp);
if (error) {
args->trans = NULL;
xfs_bmap_cancel(args->flist);
xfs_defer_cancel(args->dfops);
return error;
}
......@@ -674,15 +675,15 @@ xfs_attr_leaf_addname(xfs_da_args_t *args)
* If the result is small enough, shrink it all into the inode.
*/
if ((forkoff = xfs_attr_shortform_allfit(bp, dp))) {
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
error = xfs_attr3_leaf_to_shortform(bp, args, forkoff);
/* bp is gone due to xfs_da_shrink_inode */
if (!error)
error = xfs_bmap_finish(&args->trans,
args->flist, dp);
error = xfs_defer_finish(&args->trans,
args->dfops, dp);
if (error) {
args->trans = NULL;
xfs_bmap_cancel(args->flist);
xfs_defer_cancel(args->dfops);
return error;
}
}
......@@ -737,14 +738,14 @@ xfs_attr_leaf_removename(xfs_da_args_t *args)
* If the result is small enough, shrink it all into the inode.
*/
if ((forkoff = xfs_attr_shortform_allfit(bp, dp))) {
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
error = xfs_attr3_leaf_to_shortform(bp, args, forkoff);
/* bp is gone due to xfs_da_shrink_inode */
if (!error)
error = xfs_bmap_finish(&args->trans, args->flist, dp);
error = xfs_defer_finish(&args->trans, args->dfops, dp);
if (error) {
args->trans = NULL;
xfs_bmap_cancel(args->flist);
xfs_defer_cancel(args->dfops);
return error;
}
}
......@@ -863,14 +864,14 @@ xfs_attr_node_addname(xfs_da_args_t *args)
*/
xfs_da_state_free(state);
state = NULL;
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
error = xfs_attr3_leaf_to_node(args);
if (!error)
error = xfs_bmap_finish(&args->trans,
args->flist, dp);
error = xfs_defer_finish(&args->trans,
args->dfops, dp);
if (error) {
args->trans = NULL;
xfs_bmap_cancel(args->flist);
xfs_defer_cancel(args->dfops);
goto out;
}
......@@ -891,13 +892,13 @@ xfs_attr_node_addname(xfs_da_args_t *args)
* in the index/blkno/rmtblkno/rmtblkcnt fields and
* in the index2/blkno2/rmtblkno2/rmtblkcnt2 fields.
*/
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
error = xfs_da3_split(state);
if (!error)
error = xfs_bmap_finish(&args->trans, args->flist, dp);
error = xfs_defer_finish(&args->trans, args->dfops, dp);
if (error) {
args->trans = NULL;
xfs_bmap_cancel(args->flist);
xfs_defer_cancel(args->dfops);
goto out;
}
} else {
......@@ -990,14 +991,14 @@ xfs_attr_node_addname(xfs_da_args_t *args)
* Check to see if the tree needs to be collapsed.
*/
if (retval && (state->path.active > 1)) {
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
error = xfs_da3_join(state);
if (!error)
error = xfs_bmap_finish(&args->trans,
args->flist, dp);
error = xfs_defer_finish(&args->trans,
args->dfops, dp);
if (error) {
args->trans = NULL;
xfs_bmap_cancel(args->flist);
xfs_defer_cancel(args->dfops);
goto out;
}
}
......@@ -1113,13 +1114,13 @@ xfs_attr_node_removename(xfs_da_args_t *args)
* Check to see if the tree needs to be collapsed.
*/
if (retval && (state->path.active > 1)) {
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
error = xfs_da3_join(state);
if (!error)
error = xfs_bmap_finish(&args->trans, args->flist, dp);
error = xfs_defer_finish(&args->trans, args->dfops, dp);
if (error) {
args->trans = NULL;
xfs_bmap_cancel(args->flist);
xfs_defer_cancel(args->dfops);
goto out;
}
/*
......@@ -1146,15 +1147,15 @@ xfs_attr_node_removename(xfs_da_args_t *args)
goto out;
if ((forkoff = xfs_attr_shortform_allfit(bp, dp))) {
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
error = xfs_attr3_leaf_to_shortform(bp, args, forkoff);
/* bp is gone due to xfs_da_shrink_inode */
if (!error)
error = xfs_bmap_finish(&args->trans,
args->flist, dp);
error = xfs_defer_finish(&args->trans,
args->dfops, dp);
if (error) {
args->trans = NULL;
xfs_bmap_cancel(args->flist);
xfs_defer_cancel(args->dfops);
goto out;
}
} else
......
......@@ -792,7 +792,7 @@ xfs_attr_shortform_to_leaf(xfs_da_args_t *args)
nargs.dp = dp;
nargs.geo = args->geo;
nargs.firstblock = args->firstblock;
nargs.flist = args->flist;
nargs.dfops = args->dfops;
nargs.total = args->total;
nargs.whichfork = XFS_ATTR_FORK;
nargs.trans = args->trans;
......@@ -922,7 +922,7 @@ xfs_attr3_leaf_to_shortform(
nargs.geo = args->geo;
nargs.dp = dp;
nargs.firstblock = args->firstblock;
nargs.flist = args->flist;
nargs.dfops = args->dfops;
nargs.total = args->total;
nargs.whichfork = XFS_ATTR_FORK;
nargs.trans = args->trans;
......
......@@ -24,6 +24,7 @@
#include "xfs_trans_resv.h"
#include "xfs_bit.h"
#include "xfs_mount.h"
#include "xfs_defer.h"
#include "xfs_da_format.h"
#include "xfs_da_btree.h"
#include "xfs_inode.h"
......@@ -460,16 +461,16 @@ xfs_attr_rmtval_set(
* extent and then crash then the block may not contain the
* correct metadata after log recovery occurs.
*/
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
nmap = 1;
error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)lblkno,
blkcnt, XFS_BMAPI_ATTRFORK, args->firstblock,
args->total, &map, &nmap, args->flist);
args->total, &map, &nmap, args->dfops);
if (!error)
error = xfs_bmap_finish(&args->trans, args->flist, dp);
error = xfs_defer_finish(&args->trans, args->dfops, dp);
if (error) {
args->trans = NULL;
xfs_bmap_cancel(args->flist);
xfs_defer_cancel(args->dfops);
return error;
}
......@@ -503,7 +504,7 @@ xfs_attr_rmtval_set(
ASSERT(blkcnt > 0);
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
nmap = 1;
error = xfs_bmapi_read(dp, (xfs_fileoff_t)lblkno,
blkcnt, &map, &nmap,
......@@ -603,16 +604,16 @@ xfs_attr_rmtval_remove(
blkcnt = args->rmtblkcnt;
done = 0;
while (!done) {
xfs_bmap_init(args->flist, args->firstblock);
xfs_defer_init(args->dfops, args->firstblock);
error = xfs_bunmapi(args->trans, args->dp, lblkno, blkcnt,
XFS_BMAPI_ATTRFORK, 1, args->firstblock,
args->flist, &done);
args->dfops, &done);
if (!error)
error = xfs_bmap_finish(&args->trans, args->flist,
error = xfs_defer_finish(&args->trans, args->dfops,
args->dp);
if (error) {
args->trans = NULL;
xfs_bmap_cancel(args->flist);
xfs_defer_cancel(args->dfops);
return error;
}
......
此差异已折叠。
......@@ -32,7 +32,7 @@ extern kmem_zone_t *xfs_bmap_free_item_zone;
*/
struct xfs_bmalloca {
xfs_fsblock_t *firstblock; /* i/o first block allocated */
struct xfs_bmap_free *flist; /* bmap freelist */
struct xfs_defer_ops *dfops; /* bmap freelist */
struct xfs_trans *tp; /* transaction pointer */
struct xfs_inode *ip; /* incore inode pointer */
struct xfs_bmbt_irec prev; /* extent before the new one */
......@@ -62,34 +62,14 @@ struct xfs_bmalloca {
* List of extents to be free "later".
* The list is kept sorted on xbf_startblock.
*/
struct xfs_bmap_free_item
struct xfs_extent_free_item
{
xfs_fsblock_t xbfi_startblock;/* starting fs block number */
xfs_extlen_t xbfi_blockcount;/* number of blocks in extent */
struct list_head xbfi_list;
xfs_fsblock_t xefi_startblock;/* starting fs block number */
xfs_extlen_t xefi_blockcount;/* number of blocks in extent */
struct list_head xefi_list;
struct xfs_owner_info xefi_oinfo; /* extent owner */
};
/*
* Header for free extent list.
*
* xbf_low is used by the allocator to activate the lowspace algorithm -
* when free space is running low the extent allocator may choose to
* allocate an extent from an AG without leaving sufficient space for
* a btree split when inserting the new extent. In this case the allocator
* will enable the lowspace algorithm which is supposed to allow further
* allocations (such as btree splits and newroots) to allocate from
* sequential AGs. In order to avoid locking AGs out of order the lowspace
* algorithm will start searching for free space from AG 0. If the correct
* transaction reservations have been made then this algorithm will eventually
* find all the space it needs.
*/
typedef struct xfs_bmap_free
{
struct list_head xbf_flist; /* list of to-be-free extents */
int xbf_count; /* count of items on list */
int xbf_low; /* alloc in low mode */
} xfs_bmap_free_t;
#define XFS_BMAP_MAX_NMAP 4
/*
......@@ -139,14 +119,6 @@ static inline int xfs_bmapi_aflag(int w)
#define DELAYSTARTBLOCK ((xfs_fsblock_t)-1LL)
#define HOLESTARTBLOCK ((xfs_fsblock_t)-2LL)
static inline void xfs_bmap_init(xfs_bmap_free_t *flp, xfs_fsblock_t *fbp)
{
INIT_LIST_HEAD(&flp->xbf_flist);
flp->xbf_count = 0;
flp->xbf_low = 0;
*fbp = NULLFSBLOCK;
}
/*
* Flags for xfs_bmap_add_extent*.
*/
......@@ -193,11 +165,9 @@ void xfs_bmap_trace_exlist(struct xfs_inode *ip, xfs_extnum_t cnt,
int xfs_bmap_add_attrfork(struct xfs_inode *ip, int size, int rsvd);
void xfs_bmap_local_to_extents_empty(struct xfs_inode *ip, int whichfork);
void xfs_bmap_add_free(struct xfs_mount *mp, struct xfs_bmap_free *flist,
xfs_fsblock_t bno, xfs_filblks_t len);
void xfs_bmap_cancel(struct xfs_bmap_free *flist);
int xfs_bmap_finish(struct xfs_trans **tp, struct xfs_bmap_free *flist,
struct xfs_inode *ip);
void xfs_bmap_add_free(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
xfs_fsblock_t bno, xfs_filblks_t len,
struct xfs_owner_info *oinfo);
void xfs_bmap_compute_maxlevels(struct xfs_mount *mp, int whichfork);
int xfs_bmap_first_unused(struct xfs_trans *tp, struct xfs_inode *ip,
xfs_extlen_t len, xfs_fileoff_t *unused, int whichfork);
......@@ -218,18 +188,18 @@ int xfs_bmapi_write(struct xfs_trans *tp, struct xfs_inode *ip,
xfs_fileoff_t bno, xfs_filblks_t len, int flags,
xfs_fsblock_t *firstblock, xfs_extlen_t total,
struct xfs_bmbt_irec *mval, int *nmap,
struct xfs_bmap_free *flist);
struct xfs_defer_ops *dfops);
int xfs_bunmapi(struct xfs_trans *tp, struct xfs_inode *ip,
xfs_fileoff_t bno, xfs_filblks_t len, int flags,
xfs_extnum_t nexts, xfs_fsblock_t *firstblock,
struct xfs_bmap_free *flist, int *done);
struct xfs_defer_ops *dfops, int *done);
int xfs_check_nostate_extents(struct xfs_ifork *ifp, xfs_extnum_t idx,
xfs_extnum_t num);
uint xfs_default_attroffset(struct xfs_inode *ip);
int xfs_bmap_shift_extents(struct xfs_trans *tp, struct xfs_inode *ip,
xfs_fileoff_t *next_fsb, xfs_fileoff_t offset_shift_fsb,
int *done, xfs_fileoff_t stop_fsb, xfs_fsblock_t *firstblock,
struct xfs_bmap_free *flist, enum shift_direction direction,
struct xfs_defer_ops *dfops, enum shift_direction direction,
int num_exts);
int xfs_bmap_split_extent(struct xfs_inode *ip, xfs_fileoff_t split_offset);
......
......@@ -23,6 +23,7 @@
#include "xfs_trans_resv.h"
#include "xfs_bit.h"
#include "xfs_mount.h"
#include "xfs_defer.h"
#include "xfs_inode.h"
#include "xfs_trans.h"
#include "xfs_inode_item.h"
......@@ -34,6 +35,7 @@
#include "xfs_quota.h"
#include "xfs_trace.h"
#include "xfs_cksum.h"
#include "xfs_rmap.h"
/*
* Determine the extent state.
......@@ -406,11 +408,11 @@ xfs_bmbt_dup_cursor(
cur->bc_private.b.ip, cur->bc_private.b.whichfork);
/*
* Copy the firstblock, flist, and flags values,
* Copy the firstblock, dfops, and flags values,
* since init cursor doesn't get them.
*/
new->bc_private.b.firstblock = cur->bc_private.b.firstblock;
new->bc_private.b.flist = cur->bc_private.b.flist;
new->bc_private.b.dfops = cur->bc_private.b.dfops;
new->bc_private.b.flags = cur->bc_private.b.flags;
return new;
......@@ -423,7 +425,7 @@ xfs_bmbt_update_cursor(
{
ASSERT((dst->bc_private.b.firstblock != NULLFSBLOCK) ||
(dst->bc_private.b.ip->i_d.di_flags & XFS_DIFLAG_REALTIME));
ASSERT(dst->bc_private.b.flist == src->bc_private.b.flist);
ASSERT(dst->bc_private.b.dfops == src->bc_private.b.dfops);
dst->bc_private.b.allocated += src->bc_private.b.allocated;
dst->bc_private.b.firstblock = src->bc_private.b.firstblock;
......@@ -446,6 +448,8 @@ xfs_bmbt_alloc_block(
args.mp = cur->bc_mp;
args.fsbno = cur->bc_private.b.firstblock;
args.firstblock = args.fsbno;
xfs_rmap_ino_bmbt_owner(&args.oinfo, cur->bc_private.b.ip->i_ino,
cur->bc_private.b.whichfork);
if (args.fsbno == NULLFSBLOCK) {
args.fsbno = be64_to_cpu(start->l);
......@@ -462,7 +466,7 @@ xfs_bmbt_alloc_block(
* block allocation here and corrupt the filesystem.
*/
args.minleft = args.tp->t_blk_res;
} else if (cur->bc_private.b.flist->xbf_low) {
} else if (cur->bc_private.b.dfops->dop_low) {
args.type = XFS_ALLOCTYPE_START_BNO;
} else {
args.type = XFS_ALLOCTYPE_NEAR_BNO;
......@@ -490,7 +494,7 @@ xfs_bmbt_alloc_block(
error = xfs_alloc_vextent(&args);
if (error)
goto error0;
cur->bc_private.b.flist->xbf_low = 1;
cur->bc_private.b.dfops->dop_low = true;
}
if (args.fsbno == NULLFSBLOCK) {
XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT);
......@@ -525,8 +529,10 @@ xfs_bmbt_free_block(
struct xfs_inode *ip = cur->bc_private.b.ip;
struct xfs_trans *tp = cur->bc_tp;
xfs_fsblock_t fsbno = XFS_DADDR_TO_FSB(mp, XFS_BUF_ADDR(bp));
struct xfs_owner_info oinfo;
xfs_bmap_add_free(mp, cur->bc_private.b.flist, fsbno, 1);
xfs_rmap_ino_bmbt_owner(&oinfo, ip->i_ino, cur->bc_private.b.whichfork);
xfs_bmap_add_free(mp, cur->bc_private.b.dfops, fsbno, 1, &oinfo);
ip->i_d.di_nblocks--;
xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
......@@ -599,17 +605,6 @@ xfs_bmbt_init_key_from_rec(
cpu_to_be64(xfs_bmbt_disk_get_startoff(&rec->bmbt));
}
STATIC void
xfs_bmbt_init_rec_from_key(
union xfs_btree_key *key,
union xfs_btree_rec *rec)
{
ASSERT(key->bmbt.br_startoff != 0);
xfs_bmbt_disk_set_allf(&rec->bmbt, be64_to_cpu(key->bmbt.br_startoff),
0, 0, XFS_EXT_NORM);
}
STATIC void
xfs_bmbt_init_rec_from_cur(
struct xfs_btree_cur *cur,
......@@ -760,7 +755,6 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
.get_minrecs = xfs_bmbt_get_minrecs,
.get_dmaxrecs = xfs_bmbt_get_dmaxrecs,
.init_key_from_rec = xfs_bmbt_init_key_from_rec,
.init_rec_from_key = xfs_bmbt_init_rec_from_key,
.init_rec_from_cur = xfs_bmbt_init_rec_from_cur,
.init_ptr_from_cur = xfs_bmbt_init_ptr_from_cur,
.key_diff = xfs_bmbt_key_diff,
......@@ -800,7 +794,7 @@ xfs_bmbt_init_cursor(
cur->bc_private.b.forksize = XFS_IFORK_SIZE(ip, whichfork);
cur->bc_private.b.ip = ip;
cur->bc_private.b.firstblock = NULLFSBLOCK;
cur->bc_private.b.flist = NULL;
cur->bc_private.b.dfops = NULL;
cur->bc_private.b.allocated = 0;
cur->bc_private.b.flags = 0;
cur->bc_private.b.whichfork = whichfork;
......
此差异已折叠。
......@@ -19,7 +19,7 @@
#define __XFS_BTREE_H__
struct xfs_buf;
struct xfs_bmap_free;
struct xfs_defer_ops;
struct xfs_inode;
struct xfs_mount;
struct xfs_trans;
......@@ -38,17 +38,37 @@ union xfs_btree_ptr {
};
union xfs_btree_key {
xfs_bmbt_key_t bmbt;
struct xfs_bmbt_key bmbt;
xfs_bmdr_key_t bmbr; /* bmbt root block */
xfs_alloc_key_t alloc;
xfs_inobt_key_t inobt;
struct xfs_inobt_key inobt;
struct xfs_rmap_key rmap;
};
/*
* In-core key that holds both low and high keys for overlapped btrees.
* The two keys are packed next to each other on disk, so do the same
* in memory. Preserve the existing xfs_btree_key as a single key to
* avoid the mental model breakage that would happen if we passed a
* bigkey into a function that operates on a single key.
*/
union xfs_btree_bigkey {
struct xfs_bmbt_key bmbt;
xfs_bmdr_key_t bmbr; /* bmbt root block */
xfs_alloc_key_t alloc;
struct xfs_inobt_key inobt;
struct {
struct xfs_rmap_key rmap;
struct xfs_rmap_key rmap_hi;
};
};
union xfs_btree_rec {
xfs_bmbt_rec_t bmbt;
struct xfs_bmbt_rec bmbt;
xfs_bmdr_rec_t bmbr; /* bmbt root block */
xfs_alloc_rec_t alloc;
xfs_inobt_rec_t inobt;
struct xfs_alloc_rec alloc;
struct xfs_inobt_rec inobt;
struct xfs_rmap_rec rmap;
};
/*
......@@ -63,6 +83,7 @@ union xfs_btree_rec {
#define XFS_BTNUM_BMAP ((xfs_btnum_t)XFS_BTNUM_BMAPi)
#define XFS_BTNUM_INO ((xfs_btnum_t)XFS_BTNUM_INOi)
#define XFS_BTNUM_FINO ((xfs_btnum_t)XFS_BTNUM_FINOi)
#define XFS_BTNUM_RMAP ((xfs_btnum_t)XFS_BTNUM_RMAPi)
/*
* For logging record fields.
......@@ -95,6 +116,7 @@ do { \
case XFS_BTNUM_BMAP: __XFS_BTREE_STATS_INC(__mp, bmbt, stat); break; \
case XFS_BTNUM_INO: __XFS_BTREE_STATS_INC(__mp, ibt, stat); break; \
case XFS_BTNUM_FINO: __XFS_BTREE_STATS_INC(__mp, fibt, stat); break; \
case XFS_BTNUM_RMAP: __XFS_BTREE_STATS_INC(__mp, rmap, stat); break; \
case XFS_BTNUM_MAX: ASSERT(0); /* fucking gcc */ ; break; \
} \
} while (0)
......@@ -115,11 +137,13 @@ do { \
__XFS_BTREE_STATS_ADD(__mp, ibt, stat, val); break; \
case XFS_BTNUM_FINO: \
__XFS_BTREE_STATS_ADD(__mp, fibt, stat, val); break; \
case XFS_BTNUM_RMAP: \
__XFS_BTREE_STATS_ADD(__mp, rmap, stat, val); break; \
case XFS_BTNUM_MAX: ASSERT(0); /* fucking gcc */ ; break; \
} \
} while (0)
#define XFS_BTREE_MAXLEVELS 8 /* max of all btrees */
#define XFS_BTREE_MAXLEVELS 9 /* max of all btrees */
struct xfs_btree_ops {
/* size of the key and record structures */
......@@ -158,17 +182,25 @@ struct xfs_btree_ops {
/* init values of btree structures */
void (*init_key_from_rec)(union xfs_btree_key *key,
union xfs_btree_rec *rec);
void (*init_rec_from_key)(union xfs_btree_key *key,
union xfs_btree_rec *rec);
void (*init_rec_from_cur)(struct xfs_btree_cur *cur,
union xfs_btree_rec *rec);
void (*init_ptr_from_cur)(struct xfs_btree_cur *cur,
union xfs_btree_ptr *ptr);
void (*init_high_key_from_rec)(union xfs_btree_key *key,
union xfs_btree_rec *rec);
/* difference between key value and cursor value */
__int64_t (*key_diff)(struct xfs_btree_cur *cur,
union xfs_btree_key *key);
/*
* Difference between key2 and key1 -- positive if key1 > key2,
* negative if key1 < key2, and zero if equal.
*/
__int64_t (*diff_two_keys)(struct xfs_btree_cur *cur,
union xfs_btree_key *key1,
union xfs_btree_key *key2);
const struct xfs_buf_ops *buf_ops;
#if defined(DEBUG) || defined(XFS_WARN)
......@@ -192,6 +224,13 @@ struct xfs_btree_ops {
#define LASTREC_DELREC 2
union xfs_btree_irec {
struct xfs_alloc_rec_incore a;
struct xfs_bmbt_irec b;
struct xfs_inobt_rec_incore i;
struct xfs_rmap_irec r;
};
/*
* Btree cursor structure.
* This collects all information needed by the btree code in one place.
......@@ -202,11 +241,7 @@ typedef struct xfs_btree_cur
struct xfs_mount *bc_mp; /* file system mount struct */
const struct xfs_btree_ops *bc_ops;
uint bc_flags; /* btree features - below */
union {
xfs_alloc_rec_incore_t a;
xfs_bmbt_irec_t b;
xfs_inobt_rec_incore_t i;
} bc_rec; /* current insert/search record value */
union xfs_btree_irec bc_rec; /* current insert/search record value */
struct xfs_buf *bc_bufs[XFS_BTREE_MAXLEVELS]; /* buf ptr per level */
int bc_ptrs[XFS_BTREE_MAXLEVELS]; /* key/record # */
__uint8_t bc_ra[XFS_BTREE_MAXLEVELS]; /* readahead bits */
......@@ -218,11 +253,12 @@ typedef struct xfs_btree_cur
union {
struct { /* needed for BNO, CNT, INO */
struct xfs_buf *agbp; /* agf/agi buffer pointer */
struct xfs_defer_ops *dfops; /* deferred updates */
xfs_agnumber_t agno; /* ag number */
} a;
struct { /* needed for BMAP */
struct xfs_inode *ip; /* pointer to our inode */
struct xfs_bmap_free *flist; /* list to free after */
struct xfs_defer_ops *dfops; /* deferred updates */
xfs_fsblock_t firstblock; /* 1st blk allocated */
int allocated; /* count of alloced */
short forksize; /* fork's inode space */
......@@ -238,6 +274,7 @@ typedef struct xfs_btree_cur
#define XFS_BTREE_ROOT_IN_INODE (1<<1) /* root may be variable size */
#define XFS_BTREE_LASTREC_UPDATE (1<<2) /* track last rec externally */
#define XFS_BTREE_CRC_BLOCKS (1<<3) /* uses extended btree blocks */
#define XFS_BTREE_OVERLAPPING (1<<4) /* overlapping intervals */
#define XFS_BTREE_NOERROR 0
......@@ -477,4 +514,19 @@ bool xfs_btree_sblock_verify(struct xfs_buf *bp, unsigned int max_recs);
uint xfs_btree_compute_maxlevels(struct xfs_mount *mp, uint *limits,
unsigned long len);
/* return codes */
#define XFS_BTREE_QUERY_RANGE_CONTINUE 0 /* keep iterating */
#define XFS_BTREE_QUERY_RANGE_ABORT 1 /* stop iterating */
typedef int (*xfs_btree_query_range_fn)(struct xfs_btree_cur *cur,
union xfs_btree_rec *rec, void *priv);
int xfs_btree_query_range(struct xfs_btree_cur *cur,
union xfs_btree_irec *low_rec, union xfs_btree_irec *high_rec,
xfs_btree_query_range_fn fn, void *priv);
typedef int (*xfs_btree_visit_blocks_fn)(struct xfs_btree_cur *cur, int level,
void *data);
int xfs_btree_visit_blocks(struct xfs_btree_cur *cur,
xfs_btree_visit_blocks_fn fn, void *data);
#endif /* __XFS_BTREE_H__ */
......@@ -2029,7 +2029,7 @@ xfs_da_grow_inode_int(
error = xfs_bmapi_write(tp, dp, *bno, count,
xfs_bmapi_aflag(w)|XFS_BMAPI_METADATA|XFS_BMAPI_CONTIG,
args->firstblock, args->total, &map, &nmap,
args->flist);
args->dfops);
if (error)
return error;
......@@ -2052,7 +2052,7 @@ xfs_da_grow_inode_int(
error = xfs_bmapi_write(tp, dp, b, c,
xfs_bmapi_aflag(w)|XFS_BMAPI_METADATA,
args->firstblock, args->total,
&mapp[mapi], &nmap, args->flist);
&mapp[mapi], &nmap, args->dfops);
if (error)
goto out_free_map;
if (nmap < 1)
......@@ -2362,7 +2362,7 @@ xfs_da_shrink_inode(
*/
error = xfs_bunmapi(tp, dp, dead_blkno, count,
xfs_bmapi_aflag(w), 0, args->firstblock,
args->flist, &done);
args->dfops, &done);
if (error == -ENOSPC) {
if (w != XFS_DATA_FORK)
break;
......
......@@ -19,7 +19,7 @@
#ifndef __XFS_DA_BTREE_H__
#define __XFS_DA_BTREE_H__
struct xfs_bmap_free;
struct xfs_defer_ops;
struct xfs_inode;
struct xfs_trans;
struct zone;
......@@ -70,7 +70,7 @@ typedef struct xfs_da_args {
xfs_ino_t inumber; /* input/output inode number */
struct xfs_inode *dp; /* directory inode to manipulate */
xfs_fsblock_t *firstblock; /* ptr to firstblock for bmap calls */
struct xfs_bmap_free *flist; /* ptr to freelist for bmap_finish */
struct xfs_defer_ops *dfops; /* ptr to freelist for bmap_finish */
struct xfs_trans *trans; /* current trans (changes over time) */
xfs_extlen_t total; /* total blocks needed, for 1st bmap */
int whichfork; /* data or attribute fork */
......
......@@ -629,6 +629,7 @@ typedef struct xfs_attr_shortform {
struct xfs_attr_sf_hdr { /* constant-structure header block */
__be16 totsize; /* total bytes in shortform list */
__u8 count; /* count of active entries */
__u8 padding;
} hdr;
struct xfs_attr_sf_entry {
__uint8_t namelen; /* actual length of name (no NULL) */
......
/*
* Copyright (C) 2016 Oracle. All Rights Reserved.
*
* Author: Darrick J. Wong <darrick.wong@oracle.com>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it would be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#include "xfs.h"
#include "xfs_fs.h"
#include "xfs_shared.h"
#include "xfs_format.h"
#include "xfs_log_format.h"
#include "xfs_trans_resv.h"
#include "xfs_bit.h"
#include "xfs_sb.h"
#include "xfs_mount.h"
#include "xfs_defer.h"
#include "xfs_trans.h"
#include "xfs_trace.h"
/*
* Deferred Operations in XFS
*
* Due to the way locking rules work in XFS, certain transactions (block
* mapping and unmapping, typically) have permanent reservations so that
* we can roll the transaction to adhere to AG locking order rules and
* to unlock buffers between metadata updates. Prior to rmap/reflink,
* the mapping code had a mechanism to perform these deferrals for
* extents that were going to be freed; this code makes that facility
* more generic.
*
* When adding the reverse mapping and reflink features, it became
* necessary to perform complex remapping multi-transactions to comply
* with AG locking order rules, and to be able to spread a single
* refcount update operation (an operation on an n-block extent can
* update as many as n records!) among multiple transactions. XFS can
* roll a transaction to facilitate this, but using this facility
* requires us to log "intent" items in case log recovery needs to
* redo the operation, and to log "done" items to indicate that redo
* is not necessary.
*
* Deferred work is tracked in xfs_defer_pending items. Each pending
* item tracks one type of deferred work. Incoming work items (which
* have not yet had an intent logged) are attached to a pending item
* on the dop_intake list, where they wait for the caller to finish
* the deferred operations.
*
* Finishing a set of deferred operations is an involved process. To
* start, we define "rolling a deferred-op transaction" as follows:
*
* > For each xfs_defer_pending item on the dop_intake list,
* - Sort the work items in AG order. XFS locking
* order rules require us to lock buffers in AG order.
* - Create a log intent item for that type.
* - Attach it to the pending item.
* - Move the pending item from the dop_intake list to the
* dop_pending list.
* > Roll the transaction.
*
* NOTE: To avoid exceeding the transaction reservation, we limit the
* number of items that we attach to a given xfs_defer_pending.
*
* The actual finishing process looks like this:
*
* > For each xfs_defer_pending in the dop_pending list,
* - Roll the deferred-op transaction as above.
* - Create a log done item for that type, and attach it to the
* log intent item.
* - For each work item attached to the log intent item,
* * Perform the described action.
* * Attach the work item to the log done item.
*
* The key here is that we must log an intent item for all pending
* work items every time we roll the transaction, and that we must log
* a done item as soon as the work is completed. With this mechanism
* we can perform complex remapping operations, chaining intent items
* as needed.
*
* This is an example of remapping the extent (E, E+B) into file X at
* offset A and dealing with the extent (C, C+B) already being mapped
* there:
* +-------------------------------------------------+
* | Unmap file X startblock C offset A length B | t0
* | Intent to reduce refcount for extent (C, B) |
* | Intent to remove rmap (X, C, A, B) |
* | Intent to free extent (D, 1) (bmbt block) |
* | Intent to map (X, A, B) at startblock E |
* +-------------------------------------------------+
* | Map file X startblock E offset A length B | t1
* | Done mapping (X, E, A, B) |
* | Intent to increase refcount for extent (E, B) |
* | Intent to add rmap (X, E, A, B) |
* +-------------------------------------------------+
* | Reduce refcount for extent (C, B) | t2
* | Done reducing refcount for extent (C, B) |
* | Increase refcount for extent (E, B) |
* | Done increasing refcount for extent (E, B) |
* | Intent to free extent (C, B) |
* | Intent to free extent (F, 1) (refcountbt block) |
* | Intent to remove rmap (F, 1, REFC) |
* +-------------------------------------------------+
* | Remove rmap (X, C, A, B) | t3
* | Done removing rmap (X, C, A, B) |
* | Add rmap (X, E, A, B) |
* | Done adding rmap (X, E, A, B) |
* | Remove rmap (F, 1, REFC) |
* | Done removing rmap (F, 1, REFC) |
* +-------------------------------------------------+
* | Free extent (C, B) | t4
* | Done freeing extent (C, B) |
* | Free extent (D, 1) |
* | Done freeing extent (D, 1) |
* | Free extent (F, 1) |
* | Done freeing extent (F, 1) |
* +-------------------------------------------------+
*
* If we should crash before t2 commits, log recovery replays
* the following intent items:
*
* - Intent to reduce refcount for extent (C, B)
* - Intent to remove rmap (X, C, A, B)
* - Intent to free extent (D, 1) (bmbt block)
* - Intent to increase refcount for extent (E, B)
* - Intent to add rmap (X, E, A, B)
*
* In the process of recovering, it should also generate and take care
* of these intent items:
*
* - Intent to free extent (C, B)
* - Intent to free extent (F, 1) (refcountbt block)
* - Intent to remove rmap (F, 1, REFC)
*/
static const struct xfs_defer_op_type *defer_op_types[XFS_DEFER_OPS_TYPE_MAX];
/*
* For each pending item in the intake list, log its intent item and the
* associated extents, then add the entire intake list to the end of
* the pending list.
*/
STATIC void
xfs_defer_intake_work(
struct xfs_trans *tp,
struct xfs_defer_ops *dop)
{
struct list_head *li;
struct xfs_defer_pending *dfp;
list_for_each_entry(dfp, &dop->dop_intake, dfp_list) {
trace_xfs_defer_intake_work(tp->t_mountp, dfp);
dfp->dfp_intent = dfp->dfp_type->create_intent(tp,
dfp->dfp_count);
list_sort(tp->t_mountp, &dfp->dfp_work,
dfp->dfp_type->diff_items);
list_for_each(li, &dfp->dfp_work)
dfp->dfp_type->log_item(tp, dfp->dfp_intent, li);
}
list_splice_tail_init(&dop->dop_intake, &dop->dop_pending);
}
/* Abort all the intents that were committed. */
STATIC void
xfs_defer_trans_abort(
struct xfs_trans *tp,
struct xfs_defer_ops *dop,
int error)
{
struct xfs_defer_pending *dfp;
trace_xfs_defer_trans_abort(tp->t_mountp, dop);
/*
* If the transaction was committed, drop the intent reference
* since we're bailing out of here. The other reference is
* dropped when the intent hits the AIL. If the transaction
* was not committed, the intent is freed by the intent item
* unlock handler on abort.
*/
if (!dop->dop_committed)
return;
/* Abort intent items. */
list_for_each_entry(dfp, &dop->dop_pending, dfp_list) {
trace_xfs_defer_pending_abort(tp->t_mountp, dfp);
if (dfp->dfp_committed)
dfp->dfp_type->abort_intent(dfp->dfp_intent);
}
/* Shut down FS. */
xfs_force_shutdown(tp->t_mountp, (error == -EFSCORRUPTED) ?
SHUTDOWN_CORRUPT_INCORE : SHUTDOWN_META_IO_ERROR);
}
/* Roll a transaction so we can do some deferred op processing. */
STATIC int
xfs_defer_trans_roll(
struct xfs_trans **tp,
struct xfs_defer_ops *dop,
struct xfs_inode *ip)
{
int i;
int error;
/* Log all the joined inodes except the one we passed in. */
for (i = 0; i < XFS_DEFER_OPS_NR_INODES && dop->dop_inodes[i]; i++) {
if (dop->dop_inodes[i] == ip)
continue;
xfs_trans_log_inode(*tp, dop->dop_inodes[i], XFS_ILOG_CORE);
}
trace_xfs_defer_trans_roll((*tp)->t_mountp, dop);
/* Roll the transaction. */
error = xfs_trans_roll(tp, ip);
if (error) {
trace_xfs_defer_trans_roll_error((*tp)->t_mountp, dop, error);
xfs_defer_trans_abort(*tp, dop, error);
return error;
}
dop->dop_committed = true;
/* Rejoin the joined inodes except the one we passed in. */
for (i = 0; i < XFS_DEFER_OPS_NR_INODES && dop->dop_inodes[i]; i++) {
if (dop->dop_inodes[i] == ip)
continue;
xfs_trans_ijoin(*tp, dop->dop_inodes[i], 0);
}
return error;
}
/* Do we have any work items to finish? */
bool
xfs_defer_has_unfinished_work(
struct xfs_defer_ops *dop)
{
return !list_empty(&dop->dop_pending) || !list_empty(&dop->dop_intake);
}
/*
* Add this inode to the deferred op. Each joined inode is relogged
* each time we roll the transaction, in addition to any inode passed
* to xfs_defer_finish().
*/
int
xfs_defer_join(
struct xfs_defer_ops *dop,
struct xfs_inode *ip)
{
int i;
for (i = 0; i < XFS_DEFER_OPS_NR_INODES; i++) {
if (dop->dop_inodes[i] == ip)
return 0;
else if (dop->dop_inodes[i] == NULL) {
dop->dop_inodes[i] = ip;
return 0;
}
}
return -EFSCORRUPTED;
}
/*
* Finish all the pending work. This involves logging intent items for
* any work items that wandered in since the last transaction roll (if
* one has even happened), rolling the transaction, and finishing the
* work items in the first item on the logged-and-pending list.
*
* If an inode is provided, relog it to the new transaction.
*/
int
xfs_defer_finish(
struct xfs_trans **tp,
struct xfs_defer_ops *dop,
struct xfs_inode *ip)
{
struct xfs_defer_pending *dfp;
struct list_head *li;
struct list_head *n;
void *done_item = NULL;
void *state;
int error = 0;
void (*cleanup_fn)(struct xfs_trans *, void *, int);
ASSERT((*tp)->t_flags & XFS_TRANS_PERM_LOG_RES);
trace_xfs_defer_finish((*tp)->t_mountp, dop);
/* Until we run out of pending work to finish... */
while (xfs_defer_has_unfinished_work(dop)) {
/* Log intents for work items sitting in the intake. */
xfs_defer_intake_work(*tp, dop);
/* Roll the transaction. */
error = xfs_defer_trans_roll(tp, dop, ip);
if (error)
goto out;
/* Mark all pending intents as committed. */
list_for_each_entry_reverse(dfp, &dop->dop_pending, dfp_list) {
if (dfp->dfp_committed)
break;
trace_xfs_defer_pending_commit((*tp)->t_mountp, dfp);
dfp->dfp_committed = true;
}
/* Log an intent-done item for the first pending item. */
dfp = list_first_entry(&dop->dop_pending,
struct xfs_defer_pending, dfp_list);
trace_xfs_defer_pending_finish((*tp)->t_mountp, dfp);
done_item = dfp->dfp_type->create_done(*tp, dfp->dfp_intent,
dfp->dfp_count);
cleanup_fn = dfp->dfp_type->finish_cleanup;
/* Finish the work items. */
state = NULL;
list_for_each_safe(li, n, &dfp->dfp_work) {
list_del(li);
dfp->dfp_count--;
error = dfp->dfp_type->finish_item(*tp, dop, li,
done_item, &state);
if (error) {
/*
* Clean up after ourselves and jump out.
* xfs_defer_cancel will take care of freeing
* all these lists and stuff.
*/
if (cleanup_fn)
cleanup_fn(*tp, state, error);
xfs_defer_trans_abort(*tp, dop, error);
goto out;
}
}
/* Done with the dfp, free it. */
list_del(&dfp->dfp_list);
kmem_free(dfp);
if (cleanup_fn)
cleanup_fn(*tp, state, error);
}
out:
if (error)
trace_xfs_defer_finish_error((*tp)->t_mountp, dop, error);
else
trace_xfs_defer_finish_done((*tp)->t_mountp, dop);
return error;
}
/*
* Free up any items left in the list.
*/
void
xfs_defer_cancel(
struct xfs_defer_ops *dop)
{
struct xfs_defer_pending *dfp;
struct xfs_defer_pending *pli;
struct list_head *pwi;
struct list_head *n;
trace_xfs_defer_cancel(NULL, dop);
/*
* Free the pending items. Caller should already have arranged
* for the intent items to be released.
*/
list_for_each_entry_safe(dfp, pli, &dop->dop_intake, dfp_list) {
trace_xfs_defer_intake_cancel(NULL, dfp);
list_del(&dfp->dfp_list);
list_for_each_safe(pwi, n, &dfp->dfp_work) {
list_del(pwi);
dfp->dfp_count--;
dfp->dfp_type->cancel_item(pwi);
}
ASSERT(dfp->dfp_count == 0);
kmem_free(dfp);
}
list_for_each_entry_safe(dfp, pli, &dop->dop_pending, dfp_list) {
trace_xfs_defer_pending_cancel(NULL, dfp);
list_del(&dfp->dfp_list);
list_for_each_safe(pwi, n, &dfp->dfp_work) {
list_del(pwi);
dfp->dfp_count--;
dfp->dfp_type->cancel_item(pwi);
}
ASSERT(dfp->dfp_count == 0);
kmem_free(dfp);
}
}
/* Add an item for later deferred processing. */
void
xfs_defer_add(
struct xfs_defer_ops *dop,
enum xfs_defer_ops_type type,
struct list_head *li)
{
struct xfs_defer_pending *dfp = NULL;
/*
* Add the item to a pending item at the end of the intake list.
* If the last pending item has the same type, reuse it. Else,
* create a new pending item at the end of the intake list.
*/
if (!list_empty(&dop->dop_intake)) {
dfp = list_last_entry(&dop->dop_intake,
struct xfs_defer_pending, dfp_list);
if (dfp->dfp_type->type != type ||
(dfp->dfp_type->max_items &&
dfp->dfp_count >= dfp->dfp_type->max_items))
dfp = NULL;
}
if (!dfp) {
dfp = kmem_alloc(sizeof(struct xfs_defer_pending),
KM_SLEEP | KM_NOFS);
dfp->dfp_type = defer_op_types[type];
dfp->dfp_committed = false;
dfp->dfp_intent = NULL;
dfp->dfp_count = 0;
INIT_LIST_HEAD(&dfp->dfp_work);
list_add_tail(&dfp->dfp_list, &dop->dop_intake);
}
list_add_tail(li, &dfp->dfp_work);
dfp->dfp_count++;
}
/* Initialize a deferred operation list. */
void
xfs_defer_init_op_type(
const struct xfs_defer_op_type *type)
{
defer_op_types[type->type] = type;
}
/* Initialize a deferred operation. */
void
xfs_defer_init(
struct xfs_defer_ops *dop,
xfs_fsblock_t *fbp)
{
dop->dop_committed = false;
dop->dop_low = false;
memset(&dop->dop_inodes, 0, sizeof(dop->dop_inodes));
*fbp = NULLFSBLOCK;
INIT_LIST_HEAD(&dop->dop_intake);
INIT_LIST_HEAD(&dop->dop_pending);
trace_xfs_defer_init(NULL, dop);
}
/*
* Copyright (C) 2016 Oracle. All Rights Reserved.
*
* Author: Darrick J. Wong <darrick.wong@oracle.com>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it would be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#ifndef __XFS_DEFER_H__
#define __XFS_DEFER_H__
struct xfs_defer_op_type;
/*
* Save a log intent item and a list of extents, so that we can replay
* whatever action had to happen to the extent list and file the log done
* item.
*/
struct xfs_defer_pending {
const struct xfs_defer_op_type *dfp_type; /* function pointers */
struct list_head dfp_list; /* pending items */
bool dfp_committed; /* committed trans? */
void *dfp_intent; /* log intent item */
struct list_head dfp_work; /* work items */
unsigned int dfp_count; /* # extent items */
};
/*
* Header for deferred operation list.
*
* dop_low is used by the allocator to activate the lowspace algorithm -
* when free space is running low the extent allocator may choose to
* allocate an extent from an AG without leaving sufficient space for
* a btree split when inserting the new extent. In this case the allocator
* will enable the lowspace algorithm which is supposed to allow further
* allocations (such as btree splits and newroots) to allocate from
* sequential AGs. In order to avoid locking AGs out of order the lowspace
* algorithm will start searching for free space from AG 0. If the correct
* transaction reservations have been made then this algorithm will eventually
* find all the space it needs.
*/
enum xfs_defer_ops_type {
XFS_DEFER_OPS_TYPE_RMAP,
XFS_DEFER_OPS_TYPE_FREE,
XFS_DEFER_OPS_TYPE_MAX,
};
#define XFS_DEFER_OPS_NR_INODES 2 /* join up to two inodes */
struct xfs_defer_ops {
bool dop_committed; /* did any trans commit? */
bool dop_low; /* alloc in low mode */
struct list_head dop_intake; /* unlogged pending work */
struct list_head dop_pending; /* logged pending work */
/* relog these inodes with each roll */
struct xfs_inode *dop_inodes[XFS_DEFER_OPS_NR_INODES];
};
void xfs_defer_add(struct xfs_defer_ops *dop, enum xfs_defer_ops_type type,
struct list_head *h);
int xfs_defer_finish(struct xfs_trans **tp, struct xfs_defer_ops *dop,
struct xfs_inode *ip);
void xfs_defer_cancel(struct xfs_defer_ops *dop);
void xfs_defer_init(struct xfs_defer_ops *dop, xfs_fsblock_t *fbp);
bool xfs_defer_has_unfinished_work(struct xfs_defer_ops *dop);
int xfs_defer_join(struct xfs_defer_ops *dop, struct xfs_inode *ip);
/* Description of a deferred type. */
struct xfs_defer_op_type {
enum xfs_defer_ops_type type;
unsigned int max_items;
void (*abort_intent)(void *);
void *(*create_done)(struct xfs_trans *, void *, unsigned int);
int (*finish_item)(struct xfs_trans *, struct xfs_defer_ops *,
struct list_head *, void *, void **);
void (*finish_cleanup)(struct xfs_trans *, void *, int);
void (*cancel_item)(struct list_head *);
int (*diff_items)(void *, struct list_head *, struct list_head *);
void *(*create_intent)(struct xfs_trans *, uint);
void (*log_item)(struct xfs_trans *, void *, struct list_head *);
};
void xfs_defer_init_op_type(const struct xfs_defer_op_type *type);
#endif /* __XFS_DEFER_H__ */
......@@ -21,6 +21,7 @@
#include "xfs_log_format.h"
#include "xfs_trans_resv.h"
#include "xfs_mount.h"
#include "xfs_defer.h"
#include "xfs_da_format.h"
#include "xfs_da_btree.h"
#include "xfs_inode.h"
......@@ -259,7 +260,7 @@ xfs_dir_createname(
struct xfs_name *name,
xfs_ino_t inum, /* new entry inode number */
xfs_fsblock_t *first, /* bmap's firstblock */
xfs_bmap_free_t *flist, /* bmap's freeblock list */
struct xfs_defer_ops *dfops, /* bmap's freeblock list */
xfs_extlen_t total) /* bmap's total block count */
{
struct xfs_da_args *args;
......@@ -286,7 +287,7 @@ xfs_dir_createname(
args->inumber = inum;
args->dp = dp;
args->firstblock = first;
args->flist = flist;
args->dfops = dfops;
args->total = total;
args->whichfork = XFS_DATA_FORK;
args->trans = tp;
......@@ -436,7 +437,7 @@ xfs_dir_removename(
struct xfs_name *name,
xfs_ino_t ino,
xfs_fsblock_t *first, /* bmap's firstblock */
xfs_bmap_free_t *flist, /* bmap's freeblock list */
struct xfs_defer_ops *dfops, /* bmap's freeblock list */
xfs_extlen_t total) /* bmap's total block count */
{
struct xfs_da_args *args;
......@@ -458,7 +459,7 @@ xfs_dir_removename(
args->inumber = ino;
args->dp = dp;
args->firstblock = first;
args->flist = flist;
args->dfops = dfops;
args->total = total;
args->whichfork = XFS_DATA_FORK;
args->trans = tp;
......@@ -498,7 +499,7 @@ xfs_dir_replace(
struct xfs_name *name, /* name of entry to replace */
xfs_ino_t inum, /* new inode number */
xfs_fsblock_t *first, /* bmap's firstblock */
xfs_bmap_free_t *flist, /* bmap's freeblock list */
struct xfs_defer_ops *dfops, /* bmap's freeblock list */
xfs_extlen_t total) /* bmap's total block count */
{
struct xfs_da_args *args;
......@@ -523,7 +524,7 @@ xfs_dir_replace(
args->inumber = inum;
args->dp = dp;
args->firstblock = first;
args->flist = flist;
args->dfops = dfops;
args->total = total;
args->whichfork = XFS_DATA_FORK;
args->trans = tp;
......@@ -680,7 +681,7 @@ xfs_dir2_shrink_inode(
/* Unmap the fsblock(s). */
error = xfs_bunmapi(tp, dp, da, args->geo->fsbcount, 0, 0,
args->firstblock, args->flist, &done);
args->firstblock, args->dfops, &done);
if (error) {
/*
* ENOSPC actually can happen if we're in a removename with no
......
......@@ -18,7 +18,7 @@
#ifndef __XFS_DIR2_H__
#define __XFS_DIR2_H__
struct xfs_bmap_free;
struct xfs_defer_ops;
struct xfs_da_args;
struct xfs_inode;
struct xfs_mount;
......@@ -129,18 +129,18 @@ extern int xfs_dir_init(struct xfs_trans *tp, struct xfs_inode *dp,
extern int xfs_dir_createname(struct xfs_trans *tp, struct xfs_inode *dp,
struct xfs_name *name, xfs_ino_t inum,
xfs_fsblock_t *first,
struct xfs_bmap_free *flist, xfs_extlen_t tot);
struct xfs_defer_ops *dfops, xfs_extlen_t tot);
extern int xfs_dir_lookup(struct xfs_trans *tp, struct xfs_inode *dp,
struct xfs_name *name, xfs_ino_t *inum,
struct xfs_name *ci_name);
extern int xfs_dir_removename(struct xfs_trans *tp, struct xfs_inode *dp,
struct xfs_name *name, xfs_ino_t ino,
xfs_fsblock_t *first,
struct xfs_bmap_free *flist, xfs_extlen_t tot);
struct xfs_defer_ops *dfops, xfs_extlen_t tot);
extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
struct xfs_name *name, xfs_ino_t inum,
xfs_fsblock_t *first,
struct xfs_bmap_free *flist, xfs_extlen_t tot);
struct xfs_defer_ops *dfops, xfs_extlen_t tot);
extern int xfs_dir_canenter(struct xfs_trans *tp, struct xfs_inode *dp,
struct xfs_name *name);
......
......@@ -455,8 +455,10 @@ xfs_sb_has_compat_feature(
}
#define XFS_SB_FEAT_RO_COMPAT_FINOBT (1 << 0) /* free inode btree */
#define XFS_SB_FEAT_RO_COMPAT_RMAPBT (1 << 1) /* reverse map btree */
#define XFS_SB_FEAT_RO_COMPAT_ALL \
(XFS_SB_FEAT_RO_COMPAT_FINOBT)
(XFS_SB_FEAT_RO_COMPAT_FINOBT | \
XFS_SB_FEAT_RO_COMPAT_RMAPBT)
#define XFS_SB_FEAT_RO_COMPAT_UNKNOWN ~XFS_SB_FEAT_RO_COMPAT_ALL
static inline bool
xfs_sb_has_ro_compat_feature(
......@@ -538,6 +540,12 @@ static inline bool xfs_sb_version_hasmetauuid(struct xfs_sb *sbp)
(sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_META_UUID);
}
static inline bool xfs_sb_version_hasrmapbt(struct xfs_sb *sbp)
{
return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) &&
(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_RMAPBT);
}
/*
* end of superblock version macros
*/
......@@ -598,10 +606,10 @@ xfs_is_quota_inode(struct xfs_sb *sbp, xfs_ino_t ino)
#define XFS_AGI_GOOD_VERSION(v) ((v) == XFS_AGI_VERSION)
/*
* Btree number 0 is bno, 1 is cnt. This value gives the size of the
* Btree number 0 is bno, 1 is cnt, 2 is rmap. This value gives the size of the
* arrays below.
*/
#define XFS_BTNUM_AGF ((int)XFS_BTNUM_CNTi + 1)
#define XFS_BTNUM_AGF ((int)XFS_BTNUM_RMAPi + 1)
/*
* The second word of agf_levels in the first a.g. overlaps the EFS
......@@ -618,12 +626,10 @@ typedef struct xfs_agf {
__be32 agf_seqno; /* sequence # starting from 0 */
__be32 agf_length; /* size in blocks of a.g. */
/*
* Freespace information
* Freespace and rmap information
*/
__be32 agf_roots[XFS_BTNUM_AGF]; /* root blocks */
__be32 agf_spare0; /* spare field */
__be32 agf_levels[XFS_BTNUM_AGF]; /* btree levels */
__be32 agf_spare1; /* spare field */
__be32 agf_flfirst; /* first freelist block's index */
__be32 agf_fllast; /* last freelist block's index */
......@@ -1308,17 +1314,118 @@ typedef __be32 xfs_inobt_ptr_t;
#define XFS_FIBT_BLOCK(mp) ((xfs_agblock_t)(XFS_IBT_BLOCK(mp) + 1))
/*
* The first data block of an AG depends on whether the filesystem was formatted
* with the finobt feature. If so, account for the finobt reserved root btree
* block.
* Reverse mapping btree format definitions
*
* There is a btree for the reverse map per allocation group
*/
#define XFS_RMAP_CRC_MAGIC 0x524d4233 /* 'RMB3' */
/*
* Ownership info for an extent. This is used to create reverse-mapping
* entries.
*/
#define XFS_PREALLOC_BLOCKS(mp) \
#define XFS_OWNER_INFO_ATTR_FORK (1 << 0)
#define XFS_OWNER_INFO_BMBT_BLOCK (1 << 1)
struct xfs_owner_info {
uint64_t oi_owner;
xfs_fileoff_t oi_offset;
unsigned int oi_flags;
};
/*
* Special owner types.
*
* Seeing as we only support up to 8EB, we have the upper bit of the owner field
* to tell us we have a special owner value. We use these for static metadata
* allocated at mkfs/growfs time, as well as for freespace management metadata.
*/
#define XFS_RMAP_OWN_NULL (-1ULL) /* No owner, for growfs */
#define XFS_RMAP_OWN_UNKNOWN (-2ULL) /* Unknown owner, for EFI recovery */
#define XFS_RMAP_OWN_FS (-3ULL) /* static fs metadata */
#define XFS_RMAP_OWN_LOG (-4ULL) /* static fs metadata */
#define XFS_RMAP_OWN_AG (-5ULL) /* AG freespace btree blocks */
#define XFS_RMAP_OWN_INOBT (-6ULL) /* Inode btree blocks */
#define XFS_RMAP_OWN_INODES (-7ULL) /* Inode chunk */
#define XFS_RMAP_OWN_MIN (-8ULL) /* guard */
#define XFS_RMAP_NON_INODE_OWNER(owner) (!!((owner) & (1ULL << 63)))
/*
* Data record structure
*/
struct xfs_rmap_rec {
__be32 rm_startblock; /* extent start block */
__be32 rm_blockcount; /* extent length */
__be64 rm_owner; /* extent owner */
__be64 rm_offset; /* offset within the owner */
};
/*
* rmap btree record
* rm_offset:63 is the attribute fork flag
* rm_offset:62 is the bmbt block flag
* rm_offset:61 is the unwritten extent flag (same as l0:63 in bmbt)
* rm_offset:54-60 aren't used and should be zero
* rm_offset:0-53 is the block offset within the inode
*/
#define XFS_RMAP_OFF_ATTR_FORK ((__uint64_t)1ULL << 63)
#define XFS_RMAP_OFF_BMBT_BLOCK ((__uint64_t)1ULL << 62)
#define XFS_RMAP_OFF_UNWRITTEN ((__uint64_t)1ULL << 61)
#define XFS_RMAP_LEN_MAX ((__uint32_t)~0U)
#define XFS_RMAP_OFF_FLAGS (XFS_RMAP_OFF_ATTR_FORK | \
XFS_RMAP_OFF_BMBT_BLOCK | \
XFS_RMAP_OFF_UNWRITTEN)
#define XFS_RMAP_OFF_MASK ((__uint64_t)0x3FFFFFFFFFFFFFULL)
#define XFS_RMAP_OFF(off) ((off) & XFS_RMAP_OFF_MASK)
#define XFS_RMAP_IS_BMBT_BLOCK(off) (!!((off) & XFS_RMAP_OFF_BMBT_BLOCK))
#define XFS_RMAP_IS_ATTR_FORK(off) (!!((off) & XFS_RMAP_OFF_ATTR_FORK))
#define XFS_RMAP_IS_UNWRITTEN(len) (!!((off) & XFS_RMAP_OFF_UNWRITTEN))
#define RMAPBT_STARTBLOCK_BITLEN 32
#define RMAPBT_BLOCKCOUNT_BITLEN 32
#define RMAPBT_OWNER_BITLEN 64
#define RMAPBT_ATTRFLAG_BITLEN 1
#define RMAPBT_BMBTFLAG_BITLEN 1
#define RMAPBT_EXNTFLAG_BITLEN 1
#define RMAPBT_UNUSED_OFFSET_BITLEN 7
#define RMAPBT_OFFSET_BITLEN 54
#define XFS_RMAP_ATTR_FORK (1 << 0)
#define XFS_RMAP_BMBT_BLOCK (1 << 1)
#define XFS_RMAP_UNWRITTEN (1 << 2)
#define XFS_RMAP_KEY_FLAGS (XFS_RMAP_ATTR_FORK | \
XFS_RMAP_BMBT_BLOCK)
#define XFS_RMAP_REC_FLAGS (XFS_RMAP_UNWRITTEN)
struct xfs_rmap_irec {
xfs_agblock_t rm_startblock; /* extent start block */
xfs_extlen_t rm_blockcount; /* extent length */
__uint64_t rm_owner; /* extent owner */
__uint64_t rm_offset; /* offset within the owner */
unsigned int rm_flags; /* state flags */
};
/*
* Key structure
*
* We don't use the length for lookups
*/
struct xfs_rmap_key {
__be32 rm_startblock; /* extent start block */
__be64 rm_owner; /* extent owner */
__be64 rm_offset; /* offset within the owner */
} __attribute__((packed));
/* btree pointer type */
typedef __be32 xfs_rmap_ptr_t;
#define XFS_RMAP_BLOCK(mp) \
(xfs_sb_version_hasfinobt(&((mp)->m_sb)) ? \
XFS_FIBT_BLOCK(mp) + 1 : \
XFS_IBT_BLOCK(mp) + 1)
/*
* BMAP Btree format definitions
*
......
......@@ -206,6 +206,7 @@ typedef struct xfs_fsop_resblks {
#define XFS_FSOP_GEOM_FLAGS_FTYPE 0x10000 /* inode directory types */
#define XFS_FSOP_GEOM_FLAGS_FINOBT 0x20000 /* free inode btree */
#define XFS_FSOP_GEOM_FLAGS_SPINODES 0x40000 /* sparse inode chunks */
#define XFS_FSOP_GEOM_FLAGS_RMAPBT 0x80000 /* Reverse mapping btree */
/*
* Minimum and maximum sizes need for growth checks.
......
......@@ -24,6 +24,7 @@
#include "xfs_bit.h"
#include "xfs_sb.h"
#include "xfs_mount.h"
#include "xfs_defer.h"
#include "xfs_inode.h"
#include "xfs_btree.h"
#include "xfs_ialloc.h"
......@@ -39,6 +40,7 @@
#include "xfs_icache.h"
#include "xfs_trace.h"
#include "xfs_log.h"
#include "xfs_rmap.h"
/*
......@@ -614,6 +616,7 @@ xfs_ialloc_ag_alloc(
args.tp = tp;
args.mp = tp->t_mountp;
args.fsbno = NULLFSBLOCK;
xfs_rmap_ag_owner(&args.oinfo, XFS_RMAP_OWN_INODES);
#ifdef DEBUG
/* randomly do sparse inode allocations */
......@@ -1817,19 +1820,21 @@ xfs_difree_inode_chunk(
struct xfs_mount *mp,
xfs_agnumber_t agno,
struct xfs_inobt_rec_incore *rec,
struct xfs_bmap_free *flist)
struct xfs_defer_ops *dfops)
{
xfs_agblock_t sagbno = XFS_AGINO_TO_AGBNO(mp, rec->ir_startino);
int startidx, endidx;
int nextbit;
xfs_agblock_t agbno;
int contigblk;
struct xfs_owner_info oinfo;
DECLARE_BITMAP(holemask, XFS_INOBT_HOLEMASK_BITS);
xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_INODES);
if (!xfs_inobt_issparse(rec->ir_holemask)) {
/* not sparse, calculate extent info directly */
xfs_bmap_add_free(mp, flist, XFS_AGB_TO_FSB(mp, agno, sagbno),
mp->m_ialloc_blks);
xfs_bmap_add_free(mp, dfops, XFS_AGB_TO_FSB(mp, agno, sagbno),
mp->m_ialloc_blks, &oinfo);
return;
}
......@@ -1872,8 +1877,8 @@ xfs_difree_inode_chunk(
ASSERT(agbno % mp->m_sb.sb_spino_align == 0);
ASSERT(contigblk % mp->m_sb.sb_spino_align == 0);
xfs_bmap_add_free(mp, flist, XFS_AGB_TO_FSB(mp, agno, agbno),
contigblk);
xfs_bmap_add_free(mp, dfops, XFS_AGB_TO_FSB(mp, agno, agbno),
contigblk, &oinfo);
/* reset range to current bit and carry on... */
startidx = endidx = nextbit;
......@@ -1889,7 +1894,7 @@ xfs_difree_inobt(
struct xfs_trans *tp,
struct xfs_buf *agbp,
xfs_agino_t agino,
struct xfs_bmap_free *flist,
struct xfs_defer_ops *dfops,
struct xfs_icluster *xic,
struct xfs_inobt_rec_incore *orec)
{
......@@ -1976,7 +1981,7 @@ xfs_difree_inobt(
goto error0;
}
xfs_difree_inode_chunk(mp, agno, &rec, flist);
xfs_difree_inode_chunk(mp, agno, &rec, dfops);
} else {
xic->deleted = 0;
......@@ -2121,7 +2126,7 @@ int
xfs_difree(
struct xfs_trans *tp, /* transaction pointer */
xfs_ino_t inode, /* inode to be freed */
struct xfs_bmap_free *flist, /* extents to free */
struct xfs_defer_ops *dfops, /* extents to free */
struct xfs_icluster *xic) /* cluster info if deleted */
{
/* REFERENCED */
......@@ -2173,7 +2178,7 @@ xfs_difree(
/*
* Fix up the inode allocation btree.
*/
error = xfs_difree_inobt(mp, tp, agbp, agino, flist, xic, &rec);
error = xfs_difree_inobt(mp, tp, agbp, agino, dfops, xic, &rec);
if (error)
goto error0;
......
......@@ -95,7 +95,7 @@ int /* error */
xfs_difree(
struct xfs_trans *tp, /* transaction pointer */
xfs_ino_t inode, /* inode to be freed */
struct xfs_bmap_free *flist, /* extents to free */
struct xfs_defer_ops *dfops, /* extents to free */
struct xfs_icluster *ifree); /* cluster info if deleted */
/*
......
......@@ -32,6 +32,7 @@
#include "xfs_trace.h"
#include "xfs_cksum.h"
#include "xfs_trans.h"
#include "xfs_rmap.h"
STATIC int
......@@ -96,6 +97,7 @@ xfs_inobt_alloc_block(
memset(&args, 0, sizeof(args));
args.tp = cur->bc_tp;
args.mp = cur->bc_mp;
xfs_rmap_ag_owner(&args.oinfo, XFS_RMAP_OWN_INOBT);
args.fsbno = XFS_AGB_TO_FSB(args.mp, cur->bc_private.a.agno, sbno);
args.minlen = 1;
args.maxlen = 1;
......@@ -125,8 +127,12 @@ xfs_inobt_free_block(
struct xfs_btree_cur *cur,
struct xfs_buf *bp)
{
struct xfs_owner_info oinfo;
xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_INOBT);
return xfs_free_extent(cur->bc_tp,
XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(bp)), 1);
XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(bp)), 1,
&oinfo);
}
STATIC int
......@@ -145,14 +151,6 @@ xfs_inobt_init_key_from_rec(
key->inobt.ir_startino = rec->inobt.ir_startino;
}
STATIC void
xfs_inobt_init_rec_from_key(
union xfs_btree_key *key,
union xfs_btree_rec *rec)
{
rec->inobt.ir_startino = key->inobt.ir_startino;
}
STATIC void
xfs_inobt_init_rec_from_cur(
struct xfs_btree_cur *cur,
......@@ -314,7 +312,6 @@ static const struct xfs_btree_ops xfs_inobt_ops = {
.get_minrecs = xfs_inobt_get_minrecs,
.get_maxrecs = xfs_inobt_get_maxrecs,
.init_key_from_rec = xfs_inobt_init_key_from_rec,
.init_rec_from_key = xfs_inobt_init_rec_from_key,
.init_rec_from_cur = xfs_inobt_init_rec_from_cur,
.init_ptr_from_cur = xfs_inobt_init_ptr_from_cur,
.key_diff = xfs_inobt_key_diff,
......@@ -336,7 +333,6 @@ static const struct xfs_btree_ops xfs_finobt_ops = {
.get_minrecs = xfs_inobt_get_minrecs,
.get_maxrecs = xfs_inobt_get_maxrecs,
.init_key_from_rec = xfs_inobt_init_key_from_rec,
.init_rec_from_key = xfs_inobt_init_rec_from_key,
.init_rec_from_cur = xfs_inobt_init_rec_from_cur,
.init_ptr_from_cur = xfs_finobt_init_ptr_from_cur,
.key_diff = xfs_inobt_key_diff,
......
......@@ -22,6 +22,7 @@
#include "xfs_log_format.h"
#include "xfs_trans_resv.h"
#include "xfs_mount.h"
#include "xfs_defer.h"
#include "xfs_inode.h"
#include "xfs_error.h"
#include "xfs_cksum.h"
......
......@@ -110,7 +110,9 @@ static inline uint xlog_get_cycle(char *ptr)
#define XLOG_REG_TYPE_COMMIT 18
#define XLOG_REG_TYPE_TRANSHDR 19
#define XLOG_REG_TYPE_ICREATE 20
#define XLOG_REG_TYPE_MAX 20
#define XLOG_REG_TYPE_RUI_FORMAT 21
#define XLOG_REG_TYPE_RUD_FORMAT 22
#define XLOG_REG_TYPE_MAX 22
/*
* Flags to log operation header
......@@ -227,6 +229,8 @@ typedef struct xfs_trans_header {
#define XFS_LI_DQUOT 0x123d
#define XFS_LI_QUOTAOFF 0x123e
#define XFS_LI_ICREATE 0x123f
#define XFS_LI_RUI 0x1240 /* rmap update intent */
#define XFS_LI_RUD 0x1241
#define XFS_LI_TYPE_DESC \
{ XFS_LI_EFI, "XFS_LI_EFI" }, \
......@@ -236,7 +240,9 @@ typedef struct xfs_trans_header {
{ XFS_LI_BUF, "XFS_LI_BUF" }, \
{ XFS_LI_DQUOT, "XFS_LI_DQUOT" }, \
{ XFS_LI_QUOTAOFF, "XFS_LI_QUOTAOFF" }, \
{ XFS_LI_ICREATE, "XFS_LI_ICREATE" }
{ XFS_LI_ICREATE, "XFS_LI_ICREATE" }, \
{ XFS_LI_RUI, "XFS_LI_RUI" }, \
{ XFS_LI_RUD, "XFS_LI_RUD" }
/*
* Inode Log Item Format definitions.
......@@ -603,6 +609,59 @@ typedef struct xfs_efd_log_format_64 {
xfs_extent_64_t efd_extents[1]; /* array of extents freed */
} xfs_efd_log_format_64_t;
/*
* RUI/RUD (reverse mapping) log format definitions
*/
struct xfs_map_extent {
__uint64_t me_owner;
__uint64_t me_startblock;
__uint64_t me_startoff;
__uint32_t me_len;
__uint32_t me_flags;
};
/* rmap me_flags: upper bits are flags, lower byte is type code */
#define XFS_RMAP_EXTENT_MAP 1
#define XFS_RMAP_EXTENT_UNMAP 3
#define XFS_RMAP_EXTENT_CONVERT 5
#define XFS_RMAP_EXTENT_ALLOC 7
#define XFS_RMAP_EXTENT_FREE 8
#define XFS_RMAP_EXTENT_TYPE_MASK 0xFF
#define XFS_RMAP_EXTENT_ATTR_FORK (1U << 31)
#define XFS_RMAP_EXTENT_BMBT_BLOCK (1U << 30)
#define XFS_RMAP_EXTENT_UNWRITTEN (1U << 29)
#define XFS_RMAP_EXTENT_FLAGS (XFS_RMAP_EXTENT_TYPE_MASK | \
XFS_RMAP_EXTENT_ATTR_FORK | \
XFS_RMAP_EXTENT_BMBT_BLOCK | \
XFS_RMAP_EXTENT_UNWRITTEN)
/*
* This is the structure used to lay out an rui log item in the
* log. The rui_extents field is a variable size array whose
* size is given by rui_nextents.
*/
struct xfs_rui_log_format {
__uint16_t rui_type; /* rui log item type */
__uint16_t rui_size; /* size of this item */
__uint32_t rui_nextents; /* # extents to free */
__uint64_t rui_id; /* rui identifier */
struct xfs_map_extent rui_extents[1]; /* array of extents to rmap */
};
/*
* This is the structure used to lay out an rud log item in the
* log. The rud_extents array is a variable size array whose
* size is given by rud_nextents;
*/
struct xfs_rud_log_format {
__uint16_t rud_type; /* rud log item type */
__uint16_t rud_size; /* size of this item */
__uint32_t __pad;
__uint64_t rud_rui_id; /* id of corresponding rui */
};
/*
* Dquot Log format definitions.
*
......
此差异已折叠。
/*
* Copyright (C) 2016 Oracle. All Rights Reserved.
*
* Author: Darrick J. Wong <darrick.wong@oracle.com>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it would be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#ifndef __XFS_RMAP_H__
#define __XFS_RMAP_H__
static inline void
xfs_rmap_ag_owner(
struct xfs_owner_info *oi,
uint64_t owner)
{
oi->oi_owner = owner;
oi->oi_offset = 0;
oi->oi_flags = 0;
}
static inline void
xfs_rmap_ino_bmbt_owner(
struct xfs_owner_info *oi,
xfs_ino_t ino,
int whichfork)
{
oi->oi_owner = ino;
oi->oi_offset = 0;
oi->oi_flags = XFS_OWNER_INFO_BMBT_BLOCK;
if (whichfork == XFS_ATTR_FORK)
oi->oi_flags |= XFS_OWNER_INFO_ATTR_FORK;
}
static inline void
xfs_rmap_ino_owner(
struct xfs_owner_info *oi,
xfs_ino_t ino,
int whichfork,
xfs_fileoff_t offset)
{
oi->oi_owner = ino;
oi->oi_offset = offset;
oi->oi_flags = 0;
if (whichfork == XFS_ATTR_FORK)
oi->oi_flags |= XFS_OWNER_INFO_ATTR_FORK;
}
static inline void
xfs_rmap_skip_owner_update(
struct xfs_owner_info *oi)
{
oi->oi_owner = XFS_RMAP_OWN_UNKNOWN;
}
/* Reverse mapping functions. */
struct xfs_buf;
static inline __u64
xfs_rmap_irec_offset_pack(
const struct xfs_rmap_irec *irec)
{
__u64 x;
x = XFS_RMAP_OFF(irec->rm_offset);
if (irec->rm_flags & XFS_RMAP_ATTR_FORK)
x |= XFS_RMAP_OFF_ATTR_FORK;
if (irec->rm_flags & XFS_RMAP_BMBT_BLOCK)
x |= XFS_RMAP_OFF_BMBT_BLOCK;
if (irec->rm_flags & XFS_RMAP_UNWRITTEN)
x |= XFS_RMAP_OFF_UNWRITTEN;
return x;
}
static inline int
xfs_rmap_irec_offset_unpack(
__u64 offset,
struct xfs_rmap_irec *irec)
{
if (offset & ~(XFS_RMAP_OFF_MASK | XFS_RMAP_OFF_FLAGS))
return -EFSCORRUPTED;
irec->rm_offset = XFS_RMAP_OFF(offset);
if (offset & XFS_RMAP_OFF_ATTR_FORK)
irec->rm_flags |= XFS_RMAP_ATTR_FORK;
if (offset & XFS_RMAP_OFF_BMBT_BLOCK)
irec->rm_flags |= XFS_RMAP_BMBT_BLOCK;
if (offset & XFS_RMAP_OFF_UNWRITTEN)
irec->rm_flags |= XFS_RMAP_UNWRITTEN;
return 0;
}
static inline void
xfs_owner_info_unpack(
struct xfs_owner_info *oinfo,
uint64_t *owner,
uint64_t *offset,
unsigned int *flags)
{
unsigned int r = 0;
*owner = oinfo->oi_owner;
*offset = oinfo->oi_offset;
if (oinfo->oi_flags & XFS_OWNER_INFO_ATTR_FORK)
r |= XFS_RMAP_ATTR_FORK;
if (oinfo->oi_flags & XFS_OWNER_INFO_BMBT_BLOCK)
r |= XFS_RMAP_BMBT_BLOCK;
*flags = r;
}
static inline void
xfs_owner_info_pack(
struct xfs_owner_info *oinfo,
uint64_t owner,
uint64_t offset,
unsigned int flags)
{
oinfo->oi_owner = owner;
oinfo->oi_offset = XFS_RMAP_OFF(offset);
oinfo->oi_flags = 0;
if (flags & XFS_RMAP_ATTR_FORK)
oinfo->oi_flags |= XFS_OWNER_INFO_ATTR_FORK;
if (flags & XFS_RMAP_BMBT_BLOCK)
oinfo->oi_flags |= XFS_OWNER_INFO_BMBT_BLOCK;
}
int xfs_rmap_alloc(struct xfs_trans *tp, struct xfs_buf *agbp,
xfs_agnumber_t agno, xfs_agblock_t bno, xfs_extlen_t len,
struct xfs_owner_info *oinfo);
int xfs_rmap_free(struct xfs_trans *tp, struct xfs_buf *agbp,
xfs_agnumber_t agno, xfs_agblock_t bno, xfs_extlen_t len,
struct xfs_owner_info *oinfo);
int xfs_rmap_lookup_le(struct xfs_btree_cur *cur, xfs_agblock_t bno,
xfs_extlen_t len, uint64_t owner, uint64_t offset,
unsigned int flags, int *stat);
int xfs_rmap_lookup_eq(struct xfs_btree_cur *cur, xfs_agblock_t bno,
xfs_extlen_t len, uint64_t owner, uint64_t offset,
unsigned int flags, int *stat);
int xfs_rmap_insert(struct xfs_btree_cur *rcur, xfs_agblock_t agbno,
xfs_extlen_t len, uint64_t owner, uint64_t offset,
unsigned int flags);
int xfs_rmap_get_rec(struct xfs_btree_cur *cur, struct xfs_rmap_irec *irec,
int *stat);
typedef int (*xfs_rmap_query_range_fn)(
struct xfs_btree_cur *cur,
struct xfs_rmap_irec *rec,
void *priv);
int xfs_rmap_query_range(struct xfs_btree_cur *cur,
struct xfs_rmap_irec *low_rec, struct xfs_rmap_irec *high_rec,
xfs_rmap_query_range_fn fn, void *priv);
enum xfs_rmap_intent_type {
XFS_RMAP_MAP,
XFS_RMAP_MAP_SHARED,
XFS_RMAP_UNMAP,
XFS_RMAP_UNMAP_SHARED,
XFS_RMAP_CONVERT,
XFS_RMAP_CONVERT_SHARED,
XFS_RMAP_ALLOC,
XFS_RMAP_FREE,
};
struct xfs_rmap_intent {
struct list_head ri_list;
enum xfs_rmap_intent_type ri_type;
__uint64_t ri_owner;
int ri_whichfork;
struct xfs_bmbt_irec ri_bmap;
};
/* functions for updating the rmapbt based on bmbt map/unmap operations */
int xfs_rmap_map_extent(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
struct xfs_inode *ip, int whichfork,
struct xfs_bmbt_irec *imap);
int xfs_rmap_unmap_extent(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
struct xfs_inode *ip, int whichfork,
struct xfs_bmbt_irec *imap);
int xfs_rmap_convert_extent(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
struct xfs_inode *ip, int whichfork,
struct xfs_bmbt_irec *imap);
int xfs_rmap_alloc_extent(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
xfs_agnumber_t agno, xfs_agblock_t bno, xfs_extlen_t len,
__uint64_t owner);
int xfs_rmap_free_extent(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
xfs_agnumber_t agno, xfs_agblock_t bno, xfs_extlen_t len,
__uint64_t owner);
void xfs_rmap_finish_one_cleanup(struct xfs_trans *tp,
struct xfs_btree_cur *rcur, int error);
int xfs_rmap_finish_one(struct xfs_trans *tp, enum xfs_rmap_intent_type type,
__uint64_t owner, int whichfork, xfs_fileoff_t startoff,
xfs_fsblock_t startblock, xfs_filblks_t blockcount,
xfs_exntst_t state, struct xfs_btree_cur **pcur);
#endif /* __XFS_RMAP_H__ */
/*
* Copyright (c) 2014 Red Hat, Inc.
* All Rights Reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it would be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include "xfs.h"
#include "xfs_fs.h"
#include "xfs_shared.h"
#include "xfs_format.h"
#include "xfs_log_format.h"
#include "xfs_trans_resv.h"
#include "xfs_bit.h"
#include "xfs_sb.h"
#include "xfs_mount.h"
#include "xfs_defer.h"
#include "xfs_inode.h"
#include "xfs_trans.h"
#include "xfs_alloc.h"
#include "xfs_btree.h"
#include "xfs_rmap.h"
#include "xfs_rmap_btree.h"
#include "xfs_trace.h"
#include "xfs_cksum.h"
#include "xfs_error.h"
#include "xfs_extent_busy.h"
/*
* Reverse map btree.
*
* This is a per-ag tree used to track the owner(s) of a given extent. With
* reflink it is possible for there to be multiple owners, which is a departure
* from classic XFS. Owner records for data extents are inserted when the
* extent is mapped and removed when an extent is unmapped. Owner records for
* all other block types (i.e. metadata) are inserted when an extent is
* allocated and removed when an extent is freed. There can only be one owner
* of a metadata extent, usually an inode or some other metadata structure like
* an AG btree.
*
* The rmap btree is part of the free space management, so blocks for the tree
* are sourced from the agfl. Hence we need transaction reservation support for
* this tree so that the freelist is always large enough. This also impacts on
* the minimum space we need to leave free in the AG.
*
* The tree is ordered by [ag block, owner, offset]. This is a large key size,
* but it is the only way to enforce unique keys when a block can be owned by
* multiple files at any offset. There's no need to order/search by extent
* size for online updating/management of the tree. It is intended that most
* reverse lookups will be to find the owner(s) of a particular block, or to
* try to recover tree and file data from corrupt primary metadata.
*/
static struct xfs_btree_cur *
xfs_rmapbt_dup_cursor(
struct xfs_btree_cur *cur)
{
return xfs_rmapbt_init_cursor(cur->bc_mp, cur->bc_tp,
cur->bc_private.a.agbp, cur->bc_private.a.agno);
}
STATIC void
xfs_rmapbt_set_root(
struct xfs_btree_cur *cur,
union xfs_btree_ptr *ptr,
int inc)
{
struct xfs_buf *agbp = cur->bc_private.a.agbp;
struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp);
xfs_agnumber_t seqno = be32_to_cpu(agf->agf_seqno);
int btnum = cur->bc_btnum;
struct xfs_perag *pag = xfs_perag_get(cur->bc_mp, seqno);
ASSERT(ptr->s != 0);
agf->agf_roots[btnum] = ptr->s;
be32_add_cpu(&agf->agf_levels[btnum], inc);
pag->pagf_levels[btnum] += inc;
xfs_perag_put(pag);
xfs_alloc_log_agf(cur->bc_tp, agbp, XFS_AGF_ROOTS | XFS_AGF_LEVELS);
}
STATIC int
xfs_rmapbt_alloc_block(
struct xfs_btree_cur *cur,
union xfs_btree_ptr *start,
union xfs_btree_ptr *new,
int *stat)
{
int error;
xfs_agblock_t bno;
XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY);
/* Allocate the new block from the freelist. If we can't, give up. */
error = xfs_alloc_get_freelist(cur->bc_tp, cur->bc_private.a.agbp,
&bno, 1);
if (error) {
XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR);
return error;
}
trace_xfs_rmapbt_alloc_block(cur->bc_mp, cur->bc_private.a.agno,
bno, 1);
if (bno == NULLAGBLOCK) {
XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT);
*stat = 0;
return 0;
}
xfs_extent_busy_reuse(cur->bc_mp, cur->bc_private.a.agno, bno, 1,
false);
xfs_trans_agbtree_delta(cur->bc_tp, 1);
new->s = cpu_to_be32(bno);
XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT);
*stat = 1;
return 0;
}
STATIC int
xfs_rmapbt_free_block(
struct xfs_btree_cur *cur,
struct xfs_buf *bp)
{
struct xfs_buf *agbp = cur->bc_private.a.agbp;
struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp);
xfs_agblock_t bno;
int error;
bno = xfs_daddr_to_agbno(cur->bc_mp, XFS_BUF_ADDR(bp));
trace_xfs_rmapbt_free_block(cur->bc_mp, cur->bc_private.a.agno,
bno, 1);
error = xfs_alloc_put_freelist(cur->bc_tp, agbp, NULL, bno, 1);
if (error)
return error;
xfs_extent_busy_insert(cur->bc_tp, be32_to_cpu(agf->agf_seqno), bno, 1,
XFS_EXTENT_BUSY_SKIP_DISCARD);
xfs_trans_agbtree_delta(cur->bc_tp, -1);
return 0;
}
STATIC int
xfs_rmapbt_get_minrecs(
struct xfs_btree_cur *cur,
int level)
{
return cur->bc_mp->m_rmap_mnr[level != 0];
}
STATIC int
xfs_rmapbt_get_maxrecs(
struct xfs_btree_cur *cur,
int level)
{
return cur->bc_mp->m_rmap_mxr[level != 0];
}
STATIC void
xfs_rmapbt_init_key_from_rec(
union xfs_btree_key *key,
union xfs_btree_rec *rec)
{
key->rmap.rm_startblock = rec->rmap.rm_startblock;
key->rmap.rm_owner = rec->rmap.rm_owner;
key->rmap.rm_offset = rec->rmap.rm_offset;
}
/*
* The high key for a reverse mapping record can be computed by shifting
* the startblock and offset to the highest value that would still map
* to that record. In practice this means that we add blockcount-1 to
* the startblock for all records, and if the record is for a data/attr
* fork mapping, we add blockcount-1 to the offset too.
*/
STATIC void
xfs_rmapbt_init_high_key_from_rec(
union xfs_btree_key *key,
union xfs_btree_rec *rec)
{
__uint64_t off;
int adj;
adj = be32_to_cpu(rec->rmap.rm_blockcount) - 1;
key->rmap.rm_startblock = rec->rmap.rm_startblock;
be32_add_cpu(&key->rmap.rm_startblock, adj);
key->rmap.rm_owner = rec->rmap.rm_owner;
key->rmap.rm_offset = rec->rmap.rm_offset;
if (XFS_RMAP_NON_INODE_OWNER(be64_to_cpu(rec->rmap.rm_owner)) ||
XFS_RMAP_IS_BMBT_BLOCK(be64_to_cpu(rec->rmap.rm_offset)))
return;
off = be64_to_cpu(key->rmap.rm_offset);
off = (XFS_RMAP_OFF(off) + adj) | (off & ~XFS_RMAP_OFF_MASK);
key->rmap.rm_offset = cpu_to_be64(off);
}
STATIC void
xfs_rmapbt_init_rec_from_cur(
struct xfs_btree_cur *cur,
union xfs_btree_rec *rec)
{
rec->rmap.rm_startblock = cpu_to_be32(cur->bc_rec.r.rm_startblock);
rec->rmap.rm_blockcount = cpu_to_be32(cur->bc_rec.r.rm_blockcount);
rec->rmap.rm_owner = cpu_to_be64(cur->bc_rec.r.rm_owner);
rec->rmap.rm_offset = cpu_to_be64(
xfs_rmap_irec_offset_pack(&cur->bc_rec.r));
}
STATIC void
xfs_rmapbt_init_ptr_from_cur(
struct xfs_btree_cur *cur,
union xfs_btree_ptr *ptr)
{
struct xfs_agf *agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp);
ASSERT(cur->bc_private.a.agno == be32_to_cpu(agf->agf_seqno));
ASSERT(agf->agf_roots[cur->bc_btnum] != 0);
ptr->s = agf->agf_roots[cur->bc_btnum];
}
STATIC __int64_t
xfs_rmapbt_key_diff(
struct xfs_btree_cur *cur,
union xfs_btree_key *key)
{
struct xfs_rmap_irec *rec = &cur->bc_rec.r;
struct xfs_rmap_key *kp = &key->rmap;
__u64 x, y;
__int64_t d;
d = (__int64_t)be32_to_cpu(kp->rm_startblock) - rec->rm_startblock;
if (d)
return d;
x = be64_to_cpu(kp->rm_owner);
y = rec->rm_owner;
if (x > y)
return 1;
else if (y > x)
return -1;
x = XFS_RMAP_OFF(be64_to_cpu(kp->rm_offset));
y = rec->rm_offset;
if (x > y)
return 1;
else if (y > x)
return -1;
return 0;
}
STATIC __int64_t
xfs_rmapbt_diff_two_keys(
struct xfs_btree_cur *cur,
union xfs_btree_key *k1,
union xfs_btree_key *k2)
{
struct xfs_rmap_key *kp1 = &k1->rmap;
struct xfs_rmap_key *kp2 = &k2->rmap;
__int64_t d;
__u64 x, y;
d = (__int64_t)be32_to_cpu(kp1->rm_startblock) -
be32_to_cpu(kp2->rm_startblock);
if (d)
return d;
x = be64_to_cpu(kp1->rm_owner);
y = be64_to_cpu(kp2->rm_owner);
if (x > y)
return 1;
else if (y > x)
return -1;
x = XFS_RMAP_OFF(be64_to_cpu(kp1->rm_offset));
y = XFS_RMAP_OFF(be64_to_cpu(kp2->rm_offset));
if (x > y)
return 1;
else if (y > x)
return -1;
return 0;
}
static bool
xfs_rmapbt_verify(
struct xfs_buf *bp)
{
struct xfs_mount *mp = bp->b_target->bt_mount;
struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp);
struct xfs_perag *pag = bp->b_pag;
unsigned int level;
/*
* magic number and level verification
*
* During growfs operations, we can't verify the exact level or owner as
* the perag is not fully initialised and hence not attached to the
* buffer. In this case, check against the maximum tree depth.
*
* Similarly, during log recovery we will have a perag structure
* attached, but the agf information will not yet have been initialised
* from the on disk AGF. Again, we can only check against maximum limits
* in this case.
*/
if (block->bb_magic != cpu_to_be32(XFS_RMAP_CRC_MAGIC))
return false;
if (!xfs_sb_version_hasrmapbt(&mp->m_sb))
return false;
if (!xfs_btree_sblock_v5hdr_verify(bp))
return false;
level = be16_to_cpu(block->bb_level);
if (pag && pag->pagf_init) {
if (level >= pag->pagf_levels[XFS_BTNUM_RMAPi])
return false;
} else if (level >= mp->m_rmap_maxlevels)
return false;
return xfs_btree_sblock_verify(bp, mp->m_rmap_mxr[level != 0]);
}
static void
xfs_rmapbt_read_verify(
struct xfs_buf *bp)
{
if (!xfs_btree_sblock_verify_crc(bp))
xfs_buf_ioerror(bp, -EFSBADCRC);
else if (!xfs_rmapbt_verify(bp))
xfs_buf_ioerror(bp, -EFSCORRUPTED);
if (bp->b_error) {
trace_xfs_btree_corrupt(bp, _RET_IP_);
xfs_verifier_error(bp);
}
}
static void
xfs_rmapbt_write_verify(
struct xfs_buf *bp)
{
if (!xfs_rmapbt_verify(bp)) {
trace_xfs_btree_corrupt(bp, _RET_IP_);
xfs_buf_ioerror(bp, -EFSCORRUPTED);
xfs_verifier_error(bp);
return;
}
xfs_btree_sblock_calc_crc(bp);
}
const struct xfs_buf_ops xfs_rmapbt_buf_ops = {
.name = "xfs_rmapbt",
.verify_read = xfs_rmapbt_read_verify,
.verify_write = xfs_rmapbt_write_verify,
};
#if defined(DEBUG) || defined(XFS_WARN)
STATIC int
xfs_rmapbt_keys_inorder(
struct xfs_btree_cur *cur,
union xfs_btree_key *k1,
union xfs_btree_key *k2)
{
__uint32_t x;
__uint32_t y;
__uint64_t a;
__uint64_t b;
x = be32_to_cpu(k1->rmap.rm_startblock);
y = be32_to_cpu(k2->rmap.rm_startblock);
if (x < y)
return 1;
else if (x > y)
return 0;
a = be64_to_cpu(k1->rmap.rm_owner);
b = be64_to_cpu(k2->rmap.rm_owner);
if (a < b)
return 1;
else if (a > b)
return 0;
a = XFS_RMAP_OFF(be64_to_cpu(k1->rmap.rm_offset));
b = XFS_RMAP_OFF(be64_to_cpu(k2->rmap.rm_offset));
if (a <= b)
return 1;
return 0;
}
STATIC int
xfs_rmapbt_recs_inorder(
struct xfs_btree_cur *cur,
union xfs_btree_rec *r1,
union xfs_btree_rec *r2)
{
__uint32_t x;
__uint32_t y;
__uint64_t a;
__uint64_t b;
x = be32_to_cpu(r1->rmap.rm_startblock);
y = be32_to_cpu(r2->rmap.rm_startblock);
if (x < y)
return 1;
else if (x > y)
return 0;
a = be64_to_cpu(r1->rmap.rm_owner);
b = be64_to_cpu(r2->rmap.rm_owner);
if (a < b)
return 1;
else if (a > b)
return 0;
a = XFS_RMAP_OFF(be64_to_cpu(r1->rmap.rm_offset));
b = XFS_RMAP_OFF(be64_to_cpu(r2->rmap.rm_offset));
if (a <= b)
return 1;
return 0;
}
#endif /* DEBUG */
static const struct xfs_btree_ops xfs_rmapbt_ops = {
.rec_len = sizeof(struct xfs_rmap_rec),
.key_len = 2 * sizeof(struct xfs_rmap_key),
.dup_cursor = xfs_rmapbt_dup_cursor,
.set_root = xfs_rmapbt_set_root,
.alloc_block = xfs_rmapbt_alloc_block,
.free_block = xfs_rmapbt_free_block,
.get_minrecs = xfs_rmapbt_get_minrecs,
.get_maxrecs = xfs_rmapbt_get_maxrecs,
.init_key_from_rec = xfs_rmapbt_init_key_from_rec,
.init_high_key_from_rec = xfs_rmapbt_init_high_key_from_rec,
.init_rec_from_cur = xfs_rmapbt_init_rec_from_cur,
.init_ptr_from_cur = xfs_rmapbt_init_ptr_from_cur,
.key_diff = xfs_rmapbt_key_diff,
.buf_ops = &xfs_rmapbt_buf_ops,
.diff_two_keys = xfs_rmapbt_diff_two_keys,
#if defined(DEBUG) || defined(XFS_WARN)
.keys_inorder = xfs_rmapbt_keys_inorder,
.recs_inorder = xfs_rmapbt_recs_inorder,
#endif
};
/*
* Allocate a new allocation btree cursor.
*/
struct xfs_btree_cur *
xfs_rmapbt_init_cursor(
struct xfs_mount *mp,
struct xfs_trans *tp,
struct xfs_buf *agbp,
xfs_agnumber_t agno)
{
struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp);
struct xfs_btree_cur *cur;
cur = kmem_zone_zalloc(xfs_btree_cur_zone, KM_NOFS);
cur->bc_tp = tp;
cur->bc_mp = mp;
/* Overlapping btree; 2 keys per pointer. */
cur->bc_btnum = XFS_BTNUM_RMAP;
cur->bc_flags = XFS_BTREE_CRC_BLOCKS | XFS_BTREE_OVERLAPPING;
cur->bc_blocklog = mp->m_sb.sb_blocklog;
cur->bc_ops = &xfs_rmapbt_ops;
cur->bc_nlevels = be32_to_cpu(agf->agf_levels[XFS_BTNUM_RMAP]);
cur->bc_private.a.agbp = agbp;
cur->bc_private.a.agno = agno;
return cur;
}
/*
* Calculate number of records in an rmap btree block.
*/
int
xfs_rmapbt_maxrecs(
struct xfs_mount *mp,
int blocklen,
int leaf)
{
blocklen -= XFS_RMAP_BLOCK_LEN;
if (leaf)
return blocklen / sizeof(struct xfs_rmap_rec);
return blocklen /
(2 * sizeof(struct xfs_rmap_key) + sizeof(xfs_rmap_ptr_t));
}
/* Compute the maximum height of an rmap btree. */
void
xfs_rmapbt_compute_maxlevels(
struct xfs_mount *mp)
{
mp->m_rmap_maxlevels = xfs_btree_compute_maxlevels(mp,
mp->m_rmap_mnr, mp->m_sb.sb_agblocks);
}
/*
* Copyright (c) 2014 Red Hat, Inc.
* All Rights Reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it would be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#ifndef __XFS_RMAP_BTREE_H__
#define __XFS_RMAP_BTREE_H__
struct xfs_buf;
struct xfs_btree_cur;
struct xfs_mount;
/* rmaps only exist on crc enabled filesystems */
#define XFS_RMAP_BLOCK_LEN XFS_BTREE_SBLOCK_CRC_LEN
/*
* Record, key, and pointer address macros for btree blocks.
*
* (note that some of these may appear unused, but they are used in userspace)
*/
#define XFS_RMAP_REC_ADDR(block, index) \
((struct xfs_rmap_rec *) \
((char *)(block) + XFS_RMAP_BLOCK_LEN + \
(((index) - 1) * sizeof(struct xfs_rmap_rec))))
#define XFS_RMAP_KEY_ADDR(block, index) \
((struct xfs_rmap_key *) \
((char *)(block) + XFS_RMAP_BLOCK_LEN + \
((index) - 1) * 2 * sizeof(struct xfs_rmap_key)))
#define XFS_RMAP_HIGH_KEY_ADDR(block, index) \
((struct xfs_rmap_key *) \
((char *)(block) + XFS_RMAP_BLOCK_LEN + \
sizeof(struct xfs_rmap_key) + \
((index) - 1) * 2 * sizeof(struct xfs_rmap_key)))
#define XFS_RMAP_PTR_ADDR(block, index, maxrecs) \
((xfs_rmap_ptr_t *) \
((char *)(block) + XFS_RMAP_BLOCK_LEN + \
(maxrecs) * 2 * sizeof(struct xfs_rmap_key) + \
((index) - 1) * sizeof(xfs_rmap_ptr_t)))
struct xfs_btree_cur *xfs_rmapbt_init_cursor(struct xfs_mount *mp,
struct xfs_trans *tp, struct xfs_buf *bp,
xfs_agnumber_t agno);
int xfs_rmapbt_maxrecs(struct xfs_mount *mp, int blocklen, int leaf);
extern void xfs_rmapbt_compute_maxlevels(struct xfs_mount *mp);
#endif /* __XFS_RMAP_BTREE_H__ */
......@@ -24,6 +24,7 @@
#include "xfs_bit.h"
#include "xfs_sb.h"
#include "xfs_mount.h"
#include "xfs_defer.h"
#include "xfs_inode.h"
#include "xfs_ialloc.h"
#include "xfs_alloc.h"
......@@ -36,6 +37,7 @@
#include "xfs_alloc_btree.h"
#include "xfs_ialloc_btree.h"
#include "xfs_log.h"
#include "xfs_rmap_btree.h"
/*
* Physical superblock buffer manipulations. Shared with libxfs in userspace.
......@@ -729,6 +731,11 @@ xfs_sb_mount_common(
mp->m_bmap_dmnr[0] = mp->m_bmap_dmxr[0] / 2;
mp->m_bmap_dmnr[1] = mp->m_bmap_dmxr[1] / 2;
mp->m_rmap_mxr[0] = xfs_rmapbt_maxrecs(mp, sbp->sb_blocksize, 1);
mp->m_rmap_mxr[1] = xfs_rmapbt_maxrecs(mp, sbp->sb_blocksize, 0);
mp->m_rmap_mnr[0] = mp->m_rmap_mxr[0] / 2;
mp->m_rmap_mnr[1] = mp->m_rmap_mxr[1] / 2;
mp->m_bsize = XFS_FSB_TO_BB(mp, 1);
mp->m_ialloc_inos = (int)MAX((__uint16_t)XFS_INODES_PER_CHUNK,
sbp->sb_inopblock);
......@@ -738,6 +745,8 @@ xfs_sb_mount_common(
mp->m_ialloc_min_blks = sbp->sb_spino_align;
else
mp->m_ialloc_min_blks = mp->m_ialloc_blks;
mp->m_alloc_set_aside = xfs_alloc_set_aside(mp);
mp->m_ag_max_usable = xfs_alloc_ag_max_usable(mp);
}
/*
......
......@@ -38,6 +38,7 @@ extern const struct xfs_buf_ops xfs_agi_buf_ops;
extern const struct xfs_buf_ops xfs_agf_buf_ops;
extern const struct xfs_buf_ops xfs_agfl_buf_ops;
extern const struct xfs_buf_ops xfs_allocbt_buf_ops;
extern const struct xfs_buf_ops xfs_rmapbt_buf_ops;
extern const struct xfs_buf_ops xfs_attr3_leaf_buf_ops;
extern const struct xfs_buf_ops xfs_attr3_rmt_buf_ops;
extern const struct xfs_buf_ops xfs_bmbt_buf_ops;
......@@ -116,6 +117,7 @@ int xfs_log_calc_minimum_size(struct xfs_mount *);
#define XFS_INO_BTREE_REF 3
#define XFS_ALLOC_BTREE_REF 2
#define XFS_BMAP_BTREE_REF 2
#define XFS_RMAP_BTREE_REF 2
#define XFS_DIR_BTREE_REF 2
#define XFS_INO_REF 2
#define XFS_ATTR_BTREE_REF 1
......
此差异已折叠。
......@@ -67,16 +67,6 @@ struct xfs_trans_resv {
/* shorthand way of accessing reservation structure */
#define M_RES(mp) (&(mp)->m_resv)
/*
* Per-extent log reservation for the allocation btree changes
* involved in freeing or allocating an extent.
* 2 trees * (2 blocks/level * max depth - 1) * block size
*/
#define XFS_ALLOCFREE_LOG_RES(mp,nx) \
((nx) * (2 * XFS_FSB_TO_B((mp), 2 * (mp)->m_ag_maxlevels - 1)))
#define XFS_ALLOCFREE_LOG_COUNT(mp,nx) \
((nx) * (2 * (2 * (mp)->m_ag_maxlevels - 1)))
/*
* Per-directory log reservation for any directory change.
* dir blocks: (1 btree block per level + data block + free block) * dblock size
......
......@@ -108,8 +108,8 @@ typedef enum {
} xfs_lookup_t;
typedef enum {
XFS_BTNUM_BNOi, XFS_BTNUM_CNTi, XFS_BTNUM_BMAPi, XFS_BTNUM_INOi,
XFS_BTNUM_FINOi, XFS_BTNUM_MAX
XFS_BTNUM_BNOi, XFS_BTNUM_CNTi, XFS_BTNUM_RMAPi, XFS_BTNUM_BMAPi,
XFS_BTNUM_INOi, XFS_BTNUM_FINOi, XFS_BTNUM_MAX
} xfs_btnum_t;
struct xfs_name {
......
此差异已折叠。
......@@ -21,7 +21,7 @@
/* Kernel only BMAP related definitions and functions */
struct xfs_bmbt_irec;
struct xfs_bmap_free_item;
struct xfs_extent_free_item;
struct xfs_ifork;
struct xfs_inode;
struct xfs_mount;
......@@ -40,8 +40,6 @@ int xfs_getbmap(struct xfs_inode *ip, struct getbmapx *bmv,
xfs_bmap_format_t formatter, void *arg);
/* functions in xfs_bmap.c that are only needed by xfs_bmap_util.c */
void xfs_bmap_del_free(struct xfs_bmap_free *flist,
struct xfs_bmap_free_item *free);
int xfs_bmap_extsize_align(struct xfs_mount *mp, struct xfs_bmbt_irec *gotp,
struct xfs_bmbt_irec *prevp, xfs_extlen_t extsz,
int rt, int eof, int delay, int convert,
......
......@@ -179,7 +179,7 @@ xfs_ioc_trim(
* matter as trimming blocks is an advisory interface.
*/
if (range.start >= XFS_FSB_TO_B(mp, mp->m_sb.sb_dblocks) ||
range.minlen > XFS_FSB_TO_B(mp, XFS_ALLOC_AG_MAX_USABLE(mp)) ||
range.minlen > XFS_FSB_TO_B(mp, mp->m_ag_max_usable) ||
range.len < mp->m_sb.sb_blocksize)
return -EINVAL;
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
......@@ -98,4 +98,7 @@ int xfs_efi_copy_format(xfs_log_iovec_t *buf,
void xfs_efi_item_free(xfs_efi_log_item_t *);
void xfs_efi_release(struct xfs_efi_log_item *);
int xfs_efi_recover(struct xfs_mount *mp,
struct xfs_efi_log_item *efip);
#endif /* __XFS_EXTFREE_ITEM_H__ */
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册