提交 · 4f6ae1a49ed5c81501d6f7385416bb4e07289e99 · openanolis / cloud-kernel

08 7月, 2011 16 次提交

xfs: avoid usage of struct xfs_dir2_block · 4f6ae1a4

由 Christoph Hellwig 提交于 7月 08, 2011

In most places we can simply pass around and use the struct xfs_dir2_data_hdr,
which is the first and most important member of struct xfs_dir2_block instead
of the full structure.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

4f6ae1a4

xfs: cleanup the definition of struct xfs_dir2_sf_entry · 78f70cd7

由 Christoph Hellwig 提交于 7月 08, 2011

Remove the inumber member which is at a variable offset after the actual
name, and make name a real variable sized C99 array instead of the incorrect
one-sized array which confuses (not only) gcc.  Based on this clean up
the helpers to calculate the entry size.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

78f70cd7

xfs: kill struct xfs_dir2_sf · ac8ba50f

由 Christoph Hellwig 提交于 7月 08, 2011

The list field of it is never cactually used, so all uses can simply be
replaced with the xfs_dir2_sf_hdr_t type that it has as first member.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

ac8ba50f

xfs: cleanup shortform directory inode number handling · 8bc38787

由 Christoph Hellwig 提交于 7月 08, 2011

Refactor the shortform directory helpers that deal with the 32-bit vs
64-bit wide inode numbers into more sensible helpers, and kill the
xfs_intino_t typedef that is now superflous.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

8bc38787

xfs: factor out xfs_dir2_leaf_find_entry · 4fb44c82

由 Christoph Hellwig 提交于 7月 08, 2011

Add a new xfs_dir2_leaf_find_entry helper to factor out some duplicate code
from xfs_dir2_leaf_addname xfs_dir2_leafn_add.  Found by Eric Sandeen using
an automated code duplication checker.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

4fb44c82

xfs: kill the unused struct xfs_sync_work · 29d104af

由 Christoph Hellwig 提交于 7月 08, 2011

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

29d104af

xfs: remove i_transp · f3ca8738

由 Christoph Hellwig 提交于 7月 08, 2011

Remove the transaction pointer in the inode.  It's only used to avoid
passing down an argument in the bmap code, and for a few asserts in
the transaction code right now.

Also use the local variable ip in a few more places in xfs_inode_item_unlock,
so that it isn't only used for debug builds after the above change.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

f3ca8738

xfs: fix filesystsem freeze race in xfs_trans_alloc · 7a249cf8

由 Christoph Hellwig 提交于 7月 08, 2011

As pointed out by Jan xfs_trans_alloc can race with a concurrent filesystem
freeze when it sleeps during the memory allocation. Fix this by moving the
wait_for_freeze call after the memory allocation. This means moving the
freeze into the low-level _xfs_trans_alloc helper, which thus grows a new
argument. Also fix up some comments in that area while at it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <david@fromorbit.com>

7a249cf8

xfs: improve sync behaviour in the face of aggressive dirtying · 33b8f7c2

由 Christoph Hellwig 提交于 7月 08, 2011

The following script from Wu Fengguang shows very bad behaviour in XFS
when aggressively dirtying data during a sync on XFS, with sync times
up to almost 10 times as long as ext4.

A large part of the issue is that XFS writes data out itself two times
in the ->sync_fs method, overriding the livelock protection in the core
writeback code, and another issue is the lock-less xfs_ioend_wait call,
which doesn't prevent new ioend from being queue up while waiting for
the count to reach zero.

This patch removes the XFS-internal sync calls and relies on the VFS
to do it's work just like all other filesystems do.  Note that the
i_iocount wait which is rather suboptimal is simply removed here.
We already do it in ->write_inode, which keeps the current supoptimal
behaviour.  We'll eventually need to remove that as well, but that's
material for a separate commit.

------------------------------ snip ------------------------------
#!/bin/sh

umount /dev/sda7
mkfs.xfs -f /dev/sda7
# mkfs.ext4 /dev/sda7
# mkfs.btrfs /dev/sda7
mount /dev/sda7 /fs

echo $((50<<20)) > /proc/sys/vm/dirty_bytes

pid=
for i in `seq 10`
do
	dd if=/dev/zero of=/fs/zero-$i bs=1M count=1000 &
	pid="$pid $!"
done

sleep 1

tic=$(date +'%s')
sync
tac=$(date +'%s')

echo
echo sync time: $((tac-tic))
egrep '(Dirty|Writeback|NFS_Unstable)' /proc/meminfo

pidof dd > /dev/null && { kill -9 $pid; echo sync NOT livelocked; }
------------------------------ snip ------------------------------
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NWu Fengguang <fengguang.wu@intel.com>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

33b8f7c2

xfs: split xfs_itruncate_finish · 8f04c47a

由 Christoph Hellwig 提交于 7月 08, 2011

Split the guts of xfs_itruncate_finish that loop over the existing extents
and calls xfs_bunmapi on them into a new helper, xfs_itruncate_externs.
Make xfs_attr_inactive call it directly instead of xfs_itruncate_finish,
which allows to simplify the latter a lot, by only letting it deal with
the data fork.  As a result xfs_itruncate_finish is renamed to
xfs_itruncate_data to make its use case more obvious.

Also remove the sync parameter from xfs_itruncate_data, which has been
unessecary since the introduction of the busy extent list in 2002, and
completely dead code since 2003 when the XFS_BMAPI_ASYNC parameter was
made a no-op.

I can't actually see why the xfs_attr_inactive needs to set the transaction
sync, but let's keep this patch simple and without changes in behaviour.

Also avoid passing a useless argument to xfs_isize_check, and make it
private to xfs_inode.c.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

8f04c47a

xfs: kill xfs_itruncate_start · 857b9778

由 Christoph Hellwig 提交于 7月 08, 2011

xfs_itruncate_start is a rather length wrapper that evaluates to a call
to xfs_ioend_wait and xfs_tosspages, and only has two callers.

Instead of using the complicated checks left over from IRIX where we
can to truncate the pagecache just call xfs_tosspages
(aka truncate_inode_pages) directly as we want to get rid of all data
after i_size, and truncate_inode_pages handles incorrect alignments
and too large offsets just fine.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

857b9778

xfs: always log timestamp updates in xfs_setattr_size · 681b1200

由 Christoph Hellwig 提交于 7月 08, 2011

Get rid of the special case where we use unlogged timestamp updates for
a truncate to the current inode size, and just call xfs_setattr_nonsize
for it to treat it like a utimes calls.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

681b1200

xfs: split xfs_setattr · c4ed4243

由 Christoph Hellwig 提交于 7月 08, 2011

Split up xfs_setattr into two functions, one for the complex truncate
handling, and one for the trivial attribute updates.  Also move both
new routines to xfs_iops.c as they are fairly Linux-specific.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

c4ed4243

xfs: work around bogus gcc warning in xfs_allocbt_init_cursor · dec58f1d

由 Christoph Hellwig 提交于 7月 08, 2011

GCC 4.6 complains about an array subscript is above array bounds when
using the btree index to index into the agf_levels array.  The only
two indices passed in are 0 and 1, and we have an assert insuring that.

Replace the trick of using the array index directly with using constants
in the already existing branch for assigning the XFS_BTREE_LASTREC_UPDATE
flag.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

dec58f1d

xfs: re-enable non-blocking behaviour in xfs_map_blocks · dbcdde3e

由 Christoph Hellwig 提交于 7月 08, 2011

The non-blockig behaviour in xfs_vm_writepage currently is conditional on
having both the WB_SYNC_NONE sync_mode and the nonblocking flag set.
The latter used to be used by both pdflush, kswapd and a few other places
in older kernels, but has been fading out starting with the introduction
of the per-bdi flusher threads.

Enable the non-blocking behaviour for all WB_SYNC_NONE calls to get back
the behaviour we want.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

dbcdde3e

xfs: PF_FSTRANS should never be set in ->writepage · 680a647b

由 Christoph Hellwig 提交于 7月 08, 2011

Now that we reject direct reclaim in addition to always using GFP_NOFS
allocation there's no chance we'll ever end up in ->writepage with
PF_FSTRANS set.  Add a WARN_ON if we hit this case, and stop checking
if we'd actually need to start a transaction.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

680a647b

07 7月, 2011 1 次提交

xfs: unpin stale inodes directly in IOP_COMMITTED · 1316d4da

由 Dave Chinner 提交于 7月 04, 2011

When inodes are marked stale in a transaction, they are treated
specially when the inode log item is being inserted into the AIL.
It tries to avoid moving the log item forward in the AIL due to a
race condition with the writing the underlying buffer back to disk.
The was "fixed" in commit de25c181 ("xfs: avoid moving stale inodes
in the AIL").

To avoid moving the item forward, we return a LSN smaller than the
commit_lsn of the completing transaction, thereby trying to trick
the commit code into not moving the inode forward at all. I'm not
sure this ever worked as intended - it assumes the inode is already
in the AIL, but I don't think the returned LSN would have been small
enough to prevent moving the inode. It appears that the reason it
worked is that the lower LSN of the inodes meant they were inserted
into the AIL and flushed before the inode buffer (which was moved to
the commit_lsn of the transaction).

The big problem is that with delayed logging, the returning of the
different LSN means insertion takes the slow, non-bulk path.  Worse
yet is that insertion is to a position -before- the commit_lsn so it
is doing a AIL traversal on every insertion, and has to walk over
all the items that have already been inserted into the AIL. It's
expensive.

To compound the matter further, with delayed logging inodes are
likely to go from clean to stale in a single checkpoint, which means
they aren't even in the AIL at all when we come across them at AIL
insertion time. Hence these were all getting inserted into the AIL
when they simply do not need to be as inodes marked XFS_ISTALE are
never written back.

Transactional/recovery integrity is maintained in this case by the
other items in the unlink transaction that were modified (e.g. the
AGI btree blocks) and committed in the same checkpoint.

So to fix this, simply unpin the stale inodes directly in
xfs_inode_item_committed() and return -1 to indicate that the AIL
insertion code does not need to do any further processing of these
inodes.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

1316d4da

24 6月, 2011 3 次提交

xfs: prevent bogus assert when trying to remove non-existent attribute · 4a338212

由 Dave Chinner 提交于 6月 23, 2011

If the attribute fork on an inode is in btree format and has
multiple levels (i.e node format rather than leaf format), then a
lookup failure will trigger an assert failure in xfs_da_path_shift
if the flag XFS_DA_OP_OKNOENT is not set. This flag is used to
indicate to the directory btree code that not finding an entry is
not a fatal error. In the case of doing a lookup for a directory
name removal, this is valid as a user cannot insert an arbitrary
name to remove from the directory btree.

However, in the case of the attribute tree, a user has direct
control over the attribute name and can ask for any random name to
be removed without any validation. In this case, fsstress is asking
for a non-existent user.selinux attribute to be removed, and that is
causing xfs_da_path_shift() to fall off the bottom of the tree where
it asserts that a lookup failure is allowed. Because the flag is not
set, we die a horrible death on a debug enable kernel.

Prevent this assert from firing on attribute removes by adding the
op_flag XFS_DA_OP_OKNOENT to atribute removal operations.

Discovered when testing on a SELinux enabled system by fsstress in
test 070 by trying to remove a non-existent user.selinux attribute.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

4a338212

xfs: clear XFS_IDIRTY_RELEASE on truncate down · df4368a1

由 Dave Chinner 提交于 6月 23, 2011

When an inode is truncated down, speculative preallocation is
removed from the inode. This should also reset the state bits for
controlling whether preallocation is subsequently removed when the
file is next closed. The flag is not being cleared, so repeated
operations on a file that first involve a truncate (e.g. multiple
repeated dd invocations on a file) give different file layouts for
the second and subsequent invocations.

Fix this by clearing the XFS_IDIRTY_RELEASE state bit when the
XFS_ITRUNCATED bit is detected in xfs_release() and hence ensure
that speculative delalloc is removed on files that have been
truncated down.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

df4368a1

xfs: reset inode per-lifetime state when recycling it · 778e24bb

由 Dave Chinner 提交于 6月 23, 2011

XFS inodes has several per-lifetime state fields that determine the
behaviour of the inode. These state fields are not all reset when an
inode is reused from the reclaimable state.

This can lead to unexpected behaviour of the new inode such as
speculative preallocation not being truncated away in the expected
manner for local files until the inode is subsequently truncated,
freed or cycles out of the cache. It can also lead to an inode being
considered to be a filestream inode or having been truncated when
that is not the case.

Rework the reinitialisation of the inode when it is recycled to
ensure that it is pristine before it is reused. While there, also
fix the resetting of state flags in the recycling error paths so the
inode does not become unreclaimable.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

778e24bb

16 6月, 2011 1 次提交

xfs: make log devices with write back caches work · a27a263b

由 Christoph Hellwig 提交于 6月 16, 2011

There's no reason not to support cache flushing on external log devices.
The only thing this really requires is flushing the data device first
both in fsync and log commits.  A side effect is that we also have to
remove the barrier write test during mount, which has been superflous
since the new FLUSH+FUA code anyway.  Also use the chance to flush the
RT subvolume write cache before the fsync commit, which is required
for correct semantics.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

a27a263b

15 6月, 2011 1 次提交

xfs: fix ->mknod() return value on xfs_get_acl() failure · c46a131c

由 Al Viro 提交于 6月 05, 2011

->mknod() should return negative on errors and PTR_ERR() gives
already negative value...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c46a131c

04 6月, 2011 11 次提交

btrfs: fix uninitialized variable warning · aa0467d8

由 David Sterba 提交于 6月 03, 2011

With Linus' tree, today's linux-next build (powercp ppc64_defconfig)
produced this warning:

fs/btrfs/delayed-inode.c: In function 'btrfs_delayed_update_inode':
fs/btrfs/delayed-inode.c:1598:6: warning: 'ret' may be used
uninitialized in this function

Introduced by commit 16cdcec7 ("btrfs: implement delayed inode items
operation").

This fixes a bug in btrfs_update_inode(): if the returned value from
btrfs_delayed_update_inode is a nonzero garbage, inode stat data are not
updated and several call paths may hit a BUG_ON or fail with strange
code.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

aa0467d8

btrfs: add helper for fs_info->closing · 7841cb28

由 David Sterba 提交于 5月 31, 2011

wrap checking of filesystem 'closing' flag and fix a few missing memory
barriers.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

7841cb28

Btrfs: add mount -o inode_cache · 4b9465cb

由 Chris Mason 提交于 6月 03, 2011

This makes the inode map cache default to off until we
fix the overflow problem when the free space crcs don't fit
inside a single page.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4b9465cb

btrfs: scrub: add explicit plugging · e7786c3a

由 Arne Jansen 提交于 5月 28, 2011

With the removal of the implicit plugging scrub ends up doing more and
smaller I/O than necessary. This patch adds explicit plugging per chunk.
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e7786c3a

btrfs: use btrfs_ino to access inode number · a4689d2b

由 David Sterba 提交于 5月 31, 2011

commit 4cb5300b ("Btrfs: add mount -o auto_defrag") accesses inode
number directly while it should use the helper with the new inode
number allocator.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a4689d2b

Btrfs: don't save the inode cache if we are deleting this root · d132a538

由 Josef Bacik 提交于 5月 31, 2011

With xfstest 254 I can panic the box every time with the inode number caching
stuff on. This is because we clean the inodes out when we delete the subvolume,
but then we write out the inode cache which adds an inode to the subvolume inode
tree, and then when it gets evicted again the root gets added back on the dead
roots list and is deleted again, so we have a double free. To stop this from
happening just return 0 if refs is 0 (and we're not the tree root since tree
root always has refs of 0). With this fix 254 no longer panics. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Tested-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d132a538

btrfs: false BUG_ON when degraded · 5f3f302a

由 Arne Jansen 提交于 5月 30, 2011

In degraded mode the struct btrfs_device of missing devs don't have
device->name set. A kstrdup of NULL correctly returns NULL. Don't
BUG in this case.
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5f3f302a

Btrfs: don't save the inode cache in non-FS roots · ca456ae2

由 liubo 提交于 6月 01, 2011

This adds extra checks to make sure the inode map we are caching really
belongs to a FS root instead of a special relocation tree.  It
prevents crashes during balancing operations.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ca456ae2

Btrfs: make sure we don't overflow the free space cache crc page · 211f96c2

由 Chris Mason 提交于 6月 03, 2011

The free space cache uses only one page for crcs right now,
which means we can't have a cache file bigger than the
crcs we can fit in the first page.  This adds a check to
enforce that restriction.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

211f96c2

C
Btrfs: fix uninit variable in the delayed inode code · 17aca1c9
由 Chris Mason 提交于 6月 03, 2011
```
The nitems counter needs to start at zero
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
17aca1c9

btrfs: scrub: don't reuse bios and pages · 1bc87793

由 Arne Jansen 提交于 5月 28, 2011

The current scrub implementation reuses bios and pages as often as possible,
allocating them only on start and releasing them when finished. This leads
to more problems with the block layer than it's worth. The elevator gets
confused when there are more pages added to the bio than bi_size suggests.
This patch completely rips out the reuse of bios and pages and allocates
them freshly for each submit.
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Maosn <chris.mason@oracle.com>

1bc87793

03 6月, 2011 6 次提交

UBIFS: fix-up free space earlier · 09801194

由 Ben Gardiner 提交于 5月 30, 2011

The free space fixup is currently initiated during mount after the call to
ubifs_write_master() which results in a write to PEBs; this has been observed
with the patch 'assert no fixup when writing a node' applied:

Move the free space fixup on mount to before the calls to
ubifs_recover_inl_heads() and ubifs_write_master(). This results in no
assertions with the previously mentioned patch applied.

Artem: tweaked the patch a bit
Signed-off-by: NBen Gardiner <bengardiner@nanometrics>
Reviewed-by: NMatthew L. Creech <mlcreech@gmail.com>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

09801194

UBIFS: intialize LPT earlier · 781c5717

由 Ben Gardiner 提交于 5月 30, 2011

The current 'mount_ubifs()' implementation does not initialize the LPT until the
the master node is marked dirty. Move the LPT initialization to before marking
the master node dirty. This is a preparation for the next patch which will move
the free-space-fixup check to before marking the master node dirty, because we
have to fix-up the free space before doing any writes.

Artem: massaged the patch and commit message.
Signed-off-by: NBen Gardiner <bengardiner@nanometrics.ca>
Reviewed-by: NMatthew L. Creech <mlcreech@gmail.com>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

781c5717

UBIFS: assert no fixup when writing a node · 4f1ab9b0

由 Ben Gardiner 提交于 5月 30, 2011

The current free space fixup can result in some writing to the UBI volume
when the space_fixup flag is set.

To catch instances where UBIFS is writing to the NAND while the space_fixup
flag is set, add an assert to ubifs_write_node().

Artem: tweaked the patch, added similar assertion to the write buffer
       write path.
Signed-off-by: NBen Gardiner <bengardiner@nanometrics.ca>
Reviewed-by: NMatthew L. Creech <mlcreech@gmail.com>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

4f1ab9b0

UBIFS: fix clean znode counter corruption in error cases · 83707237

由 Artem Bityutskiy 提交于 5月 31, 2011

UBIFS maintains per-filesystem and global clean znode counters
('c->clean_zn_cnt' and 'ubifs_clean_zn_cnt'). It is important to maintain
correct values there since the shrinker relies on 'ubifs_clean_zn_cnt'.

However, in case of failures during commit the counters were corrupted. E.g.,
if a failure happens in the middle of 'write_index()', then some nodes in the
commit list ('c->cnext') are marked as clean, and some are marked as dirty. And
the 'ubifs_destroy_tnc_subtree()' frees does not retrun correct count, and we
end up with non-zero 'c->clean_zn_cnt' when unmounting. This means that if we
have 2 file-sytem and one of them fails, and we unmount it,
'ubifs_clean_zn_cnt' stays incorrect and confuses the shrinker.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

83707237

UBIFS: fix memory leak on error path · 812eb258

由 Artem Bityutskiy 提交于 5月 31, 2011

UBIFS leaks memory on error path in 'ubifs_jnl_update()' in case of write
failure because it forgets to free the 'struct ubifs_dent_node *dent' object.
Although the object is small, the alignment can make it large - e.g., 2KiB
if the min. I/O unit is 2KiB.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: stable@kernel.org

812eb258

UBIFS: fix shrinker object count reports · cf610bf4

由 Artem Bityutskiy 提交于 5月 31, 2011

Sometimes VM asks the shrinker to return amount of objects it can shrink,
and we return the ubifs_clean_zn_cnt in that case. However, it is possible
that this counter is negative for a short period of time, due to the way
UBIFS TNC code updates it. And I can observe the following warnings sometimes:

shrink_slab: ubifs_shrinker+0x0/0x2b7 [ubifs] negative objects to delete nr=-8541616642706119788

This patch makes sure UBIFS never returns negative count of objects.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: stable@kernel.org

cf610bf4

01 6月, 2011 1 次提交

UBIFS: fix recovery broken by the previous recovery fix · da8b94ea

由 Artem Bityutskiy 提交于 5月 26, 2011

Unfortunately, the recovery fix d1606a59b6be4ea392eabd40d1250aa1eeb19efb
(UBIFS: fix extremely rare mount failure) broke recovery. This commit make
UBIFS drop the last min. I/O unit in all journal heads, but this is needed only
for the GC head. And this does not work for non-GC heads. For example, if
suppose we have min. I/O units A and B, and A contains a valid node X, which
was fsynced, and then a group of nodes Y which spans the rest of A and B. In
this case we'll drop not only Y, but also X, which is obviously incorrect.

This patch fixes the issue and additionally makes recovery to drop last min.
I/O unit only for the GC head, and leave things as they have been for ages for
the other heads - this is safer.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

da8b94ea

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功