提交 · 5477d0face8a3ba4e9a1e7283692fff9c92f8e5e · openanolis / cloud-kernel

30 4月, 2010 1 次提交

fs: fs/super.c needs to include backing-dev.h for !CONFIG_BLOCK · 5477d0fa

由 Jens Axboe 提交于 4月 29, 2010

When CONFIG_BLOCK is set, it ends up getting backing-dev.h included.
But for !CONFIG_BLOCK, it isn't so lucky. The proper thing to do is
include <linux/backing-dev.h> directly from the file it's used from,
so do that.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5477d0fa

29 4月, 2010 1 次提交

nfs d_revalidate() is too trigger-happy with d_drop() · d9e80b7d

由 Al Viro 提交于 4月 29, 2010

If dentry found stale happens to be a root of disconnected tree, we
can't d_drop() it; its d_hash is actually part of s_anon and d_drop()
would simply hide it from shrink_dcache_for_umount(), leading to
all sorts of fun, including busy inodes on umount and oopsen after
that.

Bug had been there since at least 2006 (commit c636eb already has it),
so it's definitely -stable fodder.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d9e80b7d

28 4月, 2010 2 次提交

procfs: fix tid fdinfo · 3835541d

由 Jerome Marchand 提交于 4月 27, 2010

Correct the file_operations struct in fdinfo entry of tid_base_stuff[].

Presently /proc/*/task/*/fdinfo contains symlinks to opened files like
/proc/*/fd/.
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Miklos Szeredi <mszeredi@suse.cz>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3835541d

Remove redundant check for CONFIG_MMU · 16a5b3c4

由 Christoph Egger 提交于 4月 26, 2010

The checks for CONFIG_MMU at this location are duplicated as all the code is
located inside a #ifndef CONFIG_MMU block. So the first conditional block will
always be included while the second never will.
Signed-off-by: NChristoph Egger <siccegge@stud.informatik.uni-erlangen.de>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

16a5b3c4

27 4月, 2010 2 次提交

nfsd4: bug in read_buf · 2bc3c117

由 Neil Brown 提交于 4月 20, 2010

When read_buf is called to move over to the next page in the pagelist
of an NFSv4 request, it sets argp->end to essentially a random
number, certainly not an address within the page which argp->p now
points to.  So subsequent calls to READ_BUF will think there is much
more than a page of spare space (the cast to u32 ensures an unsigned
comparison) so we can expect to fall off the end of the second
page.

We never encountered thsi in testing because typically the only
operations which use more than two pages are write-like operations,
which have their own decoding logic.  Something like a getattr after a
write may cross a page boundary, but it would be very unusual for it to
cross another boundary after that.

Cc: stable@kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

2bc3c117

xfs: more swap extent fixes for dynamic fork offsets · dd77ef92

由 Dave Chinner 提交于 4月 20, 2010

A new xfsqa test (226) with a prototype xfs_fsr change to try to
handle dynamic fork offsets better triggers an assertion failure
where the inode data fork is in btree format, yet there is room in
the inode for it to be in extent format. The two inodes look like:

before: ino 0x101 (target), num_extents 11, Max in-fork extents 6, broot size 40, fork offset 96
before: ino 0x115 (temp), num_extents 5, Max in-fork extents 3, broot size 40, fork offset 56
after: ino 0x101 (target), num_extents 5, Max in-fork extents 6, broot size 40, fork offset 96
after: ino 0x115 (temp), num_extents 11, Max in-fork extents 3, broot size 40, fork offset 56

Basically the target inode ends up with 5 extents in btree format,
but it had space for 6 extents in extent format, so ends up
incorrect. Notably here the broot size is the same, and that is
where the kernel code is going wrong - the btree root will fit, so
it lets the swap go ahead.

The check should not allow the swap to take place if the number of
extents while in btree format is less than the number of extents
that can fit in the inode in extent format. Adding that check will
prevent this swap and corruption from occurring.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

dd77ef92

26 4月, 2010 1 次提交

btrfs: convert to using bdi_setup_and_register() · e6d086d8

由 Jens Axboe 提交于 4月 26, 2010

It's now a provided helper, so get rid of the internal setup
and btrfs atomic_t bdi enumerator.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e6d086d8

25 4月, 2010 7 次提交

Catch filesystems lacking s_bdi · 5129a469

由 Jörn Engel 提交于 4月 25, 2010

noop_backing_dev_info is used only as a flag to mark filesystems that
don't have any backing store, like tmpfs, procfs, spufs, etc.
Signed-off-by: NJoern Engel <joern@logfs.org>

Changed the BUG_ON() to a WARN_ON(). Note that adding dirty inodes
to the noop_backing_dev_info is not legal and will not result in
them being flushed, but we already catch this condition in
__mark_inode_dirty() when checking for a registered bdi.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5129a469

squashfs: fix potential buffer over-run on 4K block file systems · e0d1f700

由 Phillip Lougher 提交于 4月 23, 2010

Sizing the buffer based on block size is incorrect, leading
to a potential buffer over-run on 4K block size file systems
(because the metadata block size is always 8K).  This bug
doesn't seem have triggered because 4K block size file systems
are not default, and also because metadata blocks after
compression tend to be less than 4K.
Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk>

e0d1f700

P
squashfs: add missing buffer free · 370ec3d1
由 Phillip Lougher 提交于 4月 23, 2010
```
Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk>
```
370ec3d1

squashfs: fix warn_on when root inode is corrupted · 1cb08e97

由 Phillip Lougher 提交于 4月 16, 2010

Fix warn_on triggered by mounting a fsfuzzer corrupted file system, where
the root inode has been corrupted.
Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk>
Reported-by: NSteve Grubb <sgrubb@redhat.com>

1cb08e97

fs/block_dev.c: fix performance regression in O_DIRECT|O_SYNC writes to block devices · b8af67e2

由 Anton Blanchard 提交于 4月 23, 2010

We are seeing a large regression in database performance on recent
kernels.  The database opens a block device with O_DIRECT|O_SYNC and a
number of threads write to different regions of the file at the same time.

A simple test case is below.  I haven't defined DEVICE since getting it
wrong will destroy your data :) On an 3 disk LVM with a 64k chunk size we
see about 17MB/sec and only a few threads in IO wait:

procs  -----io---- -system-- -----cpu------
 r  b     bi    bo   in   cs us sy id wa st
 0  3      0 16170  656 2259  0  0 86 14  0
 0  2      0 16704  695 2408  0  0 92  8  0
 0  2      0 17308  744 2653  0  0 86 14  0
 0  2      0 17933  759 2777  0  0 89 10  0

Most threads are blocking in vfs_fsync_range, which has:

        mutex_lock(&mapping->host->i_mutex);
        err = fop->fsync(file, dentry, datasync);
        if (!ret)
                ret = err;
        mutex_unlock(&mapping->host->i_mutex);

commit 148f948b (vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode) offers
some explanation of what is going on:

    Use these new helpers for syncing from generic VFS functions. This makes
    O_SYNC writes to block devices acquire i_mutex for syncing. If we really
    care about this, we can make block_fsync() drop the i_mutex and reacquire
    it before it returns.

Thanks Jan for such a good commit message!  As well as dropping i_mutex,
Christoph suggests we should remove the call to sync_blockdev():

> sync_blockdev is an overcomplicated alias for filemap_write_and_wait on
> the block device inode, which is exactly what we did just before calling
> into ->fsync

The patch below incorporates both suggestions. With it the testcase improves
from 17MB/s to 68M/sec:

procs  -----io---- -system-- -----cpu------
 r  b     bi    bo   in   cs us sy id wa st
 0  7      0 65536 1000 3878  0  0 70 30  0
 0 34      0 69632 1016 3921  0  1 46 53  0
 0 57      0 69632 1000 3921  0  0 55 45  0
 0 53      0 69640  754 4111  0  0 81 19  0

Testcase:

#define _GNU_SOURCE
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define NR_THREADS 64
#define BUFSIZE (64 * 1024)

#define DEVICE "/dev/mapper/XXXXXX"

#define ALIGN(VAL, SIZE) (((VAL)+(SIZE)-1) & ~((SIZE)-1))

static int fd;

static void *doit(void *arg)
{
	unsigned long offset = (long)arg;
	char *b, *buf;

	b = malloc(BUFSIZE + 1024);
	buf = (char *)ALIGN((unsigned long)b, 1024);
	memset(buf, 0, BUFSIZE);

	while (1)
		pwrite(fd, buf, BUFSIZE, offset);
}

int main(int argc, char *argv[])
{
	int flags = O_RDWR|O_DIRECT;
	int i;
	unsigned long offset = 0;

	if (argc > 1 && !strcmp(argv[1], "O_SYNC"))
		flags |= O_SYNC;

	fd = open(DEVICE, flags);
	if (fd == -1) {
		perror("open");
		exit(1);
	}

	for (i = 0; i < NR_THREADS-1; i++) {
		pthread_t tid;
		pthread_create(&tid, NULL, doit, (void *)offset);
		offset += BUFSIZE;
	}
	doit((void *)offset);

	return 0;
}
Signed-off-by: NAnton Blanchard <anton@samba.org>
Acked-by: NJan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b8af67e2

reiserfs: fix corruption during shrinking of xattrs · fb2162df

由 Jeff Mahoney 提交于 4月 23, 2010

Commit 48b32a35 ("reiserfs: use generic
xattr handlers") introduced a problem that causes corruption when extended
attributes are replaced with a smaller value.

The issue is that the reiserfs_setattr to shrink the xattr file was moved
from before the write to after the write.

The root issue has always been in the reiserfs xattr code, but was papered
over by the fact that in the shrink case, the file would just be expanded
again while the xattr was written.

The end result is that the last 8 bytes of xattr data are lost.

This patch fixes it to use new_size.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=14826Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Reported-by: NChristian Kujau <lists@nerdbynature.de>
Tested-by: NChristian Kujau <lists@nerdbynature.de>
Cc: Edward Shishkin <edward.shishkin@gmail.com>
Cc: Jethro Beekman <kernel@jbeekman.nl>
Cc: Greg Surbey <gregsurbey@hotmail.com>
Cc: Marco Gatti <marco.gatti@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fb2162df

reiserfs: fix permissions on .reiserfs_priv · cac36f70

由 Jeff Mahoney 提交于 4月 23, 2010

Commit 677c9b2e ("reiserfs: remove
privroot hiding in lookup") removed the magic from the lookup code to hide
the .reiserfs_priv directory since it was getting loaded at mount-time
instead.  The intent was that the entry would be hidden from the user via
a poisoned d_compare, but this was faulty.

This introduced a security issue where unprivileged users could access and
modify extended attributes or ACLs belonging to other users, including
root.

This patch resolves the issue by properly hiding .reiserfs_priv.  This was
the intent of the xattr poisoning code, but it appears to have never
worked as expected.  This is fixed by using d_revalidate instead of
d_compare.

This patch makes -oexpose_privroot a no-op.  I'm fine leaving it this way.
The effort involved in working out the corner cases wrt permissions and
caching outweigh the benefit of the feature.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Acked-by: NEdward Shishkin <edward.shishkin@gmail.com>
Reported-by: NMatt McCutchen <matt@mattmccutchen.net>
Tested-by: NMatt McCutchen <matt@mattmccutchen.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cac36f70

24 4月, 2010 1 次提交

Cleanup generic block based fiemap · 3a3076f4

由 Josef Bacik 提交于 4月 23, 2010

This cleans up a few of the complaints of __generic_block_fiemap. I've
fixed all the typing stuff, used inline functions instead of macros,
gotten rid of a couple of variables, and made sure the size and block
requests are all block aligned. It also fixes a problem where sometimes
FIEMAP_EXTENT_LAST wasn't being set properly.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3a3076f4

23 4月, 2010 1 次提交

squashfs: fix locking bug in zlib wrapper · 792590c7

由 Phillip Lougher 提交于 4月 04, 2010

Fix locking bug in zlib wrapper introduced by recent decompressor changes.
Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk>

792590c7

22 4月, 2010 9 次提交

smbfs: add bdi backing to mount session · 424264b7

由 Jens Axboe 提交于 4月 22, 2010

This ensures that dirty data gets flushed properly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

424264b7

ncpfs: add bdi backing to mount session · f1970c73

由 Jens Axboe 提交于 4月 22, 2010

This ensures that dirty data gets flushed properly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f1970c73

exofs: add bdi backing to mount session · b3d0ab7e

由 Jens Axboe 提交于 4月 22, 2010

This ensures that dirty data gets flushed properly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b3d0ab7e

ecryptfs: add bdi backing to mount session · 9df9c8b9

由 Jens Axboe 提交于 4月 22, 2010

This ensures that dirty data gets flushed properly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9df9c8b9

coda: add bdi backing to mount session · 5163d900

由 Jens Axboe 提交于 4月 22, 2010

This ensures that dirty data gets flushed properly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5163d900

cifs: add bdi backing to mount session · 8044f7f4

由 Jens Axboe 提交于 4月 22, 2010

This ensures that dirty data gets flushed properly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8044f7f4

afs: add bdi backing to mount session. · e1da0222

由 Jens Axboe 提交于 4月 22, 2010

This ensures that dirty data gets flushed properly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e1da0222

9p: add bdi backing to mount session · 0ed07ddb

由 Jens Axboe 提交于 4月 22, 2010

This ensures that dirty data gets flushed properly.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0ed07ddb

AFS: Don't pass error value to page_cache_release() in error handling · 083fd8b2

由 David Howells 提交于 4月 21, 2010

In the error handling in afs_mntpt_do_automount(), we pass an error
pointer to page_cache_release() if read_mapping_page() failed. Instead,
we should extend the gotos around the error handling we don't need.
Reported-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

083fd8b2

21 4月, 2010 4 次提交

uclinux: error message when FLAT reloc symbol is invalid, v2 · d7dfee3f

由 Jun Sun 提交于 12月 31, 2009

This patch fixes a cosmetic error in printk. Text segment and data/bss
segment are allocated from two different areas. It is not meaningful to
give the diff between them in the error reporting messages.
Signed-off-by: NJun Sun <jsun@junsun.net>
Signed-off-by: NGreg Ungerer <gerg@uclinux.org>

d7dfee3f

ext4: Issue the discard operation *before* releasing the blocks to be reused · b90f6870

由 Theodore Ts'o 提交于 4月 20, 2010

Otherwise, we can end up having data corruption because the blocks
could get reused and then discarded!

https://bugzilla.kernel.org/show_bug.cgi?id=15579Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b90f6870

[LogFS] Split large truncated into smaller chunks · b6349ac8

由 Joern Engel 提交于 4月 20, 2010

Truncate would do an almost limitless amount of work without invoking
the garbage collector in between.  Split it up into more manageable,
though still large, chunks.
Signed-off-by: NJoern Engel <joern@logfs.org>

b6349ac8

quota: Convert __DQUOT_PARANOIA symbol to standard config option · 62af9b52

由 Jan Kara 提交于 4月 19, 2010

Make __DQUOT_PARANOIA define from the old days a standard config option
and turn it off by default.

This gets rid of a quota warning about writes before quota is turned on
for systems with ext4 root filesystem. Currently there's no way to legally
solve this because /etc/mtab has to be written before quota is turned on
on most systems.
Signed-off-by: NJan Kara <jack@suse.cz>

62af9b52

20 4月, 2010 5 次提交

eCryptfs: Turn lower lookup error messages into debug messages · 9f37622f

由 Tyler Hicks 提交于 3月 25, 2010

Vaugue warnings about ENAMETOOLONG errors when looking up an encrypted
file name have caused many users to become concerned about their data.
Since this is a rather harmless condition, I'm moving this warning to
only be printed when the ecryptfs_verbosity module param is 1.
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

9f37622f

eCryptfs: Copy lower directory inode times and size on link · 3a8380c0

由 Tyler Hicks 提交于 3月 23, 2010

The timestamps and size of a lower inode involved in a link() call was
being copied to the upper parent inode. Instead, we should be
copying lower parent inode's timestamps and size to the upper parent
inode. I discovered this bug using the POSIX test suite at Tuxera.
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

3a8380c0

ecryptfs: fix use with tmpfs by removing d_drop from ecryptfs_destroy_inode · 133b8f9d

由 Jeff Mahoney 提交于 3月 19, 2010

Since tmpfs has no persistent storage, it pins all its dentries in memory
so they have d_count=1 when other file systems would have d_count=0.
->lookup is only used to create new dentries. If the caller doesn't
instantiate it, it's freed immediately at dput(). ->readdir reads
directly from the dcache and depends on the dentries being hashed.

When an ecryptfs mount is mounted, it associates the lower file and dentry
with the ecryptfs files as they're accessed. When it's umounted and
destroys all the in-memory ecryptfs inodes, it fput's the lower_files and
d_drop's the lower_dentries. Commit 4981e081 added this and a d_delete in
2008 and several months later commit caeeeecf removed the d_delete. I
believe the d_drop() needs to be removed as well.

The d_drop effectively hides any file that has been accessed via ecryptfs
from the underlying tmpfs since it depends on it being hashed for it to
be accessible. I've removed the d_drop on my development node and see no
ill effects with basic testing on both tmpfs and persistent storage.

As a side effect, after ecryptfs d_drops the dentries on tmpfs, tmpfs
BUGs on umount. This is due to the dentries being unhashed.
tmpfs->kill_sb is kill_litter_super which calls d_genocide to drop
the reference pinning the dentry. It skips unhashed and negative dentries,
but shrink_dcache_for_umount_subtree doesn't. Since those dentries
still have an elevated d_count, we get a BUG().

This patch removes the d_drop call and fixes both issues.

This issue was reported at:
https://bugzilla.novell.com/show_bug.cgi?id=567887Reported-by: NÁrpád Bíró <biroa@demasz.hu>
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Cc: Dustin Kirkland <kirkland@canonical.com>
Cc: stable@kernel.org
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

133b8f9d

ecryptfs: fix error code for missing xattrs in lower fs · cfce08c6

由 Christian Pulvermacher 提交于 3月 23, 2010

If the lower file system driver has extended attributes disabled,
ecryptfs' own access functions return -ENOSYS instead of -EOPNOTSUPP.
This breaks execution of programs in the ecryptfs mount, since the
kernel expects the latter error when checking for security
capabilities in xattrs.
Signed-off-by: NChristian Pulvermacher <pulvermacher@gmx.de>
Cc: stable@kernel.org
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

cfce08c6

eCryptfs: Decrypt symlink target for stat size · 3a60a168

由 Tyler Hicks 提交于 3月 22, 2010

Create a getattr handler for eCryptfs symlinks that is capable of
reading the lower target and decrypting its path.  Prior to this patch,
a stat's st_size field would represent the strlen of the encrypted path,
while readlink() would return the strlen of the decrypted path.  This
could lead to confusion in some userspace applications, since the two
values should be equal.

https://bugs.launchpad.net/bugs/524919Reported-by: NLoïc Minier <loic.minier@canonical.com>
Cc: stable@kernel.org
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

3a60a168

18 4月, 2010 1 次提交

[LogFS] Set s_bdi · b8639077

由 Joern Engel 提交于 4月 17, 2010

Since 32a88aa1 sync() was turned into a NOP for logfs.  Worse, sync()
would not return an error, giving the illusion that writeout had
actually happened.

Afaics jffs2 was broken as well.
Signed-off-by: NJoern Engel <joern@logfs.org>

b8639077

17 4月, 2010 2 次提交

xfs: don't warn on EAGAIN in inode reclaim · f1d486a3

由 Dave Chinner 提交于 4月 13, 2010

Any inode reclaim flush that returns EAGAIN will result in the inode
reclaim being attempted again later. There is no need to issue a
warning into the logs about this situation.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

f1d486a3

xfs: ensure that sync updates the log tail correctly · b6f8dd49

由 Dave Chinner 提交于 4月 13, 2010

Updates to the VFS layer removed an extra ->sync_fs call into the
filesystem during the sync process (from the quota code).
Unfortunately the sync code was unknowingly relying on this call to
make sure metadata buffers were flushed via a xfs_buftarg_flush()
call to move the tail of the log forward in memory before the final
transactions of the sync process were issued.

As a result, the old code would write a very recent log tail value
to the log by the end of the sync process, and so a subsequent crash
would leave nothing for log recovery to do. Hence in qa test 182,
log recovery only replayed a small handle for inode fsync
transactions in this case.

However, with the removal of the extra ->sync_fs call, the log tail
was now not moved forward with the inode fsync transactions near the
end of the sync procese the first (and only) buftarg flush occurred
after these transactions went to disk. The result is that log
recovery now sees a large number of transactions for metadata that
is already on disk.

This usually isn't a problem, but when the transactions include
inode chunk allocation, the inode create transactions and all
subsequent changes are replayed as we cannt rely on what is on disk
is valid. As a result, if the inode was written and contains
unlogged changes, the unlogged changes are lost, thereby violating
sync semantics.

The fix is to always issue a transaction after the buftarg flush
occurs is the log iѕ not idle or covered. This results in a dummy
transaction being written that contains the up-to-date log tail
value, which will be very recent. Indeed, it will be at least as
recent as the old code would have left on disk, so log recovery
will behave exactly as it used to in this situation.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

b6f8dd49

16 4月, 2010 2 次提交

jfs: add jfs specific ->setattr call · c7f2e1f0

由 Dmitry Monakhov 提交于 4月 16, 2010

generic setattr not longer responsible for quota transfer.
use jfs_setattr for all jfs's inodes.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>

c7f2e1f0

jfs: fix diAllocExt error in resizing filesystem · 2b0b3951

由 Bill Pemberton 提交于 4月 16, 2010

Resizing the filesystem would result in an diAllocExt error in some
instances because changes in bmp->db_agsize would not get noticed if
goto extendBmap was called.
Signed-off-by: NBill Pemberton <wfp5p@virginia.edu>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: jfs-discussion@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org

2b0b3951

15 4月, 2010 1 次提交

[LogFS] Prevent mempool_destroy NULL pointer dereference · 1f1b0008

由 Joern Engel 提交于 4月 15, 2010

It would probably be better to just accept NULL pointers in
mempool_destroy().  But for the current -rc series let's keep things
simple.

This patch was lost in the cracks for a while.
Kevin Cernekee <cernekee@gmail.com> had to rediscover the problem and
send a similar patch because of it. :(
Signed-off-by: NJoern Engel <joern@logfs.org>

1f1b0008

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功