提交 · f25b874d39461935b1b5bbffaa622e735e79d49e · xiphi1978 / linux

08 10月, 2008 15 次提交

NFS: missing nfs_fattr_init in nfs3_proc_getacl and nfs3_proc_setacls (resend #2) · f25b874d

由 Jeff Layton 提交于 8月 18, 2008

The fattrs used in the NFSv3 getacl/setacl calls are not being properly
initialized. This occasionally causes nfs_update_inode to fall into
NFSv4 specific codepaths when handling post-op attrs from these calls.

Thanks to Cai Qian for noticing the spurious NFSv4 messages in debug
output from a v3 mount...
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

f25b874d

nfs: remove an obsolete nfs_flock comment · f200c11c

由 J. Bruce Fields 提交于 8月 14, 2008

We *do* now allow bsd flocks over nfs.
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

f200c11c

nfs: BUG_ON in nfs_follow_mountpoint · 44d5759d

由 Denis V. Lunev 提交于 8月 11, 2008

Unfortunately, BUG_ON(IS_ROOT(dentry)) can happen inside
nfs_follow_mountpoint with NFS running Fedora 8 using a
specific setup.
https://bugzilla.redhat.com/show_bug.cgi?id=458622

So, the situation should be handled on NFS client gracefully.
Signed-off-by: NDenis V. Lunev <den@openvz.org>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
CC: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

44d5759d

nfs: ERR_PTR is expected on failure from nfs_do_clone_mount · fd08d7e9

由 Denis V. Lunev 提交于 7月 31, 2008

Replace NULL with ERR_PTR(-EINVAL).
Signed-off-by: NDenis V. Lunev <den@openvz.org>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

fd08d7e9

fix fs/nfs/nfsroot.c compilation · bb8a3b53

由 Adrian Bunk 提交于 7月 25, 2008

This patch fixes the following compile error caused by
commit f9247273
(UFS: add const to parser token tabl):

<--  snip  -->

...
  CC      fs/nfs/nfsroot.o
/home/bunk/linux/kernel-2.6/git/linux-2.6/fs/nfs/nfsroot.c:130: error: tokens causes a section type conflict
make[3]: *** [fs/nfs/nfsroot.o] Error 1

<--  snip  -->
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

bb8a3b53

NFS: Allow concurrent inode revalidation · 691beb13

由 Trond Myklebust 提交于 10月 05, 2008

Currently, if two processes are both trying to revalidate metadata for the
same inode, they will find themselves being serialised. There is no good
justification for this now that we have improved our ability to detect
stale attribute data, so we should remove that serialisation.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

691beb13

NFS: Fix up nfs_setattr_update_inode() · 2f28ea61

由 Trond Myklebust 提交于 10月 05, 2008

Ensure that it sets the inode metadata under the correct spinlock.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

2f28ea61

NFS: Don't clear nfsi->cache_validity in nfs_check_inode_attributes() · 076f1fc9

由 Trond Myklebust 提交于 10月 05, 2008

If we're merely checking the inode attributes because we suspect that the
'updated' attributes returned by the RPC call are stale, then we shouldn't
be doing weak cache consistency updates or clearing the cache_validity
flags.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

076f1fc9

NFS: Convert __nfs_revalidate_inode() to use nfs_refresh_inode() · 4dc05efb

由 Trond Myklebust 提交于 9月 23, 2008

In the case where there are parallel RPC calls to the same inode, we may
receive stale metadata due to the lack of ordering, hence the sanity
checking of metadata in nfs_refresh_inode().
Currently, __nfs_revalidate_inode() is calling nfs_update_inode() directly,
without any further sanity checks, and hence may end up setting the inode
up with stale metadata.

Fix is to use nfs_refresh_inode() instead of nfs_update_inode().
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

4dc05efb

NFS: Fix nfs_post_op_update_inode_force_wcc() · d65f557f

由 Trond Myklebust 提交于 10月 05, 2008

If we believe that the attributes are old (see nfs_refresh_inode()), then
we shouldn't force an update.
Also ensure that we hold the inode->i_lock across attribute checks and the
call to nfs_refresh_inode_locked() to ensure that we don't race with other
attribute updates.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d65f557f

NFS: Fix the NFS attribute update · a10ad176

由 Trond Myklebust 提交于 9月 23, 2008

Currently nfs_refresh_inode() will only update the inode metadata if it
sees that the RPC call that returned the nfs_fattr was started
after the last update of the inode. This means that if we have parallel
RPC calls to the same inode (when sending WRITE calls, for instance), we
may often miss updates.

This patch attempts to recover those missed updates by also accepting
them if the ctime in the nfs_fattr is more recent than the inode's
cached ctime.
It also recovers the case where the file size has increased, but the
ctime has not been updated due to limited ctime resolution.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

a10ad176

NFS: Clean up nfs_refresh_inode() and nfs_post_op_update_inode() · 870a5be8

由 Trond Myklebust 提交于 10月 05, 2008

Try to avoid taking and dropping the inode->i_lock more than once. Do so by
moving the code in nfs_refresh_inode() that needs to be done under the
spinlock into a function nfs_refresh_inode_locked(), and then having both
nfs_refresh_inode() and nfs_post_op_update_inode() call it directly.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

870a5be8

NFS: Add mount options for controlling the lookup cache · 7973c1f1

由 Trond Myklebust 提交于 7月 15, 2008

Add the following NFS-specific mount options to the parser.

    -o lookupcache=all          /* Default: cache positive & negative
                                   dentries */
    -o lookupcache=pos[itive]   /* Don't cache negative dentries */
    -o lookupcache=none         /* Strict revalidation of all dentries */
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

7973c1f1

NFS: Don't apply NFS_MOUNT_FLAGMASK to text-based mounts · ff3525a5

由 Trond Myklebust 提交于 8月 15, 2008

The point of introducing text-based mounts was to allow us to add
functionality without having to worry about legacy binary mount formats.
The mask should be there in order to ensure that binary formats don't start
enabling features that they cannot support. There is no justification for
applying it to the text mount path.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

ff3525a5

NFS: Add options for finer control of the lookup cache · 4eec952e

由 Trond Myklebust 提交于 7月 15, 2008

Add the flag NFS_MOUNT_LOOKUP_CACHE_NONEG to turn off the caching of
negative dentries. In reality what we do is to force
nfs_lookup_revalidate() to always discard negative dentries.

Add the flag NFS_MOUNT_LOOKUP_CACHE_NONE for enforcing stricter
revalidation of dentries. It forces the revalidate code to always do a
lookup instead of just checking the cached mtime of the parent directory.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

4eec952e

07 10月, 2008 2 次提交

NFS: Clean up nfs_sb_active/nfs_sb_deactive · 1daef0a8

由 Trond Myklebust 提交于 7月 27, 2008

Instead of causing umount requests to block on server->active_wq while the
asynchronous sillyrename deletes are executing, we can use the sb->s_active
counter to obtain a reference to the super_block, and then release that
reference in nfs_async_unlink_release().
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1daef0a8

NFS: Fix nfs_file_llseek() · d5e66348

由 Trond Myklebust 提交于 9月 23, 2008

After the BKL removal patches were applied to the rest of the NFS code, the
BKL protection in nfs_file_llseek() is no longer sufficient to ensure that
inode->i_size is read safely in generic_file_llseek_unlocked().

In order to fix the situation, we either have to replace the naked read of
inode->i_size in generic_file_llseek_unlocked() with i_size_read(), or the
whole thing needs to be executed under the inode->i_lock;
In order to avoid disrupting other filesystems, avoid touching
generic_file_llseek_unlocked() for now...
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d5e66348

03 10月, 2008 2 次提交

mm: tiny-shmem nommu fix · 4b19de6d

由 Nick Piggin 提交于 10月 02, 2008

The previous patch db203d53 ("mm:
tiny-shmem fix lock ordering: mmap_sem vs i_mutex") to fix the lock
ordering in tiny-shmem breaks shared anonymous and IPC memory on NOMMU
architectures because it was using the expanding truncate to signal ramfs
to allocate a physically contiguous RAM backing the inode (otherwise it is
unusable for "memory mapping" it to userspace).

However do_truncate is what caused the lock ordering error, due to it
taking i_mutex.  In this case, we can actually just call ramfs directly to
allocate memory for the mapping, rather than go via truncate.
Acked-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4b19de6d

inotify: fix lock ordering wrt do_page_fault's mmap_sem · 16dbc6c9

由 Nick Piggin 提交于 10月 02, 2008

Fix inotify lock order reversal with mmap_sem due to holding locks over
copy_to_user.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Reported-by: N"Daniel J Blueman" <daniel.blueman@gmail.com>
Tested-by: N"Daniel J Blueman" <daniel.blueman@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

16dbc6c9

29 9月, 2008 2 次提交

mm owner: fix race between swapoff and exit · 31a78f23

由 Balbir Singh 提交于 9月 28, 2008

There's a race between mm->owner assignment and swapoff, more easily
seen when task slab poisoning is turned on.  The condition occurs when
try_to_unuse() runs in parallel with an exiting task.  A similar race
can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats>
or ptrace or page migration.

CPU0                                    CPU1
                                        try_to_unuse
                                        looks at mm = task0->mm
                                        increments mm->mm_users
task 0 exits
mm->owner needs to be updated, but no
new owner is found (mm_users > 1, but
no other task has task->mm = task0->mm)
mm_update_next_owner() leaves
                                        mmput(mm) decrements mm->mm_users
task0 freed
                                        dereferencing mm->owner fails

The fix is to notify the subsystem via mm_owner_changed callback(),
if no new owner is found, by specifying the new task as NULL.

Jiri Slaby:
mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but
must be set after that, so as not to pass NULL as old owner causing oops.

Daisuke Nishimura:
mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task()
and its callers need to take account of this situation to avoid oops.

Hugh Dickins:
Lockdep warning and hang below exec_mmap() when testing these patches.
exit_mm() up_reads mmap_sem before calling mm_update_next_owner(),
so exec_mmap() now needs to do the same.  And with that repositioning,
there's now no point in mm_need_new_owner() allowing for NULL mm.
Reported-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: NJiri Slaby <jirislaby@gmail.com>
Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

31a78f23

Fix NULL pointer dereference in proc_sys_compare · d0185c08

由 Linus Torvalds 提交于 9月 29, 2008

The VFS interface for the 'd_compare()' is a bit special (read: 'odd'),
because it really just essentially replaces a memcmp().  The filesystem
is supposed to just compare the two names with whatever case-independent
or other function.

And when I say 'is supposed to', I obviously mean that 'procfs does odd
things, and actually looks at the dentry that we don't even pass down,
rather than just the name'.  Which results in problems, because we
actually call d_compare before we have even verified that the dentry is
still hashed at all.

And that causes a problm since the inode that procfs looks at may have
been free'd and the d_inode pointer is NULL.  procfs just assumes that
all dentries are positive, since procfs itself never generates a
negative one.  But memory pressure will still result in the dentry
getting torn down, and as it is removed by RCU, it still remains visible
on some lists - and to d_compare.

If the filesystem just did a name comparison, we wouldn't care.  And we
could just fix procfs to know about negative dentries too.  But rather
than have the low-level filesystems know about internal VFS details,
just move the check for a unhashed dentry up a bit, so that we will only
call d_compare on dentries that are still active.

The actual oops this caused didn't look like a NULL pointer dereference
because procfs did a 'container_of(inode, struct proc_inode, vfs_inode)'
to get at its internal proc_inode information from the inode pointer,
and accessed a field below the inode. So the oops would look something
like

	BUG: unable to handle kernel paging request at fffffffffffffff0
	IP: [<ffffffff802bc6c6>] proc_sys_compare+0x36/0x50

and was seen on both x86-64 (Alexey Dobriyan and Hugh Dickins) and
ppc64 (Hugh Dickins).
Reported-by: NAlexey Dobriyan <adobriyan@gmail.com>
Acked-by: NHugh Dickins <hugh@veritas.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-of-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0185c08

26 9月, 2008 2 次提交

[XFS] Remove xfs_iext_irec_compact_full() · 71a8c87f

由 Lachlan McIlroy 提交于 9月 26, 2008

Yet another bug was found in xfs_iext_irec_compact_full() and while the
source of the bug was found it wasn't an easy task to track it down
because the conditions are very difficult to reproduce.

A HUGE thank-you goes to Russell Cattelan and Eric Sandeen for their
significant effort in tracking down the source of this corruption.

xfs_iext_irec_compact_full() and xfs_iext_irec_compact_pages() are almost
identical - they both compact indirect extent lists by moving extents from
subsequent buffers into earlier ones. xfs_iext_irec_compact_pages() only
moves extents if all of the extents in the next buffer will fit into the
empty space in the buffer before it. xfs_iext_irec_compact_full() will go
a step further and move part of the next buffer if all the extents wont
fit. It will then shift the remaining extents in the next buffer up to the
start of the buffer. The bug here was that we did not update er_extoff and
this caused extent list corruption.

It does not appear that this extra functionality gains us much. Calling
xfs_iext_irec_compact_pages() instead will do a good enough job at
compacting the indirect list and will be quicker too.

For the case in xfs_iext_indirect_to_direct() the total number of extents
in the indirect list will fit into one buffer so we will never need the
extra functionality of xfs_iext_irec_compact_full() there.

Also xfs_iext_irec_compact_pages() doesn't need to do a memmove() (the
buffers will never overlap) so we don't want the performance hit that can
incur.

SGI-PV: 987159

SGI-Modid: xfs-linux-melb:xfs-kern:32166a
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>

71a8c87f

[XFS] Fix extent list corruption in xfs_iext_irec_compact_full(). · f1ccd295

由 Lachlan McIlroy 提交于 9月 26, 2008

If we don't move all the records from the next buffer into the current
buffer then we need to update the er_extoff field of the next buffer as we
shift the remaining records to the start of the buffer.

SGI-PV: 987159

SGI-Modid: xfs-linux-melb:xfs-kern:32165a
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Signed-off-by: NRussell Cattelan <cattelan@thebarn.com>

f1ccd295

25 9月, 2008 1 次提交

9p: use an IS_ERR test rather than a NULL test · 62aa528e

由 Julien Brunel 提交于 9月 24, 2008

In case of error, the function p9_client_walk returns an ERR pointer, but
never returns a NULL pointer.  So a NULL test that comes after an IS_ERR
test should be deleted.

The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@match_bad_null_test@
expression x, E;
statement S1,S2;
@@
x = p9_client_walk(...)
... when != x = E
*  if (x != NULL)
S1 else S2
// </smpl>
Signed-off-by: NJulien Brunel <brunel@diku.dk>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

62aa528e

18 9月, 2008 1 次提交

UBIFS: fix printk format warnings · 7424bac8

由 Alexander Beregalov 提交于 9月 17, 2008

fs/ubifs/dir.c:428: warning: format '%llu' expects type 'long long
unsigned int', but argument 5 has type 'long unsigned int'

fs/ubifs/debug.c:541: warning: format '%llu' expects type 'long long
unsigned int', but argument 2 has type 'long unsigned int'
Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

7424bac8

17 9月, 2008 10 次提交

UBIFS: remove incorrect assert · 6e14968c

由 Adrian Hunter 提交于 9月 17, 2008

The assert was not valid because one of the variables
'taken_empty_lebs' has transient values out of sync
with the other variables.
Signed-off-by: NAdrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

6e14968c

UBIFS: TNC / GC race fixes · 6dcfac4f

由 Adrian Hunter 提交于 9月 12, 2008

- update GC sequence number if any nodes may have been moved
even if GC did not finish the LEB
- don't ignore error return when reading
Signed-off-by: NAdrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

6dcfac4f

UBIFS: create the name of the background thread in every case · 0855f310

由 Sebastian Siewior 提交于 9月 09, 2008

If the ubifs partition is mounted RO and then remounted RW we end
up with no thread name in ubifs_remount_rw() and the thread appears
nameless.
Signed-off-by: NSebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

0855f310

[XFS] Don't do I/O beyond eof when unreserving space · 2fd6f6ec

由 Lachlan McIlroy 提交于 9月 17, 2008

When unreserving space with boundaries that are not block aligned we round
up the start and round down the end boundaries and then use this function,
xfs_zero_remaining_bytes(), to zero the parts of the blocks that got
dropped during the rounding. The problem is we don't consider if these
blocks are beyond eof. Worse still is if we encounter delayed allocations
beyond eof we will try to use the magic delayed allocation block number as
a real block number. If the file size is ever extended to expose these
blocks then we'll go through xfs_zero_eof() to zero them anyway.

SGI-PV: 983683

SGI-Modid: xfs-linux-melb:xfs-kern:32055a
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>

2fd6f6ec

[XFS] Fix use-after-free with buffers · e1f5dbd7

由 Lachlan McIlroy 提交于 9月 17, 2008

We have a use-after-free issue where log completions access buffers via
the buffer log item and the buffer has already been freed. Fix this by
taking a reference on the buffer when attaching the buffer log item and
release the hold when the buffer log item is detached and we no longer
need the buffer. Also create a new function xfs_buf_item_free() to combine
some common code.

SGI-PV: 985757

SGI-Modid: xfs-linux-melb:xfs-kern:32025a
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>

e1f5dbd7

[XFS] Prevent lockdep false positives when locking two inodes. · f9114eba

由 David Chinner 提交于 9月 17, 2008

If we call xfs_lock_two_inodes() to grab both the iolock and the ilock,
then drop the ilocks on both inodes, then grab them again (as
xfs_swap_extents() does) then lockdep will report a locking order problem.
This is a false positive.

To avoid this, disallow xfs_lock_two_inodes() fom locking both inode locks
at once - force calers to make two separate calls. This means that nested
dropping and regaining of the ilocks will retain the same lockdep subclass
and so lockdep will not see anything wrong with this code.

SGI-PV: 986238

SGI-Modid: xfs-linux-melb:xfs-kern:31999a
Signed-off-by: NDavid Chinner <david@fromorbit.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NPeter Leckie <pleckie@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

f9114eba

[XFS] Fix barrier status change detection. · b5b8c9ac

由 David Chinner 提交于 9月 17, 2008

The current code in xlog_iodone() uses the wrong macro to check if the
barrier has been cleared due to an EOPNOTSUPP error form the lower layer.

SGI-PV: 986143

SGI-Modid: xfs-linux-melb:xfs-kern:31984a
Signed-off-by: NDavid Chinner <david@fromorbit.com>
Signed-off-by: NNathaniel W. Turner <nate@houseofnate.net>
Signed-off-by: NPeter Leckie <pleckie@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

b5b8c9ac

[XFS] Prevent direct I/O from mapping extents beyond eof · 364f358a

由 Lachlan McIlroy 提交于 9月 17, 2008

With the help from some tracing I found that we try to map extents beyond
eof when doing a direct I/O read. It appears that the way to inform the
generic direct I/O path (ie do_direct_IO()) that we have breached eof is
to return an unmapped buffer from xfs_get_blocks_direct(). This will cause
do_direct_IO() to jump to the hole handling code where is will check for
eof and then abort.

This problem was found because a direct I/O read was trying to map beyond
eof and was encountering delayed allocations. The delayed allocations
beyond eof are speculative allocations and they didn't get converted when
the direct I/O flushed the file because there was only enough space in the
current AG to convert and write out the dirty pages within eof. Note that
xfs_iomap_write_allocate() wont necessarily convert all the delayed
allocation passed to it - it will return after allocating the first extent
- so if the delayed allocation extends beyond eof then it will stay that
way.

SGI-PV: 983683

SGI-Modid: xfs-linux-melb:xfs-kern:31929a
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>

364f358a

[XFS] Fix regression introduced by remount fixup · 6efdf281

由 Christoph Hellwig 提交于 9月 17, 2008

Logically we would return an error in xfs_fs_remount code to prevent users
from believing they might have changed mount options using remount which
can't be changed.

But unfortunately mount(8) adds all options from mtab and fstab to the
mount arguments in some cases so we can't blindly reject options, but have
to check for each specified option if it actually differs from the
currently set option and only reject it if that's the case.

Until that is implemented we return success for every remount request, and
silently ignore all options that we can't actually change.

SGI-PV: 985710

SGI-Modid: xfs-linux-melb:xfs-kern:31908a
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NTim Shimmin <tes@sgi.com>
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>

6efdf281

[XFS] Move memory allocations for log tracing out of the critical path · 31bd61f2

由 Lachlan McIlroy 提交于 9月 17, 2008

Memory allocations for log->l_grant_trace and iclog->ic_trace are done on
demand when the first event is logged. In xlog_state_get_iclog_space() we
call xlog_trace_iclog() under a spinlock and allocating memory here can
cause us to sleep with a spinlock held and deadlock the system.

For the log grant tracing we use KM_NOSLEEP but that means we can lose
trace entries. Since there is no locking to serialize the log grant
tracing we could race and have multiple allocations and leak memory.

So move the allocations to where we initialize the log/iclog structures.
Use KM_NOFS to avoid recursing into the filesystem and drop log->l_trace
since it's not even used.

SGI-PV: 983738

SGI-Modid: xfs-linux-melb:xfs-kern:31896a
Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
Signed-off-by: NChristoph Hellwig <hch@infradead.org>

31bd61f2

14 9月, 2008 4 次提交

rescan_partitions(): make device capacity errors non-fatal · 8d99f83b

由 Andrew Morton 提交于 9月 13, 2008

Herton Krzesinski reports that the error-checking changes in
04ebd4ae ("block/ioctl.c and
fs/partition/check.c: check value returned by add_partition") cause his
buggy USB camera to no longer mount.  "The camera is an Olympus X-840.
The original issue comes from the camera itself: its format program
creates a partition with an off by one error".

Buggy devices happen.  It is better for the kernel to warn and to proceed
with the mount.
Reported-by: NHerton Ronaldo Krzesinski <herton@mandriva.com.br>
Cc: Abdel Benamrouche <draconux@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8d99f83b

mm: ifdef Quicklists in /proc/meminfo · d7a3e495

由 Hugh Dickins 提交于 9月 13, 2008

A "Quicklists:          0 kB" line has just started appearing in
/proc/meminfo, but most architectures (including x86) don't have
them configured, so #ifdef it, like the highmem lines.

And those architectures which do have quicklists configured are
using them for page tables: so let's place it next to PageTables.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Acked-by: NChristoph Lameter <cl@linux-foundation.org>
Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d7a3e495

bfs: fix Lockdep warning · 1558182f

由 Eric Sesterhenn 提交于 9月 13, 2008

This fixes:

  =============================================
  [ INFO: possible recursive locking detected ]
  2.6.27-rc5-00283-g70bb0896 #68
  ---------------------------------------------
  touch/6855 is trying to acquire lock:
   (&info->bfs_lock){--..}, at: [<c02262f5>] bfs_delete_inode+0x9e/0x18c

  but task is already holding lock:
   (&info->bfs_lock){--..}, at: [<c0226c00>] bfs_create+0x45/0x187

  other info that might help us debug this:
  2 locks held by touch/6855:
   #0:  (&type->i_mutex_dir_key#5){--..}, at: [<c018ad13>] do_filp_open+0x10b/0x62f
   #1:  (&info->bfs_lock){--..}, at: [<c0226c00>] bfs_create+0x45/0x187

  stack backtrace:
  Pid: 6855, comm: touch Not tainted 2.6.27-rc5-00283-g70bb0896 #68
   [<c013e769>] validate_chain+0x458/0x9f4
   [<c013bece>] ? trace_hardirqs_off+0xb/0xd
   [<c013f36b>] __lock_acquire+0x666/0x6e0
   [<c013f440>] lock_acquire+0x5b/0x77
   [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c
   [<c06aab74>] mutex_lock_nested+0xbc/0x234
   [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c
   [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c
   [<c02262f5>] bfs_delete_inode+0x9e/0x18c
   [<c0226257>] ? bfs_delete_inode+0x0/0x18c
   [<c01925e1>] generic_delete_inode+0x94/0xfe
   [<c019265d>] generic_drop_inode+0x12/0x12f
   [<c0191b7e>] iput+0x4b/0x4e
   [<c0226d1e>] bfs_create+0x163/0x187
   [<c0188b42>] vfs_create+0xa6/0x114
   [<c018adb5>] do_filp_open+0x1ad/0x62f
   [<c0107cdc>] ? native_sched_clock+0x82/0x96
   [<c06ac309>] ? _spin_unlock+0x27/0x3c
   [<c019379e>] ? alloc_fd+0xbf/0xc9
   [<c06ae2f4>] ? sub_preempt_count+0x9d/0xab
   [<c019379e>] ? alloc_fd+0xbf/0xc9
   [<c0180391>] do_sys_open+0x42/0xb8
   [<c041d564>] ? trace_hardirqs_on_thunk+0xc/0x10
   [<c0180449>] sys_open+0x1e/0x26
   [<c01038bd>] sysenter_do_call+0x12/0x31
   =======================

The problem is that we don't unlock the bfs->lock mutex before calling
iput (we do in the other cases).
Signed-off-by: NEric Sesterhenn <snakebyte@gmx.de>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1558182f

proc: more debugging for "already registered" case · 665020c3

由 Alexey Dobriyan 提交于 9月 13, 2008

Print parent directory name as well.

The aim is to catch non-creation of parent directory when proc_mkdir will
return NULL and all subsequent registrations go directly in /proc instead
of intended directory.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
[ Fixed insane printk string while at it.  - Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

665020c3

10 9月, 2008 1 次提交

ocfs2: Fix a bug in direct IO read. · 0e116227

由 Tao Ma 提交于 9月 03, 2008

ocfs2 will become read-only if we try to read the bytes which pass
the end of i_size. This can be easily reproduced by following steps:
1. mkfs a ocfs2 volume with bs=4k cs=4k and nosparse.
2. create a small file(say less than 100 bytes) and we will create the file
   which is allocated 1 cluster.
3. read 8196 bytes from the kernel using O_DIRECT which exceeds the limit.
4. The ocfs2 volume becomes read-only and dmesg shows:
OCFS2: ERROR (device sda13): ocfs2_direct_IO_get_blocks:
Inode 66010 has a hole at block 1
File system is now read-only due to the potential of on-disk corruption.
Please run fsck.ocfs2 once the file system is unmounted.

So suppress the ERROR message.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

0e116227