提交 · 158a5d61f780b707da0b559cf60d72294006aa97 · openeuler / raspberrypi-kernel

26 2月, 2011 5 次提交

aio: fix race between io_destroy() and io_submit() · 7137c6bd

由 Jan Kara 提交于 2月 25, 2011

A race can occur when io_submit() races with io_destroy():

 CPU1						CPU2
io_submit()
  do_io_submit()
    ...
    ctx = lookup_ioctx(ctx_id);
						io_destroy()
    Now do_io_submit() holds the last reference to ctx.
    ...
    queue new AIO
    put_ioctx(ctx) - frees ctx with active AIOs

We solve this issue by checking whether ctx is being destroyed in AIO
submission path after adding new AIO to ctx.  Then we are guaranteed that
either io_destroy() waits for new AIO or we see that ctx is being
destroyed and bail out.

Cc: Nick Piggin <npiggin@kernel.dk>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7137c6bd

aio: fix rcu ioctx lookup · 3bd9a5d7

由 Nick Piggin 提交于 2月 25, 2011

aio-dio-invalidate-failure GPFs in aio_put_req from io_submit.

lookup_ioctx doesn't implement the rcu lookup pattern properly.
rcu_read_lock does not prevent refcount going to zero, so we might take
a refcount on a zero count ioctx.

Fix the bug by atomically testing for zero refcount before incrementing.

[jack@suse.cz: added comment into the code]
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NNick Piggin <npiggin@kernel.dk>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3bd9a5d7

ldm: corrupted partition table can cause kernel oops · 294f6cf4

由 Timo Warns 提交于 2月 25, 2011

The kernel automatically evaluates partition tables of storage devices.
The code for evaluating LDM partitions (in fs/partitions/ldm.c) contains
a bug that causes a kernel oops on certain corrupted LDM partitions.  A
kernel subsystem seems to crash, because, after the oops, the kernel no
longer recognizes newly connected storage devices.

The patch changes ldm_parse_vmdb() to Validate the value of vblk_size.
Signed-off-by: NTimo Warns <warns@pre-sense.de>
Cc: Eugene Teo <eugeneteo@kernel.sg>
Acked-by: NRichard Russon <ldm@flatcap.org>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

294f6cf4

epoll: prevent creating circular epoll structures · 22bacca4

由 Davide Libenzi 提交于 2月 25, 2011

In several places, an epoll fd can call another file's ->f_op->poll()
method with ep->mtx held.  This is in general unsafe, because that other
file could itself be an epoll fd that contains the original epoll fd.

The code defends against this possibility in its own ->poll() method using
ep_call_nested, but there are several other unsafe calls to ->poll
elsewhere that can be made to deadlock.  For example, the following simple
program causes the call in ep_insert recursively call the original fd's
->poll, leading to deadlock:

 #include <unistd.h>
 #include <sys/epoll.h>

 int main(void) {
     int e1, e2, p[2];
     struct epoll_event evt = {
         .events = EPOLLIN
     };

     e1 = epoll_create(1);
     e2 = epoll_create(2);
     pipe(p);

     epoll_ctl(e2, EPOLL_CTL_ADD, e1, &evt);
     epoll_ctl(e1, EPOLL_CTL_ADD, p[0], &evt);
     write(p[1], p, sizeof p);
     epoll_ctl(e1, EPOLL_CTL_ADD, e2, &evt);

     return 0;
 }

On insertion, check whether the inserted file is itself a struct epoll,
and if so, do a recursive walk to detect whether inserting this file would
create a loop of epoll structures, which could lead to deadlock.

[nelhage@ksplice.com: Use epmutex to serialize concurrent inserts]
Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
Signed-off-by: NNelson Elhage <nelhage@ksplice.com>
Reported-by: NNelson Elhage <nelhage@ksplice.com>
Tested-by: NNelson Elhage <nelhage@ksplice.com>
Cc: <stable@kernel.org>		[2.6.34+, possibly earlier]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

22bacca4

afs: Fix oops in afs_unlink_writeback · f129ccc9

由 Anton Blanchard 提交于 2月 25, 2011

I'm seeing the following oops when testing afs:

  Unable to handle kernel paging request for data at address 0x00000008
  ...
  NIP [c0000000003393b0] .afs_unlink_writeback+0x38/0xc0
  LR [c00000000033987c] .afs_put_writeback+0x98/0xec
  Call Trace:
  [c00000000345f600] [c00000000033987c] .afs_put_writeback+0x98/0xec
  [c00000000345f690] [c00000000033ae80] .afs_write_begin+0x6a4/0x75c
  [c00000000345f790] [c00000000012b77c] .generic_file_buffered_write+0x148/0x320
  [c00000000345f8d0] [c00000000012e1b8] .__generic_file_aio_write+0x37c/0x3e4
  [c00000000345f9d0] [c00000000012e2a8] .generic_file_aio_write+0x88/0xfc
  [c00000000345fa90] [c0000000003390a8] .afs_file_write+0x10c/0x178
  [c00000000345fb40] [c000000000188788] .do_sync_write+0xc4/0x128
  [c00000000345fcc0] [c000000000189658] .vfs_write+0xe8/0x1d8
  [c00000000345fd70] [c000000000189884] .SyS_write+0x68/0xb0
  [c00000000345fe30] [c000000000008564] syscall_exit+0x0/0x40

afs_write_begin hits an error and calls afs_unlink_writeback. In there
we do list_del_init on an uninitialised list.

The patch below initialises ->link when creating the afs_writeback struct.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f129ccc9

25 2月, 2011 1 次提交

block: bd_link_disk_holder() should hold on to holder_dir · e7407d16

由 Tejun Heo 提交于 2月 24, 2011

The new implementation of bd_link_disk_holder() added by 49731baa
(block: restore multiple bd_link_disk_holder() support) didn't get an
extra reference for the holder_dir kobject of the slave bdev; however,
bdev kills holder_dir on removal, not release, so if the slave bdev is
removed while there are holder links, the holder_dir will be destroyed
while there still are holder links, which leads to oops later when
bd_unlink_disk_order() tries to remove those links.

Make bd_link_disk_holder() grab an extra reference for the slave's
holder_dir and put it in bd_unlink_disk_holder().
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: N"Hawrylewicz Czarnowski, Przemyslaw" <przemyslaw.hawrylewicz.czarnowski@intel.com>
Tested-by: N"Hawrylewicz Czarnowski, Przemyslaw" <przemyslaw.hawrylewicz.czarnowski@intel.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e7407d16

24 2月, 2011 4 次提交

Unlock vfsmount_lock in do_umount · bf9faa2a

由 J. R. Okajima 提交于 2月 23, 2011

By the commit
	b3e19d92 2011-01-07 fs: scale mntget/mntput
vfsmount_lock was introduced around testing mnt_count.
Fix the mis-typed 'unlock'
Signed-off-by: NJ. R. Okajima <hooanon05@yahoo.co.jp>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bf9faa2a

Fix over-zealous flush_disk when changing device size. · 93b270f7

由 NeilBrown 提交于 2月 24, 2011

There are two cases when we call flush_disk.
In one, the device has disappeared (check_disk_change) so any
data will hold becomes irrelevant.
In the oter, the device has changed size (check_disk_size_change)
so data we hold may be irrelevant.

In both cases it makes sense to discard any 'clean' buffers,
so they will be read back from the device if needed.

In the former case it makes sense to discard 'dirty' buffers
as there will never be anywhere safe to write the data.  In the
second case it *does*not* make sense to discard dirty buffers
as that will lead to file system corruption when you simply enlarge
the containing devices.

flush_disk calls __invalidate_devices.
__invalidate_device calls both invalidate_inodes and invalidate_bdev.

invalidate_inodes *does* discard I_DIRTY inodes and this does lead
to fs corruption.

invalidate_bev *does*not* discard dirty pages, but I don't really care
about that at present.

So this patch adds a flag to __invalidate_device (calling it
__invalidate_device2) to indicate whether dirty buffers should be
killed, and this is passed to invalidate_inodes which can choose to
skip dirty inodes.

flusk_disk then passes true from check_disk_change and false from
check_disk_size_change.

dm avoids tripping over this problem by calling i_size_write directly
rathher than using check_disk_size_change.

md does use check_disk_size_change and so is affected.

This regression was introduced by commit 608aeef1 which causes
check_disk_size_change to call flush_disk, so it is suitable for any
kernel since 2.6.27.

Cc: stable@kernel.org
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Andrew Patterson <andrew.patterson@hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NNeilBrown <neilb@suse.de>

93b270f7

mm: prevent concurrent unmap_mapping_range() on the same inode · 2aa15890

由 Miklos Szeredi 提交于 2月 23, 2011

Michael Leun reported that running parallel opens on a fuse filesystem
can trigger a "kernel BUG at mm/truncate.c:475"

Gurudas Pai reported the same bug on NFS.

The reason is, unmap_mapping_range() is not prepared for more than
one concurrent invocation per inode.  For example:

  thread1: going through a big range, stops in the middle of a vma and
     stores the restart address in vm_truncate_count.

  thread2: comes in with a small (e.g. single page) unmap request on
     the same vma, somewhere before restart_address, finds that the
     vma was already unmapped up to the restart address and happily
     returns without doing anything.

Another scenario would be two big unmap requests, both having to
restart the unmapping and each one setting vm_truncate_count to its
own value.  This could go on forever without any of them being able to
finish.

Truncate and hole punching already serialize with i_mutex.  Other
callers of unmap_mapping_range() do not, and it's difficult to get
i_mutex protection for all callers.  In particular ->d_revalidate(),
which calls invalidate_inode_pages2_range() in fuse, may be called
with or without i_mutex.

This patch adds a new mutex to 'struct address_space' to prevent
running multiple concurrent unmap_mapping_range() on the same mapping.

[ We'll hopefully get rid of all this with the upcoming mm
  preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex
  lockbreak" patch in particular.  But that is for 2.6.39 ]
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Reported-by: NMichael Leun <lkml20101129@newton.leun.net>
Reported-by: NGurudas Pai <gurudas.pai@oracle.com>
Tested-by: NGurudas Pai <gurudas.pai@oracle.com>
Acked-by: NHugh Dickins <hughd@google.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2aa15890

Btrfs: fix fiemap bugs with delalloc · ec29ed5b

由 Chris Mason 提交于 2月 23, 2011

The Btrfs fiemap code wasn't properly returning delalloc extents,
so applications that trust fiemap to decide if there are holes in the
file see holes instead of delalloc.

This reworks the btrfs fiemap code, adding a get_extent helper that
searches for delalloc ranges and also adding a helper for extent_fiemap
that skips past holes in the file.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ec29ed5b

23 2月, 2011 2 次提交

xfs: check if device support discard in xfs_ioc_trim() · be715140

由 Lukas Czerner 提交于 2月 15, 2011

Right now we, are relying on the fact that when we attempt to
actually do the discard, blkdev_issue_discar() returns -EOPNOTSUPP
and the user is informed that the device does not support discard.

However, in the case where the we do not hit any suitable free
extent to trim in FITRIM code, it will finish without any error.
This is very confusing, because it seems that FITRIM was successful
even though the device does not actually supports discard.

Solution: Check for the discard support before attempt to search for
free extents.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

be715140

xfs: prevent leaking uninitialized stack memory in FSGEOMETRY_V1 · 3a3675b7

由 Dan Rosenberg 提交于 2月 14, 2011

The FSGEOMETRY_V1 ioctl (and its compat equivalent) calls out to
xfs_fs_geometry() with a version number of 3.  This code path does not
fill in the logsunit member of the passed xfs_fsop_geom_t, leading to
the leaking of four bytes of uninitialized stack data to potentially
unprivileged callers.

v2 switches to memset() to avoid future issues if structure members
change, on suggestion of Dave Chinner.
Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
Reviewed-by: NEugene Teo <eugeneteo@kernel.org>
Signed-off-by: NAlex Elder <aelder@sgi.com>

3a3675b7

22 2月, 2011 6 次提交

Docbook: add fs/eventfd.c and fix typos in it · 36182185

由 Randy Dunlap 提交于 2月 20, 2011

Add fs/eventfd.c to filesystems docbook.
Make typo corrections in fs/eventfd.c.
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

36182185

[CIFS] update cifs version · eed9e830

由 Steve French 提交于 2月 21, 2011

Update version to 1.71 so we can more easily spot modules with the last two fixes
Signed-off-by: NSteve French <sfrench@us.ibm.com>

eed9e830

cifs: Fix regression in LANMAN (LM) auth code · 5e640927

由 Shirish Pargaonkar 提交于 2月 17, 2011

LANMAN response length was changed to 16 bytes instead of 24 bytes.
Revert it back to 24 bytes.
Signed-off-by: NShirish Pargaonkar <shirishpargaonkar@gmail.com>
CC: stable@kernel.org
Signed-off-by: NSteve French <sfrench@us.ibm.com>

5e640927

eCryptfs: Copy up lower inode attrs in getattr · 55f9cf6b

由 Tyler Hicks 提交于 1月 11, 2011

The lower filesystem may do some type of inode revalidation during a
getattr call. eCryptfs should take advantage of that by copying the
lower inode attributes to the eCryptfs inode after a call to
vfs_getattr() on the lower inode.

I originally wrote this fix while working on eCryptfs on nfsv3 support,
but discovered it also fixed an eCryptfs on ext4 nanosecond timestamp
bug that was reported.

https://bugs.launchpad.net/bugs/613873

Cc: <stable@kernel.org>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

55f9cf6b

ecryptfs: read on a directory should return EISDIR if not supported · 323ef68f

由 Andy Whitcroft 提交于 2月 16, 2011

read() calls against a file descriptor connected to a directory are
incorrectly returning EINVAL rather than EISDIR:

  [EISDIR]
    [XSI] [Option Start] The fildes argument refers to a directory and the
    implementation does not allow the directory to be read using read()
    or pread(). The readdir() function should be used instead. [Option End]

This occurs because we do not have a .read operation defined for
ecryptfs directories.  Connect this up to generic_read_dir().

BugLink: http://bugs.launchpad.net/bugs/719691Signed-off-by: NAndy Whitcroft <apw@canonical.com>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

323ef68f

eCryptfs: Handle NULL nameidata pointers · 70b89021

由 Tyler Hicks 提交于 2月 17, 2011

Allow for NULL nameidata pointers in eCryptfs create, lookup, and
d_revalidate functions.
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

70b89021

20 2月, 2011 4 次提交

ocfs2: Check heartbeat mode for kernel stacks only · 52c303c5

由 Mark Fasheh 提交于 1月 31, 2011

Commit 2c442719 added some checks for proper
heartbeat mode when the o2cb stack is running.  Unfortunately, it didn't
take into account that a userpsace stack could be running. Fix this by only
doing the check if o2cb is in use. This patch allows userspace stacks to
mount the fs again.

Cc: stable@kernel.org
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

52c303c5

Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a right number. · acf3bb00

由 Tristan Ye 提交于 1月 21, 2011

Current refcounttree codes actually didn't writeback the new pages out in
write-back mode, due to a bug of always passing a ZERO number of clusters
to 'ocfs2_cow_sync_writeback', the patch tries to pass a proper one in.
Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
Cc: stable@kernel.org
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

acf3bb00

ocfs2: Fix estimate of necessary credits for mkdir · 705773a6

由 Jan Kara 提交于 2月 03, 2011

In the rare case that INLINE_DATA, INDEX_DIR, QUOTA, XATTR features are
disabled and both the allocation of the directory inode and the allocation
of the first directory block need to relink allocation group, there need
not be enough credits reserved in a transaction. Fix the estimate.

CC: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

705773a6

ceph: keep reference to parent inode on ceph_dentry · 97d79b40

由 Yehuda Sadeh 提交于 1月 18, 2011

When creating a new dentry we now hold a reference to the parent
inode in the ceph_dentry.  This is required due to the new RCU
changes from 949854d0, which set dentry->d_parent to NULL in d_kill before
calling the ->release() callback.  If/when that behavior is changed, we can
revert this hack.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

97d79b40

18 2月, 2011 2 次提交

eCryptfs: Revert "dont call lookup_one_len to avoid NULL nameidata" · 8787c7a3

由 Tyler Hicks 提交于 2月 17, 2011

This reverts commit 21edad32 and commit
93c3fe40, which fixed a regression by
the former.

Al Viro pointed out bypassed dcache lookups in
ecryptfs_new_lower_dentry(), misuse of vfs_path_lookup() in
ecryptfs_lookup_one_lower() and a dislike of passing nameidata to the
lower filesystem.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

8787c7a3

fs/partitions: Validate map_count in Mac partition tables · fa7ea87a

由 Timo Warns 提交于 2月 17, 2011

Validate number of blocks in map and remove redundant variable.
Signed-off-by: NTimo Warns <warns@pre-sense.de>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fa7ea87a

17 2月, 2011 11 次提交

cifs: fix handling of scopeid in cifs_convert_address · 96161256

由 Jeff Layton 提交于 2月 16, 2011

The code finds, the '%' sign in an ipv6 address and copies that to a
buffer allocated on the stack. It then ignores that buffer, and passes
'pct' to simple_strtoul(), which doesn't work right because we're
comparing 'endp' against a completely different string.

Fix it by passing the correct pointer. While we're at it, this is a
good candidate for conversion to strict_strtoul as well.

Cc: stable@kernel.org
Cc: David Howells <dhowells@redhat.com>
Reported-by: NBjÃ¶rn JACKE <bj@sernet.de>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

96161256

block: revert block_dev read-only check · e51900f7

由 Chuck Ebbert 提交于 2月 16, 2011

This reverts commit 75f1dc0d ("block: check bdev_read_only() from
blkdev_get()").  That commit added stricter checking to make sure
devices that were being used read-only were actually opened in that
mode.

It turns out that the change breaks a bunch of kernel code that opens
block devices.  Affected systems include dm, md, and the loop device.
Because strict checking for read-only opens of block devices was not
done before this, the code that opens the devices was opening them
read-write even if they were being used read-only.  Auditing all that
code will take time, and new userspace packages for dm, mdadm, etc.
will also be required.
Signed-off-by: NChuck Ebbert <cebbert@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e51900f7

nfsd: correctly handle return value from nfsd_map_name_to_* · 47c85291

由 NeilBrown 提交于 2月 16, 2011

These functions return an nfs status, not a host_err.  So don't
try to convert  before returning.

This is a regression introduced by
3c726023; I fixed up two of the callers,
but missed these two.

Cc: stable@kernel.org
Reported-by: NHerbert Poetzl <herbert@13thfloor.at>
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

47c85291

Btrfs: set FMODE_EXCL in btrfs_device->mode · fb01aa85

由 Ilya Dryomov 提交于 2月 15, 2011

This fixes a bug introduced in d4d77629, where the device added online
(and therefore initialized via btrfs_init_new_device()) would be left
with the positive bdev->bd_holders after unmount.  Since d4d77629 we no
longer OR FMODE_EXCL explicitly on blkdev_put(), set it in
btrfs_device->mode.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

fb01aa85

Btrfs: make btrfs_rm_device() fail gracefully · 9b3517e9

由 Ilya Dryomov 提交于 2月 15, 2011

If shrinking done as part of the online device removal fails add that
device back to the allocation list and increment the rw_devices counter.
This fixes two bugs:

1) we could have a perfectly good device out of alloc list for no good
reason;

2) in the btrfs consisting of two devices, failure in btrfs_rm_device()
could lead to a situation where it was impossible to remove any of the
devices because of the "unable to remove the only writeable device"
error.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9b3517e9

Btrfs: Avoid accessing unmapped kernel address · ca9b688c

由 Li Zefan 提交于 2月 16, 2011

When decompressing a chunk of data, we'll copy the data out to
a working buffer if the data is stored in more than one page,
otherwise we'll use the mapped page directly to avoid memory
copy.

In the latter case, we'll end up accessing the kernel address
after we've unmapped the page in a corner case.
Reported-by: NJuan Francisco Cantero Hurtado <iam@juanfra.info>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ca9b688c

Btrfs: Fix BTRFS_IOC_SUBVOL_SETFLAGS ioctl · b4dc2b8c

由 Li Zefan 提交于 2月 16, 2011

- Check user-specified flags correctly
- Check the inode owership
- Search root item in root tree but not fs tree
Reported-by: NDan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b4dc2b8c

Btrfs: allow balance to explicitly allocate chunks as it relocates · c87f08ca

由 Chris Mason 提交于 2月 16, 2011

Btrfs device shrinking and balancing ends up reallocating all the blocks
in order to allow COW to move them to new destinations. It is somewhat
awkward in terms of ENOSPC because most of the enospc code is built
around the idea that some operation on a reference counted tree triggers
allocations in the non-reference counted trees.

This commit changes the balancing code to deal with enospc by trying to
allocate a new chunk. If that allocation succeeds, we go ahead and
retry whatever failed due to enospc.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c87f08ca

Btrfs: put ENOSPC debugging under a mount option · 91435650

由 Chris Mason 提交于 2月 16, 2011

ENOSPC in btrfs is getting to the point where the extra debugging isn't
required.  I've put it under mount -o enospc_debug just in case someone
is having difficult problems.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

91435650

vfs: fix BUG_ON() in fs/namei.c:1461 · 3abb17e8

由 Linus Torvalds 提交于 2月 16, 2011

When Al moved the nameidata_dentry_drop_rcu_maybe() call into the
do_follow_link function in commit 844a3917 ("nothing in
do_follow_link() is going to see RCU"), he mistakenly left the

	BUG_ON(inode != path->dentry->d_inode);

behind.  Which would otherwise be ok, but that BUG_ON() really needs to
be _after_ dropping RCU, since the dentry isn't necessarily stable
otherwise.

So complete the code movement in that commit, and move the BUG_ON() into
do_follow_link() too.  This means that we need to pass in 'inode' as an
argument (just for this one use), but that's a small thing.  And
eventually we may be confident enough in our path lookup that we can
just remove the BUG_ON() and the unnecessary inode argument.
Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3abb17e8

workqueue, freezer: unify spelling of 'freeze' + 'able' to 'freezable' · 58a69cb4

由 Tejun Heo 提交于 2月 16, 2011

There are two spellings in use for 'freeze' + 'able' - 'freezable' and
'freezeable'.  The former is the more prominent one.  The latter is
mostly used by workqueue and in a few other odd places.  Unify the
spelling to 'freezable'.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NAlan Stern <stern@rowland.harvard.edu>
Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
Acked-by: NDmitry Torokhov <dtor@mail.ru>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Steven Whitehouse <swhiteho@redhat.com>

58a69cb4

15 2月, 2011 5 次提交

s390: remove task_show_regs · 261cd298

由 Martin Schwidefsky 提交于 2月 15, 2011

task_show_regs used to be a debugging aid in the early bringup days
of Linux on s390. /proc/<pid>/status is a world readable file, it
is not a good idea to show the registers of a process. The only
correct fix is to remove task_show_regs.
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

261cd298

A
get rid of nameidata_dentry_drop_rcu() calling nameidata_drop_rcu() · 4e924a4f
由 Al Viro 提交于 2月 15, 2011
```
can't happen anymore and didn't work right anyway
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
4e924a4f

drop out of RCU in return_reval · f60aef7e

由 Al Viro 提交于 2月 15, 2011

... thus killing the need to handle drop-from-RCU in d_revalidate()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f60aef7e

A
split do_revalidate() into RCU and non-RCU cases · f5e1c1c1
由 Al Viro 提交于 2月 15, 2011
```
fixing oopsen in lookup_one_len()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
f5e1c1c1
A
in do_lookup() split RCU and non-RCU cases of need_revalidate · 24643087
由 Al Viro 提交于 2月 15, 2011
```
and use unlikely() instead of gotos, for fsck sake...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
24643087