提交 · 4f1772645296a230e04f5c53e79cfb6f841ce634 · openanolis / cloud-kernel

27 7月, 2011 18 次提交

S
ceph: document locking for ceph_set_dentry_offset · 4f177264
由 Sage Weil 提交于 7月 26, 2011
```
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>
```
4f177264

ceph: avoid d_parent in ceph_dentry_hash; fix ceph_encode_fh() hashing bug · e5f86dc3

由 Sage Weil 提交于 7月 26, 2011

Have caller pass in a safely-obtained reference to the parent directory
for calculating a dentry's hash valud.

While we're here, simpify the flow through ceph_encode_fh() so that there
is a single exit point and cleanup.

Also fix a bug with the dentry hash calculation: calculate the hash for the
dentry we were given, not its parent.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

e5f86dc3

ceph: protect d_parent access in ceph_d_revalidate · bf1c6aca

由 Sage Weil 提交于 7月 26, 2011

Protect d_parent with d_lock.  Carry a reference.  Simplify the flow so
that there is a single exit point and cleanup.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

bf1c6aca

ceph: protect access to d_parent · 5f21c96d

由 Sage Weil 提交于 7月 26, 2011

d_parent is protected by d_lock: use it when looking up a dentry's parent
directory inode.  Also take a reference and drop it in the caller to avoid
a use-after-free.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

5f21c96d

ceph: handle racing calls to ceph_init_dentry · 48d0cbd1

由 Sage Weil 提交于 7月 26, 2011

The ->lookup() and prepopulate_readdir() callers are working with unhashed
dentries, so we don't have to worry.  The export.c callers, though, need
to initialize something they got back from d_obtain_alias() and are
potentially racing with other callers.  Make sure we don't return unless
the dentry is properly initialized (by us or someone else).
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

48d0cbd1

ceph: set dir complete frag after adding capability · dfabbed6

由 Sage Weil 提交于 7月 26, 2011

Curretly ceph_add_cap clears the complete bit if we are newly issued the
FILE_SHARED cap, which is normally the case for a newly issue cap on a new
directory.  That means we clear the just-set bit.  Move the check that sets
the flag to after the cap is added/updated.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

dfabbed6

ceph: set up readahead size when rsize is not passed · e9852227

由 Yehuda Sadeh 提交于 7月 22, 2011

This should improve the default read performance, as without it
readahead is practically disabled.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>

e9852227

ceph: ignore lease mask · 2f90b852

由 Sage Weil 提交于 7月 26, 2011

The lease mask is no longer used (and it changed a while back).  Instead,
use a non-zero duration to indicate that there is a lease being issued.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

2f90b852

ceph: fix ceph_lookup_open intent usage · 468640e3

由 Sage Weil 提交于 7月 26, 2011

We weren't properly calling lookup_instantiate_filp when setting up the
lookup intent, which could lead to file leakage on errors.  So:

 - use separate helper for the hidden snapdir translation, immediately
   following the mds request
 - use ceph_finish_lookup for the final dentry/return value dance in the
   exit path
 - lookup_instantiate_filp on success
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

468640e3

ceph: only link open operations to directory unsafe list if O_CREAT|O_TRUNC · 9bae113a

由 Sage Weil 提交于 7月 26, 2011

We only need to put these on the directory unsafe list if they have
side effects that fsync(2) should flush out.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

9bae113a

ceph: fix bad parent_inode calc in ceph_lookup_open · acda7657

由 Sage Weil 提交于 7月 26, 2011

We were always getting NULL here because the intent file f_dentry is always
NULL at this point, which means we were always passing NULL to
ceph_mdsc_do_request.  In reality, this was fine, since this isn't
currently ever a write operation that needs to get strung on the dir's
unsafe list.

Use the dir explicitly, and only pass it if this open has side-effects that
a dir fsync should flush.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

acda7657

ceph: avoid carrying Fw cap during write into page cache · d8de9ab6

由 Sage Weil 提交于 7月 26, 2011

The generic_file_aio_write call may block on balance_dirty_pages while we
flush data to the OSDs.  If we hold a reference to the FILE_WR cap during
that interval revocation by the MDS (e.g., to do a stat(2)) may be very
slow.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

d8de9ab6

ceph: report f_bfree based on kb_avail rather than diffing. · 8f04d422

由 Greg Farnum 提交于 7月 26, 2011

Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NGreg Farnum <gregory.farnum@dreamhost.com>

8f04d422

ceph: only queue capsnap if caps are dirty · e77dc3e9

由 Sage Weil 提交于 7月 26, 2011

We used to go into this branch if i_wrbuffer_ref_head was non-zero.  This
was an ancient check from before we were careful about dealing with all
kinds of caps (and not just dirty pages).  It is cleaner to only queue a
capsnap if there is an actual dirty cap.  If we are racing with...
something...we will end up here with ci->i_wrbuffer_refs but no dirty
caps.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

e77dc3e9

ceph: fix snap writeback when racing with writes · af0ed569

由 Sage Weil 提交于 7月 26, 2011

There are two problems that come up when we try to queue a capsnap while a
write is in progress:

 - The FILE_WR cap is held, but not yet dirty, so we may queue a capsnap
   with dirty == 0.  That will crash later in __ceph_flush_snaps().  Or
   on the FILE_WR cap if a write is in progress.
 - We may not have i_head_snapc set, which causes problems pretty quickly.
   Look to the snaprealm in this case.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

af0ed569

ceph: use flag bit for at_end readdir flag · 9cfa1098

由 Sage Weil 提交于 7月 26, 2011

This saves us a word of memory per file.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

9cfa1098

ceph: add F_SYNC file flag to force sync (non-O_DIRECT) io · 4918b6d1

由 Sage Weil 提交于 7月 26, 2011

This allows us to force IO through the sync path which you normally only
get when multiple clients are reading/writing to the same file or by
mounting with -o sync.  Among other things, this lets test programs verify
correctness with a single mount.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

4918b6d1

ceph: add flags field to file_info · 252c6728

由 Sage Weil 提交于 7月 26, 2011

Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

252c6728

22 7月, 2011 2 次提交

vfs: drop conditional inode prefetch in __do_lookup_rcu · b91da88f

由 Linus Torvalds 提交于 7月 21, 2011

It seems to hurt performance in real life.  Yes, the inode will be used
later, but the conditional doesn't seem to predict all that well
(negative dentries are not uncommon) and it looks like the cost of
prefetching is simply higher than depending on the cache doing the right
thing.

As usual.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b91da88f

FS-Cache: Fix __fscache_uncache_all_inode_pages()'s outer loop · b307d465

由 Jan Beulich 提交于 7月 21, 2011

The compiler, at least for ix86 and m68k, validly warns that the
comparison:

	next <= (loff_t)-1

is always true (and it's always true also for x86-64 and probably all
other arches - as long as pgoff_t isn't wider than loff_t).  The
intention appears to be to avoid wrapping of "next", so rather than
eliminating the pointless comparison, fix the loop to indeed get exited
when "next" would otherwise wrap.

On m68k the following warning is observed:

  fs/fscache/page.c: In function '__fscache_uncache_all_inode_pages':
  fs/fscache/page.c:979: warning: comparison is always false due to limited range of data type
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Reported-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b307d465

21 7月, 2011 1 次提交

CIFS: Fix wrong length in cifs_iovec_read · 2cebaa58

由 Pavel Shilovsky 提交于 7月 20, 2011

Signed-off-by: NPavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

2cebaa58

20 7月, 2011 2 次提交

fs/libfs.c: fix simple_attr_write() on 32bit machines · f7b88631

由 Akinobu Mita 提交于 7月 19, 2011

Assume that /sys/kernel/debug/dummy64 is debugfs file created by
debugfs_create_x64().

	# cd /sys/kernel/debug
	# echo 0x1234567812345678 > dummy64
	# cat dummy64
	0x0000000012345678

	# echo 0x80000000 > dummy64
	# cat dummy64
	0xffffffff80000000

A value larger than INT_MAX cannot be written to the debugfs file created
by debugfs_create_u64 or debugfs_create_x64 on 32bit machine.  Because
simple_attr_write() uses simple_strtol() for the conversion.

To fix this, use simple_strtoll() instead.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f7b88631

vfs: fix race in rcu lookup of pruned dentry · 59430262

由 Linus Torvalds 提交于 7月 18, 2011

Don't update *inode in __follow_mount_rcu() until we'd verified that
there is mountpoint there.  Kudos to Hugh Dickins for catching that
one in the first place and eventually figuring out the solution (and
catching a braino in the earlier version of patch).
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

59430262

19 7月, 2011 1 次提交

Fix cifs_get_root() · fec11dd9

由 Al Viro 提交于 7月 18, 2011

Add missing ->i_mutex, convert to lookup_one_len() instead of
(broken) open-coded analog, cope with getting something like
a//b as relative pathname.  Simplify the hell out of it, while
we are there...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Reviewed-by: NJeff Layton <jlayton@redhat.com>

fec11dd9

18 7月, 2011 5 次提交

hppfs_lookup(): don't open-code lookup_one_len() · 0916a5e4

由 Al Viro 提交于 7月 17, 2011

... and it's getting it wrong, too - missing ->d_revalidate() calls when
it's dealing with filesystem (procfs) that has non-trivial ->d_revalidate()...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0916a5e4

A
hppfs: fix dentry leak · 3cc0658e
由 Al Viro 提交于 7月 17, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3cc0658e

cramfs: get_cramfs_inode() returns ERR_PTR() on failure · 0577d1ba

由 Al Viro 提交于 7月 17, 2011

... and we want to report these failures in ->lookup() anyway.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0577d1ba

A
ufs should use d_splice_alias() · 642c937b
由 Al Viro 提交于 7月 17, 2011
```
it's NFS-exportable, so...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
642c937b

fix exofs ->get_parent() · a803b806

由 Al Viro 提交于 7月 08, 2011

NULL is not a possible return value for that method, TYVM...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a803b806

17 7月, 2011 2 次提交

ceph analog of cifs build_path_from_dentry() race fix · 1b71fe2e

由 Al Viro 提交于 7月 16, 2011

... unfortunately, cifs bug got copied.  Fix is essentially the same.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1b71fe2e

cifs: build_path_from_dentry() race fix · dc137bf5

由 Al Viro 提交于 7月 16, 2011

deal with d_move() races properly; rename_lock read-retry loop,
rcu_read_lock() held while walking to root, d_lock held over
subtraction from namelen and copying the component to stabilize
->d_name.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

dc137bf5

15 7月, 2011 1 次提交

fix loop checks in d_materialise_unique() · 18367501

由 Al Viro 提交于 7月 12, 2011

Both __d_unalias() and __d_materialise_dentry() need loop prevention.
Grab rename_lock in caller, check for loops there...

As a side benefit, we have dentry_lock_for_move() called only under
rename_lock, which seriously reduces deadlock potential of the
execrable "locking order" used for ->d_lock.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

18367501

14 7月, 2011 1 次提交

GFS2: Resolve inode eviction and ail list interaction bug · 380f7c65

由 Steven Whitehouse 提交于 7月 14, 2011

This patch contains a few misc fixes which resolve a recently
reported issue. This patch has been a real team effort and has
received a lot of testing.

The first issue is that the ail lock needs to be held over a few
more operations. The lock thats added into gfs2_releasepage() may
possibly be a candidate for replacing with RCU at some future
point, but at this stage we've gone for the obvious fix.

The second issue is that gfs2_write_inode() can end up calling
a glock recursively when called from gfs2_evict_inode() via the
syncing code, so it needs a guard added.

The third issue is that we either need to not truncate the metadata
pages of inodes which have zero link count, but which we cannot
deallocate due to them still being in use by other nodes, or we need
to ensure that those pages have all made it through the journal and
ail lists first. This patch takes the former approach, but the
latter has also been tested and there is nothing to choose between
them performance-wise. So again, we could revise that decision
in the future.

Also, the inode eviction process is now better documented.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Tested-by: NBob Peterson <rpeterso@redhat.com>
Tested-by: NAbhijith Das <adas@redhat.com>
Reported-by: NBarry J. Marson <bmarson@redhat.com>
Reported-by: NDavid Teigland <teigland@redhat.com>

380f7c65

13 7月, 2011 4 次提交

Fix ->d_lock locking order in unlazy_walk() · 94c0d4ec

由 Al Viro 提交于 7月 12, 2011

Make sure that child is still a child of parent before nested locking
of child->d_lock in unlazy_walk(); otherwise we are risking a violation
of locking order and deadlocks.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

94c0d4ec

S
[CIFS] update cifs to version 1.74 · c2ec9471
由 Steve French 提交于 7月 12, 2011
```
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
c2ec9471

[CIFS] update limit for snprintf in cifs_construct_tcon · ea1be1a3

由 Steve French 提交于 7月 12, 2011

In 34c87901 "Shrink stack space usage in cifs_construct_tcon" we
change the size of the username name buffer from MAX_USERNAME_SIZE
(256) to 28.  This call to snprintf() needs to be updated as well.
Reported by Dan Carpenter.
Reviewed-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

ea1be1a3

cifs: Fix signing failure when server mandates signing for NTLMSSP · 62411ab2

由 Shirish Pargaonkar 提交于 7月 10, 2011

When using NTLMSSP authentication mechanism, if server mandates
signing, keep the flags in type 3 messages of the NTLMSSP exchange
same as in type 1 messages (i.e. keep the indicated capabilities same).

Some of the servers such as Samba, expect the flags such as
Negotiate_Key_Exchange in type 3 message of NTLMSSP exchange as well.
Some servers like Windows do not.

https://bugzilla.samba.org/show_bug.cgi?id=8212Signed-off-by: NShirish Pargaonkar <shirishpargaonkar@gmail>
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

62411ab2

12 7月, 2011 3 次提交

GFS2: Fix race during filesystem mount · 3942ae53

由 Steven Whitehouse 提交于 7月 11, 2011

There is a potential race during filesystem mounting which has recently
been reported. It occurs when the userland gfs_controld is able to
process requests fast enough that it tries to use the sysfs interface
before the lock module is properly initialised. This is a pretty
unusual case as normally the lock module initialisation is very quick
compared with gfs_controld.

This patch adds an interruptible completion which is used to ensure that
userland will wait for the initialisation of the lock module to
complete.

There are other potential solutions to this problem, but this is the
quickest at this stage and has been tested both with and without
mount.gfs2 present in the system.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Reported-by: NDavid Booher <dbooher@adams.net>

3942ae53

GFS2: force a log flush when invalidating the rindex glock · 1ce53368

由 Benjamin Marzinski 提交于 6月 13, 2011

Right now, there is nothing that forces the log to get flushed when a node
drops its rindex glock so that another node can grow the filesystem. If the
log doesn't get flushed, GFS2 can corrupt the sd_log_le_rg list in the
following way.

A node puts an rgd on the list in rg_lo_add(), and then the rindex glock is
dropped so the other node can grow the filesystem. When the node reacquires the
rindex glock, that rgd gets deleted in clear_rgrpdi() before ever being
removed from the list by gfs2_log_flush().

This code simply forces a log flush when the rindex glock is invalidated,
solving the problem.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1ce53368

NFSv4.1: update nfs4_fattr_bitmap_maxsz · e5012d1f

由 Andy Adamson 提交于 7月 11, 2011

Attribute IDs assigned in RFC 5661 now require three bitmaps.
Fixes hitting a BUG_ON in xdr_shrink_bufhead when getting ACLs.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Cc:stable@kernel.org [2.6.39]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

e5012d1f

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功