提交 · ad1fee96cbaf873520064252c5dc3212c9844861 · openeuler / raspberrypi-kernel

22 3月, 2011 2 次提交

由 Yehuda Sadeh 提交于 1月 21, 2011

The ino32 mount option forces the ceph fs to report 32 bit
ino values.  This is useful for 64 bit kernels with 32 bit userspace.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>

ad1fee96

S
ceph: remove debugfs debug cruft · 21f3b5f1
由 Sage Weil 提交于 1月 19, 2011
```
Whoops!
Signed-off-by: NSage Weil <sage@newdream.net>
```
21f3b5f1

16 3月, 2011 1 次提交

ceph: preserve I_COMPLETE across rename · 09adc80c

由 Sage Weil 提交于 2月 04, 2011

d_move puts the renamed dentry at the end of d_subdirs, screwing with our
cached dentry directory offsets. We were just clearing I_COMPLETE to avoid
any possibility of trouble. However, assigning the renamed dentry an
offset at the end of the directory (to match it's new d_subdirs position)
is sufficient to maintain correct behavior and hold onto I_COMPLETE.

This is especially important for workloads like rsync, which renames files
into place. Before, we would lose I_COMPLETE and do MDS lookups for each
file. With this patch we only talk to the MDS on create and rename.
Signed-off-by: NSage Weil <sage@newdream.net>

09adc80c

10 3月, 2011 1 次提交

ceph: fix d_revalidate oopsen on NFS exports · 0eb980e3

由 Al Viro 提交于 3月 10, 2011

can't blindly check nd->flags in ->d_revalidate()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0eb980e3

05 3月, 2011 1 次提交

ceph: no .snap inside of snapped namespace · 455cec0a

由 Sage Weil 提交于 3月 03, 2011

Otherwise you can do things like

# mkdir .snap/foo
# cd .snap/foo/.snap
# ls
<badness>
Signed-off-by: NSage Weil <sage@newdream.net>

455cec0a

04 3月, 2011 3 次提交

ceph: do not clear I_COMPLETE from d_release · 16a8b70a

由 Sage Weil 提交于 2月 28, 2011

First, this was racy anyway: d_release isn't called until well after the
dentry is unhashed.  Second, this runs afoul of the recent dcache change
that clears d_parent prior to calling d_release (949854d0), causing a NULL
pointer dereference.
Signed-off-by: NSage Weil <sage@newdream.net>

16a8b70a

ceph: do not set I_COMPLETE · b545cc15

由 Sage Weil 提交于 2月 28, 2011

Do not set the I_COMPLETE flag on directories until we resolve races with
dcache pruning.
Signed-off-by: NSage Weil <sage@newdream.net>

b545cc15

Revert "ceph: keep reference to parent inode on ceph_dentry" · 9bde178d

由 Sage Weil 提交于 2月 28, 2011

This reverts commit 97d79b40.

This fails to account for d_parent changes due to rename or disconnected
dentries due to submounts or NFS reexports.
Signed-off-by: NSage Weil <sage@newdream.net>

9bde178d

20 2月, 2011 1 次提交

ceph: keep reference to parent inode on ceph_dentry · 97d79b40

由 Yehuda Sadeh 提交于 1月 18, 2011

When creating a new dentry we now hold a reference to the parent
inode in the ceph_dentry.  This is required due to the new RCU
changes from 949854d0, which set dentry->d_parent to NULL in d_kill before
calling the ->release() callback.  If/when that behavior is changed, we can
revert this hack.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

97d79b40

05 2月, 2011 1 次提交

ceph: queue cap_snaps once per realm · e8e1ba96

由 Sage Weil 提交于 2月 04, 2011

We were forming a dirty list, and then queueing cap_snaps for each realm
_and_ its children, regardless of whether the children were already in the
dirty list.  This meant we did it twice for some realms.  Which in turn
meant we corrupted mdsc->snap_flush_list when the cap_snap was re-added to
the list it was already on, and could trigger an infinite loop.

We were also using recursion to do reach all the children, a no-no when
stack is limited.

Instead, (re)queue any children on the dirty list, avoiding processing
anything twice and avoiding any recursion.
Signed-off-by: NSage Weil <sage@newdream.net>

e8e1ba96

26 1月, 2011 1 次提交

ceph: avoid picking MDS that is not active · d66bbd44

由 Sage Weil 提交于 1月 21, 2011

Ignore replication or auth frag data if it indicates an MDS that is not
active.  This can happen if the MDS shuts down and the client has stale
data about the namespace distribution across the MDS cluster.  If that's
the case, fall back to directing the request based on the auth cap (which
should always be accurate).
Signed-off-by: NSage Weil <sage@newdream.net>

d66bbd44

20 1月, 2011 4 次提交

ceph: avoid immediate cap check after import · 7e57b81c

由 Sage Weil 提交于 1月 18, 2011

The NODELAY flag avoids the heuristics that delay cap (issued/wanted)
release.  There's no reason for that after we import a cap, and it kills
whatever benefit we get from those delays.
Signed-off-by: NSage Weil <sage@newdream.net>

7e57b81c

ceph: fix flushing of caps vs cap import · 088b3f5e

由 Sage Weil 提交于 1月 18, 2011

If we are mid-flush and a cap is migrated to another node, we need to
resend the cap flush message to the new MDS, and do so with the original
flush_seq to avoid leaking across a sync boundary.  Previously we didn't
redo the flush (we only flushed newly dirty data), which would cause a
later sync to hang forever.
Signed-off-by: NSage Weil <sage@newdream.net>

088b3f5e

ceph: fix erroneous cap flush to non-auth mds · 24be0c48

由 Sage Weil 提交于 1月 18, 2011

The int flushing is global and not clear on each iteration of the loop,
which can cause a second flush of caps to any MDSs with ids greater than
the auth.
Signed-off-by: NSage Weil <sage@newdream.net>

24be0c48

ceph: fix cap_wanted_delay_{min,max} mount option initialization · 50aac4fe

由 Sage Weil 提交于 1月 18, 2011

These were initialized to 0 instead of the default, fallout from the RBD
refactor in 3d14c5d2.
Signed-off-by: NSage Weil <sage@newdream.net>

50aac4fe

14 1月, 2011 2 次提交

ceph: fix xattr rbtree search · 17db143f

由 Sage Weil 提交于 1月 13, 2011

Fix xattr name comparison in rbtree search for strings that share a prefix.
The *name argument is null terminated, but the xattr name is not, so we
need to use strncmp, but that means adjusting for the case where name is
a prefix of xattr->name.

The corresponding case in __set_xattr() already handles this properly
(although in that case *name is also not null terminated).
Reported-by: NSergiy Kibrik <sakib@meta.ua>
Signed-off-by: NSage Weil <sage@newdream.net>

17db143f

ceph: fix getattr on directory when using norbytes · 1c1266bb

由 Yehuda Sadeh 提交于 1月 12, 2011

The norbytes mount option was broken, and when doing getattr
on a directory it return the rbytes instead of the number of
entities. This commit fixes it.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

1c1266bb

13 1月, 2011 6 次提交

ceph: fsc->*_wq's aren't used in memory reclaim path · 01e6acc4

由 Tejun Heo 提交于 1月 03, 2011

fsc->*_wq's aren't depended upon during memory reclaim.  Convert to
alloc_workqueue() w/o WQ_MEM_RECLAIM.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Sage Weil <sage@newdream.net>
Cc: ceph-devel@vger.kernel.org
Signed-off-by: NSage Weil <sage@newdream.net>

01e6acc4

ceph: Makefile: Remove unnessary code · 582c86e6

由 Tracey Dent 提交于 12月 14, 2010

Remove the if and else conditional because the code is in mainline and there
is no need in it being there.

Also, Changed Makefile to use <modules>-y instead of <modules>-objs
because -objs is deprecated and not mentioned in
 Documentation/kbuild/makefiles.txt.
Signed-off-by: NTracey Dent <tdent48227@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

582c86e6

ceph: associate requests with opening sessions · dc69e2e9

由 Sage Weil 提交于 11月 02, 2010

Associate request with sessions that aren't yep open.  This makes the
debugfs mdsc request list more informative.
Signed-off-by: NSage Weil <sage@newdream.net>

dc69e2e9

ceph: drop redundant r_mds field · 4af25fdd

由 Sage Weil 提交于 11月 02, 2010

The r_mds field is redundant, since we can find the same information at
r_session->s_mds, and when r_session is NULL then r_mds is meaningless.
Signed-off-by: NSage Weil <sage@newdream.net>

4af25fdd

ceph: implement DIRLAYOUTHASH feature to get dir layout from MDS · 14303d20

由 Sage Weil 提交于 12月 14, 2010

This implements the DIRLAYOUTHASH protocol feature, which passes the dir
layout over the wire from the MDS. This gives the client knowledge
of the correct hash function to use for mapping dentries among dir
fragments.

Note that if this feature is _not_ present on the client but is on the
MDS, the client may misdirect requests. This will result in a forward
and degrade performance. It may also result in inaccurate NFS filehandle
generation, which will prevent fh resolution when the inode is not present
in the client cache and the parent directories have been fragmented.
Signed-off-by: NSage Weil <sage@newdream.net>

14303d20

ceph: add dir_layout to inode · 6c0f3af7

由 Sage Weil 提交于 11月 16, 2010

Add a ceph_dir_layout to the inode, and calculate dentry hash values based
on the parent directory's specified dir_hash function. This is needed
because the old default Linux dcache hash function is extremely week and
leads to a poor distribution of files among dir fragments.
Signed-off-by: NSage Weil <sage@newdream.net>

6c0f3af7

07 1月, 2011 8 次提交

N
fs: provide rcu-walk aware permission i_ops · b74c79e9
由 Nick Piggin 提交于 1月 07, 2011
```
Signed-off-by: NNick Piggin <npiggin@kernel.dk>
```
b74c79e9

fs: rcu-walk aware d_revalidate method · 34286d66

由 Nick Piggin 提交于 1月 07, 2011

Require filesystems be aware of .d_revalidate being called in rcu-walk
mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning
-ECHILD from all implementations.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

34286d66

fs: dcache reduce branches in lookup path · fb045adb

由 Nick Piggin 提交于 1月 07, 2011

Reduce some branches and memory accesses in dcache lookup by adding dentry
flags to indicate common d_ops are set, rather than having to check them.
This saves a pointer memory access (dentry->d_op) in common path lookup
situations, and saves another pointer load and branch in cases where we
have d_op but not the particular operation.

Patched with:

git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

fb045adb

fs: icache RCU free inodes · fa0d7e3d

由 Nick Piggin 提交于 1月 07, 2011

RCU free the struct inode. This will allow:

- Subsequent store-free path walking patch. The inode must be consulted for
  permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
  to take i_lock no longer need to take sb_inode_list_lock to walk the list in
  the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
  page lock to follow page->mapping.

The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.

In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.

The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

fa0d7e3d

fs: dcache remove dcache_lock · b5c84bf6

由 Nick Piggin 提交于 1月 07, 2011

dcache_lock no longer protects anything. remove it.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

b5c84bf6

fs: dcache scale subdirs · 2fd6b7f5

由 Nick Piggin 提交于 1月 07, 2011

Protect d_subdirs and d_child with d_lock, except in filesystems that aren't
using dcache_lock for these anyway (eg. using i_mutex).

Note: if we change the locking rule in future so that ->d_child protection is
provided only with ->d_parent->d_lock, it may allow us to reduce some locking.
But it would be an exception to an otherwise regular locking scheme, so we'd
have to see some good results. Probably not worthwhile.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

2fd6b7f5

fs: dcache scale d_unhashed · da502956

由 Nick Piggin 提交于 1月 07, 2011

Protect d_unhashed(dentry) condition with d_lock. This means keeping
DCACHE_UNHASHED bit in synch with hash manipulations.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

da502956

fs: dcache scale dentry refcount · b7ab39f6

由 Nick Piggin 提交于 1月 07, 2011

Make d_count non-atomic and protect it with d_lock. This allows us to ensure a
0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when
we start protecting many other dentry members with d_lock.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

b7ab39f6

18 12月, 2010 2 次提交

ceph: mark user pages dirty on direct-io reads · b6aa5901

由 Henry C Chang 提交于 12月 15, 2010

For read operation, we have to set the argument _write_ of get_user_pages
to 1 since we will write data to pages. Also, we need to SetPageDirty before
releasing these pages.
Signed-off-by: NHenry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: NSage Weil <sage@newdream.net>

b6aa5901

ceph: fix null pointer dereference in ceph_init_dentry for nfs reexport · 92cf7652

由 Sage Weil 提交于 12月 17, 2010

The fh_to_dentry etc. methods use ceph_init_dentry(), which assumes that
d_parent is defined.  It isn't for those callers, so check!
Signed-off-by: NSage Weil <sage@newdream.net>

92cf7652

16 12月, 2010 1 次提交

ceph: fix direct-io on non-page-aligned buffers · ab226e21

由 Henry C Chang 提交于 12月 15, 2010

The user buffer may be 512-byte aligned, not page-aligned.  We were
assuming the buffer was page-aligned and only accounting for
non-page-aligned io offsets.
Signed-off-by: NHenry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: NSage Weil <sage@newdream.net>

ab226e21

07 12月, 2010 1 次提交

ceph: fix ioctl magic · 1cd275f6

由 Sage Weil 提交于 12月 06, 2010

The ioctl magic was inadvertently changed in 571dba52.
Signed-off-by: NSage Weil <sage@newdream.net>

1cd275f6

02 12月, 2010 4 次提交

ceph: Behave better when handling file lock replies. · a5b10629

由 Herb Shiu 提交于 11月 23, 2010

Fill in the local lock with response data if appropriate,
and don't call posix_lock_file when reading locks.
Signed-off-by: NHerb Shiu <herb_shiu@tcloudcomputing.com>
Acked-by: NGreg Farnum <gregf@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

a5b10629

ceph: pass lock information by struct file_lock instead of as individual params. · 637ae8d5

由 Herb Shiu 提交于 11月 23, 2010

Signed-off-by: NHerb Shiu <herb_shiu@tcloudcomputing.com>
Acked-by: NGreg Farnum <gregf@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

637ae8d5

ceph: Handle file locks in replies from the MDS. · 25933abd

由 Herb Shiu 提交于 12月 01, 2010

Previously the kernel client incorrectly assumed everything was a directory.
Signed-off-by: NHerb Shiu <herb_shiu@tcloudcomputing.com>
Acked-by: NGreg Farnum <gregf@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

25933abd

ceph: avoid possible null deref in readdir after dir llseek · 884ea892

由 Sage Weil 提交于 11月 22, 2010

last may be NULL, but we dereference it in the else branch without
checking.  Normally it doesn't trigger because last == NULL when fpos == 2,
but it could happen on a newly opened dir if the user seeks forward.
Reported-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

884ea892

19 11月, 2010 1 次提交

ceph: fix readdir EOVERFLOW on 32-bit archs · 3105c19c

由 Sage Weil 提交于 11月 18, 2010

One of the readdir filldir_t callers was passing the raw ceph 64-bit ino
instead of the hashed 32-bit one, producing an EOVERFLOW in the filler
callback.  Fix this by calling the ceph_vino_to_ino() helper to do the
conversion.
Reported-by: NJan Smets <jan.smets@alcatel-lucent.com>
Tested-by: NJan Smets <jan.smets@alcatel-lucent.com>
Signed-off-by: NSage Weil <sage@newdream.net>

3105c19c