提交 · 3469ac1aa3a2f1e2586a412923c414779a0af854 · openeuler / raspberrypi-kernel

08 5月, 2012 1 次提交

ceph: drop support for preferred_osd pgs · 3469ac1a

由 Sage Weil 提交于 5月 07, 2012

This was an ill-conceived feature that has been removed from Ceph.  Do
this gracefully:

 - reject attempts to specify a preferred_osd via the ioctl
 - stop exposing this information via virtual xattrs
 - always fill in -1 for requests, in case we talk to an older server
 - don't calculate preferred_osd placements/pgids
Reviewed-by: NAlex Elder <elder@inktank.com>
Signed-off-by: NSage Weil <sage@inktank.com>

3469ac1a

22 3月, 2012 15 次提交

ceph: fix three bugs, two in ceph_vxattrcb_file_layout() · 3489b42a

由 Alex Elder 提交于 3月 08, 2012

In ceph_vxattrcb_file_layout(), there is a check to determine
whether a preferred PG should be formatted into the output buffer.
That check assumes that a preferred PG number of 0 indicates "no
preference," but that is wrong.  No preference is indicated by a
negative (specifically, -1) PG number.

In addition, if that condition yields true, the preferred value
is formatted into a sized buffer, but the size consumed by the
earlier snprintf() call is not accounted for, opening up the
possibilty of a buffer overrun.

Finally, in ceph_vxattrcb_dir_rctime() where the nanoseconds part of
the time displayed did not include leading 0's, which led to
erroneous (sub-second portion of) time values being shown.

This fixes these three issues:
    http://tracker.newdream.net/issues/2155
    http://tracker.newdream.net/issues/2156
    http://tracker.newdream.net/issues/2157Signed-off-by: NAlex Elder <elder@dreamhost.com>
Reviewed-by: NSage Weil <sage@newdream.net>

3489b42a

ceph: ensure Boolean options support both senses · cffaba15

由 Alex Elder 提交于 2月 15, 2012

Many ceph-related Boolean options offer the ability to both enable
and disable a feature.  For all those that don't offer this, add
a new option so that they do.

Note that ceph_show_options()--which reports mount options currently
in effect--only reports the option if it is different from the
default value.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

cffaba15

rbd: make ceph_parse_options() return a pointer · ee57741c

由 Alex Elder 提交于 1月 24, 2012

ceph_parse_options() takes the address of a pointer as an argument
and uses it to return the address of an allocated structure if
successful.  With this interface is not evident at call sites that
the pointer is always initialized.  Change the interface to return
the address instead (or a pointer-coded error code) to make the
validity of the returned pointer obvious.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

ee57741c

ceph: make ceph_setxattr() and ceph_removexattr() more alike · 18fa8b3f

由 Alex Elder 提交于 1月 23, 2012

This patch just rearranges a few bits of code to make more
portions of ceph_setxattr() and ceph_removexattr() identical.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

18fa8b3f

ceph: avoid repeatedly computing the size of constant vxattr names · 3ce6cd12

由 Alex Elder 提交于 1月 23, 2012

All names defined in the directory and file virtual extended
attribute tables are constant, and the size of each is known at
compile time.  So there's no need to compute their length every
time any file's attribute is listed.

Record the length of each string and use it when needed to determine
the space need to represent them.  In addition, compute the
aggregate size of strings in each table just once at initialization
time.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

3ce6cd12

ceph: encode type in vxattr callback routines · aa4066ed

由 Alex Elder 提交于 1月 23, 2012

The names of the callback functions used for virtual extended
attributes are based only on the last component of the attribute
name.  Because of the way these are defined, this precludes allowing
a single (lowest) attribute name for different callbacks, dependent
on the type of file being operated on.  (For example, it might be
nice to support both "ceph.dir.layout" and "ceph.file.layout".)

Just change the callback names to avoid this problem.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

aa4066ed

ceph: drop "_cb" from name of struct ceph_vxattr_cb · 881a5fa2

由 Alex Elder 提交于 1月 23, 2012

A struct ceph_vxattr_cb does not represent a callback at all, but
rather a virtual extended attribute itself.  Drop the "_cb" suffix
from its name to reflect that.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

881a5fa2

ceph: use macros to normalize vxattr table definitions · eb788084

由 Alex Elder 提交于 1月 23, 2012

Entries in the ceph virtual extended attribute tables all follow a
distinct pattern in their definition.  Enforce this pattern through
the use of a macro.

Also, a null name field signals the end of the table, so make that
be the first field in the ceph_vxattr_cb structure.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

eb788084

ceph: use a symbolic name for "ceph." extended attribute namespace · 22891907

由 Alex Elder 提交于 1月 23, 2012

Use symbolic constants to define the top-level prefix for "ceph."
extended attribute names.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

22891907

ceph: pass inode rather than table to ceph_match_vxattr() · 06476a69

由 Alex Elder 提交于 1月 23, 2012

All callers of ceph_match_vxattr() determine what to pass as the
first argument by calling ceph_inode_vxattrs(inode).  Just do that
inside ceph_match_vxattr() itself, changing it to take an inode
rather than the vxattr pointer as its first argument.

Also ensure the function works correctly for an empty table (i.e.,
containing only a terminating null entry).
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

06476a69

ceph: don't null-terminate xattr values · b829c195

由 Alex Elder 提交于 1月 23, 2012

For some reason, ceph_setxattr() allocates an extra byte in which a
'\0' is stored past the end of an extended attribute value.  This is
not needed, and is potentially misleading, so get rid of it.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

b829c195

ceph: fix overflow check in build_snap_context() · 80834312

由 Xi Wang 提交于 2月 16, 2012

The overflow check for a + n * b should be (n > (ULONG_MAX - a) / b),
rather than (n > ULONG_MAX / b - a).
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

80834312

ceph: avoid panic with mismatched symlink sizes in fill_inode() · 810339ec

由 Xi Wang 提交于 2月 03, 2012

Return -EINVAL rather than panic if iinfo->symlink_len and inode->i_size
do not match.

Also use kstrndup rather than kmalloc/memcpy.
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Reviewed-by: NAlex Elder <elder@dreamhost.com>

810339ec

ceph: use 2 instead of 1 as fallback for 32-bit inode number · a661fc56

由 Amon Ott 提交于 1月 23, 2012

The root directory of the Ceph mount has inode number 1, so falling back
to 1 always creates a collision. 2 is unused on my test systems and seems
less likely to collide.
Signed-off-by: NAmon Ott <ao@m-privacy.de>
Signed-off-by: NSage Weil <sage@newdream.net>

a661fc56

ceph: don't reset s_cap_ttl to zero · 1ce208a6

由 Alex Elder 提交于 1月 12, 2012

Avoid the need to check for a special zero s_cap_ttl value by just
using (jiffies - 1) as the value assigned to indicate "sometime in
the past."
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Reviewed-by: NSage Weil <sage@newdream.net>

1ce208a6

03 2月, 2012 3 次提交

ceph: create a new session lock to avoid lock inversion · d8fb02ab

由 Alex Elder 提交于 1月 12, 2012

Lockdep was reporting a possible circular lock dependency in
dentry_lease_is_valid().  That function needs to sample the
session's s_cap_gen and and s_cap_ttl fields coherently, but needs
to do so while holding a dentry lock.  The s_cap_lock field was
being used to protect the two fields, but that can't be taken while
holding a lock on a dentry within the session.

In most cases, the s_cap_gen and s_cap_ttl fields only get operated
on separately.  But in three cases they need to be updated together.
Implement a new lock to protect the spots updating both fields
atomically is required.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Reviewed-by: NSage Weil <sage@newdream.net>

d8fb02ab

ceph: fix length validation in parse_reply_info() · 32852a81

由 Xi Wang 提交于 1月 14, 2012

"len" is read from network and thus needs validation.  Otherwise, given
a bogus "len" value, p+len could be an out-of-bounds pointer, which is
used in further parsing.
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

32852a81

ceph: change "ceph.layout" xattr to be "ceph.file.layout" · 114fc474

由 Alex Elder 提交于 1月 11, 2012

The virtual extended attribute named "ceph.layout" is meaningful
only for regular files.  Change its name to be "ceph.file.layout" to
more directly reflect that in the ceph xattr namespace.  Preserve
the old "ceph.layout" name for the time being (until we decide it's
safe to get rid of it entirely).

Add a missing initializer for "readonly" in the terminating entry.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Reviewed-by: NSage Weil <sage@newdream.net>

114fc474

13 1月, 2012 2 次提交

ceph: ensure prealloc_blob is in place when removing xattr · 83eb26af

由 Alex Elder 提交于 1月 11, 2012

In __ceph_build_xattrs_blob(), if a ceph inode's extended attributes
are marked dirty, all attributes recorded in its rb_tree index are
formatted into a "blob" buffer.  The target buffer is recorded in
ceph_inode->i_xattrs.prealloc_blob, and it is expected to exist and
be of sufficient size to hold the attributes.

The extended attributes are marked dirty in two cases: when a new
attribute is added to the inode; or when one is removed.  In the
former case work is done to ensure the prealloc_blob buffer is
properly set up, but in the latter it is not.

Change the logic in ceph_removexattr() so it matches what is
done in ceph_setxattr().  Note that this is done in a way that
keeps the two blocks of code nearly identical, in anticipation
of a subsequent patch that encapsulates some of this logic into
one or more helper routines.
Signed-off-by: NAlex Elder <elder@dreamhost.com>
Signed-off-by: NSage Weil <sage@newdream.net>

83eb26af

ceph: enable/disable dentry complete flags via mount option · a40dc6cc

由 Sage Weil 提交于 1月 10, 2012

Enable/disable use of the dentry dir 'complete' flag via a mount option.
This lets the admin control whether ceph uses the dcache to satisfy
negative lookups or readdir when it has the entire directory contents in
its cache.

This is purely a performance optimization; correctness is guaranteed
whether it is enabled or not.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSage Weil <sage@newdream.net>

a40dc6cc

12 1月, 2012 1 次提交

ceph: always initialize the dentry in open_root_dentry() · d46cfba5

由 Alex Elder 提交于 1月 04, 2012

When open_root_dentry() gets a dentry via d_obtain_alias() it does
not get initialized.  If the dentry obtained came from the cache,
this is OK.  But if not, the result is an improperly initialized
dentry.

To fix this, call ceph_init_dentry() regardless of which path
produced the dentry.  That function returns immediately for a dentry
that is already initialized, it is safe to use either way.

(Credit to Sage, who suggested this fix.)
Signed-off-by: NAlex Elder <aelder@sgi.com>

d46cfba5

11 1月, 2012 4 次提交

ceph: avoid iput() while holding spinlock in ceph_dir_fsync · 2ff179e6

由 Sage Weil 提交于 1月 03, 2012

ceph_mdsc_put_request() can call iput(), which can sleep.  Don't do that.

Fixes: #1812
Signed-off-by: NSage Weil <sage@newdream.net>

2ff179e6

ceph: avoid useless dget/dput in encode_fh · ee6b1baf

由 Sage Weil 提交于 1月 03, 2012

Nothing we do here sleeps, so just do it under d_lock and avoid the dget/
dput entirely.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NSage Weil <sage@newdream.net>

ee6b1baf

Y
ceph: dereference pointer after checking for NULL · b8cd952b
由 Yehuda Sadeh 提交于 12月 13, 2011
```
moved dereference after BUG_ON
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
```
b8cd952b

ceph: remove unnecessary d_fsdata conditional checks · 3d8eb7a9

由 Sage Weil 提交于 11月 11, 2011

We now set d_fsdata unconditionally on all dentries prior to setting up
the d_ops, so all of these checks are unnecessary.
Signed-off-by: NSage Weil <sage@newdream.net>

3d8eb7a9

10 1月, 2012 1 次提交

ceph: d_alloc_root() may fail · 3c5184ef

由 Al Viro 提交于 1月 09, 2012

... and ceph_init_dentry(NULL) will oops
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3c5184ef

07 1月, 2012 1 次提交
- A
  vfs: switch ->show_options() to struct dentry * · 34c80b1d
  由 Al Viro 提交于 12月 08, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  34c80b1d
04 1月, 2012 6 次提交

A
ceph: propagate umode_t · 5706b27d
由 Al Viro 提交于 7月 26, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
5706b27d
A
get rid of open-coded S_ISREG(), etc. · dba19c60
由 Al Viro 提交于 7月 25, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
dba19c60
A
switch ->mknod() to umode_t · 1a67aafb
由 Al Viro 提交于 7月 26, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
1a67aafb

switch ->create() to umode_t · 4acdaf27

由 Al Viro 提交于 7月 26, 2011

vfs_create() ignores everything outside of 16bit subset of its
mode argument; switching it to umode_t is obviously equivalent
and it's the only caller of the method
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4acdaf27

switch vfs_mkdir() and ->mkdir() to umode_t · 18bb1db3

由 Al Viro 提交于 7月 26, 2011

vfs_mkdir() gets int, but immediately drops everything that might not
fit into umode_t and that's the only caller of ->mkdir()...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

18bb1db3

vfs: fix the stupidity with i_dentry in inode destructors · 6b520e05

由 Al Viro 提交于 12月 12, 2011

Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
the cost of taking it into inode_init_always() will be negligible for pipes
and sockets and negative for everything else. Not to mention the removal of
boilerplate code from ->destroy_inode() instances...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6b520e05

30 12月, 2011 1 次提交

ceph: disable use of dcache for readdir etc. · a4d46363

由 Sage Weil 提交于 12月 29, 2011

Ceph attempts to use the dcache to satisfy negative lookups and readdir
when the entire directory contents are in cache.  Disable this behavior
until lingering bugs in this code are shaken out; we'll re-enable these
hooks once things are fully stable.
Signed-off-by: NSage Weil <sage@newdream.net>

a4d46363

14 12月, 2011 2 次提交

Y
ceph: add missing spin_unlock at ceph_mdsc_build_path() · 9d5a09e6
由 Yehuda Sadeh 提交于 12月 13, 2011
```
one of the paths was missing spin_unlock
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
```
9d5a09e6

ceph: fix SEEK_CUR, SEEK_SET regression · 6a82c47a

由 Sage Weil 提交于 12月 13, 2011

Commit 06222e49 got the if wrong so that
it always evaluates as true.  This is semantically harmless, but makes
SEEK_CUR and SEEK_SET needlessly query the server.

Rewrite the if to explicitly enumerate the cases we DO need a valid i_size
to make this code less fragile.
Reported-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

6a82c47a

08 12月, 2011 1 次提交

ceph: use i_ceph_lock instead of i_lock · be655596

由 Sage Weil 提交于 11月 30, 2011

We have been using i_lock to protect all kinds of data structures in the
ceph_inode_info struct, including lists of inodes that we need to iterate
over while avoiding races with inode destruction.  That requires grabbing
a reference to the inode with the list lock protected, but igrab() now
takes i_lock to check the inode flags.

Changing the list lock ordering would be a painful process.

However, using a ceph-specific i_ceph_lock in the ceph inode instead of
i_lock is a simple mechanical change and avoids the ordering constraints
imposed by igrab().
Reported-by: NAmon Ott <a.ott@m-privacy.de>
Signed-off-by: NSage Weil <sage@newdream.net>

be655596

03 12月, 2011 1 次提交

ceph: fix rasize reporting by ceph_show_options · 2151937d

由 Sage Weil 提交于 12月 01, 2011

Fix typo.
Reported-by: Nmowang da <whooya.xxl@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

2151937d

12 11月, 2011 1 次提交

ceph: initialize root dentry · 774ac21d

由 Sage Weil 提交于 11月 11, 2011

Set up d_fsdata on the root dentry.  This fixes a NULL pointer dereference
in ceph_d_prune on umount.  It also means we can eventually strip out all
of the conditional checks on d_fsdata because it is now set unconditionally
(prior to setting up the d_ops).

Fix the ceph_d_prune debug print while we're here.
Signed-off-by: NSage Weil <sage@newdream.net>

774ac21d