提交 · 6070e0c1e2b515ad5edc2f8224031b051bd08109 · openeuler / Kernel

02 5月, 2013 7 次提交

ceph: don't early drop Fw cap · 6070e0c1

由 Yan, Zheng 提交于 3月 01, 2013

ceph_aio_write() has an optimization that marks CEPH_CAP_FILE_WR
cap dirty before data is copied to page cache and inode size is
updated. The optimization avoids slow cap revocation caused by
balance_dirty_pages(), but introduces inode size update race. If
ceph_check_caps() flushes the dirty cap before the inode size is
updated, MDS can miss the new inode size. So just remove the
optimization.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

6070e0c1

ceph: revert commit · 7971bd92

由 Sage Weil 提交于 5月 01, 2013

commit 22cddde1 breaks the atomicity of write operation, it also
introduces a deadlock between write and truncate.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

Conflicts:
	fs/ceph/addr.c

7971bd92

ceph: use I_COMPLETE inode flag instead of D_COMPLETE flag · a8673d61

由 Yan, Zheng 提交于 2月 18, 2013

commit c6ffe100 moved the flag that tracks if the dcache contents
for a directory are complete to dentry. The problem is there are
lots of places that use ceph_dir_{set,clear,test}_complete() while
holding i_ceph_lock. but ceph_dir_{set,clear,test}_complete() may
sleep because they call dput().

This patch basically reverts that commit. For ceph_d_prune(), it's
called with both the dentry to prune and the parent dentry are
locked. So it's safe to access the parent dentry's d_inode and
clear I_COMPLETE flag.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

a8673d61

ceph: set mds_want according to cap import message · 964266cc

由 Yan, Zheng 提交于 2月 27, 2013

MDS ignores cap update message if migrate_seq mismatch, so when
receiving a cap import message with higher migrate_seq, set mds_want
according to the cap import message.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

964266cc

ceph: queue cap release when trimming cap · d40ee0dc

由 Yan, Zheng 提交于 2月 18, 2013

So the client will later send cap release message to MDS
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

d40ee0dc

ceph: fix LSSNAP regression · 8a034497

由 Yan, Zheng 提交于 2月 21, 2013

commit 6e8575fa makes parse_reply_info_extra() return -EIO for LSSNAP
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

8a034497

libceph: distinguish page array and pagelist count · d4b515fa

由 Alex Elder 提交于 2月 25, 2013

Use distinct fields for tracking the number of pages in a message's
page array and in a message's page list.  Currently only one or the
other is used at a time, but that will be changing soon.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

d4b515fa

04 3月, 2013 1 次提交

fs: Limit sys_mount to only request filesystem modules. · 7f78e035

由 Eric W. Biederman 提交于 3月 02, 2013

Modify the request_module to prefix the file system type with "fs-"
and add aliases to all of the filesystems that can be built as modules
to match.

A common practice is to build all of the kernel code and leave code
that is not commonly needed as modules, with the result that many
users are exposed to any bug anywhere in the kernel.

Looking for filesystems with a fs- prefix limits the pool of possible
modules that can be loaded by mount to just filesystems trivially
making things safer with no real cost.

Using aliases means user space can control the policy of which
filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
with blacklist and alias directives.  Allowing simple, safe,
well understood work-arounds to known problematic software.

This also addresses a rare but unfortunate problem where the filesystem
name is not the same as it's module name and module auto-loading
would not work.  While writing this patch I saw a handful of such
cases.  The most significant being autofs that lives in the module
autofs4.

This is relevant to user namespaces because we can reach the request
module in get_fs_type() without having any special permissions, and
people get uncomfortable when a user specified string (in this case
the filesystem type) goes all of the way to request_module.

After having looked at this issue I don't think there is any
particular reason to perform any filtering or permission checks beyond
making it clear in the module request that we want a filesystem
module.  The common pattern in the kernel is to call request_module()
without regards to the users permissions.  In general all a filesystem
module does once loaded is call register_filesystem() and go to sleep.
Which means there is not much attack surface exposed by loading a
filesytem module unless the filesystem is mounted.  In a user
namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
which most filesystems do not set today.
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Acked-by: NKees Cook <keescook@chromium.org>
Reported-by: NKees Cook <keescook@google.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

7f78e035

27 2月, 2013 5 次提交

libceph: update osd request/reply encoding · 1b83bef2

由 Sage Weil 提交于 2月 25, 2013

Use the new version of the encoding for osd requests and replies.  In the
process, update the way we are tracking request ops and reply lengths and
results in the struct ceph_osd_request.  Update the rbd and fs/ceph users
appropriately.

The main changes are:
 - we keep pointers into the request memory for fields we need to update
   each time the request is sent out over the wire
 - we keep information about the result in an array in the request struct
   where the users can easily get at it.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

1b83bef2

libceph: calculate placement based on the internal data types · 2169aea6

由 Sage Weil 提交于 2月 25, 2013

Instead of using the old ceph_object_layout struct, update our internal
ceph_calc_object_layout method to use the ceph_pg type.  This allows us to
pass the full 32-bit precision of the pgid.seed to the callers.  It also
allows some callers to avoid reaching into the request structures for the
struct ceph_object_layout fields.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

2169aea6

ceph: update support for PGID64, PGPOOL3, OSDENC protocol features · 4f6a7e5e

由 Sage Weil 提交于 2月 23, 2013

Support (and require) the PGID64, PGPOOL3, and OSDENC protocol features.
These have been present in ceph.git since v0.42, Feb 2012.  Require these
features to simplify support; nobody is running older userspace.

Note that the new request and reply encoding is still not in place, so the new
code is not yet functional.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

4f6a7e5e

libceph: decode into cpu-native ceph_pg type · 5b191d99

由 Sage Weil 提交于 2月 23, 2013

Always decode data into our cpu-native ceph_pg type that has the correct
field widths.  Limit any remaining uses of ceph_pg_v1 to dealing with the
legacy protocol.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

5b191d99

libceph: rename ceph_pg -> ceph_pg_v1 · 12979354

由 Sage Weil 提交于 1月 08, 2013

Rename the old version this type to distinguish it from the new version.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

12979354

26 2月, 2013 3 次提交

fs: encode_fh: return FILEID_INVALID if invalid fid_type · 94e07a75

由 Namjae Jeon 提交于 2月 17, 2013

This patch is a follow up on below patch:

[PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type
commit: 216b6cbdSigned-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NVivek Trivedi <t.vivek@samsung.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Acked-by: NSage Weil <sage@inktank.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

94e07a75

ceph: prepopulate inodes only when request is aborted · 79f9f99a

由 Sage Weil 提交于 1月 29, 2013

If r_aborted is true, we do not hold the dir i_mutex, and cannot touch
the dcache.  However, we still need to update the inodes with the state
returned by the MDS.
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NSage Weil <sage@inktank.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

79f9f99a

ceph: eliminate sparse warnings in fs code · 2c3dd4ff

由 Alex Elder 提交于 2月 19, 2013

Fix the causes for sparse warnings reported in the ceph file system
code.  Here there are only two (and they're sort of silly but
they're easy to fix).

This partially resolves:
    http://tracker.ceph.com/issues/4184Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

2c3dd4ff

23 2月, 2013 2 次提交

A
new helper: file_inode(file) · 496ad9aa
由 Al Viro 提交于 1月 23, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
496ad9aa

ceph: fix statvfs fr_size · 92a49fb0

由 Sage Weil 提交于 2月 22, 2013

Different versions of glibc are broken in different ways, but the short of
it is that for the time being, frsize should == bsize, and be used as the
multiple for the blocks, free, and available fields. This mirrors what is
done for NFS. The previous reporting of the page size for frsize meant
that newer glibc and df would report a very small value for the fs size.

Fixes http://tracker.ceph.com/issues/3793.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

92a49fb0

20 2月, 2013 1 次提交

ceph: remove a few bogus declarations · 9e0eb85d

由 Alex Elder 提交于 2月 06, 2013

There are three ceph page vector functions declared in
"fs/ceph/super.h" that don't belong there.  They're
probably left over from some long-ago code reorganization.

They're properly declared in "include/linux/ceph/libceph.h"
so just delete the ones in "super.h".

This and the next few commits resolve:
    http://tracker.ceph.com/issues/4053Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

9e0eb85d

19 2月, 2013 5 次提交

libceph: update ceph_mds_state_name() and ceph_mds_op_name() · 0eb40bf6

由 Alex Elder 提交于 2月 15, 2013

Update ceph_mds_state_name() and ceph_mds_op_name() to include the
newly-added definitions in "ceph_fs.h", and to match its counterpart
in the user space code.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

0eb40bf6

ceph: kill ceph_osdc_new_request() "num_reply" parameter · a3bea47e

由 Alex Elder 提交于 2月 15, 2013

The "num_reply" parameter to ceph_osdc_new_request() is never
used inside that function, so get rid of it.

Note that ceph_sync_write() passes 2 for that argument, while all
other callers pass 1.  It doesn't matter, but perhaps someone should
verify this doesn't indicate a problem.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

a3bea47e

ceph: kill ceph_osdc_writepages() "flags" parameter · 24808826

由 Alex Elder 提交于 2月 15, 2013

There is only one caller of ceph_osdc_writepages(), and it always
passes 0 as its "flags" argument.  Get rid of that argument and
replace its use in ceph_osdc_writepages() with 0.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

24808826

ceph: kill ceph_osdc_writepages() "dosync" parameter · fbf8685f

由 Alex Elder 提交于 2月 15, 2013

There is only one caller of ceph_osdc_writepages(), and it always
passes 0 as its "dosync" argument.  Get rid of that argument and
replace its use in ceph_osdc_writepages() with 0.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

fbf8685f

ceph: kill ceph_osdc_writepages() "nofail" parameter · 87f979d3

由 Alex Elder 提交于 2月 15, 2013

There is only one caller of ceph_osdc_writepages(), and it always
passes the value true as its "nofail" argument.  Get rid of that
argument and replace its use in ceph_osdc_writepages() with the
constant value true.

This and a number of cleanup patches that follow resolve:
    http://tracker.ceph.com/issues/4126Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

87f979d3

14 2月, 2013 10 次提交

ceph: implement hidden per-field ceph.*.layout.* vxattrs · 695b7119

由 Sage Weil 提交于 1月 20, 2013

Allow individual fields of the layout to be fetched via getxattr.
The ceph.dir.layout.* vxattr with "disappear" if the exists_cb
indicates there no dir layout set.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NSam Lang <sam.lang@inktank.com>

695b7119

ceph: add ceph.dir.layout vxattr · 1f08f2b0

由 Sage Weil 提交于 1月 20, 2013

This virtual xattr will only appear when there is a dir layout policy
set on the directory.  It can be set via setxattr and removed via
removexattr (implemented by the MDS).
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NSam Lang <sam.lang@inktank.com>

1f08f2b0

ceph: change ceph.file.layout.* implementation, content · 32ab0bd7

由 Sage Weil 提交于 1月 19, 2013

Implement a new method to generate the ceph.file.layout vxattr using
the new framework.

Use 'stripe_unit' instead of 'chunk_size'.

Include pool name, either as a string or as an integer.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NSam Lang <sam.lang@inktank.com>

32ab0bd7

ceph: fix listxattr handling for vxattrs · b65917dd

由 Sage Weil 提交于 1月 20, 2013

Only include vxattrs in the result if they are not hidden and exist
(as determined by the exists_cb callback).

Note that the buffer size we return when 0 is passed in always includes
vxattrs that *might* exist, forming an upper bound.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NSam Lang <sam.lang@inktank.com>

b65917dd

ceph: fix getxattr vxattr handling · 0bee82fb

由 Sage Weil 提交于 1月 20, 2013

Change the vxattr handling for getxattr so that vxattrs are checked
prior to any xattr content, and never after.  Enforce vxattr existence
via the exists_cb callback.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NSam Lang <sam.lang@inktank.com>

0bee82fb

ceph: add exists_cb to vxattr struct · f36e4472

由 Sage Weil 提交于 1月 20, 2013

Allow for a callback to dynamically determine if a vxattr exists for
the given inode.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NSam Lang <sam.lang@inktank.com>

f36e4472

ceph: pass ceph.* removexattrs through to MDS · d421acb1

由 Sage Weil 提交于 1月 20, 2013

If we do not explicitly recognized a vxattr (e.g., as readonly), pass
the request through to the MDS and deal with it there.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NSam Lang <sam.lang@inktank.com>

d421acb1

ceph: pass unhandled ceph.* setxattrs through to MDS · 3adf654d

由 Sage Weil 提交于 1月 31, 2013

If we do not specifically understand a setxattr on a ceph.* virtual
xattr, send it through to the MDS.  This allows us to implement new
functionality via the MDS without direct support on the client side.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NSam Lang <sam.lang@inktank.com>

3adf654d

ceph: support hidden vxattrs · 8860147a

由 Sage Weil 提交于 1月 31, 2013

Add ability to flag virtual xattrs as hidden, such that you can
getxattr them but they do not appear in listxattr.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NSam Lang <sam.lang@inktank.com>

8860147a

ceph: remove 'ceph.layout' virtual xattr · 39b648d9

由 Sage Weil 提交于 1月 31, 2013

This has been deprecated since v3.3, 114fc474.  Kill it.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NSam Lang <sam.lang@inktank.com>

39b648d9

12 2月, 2013 4 次提交

ceph: Convert kuids and kgids before printing them. · bd2bae6a

由 Eric W. Biederman 提交于 1月 31, 2013

Before printing kuid and kgids values convert them into
the initial user namespace.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

bd2bae6a

ceph: Convert struct ceph_mds_request to use kuid_t and kgid_t · ff3d0046

由 Eric W. Biederman 提交于 1月 31, 2013

Hold the uid and gid for a pending ceph mds request using the types
kuid_t and kgid_t.  When a request message is finally created convert
the kuid_t and kgid_t values into uids and gids in the initial user
namespace.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

ff3d0046

ceph: Translate inode uid and gid attributes to/from kuids and kgids. · ab871b90

由 Eric W. Biederman 提交于 1月 31, 2013

- In fill_inode() transate uids and gids in the initial user namespace
  into kuids and kgids stored in inode->i_uid and inode->i_gid.

- In ceph_setattr() if they have changed convert inode->i_uid and
  inode->i_gid into initial user namespace uids and gids for
  transmission.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

ab871b90

ceph: Translate between uid and gids in cap messages and kuids and kgids · 05cb11c1

由 Eric W. Biederman 提交于 1月 31, 2013

- Make the uid and gid arguments of send_cap_msg() used to compose
  ceph_mds_caps messages of type kuid_t and kgid_t.

- Pass inode->i_uid and inode->i_gid in __send_cap to send_cap_msg()
  through variables of type kuid_t and kgid_t.

- Modify struct ceph_cap_snap to store uids and gids in types kuid_t
  and kgid_t.  This allows capturing inode->i_uid and inode->i_gid in
  ceph_queue_cap_snap() without loss and pssing them to
  __ceph_flush_snaps() where they are removed from struct
  ceph_cap_snap and passed to send_cap_msg().

- In handle_cap_grant translate uid and gids in the initial user
  namespace stored in struct ceph_mds_cap into kuids and kgids
  before setting inode->i_uid and inode->i_gid.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

05cb11c1

18 1月, 2013 2 次提交

libceph: pass length to ceph_calc_file_object_mapping() · e8afad65

由 Alex Elder 提交于 11月 14, 2012

ceph_calc_file_object_mapping() takes (among other things) a "file"
offset and length, and based on the layout, determines the object
number ("bno") backing the affected portion of the file's data and
the offset into that object where the desired range begins.  It also
computes the size that should be used for the request--either the
amount requested or something less if that would exceed the end of
the object.

This patch changes the input length parameter in this function so it
is used only for input.  That is, the argument will be passed by
value rather than by address, so the value provided won't get
updated by the function.

The value would only get updated if the length would surpass the
current object, and in that case the value it got updated to would
be exactly that returned in *oxlen.

Only one of the two callers is affected by this change.  Update
ceph_calc_raw_layout() so it records any updated value.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

e8afad65

ceph: check mds_wanted for imported cap · 390306c3

由 Yan, Zheng 提交于 1月 04, 2013

The MDS may have incorrect wanted caps after importing caps. So the
client should check the value mds has and send cap update if necessary.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

390306c3

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功