- 02 5月, 2013 7 次提交
-
-
由 Yan, Zheng 提交于
ceph_aio_write() has an optimization that marks CEPH_CAP_FILE_WR cap dirty before data is copied to page cache and inode size is updated. The optimization avoids slow cap revocation caused by balance_dirty_pages(), but introduces inode size update race. If ceph_check_caps() flushes the dirty cap before the inode size is updated, MDS can miss the new inode size. So just remove the optimization. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NGreg Farnum <greg@inktank.com>
-
由 Sage Weil 提交于
commit 22cddde1 breaks the atomicity of write operation, it also introduces a deadlock between write and truncate. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NGreg Farnum <greg@inktank.com> Conflicts: fs/ceph/addr.c
-
由 Yan, Zheng 提交于
commit c6ffe100 moved the flag that tracks if the dcache contents for a directory are complete to dentry. The problem is there are lots of places that use ceph_dir_{set,clear,test}_complete() while holding i_ceph_lock. but ceph_dir_{set,clear,test}_complete() may sleep because they call dput(). This patch basically reverts that commit. For ceph_d_prune(), it's called with both the dentry to prune and the parent dentry are locked. So it's safe to access the parent dentry's d_inode and clear I_COMPLETE flag. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NGreg Farnum <greg@inktank.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
由 Yan, Zheng 提交于
MDS ignores cap update message if migrate_seq mismatch, so when receiving a cap import message with higher migrate_seq, set mds_want according to the cap import message. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NGreg Farnum <greg@inktank.com>
-
由 Yan, Zheng 提交于
So the client will later send cap release message to MDS Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NGreg Farnum <greg@inktank.com>
-
由 Yan, Zheng 提交于
commit 6e8575fa makes parse_reply_info_extra() return -EIO for LSSNAP Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NGreg Farnum <greg@inktank.com>
-
由 Alex Elder 提交于
Use distinct fields for tracking the number of pages in a message's page array and in a message's page list. Currently only one or the other is used at a time, but that will be changing soon. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
- 04 3月, 2013 1 次提交
-
-
由 Eric W. Biederman 提交于
Modify the request_module to prefix the file system type with "fs-" and add aliases to all of the filesystems that can be built as modules to match. A common practice is to build all of the kernel code and leave code that is not commonly needed as modules, with the result that many users are exposed to any bug anywhere in the kernel. Looking for filesystems with a fs- prefix limits the pool of possible modules that can be loaded by mount to just filesystems trivially making things safer with no real cost. Using aliases means user space can control the policy of which filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf with blacklist and alias directives. Allowing simple, safe, well understood work-arounds to known problematic software. This also addresses a rare but unfortunate problem where the filesystem name is not the same as it's module name and module auto-loading would not work. While writing this patch I saw a handful of such cases. The most significant being autofs that lives in the module autofs4. This is relevant to user namespaces because we can reach the request module in get_fs_type() without having any special permissions, and people get uncomfortable when a user specified string (in this case the filesystem type) goes all of the way to request_module. After having looked at this issue I don't think there is any particular reason to perform any filtering or permission checks beyond making it clear in the module request that we want a filesystem module. The common pattern in the kernel is to call request_module() without regards to the users permissions. In general all a filesystem module does once loaded is call register_filesystem() and go to sleep. Which means there is not much attack surface exposed by loading a filesytem module unless the filesystem is mounted. In a user namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT, which most filesystems do not set today. Acked-by: NSerge Hallyn <serge.hallyn@canonical.com> Acked-by: NKees Cook <keescook@chromium.org> Reported-by: NKees Cook <keescook@google.com> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 27 2月, 2013 5 次提交
-
-
由 Sage Weil 提交于
Use the new version of the encoding for osd requests and replies. In the process, update the way we are tracking request ops and reply lengths and results in the struct ceph_osd_request. Update the rbd and fs/ceph users appropriately. The main changes are: - we keep pointers into the request memory for fields we need to update each time the request is sent out over the wire - we keep information about the result in an array in the request struct where the users can easily get at it. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NAlex Elder <elder@inktank.com>
-
由 Sage Weil 提交于
Instead of using the old ceph_object_layout struct, update our internal ceph_calc_object_layout method to use the ceph_pg type. This allows us to pass the full 32-bit precision of the pgid.seed to the callers. It also allows some callers to avoid reaching into the request structures for the struct ceph_object_layout fields. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NAlex Elder <elder@inktank.com>
-
由 Sage Weil 提交于
Support (and require) the PGID64, PGPOOL3, and OSDENC protocol features. These have been present in ceph.git since v0.42, Feb 2012. Require these features to simplify support; nobody is running older userspace. Note that the new request and reply encoding is still not in place, so the new code is not yet functional. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NAlex Elder <elder@inktank.com>
-
由 Sage Weil 提交于
Always decode data into our cpu-native ceph_pg type that has the correct field widths. Limit any remaining uses of ceph_pg_v1 to dealing with the legacy protocol. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NAlex Elder <elder@inktank.com>
-
由 Sage Weil 提交于
Rename the old version this type to distinguish it from the new version. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NAlex Elder <elder@inktank.com>
-
- 26 2月, 2013 3 次提交
-
-
由 Namjae Jeon 提交于
This patch is a follow up on below patch: [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type commit: 216b6cbdSigned-off-by: NNamjae Jeon <namjae.jeon@samsung.com> Signed-off-by: NVivek Trivedi <t.vivek@samsung.com> Acked-by: NSteven Whitehouse <swhiteho@redhat.com> Acked-by: NSage Weil <sage@inktank.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Sage Weil 提交于
If r_aborted is true, we do not hold the dir i_mutex, and cannot touch the dcache. However, we still need to update the inodes with the state returned by the MDS. Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NSage Weil <sage@inktank.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Alex Elder 提交于
Fix the causes for sparse warnings reported in the ceph file system code. Here there are only two (and they're sort of silly but they're easy to fix). This partially resolves: http://tracker.ceph.com/issues/4184Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
- 23 2月, 2013 2 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Sage Weil 提交于
Different versions of glibc are broken in different ways, but the short of it is that for the time being, frsize should == bsize, and be used as the multiple for the blocks, free, and available fields. This mirrors what is done for NFS. The previous reporting of the page size for frsize meant that newer glibc and df would report a very small value for the fs size. Fixes http://tracker.ceph.com/issues/3793. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NGreg Farnum <greg@inktank.com>
-
- 20 2月, 2013 1 次提交
-
-
由 Alex Elder 提交于
There are three ceph page vector functions declared in "fs/ceph/super.h" that don't belong there. They're probably left over from some long-ago code reorganization. They're properly declared in "include/linux/ceph/libceph.h" so just delete the ones in "super.h". This and the next few commits resolve: http://tracker.ceph.com/issues/4053Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
- 19 2月, 2013 5 次提交
-
-
由 Alex Elder 提交于
Update ceph_mds_state_name() and ceph_mds_op_name() to include the newly-added definitions in "ceph_fs.h", and to match its counterpart in the user space code. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
The "num_reply" parameter to ceph_osdc_new_request() is never used inside that function, so get rid of it. Note that ceph_sync_write() passes 2 for that argument, while all other callers pass 1. It doesn't matter, but perhaps someone should verify this doesn't indicate a problem. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
There is only one caller of ceph_osdc_writepages(), and it always passes 0 as its "flags" argument. Get rid of that argument and replace its use in ceph_osdc_writepages() with 0. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
There is only one caller of ceph_osdc_writepages(), and it always passes 0 as its "dosync" argument. Get rid of that argument and replace its use in ceph_osdc_writepages() with 0. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Alex Elder 提交于
There is only one caller of ceph_osdc_writepages(), and it always passes the value true as its "nofail" argument. Get rid of that argument and replace its use in ceph_osdc_writepages() with the constant value true. This and a number of cleanup patches that follow resolve: http://tracker.ceph.com/issues/4126Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
- 14 2月, 2013 10 次提交
-
-
由 Sage Weil 提交于
Allow individual fields of the layout to be fetched via getxattr. The ceph.dir.layout.* vxattr with "disappear" if the exists_cb indicates there no dir layout set. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NSam Lang <sam.lang@inktank.com>
-
由 Sage Weil 提交于
This virtual xattr will only appear when there is a dir layout policy set on the directory. It can be set via setxattr and removed via removexattr (implemented by the MDS). Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NSam Lang <sam.lang@inktank.com>
-
由 Sage Weil 提交于
Implement a new method to generate the ceph.file.layout vxattr using the new framework. Use 'stripe_unit' instead of 'chunk_size'. Include pool name, either as a string or as an integer. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NSam Lang <sam.lang@inktank.com>
-
由 Sage Weil 提交于
Only include vxattrs in the result if they are not hidden and exist (as determined by the exists_cb callback). Note that the buffer size we return when 0 is passed in always includes vxattrs that *might* exist, forming an upper bound. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NSam Lang <sam.lang@inktank.com>
-
由 Sage Weil 提交于
Change the vxattr handling for getxattr so that vxattrs are checked prior to any xattr content, and never after. Enforce vxattr existence via the exists_cb callback. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NSam Lang <sam.lang@inktank.com>
-
由 Sage Weil 提交于
Allow for a callback to dynamically determine if a vxattr exists for the given inode. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NSam Lang <sam.lang@inktank.com>
-
由 Sage Weil 提交于
If we do not explicitly recognized a vxattr (e.g., as readonly), pass the request through to the MDS and deal with it there. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NSam Lang <sam.lang@inktank.com>
-
由 Sage Weil 提交于
If we do not specifically understand a setxattr on a ceph.* virtual xattr, send it through to the MDS. This allows us to implement new functionality via the MDS without direct support on the client side. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NSam Lang <sam.lang@inktank.com>
-
由 Sage Weil 提交于
Add ability to flag virtual xattrs as hidden, such that you can getxattr them but they do not appear in listxattr. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NSam Lang <sam.lang@inktank.com>
-
由 Sage Weil 提交于
This has been deprecated since v3.3, 114fc474. Kill it. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NSam Lang <sam.lang@inktank.com>
-
- 12 2月, 2013 4 次提交
-
-
由 Eric W. Biederman 提交于
Before printing kuid and kgids values convert them into the initial user namespace. Cc: Sage Weil <sage@inktank.com> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Eric W. Biederman 提交于
Hold the uid and gid for a pending ceph mds request using the types kuid_t and kgid_t. When a request message is finally created convert the kuid_t and kgid_t values into uids and gids in the initial user namespace. Cc: Sage Weil <sage@inktank.com> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Eric W. Biederman 提交于
- In fill_inode() transate uids and gids in the initial user namespace into kuids and kgids stored in inode->i_uid and inode->i_gid. - In ceph_setattr() if they have changed convert inode->i_uid and inode->i_gid into initial user namespace uids and gids for transmission. Cc: Sage Weil <sage@inktank.com> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Eric W. Biederman 提交于
- Make the uid and gid arguments of send_cap_msg() used to compose ceph_mds_caps messages of type kuid_t and kgid_t. - Pass inode->i_uid and inode->i_gid in __send_cap to send_cap_msg() through variables of type kuid_t and kgid_t. - Modify struct ceph_cap_snap to store uids and gids in types kuid_t and kgid_t. This allows capturing inode->i_uid and inode->i_gid in ceph_queue_cap_snap() without loss and pssing them to __ceph_flush_snaps() where they are removed from struct ceph_cap_snap and passed to send_cap_msg(). - In handle_cap_grant translate uid and gids in the initial user namespace stored in struct ceph_mds_cap into kuids and kgids before setting inode->i_uid and inode->i_gid. Cc: Sage Weil <sage@inktank.com> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 18 1月, 2013 2 次提交
-
-
由 Alex Elder 提交于
ceph_calc_file_object_mapping() takes (among other things) a "file" offset and length, and based on the layout, determines the object number ("bno") backing the affected portion of the file's data and the offset into that object where the desired range begins. It also computes the size that should be used for the request--either the amount requested or something less if that would exceed the end of the object. This patch changes the input length parameter in this function so it is used only for input. That is, the argument will be passed by value rather than by address, so the value provided won't get updated by the function. The value would only get updated if the length would surpass the current object, and in that case the value it got updated to would be exactly that returned in *oxlen. Only one of the two callers is affected by this change. Update ceph_calc_raw_layout() so it records any updated value. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Yan, Zheng 提交于
The MDS may have incorrect wanted caps after importing caps. So the client should check the value mds has and send cap update if necessary. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NSage Weil <sage@inktank.com>
-