- 20 2月, 2017 1 次提交
-
-
由 Arnd Bergmann 提交于
I ran into this compile warning, which is the result of BUG_ON(1) not always leading to the compiler treating the code path as unreachable: include/linux/ceph/osdmap.h: In function 'ceph_can_shift_osds': include/linux/ceph/osdmap.h:62:1: error: control reaches end of non-void function [-Werror=return-type] Using BUG() here avoids the warning. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 15 12月, 2016 1 次提交
-
-
由 Ilya Dryomov 提交于
r_safe_completion is currently, and has always been, signaled only if on-disk ack was requested. It's there for fsync and syncfs, which wait for in-flight writes to flush - all data write requests set ONDISK. However, the pool perm check code introduced in 4.2 sends a write request with only ACK set. An unfortunately timed syncfs can then hang forever: r_safe_completion won't be signaled because only an unsafe reply was requested. We could patch ceph_osdc_sync() to skip !ONDISK write requests, but that is somewhat incomplete and yet another special case. Instead, rename this completion to r_done_completion and always signal it when the OSD client is done with the request, whether unsafe, safe, or error. This is a bit cleaner and helps with the cancellation code. Reported-by: NYan, Zheng <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 13 12月, 2016 3 次提交
-
-
由 Jeff Layton 提交于
Add a flags parameter to send_cap_msg, so we can request expedited service from the MDS when we know we'll be waiting on the result. Set that flag in the case of try_flush_caps. The callers of that function generally wait synchronously on the result, so it's beneficial to ask the server to expedite it. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com> -
由 Ilya Dryomov 提交于
The length of the reply is protocol-dependent - for cephx it's ceph_x_authorize_reply. Nothing sensible can be passed from the messenger layer anyway. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
- 11 11月, 2016 1 次提交
-
-
由 Ilya Dryomov 提交于
osdc->last_linger_id is a counter for lreq->linger_id, which is used for watch cookies. Starting with a large integer should ease the task of telling apart kernel and userspace clients. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 01 11月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
The file only needs the struct bvec_iter delcaration, which is available from bvec.h. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 03 10月, 2016 1 次提交
-
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
- 25 8月, 2016 8 次提交
-
-
由 Ilya Dryomov 提交于
Export client addr/nonce, so userspace can check if a image is being blacklisted. Signed-off-by: NMike Christie <mchristi@redhat.com> [idryomov@gmail.com: ceph_client_addr(), endianess fix] Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
It's gid / global_id in other places. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Douglas Fuller 提交于
Reuse ceph_mon_generic_request infrastructure for sending monitor commands. In particular, add support for 'blacklist add' to prevent other, non-responsive clients from making further updates. Signed-off-by: NDouglas Fuller <dfuller@redhat.com> [idryomov@gmail.com: refactor, misc fixes throughout] Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Douglas Fuller 提交于
Add an interface for the Ceph OSD lock.lock_info method and associated data structures. Based heavily on code by Mike Christie <michaelc@cs.wisc.edu>. Signed-off-by: NDouglas Fuller <dfuller@redhat.com> [idryomov@gmail.com: refactor, misc fixes throughout] Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Douglas Fuller 提交于
This patch adds support for rados lock, unlock and break lock. Based heavily on code by Mike Christie <michaelc@cs.wisc.edu>. Signed-off-by: NDouglas Fuller <dfuller@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Douglas Fuller 提交于
Add a convenience function to osd_client to send Ceph OSD 'class' ops. The interface assumes that the request and reply data each consist of single pages. Signed-off-by: NDouglas Fuller <dfuller@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Douglas Fuller 提交于
Add support for this Ceph OSD op, needed to support the RBD exclusive lock feature. Signed-off-by: NDouglas Fuller <dfuller@redhat.com> [idryomov@gmail.com: refactor, misc fixes throughout] Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Ilya Dryomov 提交于
Clear up EntityName vs entity_name_t confusion. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
- 28 7月, 2016 9 次提交
-
-
由 Arnd Bergmann 提交于
The genksyms helper in the kernel cannot parse a type definition like "typeof(((type *)0)->keyfld)" that is used in the DEFINE_RB_FUNCS helper, causing the following EXPORT_SYMBOL() statement to be ignored when computing the crcs, and triggering a warning about this: WARNING: "ceph_monc_do_statfs" [fs/ceph/ceph.ko] has no CRC To work around the problem, we can rewrite the type to reference an undefined 'extern' symbol instead of a NULL pointer. This is evidently ok for genksyms, and it no longer complains about the line when calling it with 'genksyms -w'. I've looked briefly into extending genksyms instead, but it seems really hard to do. Jan Beulich introduced basic support for 'typeof' a while ago in dc533240 ("genksyms: fix typeof() handling"), but that is not sufficient for the expression we have here. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Fixes: fcd00b68 ("libceph: DEFINE_RB_FUNCS macro") Cc: Jan Beulich <jbeulich@suse.com> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
Track usage count for individual fmode bit. This can reduce the array size by half. Signed-off-by: NYan, Zheng <zyan@redhat.com> -
由 Yan, Zheng 提交于
Add pool namesapce pointer to struct ceph_file_layout and struct ceph_object_locator. Pool namespace is used by when mapping object to PG, it's also used when composing OSD request. The namespace pointer in struct ceph_file_layout is RCU protected. So libceph can read namespace without taking lock. Signed-off-by: NYan, Zheng <zyan@redhat.com> [idryomov@gmail.com: ceph_oloc_destroy(), misc minor changes] Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
The data structure is for storing namesapce string. It allows namespace string to be shared between cephfs inodes with same layout. This data structure can also be referenced by OSD request. Signed-off-by: NYan, Zheng <zyan@redhat.com> -
由 Yan, Zheng 提交于
Define new ceph_file_layout structure and rename old ceph_file_layout to ceph_file_layout_legacy. This is preparation for adding namespace to ceph_file_layout structure. Signed-off-by: NYan, Zheng <zyan@redhat.com> -
由 Ilya Dryomov 提交于
Add ceph_start_encoding() and ceph_start_decoding(), the equivalent of ENCODE_START and DECODE_START in the userspace ceph code. This is based on a patch from Mike Christie <michaelc@cs.wisc.edu>. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
An on-stack oid in ceph_ioctl_get_dataloc() is not initialized, resulting in a WARN and a NULL pointer dereference later on. We will have more of these on-stack in the future, so fix it with a convenience macro. Fixes: d30291b9 ("libceph: variable-sized ceph_object_id") Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
- decode.h needs slab.h for kmalloc() - osd_client.h needs msgpool.h for struct ceph_msgpool - msgpool.h doesn't need messenger.h Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 31 5月, 2016 1 次提交
-
-
由 Ilya Dryomov 提交于
For the benefit of every single caller, take osdc instead of map. Also, now that osdc->osdmap can't ever be NULL, drop the check. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 26 5月, 2016 14 次提交
-
-
由 Zhang Zhuoyu 提交于
This patch makes serverl logical caculation functions return bool to improve readability due to these particular functions only using 0/1 as their return value. No functional change. Signed-off-by: NZhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com> -
由 Yan, Zheng 提交于
If MDS sorts dentries in dirfrag in hash order, we use hash value to compose dentry offset. dentry offset is: (0xff << 52) | ((24 bits hash) << 28) | (the nth entry hash hash collision) This offset is stable across directory fragmentation. This alos means there is no need to reset readdir offset if directory get fragmented in the middle of readdir. Signed-off-by: NYan, Zheng <zyan@redhat.com> -
由 Yan, Zheng 提交于
Set a flag in readdir request, which indicates that client interprets 'end/complete' as bit flags. So that mds can reply additional flags in readdir reply. Signed-off-by: NYan, Zheng <zyan@redhat.com> -
由 Ilya Dryomov 提交于
Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
... with a wrapper around maybe_request_map() - no need for two osdmap-specific functions. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
This adds the "map check" infrastructure for sending osdmap version checks on CALC_TARGET_POOL_DNE and completing in-flight requests with -ENOENT if the target pool doesn't exist or has just been deleted. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
For map check, we are going to need to send CEPH_MSG_MON_GET_VERSION messages asynchronously and get a callback on completion. Refactor MON client to allow firing off generic requests asynchronously and add an async variant of ceph_monc_get_version(). ceph_monc_do_statfs() is switched over and remains sync. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
Implement ceph_osdc_watch_check() to be able to check on status of watch. Note that the time it takes for a watch/notify event to get delivered through the notify_wq is taken into account. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
Implement ceph_osdc_notify() for sending notifies. Due to the fact that the current messenger can't do read-in into pagelists (it can only do write-out from them), I had to go with a page vector for a NOTIFY_COMPLETE payload, for now. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
This adds support and switches rbd to a new, more reliable version of watch/notify protocol. As with the OSD client update, this is mostly about getting the right structures linked into the right places so that reconnects are properly sent when needed. watch/notify v2 also requires sending regular pings to the OSDs - send_linger_ping(). A major change from the old watch/notify implementation is the introduction of ceph_osd_linger_request - linger requests no longer piggy back on ceph_osd_request. ceph_osd_event has been merged into ceph_osd_linger_request. All the details are now hidden within libceph, the interface consists of a simple pair of watch/unwatch functions and ceph_osdc_notify_ack(). ceph_osdc_watch() does return ceph_osd_linger_request, but only to keep the lifetime management simple. ceph_osdc_notify_ack() accepts an optional data payload, which is relayed back to the notifier. Portions of this patch are loosely based on work by Douglas Fuller <dfuller@redhat.com> and Mike Christie <michaelc@cs.wisc.edu>. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
This is a major sync up, up to ~Jewel. The highlights are: - per-session request trees (vs a global per-client tree) - per-session locking (vs a global per-client rwlock) - homeless OSD session - no ad-hoc global per-client lists - support for pool quotas - foundation for watch/notify v2 support - foundation for map check (pool deletion detection) support The switchover is incomplete: lingering requests can be setup and teared down but aren't ever reestablished. This functionality is restored with the introduction of the new lingering infrastructure (ceph_osd_linger_request, linger_work, etc) in a later commit. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
OSD client is getting moved from the big per-client lock to a set of per-session locks. The big rwlock would only be held for read most of the time, so a global osdc->osd_lru needs additional protection. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
Separate osdmap handling from decoding and iterating over a bag of maps in a fresh MOSDMap message. This sets up the scene for the updated OSD client. Of particular importance here is the addition of pi->was_full, which can be used to answer "did this pool go full -> not-full in this map?". This is the key bit for supporting pool quotas. We won't be able to downgrade map_sem for much longer, so drop downgrade_write(). Signed-off-by: NIlya Dryomov <idryomov@gmail.com> -
由 Ilya Dryomov 提交于
This leads to a simpler osdmap handling code, particularly when dealing with pi->was_full, which is introduced in a later commit. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-