- 15 10月, 2014 6 次提交
-
-
由 Yan, Zheng 提交于
this allow pagelist to present data that may be sent multiple times. Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
由 John Spray 提交于
Implement version 2 of CEPH_MSG_CLIENT_SESSION syntax, which includes additional client metadata to allow the MDS to report on clients by user-sensible names like hostname. Signed-off-by: NJohn Spray <john.spray@redhat.com> Reviewed-by: NYan, Zheng <zyan@redhat.com>
-
由 Yan, Zheng 提交于
ceph_find_inode() may wait on freeing inode, using it inside the s_mutex may cause deadlock. (the freeing inode is waiting for OSD read reply, but dispatch thread is blocked by the s_mutex) Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
由 Yan, Zheng 提交于
we may corrupt waiting list if a request in the waiting list is kicked. Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
由 Yan, Zheng 提交于
So the recovering MDS does not need to fetch these ununsed inodes during cache rejoin. This may reduce MDS recovery time. Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
- 07 8月, 2014 1 次提交
-
-
由 Yan, Zheng 提交于
__do_request() may unregister the request. So we should update iterator 'p' before calling __do_request() Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
- 14 7月, 2014 1 次提交
-
-
由 Yan, Zheng 提交于
this makes __choose_mds() choose mds according caps Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
- 08 7月, 2014 1 次提交
-
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
- 07 6月, 2014 1 次提交
-
-
由 Fabian Frederick 提交于
Update the last pr_warning callsites in fs branch Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: Sage Weil <sage@inktank.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 6月, 2014 1 次提交
-
-
由 Sage Weil 提交于
We recently modified the client/MDS protocol to include a timestamp in the client request. This allows ctime updates to follow the client's clock in most cases, which avoids subtle problems when clocks are out of sync and timestamps are updated sometimes by the MDS clock (for most requests) and sometimes by the client clock (for cap writeback). Signed-off-by: NSage Weil <sage@inktank.com>
-
- 05 4月, 2014 3 次提交
-
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
由 Yan, Zheng 提交于
Preallocate buffer for readdir reply. Limit number of entries in readdir reply according to the buffer size. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
由 Yan, Zheng 提交于
send_mds_reconnect() may call discard_cap_releases() after all release messages have been dropped by cleanup_cap_releases() Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 03 4月, 2014 1 次提交
-
-
由 Sage Weil 提交于
Do not assume that r_old_dentry implies that r_old_dentry_dir is also true. Separate out the ref cleanup and make the debugs dump behave when it is NULL. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NYan, Zheng <zheng.z.yan@intel.com>
-
- 21 1月, 2014 4 次提交
-
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
由 Yan, Zheng 提交于
Send requests that operate on path to directory's auth MDS if mode == USE_AUTH_MDS. Always retry using the auth MDS if got -ESTALE reply from non-auth MDS. Also clean up the code that handles auth MDS change. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
由 Yan, Zheng 提交于
- don't trim auth cap if there are flusing caps - don't trim auth cap if any 'write' cap is wanted - allow trimming non-auth cap even if the inode is dirty Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
- 01 1月, 2014 1 次提交
-
-
由 Ilya Dryomov 提交于
In preparation for ceph_features.h update, change all features fields from unsigned int/u32 to u64. (ceph.git has ~40 feature bits at this point.) Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 24 11月, 2013 5 次提交
-
-
由 Yan, Zheng 提交于
We also need to wake up 'safe' waiters if error occurs or request aborted. Otherwise sync(2)/fsync(2) may hang forever. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Signed-off-by: NSage Weil <sage@inktank.com>
-
由 Yan, Zheng 提交于
Aborted requests usually get cleared when the reply is received. If MDS crashes, no reply will be received. So we need to cleanup aborted requests when re-sending requests. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NGreg Farnum <greg@inktank.com> Signed-off-by: NSage Weil <sage@inktank.com>
-
由 Yan, Zheng 提交于
When a cap get released while composing the cap reconnect message. We should skip queuing the release message if the cap hasn't been added to the cap reconnect message. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
由 Yan, Zheng 提交于
It's possible that some caps get released while composing the cap reconnect message. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
由 Yan, Zheng 提交于
call __queue_cap_release() in __ceph_remove_cap(), this avoids acquiring s_cap_lock twice. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 01 10月, 2013 1 次提交
-
-
由 Yan, Zheng 提交于
If client has outdated directory fragments information, it may request readdir an non-existent directory fragment. In this case, the MDS finds an approximate directory fragment and sends its contents back to the client. When receiving a reply with fragment that is different than the requested one, the client need to reset the 'readdir offset'. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 07 9月, 2013 1 次提交
-
-
由 Yan, Zheng 提交于
commit 6f60f889 (ceph: fix freeing inode vs removing session caps race) introduced ceph_lookup_inode(). But there is already a ceph_find_inode() which provides similar function. So remove ceph_lookup_inode(), use ceph_find_inode() instead. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NAlex Elder <alex.elder@linary.org> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 10 8月, 2013 2 次提交
-
-
由 Yan, Zheng 提交于
remove_session_caps() uses iterate_session_caps() to remove caps, but iterate_session_caps() skips inodes that are being deleted. So session->s_nr_caps can be non-zero after iterate_session_caps() return. We can fix the issue by waiting until deletions are complete. __wait_on_freeing_inode() is designed for the job, but it is not exported, so we use lookup inode function to access it. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
由 Nathaniel Yazdani 提交于
When register_session() is given an out-of-range argument for mds, ceph_mdsmap_get_addr() will return a null pointer, which would be given to ceph_con_open() & be dereferenced, causing a kernel oops. This fixes bug #4685 in the Ceph bug tracker <http://tracker.ceph.com/issues/4685>. Signed-off-by: NNathaniel Yazdani <n1ght.4nd.d4y@gmail.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 05 7月, 2013 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 04 7月, 2013 3 次提交
-
-
由 majianpeng 提交于
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 29 6月, 2013 1 次提交
-
-
由 Jeff Layton 提交于
Having a global lock that protects all of this code is a clear scalability problem. Instead of doing that, move most of the code to be protected by the i_lock instead. The exceptions are the global lists that the ->fl_link sits on, and the ->fl_block list. ->fl_link is what connects these structures to the global lists, so we must ensure that we hold those locks when iterating over or updating these lists. Furthermore, sound deadlock detection requires that we hold the blocked_list state steady while checking for loops. We also must ensure that the search and update to the list are atomic. For the checking and insertion side of the blocked_list, push the acquisition of the global lock into __posix_lock_file and ensure that checking and update of the blocked_list is done without dropping the lock in between. On the removal side, when waking up blocked lock waiters, take the global lock before walking the blocked list and dequeue the waiters from the global list prior to removal from the fl_block list. With this, deadlock detection should be race free while we minimize excessive file_lock_lock thrashing. Finally, in order to avoid a lock inversion problem when handling /proc/locks output we must ensure that manipulations of the fl_block list are also protected by the file_lock_lock. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 18 5月, 2013 2 次提交
-
-
由 Jim Schutt 提交于
Ceph's encode_caps_cb() worked hard to not call __page_cache_alloc() while holding a lock, but it's spoiled because ceph_pagelist_addpage() always calls kmap(), which might sleep. Here's the result: [13439.295457] ceph: mds0 reconnect start [13439.300572] BUG: sleeping function called from invalid context at include/linux/highmem.h:58 [13439.309243] in_atomic(): 1, irqs_disabled(): 0, pid: 12059, name: kworker/1:1 . . . [13439.376225] Call Trace: [13439.378757] [<ffffffff81076f4c>] __might_sleep+0xfc/0x110 [13439.384353] [<ffffffffa03f4ce0>] ceph_pagelist_append+0x120/0x1b0 [libceph] [13439.391491] [<ffffffffa0448fe9>] ceph_encode_locks+0x89/0x190 [ceph] [13439.398035] [<ffffffff814ee849>] ? _raw_spin_lock+0x49/0x50 [13439.403775] [<ffffffff811cadf5>] ? lock_flocks+0x15/0x20 [13439.409277] [<ffffffffa045e2af>] encode_caps_cb+0x41f/0x4a0 [ceph] [13439.415622] [<ffffffff81196748>] ? igrab+0x28/0x70 [13439.420610] [<ffffffffa045e9f8>] ? iterate_session_caps+0xe8/0x250 [ceph] [13439.427584] [<ffffffffa045ea25>] iterate_session_caps+0x115/0x250 [ceph] [13439.434499] [<ffffffffa045de90>] ? set_request_path_attr+0x2d0/0x2d0 [ceph] [13439.441646] [<ffffffffa0462888>] send_mds_reconnect+0x238/0x450 [ceph] [13439.448363] [<ffffffffa0464542>] ? ceph_mdsmap_decode+0x5e2/0x770 [ceph] [13439.455250] [<ffffffffa0462e42>] check_new_map+0x352/0x500 [ceph] [13439.461534] [<ffffffffa04631ad>] ceph_mdsc_handle_map+0x1bd/0x260 [ceph] [13439.468432] [<ffffffff814ebc7e>] ? mutex_unlock+0xe/0x10 [13439.473934] [<ffffffffa043c612>] extra_mon_dispatch+0x22/0x30 [ceph] [13439.480464] [<ffffffffa03f6c2c>] dispatch+0xbc/0x110 [libceph] [13439.486492] [<ffffffffa03eec3d>] process_message+0x1ad/0x1d0 [libceph] [13439.493190] [<ffffffffa03f1498>] ? read_partial_message+0x3e8/0x520 [libceph] . . . [13439.587132] ceph: mds0 reconnect success [13490.720032] ceph: mds0 caps stale [13501.235257] ceph: mds0 recovery completed [13501.300419] ceph: mds0 caps renewed Fix it up by encoding locks into a buffer first, and when the number of encoded locks is stable, copy that into a ceph_pagelist. [elder@inktank.com: abbreviated the stack info a bit.] Cc: stable@vger.kernel.org # 3.4+ Signed-off-by: NJim Schutt <jaschut@sandia.gov> Reviewed-by: NAlex Elder <elder@inktank.com>
-
由 Jim Schutt 提交于
In his review, Alex Elder mentioned that he hadn't checked that num_fcntl_locks and num_flock_locks were properly decoded on the server side, from a le32 over-the-wire type to a cpu type. I checked, and AFAICS it is done; those interested can consult Locker::_do_cap_update() in src/mds/Locker.cc and src/include/encoding.h in the Ceph server code (git://github.com/ceph/ceph). I also checked the server side for flock_len decoding, and I believe that also happens correctly, by virtue of having been declared __le32 in struct ceph_mds_cap_reconnect, in src/include/ceph_fs.h. Cc: stable@vger.kernel.org # 3.4+ Signed-off-by: NJim Schutt <jaschut@sandia.gov> Reviewed-by: NAlex Elder <elder@inktank.com>
-
- 02 5月, 2013 4 次提交
-
-
由 Alex Elder 提交于
Change the names of the functions that put data on a pagelist to reflect that we're adding to whatever's already there rather than just setting it to the one thing. Currently only one data item is ever added to a message, but that's about to change. This resolves: http://tracker.ceph.com/issues/2770Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
-
由 Sage Weil 提交于
Use wrapper functions that check whether the auth op exists so that callers do not need a bunch of conditional checks. Simplifies the external interface. Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NAlex Elder <elder@inktank.com>
-
由 Sage Weil 提交于
Currently the messenger calls out to a get_authorizer con op, which will create a new authorizer if it doesn't yet have one. In the meantime, when we rotate our service keys, the authorizer doesn't get updated. Eventually it will be rejected by the server on a new connection attempt and get invalidated, and we will then rebuild a new authorizer, but this is not ideal. Instead, if we do have an authorizer, call a new update_authorizer op that will verify that the current authorizer is using the latest secret. If it is not, we will build a new one that does. This avoids the transient failure. This fixes one of the sorry sequence of events for bug http://tracker.ceph.com/issues/4282Signed-off-by: NSage Weil <sage@inktank.com> Reviewed-by: NAlex Elder <elder@inktank.com>
-
由 Yan, Zheng 提交于
Current ceph code tracks directory's completeness in two places. ceph_readdir() checks i_release_count to decide if it can set the I_COMPLETE flag in i_ceph_flags. All other places check the I_COMPLETE flag. This indirection introduces locking complexity. This patch adds a new variable i_complete_count to ceph_inode_info. Set i_release_count's value to it when marking a directory complete. By comparing the two variables, we know if a directory is complete Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-