- 20 5月, 2011 3 次提交
-
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
If there is no get_authorizer method we set the out_kvec to a bogus pointer. The length is also zero in that case, so it doesn't much matter, but it's better not to add the empty item in the first place. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
If a connection is closed and/or reopened (ceph_con_close, ceph_con_open) it can race with a callback. con_work does various state checks for closed or reopened sockets at the beginning, but drops con->mutex before making callbacks. We need to check for state bit changes after retaking the lock to ensure we restart con_work and execute those CLOSED/OPENING tests or else we may end up operating under stale assumptions. In Jim's case, this was causing 'bad tag' errors. There are four cases where we re-take the con->mutex inside con_work: catch them all and return EAGAIN from try_{read,write} so that we can restart con_work. Reported-by: NJim Schutt <jaschut@sandia.gov> Tested-by: NJim Schutt <jaschut@sandia.gov> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 04 5月, 2011 1 次提交
-
-
由 Henry C Chang 提交于
If memory allocation failed, calling ceph_msg_put() will cause GPF since some of ceph_msg variables are not initialized first. Fix Bug #970. Signed-off-by: NHenry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 05 3月, 2011 3 次提交
-
-
由 Sage Weil 提交于
The standby logic used to be pretty dependent on the work requeueing behavior that changed when we switched to WQ_NON_REENTRANT. It was also very fragile. Restructure things so that: - We clear WRITE_PENDING when we set STANDBY. This ensures we will requeue work when we wake up later. - con_work backs off if STANDBY is set. There is nothing to do if we are in standby. - clear_standby() helper is called by both con_send() and con_keepalive(), the two actions that can wake us up again. Move the connect_seq++ logic here. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
There was some broken keepalive code using a dead variable. Shift to using the proper bit flag. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
With commit f363e45f we replaced a bunch of hacky workqueue mutual exclusion logic with the WQ_NON_REENTRANT flag. One pieces of fallout is that the exponential backoff breaks in certain cases: * con_work attempts to connect. * we get an immediate failure, and the socket state change handler queues immediate work. * con_work calls con_fault, we decide to back off, but can't queue delayed work. In this case, we add a BACKOFF bit to make con_work reschedule delayed work next time it runs (which should be immediately). Signed-off-by: NSage Weil <sage@newdream.net>
-
- 04 3月, 2011 1 次提交
-
-
由 Sage Weil 提交于
If we mark the connection CLOSED we will give up trying to reconnect to this server instance. That is appropriate for things like a protocol version mismatch that won't change until the server is restarted, at which point we'll get a new addr and reconnect. An authorization failure like this is probably due to the server not properly rotating it's secret keys, however, and should be treated as transient so that the normal backoff and retry behavior kicks in. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 26 1月, 2011 2 次提交
-
-
由 Sage Weil 提交于
Pass errors from writing to the socket up the stack. If we get -EAGAIN, return 0 from the helper to simplify the callers' checks. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
If we get EAGAIN when trying to read from the socket, it is not an error. Return 0 from the helper in this case to simplify the error handling cases in the caller (indirectly, try_read). Fix try_read to pass any error to it's caller (con_work) instead of almost always returning 0. This let's us respond to things like socket disconnects. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 13 1月, 2011 1 次提交
-
-
由 Tejun Heo 提交于
ceph messenger code does a rather complex dancing around multithread workqueue to make sure the same work item isn't executed concurrently on different CPUs. This restriction can be provided by workqueue with WQ_NON_REENTRANT. Make ceph_msgr_wq non-reentrant workqueue with the default concurrency level and remove the QUEUED/BUSY logic. * This removes backoff handling in con_work() but it couldn't reliably block execution of con_work() to begin with - queue_con() can be called after the work started but before BUSY is set. It seems that it was an optimization for a rather cold path and can be safely removed. * The number of concurrent work items is bound by the number of connections and connetions are independent from each other. With the default concurrency level, different connections will be executed independently. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Sage Weil <sage@newdream.net> Cc: ceph-devel@vger.kernel.org Signed-off-by: NSage Weil <sage@newdream.net>
-
- 14 12月, 2010 1 次提交
-
-
由 Sage Weil 提交于
create_workqueue() returns NULL on failure. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 10 11月, 2010 1 次提交
-
-
由 Sage Weil 提交于
The alignment used for reading data into or out of pages used to be taken from the data_off field in the message header. This only worked as long as the page alignment matched the object offset, breaking direct io to non-page aligned offsets. Instead, explicitly specify the page alignment next to the page vector in the ceph_msg struct, and use that instead of the message header (which probably shouldn't be trusted). The alloc_msg callback is responsible for filling in this field properly when it sets up the page vector. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 02 11月, 2010 1 次提交
-
-
由 Sage Weil 提交于
If the client gets out of sync with the server message sequence number, we normally skip low seq messages (ones we already received). The skip code was also incrementing the expected seq, such that all subsequent messages also appeared old and got skipped, and an eventual timeout on the osd connection. This resulted in some lagging requests and console messages like [233480.882885] ceph: skipping osd22 10.138.138.13:6804 seq 2016, expected 2017 [233480.882919] ceph: skipping osd22 10.138.138.13:6804 seq 2017, expected 2018 [233480.882963] ceph: skipping osd22 10.138.138.13:6804 seq 2018, expected 2019 [233480.883488] ceph: skipping osd22 10.138.138.13:6804 seq 2019, expected 2020 [233485.219558] ceph: skipping osd22 10.138.138.13:6804 seq 2020, expected 2021 [233485.906595] ceph: skipping osd22 10.138.138.13:6804 seq 2021, expected 2022 [233490.379536] ceph: skipping osd22 10.138.138.13:6804 seq 2022, expected 2023 [233495.523260] ceph: skipping osd22 10.138.138.13:6804 seq 2023, expected 2024 [233495.923194] ceph: skipping osd22 10.138.138.13:6804 seq 2024, expected 2025 [233500.534614] ceph: tid 6023602 timed out on osd22, will reset osd Reported-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 21 10月, 2010 2 次提交
-
-
由 Yehuda Sadeh 提交于
This factors out protocol and low-level storage parts of ceph into a separate libceph module living in net/ceph and include/linux/ceph. This is mostly a matter of moving files around. However, a few key pieces of the interface change as well: - ceph_client becomes ceph_fs_client and ceph_client, where the latter captures the mon and osd clients, and the fs_client gets the mds client and file system specific pieces. - Mount option parsing and debugfs setup is correspondingly broken into two pieces. - The mon client gets a generic handler callback for otherwise unknown messages (mds map, in this case). - The basic supported/required feature bits can be expanded (and are by ceph_fs_client). No functional change, aside from some subtle error handling cases that got cleaned up in the refactoring process. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Yehuda Sadeh 提交于
Allow the messenger to send/receive data in a bio. This is added so that we wouldn't need to copy the data into pages or some other buffer when doing IO for an rbd block device. We can now have trailing variable sized data for osd ops. Also osd ops encoding is more modular. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 04 8月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
- 02 8月, 2010 2 次提交
-
-
由 Sage Weil 提交于
Specify the supported/required feature bits in super.h client code instead of using the definitions from the shared kernel/userspace headers (which will go away shortly). Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Yehuda Sadeh 提交于
Mainly fixing minor issues reported by sparse. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 10 7月, 2010 2 次提交
-
-
由 Sage Weil 提交于
Use the address family from the peer address instead of assuming IPv4. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Check for brackets around the ipv6 address to avoid ambiguity with the port number. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 09 7月, 2010 1 次提交
-
-
由 Sage Weil 提交于
The buffer was too small. Make it bigger, use snprintf(), put brackets around the ipv6 address to avoid mixing it up with the :port, and use the ever-so-handy %pI[46] formats. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 06 7月, 2010 1 次提交
-
-
由 Sage Weil 提交于
A message can be on a queue (pending or sent), or out_msg (sending), or both. We were assuming that if it's not on a queue it couldn't be out_msg, but that was false in the case of lossy connections like the OSD. Fix ceph_con_revoke() to treat these cases independently. Also, fix the out_kvec_is_message check to only trigger if we are currently sending _this_ message. This fixes a GPF in tcp_sendpage, triggered by OSD restarts. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 14 6月, 2010 2 次提交
-
-
由 Sage Weil 提交于
We need to properly initialize skip, as not all alloc_msg op instances set it. Also, BUG if someone says skip but also allocates a message. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Yehuda Sadeh 提交于
Fix some problems that came up with sparse. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 30 5月, 2010 1 次提交
-
-
由 Sage Weil 提交于
The auth module (part of the mon_client) is needed to free any ceph_authorizer(s) used by the mds and osd connections. Flush the msgr workqueue before stopping monc to ensure that the destroy_authorizer auth op is available when those connections are closed out. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 18 5月, 2010 9 次提交
-
-
由 Yehuda Sadeh 提交于
This is essential, as for the rados block device we'll need to run in different contexts that would need flags that are other than GFP_NOFS. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
These are used for adjusting behavior, such as conditionally encoding a newer message format. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Notable changes include pool op defines and types, FLOCK feature bit, and new CMPXATTR osd ops. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
The CEPH_FEATURE_NOSRCADDR protocol feature avoids putting the full source address in each message header (twice). This patch switches the client to the new scheme, and _requires_ this feature on the server. The server will support both the old and new schemes. That means an old client will work with a new server, but a new client will not work with an old server. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Simplify messenger locking, and close race between ceph_con_close() setting the CLOSED bit and con_work() checking the bit, then taking the mutex. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Reset out_keepalive_pending and peer_global_seq, and drop unused var. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
We only need to pass in front_len. Callers can attach any other payload pieces (middle, data) as they see fit. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Returning ERR_PTR(-ENOMEM) is useless extra work. Return NULL on failure instead, and fix up the callers (about half of which were wrong anyway). Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Yehuda Sadeh 提交于
Following Nick Piggin patches in btrfs, pagecache pages should be allocated with __page_cache_alloc, so they obey pagecache memory policies. Also, using add_to_page_cache_lru instead of using a private pagevec where applicable. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 12 5月, 2010 2 次提交
-
-
由 Sage Weil 提交于
If the tcp connection drops and we reconnect to reestablish a stateful session (with the mds), we need to resend previously sent (and possibly received) messages with the _same_ seq # so that they can be dropped on the other end if needed. Only assign a new seq once after the message is queued. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
We shouldn't leak any prior memory contents to other parties. And random data, particularly in the 'version' field, can cause problems down the line. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 04 5月, 2010 2 次提交
-
-
由 Sage Weil 提交于
We can get old message seq #'s after a tcp reconnect for stateful sessions (i.e., the MDS). If we get a higher seq #, that is an error, and we shouldn't see any bad seq #'s for stateless (mon, osd) connections. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Increment in_seq even when the message is skipped for some reason. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 14 4月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Use a separate class for ceph sockets to prevent lockdep confusion. Because ceph sockets only get passed kernel pointers, there is no dependency from sk_lock -> mmap_sem. If we share the same class as other sockets, lockdep detects a circular dependency from mmap_sem (page fault) -> fs mutex -> sk_lock -> mmap_sem because dependencies are noted from both ceph and user contexts. Using a separate class prevents the sk_lock(ceph) -> mmap_sem dependency and makes lockdep happy. Signed-off-by: NSage Weil <sage@newdream.net>
-