- 22 3月, 2012 3 次提交
-
-
由 Alex Elder 提交于
Change the name (and type) of a few CRC-related Boolean local variables so they contain the word "do", to distingish their purpose from variables used for holding an actual CRC value. Note that in the process of doing this I identified a fairly serious logic error in write_partial_msg_pages(): the value of "do_crc" assigned appears to be the opposite of what it should be. No attempt to fix this is made here; this change preserves the erroneous behavior. The problem I found is documented here: http://tracker.newdream.net/issues/2064Signed-off-by: NAlex Elder <elder@dreamhost.com> Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Alex Elder 提交于
The messenger workqueue has no need to be public. So give it static scope. Signed-off-by: NAlex Elder <elder@dreamhost.com> Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Alex Elder 提交于
Each messenger allocates a page to be used when writing zeroes out in the event of error or other abnormal condition. Instead, use the kernel ZERO_PAGE() for that purpose. Signed-off-by: NAlex Elder <elder@dreamhost.com> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 26 10月, 2011 1 次提交
-
-
由 Sage Weil 提交于
The pool allocation failures are masked by the pool; there is no need to spam the console about them. (That's the whole point of having the pool in the first place.) Mark msg allocations whose failure is safely handled as such. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 15 9月, 2011 1 次提交
-
-
由 Jesper Juhl 提交于
It was pointed out by 'make versioncheck' that some includes of linux/version.h are not needed in include/. This patch removes them. When I last posted the patch, the ceph bit was ACK'ed by Sage Weil, so I've added that below. The pwc-ioctl change generated quite a bit of discussion about V4L version numbers in general, but as far as I can tell, no concensus was reached on what the long term solution should be, so in the mean time I think we could start by just removing the unneeded include, which is why I'm resending the patch with that hunk still included. Signed-off-by: NJesper Juhl <jj@chaosbits.net> Acked-by: NSage Weil <sage@newdream.net> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 27 7月, 2011 1 次提交
-
-
由 Sage Weil 提交于
Keep track of when an outgoing message is ACKed (i.e., the server fully received it and, presumably, queued it for processing). Time out OSD requests only if it's been too long since they've been received. This prevents timeouts and connection thrashing when the OSDs are simply busy and are throttling the requests they read off the network. Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 05 3月, 2011 2 次提交
-
-
由 Sage Weil 提交于
There was some broken keepalive code using a dead variable. Shift to using the proper bit flag. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
With commit f363e45f we replaced a bunch of hacky workqueue mutual exclusion logic with the WQ_NON_REENTRANT flag. One pieces of fallout is that the exponential backoff breaks in certain cases: * con_work attempts to connect. * we get an immediate failure, and the socket state change handler queues immediate work. * con_work calls con_fault, we decide to back off, but can't queue delayed work. In this case, we add a BACKOFF bit to make con_work reschedule delayed work next time it runs (which should be immediately). Signed-off-by: NSage Weil <sage@newdream.net>
-
- 13 1月, 2011 1 次提交
-
-
由 Tejun Heo 提交于
ceph messenger code does a rather complex dancing around multithread workqueue to make sure the same work item isn't executed concurrently on different CPUs. This restriction can be provided by workqueue with WQ_NON_REENTRANT. Make ceph_msgr_wq non-reentrant workqueue with the default concurrency level and remove the QUEUED/BUSY logic. * This removes backoff handling in con_work() but it couldn't reliably block execution of con_work() to begin with - queue_con() can be called after the work started but before BUSY is set. It seems that it was an optimization for a rather cold path and can be safely removed. * The number of concurrent work items is bound by the number of connections and connetions are independent from each other. With the default concurrency level, different connections will be executed independently. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Sage Weil <sage@newdream.net> Cc: ceph-devel@vger.kernel.org Signed-off-by: NSage Weil <sage@newdream.net>
-
- 10 11月, 2010 1 次提交
-
-
由 Sage Weil 提交于
The alignment used for reading data into or out of pages used to be taken from the data_off field in the message header. This only worked as long as the page alignment matched the object offset, breaking direct io to non-page aligned offsets. Instead, explicitly specify the page alignment next to the page vector in the ceph_msg struct, and use that instead of the message header (which probably shouldn't be trusted). The alloc_msg callback is responsible for filling in this field properly when it sets up the page vector. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 21 10月, 2010 2 次提交
-
-
由 Yehuda Sadeh 提交于
This factors out protocol and low-level storage parts of ceph into a separate libceph module living in net/ceph and include/linux/ceph. This is mostly a matter of moving files around. However, a few key pieces of the interface change as well: - ceph_client becomes ceph_fs_client and ceph_client, where the latter captures the mon and osd clients, and the fs_client gets the mds client and file system specific pieces. - Mount option parsing and debugfs setup is correspondingly broken into two pieces. - The mon client gets a generic handler callback for otherwise unknown messages (mds map, in this case). - The basic supported/required feature bits can be expanded (and are by ceph_fs_client). No functional change, aside from some subtle error handling cases that got cleaned up in the refactoring process. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Yehuda Sadeh 提交于
Allow the messenger to send/receive data in a bio. This is added so that we wouldn't need to copy the data into pages or some other buffer when doing IO for an rbd block device. We can now have trailing variable sized data for osd ops. Also osd ops encoding is more modular. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 30 5月, 2010 1 次提交
-
-
由 Sage Weil 提交于
The auth module (part of the mon_client) is needed to free any ceph_authorizer(s) used by the mds and osd connections. Flush the msgr workqueue before stopping monc to ensure that the destroy_authorizer auth op is available when those connections are closed out. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 18 5月, 2010 5 次提交
-
-
由 Yehuda Sadeh 提交于
This is essential, as for the rados block device we'll need to run in different contexts that would need flags that are other than GFP_NOFS. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
These are used for adjusting behavior, such as conditionally encoding a newer message format. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Notable changes include pool op defines and types, FLOCK feature bit, and new CMPXATTR osd ops. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Reset out_keepalive_pending and peer_global_seq, and drop unused var. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
We only need to pass in front_len. Callers can attach any other payload pieces (middle, data) as they see fit. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 12 5月, 2010 1 次提交
-
-
由 Sage Weil 提交于
If the tcp connection drops and we reconnect to reestablish a stateful session (with the mds), we need to resend previously sent (and possibly received) messages with the _same_ seq # so that they can be dropped on the other end if needed. Only assign a new seq once after the message is queued. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 23 3月, 2010 1 次提交
-
-
由 Sage Weil 提交于
We get a fault callback on _every_ tcp connection fault. Normally, we want to reopen the connection when that happens. If the address we have is bad, however, and connection attempts always result in a connection refused or similar error, explicitly closing and reopening the msgr connection just prevents the messenger's backoff logic from kicking in. The result can be a console full of [ 3974.417106] ceph: osd11 10.3.14.138:6800 connection failed [ 3974.423295] ceph: osd11 10.3.14.138:6800 connection failed [ 3974.429709] ceph: osd11 10.3.14.138:6800 connection failed Instead, if we get a fault, and have outstanding requests, but the osd address hasn't changed and the connection never successfully connected in the first place, do nothing to the osd connection. The messenger layer will back off and retry periodically, because we never connected and thus the lossy bit is not set. Instead, touch each request's r_stamp so that handle_timeout can tell the request is still alive and kicking. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 02 3月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Clear LOSSYTX bit, so that if/when we reconnect, said reconnect will retry on failure. Clear _PENDING bits too, to avoid polluting subsequent connection state. Drop unused REGISTERED bit. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 11 2月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Add infrastructure to allow the mon_client to periodically renew its auth credentials. Also add a messenger callback that will force such a renewal if a peer rejects our authenticator. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 26 1月, 2010 3 次提交
-
-
由 Yehuda Sadeh 提交于
This includes treating all the data preallocation and revokation at the same place, not having to have a special case for the reserved pages. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
-
由 Yehuda Sadeh 提交于
Now doing it in the same callback that is also responsible for allocating the 'front' part of the message. If we get a message that we haven't got a corresponding tid for, mark it for skipping. Moving the mutex unlock/lock from the osd alloc_msg callback to the calling function in the messenger. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
-
由 Yehuda Sadeh 提交于
Both front and middle parts of the message are now being allocated at the ceph_alloc_msg(). Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
-
- 24 12月, 2009 3 次提交
-
-
由 Sage Weil 提交于
The ceph_pagelist is a simple list of whole pages, strung together via their lru list_head. It facilitates encoding to a "buffer" of unknown size. Allow its use in place of the ceph_msg page vector. This will be used to fix the huge buffer preallocation woes of MDS reconnection. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
When we issue an OSD read, we specify a vector of pages that the data is to be read into. The request may be sent multiple times, to multiple OSDs, if the osdmap changes, which means we can get more than one reply. Only read data into the page vector if the reply is coming from the OSD we last sent the request to. Keep track of which connection is using the vector by taking a reference. If another connection was already using the vector before and a new reply comes in on the right connection, revoke the pages from the other connection. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Use a single mutex (previously out_mutex) to protect both read and write activity from concurrent ceph_con_* calls. Drop the mutex when doing callbacks to avoid nested locking (the callback may need to call something like ceph_con_close). Signed-off-by: NSage Weil <sage@newdream.net>
-
- 22 12月, 2009 2 次提交
-
-
由 Sage Weil 提交于
Also, print fsid using standard format, NOT hex dump. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Carry a ceph_msg reference for connection->out_msg. This will allow us to make out_sent optional. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 08 12月, 2009 1 次提交
-
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
- 19 11月, 2009 2 次提交
-
-
由 Sage Weil 提交于
When we open a monitor session, we send an initial AUTH message listing the auth protocols we support, our entity name, and (possibly) a previously assigned global_id. The monitor chooses a protocol and responds with an initial message. Initially implement AUTH_NONE, a dummy protocol that provides no security, but works within the new framework. It generates 'authorizers' that are used when connecting to (mds, osd) services that simply state our entity name and global_id. This is a wire protocol change. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
We want to ceph_con_close when we're done with the connection, before the ref count reaches 0. Once it does, do not call ceph_con_shutdown, as that takes the con mutex and may sleep, and besides that is unnecessary. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 11 11月, 2009 1 次提交
-
-
由 Sage Weil 提交于
We need to make sure we only swab the address during the banner once. So break process_banner out of process_connect, and clean up the surrounding code so that these are distinct phases of the handshake. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 04 11月, 2009 1 次提交
-
-
由 Sage Weil 提交于
We exchange struct ceph_entity_addr over the wire and store it on disk. The sockaddr_storage.ss_family field, however, is host endianness. So, fix ss_family endianness to big endian when sending/receiving over the wire. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 07 10月, 2009 1 次提交
-
-
由 Sage Weil 提交于
A generic message passing library is used to communicate with all other components in the Ceph file system. The messenger library provides ordered, reliable delivery of messages between two nodes in the system. This implementation is based on TCP. Signed-off-by: NSage Weil <sage@newdream.net>
-