- 17 5月, 2012 1 次提交
-
-
由 Alex Elder 提交于
The definitions for the ceph_mds_session and ceph_osd both contain five fields related only to "authorizers." Encapsulate those fields into their own struct type, allowing for better isolation in some upcoming patches. Fix the #includes in "linux/ceph/osd_client.h" to lay out their more complete canonical path. Signed-off-by: NAlex Elder <elder@inktank.com> Reviewed-by: NSage Weil <sage@inktank.com>
-
- 12 11月, 2011 1 次提交
-
-
由 Stratos Psomadakis 提交于
ceph_osd_request struct allocates a 40-byte buffer for object names. RBD image names can be up to 96 chars long (100 with the .rbd suffix), which results in the object name for the image being truncated, and a subsequent map failure. Increase the oid buffer in request messages, in order to avoid the truncation. Signed-off-by: NStratos Psomadakis <psomas@grnet.gr> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 23 3月, 2011 1 次提交
-
-
由 Yehuda Sadeh 提交于
Lingering requests are requests that are sent to the OSD normally but tracked also after we get a successful request. This keeps the OSD connection open and resends the original request if the object moves to another OSD. The OSD can then send notification messages back to us if another client initiates a notify. This framework will be used by RBD so that the client gets notification when a snapshot is created by another node or tool. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 22 3月, 2011 1 次提交
-
-
由 Sage Weil 提交于
If we send a request to osd A, and the request's pg remaps to osd B and then back to A in quick succession, we need to resend the request to A. The old code was only calling kick_requests after processing all incremental maps in a message, so it was very possible to not resend a request that needed to be resent. This would make the osd eventually time out (at least with the current default of osd timeouts enabled). The correct approach is to scan requests on every map incremental. This patch refactors the kick code in a few ways: - all requests are either on req_lru (in flight), req_unsent (ready to send), or req_notarget (currently map to no up osd) - mapping always done by map_request (previous map_osds) - if the mapping changes, we requeue. requests are resent only after all map incrementals are processed. - some osd reset code is moved out of kick_requests into a separate function - the "kick this osd" functionality is moved to kick_osd_requests, as it is unrelated to scanning for request->pg->osd mapping changes Signed-off-by: NSage Weil <sage@newdream.net>
-
- 10 11月, 2010 1 次提交
-
-
由 Sage Weil 提交于
We used to infer alignment of IOs within a page based on the file offset, which assumed they matched. This broke with direct IO that was not aligned to pages (e.g., 512-byte aligned IO). We were also trusting the alignment specified in the OSD reply, which could have been adjusted by the server. Explicitly specify the page alignment when setting up OSD IO requests. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 21 10月, 2010 4 次提交
-
-
由 Yehuda Sadeh 提交于
This factors out protocol and low-level storage parts of ceph into a separate libceph module living in net/ceph and include/linux/ceph. This is mostly a matter of moving files around. However, a few key pieces of the interface change as well: - ceph_client becomes ceph_fs_client and ceph_client, where the latter captures the mon and osd clients, and the fs_client gets the mds client and file system specific pieces. - Mount option parsing and debugfs setup is correspondingly broken into two pieces. - The mon client gets a generic handler callback for otherwise unknown messages (mds map, in this case). - The basic supported/required feature bits can be expanded (and are by ceph_fs_client). No functional change, aside from some subtle error handling cases that got cleaned up in the refactoring process. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Yehuda Sadeh 提交于
This will be used for rbd snapshots administration. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
-
由 Yehuda Sadeh 提交于
Allow the messenger to send/receive data in a bio. This is added so that we wouldn't need to copy the data into pages or some other buffer when doing IO for an rbd block device. We can now have trailing variable sized data for osd ops. Also osd ops encoding is more modular. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Yehuda Sadeh 提交于
The osd requests creation are being decoupled from the vino parameter, allowing clients using the osd to use other arbitrary object names that are not necessarily vino based. Also, calc_raw_layout now takes a snap id. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 12 5月, 2010 1 次提交
-
-
由 Sage Weil 提交于
OSD requests need to be resubmitted on any pg mapping change, not just when the pg primary changes. Resending only when the primary changes results in occasional 'hung' requests during osd cluster recovery or rebalancing. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 06 5月, 2010 1 次提交
-
-
由 Sage Weil 提交于
The ->writepages writeback_control is not still valid in the writepages completion. We were touching it solely to adjust pages_skipped when there was a writeback error (EIO, ENOSPC, EPERM due to bad osd credentials), causing an oops in the writeback code shortly thereafter. Updating pages_skipped on error isn't correct anyway, so let's just rip out this (clearly broken) code to pass the wbc to the completion. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 23 3月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Make variable name slightly more generic, since it will (soon) reflect either the time the request was sent OR the time it was last determined to be still retrying. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 05 3月, 2010 1 次提交
-
-
由 Yehuda Sadeh 提交于
This simplifies the process of timing out messages. We keep lru of current messages that are in flight. If a timeout has passed, we reset the osd connection, so that messages will be retransmitted. This is a failsafe in case we hit some sort of problem sending out message to the OSD. Normally, we'll get notification via an updated osdmap if there are problems. If a request is older than the keepalive timeout, send a keepalive to ensure we detect any breaks in the TCP connection. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 02 3月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Use a single ceph_msg for the osd reply, even when we are getting multiple replies. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 12 2月, 2010 1 次提交
-
-
由 Yehuda Sadeh 提交于
Instead of removing osd connection immediately when the requests list is empty, put the osd connection on an lru. Only if that osd has not been used for more than a specified time, will it be removed. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: NSage Weil <sage@newdream.net>
-
- 26 1月, 2010 1 次提交
-
-
由 Yehuda Sadeh 提交于
This includes treating all the data preallocation and revokation at the same place, not having to have a special case for the reserved pages. Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
-
- 15 1月, 2010 1 次提交
-
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
- 24 12月, 2009 1 次提交
-
-
由 Sage Weil 提交于
When we issue an OSD read, we specify a vector of pages that the data is to be read into. The request may be sent multiple times, to multiple OSDs, if the osdmap changes, which means we can get more than one reply. Only read data into the page vector if the reply is coming from the OSD we last sent the request to. Keep track of which connection is using the vector by taking a reference. If another connection was already using the vector before and a new reply comes in on the right connection, revoke the pages from the other connection. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 22 12月, 2009 1 次提交
-
-
由 Yehuda Sadeh 提交于
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
-
- 08 12月, 2009 1 次提交
-
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
- 19 11月, 2009 1 次提交
-
-
由 Sage Weil 提交于
When we open a monitor session, we send an initial AUTH message listing the auth protocols we support, our entity name, and (possibly) a previously assigned global_id. The monitor chooses a protocol and responds with an initial message. Initially implement AUTH_NONE, a dummy protocol that provides no security, but works within the new framework. It generates 'authorizers' that are used when connecting to (mds, osd) services that simply state our entity name and global_id. This is a wire protocol change. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 13 11月, 2009 1 次提交
-
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
- 07 10月, 2009 1 次提交
-
-
由 Sage Weil 提交于
The OSD client is responsible for reading and writing data from/to the object storage pool. This includes determining where objects are stored in the cluster, and ensuring that requests are retried or redirected in the event of a node failure or data migration. If an OSD does not respond before a timeout expires, keepalive messages are sent across the lossless, ordered communications channel to ensure that any break in the TCP is discovered. If the session does reset, a reconnection is attempted and affected requests are resent (by the message transport layer). Signed-off-by: NSage Weil <sage@newdream.net>
-