提交 · a5988c490ef66cb04ea2f610681949b25c773b3c · openanolis / cloud-kernel

01 6月, 2012 10 次提交

libceph: set CLOSED state bit in con_init · a5988c49

由 Alex Elder 提交于 5月 29, 2012

Once a connection is fully initialized, it is really in a CLOSED
state, so make that explicit by setting the bit in its state field.

It is possible for a connection in NEGOTIATING state to get a
failure, leading to ceph_fault() and ultimately ceph_con_close().
Clear that bits if it is set in that case, to reflect that the
connection truly is closed and is no longer participating in a
connect sequence.

Issue a warning if ceph_con_open() is called on a connection that
is not in CLOSED state.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

a5988c49

libceph: provide osd number when creating osd · e10006f8

由 Alex Elder 提交于 5月 26, 2012

Pass the osd number to the create_osd() routine, and move the
initialization of fields that depend on it therein.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

e10006f8

libceph: start tracking connection socket state · ce2c8903

由 Alex Elder 提交于 5月 22, 2012

Start explicitly keeping track of the state of a ceph connection's
socket, separate from the state of the connection itself.  Create
placeholder functions to encapsulate the state transitions.

    --------
    | NEW* |  transient initial state
    --------
        | con_sock_state_init()
        v
    ----------
    | CLOSED |  initialized, but no socket (and no
    ----------  TCP connection)
     ^      \
     |       \ con_sock_state_connecting()
     |        ----------------------
     |                              \
     + con_sock_state_closed()       \
     |\                               \
     | \                               \
     |  -----------                     \
     |  | CLOSING |  socket event;       \
     |  -----------  await close          \
     |       ^                            |
     |       |                            |
     |       + con_sock_state_closing()   |
     |      / \                           |
     |     /   ---------------            |
     |    /                   \           v
     |   /                    --------------
     |  /    -----------------| CONNECTING |  socket created, TCP
     |  |   /                 --------------  connect initiated
     |  |   | con_sock_state_connected()
     |  |   v
    -------------
    | CONNECTED |  TCP connection established
    -------------

Make the socket state an atomic variable, reinforcing that it's a
distinct transtion with no possible "intermediate/both" states.
This is almost certainly overkill at this point, though the
transitions into CONNECTED and CLOSING state do get called via
socket callback (the rest of the transitions occur with the
connection mutex held).  We can back out the atomicity later.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: Sage Weil<sage@inktank.com>

ce2c8903

libceph: start separating connection flags from state · 928443cd

由 Alex Elder 提交于 5月 22, 2012

A ceph_connection holds a mixture of connection state (as in "state
machine" state) and connection flags in a single "state" field.  To
make the distinction more clear, define a new "flags" field and use
it rather than the "state" field to hold Boolean flag values.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: Sage Weil<sage@inktank.com>

928443cd

libceph: embed ceph messenger structure in ceph_client · 15d9882c

由 Alex Elder 提交于 5月 26, 2012

A ceph client has a pointer to a ceph messenger structure in it.
There is always exactly one ceph messenger for a ceph client, so
there is no need to allocate it separate from the ceph client
structure.

Switch the ceph_client structure to embed its ceph_messenger
structure.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

15d9882c

libceph: rename kvec_reset and kvec_add functions · e2200423

由 Alex Elder 提交于 5月 23, 2012

The functions ceph_con_out_kvec_reset() and ceph_con_out_kvec_add()
are entirely private functions, so drop the "ceph_" prefix in their
name to make them slightly more wieldy.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

e2200423

libceph: rename socket callbacks · 327800bd

由 Alex Elder 提交于 5月 22, 2012

Change the names of the three socket callback functions to make it
more obvious they're specifically associated with a connection's
socket (not the ceph connection that uses it).
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

327800bd

libceph: kill bad_proto ceph connection op · 6384bb8b

由 Alex Elder 提交于 5月 29, 2012

No code sets a bad_proto method in its ceph connection operations
vector, so just get rid of it.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>

6384bb8b

libceph: eliminate connection state "DEAD" · e5e372da

由 Alex Elder 提交于 5月 22, 2012

The ceph connection state "DEAD" is never set and is therefore not
needed.  Eliminate it.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>

e5e372da

ceph: check PG_Private flag before accessing page->private · 28c0254e

由 Yan, Zheng 提交于 5月 28, 2012

I got lots of NULL pointer dereference Oops when compiling kernel on ceph.
The bug is because the kernel page migration routine replaces some pages
in the page cache with new pages, these new pages' private can be non-zero.
Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

28c0254e

22 5月, 2012 1 次提交

libceph: fix pg_temp updates · 6bd9adbd

由 Sage Weil 提交于 5月 21, 2012

Usually, we are adding pg_temp entries or removing them. Occasionally they
update. In that case, osdmap_apply_incremental() was failing because the
rbtree entry already exists.

Fix by removing the existing entry before inserting a new one.

Fixes http://tracker.newdream.net/issues/2446Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

6bd9adbd

19 5月, 2012 2 次提交

libceph: avoid unregistering osd request when not registered · 35f9f8a0

由 Sage Weil 提交于 5月 16, 2012

There is a race between two __unregister_request() callers: the
reply path and the ceph_osdc_wait_request().  If we get a reply
*and* the timeout expires at roughly the same time, both callers
will try to unregister the request, and the second one will do bad
things.

Simply check if the request is still already unregistered; if so,
return immediately and do nothing.

Fixes http://tracker.newdream.net/issues/2420Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

35f9f8a0

ceph: add auth buf in prepare_write_connect() · 3da54776

由 Alex Elder 提交于 5月 16, 2012

Move the addition of the authorizer buffer to a connection's
out_kvec out of get_connect_authorizer() and into its caller. This
way, the caller--prepare_write_connect()--can avoid adding the
connect header to out_kvec before it has been fully initialized.

Prior to this patch, it was possible for a connect header to be
sent over the wire before the authorizer protocol or buffer length
fields were initialized. An authorizer buffer associated with that
header could also be queued to send only after the connection header
that describes it was on the wire.

Fixes http://tracker.newdream.net/issues/2424Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

3da54776

17 5月, 2012 17 次提交

ceph: rename prepare_connect_authorizer() · dac1e716

由 Alex Elder 提交于 5月 16, 2012

Change the name of prepare_connect_authorizer().  The next
patch is going to make this function no longer add anything to the
connection's out_kvec, so it will no longer fit the pattern of
the rest of the prepare_connect_*() functions.

In addition, pass the address of a variable that will hold the
authorization protocol to use.  Move the assignment of that to the
connection's out_connect structure into prepare_write_connect().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

dac1e716

ceph: return pointer from prepare_connect_authorizer() · 729796be

由 Alex Elder 提交于 5月 16, 2012

Change prepare_connect_authorizer() so it returns a pointer (or
pointer-coded error).
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

729796be

ceph: use info returned by get_authorizer · 8f43fb53

由 Alex Elder 提交于 5月 16, 2012

Rather than passing a bunch of arguments to be filled in with the
content of the ceph_auth_handshake buffer now returned by the
get_authorizer method, just use the returned information in the
caller, and drop the unnecessary arguments.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

8f43fb53

ceph: have get_authorizer methods return pointers · a3530df3

由 Alex Elder 提交于 5月 16, 2012

Have the get_authorizer auth_client method return a ceph_auth
pointer rather than an integer, pointer-encoding any returned
error value.  This is to pave the way for making use of the
returned value in an upcoming patch.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

a3530df3

ceph: ensure auth ops are defined before use · a255651d

由 Alex Elder 提交于 5月 16, 2012

In the create_authorizer method for both the mds and osd clients,
the auth_client->ops pointer is blindly dereferenced.  There is no
obvious guarantee that this pointer has been assigned.  And
furthermore, even if the ops pointer is non-null there is definitely
no guarantee that the create_authorizer or destroy_authorizer
methods are defined.

Add checks in both routines to make sure they are defined (non-null)
before use.  Add similar checks in a few other spots in these files
while we're at it.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

a255651d

ceph: messenger: reduce args to create_authorizer · 74f1869f

由 Alex Elder 提交于 5月 16, 2012

Make use of the new ceph_auth_handshake structure in order to reduce
the number of arguments passed to the create_authorizor method in
ceph_auth_client_ops.  Use a local variable of that type as a
shorthand in the get_authorizer method definitions.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

74f1869f

ceph: define ceph_auth_handshake type · 6c4a1915

由 Alex Elder 提交于 5月 16, 2012

The definitions for the ceph_mds_session and ceph_osd both contain
five fields related only to "authorizers."  Encapsulate those fields
into their own struct type, allowing for better isolation in some
upcoming patches.

Fix the #includes in "linux/ceph/osd_client.h" to lay out their more
complete canonical path.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

6c4a1915

ceph: messenger: check return from get_authorizer · ed96af64

由 Alex Elder 提交于 5月 16, 2012

In prepare_connect_authorizer(), a connection's get_authorizer
method is called but ignores its return value.  This function can
return an error, so check for it and return it if that ever occurs.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

ed96af64

ceph: messenger: rework prepare_connect_authorizer() · b1c6b980

由 Alex Elder 提交于 5月 16, 2012

Change prepare_connect_authorizer() so it returns without dropping
the connection mutex if the connection has no get_authorizer method.

Use the symbolic CEPH_AUTH_UNKNOWN instead of 0 when assigning
authorization protocols.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

b1c6b980

ceph: messenger: check prepare_write_connect() result · 5a0f8fdd

由 Alex Elder 提交于 5月 16, 2012

prepare_write_connect() can return an error, but only one of its
callers checks for it.  All the rest are in functions that already
return errors, so it should be fine to return the error if one
gets returned.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

5a0f8fdd

ceph: don't set WRITE_PENDING too early · e10c758e

由 Alex Elder 提交于 5月 16, 2012

prepare_write_connect() prepares a connect message, then sets
WRITE_PENDING on the connection.  Then *after* this, it calls
prepare_connect_authorizer(), which updates the content of the
connection buffer already queued for sending.  It's also possible it
will result in prepare_write_connect() returning -EAGAIN despite the
WRITE_PENDING big getting set.

Fix this by preparing the connect authorizer first, setting the
WRITE_PENDING bit only after that is done.

Partially addresses http://tracker.newdream.net/issues/2424Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

e10c758e

ceph: drop msgr argument from prepare_write_connect() · e825a66d

由 Alex Elder 提交于 5月 16, 2012

In all cases, the value passed as the msgr argument to
prepare_write_connect() is just con->msgr.  Just get the msgr
value from the ceph connection and drop the unneeded argument.

The only msgr passed to prepare_write_banner() is also therefore
just the one from con->msgr, so change that function to drop the
msgr argument as well.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

e825a66d

ceph: messenger: send banner in process_connect() · 41b90c00

由 Alex Elder 提交于 5月 16, 2012

prepare_write_connect() has an argument indicating whether a banner
should be sent out before sending out a connection message.  It's
only ever set in one of its callers, so move the code that arranges
to send the banner into that caller and drop the "include_banner"
argument from prepare_write_connect().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

41b90c00

ceph: messenger: reset connection kvec caller · 84fb3adf

由 Alex Elder 提交于 5月 16, 2012

Reset a connection's kvec fields in the caller rather than in
prepare_write_connect().   This ends up repeating a few lines of
code but it's improving the separation between distinct operations
on the connection, which we can take advantage of later.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

84fb3adf

libceph: don't reset kvec in prepare_write_banner() · d329156f

由 Alex Elder 提交于 5月 16, 2012

Move the kvec reset for a connection out of prepare_write_banner and
into its only caller.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

d329156f

ceph: ignore preferred_osd field · c047be09

由 Sage Weil 提交于 5月 14, 2012

Old users may not expect EINVAL, and there is no clear user-visibile
behavior change now that we ignore it.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

c047be09

ceph: fully initialize new layout · 702aeb1f

由 Sage Weil 提交于 5月 14, 2012

When we are setting a new layout, fully initialize the structure:
 - zero it out
 - always set preferred_osd to -1
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

702aeb1f

15 5月, 2012 10 次提交

ceph: messenger: change read_partial() to take "end" arg · fd51653f

由 Alex Elder 提交于 5月 10, 2012

Make the second argument to read_partial() be the ending input byte
position rather than the beginning offset it now represents.  This
amounts to moving the addition "to + size" into the caller.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

fd51653f

ceph: messenger: update "to" in read_partial() caller · e6cee71f

由 Alex Elder 提交于 5月 10, 2012

read_partial() always increases whatever "to" value is supplied by
adding the requested size to it, and that's the only thing it does
with that pointed-to value.

Do that pointer advance in the caller (and then only when the
updated value will be subsequently used), and change the "to"
parameter to be an in-only and non-pointer value.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

e6cee71f

ceph: messenger: use read_partial() in read_partial_message() · 57dac9d1

由 Alex Elder 提交于 5月 10, 2012

There are two blocks of code in read_partial_message()--those that
read the header and footer of the message--that can be replaced by a
call to read_partial().  Do that.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

57dac9d1

rbd: correct sysfs snap attribute documentation · b7f6519e

由 Josh Durgin 提交于 12月 01, 2011

Each attribute is prefixed with "snap_".
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: NAlex Elder <elder@dreamhost.com>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>

b7f6519e

rbd: rename __rbd_update_snaps to __rbd_refresh_header · 263c6ca0

由 Josh Durgin 提交于 12月 05, 2011

This function rereads the entire header and handles any changes in
it, not just changes in snapshots.
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: NAlex Elder <elder@dreamhost.com>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>

263c6ca0

rbd: fix snapshot size type · 3591538f

由 Josh Durgin 提交于 12月 05, 2011

Snapshot sizes should be the same type as regular image sizes. This
only affects their displayed size in sysfs, not the reported size of
an actual block device sizes.
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: NAlex Elder <elder@dreamhost.com>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>

3591538f

rbd: remove conditional snapid parameters · b06e6a6b

由 Josh Durgin 提交于 11月 21, 2011

The snapid parameters passed to rbd_do_op() and rbd_req_sync_op()
are now always either a valid snapid or an explicit CEPH_NOSNAP.

[elder@dreamhost.com: Rephrased the description]
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: NAlex Elder <elder@dreamhost.com>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>

b06e6a6b

rbd: store snapshot id instead of index · 77dfe99f

由 Josh Durgin 提交于 11月 21, 2011

When a device was open at a snapshot, and snapshots were deleted or
added, data from the wrong snapshot could be read. Instead of
assuming the snap context is constant, store the actual snap id when
the device is initialized, and rely on the OSDs to signal an error
if we try reading from a snapshot that was deleted.
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: NAlex Elder <elder@dreamhost.com>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>

77dfe99f

rbd: protect read of snapshot sequence number · 403f24d3

由 Josh Durgin 提交于 12月 05, 2011

This is updated whenever a snapshot is added or deleted, and the
snapc pointer is changed with every refresh of the header.
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: NAlex Elder <elder@dreamhost.com>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>

403f24d3

rbd: fix integer overflow in rbd_header_from_disk() · 50f7c4c9

由 Xi Wang 提交于 4月 20, 2012

ondisk->snap_count is read from disk via rbd_req_sync_read() and thus
needs validation.  Otherwise, a bogus `snap_count' could overflow the
kmalloc() size, leading to memory corruption.

Also use `u32' consistently for `snap_count'.

[elder@dreamhost.com: changed to use UINT_MAX rather than ULONG_MAX]
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Reviewed-by: NAlex Elder <elder@dreamhost.com>

50f7c4c9

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功