提交 · 4e891e0af0f0011c90067373c46d7228568ec079 · openeuler / Kernel

31 7月, 2012 40 次提交

rbd: have __rbd_add_snap_dev() return a pointer · 4e891e0a

由 Alex Elder 提交于 7月 10, 2012

It's not obvious whether the snapshot pointer whose address is
provided to __rbd_add_snap_dev() will be assigned by that function.
Change it to return the snapshot, or a pointer-coded errno in the
event of a failure.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

4e891e0a

libceph: recheck con state after allocating incoming message · 61399191

由 Sage Weil 提交于 7月 30, 2012

We drop the lock when calling the ->alloc_msg() con op, which means
we need to (a) not clobber con->in_msg without the mutex held, and (b)
we need to verify that we are still in the OPEN state when we retake
it to avoid causing any mayhem.  If the state does change, -EAGAIN
will get us back to con_work() and loop.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

61399191

libceph: change ceph_con_in_msg_alloc convention to be less weird · 4740a623

由 Sage Weil 提交于 7月 30, 2012

This function's calling convention is very limiting.  In particular,
we can't return any error other than ENOMEM (and only implicitly),
which is a problem (see next patch).

Instead, return an normal 0 or error code, and make the skip a pointer
output parameter.  Drop the useless in_hdr argument (we have the con
pointer).
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

4740a623

libceph: avoid dropping con mutex before fault · 8636ea67

由 Sage Weil 提交于 7月 30, 2012

The ceph_fault() function takes the con mutex, so we should avoid
dropping it before calling it.  This fixes a potential race with
another thread calling ceph_con_close(), or _open(), or similar (we
don't reverify con->state after retaking the lock).

Add annotation so that lockdep realizes we will drop the mutex before
returning.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

8636ea67

libceph: verify state after retaking con lock after dispatch · 7b862e07

由 Sage Weil 提交于 7月 30, 2012

We drop the con mutex when delivering a message.  When we retake the
lock, we need to verify we are still in the OPEN state before
preparing to read the next tag, or else we risk stepping on a
connection that has been closed.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

7b862e07

libceph: revoke mon_client messages on session restart · 4f471e4a

由 Sage Weil 提交于 7月 30, 2012

Revoke all mon_client messages when we shut down the old connection.
This is mostly moot since we are re-using the same ceph_connection,
but it is cleaner.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

4f471e4a

libceph: fix handling of immediate socket connect failure · 8007b8d6

由 Sage Weil 提交于 7月 30, 2012

If the connect() call immediately fails such that sock == NULL, we
still need con_close_socket() to reset our socket state to CLOSED.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

8007b8d6

ceph: update MAINTAINERS file · 09d90327

由 Sage Weil 提交于 7月 30, 2012

 * shiny new inktank.com email addresses
 * add include/linux/crush directory (previous oversight)
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

09d90327

libceph: be less chatty about stray replies · 756a16a5

由 Sage Weil 提交于 7月 30, 2012

There are many (normal) conditions that can lead to us getting
unexpected replies, include cluster topology changes, osd failures,
and timeouts.  There's no need to spam the console about it.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

756a16a5

S
libceph: clear all flags on con_close · 43c7427d
由 Sage Weil 提交于 7月 20, 2012
```
Signed-off-by: NSage Weil <sage@inktank.com>
```
43c7427d

libceph: clean up con flags · 4a861692

由 Sage Weil 提交于 7月 20, 2012

Rename flags with CON_FLAG prefix, move the definitions into the c file,
and (better) document their meaning.
Signed-off-by: NSage Weil <sage@inktank.com>

4a861692

libceph: replace connection state bits with states · 8dacc7da

由 Sage Weil 提交于 7月 20, 2012

Use a simple set of 6 enumerated values for the socket states (CON_STATE_*)
and use those instead of the state bits. All of the con->state checks are
now under the protection of the con mutex, so this is safe. It also
simplifies many of the state checks because we can check for anything other
than the expected state instead of various bits for races we can think of.

This appears to hold up well to stress testing both with and without socket
failure injection on the server side.
Signed-off-by: NSage Weil <sage@inktank.com>

8dacc7da

S
libceph: drop unnecessary CLOSED check in socket state change callback · d7353dd5
由 Sage Weil 提交于 7月 20, 2012
```
If we are CLOSED, the socket is closed and we won't get these.
Signed-off-by: NSage Weil <sage@inktank.com>
```
d7353dd5

libceph: close socket directly from ceph_con_close() · ee76e073

由 Sage Weil 提交于 7月 20, 2012

It is simpler to do this immediately, since we already hold the con mutex.
It also avoids the need to deal with a not-quite-CLOSED socket in con_work.
Signed-off-by: NSage Weil <sage@inktank.com>

ee76e073

libceph: drop gratuitous socket close calls in con_work · 2e8cb100

由 Sage Weil 提交于 7月 20, 2012

If the state is CLOSED or OPENING, we shouldn't have a socket.
Signed-off-by: NSage Weil <sage@inktank.com>

2e8cb100

libceph: move ceph_con_send() closed check under the con mutex · a59b55a6

由 Sage Weil 提交于 7月 20, 2012

Take the con mutex before checking whether the connection is closed to
avoid racing with someone else closing it.
Signed-off-by: NSage Weil <sage@inktank.com>

a59b55a6

libceph: move msgr clear_standby under con mutex protection · 00650931

由 Sage Weil 提交于 7月 20, 2012

Avoid dropping and retaking con->mutex in the ceph_con_send() case by
leaving locking up to the caller.
Signed-off-by: NSage Weil <sage@inktank.com>

00650931

libceph: fix fault locking; close socket on lossy fault · 3b5ede07

由 Sage Weil 提交于 7月 20, 2012

If we fault on a lossy connection, we should still close the socket
immediately, and do so under the con mutex.

We should also take the con mutex before printing out the state bits in
the debug output.
Signed-off-by: NSage Weil <sage@inktank.com>

3b5ede07

rbd: drop "object_name" from rbd_req_sync_unwatch() · 070c633f

由 Alex Elder 提交于 7月 25, 2012

rbd_req_sync_unwatch() only ever uses rbd_dev->header_name as the
value of its "object_name" parameter, and that value is available
within the function already.  So get rid of the parameter.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

070c633f

rbd: drop "object_name" from rbd_req_sync_notify_ack() · 7f0a24d8

由 Alex Elder 提交于 7月 25, 2012

rbd_req_sync_notify_ack() only ever uses rbd_dev->header_name as the
value of its "object_name" parameter, and that value is available
within the function already.  So get rid of the parameter.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

7f0a24d8

rbd: drop "object_name" from rbd_req_sync_notify() · 4cb16250

由 Alex Elder 提交于 7月 25, 2012

rbd_req_sync_notify() only ever uses rbd_dev->header_name as the
value of its "object_name" parameter, and that value is available
within the function already.  So get rid of the parameter.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

4cb16250

rbd: drop "object_name" from rbd_req_sync_watch() · 0e6f322d

由 Alex Elder 提交于 7月 25, 2012

rbd_req_sync_watch() is only called in one place, and in that place
it passes rbd_dev->header_name as the value of the "object_name"
parameter.  This value is available within the function already.

Having the extra parameter leaves the impression the object name
could take on different values, but it does not.

So get rid of the parameter.  We can always add it back again if
we find we want to watch some other object in the future.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

0e6f322d

rbd: drop rbd_dev parameter in snap functions · 14e7085d

由 Alex Elder 提交于 7月 19, 2012

Both rbd_register_snap_dev() and __rbd_remove_snap_dev() have
rbd_dev parameters that are unused.  Remove them.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

14e7085d

rbd: drop rbd_header_from_disk() gfp_flags parameter · ed63f4fd

由 Alex Elder 提交于 7月 19, 2012

The function rbd_header_from_disk() is only called in one spot, and
it passes GFP_KERNEL as its value for the gfp_flags parameter.

Just drop that parameter and substitute GFP_KERNEL everywhere within
that function it had been used.  (If we find we need the parameter
again in the future it's easy enough to add back again.)
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

ed63f4fd

rbd: snapc is unused in rbd_req_sync_read() · 9a5d690b

由 Alex Elder 提交于 7月 19, 2012

The "snapc" parameter to in rbd_req_sync_read() is not used, so
get rid of it.
Reported-by: NJosh Durgin <josh.durgin@inktank.com>
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

9a5d690b

rbd: rename rbd_device->id · de71a297

由 Alex Elder 提交于 7月 03, 2012

The "id" field of an rbd device structure represents the unique
client-local device id mapped to the underlying rbd image.  Each rbd
image will have another id--the image id--and each snapshot has its
own id as well.  The simple name "id" no longer conveys the
information one might like to have.

Rename the device "id" field in struct rbd_dev to be "dev_id" to
make it a little more obvious what we're dealing with without having
to think more about context.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

de71a297

rbd: encapsulate header validity test · 8e94af8e

由 Alex Elder 提交于 7月 25, 2012

If an rbd image header is read and it doesn't begin with the
expected magic information, a warning is displayed.  This is
a fairly simple test, but it could be extended at some point.
Fix the comparison so it actually looks at the "text" field
rather than the front of the structure.

In any case, encapsulate the validity test in its own function.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

8e94af8e

ceph: define snap counts as u32 everywhere · aa711ee3

由 Alex Elder 提交于 7月 13, 2012

There are two structures in which a count of snapshots are
maintained:

    struct ceph_snap_context {
	...
        u32 num_snaps;
	...
    }
and
    struct ceph_snap_realm {
	...
        u32 num_prior_parent_snaps;   /*  had prior to parent_since */
	...
        u32 num_snaps;
	...
    }

These fields never take on negative values (e.g., to hold special
meaning), and so are really inherently unsigned.  Furthermore they
take their value from over-the-wire or on-disk formatted 32-bit
values.

So change their definition to have type u32, and change some spots
elsewhere in the code to account for this change.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

aa711ee3

rbd: clean up a few dout() calls · bd919d45

由 Alex Elder 提交于 7月 13, 2012

There was a dout() call in rbd_do_request() that was reporting
the reporting the offset as the length and vice versa.  While
fixing that I did a quick scan of other dout() calls and fixed
a couple of other minor things.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

bd919d45

rbd: simplify __rbd_remove_all_snaps() · a0593290

由 Alex Elder 提交于 7月 19, 2012

This just replaces a while loop with list_for_each_entry_safe()
in __rbd_remove_all_snaps().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

a0593290

rbd: drop extra header_rwsem init · a66f8c97

由 Alex Elder 提交于 7月 19, 2012

In commit c666601a there was inadvertently added an extra
initialization of rbd_dev->header_rwsem.  This gets rid of the
duplicate.
Reported-by: NGuangliang Zhao <gzhao@suse.com>
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

a66f8c97

rbd: kill rbd_image_header->snap_seq · 9e15dc73

由 Alex Elder 提交于 7月 19, 2012

The snap_seq field in an rbd_image_header structure held the value
from the rbd image header when it was last refreshed.  We now
maintain this value in the snapc->seq field.  So get rid of the
other one.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

9e15dc73

rbd: set snapc->seq only when refreshing header · 505cbb9b

由 Alex Elder 提交于 7月 19, 2012

In rbd_header_add_snap() there is code to set snapc->seq to the
just-added snapshot id.  This is the only remnant left of the
use of that field for recording which snapshot an rbd_dev was
associated with.  That functionality is no longer supported,
so get rid of that final bit of code.

Doing so means we never actually set snapc->seq any more.  On the
server, the snapshot context's sequence value represents the highest
snapshot id ever issued for a particular rbd image.  So we'll make
it have that meaning here as well.  To do so, set this value
whenever the rbd header is (re-)read.  That way it will always be
consistent with the rest of the snapshot context we maintain.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

505cbb9b

rbd: preserve snapc->seq in rbd_header_set_snap() · 78dc447d

由 Alex Elder 提交于 7月 19, 2012

In rbd_header_set_snap(), there is logic to make the snap context's
seq field get set to a particular snapshot id, or 0 if there is no
snapshot for the rbd image.

This seems to be an artifact of how the current snapshot id for an
rbd_dev was recorded before the rbd_dev->snap_id field began to be
used for that purpose.

There's no need to update the value of snapc->seq here any more, so
stop doing it.  Tidy up a few local variables in that function
while we're at it.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

78dc447d

rbd: don't use snapc->seq that way · 75fe9e19

由 Alex Elder 提交于 7月 19, 2012

In what appears to be an artifact of a different way of encoding
whether an rbd image maps a snapshot, __rbd_refresh_header() has
code that arranges to update the seq value in an rbd image's
snapshot context to point to the first entry in its snapshot
array if that's where it was pointing initially.

We now use rbd_dev->snap_id to record the snapshot id--using the
special value CEPH_NOSNAP to indicate the rbd_dev is not mapping a
snapshot at all.

There is therefore no need to check for this case, nor to update the
seq value, in __rbd_refresh_header().  Just preserve the seq value
that rbd_read_header() provides (which, at the moment, is nothing).
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

75fe9e19

rbd: send header version when notifying · a71b891b

由 Josh Durgin 提交于 12月 05, 2011

Previously the original header version was sent. Now, we update it
when the header changes.
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

a71b891b

rbd: use reference counting for the snap context · d1d25646

由 Josh Durgin 提交于 12月 05, 2011

This prevents a race between requests with a given snap context and
header updates that free it. The osd client was already expecting the
snap context to be reference counted, since it get()s it in
ceph_osdc_build_request and put()s it when the request completes.

Also remove the second down_read()/up_read() on header_rwsem in
rbd_do_request, which wasn't actually preventing this race or
protecting any other data.
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

d1d25646

rbd: set image size when header is updated · 93a24e08

由 Josh Durgin 提交于 12月 05, 2011

The image may have been resized.
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

93a24e08

rbd: expose the correct size of the device in sysfs · a51aa0c0

由 Josh Durgin 提交于 12月 05, 2011

If an image was mapped to a snapshot, the size of the head version
would be shown. Protect capacity with header_rwsem, since it may
change.
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

a51aa0c0

rbd: only reset capacity when pointing to head · 474ef7ce

由 Josh Durgin 提交于 11月 21, 2011

Snapshots cannot be resized, and the new capacity of head should not
be reflected by the snapshot.
Signed-off-by: NJosh Durgin <josh.durgin@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

474ef7ce

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功