提交 · 92451b4910895936cc05ce1d283644ffc44d7537 · openeuler / Kernel

02 5月, 2013 26 次提交

libceph: no more kick_requests() race · 92451b49

由 Alex Elder 提交于 3月 25, 2013

Since we no longer drop the request mutex between registering and
mapping an osd request in ceph_osdc_start_request(), there is no
chance of a race with kick_requests().

We can now therefore map and send the new request unconditionally
(but we'll issue a warning should it ever occur).
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-off-by: NSage Weil <sage@inktank.com>

92451b49

libceph: slightly defer registering osd request · dc4b870c

由 Alex Elder 提交于 3月 25, 2013

One of the first things ceph_osdc_start_request() does is register
the request.  It then acquires the osd client's map semaphore and
request mutex and proceeds to map and send the request.

There is no reason the request has to be registered before acquiring
the map semaphore.  So hold off doing so until after the map
semaphore is held.

Since register_request() is nothing more than a wrapper around
__register_request(), call the latter function instead, after
acquiring the request mutex.

That leaves register_request() unused, so get rid of it.

This partially resolves:
    http://tracker.ceph.com/issues/4392Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-off-by: NSage Weil <sage@inktank.com>

dc4b870c

libceph: wrap auth ops in wrapper functions · 27859f97

由 Sage Weil 提交于 3月 25, 2013

Use wrapper functions that check whether the auth op exists so that callers
do not need a bunch of conditional checks.  Simplifies the external
interface.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

27859f97

libceph: add update_authorizer auth method · 0bed9b5c

由 Sage Weil 提交于 3月 25, 2013

Currently the messenger calls out to a get_authorizer con op, which will
create a new authorizer if it doesn't yet have one.  In the meantime, when
we rotate our service keys, the authorizer doesn't get updated.  Eventually
it will be rejected by the server on a new connection attempt and get
invalidated, and we will then rebuild a new authorizer, but this is not
ideal.

Instead, if we do have an authorizer, call a new update_authorizer op that
will verify that the current authorizer is using the latest secret.  If it
is not, we will build a new one that does.  This avoids the transient
failure.

This fixes one of the sorry sequence of events for bug

	http://tracker.ceph.com/issues/4282Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

0bed9b5c

libceph: kill osd request r_trail · 95e072eb

由 Alex Elder 提交于 3月 08, 2013

The osd trail is a pagelist, used only for a CALL osd operation
to hold the class and method names, along with any input data for
the call.

It is only currently used by the rbd client, and when it's used it
is the only bit of outbound data in the osd request.  Since we
already support (non-trail) pagelist data in a message, we can
just save this outbound CALL data in the "normal" pagelist rather
than the trail, and get rid of the trail entirely.

The existing pagelist support depends on the pagelist being
dynamically allocated, and ownership of it is passed to the
messenger once it's been attached to a message.  (That is to say,
the messenger releases and frees the pagelist when it's done with
it).  That means we need to dynamically allocate the pagelist also.

Note that we simply assert that the allocation of a pagelist
structure succeeds.  Appending to a pagelist might require a dynamic
allocation, so we're already assuming we won't run into trouble
doing so (we're just ignore any failures--and that should be fixed
at some point).

This resolves:
    http://tracker.ceph.com/issues/4407Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

95e072eb

libceph: have osd requests support pagelist data · 9a5e6d09

由 Alex Elder 提交于 3月 08, 2013

Add support for recording a ceph pagelist as data associated with an
osd request.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

9a5e6d09

libceph: let osd ops determine request data length · 175face2

由 Alex Elder 提交于 3月 08, 2013

The length of outgoing data in an osd request is dependent on the
osd ops that are embedded in that request.  Each op is encoded into
a request message using osd_req_encode_op(), so that should be used
to determine the amount of outgoing data implied by the op as it
is encoded.

Have osd_req_encode_op() return the number of bytes of outgoing data
implied by the op being encoded, and accumulate and use that in
ceph_osdc_build_request().

As a result, ceph_osdc_build_request() no longer requires its "len"
parameter, so get rid of it.

Using the sum of the op lengths rather than the length provided is
a valid change because:
    - The only callers of osd ceph_osdc_build_request() are
      rbd and the osd client (in ceph_osdc_new_request() on
      behalf of the file system).
    - When rbd calls it, the length provided is only non-zero for
      write requests, and in that case the single op has the
      same length value as what was passed here.
    - When called from ceph_osdc_new_request(), (it's not all that
      easy to see, but) the length passed is also always the same
      as the extent length encoded in its (single) write op if
      present.

This resolves:
    http://tracker.ceph.com/issues/4406Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

175face2

libceph: set response data fields earlier · 70636773

由 Alex Elder 提交于 3月 04, 2013

When an incoming message is destined for the osd client, the
messenger calls the osd client's alloc_msg method.  That function
looks up which request has the tid matching the incoming message,
and returns the request message that was preallocated to receive the
response.  The response message is therefore known before the
request is even started.

Between the start of the request and the receipt of the response,
the request and its data fields will not change, so there's no
reason we need to hold off setting them.  In fact it's preferable
to set them just once because it's more obvious that they're
unchanging.

So set up the fields describing where incoming data is to land in a
response message at the beginning of ceph_osdc_start_request().
Define a helper function that sets these fields, and use it to
set the fields for both outgoing data in the request message and
incoming data in the response.

This resolves:
    http://tracker.ceph.com/issues/4284Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

70636773

ceph: only set message data pointers if non-empty · ebf18f47

由 Alex Elder 提交于 3月 04, 2013

Change it so we only assign outgoing data information for messages
if there is outgoing data to send.

This then allows us to add a few more (currently commented-out)
assertions.

This is related to:
    http://tracker.ceph.com/issues/4284Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

ebf18f47

libceph: isolate other message data fields · 27fa8385

由 Alex Elder 提交于 2月 14, 2013

Define ceph_msg_data_set_pagelist(), ceph_msg_data_set_bio(), and
ceph_msg_data_set_trail() to clearly abstract the assignment of the
remaining data-related fields in a ceph message structure.  Use the
new functions in the osd client and mds client.

This partially resolves:
    http://tracker.ceph.com/issues/4263Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

27fa8385

libceph: set page info with byte length · f1baeb2b

由 Alex Elder 提交于 3月 07, 2013

When setting page array information for message data, provide the
byte length rather than the page count ceph_msg_data_set_pages().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

f1baeb2b

libceph: isolate message page field manipulation · 02afca6c

由 Alex Elder 提交于 2月 14, 2013

Define a function ceph_msg_data_set_pages(), which more clearly
abstracts the assignment page-related fields for data in a ceph
message structure.  Use this new function in the osd client and mds
client.

Ideally, these fields would never be set more than once (with
BUG_ON() calls to guarantee that).  At the moment though the osd
client sets these every time it receives a message, and in the event
of a communication problem this can happen more than once.  (This
will be resolved shortly, but setting up these helpers first makes
it all a bit easier to work with.)

Rearrange the field order in a ceph_msg structure to group those
that are used to define the possible data payloads.

This partially resolves:
    http://tracker.ceph.com/issues/4263Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

02afca6c

libceph: record byte count not page count · e0c59487

由 Alex Elder 提交于 3月 07, 2013

Record the byte count for an osd request rather than the page count.
The number of pages can always be derived from the byte count (and
alignment/offset) but the reverse is not true.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

e0c59487

libceph: separate read and write data · 0fff87ec

由 Alex Elder 提交于 2月 14, 2013

An osd request defines information about where data to be read
should be placed as well as where data to write comes from.
Currently these are represented by common fields.

Keep information about data for writing separate from data to be
read by splitting these into data_in and data_out fields.

This is the key patch in this whole series, in that it actually
identifies which osd requests generate outgoing data and which
generate incoming data.  It's less obvious (currently) that an osd
CALL op generates both outgoing and incoming data; that's the focus
of some upcoming work.

This resolves:
    http://tracker.ceph.com/issues/4127Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

0fff87ec

libceph: distinguish page and bio requests · 2ac2b7a6

由 Alex Elder 提交于 2月 14, 2013

An osd request uses either pages or a bio list for its data.  Use a
union to record information about the two, and add a data type
tag to select between them.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

2ac2b7a6

libceph: separate osd request data info · 2794a82a

由 Alex Elder 提交于 2月 14, 2013

Pull the fields in an osd request structure that define the data for
the request out into a separate structure.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

2794a82a

libceph: don't assign page info in ceph_osdc_new_request() · 153e5167

由 Alex Elder 提交于 3月 01, 2013

Currently ceph_osdc_new_request() assigns an osd request's
r_num_pages and r_alignment fields.  The only thing it does
after that is call ceph_osdc_build_request(), and that doesn't
need those fields to be assigned.

Move the assignment of those fields out of ceph_osdc_new_request()
and into its caller.  As a result, the page_align parameter is no
longer used, so get rid of it.

Note that in ceph_sync_write(), the value for req->r_num_pages had
already been calculated earlier (as num_pages, and fortunately
it was computed the same way).  So don't bother recomputing it,
but because it's not needed earlier, move that calculation after the
call to ceph_osdc_new_request().  Hold off making the assignment to
r_alignment, doing it instead r_pages and r_num_pages are
getting set.

Similarly, in start_read(), nr_pages already holds the number of
pages in the array (and is calculated the same way), so there's no
need to recompute it.  Move the assignment of the page alignment
down with the others there as well.

This and the next few patches are preparation work for:
    http://tracker.ceph.com/issues/4127Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

153e5167

libceph: rename ceph_calc_object_layout() · 41766f87

由 Alex Elder 提交于 3月 01, 2013

The purpose of ceph_calc_object_layout() is to fill in the pool
number and seed for a ceph_pg structure provided, based on a given
osd map and target object id.

Currently that function takes a file layout parameter, but the only
thing used out of that is its pool number.

Change the function so it takes a pool number rather than the full
file layout structure.  Only update the ceph_pg if the pool is found
in the osd map.  Get rid of few useless lines of code from the
function while there.

Since the function now very clearly just fills in the ceph_pg
structure it's provided, rename it ceph_calc_ceph_pg().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

41766f87

libceph: fix wrong opcode use in osd_req_encode_op() · 8f63ca2d

由 Alex Elder 提交于 3月 04, 2013

The new cases added to osd_req_encode_op() caused a new sparse
error, which highlighted an existing problem that had been
overlooked since it was originally checked in.  When an unsupported
opcode is found the destination rather than the source opcode was
being used in the error message.  The two differ in their byte
order, and we want to be using the one in the source.

Fix the problem in both spots.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

8f63ca2d

libceph: complete lingering requests only once · 0d5af164

由 Alex Elder 提交于 2月 27, 2013

An osd request marked to linger will be re-submitted in the event
a connection to the target osd gets dropped.  Currently, if there
is a callback function associated with a request it will be called
each time a request is submitted--which for lingering requests can
be more than once.

Change it so a request--including lingering ones--will get completed
(from the perspective of the user of the osd client) exactly once.

This resolves:
    http://tracker.ceph.com/issues/3967Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

0d5af164

libceph: set page alignment in start_request() · f51a822c

由 Alex Elder 提交于 2月 14, 2013

The page alignment field for a request is currently set in
ceph_osdc_build_request().  It's not needed at that point
nor do either of its callers need that value assigned at
any point before they call ceph_osdc_start_request().

So move that assignment into ceph_osdc_start_request().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

f51a822c

libceph: distinguish page array and pagelist count · d4b515fa

由 Alex Elder 提交于 2月 25, 2013

Use distinct fields for tracking the number of pages in a message's
page array and in a message's page list.  Currently only one or the
other is used at a time, but that will be changing soon.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

d4b515fa

libceph: don't pass request to calc_layout() · 60cf5992

由 Alex Elder 提交于 2月 15, 2013

The only remaining reason to pass the osd request to calc_layout()
is to fill in its r_num_pages and r_page_alignment fields.  Once it
fills those in, it doesn't do anything more with them.

We can therefore move those assignments into the caller, and get rid
of the "req" parameter entirely.

Note, however, that the only caller is ceph_osdc_new_request(),
and that immediately overwrites those fields with values based on
its passed-in page offset.  So the assignment inside calc_layout()
was redundant anyway.

This resolves:
    http://tracker.ceph.com/issues/4262Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

60cf5992

libceph: format target object name in caller · dbe0fc41

由 Alex Elder 提交于 2月 15, 2013

Move the formatting of the object name (oid) to use for an object
request into the caller of calc_layout().  This makes the "vino"
parameter no longer necessary, so get rid of it.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

dbe0fc41

libceph: pass object number back to calc_layout() caller · 47a05811

由 Alex Elder 提交于 2月 15, 2013

Have calc_layout() pass the computed object number back to its
caller.  (This is a small step to simplify review.)
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

47a05811

libceph: fix a osd request memory leak · 3ff5f385

由 Alex Elder 提交于 2月 15, 2013

If an invalid layout is provided to ceph_osdc_new_request(), its
call to calc_layout() might return an error.  At that point in the
function we've already allocated an osd request structure, so we
need to free it (drop a reference) in the event such an error
occurs.

The only other value calc_layout() will return is 0, so make that
explicit in the successful case.

This resolves:
    http://tracker.ceph.com/issues/4240Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

3ff5f385

27 2月, 2013 4 次提交

libceph: update osd request/reply encoding · 1b83bef2

由 Sage Weil 提交于 2月 25, 2013

Use the new version of the encoding for osd requests and replies.  In the
process, update the way we are tracking request ops and reply lengths and
results in the struct ceph_osd_request.  Update the rbd and fs/ceph users
appropriately.

The main changes are:
 - we keep pointers into the request memory for fields we need to update
   each time the request is sent out over the wire
 - we keep information about the result in an array in the request struct
   where the users can easily get at it.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

1b83bef2

libceph: calculate placement based on the internal data types · 2169aea6

由 Sage Weil 提交于 2月 25, 2013

Instead of using the old ceph_object_layout struct, update our internal
ceph_calc_object_layout method to use the ceph_pg type.  This allows us to
pass the full 32-bit precision of the pgid.seed to the callers.  It also
allows some callers to avoid reaching into the request structures for the
struct ceph_object_layout fields.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

2169aea6

libceph: decode into cpu-native ceph_pg type · 5b191d99

由 Sage Weil 提交于 2月 23, 2013

Always decode data into our cpu-native ceph_pg type that has the correct
field widths.  Limit any remaining uses of ceph_pg_v1 to dealing with the
legacy protocol.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

5b191d99

libceph: rename ceph_pg -> ceph_pg_v1 · 12979354

由 Sage Weil 提交于 1月 08, 2013

Rename the old version this type to distinguish it from the new version.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

12979354

20 2月, 2013 2 次提交

libceph: allow STAT osd operations · fbfab539

由 Alex Elder 提交于 2月 08, 2013

Add support for CEPH_OSD_OP_STAT operations in the osd client
and in rbd.

This operation sends no data to the osd; everything required is
encoded in identity of the target object.

The result will be ENOENT if the object doesn't exist.  If it does
exist and no other error occurs the server returns the size and last
modification time of the target object as output data (in little
endian format).  The size is a 64 bit unsigned and the time is
ceph_timespec structure (two unsigned 32-bit integers, representing
a seconds and nanoseconds value).

This resolves:
    http://tracker.ceph.com/issues/4007Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

fbfab539

libceph: simplify data length calculation · f44246e3

由 Alex Elder 提交于 2月 14, 2013

Simplify the way the data length recorded in a message header is
calculated in ceph_osdc_build_request().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

f44246e3

19 2月, 2013 8 次提交

libceph: remove dead code in osd_req_encode_op() · a9f36c3e

由 Alex Elder 提交于 2月 15, 2013

In osd_req_encode_op() there are a few cases that handle osd
opcodes that are never used in the kernel.  The presence of
this code gives the impression it's correct (which really can't
be assumed), and may impose some unnecessary restrictions on
some upcoming refactoring of this code.

So delete this effectively dead code, and report uses of the
previously handled cases as unsupported.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

a9f36c3e

libceph: report defined but unsupported osd ops · 4c46459c

由 Alex Elder 提交于 2月 15, 2013

If osd_req_encode_op() is given any opcode it doesn't recognize
it reports an error.

This patch fleshes out that routine to distinguish between
well-defined but unsupported values and values that are simply
bogus.

This and the next commit are related to:
    http://tracker.ceph.com/issues/4126Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

4c46459c

libceph: kill ceph_osdc_wait_event() · 2d2f5226

由 Alex Elder 提交于 2月 15, 2013

There are no actual users of ceph_osdc_wait_event().  This would
have been one-shot events, but we no longer support those so just
get rid of this function.

Since this leaves nothing else that waits for the completion of an
event, we can get rid of the completion in a struct ceph_osd_event.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

2d2f5226

libceph: kill ceph_osdc_create_event() "one_shot" parameter · 3c663bbd

由 Alex Elder 提交于 2月 15, 2013

There is only one caller of ceph_osdc_create_event(), and it
provides 0 as its "one_shot" argument.  Get rid of that argument and
just use 0 in its place.

Replace the code in handle_watch_notify() that executes if one_shot
is nonzero in the event with a BUG_ON() call.

While modifying "osd_client.c", give handle_watch_notify() static
scope.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

3c663bbd

libceph: kill ceph_calc_raw_layout() · 60e56f13

由 Alex Elder 提交于 2月 15, 2013

There is no caller of ceph_calc_raw_layout() outside of libceph, so
there's no need to export from the module.

Furthermore, there is only one caller, in calc_layout(), and it
is not much more than a simple wrapper for that function.

So get rid of ceph_calc_raw_layout() and embed it instead within
calc_layout().

While touching "osd_client.c", get rid of the unnecessary forward
declaration of __send_request().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

60e56f13

libdeph: don't export ceph_osdc_init() or ceph_osdc_stop() · 60789380

由 Alex Elder 提交于 2月 15, 2013

The only callers of ceph_osdc_init() and ceph_osdc_stop()
ceph_create_client() and ceph_destroy_client() (respectively)
and they are in the same kernel module as those two functions.
There's therefore no need to export those interfaces, so don't.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

60789380

libceph: lock outside send_queued() · f9d25199

由 Alex Elder 提交于 2月 15, 2013

Two of the three callers of the osd client's send_queued() function
already hold the osd client mutex and drop it before the call.

Change send_queued() so it assumes the caller holds the mutex, and
update all callers accordingly.  Rename it __send_queued() to match
the convention used elsewhere in the file with respect to the lock.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

f9d25199

ceph: kill ceph_osdc_new_request() "num_reply" parameter · a3bea47e

由 Alex Elder 提交于 2月 15, 2013

The "num_reply" parameter to ceph_osdc_new_request() is never
used inside that function, so get rid of it.

Note that ceph_sync_write() passes 2 for that argument, while all
other callers pass 1.  It doesn't matter, but perhaps someone should
verify this doesn't indicate a problem.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

a3bea47e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功