提交 · b92eded4b7e6facc4e4ba07de89a3fc24de99552 · openanolis / cloud-kernel

30 3月, 2013 1 次提交

rbd: don't zero-fill non-image object requests · 6e2a4505

由 Alex Elder 提交于 3月 27, 2013

A result of ENOENT from a read request for an object that's part of
an rbd image indicates that there is a hole in that portion of the
image.  Similarly, a short read for such an object indicates that
the remainder of the read should be interpreted a full read with
zeros filling out the end of the request.

This behavior is not correct for objects that are not backing rbd
image data.  Currently rbd_img_obj_request_callback() assumes it
should be done for all objects.

Change rbd_img_obj_request_callback() so it only does this zeroing
for image objects.  Encapsulate that special handling in its own
function.  Add an assertion that the image object request is a bio
request, since we assume that (and we currently don't support any
other types).

This resolves a problem identified here:
    http://tracker.ceph.com/issues/4559

The regression was introduced by bf0d5f50.
Reported-by: NDan van der Ster <dan@vanderster.com>
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-off-by: NSage Weil <sage@inktank.com>

6e2a4505

27 2月, 2013 3 次提交

libceph: update osd request/reply encoding · 1b83bef2

由 Sage Weil 提交于 2月 25, 2013

Use the new version of the encoding for osd requests and replies.  In the
process, update the way we are tracking request ops and reply lengths and
results in the struct ceph_osd_request.  Update the rbd and fs/ceph users
appropriately.

The main changes are:
 - we keep pointers into the request memory for fields we need to update
   each time the request is sent out over the wire
 - we keep information about the result in an array in the request struct
   where the users can easily get at it.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

1b83bef2

rbd: pass length, not op for osd completions · c47f9371

由 Alex Elder 提交于 2月 26, 2013

The only thing type-specific osd completion functions do with their
osd op parameter is (in some cases) extract the number of bytes
transferred from it.  In the other cases, the xferred bytes field
is not used, and total message data transfer byte count (which may
well be zero) is used.

Just set the object request transfer count in the main osd request
callback function and provide that to the other routines.  There is
then no longer any need to pass the op pointer to the type-specific
completion routines, so drop those parameters.

Stop doing anything with the total message data length.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

c47f9371

rbd: move rbd_osd_trivial_callback() · 39bf2c5d

由 Alex Elder 提交于 2月 26, 2013

This function is slightly out of place, probably the result
of an errant automatic merge or something.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

39bf2c5d

26 2月, 2013 4 次提交

rbd: eliminate sparse warnings · cc344fa1

由 Alex Elder 提交于 2月 19, 2013

Fengguang Wu reminded me that there were outstanding sparse reports
in the ceph and rbd code.  This patch fixes these problems in rbd
that lead to those reports:
    - Convert functions that are never referenced externally to have
      static scope.
    - Add a lockdep annotation to rbd_request_fn(), because it
      releases a lock before acquiring it again.

This partially resolves:
    http://tracker.ceph.com/issues/4184Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

cc344fa1

rbd: normalize dout() calls · 37206ee5

由 Alex Elder 提交于 2月 20, 2013

Add dout() calls to facilitate tracing of image and object requests.
Change a few existing calls so they use __func__ rather than the
hard-coded function name.  Have calls always add ":" after the name
of the function, and prefix pointer values with a consistent tag
indicating what it represents.  (Note that there remain some older
dout() calls that are left untouched by this patch.)

Issue a warning if rbd_osd_write_callback() ever gets a short write.

This resolves:
    http://tracker.ceph.com/issues/4235Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

37206ee5

rbd: barriers are hard · 632b88ca

由 Alex Elder 提交于 2月 21, 2013

Let's go shopping!

I'm afraid this may not have gotten it right:
    07741308  rbd: add barriers near done flag operations

The smp_wmb() should have been done *before* setting the done flag,
to ensure all other data was valid before marking the object request
done.

Switch to use atomic_inc_return() here to set the done flag, which
allows us to verify we don't mark something done more than once.
Doing this also implies general barriers before and after the call.

And although a read memory barrier might have been sufficient before
reading the done flag, convert this to a full memory barrier just
to put this issue to bed.

This resolves:
    http://tracker.ceph.com/issues/4238Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

632b88ca

rbd: ignore zero-length requests · 4dda41d3

由 Alex Elder 提交于 2月 20, 2013

The old request code simply ignored zero-length requests.  We should
still operate that same way to avoid any changes in behavior.  We
can implement handling for special zero-length requests separately
(see http://tracker.ceph.com/issues/4236).

Add some assertions based on this new constraint.

This resolves:
    http://tracker.ceph.com/issues/4237Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

4dda41d3

20 2月, 2013 5 次提交

libceph: drop return value from page vector copy routines · 903bb32e

由 Alex Elder 提交于 2月 06, 2013

The return values provided for ceph_copy_to_page_vector() and
ceph_copy_from_page_vector() serve no purpose, so get rid of them.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

903bb32e

rbd: ignore result of ceph_copy_from_page_vector() · 23ed6e13

由 Alex Elder 提交于 2月 06, 2013

The result of ceph_copy_from_page_vector() is simply the length
argument it is provided.

This is called by rbd_obj_method_sync(), which returns the result if
it's non-negative.  But we always either ignore or overwrite that
return value.  So explicitly ignore what's returned by the copy
function, and have rbd_obj_method_sync() always return either a
negative errno or 0.

We also return the result of ceph_copy_from_page_vector() in
rbd_obj_read_sync().  There we still want to return the number of
bytes transferred, but we can use the value we already have in hand
rather than what ceph_copy_from_page_vector() provides.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

23ed6e13

rbd: prevent bytes transferred overflow · 1ceae7ef

由 Alex Elder 提交于 2月 06, 2013

In rbd_obj_read_sync(), verify the number of bytes transferred won't
exceed what can be represented by a size_t before using it to
indicate the number of bytes to copy to the result buffer.

(The real motivation for this is to prepare for the next patch.)
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

1ceae7ef

libceph: allow STAT osd operations · fbfab539

由 Alex Elder 提交于 2月 08, 2013

Add support for CEPH_OSD_OP_STAT operations in the osd client
and in rbd.

This operation sends no data to the osd; everything required is
encoded in identity of the target object.

The result will be ENOENT if the object doesn't exist.  If it does
exist and no other error occurs the server returns the size and last
modification time of the target object as output data (in little
endian format).  The size is a 64 bit unsigned and the time is
ceph_timespec structure (two unsigned 32-bit integers, representing
a seconds and nanoseconds value).

This resolves:
    http://tracker.ceph.com/issues/4007Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

fbfab539

rbd: add parentheses to object request iterator macros · ef06f4d3

由 Alex Elder 提交于 2月 08, 2013

The for_each_obj_request*() macros should parenthesize their uses of
the ireq parameter.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

ef06f4d3

19 2月, 2013 1 次提交

libceph: kill ceph_osdc_create_event() "one_shot" parameter · 3c663bbd

由 Alex Elder 提交于 2月 15, 2013

There is only one caller of ceph_osdc_create_event(), and it
provides 0 as its "one_shot" argument.  Get rid of that argument and
just use 0 in its place.

Replace the code in handle_watch_notify() that executes if one_shot
is nonzero in the event with a BUG_ON() call.

While modifying "osd_client.c", give handle_watch_notify() static
scope.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

3c663bbd

14 2月, 2013 22 次提交

rbd: add barriers near done flag operations · 07741308

由 Alex Elder 提交于 2月 05, 2013

Somehow, I missed this little item in Documentation/atomic_ops.txt:
    *** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! ***

Create and use some helper functions that include the proper memory
barriers for manipulating the done field.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

07741308

rbd: turn off interrupts for open/remove locking · a14ea269

由 Alex Elder 提交于 2月 05, 2013

This commit:
    bc7a62ee5 rbd: prevent open for image being removed
added checking for removing rbd before allowing an open, and used
the same request spinlock for protecting that and updating the open
count as is used for the request queue.

However it used the non-irq protected version of the spinlocks.
Fix that.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

a14ea269

libceph: don't require r_num_pages for bio requests · 9cbb1d72

由 Alex Elder 提交于 1月 31, 2013

There is a check in the completion path for osd requests that
ensures the number of pages allocated is enough to hold the amount
of incoming data expected.

For bio requests coming from rbd the "number of pages" is not really
meaningful (although total length would be).  So stop requiring that
nr_pages be supplied for bio requests.  This is done by checking
whether the pages pointer is null before checking the value of
nr_pages.

Note that this value is passed on to the messenger, but there it's
only used for debugging--it's never used for validation.

While here, change another spot that used r_pages in a debug message
inappropriately, and also invalidate the r_con_filling_msg pointer
after dropping a reference to it.

This resolves:
    http://tracker.ceph.com/issues/3875Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

9cbb1d72

rbd: don't take extra bio reference for osd client · 1e32d34c

由 Alex Elder 提交于 1月 30, 2013

Currently, if the OSD client finds an osd request has had a bio list
attached to it, it drops a reference to it (or rather, to the first
entry on that list) when the request is released.

The code that added that reference (i.e., the rbd client) is
therefore required to take an extra reference to that first bio
structure.

The osd client doesn't really do anything with the bio pointer other
than transfer it from the osd request structure to outgoing (for
writes) and ingoing (for reads) messages.  So it really isn't the
right place to be taking or dropping references.

Furthermore, the rbd client already holds references to all bio
structures it passes to the osd client, and holds them until the
request is completed.  So there's no need for this extra reference
whatsoever.

So remove the bio_put() call in ceph_osdc_release_request(), as
well as its matching bio_get() call in rbd_osd_req_create().

This change could lead to a crash if old libceph.ko was used with
new rbd.ko.  Add a compatibility check at rbd initialization time to
avoid this possibilty.

This resolves:
    http://tracker.ceph.com/issues/3798    and
    http://tracker.ceph.com/issues/3799Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

1e32d34c

rbd: prevent open for image being removed · b82d167b

由 Alex Elder 提交于 1月 14, 2013

An open request for a mapped rbd image can arrive while removal of
that mapping is underway.  We need to prevent such an open request
from succeeding.  (It appears that Maciej Galkiewicz ran into this
problem.)

Define and use a "removing" flag to indicate a mapping is getting
removed.  Set it in the remove path after verifying nothing holds
the device open.  And check it in the open path before allowing the
open to proceed.  Acquire the rbd device's lock around each of these
spots to avoid any races accessing the flags and open_count fields.

This addresses:
    http://tracker.newdream.net/issues/3427Reported-by: NMaciej Galkiewicz <maciejgalkiewicz@ragnarson.com>
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

b82d167b

rbd: define flags field, use it for exists flag · 6d292906

由 Alex Elder 提交于 1月 14, 2013

Define a new rbd device flags field, manipulated using bit
operations.  Replace the use of the current "exists" flag with a bit
in this new "flags" field.  Add a little commentary about the
"exists" flag, which does not need to be manipulated atomically.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

6d292906

rbd: don't drop watch requests on completion · 8eb87565

由 Alex Elder 提交于 1月 25, 2013

When we register an osd request to linger, it means that request
will stay around (under control of the osd client) until we've
unregistered it.  We do that for an rbd image's header object, and
we keep a pointer to the object request associated with it.

Keep a reference to the watch object request for as long as it is
registered to linger.  Drop it again after we've removed the linger
registration.

This resolves:
    http://tracker.ceph.com/issues/3937

(Note: this originally came about because the osd client was
issuing a callback more than once.  But that behavior will be
changing soon, documented in tracker issue 3967.)
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

8eb87565

rbd: decrement obj request count when deleting · 25dcf954

由 Alex Elder 提交于 1月 25, 2013

Decrement the obj_request_count value when deleting an object
request from its image request's list.  Rearrange a few lines
in the surrounding code.

This resolves:
    http://tracker.ceph.com/issues/3940Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

25dcf954

rbd: track object rather than osd request for watch · 975241af

由 Alex Elder 提交于 1月 25, 2013

Switch to keeping track of the object request pointer rather than
the osd request used to watch the rbd image header object.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

975241af

rbd: unregister linger in watch sync routine · 6977c3f9

由 Alex Elder 提交于 1月 25, 2013

Move the code that unregisters an rbd device's lingering header
object watch request into rbd_dev_header_watch_sync(), so it
occurs in the same function that originally sets up that request.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

6977c3f9

rbd: get rid of rbd_req_sync_exec() · 9f20e02a

由 Alex Elder 提交于 1月 20, 2013

Get rid rbd_req_sync_exec() because it is no longer used.  That
eliminates the last use of rbd_req_sync_op(), so get rid of that
too.  And finally, that leaves rbd_do_request() unreferenced, so get
rid of that.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

9f20e02a

rbd: implement sync method with new code · 36be9a76

由 Alex Elder 提交于 1月 19, 2013

Reimplement synchronous object method calls using the new request
tracking code.  Use the name rbd_obj_method_sync()
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

36be9a76

rbd: send notify ack asynchronously · cf81b60e

由 Alex Elder 提交于 1月 17, 2013

When we receive notification of a change to an rbd image's header
object we need to refresh our information about the image (its
size and snapshot context).  Once we have refreshed our rbd image
we need to acknowledge the notification.

This acknowledgement was previously done synchronously, but there's
really no need to wait for it to complete.

Change it so the caller doesn't wait for the notify acknowledgement
request to complete.  And change the name to reflect it's no longer
synchronous.

This resolves:
    http://tracker.newdream.net/issues/3877Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

cf81b60e

rbd: get rid of rbd_req_sync_notify_ack() · 5ae9db81

由 Alex Elder 提交于 1月 20, 2013

Get rid rbd_req_sync_notify_ack() because it is no longer used.
As a result rbd_simple_req_cb() becomes unreferenced, so get rid
of that too.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

5ae9db81

rbd: use new code for notify ack · b8d70035

由 Alex Elder 提交于 11月 30, 2012

Use the new object request tracking mechanism for handling a
notify_ack request.

Move the callback function below the definition of this so we don't
have to do a pre-declaration.

This resolves:
    http://tracker.newdream.net/issues/3754Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

b8d70035

rbd: get rid of rbd_req_sync_watch() · ecf7a031

由 Alex Elder 提交于 1月 20, 2013

Get rid of rbd_req_sync_watch(), because it is no longer used.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

ecf7a031

rbd: implement watch/unwatch with new code · 9969ebc5

由 Alex Elder 提交于 1月 18, 2013

Implement a new function to set up or tear down a watch event
for an mapped rbd image header using the new request code.

Create a new object request type "nodata" to handle this.  And
define rbd_osd_trivial_callback() which simply marks a request done.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

9969ebc5

rbd: get rid of rbd_req_sync_read() · 86ea43bf

由 Alex Elder 提交于 1月 20, 2013

Delete rbd_req_sync_read() is no longer used, so get rid of it.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

86ea43bf

rbd: implement sync object read with new code · 788e2df3

由 Alex Elder 提交于 1月 17, 2013

Reimplement the synchronous read operation used for reading a
version 1 header using the new request tracking code.  Name the
resulting function rbd_obj_read_sync() to better reflect that
it's a full object operation, not an object request.  To do this,
implement a new OBJ_REQUEST_PAGES object request type.

This implements a new mechanism to allow the caller to wait for
completion for an rbd_obj_request by calling rbd_obj_request_wait().

This partially resolves:
    http://tracker.newdream.net/issues/3755Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

788e2df3

rbd: kill rbd_req_coll and rbd_request · 7d250b94

由 Alex Elder 提交于 11月 30, 2012

The two remaining callers of rbd_do_request() always pass a null
collection pointer, so the "coll" and "coll_index" parameters are
not needed.  There is no other use of that data structure, so it
can be eliminated.

Deleting them means there is no need to allocate a rbd_request
structure for the callback function.  And since that's the only use
of *that* structure, it too can be eliminated.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

7d250b94

rbd: kill rbd_rq_fn() and all other related code · 2250a71b

由 Alex Elder 提交于 11月 30, 2012

Now that the request function has been replaced by one using the new
request management data structures the old one can go away.
Deleting it makes rbd_dev_do_request() no longer needed, and
deleting that makes other functions unneeded, and so on.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

2250a71b

rbd: new request tracking code · bf0d5f50

由 Alex Elder 提交于 11月 22, 2012

This patch fully implements the new request tracking code for rbd
I/O requests.

Each I/O request to an rbd image will get an rbd_image_request
structure allocated to track it.  This provides access to all
information about the original request, as well as access to the
set of one or more object requests that are initiated as a result
of the image request.

An rbd_obj_request structure defines a request sent to a single osd
object (possibly) as part of an rbd image request.  An rbd object
request refers to a ceph_osd_request structure built up to represent
the request; for now it will contain a single osd operation.  It
also provides space to hold the result status and the version of the
object when the osd request completes.

An rbd_obj_request structure can also stand on its own.  This will
be used for reading the version 1 header object, for issuing
acknowledgements to event notifications, and for making object
method calls.

All rbd object requests now complete asynchronously with respect
to the osd client--they supply a common callback routine.

This resolves:
    http://tracker.newdream.net/issues/3741Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

bf0d5f50

26 1月, 2013 3 次提交

rbd: don't retry setting up header watch · c0430647

由 Alex Elder 提交于 1月 18, 2013

When an rbd image is initially mapped a watch event is registered so
we can do something if the header object changes.

The code that does this currently loops if initiating the watch
request results in an ERANGE error.  The osds will never return
ERANGE, so there's no reason to do this loop, so get rid of it.

This resolves:
    http://tracker.newdream.net/issues/3860

Note that the problem this loop was intended to solve is a race
between collecting image header information and setting up the watch
on the header object.  The real fix for that problem is described
here:
    http://tracker.newdream.net/issues/3871Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

c0430647

rbd: check for overflow in rbd_get_num_segments() · 38901e0f

由 Alex Elder 提交于 1月 10, 2013

The return type of rbd_get_num_segments() is int, but the values it
operates on are u64.  Although it's not likely, there's no guarantee
the result won't exceed what can be respresented in an int.  The
function is already designed to return -ERANGE on error, so just add
this possible overflow as another reason to return that.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NDan Mick <dan.mick@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

38901e0f

rbd: small changes · 98571b5a

由 Alex Elder 提交于 1月 20, 2013

A few very minor changes to the rbd code:
    - RBD_MAX_OPT_LEN is unused, so get rid of it
    - Consolidate rbd options definitions
    - Make rbd_segment_name() return pointer to const char
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NDan Mick <dan.mick@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

98571b5a

18 1月, 2013 1 次提交

rbd: fix type of snap_id in rbd_dev_v2_snap_info() · e0b49868

由 Alex Elder 提交于 1月 09, 2013

The type of the snap_id local variable is defined with the
wrong byte order.  Fix that.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

e0b49868

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功