1. 02 5月, 2013 8 次提交
    • A
      libceph: let osd ops determine request data length · 175face2
      Alex Elder 提交于
      The length of outgoing data in an osd request is dependent on the
      osd ops that are embedded in that request.  Each op is encoded into
      a request message using osd_req_encode_op(), so that should be used
      to determine the amount of outgoing data implied by the op as it
      is encoded.
      
      Have osd_req_encode_op() return the number of bytes of outgoing data
      implied by the op being encoded, and accumulate and use that in
      ceph_osdc_build_request().
      
      As a result, ceph_osdc_build_request() no longer requires its "len"
      parameter, so get rid of it.
      
      Using the sum of the op lengths rather than the length provided is
      a valid change because:
          - The only callers of osd ceph_osdc_build_request() are
            rbd and the osd client (in ceph_osdc_new_request() on
            behalf of the file system).
          - When rbd calls it, the length provided is only non-zero for
            write requests, and in that case the single op has the
            same length value as what was passed here.
          - When called from ceph_osdc_new_request(), (it's not all that
            easy to see, but) the length passed is also always the same
            as the extent length encoded in its (single) write op if
            present.
      
      This resolves:
          http://tracker.ceph.com/issues/4406Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      175face2
    • A
      libceph: record byte count not page count · e0c59487
      Alex Elder 提交于
      Record the byte count for an osd request rather than the page count.
      The number of pages can always be derived from the byte count (and
      alignment/offset) but the reverse is not true.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      e0c59487
    • A
      libceph: separate read and write data · 0fff87ec
      Alex Elder 提交于
      An osd request defines information about where data to be read
      should be placed as well as where data to write comes from.
      Currently these are represented by common fields.
      
      Keep information about data for writing separate from data to be
      read by splitting these into data_in and data_out fields.
      
      This is the key patch in this whole series, in that it actually
      identifies which osd requests generate outgoing data and which
      generate incoming data.  It's less obvious (currently) that an osd
      CALL op generates both outgoing and incoming data; that's the focus
      of some upcoming work.
      
      This resolves:
          http://tracker.ceph.com/issues/4127Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      0fff87ec
    • A
      libceph: distinguish page and bio requests · 2ac2b7a6
      Alex Elder 提交于
      An osd request uses either pages or a bio list for its data.  Use a
      union to record information about the two, and add a data type
      tag to select between them.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      2ac2b7a6
    • A
      libceph: separate osd request data info · 2794a82a
      Alex Elder 提交于
      Pull the fields in an osd request structure that define the data for
      the request out into a separate structure.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      2794a82a
    • A
      libceph: don't assign page info in ceph_osdc_new_request() · 153e5167
      Alex Elder 提交于
      Currently ceph_osdc_new_request() assigns an osd request's
      r_num_pages and r_alignment fields.  The only thing it does
      after that is call ceph_osdc_build_request(), and that doesn't
      need those fields to be assigned.
      
      Move the assignment of those fields out of ceph_osdc_new_request()
      and into its caller.  As a result, the page_align parameter is no
      longer used, so get rid of it.
      
      Note that in ceph_sync_write(), the value for req->r_num_pages had
      already been calculated earlier (as num_pages, and fortunately
      it was computed the same way).  So don't bother recomputing it,
      but because it's not needed earlier, move that calculation after the
      call to ceph_osdc_new_request().  Hold off making the assignment to
      r_alignment, doing it instead r_pages and r_num_pages are
      getting set.
      
      Similarly, in start_read(), nr_pages already holds the number of
      pages in the array (and is calculated the same way), so there's no
      need to recompute it.  Move the assignment of the page alignment
      down with the others there as well.
      
      This and the next few patches are preparation work for:
          http://tracker.ceph.com/issues/4127Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      153e5167
    • A
      libceph: use (void *) for untyped data in osd ops · 2a24d1f4
      Alex Elder 提交于
      Two of the fields defining osd operations are defined using (char *)
      while the data they represent are really untyped, not character
      strings.  Change them to have type (void *).
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      2a24d1f4
    • A
      libceph: complete lingering requests only once · 0d5af164
      Alex Elder 提交于
      An osd request marked to linger will be re-submitted in the event
      a connection to the target osd gets dropped.  Currently, if there
      is a callback function associated with a request it will be called
      each time a request is submitted--which for lingering requests can
      be more than once.
      
      Change it so a request--including lingering ones--will get completed
      (from the perspective of the user of the osd client) exactly once.
      
      This resolves:
          http://tracker.ceph.com/issues/3967Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      0d5af164
  2. 27 2月, 2013 4 次提交
  3. 19 2月, 2013 8 次提交
  4. 18 1月, 2013 9 次提交
    • A
      rbd: kill ceph_osd_req_op->flags · 2b5fc648
      Alex Elder 提交于
      The flags field of struct ceph_osd_req_op is never used, so just get
      rid of it.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      2b5fc648
    • A
      libceph: pass num_op with ops · ae7ca4a3
      Alex Elder 提交于
      Both ceph_osdc_alloc_request() and ceph_osdc_build_request() are
      provided an array of ceph osd request operations.  Rather than just
      passing the number of operations in the array, the caller is
      required append an additional zeroed operation structure to signal
      the end of the array.
      
      All callers know the number of operations at the time these
      functions are called, so drop the silly zero entry and supply that
      number directly.  As a result, get_num_ops() is no longer needed.
      This also means that ceph_osdc_alloc_request() never uses its ops
      argument, so that can be dropped.
      
      Also rbd_create_rw_ops() no longer needs to add one to reserve room
      for the additional op.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      ae7ca4a3
    • A
      libceph: don't set pages or bio in ceph_osdc_alloc_request() · 54a54007
      Alex Elder 提交于
      Only one of the two callers of ceph_osdc_alloc_request() provides
      page or bio data for its payload.  And essentially all that function
      was doing with those arguments was assigning them to fields in the
      osd request structure.
      
      Simplify ceph_osdc_alloc_request() by having the caller take care of
      making those assignments
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      54a54007
    • A
      libceph: don't set flags in ceph_osdc_alloc_request() · d178a9e7
      Alex Elder 提交于
      The only thing ceph_osdc_alloc_request() really does with the
      flags value it is passed is assign it to the newly-created
      osd request structure.  Do that in the caller instead.
      
      Both callers subsequently call ceph_osdc_build_request(), so have
      that function (instead of ceph_osdc_alloc_request()) issue a warning
      if a request comes through with neither the read nor write flags set.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      d178a9e7
    • A
      libceph: drop osdc from ceph_calc_raw_layout() · e75b45cf
      Alex Elder 提交于
      The osdc parameter to ceph_calc_raw_layout() is not used, so get rid
      of it.  Consequently, the corresponding parameter in calc_layout()
      becomes unused, so get rid of that as well.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      e75b45cf
    • A
      libceph: drop snapid in ceph_calc_raw_layout() · 4d6b250b
      Alex Elder 提交于
      A snapshot id must be provided to ceph_calc_raw_layout() even though
      it is not needed at all for calculating the layout.
      
      Where the snapshot id *is* needed is when building the request
      message for an osd operation.
      
      Drop the snapid parameter from ceph_calc_raw_layout() and pass
      that value instead in ceph_osdc_build_request().
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      4d6b250b
    • A
      libceph: pass length to ceph_osdc_build_request() · 0120be3c
      Alex Elder 提交于
      The len argument to ceph_osdc_build_request() is set up to be
      passed by address, but that function never updates its value
      so there's no need to do this.  Tighten up the interface by
      passing the length directly.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      0120be3c
    • A
      libceph: always allow trail in osd request · c885837f
      Alex Elder 提交于
      An osd request structure contains an optional trail portion, which
      if present will contain data to be passed in the payload portion of
      the message containing the request.  The trail field is a
      ceph_pagelist pointer, and if null it indicates there is no trail.
      
      A ceph_pagelist structure contains a length field, and it can
      legitimately hold value 0.  Make use of this to change the
      interpretation of the "trail" of an osd request so that every osd
      request has trailing data, it just might have length 0.
      
      This means we change the r_trail field in a ceph_osd_request
      structure from a pointer to a structure that is always initialized.
      
      Note that in ceph_osdc_start_request(), the trail pointer (or now
      address of that structure) is assigned to a ceph message's trail
      field.  Here's why that's still OK (looking at net/ceph/messenger.c):
          - What would have resulted in a null pointer previously will now
            refer to a 0-length page list.  That message trail pointer
            is used in two functions, write_partial_msg_pages() and
            out_msg_pos_next().
          - In write_partial_msg_pages(), a null page list pointer is
            handled the same as a message with 0-length trail, and both
            result in a "in_trail" variable set to false.  The trail
            pointer is only used if in_trail is true.
          - The only other place the message trail pointer is used is
            out_msg_pos_next().  That function is only called by
            write_partial_msg_pages() and only touches the trail pointer
            if the in_trail value it is passed is true.
      Therefore a null ceph_msg->trail pointer is equivalent to a non-null
      pointer referring to a 0-length page list structure.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      c885837f
    • A
      rbd: drop oid parameters from ceph_osdc_build_request() · af77f26c
      Alex Elder 提交于
      The last two parameters to ceph_osd_build_request() describe the
      object id, but the values passed always come from the osd request
      structure whose address is also provided.  Get rid of those last
      two parameters.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>
      af77f26c
  5. 02 10月, 2012 1 次提交
  6. 17 5月, 2012 1 次提交
  7. 12 11月, 2011 1 次提交
  8. 23 3月, 2011 1 次提交
  9. 22 3月, 2011 1 次提交
    • S
      libceph: fix osd request queuing on osdmap updates · 6f6c7006
      Sage Weil 提交于
      If we send a request to osd A, and the request's pg remaps to osd B and
      then back to A in quick succession, we need to resend the request to A. The
      old code was only calling kick_requests after processing all incremental
      maps in a message, so it was very possible to not resend a request that
      needed to be resent.  This would make the osd eventually time out (at least
      with the current default of osd timeouts enabled).
      
      The correct approach is to scan requests on every map incremental.  This
      patch refactors the kick code in a few ways:
       - all requests are either on req_lru (in flight), req_unsent (ready to
         send), or req_notarget (currently map to no up osd)
       - mapping always done by map_request (previous map_osds)
       - if the mapping changes, we requeue.  requests are resent only after all
         map incrementals are processed.
       - some osd reset code is moved out of kick_requests into a separate
         function
       - the "kick this osd" functionality is moved to kick_osd_requests, as it
         is unrelated to scanning for request->pg->osd mapping changes
      Signed-off-by: NSage Weil <sage@newdream.net>
      6f6c7006
  10. 10 11月, 2010 1 次提交
    • S
      ceph: make page alignment explicit in osd interface · b7495fc2
      Sage Weil 提交于
      We used to infer alignment of IOs within a page based on the file offset,
      which assumed they matched.  This broke with direct IO that was not aligned
      to pages (e.g., 512-byte aligned IO).  We were also trusting the alignment
      specified in the OSD reply, which could have been adjusted by the server.
      
      Explicitly specify the page alignment when setting up OSD IO requests.
      Signed-off-by: NSage Weil <sage@newdream.net>
      b7495fc2
  11. 21 10月, 2010 4 次提交
  12. 12 5月, 2010 1 次提交