1. 26 5月, 2016 4 次提交
    • I
      libceph: async MON client generic requests · d0b19705
      Ilya Dryomov 提交于
      For map check, we are going to need to send CEPH_MSG_MON_GET_VERSION
      messages asynchronously and get a callback on completion.  Refactor MON
      client to allow firing off generic requests asynchronously and add an
      async variant of ceph_monc_get_version().  ceph_monc_do_statfs() is
      switched over and remains sync.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      d0b19705
    • I
      libceph: handle_one_map() · 42c1b124
      Ilya Dryomov 提交于
      Separate osdmap handling from decoding and iterating over a bag of maps
      in a fresh MOSDMap message.  This sets up the scene for the updated OSD
      client.
      
      Of particular importance here is the addition of pi->was_full, which
      can be used to answer "did this pool go full -> not-full in this map?".
      This is the key bit for supporting pool quotas.
      
      We won't be able to downgrade map_sem for much longer, so drop
      downgrade_write().
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      42c1b124
    • I
      libceph: DEFINE_RB_FUNCS macro · fcd00b68
      Ilya Dryomov 提交于
      Given
      
          struct foo {
              u64 id;
              struct rb_node bar_node;
          };
      
      generate insert_bar(), erase_bar() and lookup_bar() functions with
      
          DEFINE_RB_FUNCS(bar, struct foo, id, bar_node)
      
      The key is assumed to be an integer (u64, int, etc), compared with
      < and >.  nodefld has to be initialized with RB_CLEAR_NODE().
      
      Start using it for MDS, MON and OSD requests and OSD sessions.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      fcd00b68
    • I
      libceph: nuke unused fields and functions · 0c0a8de1
      Ilya Dryomov 提交于
      Either unused or useless:
      
          osdmap->mkfs_epoch
          osd->o_marked_for_keepalive
          monc->num_generic_requests
          osdc->map_waiters
          osdc->last_requested_map
          osdc->timeout_tid
      
          osd_req_op_cls_response_data()
      
          osdmap_apply_incremental() @msgr arg
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      0c0a8de1
  2. 26 3月, 2016 9 次提交
    • I
      libceph: behave in mon_fault() if cur_mon < 0 · b5d91704
      Ilya Dryomov 提交于
      This can happen if __close_session() in ceph_monc_stop() races with
      a connection reset.  We need to ignore such faults, otherwise it's
      likely we would take !hunting, call __schedule_delayed() and end up
      with delayed_work() executing on invalid memory, among other things.
      
      The (two!) con->private tests are useless, as nothing ever clears
      con->private.  Nuke them.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      b5d91704
    • I
      libceph: reschedule tick in mon_fault() · bee3a37c
      Ilya Dryomov 提交于
      Doing __schedule_delayed() in the hunting branch is pointless, as the
      tick will have already been scheduled by then.
      
      What we need to do instead is *reschedule* it in the !hunting branch,
      after reopen_session() changes hunt_mult, which affects the delay.
      This helps with spacing out connection attempts and avoiding things
      like two back-to-back attempts followed by a longer period of waiting
      around.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      bee3a37c
    • I
      libceph: introduce and switch to reopen_session() · 1752b50c
      Ilya Dryomov 提交于
      hunting is now set in __open_session() and cleared in finish_hunting(),
      instead of all around.  The "session lost" message is printed not only
      on connection resets, but also on keepalive timeouts.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      1752b50c
    • I
      libceph: monc hunt rate is 3s with backoff up to 30s · 168b9090
      Ilya Dryomov 提交于
      Unless we are in the process of setting up a client (i.e. connecting to
      the monitor cluster for the first time), apply a backoff: every time we
      want to reopen a session, increase our timeout by a multiple (currently
      2); when we complete the connection, reduce that multipler by 50%.
      
      Mirrors ceph.git commit 794c86fd289bd62a35ed14368fa096c46736e9a2.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      168b9090
    • I
      libceph: monc ping rate is 10s · 58d81b12
      Ilya Dryomov 提交于
      Split ping interval and ping timeout: ping interval is 10s; keepalive
      timeout is 30s.
      
      Make monc_ping_timeout a constant while at it - it's not actually
      exported as a mount option (and the rest of tick-related settings won't
      be either), so it's got no place in ceph_options.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      58d81b12
    • I
      libceph: pick a different monitor when reconnecting · 0e04dc26
      Ilya Dryomov 提交于
      Don't try to reconnect to the same monitor when we fail to establish
      a session within a timeout or it's lost.
      
      For that, pick_new_mon() needs to see the old value of cur_mon, so
      don't clear it in __close_session() - all calls to __close_session()
      but one are followed by __open_session() anyway.  __open_session() is
      only called when a new session needs to be established, so the "already
      open?" branch, which is now in the way, is simply dropped.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      0e04dc26
    • I
      libceph: revamp subs code, switch to SUBSCRIBE2 protocol · 82dcabad
      Ilya Dryomov 提交于
      It is currently hard-coded in the mon_client that mdsmap and monmap
      subs are continuous, while osdmap sub is always "onetime".  To better
      handle full clusters/pools in the osd_client, we need to be able to
      issue continuous osdmap subs.  Revamp subs code to allow us to specify
      for each sub whether it should be continuous or not.
      
      Although not strictly required for the above, switch to SUBSCRIBE2
      protocol while at it, eliminating the ambiguity between a request for
      "every map since X" and a request for "just the latest" when we don't
      have a map yet (i.e. have epoch 0).  SUBSCRIBE2 feature bit is now
      required - it's been supported since pre-argonaut (2010).
      
      Move "got mdsmap" call to the end of ceph_mdsc_handle_map() - calling
      in before we validate the epoch and successfully install the new map
      can mess up mon_client sub state.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      82dcabad
    • I
      libceph: decouple hunting and subs management · 0f9af169
      Ilya Dryomov 提交于
      Coupling hunting state with subscribe state is not a good idea.  Clear
      hunting when we complete the authentication handshake.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      0f9af169
    • I
      libceph: move debugfs initialization into __ceph_open_session() · 02ac956c
      Ilya Dryomov 提交于
      Our debugfs dir name is a concatenation of cluster fsid and client
      unique ID ("global_id").  It used to be the case that we learned
      global_id first, nowadays we always learn fsid first - the monmap is
      sent before any auth replies are.  ceph_debugfs_client_init() call in
      ceph_monc_handle_map() is therefore never executed and can be removed.
      
      Its counterpart in handle_auth_reply() doesn't really belong there
      either: having to do monc->client and unlocking early to work around
      lockdep is a testament to that.  Move it into __ceph_open_session(),
      where it can be called unconditionally.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      02ac956c
  3. 22 1月, 2016 1 次提交
  4. 09 9月, 2015 1 次提交
  5. 25 6月, 2015 2 次提交
    • I
      libceph: a couple tweaks for wait loops · 216639dd
      Ilya Dryomov 提交于
      - return -ETIMEDOUT instead of -EIO in case of timeout
      - wait_event_interruptible_timeout() returns time left until timeout
        and since it can be almost LONG_MAX we had better assign it to long
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      216639dd
    • I
      libceph: store timeouts in jiffies, verify user input · a319bf56
      Ilya Dryomov 提交于
      There are currently three libceph-level timeouts that the user can
      specify on mount: mount_timeout, osd_idle_ttl and osdkeepalive.  All of
      these are in seconds and no checking is done on user input: negative
      values are accepted, we multiply them all by HZ which may or may not
      overflow, arbitrarily large jiffies then get added together, etc.
      
      There is also a bug in the way mount_timeout=0 is handled.  It's
      supposed to mean "infinite timeout", but that's not how wait.h APIs
      treat it and so __ceph_open_session() for example will busy loop
      without much chance of being interrupted if none of ceph-mons are
      there.
      
      Fix all this by verifying user input, storing timeouts capped by
      msecs_to_jiffies() in jiffies and using the new ceph_timeout_jiffies()
      helper for all user-specified waits to handle infinite timeouts
      correctly.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      a319bf56
  6. 19 2月, 2015 2 次提交
  7. 09 1月, 2015 1 次提交
  8. 15 10月, 2014 1 次提交
  9. 11 9月, 2014 1 次提交
  10. 06 6月, 2014 2 次提交
  11. 14 1月, 2014 1 次提交
  12. 02 5月, 2013 1 次提交
  13. 26 2月, 2013 1 次提交
  14. 02 10月, 2012 2 次提交
  15. 21 8月, 2012 1 次提交
  16. 31 7月, 2012 1 次提交
  17. 06 7月, 2012 2 次提交
  18. 20 6月, 2012 1 次提交
    • S
      libceph: flush msgr queue during mon_client shutdown · 642c0dbd
      Sage Weil 提交于
      We need to flush the msgr workqueue during mon_client shutdown to
      ensure that any work affecting our embedded ceph_connection is
      finished so that we can be safely destroyed.
      
      Previously, we were flushing the work queue after osd_client
      shutdown and before mon_client shutdown to ensure that any osd
      connection refs to authorizers are flushed.  Remove the redundant
      flush, and document in the comment that the mon_client flush is
      needed to cover that case as well.
      Signed-off-by: NSage Weil <sage@inktank.com>
      Reviewed-by: NAlex Elder <elder@inktank.com>
      (cherry picked from commit f3dea7ed)
      642c0dbd
  19. 16 6月, 2012 1 次提交
    • S
      libceph: flush msgr queue during mon_client shutdown · f3dea7ed
      Sage Weil 提交于
      We need to flush the msgr workqueue during mon_client shutdown to
      ensure that any work affecting our embedded ceph_connection is
      finished so that we can be safely destroyed.
      
      Previously, we were flushing the work queue after osd_client
      shutdown and before mon_client shutdown to ensure that any osd
      connection refs to authorizers are flushed.  Remove the redundant
      flush, and document in the comment that the mon_client flush is
      needed to cover that case as well.
      Signed-off-by: NSage Weil <sage@inktank.com>
      Reviewed-by: NAlex Elder <elder@inktank.com>
      f3dea7ed
  20. 06 6月, 2012 5 次提交
    • A
      libceph: make ceph_con_revoke() a msg operation · 6740a845
      Alex Elder 提交于
      ceph_con_revoke() is passed both a message and a ceph connection.
      Now that any message associated with a connection holds a pointer
      to that connection, there's no need to provide the connection when
      revoking a message.
      
      This has the added benefit of precluding the possibility of the
      providing the wrong connection pointer.  If the message's connection
      pointer is null, it is not being tracked by any connection, so
      revoking it is a no-op.  This is supported as a convenience for
      upper layers, so they can revoke a message that is not actually
      "in flight."
      
      Rename the function ceph_msg_revoke() to reflect that it is really
      an operation on a message, not a connection.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      6740a845
    • A
      libceph: tweak ceph_alloc_msg() · 1c20f2d2
      Alex Elder 提交于
      The function ceph_alloc_msg() is only used to allocate a message
      that will be assigned to a connection's in_msg pointer.  Rename the
      function so this implied usage is more clear.
      
      In addition, make that assignment inside the function (again, since
      that's precisely what it's intended to be used for).  This allows us
      to return what is now provided via the passed-in address of a "skip"
      variable.  The return type is now Boolean to be explicit that there
      are only two possible outcomes.
      
      Make sure the result of an ->alloc_msg method call always sets the
      value of *skip properly.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      1c20f2d2
    • A
      libceph: fully initialize connection in con_init() · 1bfd89f4
      Alex Elder 提交于
      Move the initialization of a ceph connection's private pointer,
      operations vector pointer, and peer name information into
      ceph_con_init().  Rearrange the arguments so the connection pointer
      is first.  Hide the byte-swapping of the peer entity number inside
      ceph_con_init()
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      1bfd89f4
    • A
      libceph: init monitor connection when opening · 20581c1f
      Alex Elder 提交于
      Hold off initializing a monitor client's connection until just
      before it gets opened for use.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      20581c1f
    • S
      libceph: drop connection refcounting for mon_client · ec87ef43
      Sage Weil 提交于
      All references to the embedded ceph_connection come from the msgr
      workqueue, which is drained prior to mon_client destruction.  That
      means we can ignore con refcounting entirely.
      Signed-off-by: NSage Weil <sage@newdream.net>
      Reviewed-by: NAlex Elder <elder@inktank.com>
      ec87ef43