1. 20 2月, 2017 1 次提交
  2. 15 12月, 2016 1 次提交
    • I
      libceph: always signal completion when done · c297eb42
      Ilya Dryomov 提交于
      r_safe_completion is currently, and has always been, signaled only if
      on-disk ack was requested.  It's there for fsync and syncfs, which wait
      for in-flight writes to flush - all data write requests set ONDISK.
      
      However, the pool perm check code introduced in 4.2 sends a write
      request with only ACK set.  An unfortunately timed syncfs can then hang
      forever: r_safe_completion won't be signaled because only an unsafe
      reply was requested.
      
      We could patch ceph_osdc_sync() to skip !ONDISK write requests, but
      that is somewhat incomplete and yet another special case.  Instead,
      rename this completion to r_done_completion and always signal it when
      the OSD client is done with the request, whether unsafe, safe, or
      error.  This is a bit cleaner and helps with the cancellation code.
      Reported-by: NYan, Zheng <zyan@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      c297eb42
  3. 13 12月, 2016 3 次提交
  4. 11 11月, 2016 1 次提交
  5. 01 11月, 2016 1 次提交
  6. 03 10月, 2016 1 次提交
  7. 25 8月, 2016 8 次提交
  8. 28 7月, 2016 9 次提交
  9. 31 5月, 2016 1 次提交
  10. 26 5月, 2016 14 次提交
    • Z
      ceph: make logical calculation functions return bool · 3b33f692
      Zhang Zhuoyu 提交于
      This patch makes serverl logical caculation functions return bool to
      improve readability due to these particular functions only using 0/1
      as their return value.
      
      No functional change.
      Signed-off-by: NZhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com>
      3b33f692
    • Y
      ceph: using hash value to compose dentry offset · f3c4ebe6
      Yan, Zheng 提交于
      If MDS sorts dentries in dirfrag in hash order, we use hash value to
      compose dentry offset. dentry offset is:
      
        (0xff << 52) | ((24 bits hash) << 28) |
        (the nth entry hash hash collision)
      
      This offset is stable across directory fragmentation. This alos means
      there is no need to reset readdir offset if directory get fragmented
      in the middle of readdir.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      f3c4ebe6
    • Y
      ceph: define 'end/complete' in readdir reply as bit flags · 956d39d6
      Yan, Zheng 提交于
      Set a flag in readdir request, which indicates that client interprets
      'end/complete' as bit flags. So that mds can reply additional flags in
      readdir reply.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      956d39d6
    • I
      737cc81e
    • I
      libceph: replace ceph_monc_request_next_osdmap() · 7cca78c9
      Ilya Dryomov 提交于
      ... with a wrapper around maybe_request_map() - no need for two
      osdmap-specific functions.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      7cca78c9
    • I
      libceph: pool deletion detection · 4609245e
      Ilya Dryomov 提交于
      This adds the "map check" infrastructure for sending osdmap version
      checks on CALC_TARGET_POOL_DNE and completing in-flight requests with
      -ENOENT if the target pool doesn't exist or has just been deleted.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      4609245e
    • I
      libceph: async MON client generic requests · d0b19705
      Ilya Dryomov 提交于
      For map check, we are going to need to send CEPH_MSG_MON_GET_VERSION
      messages asynchronously and get a callback on completion.  Refactor MON
      client to allow firing off generic requests asynchronously and add an
      async variant of ceph_monc_get_version().  ceph_monc_do_statfs() is
      switched over and remains sync.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      d0b19705
    • I
      libceph: support for checking on status of watch · b07d3c4b
      Ilya Dryomov 提交于
      Implement ceph_osdc_watch_check() to be able to check on status of
      watch.  Note that the time it takes for a watch/notify event to get
      delivered through the notify_wq is taken into account.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      b07d3c4b
    • I
      libceph: support for sending notifies · 19079203
      Ilya Dryomov 提交于
      Implement ceph_osdc_notify() for sending notifies.
      
      Due to the fact that the current messenger can't do read-in into
      pagelists (it can only do write-out from them), I had to go with a page
      vector for a NOTIFY_COMPLETE payload, for now.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      19079203
    • I
      libceph, rbd: ceph_osd_linger_request, watch/notify v2 · 922dab61
      Ilya Dryomov 提交于
      This adds support and switches rbd to a new, more reliable version of
      watch/notify protocol.  As with the OSD client update, this is mostly
      about getting the right structures linked into the right places so that
      reconnects are properly sent when needed.  watch/notify v2 also
      requires sending regular pings to the OSDs - send_linger_ping().
      
      A major change from the old watch/notify implementation is the
      introduction of ceph_osd_linger_request - linger requests no longer
      piggy back on ceph_osd_request.  ceph_osd_event has been merged into
      ceph_osd_linger_request.
      
      All the details are now hidden within libceph, the interface consists
      of a simple pair of watch/unwatch functions and ceph_osdc_notify_ack().
      ceph_osdc_watch() does return ceph_osd_linger_request, but only to keep
      the lifetime management simple.
      
      ceph_osdc_notify_ack() accepts an optional data payload, which is
      relayed back to the notifier.
      
      Portions of this patch are loosely based on work by Douglas Fuller
      <dfuller@redhat.com> and Mike Christie <michaelc@cs.wisc.edu>.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      922dab61
    • I
      libceph: a major OSD client update · 5aea3dcd
      Ilya Dryomov 提交于
      This is a major sync up, up to ~Jewel.  The highlights are:
      
      - per-session request trees (vs a global per-client tree)
      - per-session locking (vs a global per-client rwlock)
      - homeless OSD session
      - no ad-hoc global per-client lists
      - support for pool quotas
      - foundation for watch/notify v2 support
      - foundation for map check (pool deletion detection) support
      
      The switchover is incomplete: lingering requests can be setup and
      teared down but aren't ever reestablished.  This functionality is
      restored with the introduction of the new lingering infrastructure
      (ceph_osd_linger_request, linger_work, etc) in a later commit.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      5aea3dcd
    • I
      libceph: protect osdc->osd_lru list with a spinlock · 9dd2845c
      Ilya Dryomov 提交于
      OSD client is getting moved from the big per-client lock to a set of
      per-session locks.  The big rwlock would only be held for read most of
      the time, so a global osdc->osd_lru needs additional protection.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      9dd2845c
    • I
      libceph: handle_one_map() · 42c1b124
      Ilya Dryomov 提交于
      Separate osdmap handling from decoding and iterating over a bag of maps
      in a fresh MOSDMap message.  This sets up the scene for the updated OSD
      client.
      
      Of particular importance here is the addition of pi->was_full, which
      can be used to answer "did this pool go full -> not-full in this map?".
      This is the key bit for supporting pool quotas.
      
      We won't be able to downgrade map_sem for much longer, so drop
      downgrade_write().
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      42c1b124
    • I
      libceph: allocate dummy osdmap in ceph_osdc_init() · e5253a7b
      Ilya Dryomov 提交于
      This leads to a simpler osdmap handling code, particularly when dealing
      with pi->was_full, which is introduced in a later commit.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      e5253a7b