1. 23 3月, 2010 3 次提交
    • S
      ceph: propagate mds session allocation failures to caller · 9c423956
      Sage Weil 提交于
      Return error to original caller if register_session() fails.
      Signed-off-by: NSage Weil <sage@newdream.net>
      9c423956
    • S
      ceph: prevent dup stale messages to console for restarting mds · e4cb4cb8
      Sage Weil 提交于
      Prevent duplicate 'mds0 caps stale' message from spamming the console every
      few seconds while the MDS restarts.  Set s_renew_requested earlier, so that
      we only print the message once, even if we don't send an actual request.
      Signed-off-by: NSage Weil <sage@newdream.net>
      e4cb4cb8
    • S
      ceph: fix mds sync() race with completing requests · 80fc7314
      Sage Weil 提交于
      The wait_unsafe_requests() helper dropped the mdsc mutex to wait
      for each request to complete, and then examined r_node to get the
      next request after retaking the lock.  But the request completion
      removes the request from the tree, so r_node was always undefined
      at this point.  Since it's a small race, it usually led to a
      valid request, but not always.  The result was an occasional
      crash in rb_next() while dereferencing node->rb_left.
      
      Fix this by clearing the rb_node when removing the request from
      the request tree, and not walking off into the weeds when we
      are done waiting for a request.  Since the request we waited on
      will _always_ be out of the request tree, take a ref on the next
      request, in the hopes that it won't be.  But if it is, it's ok:
      we can start over from the beginning (and traverse over older read
      requests again).
      Signed-off-by: NSage Weil <sage@newdream.net>
      80fc7314
  2. 27 2月, 2010 1 次提交
  3. 24 2月, 2010 2 次提交
  4. 18 2月, 2010 1 次提交
    • S
      ceph: fix iterate_caps removal race · 7c1332b8
      Sage Weil 提交于
      We need to be able to iterate over all caps on a session with a
      possibly slow callback on each cap.  To allow this, we used to
      prevent cap reordering while we were iterating.  However, we were
      not safe from races with removal: removing the 'next' cap would
      make the next pointer from list_for_each_entry_safe be invalid,
      and cause a lock up or similar badness.
      
      Instead, we keep an iterator pointer in the session pointing to
      the current cap.  As before, we avoid reordering.  For removal,
      if the cap isn't the current cap we are iterating over, we are
      fine.  If it is, we clear cap->ci (to mark the cap as pending
      removal) but leave it in the session list.  In iterate_caps, we
      can safely finish removal and get the next cap pointer.
      
      While we're at it, clean up put_cap to not take a cap reservation
      context, as it was never used.
      Signed-off-by: NSage Weil <sage@newdream.net>
      7c1332b8
  5. 17 2月, 2010 2 次提交
  6. 11 2月, 2010 1 次提交
  7. 30 1月, 2010 1 次提交
  8. 26 1月, 2010 2 次提交
    • Y
      ceph: allocate middle of message before stating to read · 2450418c
      Yehuda Sadeh 提交于
      Both front and middle parts of the message are now being
      allocated at the ceph_alloc_msg().
      Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
      2450418c
    • S
      ceph: properly handle aborted mds requests · 5b1daecd
      Sage Weil 提交于
      Previously, if the MDS request was interrupted, we would unregister the
      request and ignore any reply.  This could cause the caps or other cache
      state to become out of sync.  (For instance, aborting dbench and doing
      rm -r on clients would complain about a non-empty directory because the
      client didn't realize it's aborted file create request completed.)
      
      Even we don't unregister, we still can't process the reply normally because
      we are no longer holding the caller's locks (like the dir i_mutex).
      
      So, mark aborted operations with r_aborted, and in the reply handler, be
      sure to process all the caps.  Do not process the namespace changes,
      though, since we no longer will hold the dir i_mutex.  The dentry lease
      state can also be ignored as it's more forgiving.
      Signed-off-by: NSage Weil <sage@newdream.net>
      5b1daecd
  9. 24 12月, 2009 3 次提交
  10. 22 12月, 2009 2 次提交
  11. 08 12月, 2009 1 次提交
  12. 21 11月, 2009 2 次提交
    • S
      ceph: reset requested max_size after mds reconnect · 0dc2570f
      Sage Weil 提交于
      The max_size increase request to the MDS can get lost during an MDS
      restart and reconnect.  Reset our requested value after the MDS recovers,
      so that any blocked writes will re-request a larger max_size upon waking.
      
      Also, explicit wake session caps after the reconnect.  Normally the cap
      renewal catches this, but not in the cases where the caps didn't go stale
      in the first place, which would leave writers waiting on max_size asleep.
      Signed-off-by: NSage Weil <sage@newdream.net>
      0dc2570f
    • S
      ceph: fix debugfs entry, simplify fsid checks · 0743304d
      Sage Weil 提交于
      We may first learn our fsid from any of the mon, osd, or mds maps
      (whichever the monitor sends first).  Consolidate checks in a single
      helper.  Initialize the client debugfs entry then, since we need the
      fsid (and global_id) for the directory name.
      
      Also remove dead mount code.
      Signed-off-by: NSage Weil <sage@newdream.net>
      0743304d
  13. 19 11月, 2009 3 次提交
  14. 12 11月, 2009 1 次提交
  15. 11 11月, 2009 1 次提交
  16. 10 11月, 2009 1 次提交
    • S
      ceph: do not confuse stale and dead (unreconnected) caps · 685f9a5d
      Sage Weil 提交于
      We were using the cap_gen to track both stale caps (caps that timed out
      due to temporarily losing touch with the mds) and dead caps that did not
      reconnect after an MDS failure.  Introduce a recon_gen counter to track
      reconnections to restarted MDSs and kill dead caps based on that instead.
      
      Rename gen to cap_gen while we're at it to make it more clear which is
      which.
      Signed-off-by: NSage Weil <sage@newdream.net>
      685f9a5d
  17. 28 10月, 2009 1 次提交
  18. 16 10月, 2009 1 次提交
    • S
      ceph: flush dirty caps via the cap_dirty list · afcdaea3
      Sage Weil 提交于
      Previously we were flushing dirty caps by passing an extra flag
      when traversing the delayed caps list.  Besides being a bit ugly,
      that can also miss caps that are dirty but didn't result in a
      cap requeue: notably, mark_caps_dirty().
      
      Separate the flushing into a separate helper, and traverse the
      cap_dirty list.
      
      This also brings i_dirty_item in line with i_dirty_caps: we are
      on the list IFF caps != 0.  We carry an inode ref IFF
      dirty_caps|flushing_caps != 0.
      
      Lose the unused return value from __ceph_mark_caps_dirty().
      Signed-off-by: NSage Weil <sage@newdream.net>
      afcdaea3
  19. 15 10月, 2009 1 次提交
  20. 07 10月, 2009 1 次提交
    • S
      ceph: MDS client · 2f2dc053
      Sage Weil 提交于
      The MDS (metadata server) client is responsible for submitting
      requests to the MDS cluster and parsing the response.  We decide which
      MDS to submit each request to based on cached information about the
      current partition of the directory hierarchy across the cluster.  A
      stateful session is opened with each MDS before we submit requests to
      it, and a mutex is used to control the ordering of messages within
      each session.
      
      An MDS request may generate two responses.  The first indicates the
      operation was a success and returns any result.  A second reply is
      sent when the operation commits to disk.  Note that locking on the MDS
      ensures that the results of updates are visible only to the updating
      client before the operation commits.  Requests are linked to the
      containing directory so that an fsync will wait for them to commit.
      
      If an MDS fails and/or recovers, we resubmit requests as needed.  We
      also reconnect existing capabilities to a recovering MDS to
      reestablish that shared session state.  Old dentry leases are
      invalidated.
      Signed-off-by: NSage Weil <sage@newdream.net>
      2f2dc053