1. 07 1月, 2011 3 次提交
    • N
      fs: dcache remove dcache_lock · b5c84bf6
      Nick Piggin 提交于
      dcache_lock no longer protects anything. remove it.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      b5c84bf6
    • N
      fs: dcache scale subdirs · 2fd6b7f5
      Nick Piggin 提交于
      Protect d_subdirs and d_child with d_lock, except in filesystems that aren't
      using dcache_lock for these anyway (eg. using i_mutex).
      
      Note: if we change the locking rule in future so that ->d_child protection is
      provided only with ->d_parent->d_lock, it may allow us to reduce some locking.
      But it would be an exception to an otherwise regular locking scheme, so we'd
      have to see some good results. Probably not worthwhile.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      2fd6b7f5
    • N
      fs: dcache scale dentry refcount · b7ab39f6
      Nick Piggin 提交于
      Make d_count non-atomic and protect it with d_lock. This allows us to ensure a
      0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when
      we start protecting many other dentry members with d_lock.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      b7ab39f6
  2. 18 11月, 2010 1 次提交
  3. 10 11月, 2010 1 次提交
    • S
      ceph: make page alignment explicit in osd interface · b7495fc2
      Sage Weil 提交于
      We used to infer alignment of IOs within a page based on the file offset,
      which assumed they matched.  This broke with direct IO that was not aligned
      to pages (e.g., 512-byte aligned IO).  We were also trusting the alignment
      specified in the OSD reply, which could have been adjusted by the server.
      
      Explicitly specify the page alignment when setting up OSD IO requests.
      Signed-off-by: NSage Weil <sage@newdream.net>
      b7495fc2
  4. 09 11月, 2010 2 次提交
    • S
      ceph: fix update of ctime from MDS · d8672d64
      Sage Weil 提交于
      The client can have a newer ctime than the MDS due to AUTH_EXCL and
      XATTR_EXCL caps as well; update the check in ceph_fill_file_time
      appropriately.
      
      This fixes cases where ctime/mtime goes backward under the right sequence
      of local updates (e.g. chmod) and mds replies (e.g. subsequent stat that
      goes to the MDS).
      Signed-off-by: NSage Weil <sage@newdream.net>
      d8672d64
    • S
      ceph: fix version check on racing inode updates · 8bd59e01
      Sage Weil 提交于
      We may get updates on the same inode from multiple MDSs; generally we only
      pay attention if the update is newer than what we already have.  The
      exception is when an MDS sense unstable information, in which case we
      always update.
      
      The old > check got this wrong when our version was odd (e.g. 3) and the
      reply version was even (e.g. 2): the older stale (v2) info would be
      applied.  Fixed and clarified the comment.
      Signed-off-by: NSage Weil <sage@newdream.net>
      8bd59e01
  5. 08 11月, 2010 3 次提交
    • S
      ceph: fix rdcache_gen usage and invalidate · cd045cb4
      Sage Weil 提交于
      We used to use rdcache_gen to indicate whether we "might" have cached
      pages.  Now we just look at the mapping to determine that.  However, some
      old behavior remains from that transition.
      
      First, rdcache_gen == 0 no longer means we have no pages.  That can happen
      at any time (presumably when we carry FILE_CACHE).  We should not reset it
      to zero, and we should not check that it is zero.
      
      That means that the only purpose for rdcache_revoking is to resolve races
      between new issues of FILE_CACHE and an async invalidate.  If they are
      equal, we should invalidate.  On success, we decrement rdcache_revoking,
      so that it is no longer equal to rdcache_gen.  Similarly, if we success
      in doing a sync invalidate, set revoking = gen - 1.  (This is a small
      optimization to avoid doing unnecessary invalidate work and does not
      affect correctness.)
      Signed-off-by: NSage Weil <sage@newdream.net>
      cd045cb4
    • S
      ceph: only let auth caps update max_size · 912a9b03
      Sage Weil 提交于
      Only the auth MDS has a meaningful max_size value for us, so only update it
      in fill_inode if we're being issued an auth cap.  Otherwise, a random
      stat result from a non-auth MDS can clobber a meaningful max_size, get
      the client<->mds cap state out of sync, and make writes hang.
      
      Specifically, even if the client re-requests a larger max_size (which it
      will), the MDS won't respond because as far as it knows we already have a
      sufficiently large value.
      Signed-off-by: NSage Weil <sage@newdream.net>
      912a9b03
    • S
      ceph: fix bad pointer dereference in ceph_fill_trace · d8b16b3d
      Sage Weil 提交于
      We dereference *in a few lines down, but only set it on rename.  It is
      apparently pretty rare for this to trigger, but I have been hitting it
      with a clustered MDSs.
      Signed-off-by: NSage Weil <sage@newdream.net>
      d8b16b3d
  6. 21 10月, 2010 1 次提交
    • Y
      ceph: factor out libceph from Ceph file system · 3d14c5d2
      Yehuda Sadeh 提交于
      This factors out protocol and low-level storage parts of ceph into a
      separate libceph module living in net/ceph and include/linux/ceph.  This
      is mostly a matter of moving files around.  However, a few key pieces
      of the interface change as well:
      
       - ceph_client becomes ceph_fs_client and ceph_client, where the latter
         captures the mon and osd clients, and the fs_client gets the mds client
         and file system specific pieces.
       - Mount option parsing and debugfs setup is correspondingly broken into
         two pieces.
       - The mon client gets a generic handler callback for otherwise unknown
         messages (mds map, in this case).
       - The basic supported/required feature bits can be expanded (and are by
         ceph_fs_client).
      
      No functional change, aside from some subtle error handling cases that got
      cleaned up in the refactoring process.
      Signed-off-by: NSage Weil <sage@newdream.net>
      3d14c5d2
  7. 14 9月, 2010 1 次提交
  8. 26 8月, 2010 1 次提交
  9. 23 8月, 2010 1 次提交
  10. 02 8月, 2010 1 次提交
  11. 28 7月, 2010 1 次提交
  12. 24 7月, 2010 1 次提交
  13. 22 6月, 2010 1 次提交
  14. 02 6月, 2010 1 次提交
  15. 30 5月, 2010 1 次提交
    • J
      fs/ceph: Use ERR_CAST · 7e34bc52
      Julia Lawall 提交于
      Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)).  The former makes more
      clear what is the purpose of the operation, which otherwise looks like a
      no-op.
      
      In the case of fs/ceph/inode.c, ERR_CAST is not needed, because the type of
      the returned value is the same as the type of the enclosing function.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      type T;
      T x;
      identifier f;
      @@
      
      T f (...) { <+...
      - ERR_PTR(PTR_ERR(x))
      + x
       ...+> }
      
      @@
      expression x;
      @@
      
      - ERR_PTR(PTR_ERR(x))
      + ERR_CAST(x)
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Signed-off-by: NSage Weil <sage@newdream.net>
      7e34bc52
  16. 18 5月, 2010 7 次提交
  17. 12 5月, 2010 1 次提交
  18. 04 5月, 2010 1 次提交
  19. 31 3月, 2010 1 次提交
    • S
      ceph: fix dentry rehashing on virtual .snap dir · 9358c6d4
      Sage Weil 提交于
      If a lookup fails on the magic .snap directory, we bind it to a magic
      snap directory inode in ceph_lookup_finish().  That code assumes the dentry
      is unhashed, but a recent server-side change started returning NULL leases
      on lookup failure, causing the .snap dentry to be hashed and NULL by
      ceph_fill_trace().
      
      This causes dentry hash chain corruption, or a dies when d_rehash()
      includes
      	BUG_ON(!d_unhashed(entry));
      
      So, avoid processing the NULL dentry lease if it the dentry matches the
      snapdir name in ceph_fill_trace().  That allows the lookup completion to
      properly bind it to the snapdir inode.  BUG there if dentry is hashed to
      be sure.
      Signed-off-by: NSage Weil <sage@newdream.net>
      9358c6d4
  20. 21 3月, 2010 1 次提交
    • S
      ceph: fix inode removal from snap realm when racing with migration · 8b218b8a
      Sage Weil 提交于
      When an inode was dropped while being migrated between two MDSs,
      i_cap_exporting_issued was non-zero such that issue caps were non-zero and
      __ceph_is_any_caps(ci) was true.  This prevented the inode from being
      removed from the snap realm, even as it was dropped from the cache.
      
      Fix this by dropping any residual i_snap_realm ref in destroy_inode.
      Signed-off-by: NSage Weil <sage@newdream.net>
      8b218b8a
  21. 20 2月, 2010 1 次提交
  22. 18 2月, 2010 1 次提交
  23. 12 2月, 2010 2 次提交
  24. 30 1月, 2010 1 次提交
  25. 26 1月, 2010 1 次提交
    • S
      ceph: properly handle aborted mds requests · 5b1daecd
      Sage Weil 提交于
      Previously, if the MDS request was interrupted, we would unregister the
      request and ignore any reply.  This could cause the caps or other cache
      state to become out of sync.  (For instance, aborting dbench and doing
      rm -r on clients would complain about a non-empty directory because the
      client didn't realize it's aborted file create request completed.)
      
      Even we don't unregister, we still can't process the reply normally because
      we are no longer holding the caller's locks (like the dir i_mutex).
      
      So, mark aborted operations with r_aborted, and in the reply handler, be
      sure to process all the caps.  Do not process the namespace changes,
      though, since we no longer will hold the dir i_mutex.  The dentry lease
      state can also be ignored as it's more forgiving.
      Signed-off-by: NSage Weil <sage@newdream.net>
      5b1daecd
  26. 15 1月, 2010 1 次提交
  27. 22 12月, 2009 1 次提交
  28. 08 12月, 2009 1 次提交