1. 29 4月, 2014 1 次提交
    • Y
      ceph: clear directory's completeness when creating file · 0a8a70f9
      Yan, Zheng 提交于
      When creating a file, ceph_set_dentry_offset() puts the new dentry
      at the end of directory's d_subdirs, then set the dentry's offset
      based on directory's max offset. The offset does not reflect the
      real postion of the dentry in directory. Later readdir reply from
      MDS may change the dentry's position/offset. This inconsistency
      can cause missing/duplicate entries in readdir result if readdir
      is partly satisfied by dcache_readdir().
      
      The fix is clear directory's completeness after creating/renaming
      file. It prevents later readdir from using dcache_readdir().
      
      Fixes: http://tracker.ceph.com/issues/8025Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
      Reviewed-by: NSage Weil <sage@inktank.com>
      0a8a70f9
  2. 05 4月, 2014 3 次提交
  3. 03 4月, 2014 2 次提交
  4. 30 1月, 2014 1 次提交
  5. 29 1月, 2014 1 次提交
    • L
      ceph: Fix up after semantic merge conflict · 4db658ea
      Linus Torvalds 提交于
      The previous ceph-client merge resulted in ceph not even building,
      because there was a merge conflict that wasn't visible as an actual data
      conflict: commit 7221fe4c ("ceph: add acl for cephfs") added support
      for POSIX ACL's into Ceph, but unluckily we also had the VFS tree change
      a lot of the POSIX ACL helper functions to be much more helpful to
      filesystems (see for example commits 2aeccbe9 "fs: add generic
      xattr_acl handlers", 5bf3258f "fs: make posix_acl_chmod more useful"
      and 37bc1539 "fs: make posix_acl_create more useful")
      
      The reason this conflict wasn't obvious was many-fold: because it was a
      semantic conflict rather than a data conflict, it wasn't visible in the
      git merge as a conflict.  And because the VFS tree hadn't been in
      linux-next, people hadn't become aware of it that way.  And because I
      was at jury duty this morning, I was using my laptop and as a result not
      doing constant "allmodconfig" builds.
      
      Anyway, this fixes the build and generally removes a fair chunk of the
      Ceph POSIX ACL support code, since the improved helpers seem to match
      really well for Ceph too.  But I don't actually have any way to *test*
      the end result, and I was really hoping for some ACK's for this.  Oh,
      well.
      
      Not compiling certainly doesn't make things easier to test, so I'm
      committing this without the acks after having waited for four hours...
      Plus it's what I would have done for the merge had I noticed the
      semantic conflict..
      Reported-by: NDave Jones <davej@redhat.com>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Guangliang Zhao <lucienchao@gmail.com>
      Cc: Li Wang <li.wang@ubuntykylin.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4db658ea
  6. 21 1月, 2014 2 次提交
    • Y
      ceph: add imported caps when handling cap export message · 11df2dfb
      Yan, Zheng 提交于
      Version 3 cap export message includes information about the imported
      caps. It allows us to add the imported caps if the corresponding cap
      import message still hasn't been received.
      
      This allow us to handle situation that the importer MDS crashes and
      the cap import message is missing.
      Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
      11df2dfb
    • Y
      ceph: fix cache revoke race · 9563f88c
      Yan, Zheng 提交于
      handle following sequence of events:
      
      - non-auth MDS revokes Fc cap. queue invalidate work
      - auth MDS issues Fc cap through request reply. i_rdcache_gen gets
        increased.
      - invalidate work runs. it finds i_rdcache_revoking != i_rdcache_gen,
        so it does nothing.
      Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
      9563f88c
  7. 01 1月, 2014 1 次提交
  8. 14 12月, 2013 2 次提交
  9. 01 10月, 2013 2 次提交
  10. 07 9月, 2013 2 次提交
  11. 16 8月, 2013 1 次提交
    • Y
      ceph: introduce i_truncate_mutex · b0d7c223
      Yan, Zheng 提交于
      I encountered below deadlock when running fsstress
      
      wmtruncate work      truncate                 MDS
      ---------------  ------------------  --------------------------
                         lock i_mutex
                                            <- truncate file
      lock i_mutex (blocked)
                                            <- revoking Fcb (filelock to MIX)
                         send request ->
                                               handle request (xlock filelock)
      
      At the initial time, there are some dirty pages in the page cache.
      When the kclient receives the truncate message, it reduces inode size
      and creates some 'out of i_size' dirty pages. wmtruncate work can't
      truncate these dirty pages because it's blocked by the i_mutex. Later
      when the kclient receives the cap message that revokes Fcb caps, It
      can't flush all dirty pages because writepages() only flushes dirty
      pages within the inode size.
      
      When the MDS handles the 'truncate' request from kclient, it waits
      for the filelock to become stable. But the filelock is stuck in
      unstable state because it can't finish revoking kclient's Fcb caps.
      
      The truncate pagecache locking has already caused lots of trouble
      for use. I think it's time simplify it by introducing a new mutex.
      We use the new mutex to prevent concurrent truncate_inode_pages().
      There is no need to worry about race between buffered write and
      truncate_inode_pages(), because our "get caps" mechanism prevents
      them from concurrent execution.
      Reviewed-by: NSage Weil <sage@inktank.com>
      Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
      b0d7c223
  12. 10 8月, 2013 2 次提交
  13. 05 7月, 2013 1 次提交
  14. 04 7月, 2013 1 次提交
  15. 02 5月, 2013 4 次提交
  16. 26 2月, 2013 1 次提交
  17. 12 2月, 2013 2 次提交
  18. 13 12月, 2012 1 次提交
  19. 27 9月, 2012 1 次提交
  20. 22 8月, 2012 1 次提交
  21. 22 3月, 2012 1 次提交
  22. 11 1月, 2012 1 次提交
  23. 04 1月, 2012 1 次提交
    • A
      vfs: fix the stupidity with i_dentry in inode destructors · 6b520e05
      Al Viro 提交于
      Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
      it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
      the cost of taking it into inode_init_always() will be negligible for pipes
      and sockets and negative for everything else.  Not to mention the removal of
      boilerplate code from ->destroy_inode() instances...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6b520e05
  24. 08 12月, 2011 1 次提交
    • S
      ceph: use i_ceph_lock instead of i_lock · be655596
      Sage Weil 提交于
      We have been using i_lock to protect all kinds of data structures in the
      ceph_inode_info struct, including lists of inodes that we need to iterate
      over while avoiding races with inode destruction.  That requires grabbing
      a reference to the inode with the list lock protected, but igrab() now
      takes i_lock to check the inode flags.
      
      Changing the list lock ordering would be a painful process.
      
      However, using a ceph-specific i_ceph_lock in the ceph inode instead of
      i_lock is a simple mechanical change and avoids the ordering constraints
      imposed by igrab().
      Reported-by: NAmon Ott <a.ott@m-privacy.de>
      Signed-off-by: NSage Weil <sage@newdream.net>
      be655596
  25. 06 11月, 2011 2 次提交
  26. 02 11月, 2011 1 次提交
  27. 26 10月, 2011 1 次提交
    • S
      Revert "ceph: don't truncate dirty pages in invalidate work thread" · 83eaea22
      Sage Weil 提交于
      This reverts commit c9af9fb6.
      
      We need to block and truncate all pages in order to reliably invalidate
      them.  Otherwise, we could:
      
       - have some uptodate pages in the cache
       - queue an invalidate
       - write(2) locks some pages
       - invalidate_work skips them
       - write(2) only overwrites part of the page
       - page now dirty and uptodate
       -> partial leakage of invalidated data
      
      It's not entirely clear why we started skipping locked pages in the first
      place.  I just ran this through fsx and didn't see any problems.
      Signed-off-by: NSage Weil <sage@newdream.net>
      83eaea22