1. 20 2月, 2017 2 次提交
    • J
      ceph: clean up unsafe d_parent accesses in build_dentry_path · c6b0b656
      Jeff Layton 提交于
      While we hold a reference to the dentry when build_dentry_path is
      called, we could end up racing with a rename that changes d_parent.
      Handle that situation correctly, by using the rcu_read_lock to
      ensure that the parent dentry and inode stick around long enough
      to safely check ceph_snap and ceph_ino.
      
      Link: http://tracker.ceph.com/issues/18148Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NYan, Zheng <zyan@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      c6b0b656
    • J
      ceph: clean up unsafe d_parent access in __choose_mds · 30c71233
      Jeff Layton 提交于
      __choose_mds exists to pick an MDS to use when issuing a call. Doing
      that typically involves picking an inode and using the authoritative
      MDS for it. In most cases, that's pretty straightforward, as we are
      using an inode to which we hold a reference (usually represented by
      r_dentry or r_inode in the request).
      
      In the case of a snapshotted directory however, we need to fetch
      the non-snapped parent, which involves walking back up the parents
      in the tree. The dentries in the snapshot dir are effectively frozen
      but the overall parent is _not_, and could vanish if a concurrent
      rename were to occur.
      
      Clean this code up and take special care to ensure the validity of
      the entries we're working with. First, try to use the inode in
      r_locked_dir if one exists. If not and all we have is r_dentry,
      then we have to walk back up the tree. Use the rcu_read_lock for
      this so we can ensure that any d_parent we find won't go away, and
      take extra care to deal with the possibility that the dentries could
      go negative.
      
      Change get_nonsnap_parent to return an inode, and take a reference to
      that inode before returning (if any). Change all of the other places
      where we set "inode" in __choose_mds to also take a reference, and then
      call iput on that inode before exiting the function.
      
      Link: http://tracker.ceph.com/issues/18148Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NYan, Zheng <zyan@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      30c71233
  2. 19 1月, 2017 4 次提交
  3. 13 1月, 2017 2 次提交
  4. 15 12月, 2016 2 次提交
    • I
      libceph: always signal completion when done · c297eb42
      Ilya Dryomov 提交于
      r_safe_completion is currently, and has always been, signaled only if
      on-disk ack was requested.  It's there for fsync and syncfs, which wait
      for in-flight writes to flush - all data write requests set ONDISK.
      
      However, the pool perm check code introduced in 4.2 sends a write
      request with only ACK set.  An unfortunately timed syncfs can then hang
      forever: r_safe_completion won't be signaled because only an unsafe
      reply was requested.
      
      We could patch ceph_osdc_sync() to skip !ONDISK write requests, but
      that is somewhat incomplete and yet another special case.  Instead,
      rename this completion to r_done_completion and always signal it when
      the OSD client is done with the request, whether unsafe, safe, or
      error.  This is a bit cleaner and helps with the cancellation code.
      Reported-by: NYan, Zheng <zyan@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      c297eb42
    • Y
      ceph: avoid creating orphan object when checking pool permission · 80e80fbb
      Yan, Zheng 提交于
      Pool permission check needs to write to the first object. But for
      snapshot, head of the first object may have already been deleted.
      Skip the check for snapshot inode to avoid creating orphan object.
      
      Link: http://tracker.ceph.com/issues/18211Signed-off-by: NYan, Zheng <zyan@redhat.com>
      80e80fbb
  5. 13 12月, 2016 13 次提交
  6. 11 12月, 2016 1 次提交
  7. 09 12月, 2016 1 次提交
  8. 08 12月, 2016 1 次提交
    • J
      ceph: don't set req->r_locked_dir in ceph_d_revalidate · c3f4688a
      Jeff Layton 提交于
      This function sets req->r_locked_dir which is supposed to indicate to
      ceph_fill_trace that the parent's i_rwsem is locked for write.
      Unfortunately, there is no guarantee that the dir will be locked when
      d_revalidate is called, so we really don't want ceph_fill_trace to do
      any dcache manipulation from this context. Clear req->r_locked_dir since
      it's clearly not safe to do that.
      
      What we really want to know with d_revalidate is whether the dentry
      still points to the same inode. ceph_fill_trace installs a pointer to
      the inode in req->r_target_inode, so we can just compare that to
      d_inode(dentry) to see if it's the same one after the lookup.
      
      Also, since we aren't generally interested in the parent here, we can
      switch to using a GETATTR to hint that to the MDS, which also means that
      we only need to reserve one cap.
      
      Finally, just remove the d_unhashed check. That's really outside the
      purview of a filesystem's d_revalidate. If the thing became unhashed
      while we're checking it, then that's up to the VFS to handle anyway.
      
      Fixes: 200fd27c ("ceph: use lookup request to revalidate dentry")
      Link: http://tracker.ceph.com/issues/18041Reported-by: NDonatas Abraitis <donatas.abraitis@gmail.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      c3f4688a
  9. 11 11月, 2016 1 次提交
    • Y
      ceph: use default file splice read callback · 8a8d5617
      Yan, Zheng 提交于
      Splice read/write implementation changed recently. When using
      generic_file_splice_read(), iov_iter with type == ITER_PIPE is
      passed to filesystem's read_iter callback. But ceph_sync_read()
      can't serve ITER_PIPE iov_iter correctly (ITER_PIPE iov_iter
      expects pages from page cache).
      
      Fixing ceph_sync_read() requires a big patch. So use default
      splice read callback for now.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      8a8d5617
  10. 29 10月, 2016 2 次提交
  11. 18 10月, 2016 3 次提交
  12. 16 10月, 2016 1 次提交
  13. 08 10月, 2016 1 次提交
  14. 03 10月, 2016 6 次提交