1. 28 7月, 2016 3 次提交
  2. 26 5月, 2016 9 次提交
  3. 24 4月, 2016 1 次提交
    • A
      ceph: Switch to generic xattr handlers · 2cdeb1e4
      Andreas Gruenbacher 提交于
      Add a catch-all xattr handler at the end of ceph_xattr_handlers.  Check
      for valid attribute names there, and remove those checks from
      __ceph_{get,set,remove}xattr instead.  No "system.*" xattrs need to be
      handled by the catch-all handler anymore.
      
      The set xattr handler is called with a NULL value to indicate that the
      attribute should be removed; __ceph_setxattr already handles that case
      correctly (ceph_set_acl could already calling __ceph_setxattr with a NULL
      value).
      
      Move the check for snapshots from ceph_{set,remove}xattr into
      __ceph_{set,remove}xattr.  With that, ceph_{get,set,remove}xattr can be
      replaced with the generic iops.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2cdeb1e4
  4. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  5. 26 3月, 2016 4 次提交
    • G
      ceph: use kmem_cache_zalloc · 99ec2697
      Geliang Tang 提交于
      Use kmem_cache_zalloc() instead of kmem_cache_alloc() with flag GFP_ZERO.
      Signed-off-by: NGeliang Tang <geliangtang@163.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      99ec2697
    • Y
      ceph: use lookup request to revalidate dentry · 200fd27c
      Yan, Zheng 提交于
      If dentry has no lease, ceph_d_revalidate() previously return 0.
      This causes VFS to invalidate the dentry and create a new dentry
      for later lookup. Invalidating a dentry also detach any underneath
      mount points. So mount point inside cephfs can disapear mystically
      (even the mount point is not modified by other hosts).
      
      The fix is using lookup request to revalidate dentry without lease.
      This can partly solve the mount points disapear issue (as long as
      the mount point is not modified by other hosts)
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      200fd27c
    • Y
      ceph: kill ceph_get_dentry_parent_inode() · 641235d8
      Yan, Zheng 提交于
      use vfs helper dget_parent() instead
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      641235d8
    • Y
      ceph: fix security xattr deadlock · 315f2408
      Yan, Zheng 提交于
      When security is enabled, security module can call filesystem's
      getxattr/setxattr callbacks during d_instantiate(). For cephfs,
      d_instantiate() is usually called by MDS' dispatch thread, while
      handling MDS reply. If the MDS reply does not include xattrs and
      corresponding caps, getxattr/setxattr need to send a new request
      to MDS and waits for the reply. This makes MDS' dispatch sleep,
      nobody handles later MDS replies.
      
      The fix is make sure lookup/atomic_open reply include xattrs and
      corresponding caps. So getxattr can be handled by cached xattrs.
      This requires some modification to both MDS and request message.
      (Client tells MDS what caps it wants; MDS encodes proper caps in
      the reply)
      
      Smack security module may call setxattr during d_instantiate().
      Unlike getxattr, we can't force MDS to issue CEPH_CAP_XATTR_EXCL
      to us. So just make setxattr return error when called by MDS'
      dispatch thread.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      315f2408
  6. 23 1月, 2016 1 次提交
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  7. 25 6月, 2015 5 次提交
    • Y
      ceph: rework dcache readdir · fdd4e158
      Yan, Zheng 提交于
      Previously our dcache readdir code relies on that child dentries in
      directory dentry's d_subdir list are sorted by dentry's offset in
      descending order. When adding dentries to the dcache, if a dentry
      already exists, our readdir code moves it to head of directory
      dentry's d_subdir list. This design relies on dcache internals.
      Al Viro suggests using ncpfs's approach: keeping array of pointers
      to dentries in page cache of directory inode. the validity of those
      pointers are presented by directory inode's complete and ordered
      flags. When a dentry gets pruned, we clear directory inode's complete
      flag in the d_prune() callback. Before moving a dentry to other
      directory, we clear the ordered flag for both old and new directory.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      fdd4e158
    • Y
      ceph: switch some GFP_NOFS memory allocation to GFP_KERNEL · 687265e5
      Yan, Zheng 提交于
      GFP_NOFS memory allocation is required for page writeback path.
      But there is no need to use GFP_NOFS in syscall path and readpage
      path
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      687265e5
    • Y
      ceph: fix directory fsync · da819c81
      Yan, Zheng 提交于
      fsync() on directory should flush dirty caps and wait for any
      uncommitted directory opertions to commit. But ceph_dir_fsync()
      only waits for uncommitted directory opertions.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      da819c81
    • I
      ceph: simplify two mount_timeout sites · 5be73034
      Ilya Dryomov 提交于
      No need to bifurcate wait now that we've got ceph_timeout_jiffies().
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      Reviewed-by: NYan, Zheng <zyan@redhat.com>
      5be73034
    • I
      libceph: store timeouts in jiffies, verify user input · a319bf56
      Ilya Dryomov 提交于
      There are currently three libceph-level timeouts that the user can
      specify on mount: mount_timeout, osd_idle_ttl and osdkeepalive.  All of
      these are in seconds and no checking is done on user input: negative
      values are accepted, we multiply them all by HZ which may or may not
      overflow, arbitrarily large jiffies then get added together, etc.
      
      There is also a bug in the way mount_timeout=0 is handled.  It's
      supposed to mean "infinite timeout", but that's not how wait.h APIs
      treat it and so __ceph_open_session() for example will busy loop
      without much chance of being interrupted if none of ceph-mons are
      there.
      
      Fix all this by verifying user input, storing timeouts capped by
      msecs_to_jiffies() in jiffies and using the new ceph_timeout_jiffies()
      helper for all user-specified waits to handle infinite timeouts
      correctly.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      a319bf56
  8. 24 6月, 2015 1 次提交
  9. 22 4月, 2015 1 次提交
  10. 20 4月, 2015 3 次提交
  11. 16 4月, 2015 1 次提交
  12. 23 2月, 2015 1 次提交
    • D
      VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) · e36cb0b8
      David Howells 提交于
      Convert the following where appropriate:
      
       (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
      
       (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
      
       (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry).  This is actually more
           complicated than it appears as some calls should be converted to
           d_can_lookup() instead.  The difference is whether the directory in
           question is a real dir with a ->lookup op or whether it's a fake dir with
           a ->d_automount op.
      
      In some circumstances, we can subsume checks for dentry->d_inode not being
      NULL into this, provided we the code isn't in a filesystem that expects
      d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
      use d_inode() rather than d_backing_inode() to get the inode pointer).
      
      Note that the dentry type field may be set to something other than
      DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
      manages the fall-through from a negative dentry to a lower layer.  In such a
      case, the dentry type of the negative union dentry is set to the same as the
      type of the lower dentry.
      
      However, if you know d_inode is not NULL at the call site, then you can use
      the d_is_xxx() functions even in a filesystem.
      
      There is one further complication: a 0,0 chardev dentry may be labelled
      DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE.  Strictly, this was
      intended for special directory entry types that don't have attached inodes.
      
      The following perl+coccinelle script was used:
      
      use strict;
      
      my @callers;
      open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
          die "Can't grep for S_ISDIR and co. callers";
      @callers = <$fd>;
      close($fd);
      unless (@callers) {
          print "No matches\n";
          exit(0);
      }
      
      my @cocci = (
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISLNK(E->d_inode->i_mode)',
          '+ d_is_symlink(E)',
          '',
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISDIR(E->d_inode->i_mode)',
          '+ d_is_dir(E)',
          '',
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISREG(E->d_inode->i_mode)',
          '+ d_is_reg(E)' );
      
      my $coccifile = "tmp.sp.cocci";
      open($fd, ">$coccifile") || die $coccifile;
      print($fd "$_\n") || die $coccifile foreach (@cocci);
      close($fd);
      
      foreach my $file (@callers) {
          chomp $file;
          print "Processing ", $file, "\n";
          system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
      	die "spatch failed";
      }
      
      [AV: overlayfs parts skipped]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e36cb0b8
  13. 19 2月, 2015 3 次提交
    • Y
      ceph: return error for traceless reply race · 4d41cef2
      Yan, Zheng 提交于
      When we receives traceless reply for request that created new inode,
      we re-send a lookup request to MDS get information of the newly created
      inode. (VFS expects FS' callback return an inode in create case)
      This breaks one request into two requests. Other client may modify or
      move to the new inode in the middle.
      
      When the race happens, ceph_handle_notrace_create() unconditionally
      links the dentry for 'create' operation to the inode returned by lookup.
      This may confuse VFS when the inode is a directory (VFS does not allow
      multiple linkages for directory inode).
      
      This patch makes ceph_handle_notrace_create() when it detect a race.
      This event should be rare and it happens only when we talk to old MDS.
      Recent MDS does not send traceless reply for request that creates new
      inode.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      4d41cef2
    • Y
      ceph: fix dentry leaks · 5cba372c
      Yan, Zheng 提交于
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      5cba372c
    • Y
      ceph: provide seperate {inode,file}_operations for snapdir · 38c48b5f
      Yan, Zheng 提交于
      remove all unsupported operations from {inode,file}_operations.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      38c48b5f
  14. 18 12月, 2014 2 次提交
    • Y
      ceph: fix mksnap crash · 275dd19e
      Yan, Zheng 提交于
      mksnap reply only contain 'target', does not contain 'dentry'. So
      it's wrong to use req->r_reply_info.head->is_dentry to detect traceless
      reply.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      Reviewed-by: NSage Weil <sage@redhat.com>
      275dd19e
    • Y
      ceph: introduce a new inode flag indicating if cached dentries are ordered · 70db4f36
      Yan, Zheng 提交于
      After creating/deleting/renaming file, offsets of sibling dentries may
      change. So we can not use cached dentries to satisfy readdir. But we can
      still use the cached dentries to conclude -ENOENT for lookup.
      
      This patch introduces a new inode flag indicating if child dentries are
      ordered. The flag is set at the same time marking a directory complete.
      After creating/deleting/renaming file, we clear the flag on directory
      inode. This prevents ceph_readdir() from using cached dentries to satisfy
      readdir syscall.
      Signed-off-by: NYan, Zheng <zyan@redhat.com>
      70db4f36
  15. 20 11月, 2014 2 次提交
  16. 04 11月, 2014 1 次提交
  17. 15 10月, 2014 1 次提交