1. 07 1月, 2011 1 次提交
    • N
      fs: dcache reduce branches in lookup path · fb045adb
      Nick Piggin 提交于
      Reduce some branches and memory accesses in dcache lookup by adding dentry
      flags to indicate common d_ops are set, rather than having to check them.
      This saves a pointer memory access (dentry->d_op) in common path lookup
      situations, and saves another pointer load and branch in cases where we
      have d_op but not the particular operation.
      
      Patched with:
      
      git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      fb045adb
  2. 10 8月, 2010 2 次提交
    • C
      check ATTR_SIZE contraints in inode_change_ok · 2c27c65e
      Christoph Hellwig 提交于
      Make sure we check the truncate constraints early on in ->setattr by adding
      those checks to inode_change_ok.  Also clean up and document inode_change_ok
      to make this obvious.
      
      As a fallout we don't have to call inode_newsize_ok from simple_setsize and
      simplify it down to a truncate_setsize which doesn't return an error.  This
      simplifies a lot of setattr implementations and means we use truncate_setsize
      almost everywhere.  Get rid of fat_setsize now that it's trivial and mark
      ext2_setsize static to make the calling convention obvious.
      
      Keep the inode_newsize_ok in vmtruncate for now as all callers need an
      audit for its removal anyway.
      
      Note: setattr code in ecryptfs doesn't call inode_change_ok at all and
      needs a deeper audit, but that is left for later.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2c27c65e
    • C
      always call inode_change_ok early in ->setattr · db78b877
      Christoph Hellwig 提交于
      Make sure we call inode_change_ok before doing any changes in ->setattr,
      and make sure to call it even if our fs wants to ignore normal UNIX
      permissions, but use the ATTR_FORCE to skip those.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      db78b877
  3. 02 8月, 2010 1 次提交
    • E
      vfs: re-introduce MAY_CHDIR · 9cfcac81
      Eric Paris 提交于
      Currently MAY_ACCESS means that filesystems must check the permissions
      right then and not rely on cached results or the results of future
      operations on the object.  This can be because of a call to sys_access() or
      because of a call to chdir() which needs to check search without relying on
      any future operations inside that dir.  I plan to use MAY_ACCESS for other
      purposes in the security system, so I split the MAY_ACCESS and the
      MAY_CHDIR cases.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: NStephen D. Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      9cfcac81
  4. 28 5月, 2010 1 次提交
  5. 27 11月, 2009 1 次提交
    • C
      fuse: reject O_DIRECT flag also in fuse_create · 1b732396
      Csaba Henk 提交于
      The comment in fuse_open about O_DIRECT:
      
        "VFS checks this, but only _after_ ->open()"
      
      also holds for fuse_create, however, the same kind of check was missing there.
      
      As an impact of this bug, open(newfile, O_RDWR|O_CREAT|O_DIRECT) fails, but a
      stub newfile will remain if the fuse server handled the implied FUSE_CREATE
      request appropriately.
      
      Other impact: in the above situation ima_file_free() will complain to open/free
      imbalance if CONFIG_IMA is set.
      Signed-off-by: NCsaba Henk <csaba@gluster.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: Harshavardhana <harsha@gluster.com>
      Cc: stable@kernel.org
      1b732396
  6. 04 11月, 2009 1 次提交
  7. 24 9月, 2009 1 次提交
  8. 01 7月, 2009 2 次提交
    • J
      fuse: invalidation reverse calls · 3b463ae0
      John Muir 提交于
      Add notification messages that allow the filesystem to invalidate VFS
      caches.
      
      Two notifications are added:
      
       1) inode invalidation
      
         - invalidate cached attributes
         - invalidate a range of pages in the page cache (this is optional)
      
       2) dentry invalidation
      
         - try to invalidate a subtree in the dentry cache
      
      Care must be taken while accessing the 'struct super_block' for the
      mount, as it can go away while an invalidation is in progress.  To
      prevent this, introduce a rw-semaphore, that is taken for read during
      the invalidation and taken for write in the ->kill_sb callback.
      
      Cc: Csaba Henk <csaba@gluster.com>
      Cc: Anand Avati <avati@zresearch.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      3b463ae0
    • M
      fuse: allow umask processing in userspace · e0a43ddc
      Miklos Szeredi 提交于
      This patch lets filesystems handle masking the file mode on creation.
      This is needed if filesystem is using ACLs.
      
       - The CREATE, MKDIR and MKNOD requests are extended with a "umask"
         parameter.
      
       - A new FUSE_DONT_MASK flag is added to the INIT request/reply.  With
         this the filesystem may request that the create mode is not masked.
      
      CC: Jean-Pierre André <jean-pierre.andre@wanadoo.fr>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      e0a43ddc
  9. 28 4月, 2009 4 次提交
  10. 02 4月, 2009 1 次提交
    • M
      fuse: allow kernel to access "direct_io" files · f4975c67
      Miklos Szeredi 提交于
      Allow the kernel read and write on "direct_io" files.  This is
      necessary for nfs export and execute support.
      
      The implementation is simple: if an access from the kernel is
      detected, don't perform get_user_pages(), just use the kernel address
      provided by the requester to copy from/to the userspace filesystem.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      f4975c67
  11. 28 3月, 2009 1 次提交
  12. 26 11月, 2008 3 次提交
  13. 14 11月, 2008 2 次提交
  14. 27 7月, 2008 2 次提交
    • A
      [PATCH] fix MAY_CHDIR/MAY_ACCESS/LOOKUP_ACCESS mess · a110343f
      Al Viro 提交于
      * MAY_CHDIR is redundant - it's an equivalent of MAY_ACCESS
      * MAY_ACCESS on fuse should affect only the last step of pathname resolution
      * fchdir() and chroot() should pass MAY_ACCESS, for the same reason why
        chdir() needs that.
      * now that we pass MAY_ACCESS explicitly in all cases, LOOKUP_ACCESS can be
        removed; it has no business being in nameidata.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a110343f
    • A
      [PATCH] sanitize ->permission() prototype · e6305c43
      Al Viro 提交于
      * kill nameidata * argument; map the 3 bits in ->flags anybody cares
        about to new MAY_... ones and pass with the mask.
      * kill redundant gfs2_iop_permission()
      * sanitize ecryptfs_permission()
      * fix remaining places where ->permission() instances might barf on new
        MAY_... found in mask.
      
      The obvious next target in that direction is permission(9)
      
      folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e6305c43
  15. 26 7月, 2008 3 次提交
  16. 30 4月, 2008 2 次提交
    • M
      fuse: update file size on short read · 5c5c5e51
      Miklos Szeredi 提交于
      If the READ request returned a short count, then either
      
        - cached size is incorrect
        - filesystem is buggy, as short reads are only allowed on EOF
      
      So assume that the size is wrong and refresh it, so that cached read() doesn't
      zero fill the missing chunk.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c5c5e51
    • M
      fuse: support writable mmap · 3be5a52b
      Miklos Szeredi 提交于
      Quoting Linus (3 years ago, FUSE inclusion discussions):
      
        "User-space filesystems are hard to get right. I'd claim that they
         are almost impossible, unless you limit them somehow (shared
         writable mappings are the nastiest part - if you don't have those,
         you can reasonably limit your problems by limiting the number of
         dirty pages you accept through normal "write()" calls)."
      
      Instead of attempting the impossible, I've just waited for the dirty page
      accounting infrastructure to materialize (thanks to Peter Zijlstra and
      others).  This nicely solved the biggest problem: limiting the number of pages
      used for write caching.
      
      Some small details remained, however, which this largish patch attempts to
      address.  It provides a page writeback implementation for fuse, which is
      completely safe against VM related deadlocks.  Performance may not be very
      good for certain usage patterns, but generally it should be acceptable.
      
      It has been tested extensively with fsx-linux and bash-shared-mapping.
      
      Fuse page writeback design
      --------------------------
      
      fuse_writepage() allocates a new temporary page with GFP_NOFS|__GFP_HIGHMEM.
      It copies the contents of the original page, and queues a WRITE request to the
      userspace filesystem using this temp page.
      
      The writeback is finished instantly from the MM's point of view: the page is
      removed from the radix trees, and the PageDirty and PageWriteback flags are
      cleared.
      
      For the duration of the actual write, the NR_WRITEBACK_TEMP counter is
      incremented.  The per-bdi writeback count is not decremented until the actual
      write completes.
      
      On dirtying the page, fuse waits for a previous write to finish before
      proceeding.  This makes sure, there can only be one temporary page used at a
      time for one cached page.
      
      This approach is wasteful in both memory and CPU bandwidth, so why is this
      complication needed?
      
      The basic problem is that there can be no guarantee about the time in which
      the userspace filesystem will complete a write.  It may be buggy or even
      malicious, and fail to complete WRITE requests.  We don't want unrelated parts
      of the system to grind to a halt in such cases.
      
      Also a filesystem may need additional resources (particularly memory) to
      complete a WRITE request.  There's a great danger of a deadlock if that
      allocation may wait for the writepage to finish.
      
      Currently there are several cases where the kernel can block on page
      writeback:
      
        - allocation order is larger than PAGE_ALLOC_COSTLY_ORDER
        - page migration
        - throttle_vm_writeout (through NR_WRITEBACK)
        - sync(2)
      
      Of course in some cases (fsync, msync) we explicitly want to allow blocking.
      So for these cases new code has to be added to fuse, since the VM is not
      tracking writeback pages for us any more.
      
      As an extra safetly measure, the maximum dirty ratio allocated to a single
      fuse filesystem is set to 1% by default.  This way one (or several) buggy or
      malicious fuse filesystems cannot slow down the rest of the system by hogging
      dirty memory.
      
      With appropriate privileges, this limit can be raised through
      '/sys/class/bdi/<bdi>/max_ratio'.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3be5a52b
  17. 24 2月, 2008 1 次提交
  18. 08 2月, 2008 1 次提交
  19. 07 2月, 2008 1 次提交
  20. 30 11月, 2007 4 次提交
  21. 19 10月, 2007 5 次提交